Caching basics


  1. Caching is used in OS, CDNs, GNS, applications(websites:amazon, google), heavily used in games to increase the latency of read/write of the media content. 
Best practices:
  1. Validity
  2. High Hit Rate
  3. Cache Miss
  4. TTL
Features/Estimation:
  1. Terabyte
  2. 50k to 1M Queries Per Second. 
  3. Approx 1ms latency
  4. LRU eviction policies
  5. 100% availability 
  6. Scalability 
Cache access patterns:
  1. Write through: Write goes through Cache system and happens to DB. Ack will be sent back when data is saved on Cache + DB. 
  2. Write Around: 
    1. Write will go around cache and go to DB directly. Ack is sent back when the write to DB happens. 
    2. Data is not sent to the cache while write. 
    3. When the data is read from the cache then the miss will happen for the first time, data is loaded from DB into the cache. 
  3. Write back: 
    1. Data will be written to the cache and ACK will be sent. 
    2. A service will sync the data from cache to the DB. 
Data structure: 
  1. Hash table is used to implement cache. 
  2. We need hashing function(x%n), key-value, buckets numbered from 0-n
  3. Collision handling:
    1. Save the key/values in the Linked list. 
  4. Linear probing
Cache eviction policy: 
How to decide when and what to remove from the hash table. 
LRU: Least Recently Used
Bi-directional Linked list

Fault tolerance/Persistent in Cache:
  1. Regular interval snapshot
    1. A service will take a copy of the hash table and it will dump it in the file that will be saved in a hard-disk
  2. Log Reconstruction
    1. All the R/W/D operations that are made on hash-map are stored in the log file as well. It will be async. 
    2. This log file will be persisted into hard-disk


No comments:

Post a Comment

NoSQL

This one is reviewed but I need to delete its copy from hubpages or somewhere NoSQL Data models: key-value  Aggregate model.  key or i...