- Caching is used in OS, CDNs, GNS, applications(websites:amazon, google), heavily used in games to increase the latency of read/write of the media content.
Best practices:
- Validity
- High Hit Rate
- Cache Miss
- TTL
Features/Estimation:
- Terabyte
- 50k to 1M Queries Per Second.
- Approx 1ms latency
- LRU eviction policies
- 100% availability
- Scalability
Cache access patterns:
- Write through: Write goes through Cache system and happens to DB. Ack will be sent back when data is saved on Cache + DB.
- Write Around:
- Write will go around cache and go to DB directly. Ack is sent back when the write to DB happens.
- Data is not sent to the cache while write.
- When the data is read from the cache then the miss will happen for the first time, data is loaded from DB into the cache.
- Write back:
- Data will be written to the cache and ACK will be sent.
- A service will sync the data from cache to the DB.
Data structure:
- Hash table is used to implement cache.
- We need hashing function(x%n), key-value, buckets numbered from 0-n
- Collision handling:
- Save the key/values in the Linked list.
- Linear probing
Cache eviction policy:
How to decide when and what to remove from the hash table.
LRU: Least Recently Used
Bi-directional Linked list
Fault tolerance/Persistent in Cache:
- Regular interval snapshot
- A service will take a copy of the hash table and it will dump it in the file that will be saved in a hard-disk
- Log Reconstruction
- All the R/W/D operations that are made on hash-map are stored in the log file as well. It will be async.
- This log file will be persisted into hard-disk
No comments:
Post a Comment