Database scaling
- Horizontal scaling is ensured by adding concurrent machines that will handle more requests.
- Path1: The requests will be routed to SQL and it will become slow overtime. To make it better add more RAM, use sharding, denormalization, SQL tuning.
- Path 2: Better way to handle scale is denormalize right from beginning or switch to scalable no-sql DB. Even after that you'll need to introduce a cache.
Caching
- Users will see performance degradation when loads of data is fetched from the DBs. Cache needs to be implemented in such cases.
- In-memory cache like Redis or Memcached should be considered and not file based caching.
- Data is stored in the RAM.
- Redis can do 100s of 1000s of reads/second.
- Writes(including incremental ones) are faster too.
- Cache sits between storage and application.
- 2 patterns are:
- Cached database queries
- Cached objects
A. Cached database queries
- Store the query and its result in the cache.
- Query is the key and result is value.
- Problem: If just a column or row changes, you need to remove all the key-value pairs that reside in the cache. That row or column might be used by a lot of queries and might be present in a lot of results. So its not an ideal approach.
B. Cached objects
- Store the class instance so that you can get rid of it if something changes.
- If one DB column value has changed then you need to get rid of the relevant object and not complete object.
- So its an ideal approach.
What to store in cache:
- Sessions
- User activity stream like twitter
- Fully rendered blog posts
- user <-> friend relationships
Types of asynchronism
A. For mostly static data that doesn't require a lot of pre-computation:
- Website pages that are built with frameworks or CMS should be pre-rendered and stored on AWS or CDN.
- Cron job performs these operations and store/push them on CDNs.
- This will make the site super responsive and could handle multiple requests.
B. For dynamic data that requires intensive computation:
- User comes to the site and requests an operation to be performed.
- Site informs the user that its processing the task and informs the user once the job is done.
- When the task comes it is placed in the queue.
- Worker process will come and pick up the task from the queue. It will process it.
- The worker process finishes the job and informs the Front end about it.
- FE receives the signal and update the user.
- Technologies used for queuing are: Redis list, RabbitMQ, ActiveMQ
Source:
http://www.lecloud.net/post/7295452622/scalability-for-dummies-part-1-clones
http://www.lecloud.net/post/7994751381/scalability-for-dummies-part-2-database
http://www.lecloud.net/post/9246290032/scalability-for-dummies-part-3-cache
http://www.lecloud.net/post/9699762917/scalability-for-dummies-part-4-asynchronism
Thank you for posting this. It was helpful
ReplyDeleteI am glad that it was helpful. If you liked it then please share it with your friends.
Delete