- Uber calculates the ETA by taking into consideration several factors like turns, signals, stops, traffic roads.
- Uber has a backup data center that will be used in case of DC failure. But Uber never copies the data in the backup DC.
Dispatch service:
- Use Google's S2 service to find the car within a specific radius.
- Google S2 will take a region and converts it into small cells of 1x1m so it will be easier to manage it.
- Uses consistent hashing(Ring structure) to distribute the work. And it makes a server to server call.
- Uses gossip protocol so each service knows the responsibilities of each server.
- Advantages:
- Easy add/remove the server.
- Balances out the load.
Core components:
- Load Balancer
- Then the request is sent to Kafka REST service
- Then the request is forwarded to Kafka
- Kafka will send the request to application server and no-sql/sql DBs.
WorkFlows:
- Users makes the request to LB -> Web-socket -> Application Server
- Demand module within service contacts gives the cell to Supply module.
- Supply module then makes the call to Google S2 library to find the cars within the radius of x of that specific cell.
- Servers will communicate amongst each other and collect their ETA then provide it to the Supply module that in turn will communicate it to the demand module.
- If new cities are added then servers will be added with their cell data.
Analytics Workflow:
- Take data from NoSQL DBs and dump it into Hadoop.
- Use tools like Hive and Pig to get desired data.
Technologies:
- Supply/Demand components are written in Node.js as they are useful in async messaging and its event driven framework.
- No-Sql DBs like Cassandra to help with:
- Scalability
- No downtime
- Hadoop analytics tools to build analysis data.
- Spark/Storm framework to do realtime streaming distributed analysis to figure out trending things happening.
- Log stash/Kibana to do log elastic search. Dashboards can be built to check systems health.
Sources:
No comments:
Post a Comment