System design - Design a system that provides cricket score in real-time

 Problem:

Design a system that broadcast cricket system score to millions of users online. 

Requirement:

Consider 10 millions of users online accessing the server at the same time. Each want to get the live update of cricket score. Latency requirement is 100 millisec not more than that. There can be n no of matches happening at the same time. 

Functional:

1. Each user can login to the system using his credentials.
2. User can go to dashboard showing multiple matches going on with basic details like run, wicket, current playing teams.
3. Once user selects a match going on it will show the details of who played runs, who is currently playing, who is serving balls etc.

Non-Functional:
1. Througput of the system should be high handling at least 1 million of connections. For this use a high end machine having more RAM and CPU.
2. Need a LB for balancing the load. Since we are going to use Server send events in HTTP as transport medium we need a L7 LB. Use least connection as algorithm used for load balancing.
3. High availability is needed so we use redudant nodes of server to handle load. Use kubernetes autohealing functionality to restart the system if a node fails. 
4. Scalability is needed if the load increases for this we auto scale if the connection/request of a node increases more than 1 million or it a node CPU goes to 80%. We can use Kubernetes horizontal scale or vertical scale features.
5. Data storage we can use postgresql since we need high write load we can use it. Mostly read will not be there since we store most of the data in cache like Redis. Same data can be broadcast to all clients.
6. If there is a server failure we need to handle all the websocket connection from that server to all other servers. Which may increase back pressure for all the systems. So we may need to keep redundant systems.

Design:
We have multiple browser that sends HTTP GET request for live score once this connection is accepted at server a HTTP connection is established between client and server. Server keeps track of all the active web clients as listeners. 
LB takes care of layer 7 load balancing hence it has intelligence to route the call to same notification server. This can use least connection with round robin as LB algorithm with sticky session to maintain the HTTP connection with same server.
All the cricket data is stored in Postgresql DB with horizontal sharding to shard data based on geo location since some of the local matches are region specific and having those data across other region will increase latency.
We use Redis for caching when user request the home page or individual matches score all the data will be intially queried from DB and stored in Redis cache. Hence subsequent request will be retrieved from cache if not found will retreive from DB. Caching strategy is write-back cache with eviction policy least recently used.
We also use Redis for Pub/Sub when the score came from the backend server Receiver which will get all the scores posted from external servers or third party servers or operator manually. Once score is received it needs to be broad cast to all the Notification servers. For this each notification server subscribe to a match topic for example Football is a topic, cricket is a topic. The details of matches and scores, wicket details etc will be received as payload. Each notification server has list of listeners which will be iterated and data will be send to all the listeners. We can group the listeners based on the matches like India Vs England etc.
In case of failure/network disconnection/server restart the client will lose the connection. So client should have a logic to retry HTTP GET for the score page user is visiting hence the a new connection will be LB and send to different server. We can keep a server redudant in passive mode so that when a server failed this server can be put into active. Use any failover/disaster recovery tool.

Flow:
When user lands to home page it will send a HTTP GET request with host-header /crickinfo/home page which will hit the notification server that queries the DB and returns the web page. Client JS script will again send HTTP GET request to /circkinfo/livescore with Accept:text/event-stream. Notification server will add the connection as a listener in a list maintained for livescore home page.This connection is persistence hence server can send live stream of data when ever it receives. When user navigates to another page we need to close the HTTP GET connection and re-establish another connection for another page /crickinfo/Indiamatches.

We can cache the data in redis hence n/w latency to fetch the data from DB can be minimized.

When there is a data from external server it will be received by Receiver Server in our backend this will do some data cleaning or reformat the data that can consumed by notification server it will publish the data to topic like Criketmatch which will be published to all notification servers. Also it will write the data to Postgresql DB for future use. Once data is received it will check for the Match details to identify the matches. Once it identifies the match will iterate the list of active listeners/web connections and send the SSE events in UTF-8 to all the clients in JSON format. We can send multiple data in single SSE event to client like for e.g in home page we can club multiple matches data and send it in one event.
















Comments

Popular posts from this blog

Decorators in python - Part 1

Class Decorator practical uses - Part 2

Public key cryptogrphy - How certificate validation works using certificate chain