Problem Statement: Build a scalable chat infra using django/python, do assume that you will have to scale this horizontally.
I am still to get the other specifics about the product expectations but here is what I understand matters:
a. Acceptable Latency between messages/events
b. Number of open connections/user possible per server
c. Size of messages/payload (more on this in a minute).
This is what my solution needs to look like (generic, not specific to django/python)
Client Web Socket Component (CWSC)
A group of servers that can:
a. Accept an incoming user's connection request over web socket protocol and initiate a sticky session.
b. Accept messages/events from a user
c. Receives messages/events received for the connected user and passes it on to the frontend client
d. Stores the messages in a persistent db (PostgreSQL)
e. Broadcasts the message to a pub/sub style channel for other servers to listen for.
Central Redis User-Server Map
Since we are assuming a distributed system the two users might not be on the same server. In this case we need to know the receiver user's server (so that it can added as meta-data to the message broadcasted by CWSC). This meta data can be used the receiving server to filter messages meant for it as the pub/sub channel broadcasts to all server's subscribed to a particular channel.
This db would hold the a mapping of user to server. Whenever a new connection request happens (a) above, it would also add an entry to the redis db.
Pub/Sub Channel
This is where messages/events are broadcasted. Assuming simplicity of only one type of event it would only have channel, that all servers would have subscribed to. Other types of channels could be (message read event, picture sending event etc)
Here is my understanding of how this can be done using django-channels:
1. Django channels is the CWSC component here. It provides for a handy framework to write the implementations
2. Pub/Sub channel is the channel layer concept within django-channels that uses channel_redis as the backend.
3. Central redis user-server map is just something I came up with for my design and can be a very simple EC2 server running the same.
Here are my questions:
1. How does using django channels compare to using socket module directly in python? In terms of performance/scalability. I am not looking for a silver bullet here and understand that it is all a question of product specifics but is it reasonable to expect it to scale to half a million users without us having to open up the package and make modifications to it? My understanding of django-channels is that it is built on socket in python but in a django-friendly manner.
2. Is channel_redis using redis pub/sub underneath?
3. Can I use something like Amazon SQS instead of channel_redis? If so any relevant resources/packages would be great.
4. Is there something else you would want me to know?