I am developing a GPS-based live-tracking mobile app where users can see the real-time position of other nearby users. I am trying to figure out how would the server side look for serving such an app? I mean how can I solve the problem of concurrently communicating with up to 10.000+ clients using a Python server?
I have three concepts in mind, but have no experience in the viability of any of those:
Use a standard WSGI Python app, under a few normal sync workers (say 8 gunicorn workers), with HTTP REST api:
Using some kind of WebSocket / ASGI / async Python implementation to provide persistent connections, with the same logic behind.
Use MQTT protocol on the clients and use some kind of MQTT broker and split the server side to a WSGI REST API (authentication) and a MQTT client (location updates).
Which of these methods are viable, if it is possible at all in Python? From what I've seen WebSockets in Python is mostly benchmarked till hundreds of connections, maybe up to 1000 concurrent connection. This is in stark contrast to frameworks like Phoenix/Elixir which has been benchmarked to 2M concurrent connections on a single box. So I believe 2. is not a viable path.
Would 1. or 3. work reliably with 10.000+ concurrent users?
Device to Datacenter can be either REST or MQTT or both with a set of Kafka Connectors for each protocol running as dynamically scalable Microservices in the DMZ.
If you are going to run MQTT over the public internet then you will likely need to tunnel it through secure websockets (WSS) to keep it secure and still be able to traverse firewalls that might otherwise block raw Mqtt on TCP port 1883.
Hi Zsolt,
Referring to few lines from your problem statement: “I am developing a GPS-based live-tracking mobile app where users can see the real-time position of other nearby users. I am trying to figure out how would the server side look for serving such an app? I mean how can I solve the problem of concurrently communicating with up to 10.000+ clients using a Python server?”
If I understand the goal correctly, you are looking at a solution that’s on the similar lines to the demo link I sent you earlier – the one where you can “Draw the boundary of the vicinity you are interested in tracking the users within”. I would like to refer you to the paper where it talks about the mechanics of the boundary creation and how it easily translates into simple lat/lon based topics for subscriptions. The beauty of the solution lies in its simplicity - http://worldcomp-proceedings.com/proc/p2016/ICM3967.pdf
If you are interested in fanning-out the information to streaming analytics or data-at-rest analytics, you can use Wire-Tap pattern where information published on topics can be spooled to queue. You can favour choosing open wireline like AMQP for richer functionality of data processing within Core infrastructure and use MQTT for mobile to edge communication – right tool for the right job that is fit for purpose.