While trying to build a large async application, we tried the following:
- Erlang : Painful to maintain because of the syntax and programming paradigm. You might be able to complete your task in less than 50 lines though. It will be fastest and will never fail. If you make Erlang talk to your python code (or any other application), Erlang will kill you.
- Python multiprocessing (and packages based on python) : They are easy to program and maintain, but they tend to become zombie if the python process runs for long intervals. Long running python codes always have weird Zombie issues, whoever says that they have debugged the memory leak is lying (gunicorn has a setting which makes the worker restart after max-requests has been handled because of similar reasons). The other problem is efficiency.
- Twisted : If you can figure out the documentation, good to go. It should be a good fit if you only want to do some API calls.
- Gevent : Simple and easy to maintain. Benchmarked to be damn good in Cpython or Pure python implementations.
- NodeJS : Simple JS code and good modules available for web requests. One word : "MultiCore"
We ended up using Gevent, Zeromq and NodeJS together. Scaling with zeromq queues across distributed machine/cores is peaceful. Python gives the much needed application logic sanity. NodeJS handles the async web operations (like scraping, api, xmpp etc). It could theoretically scale to as many cores limited by number of sockets and memory available.