Async Web3Py ?

441 views
Skip to first unread message

Arnon Marcus

unread,
Apr 16, 2013, 10:32:57 AM4/16/13
to web...@googlegroups.com
First a little background...

I am in the process of researching and evaluating async/concurrency solutions for my company's web-servers.

Here is the state-of-the-art of concurrency in python:
* An awesome 45min lightning talk about the internals of the GIL and it's effects on threading.
It's not pretty...

Then there's the evolution of asynchronous frameworks:
* A superb walk-through around the history of evolving approaches of usage of generators for async.
We're still not there yet...

And so we now have a proliferation of inconsistent approaches to doing non-blocking IO in Python...
What to do?

Our benevolent dictator to the rescue:
* The creator of Python himself, has been working on a solution, code-named "Tulip".
An answer to all our problems... :)

The bad news?
It's not done yet... Slated for Python 3.4 due in feb.2014...

The good news?
It can be used as a library for Python 3.3!
* Also, work is being done to shim it in Python 2.7 via existing event-lopp frameworks as polyfills...

So, I am suggesting this be taken seriously as a research-project for future developments of web3py.
:)

Niphlod

unread,
Apr 16, 2013, 10:45:02 AM4/16/13
to web...@googlegroups.com
web2(3)py builds "on-top" of things: taking a not-finished library as something that can change from time to time basically leads to "if tulip dies so does web2(3)py". So, until it's committed to python stable, noone would pickup that as a base reference to build a framework.

Benoitc is the author of gunicorn, that works yet and pretty awesomely without hiccups.

As things stand, there are a very few places in which web2(3)py needs improvement to support "the async way".
All the blockings are circumventable with web2(3)py as it is running inside an evented environment, and leveraging specific "async-friendly" drivers for the db interaction.

If the issue is "choose a different standard webserver than rocket", we tried not so long ago to review all the possible options, but at the time being there's no python webserver capable of doing everything cherrypy or rocket do, so it would cripple a lot of installation environments.
That being said, if something that builds on top of tulip comes up and supports wsgi, it can be easily added to anyserver.py.

Arnon Marcus

unread,
Apr 16, 2013, 12:03:41 PM4/16/13
to web...@googlegroups.com
Well, as web3py is going to be python3-based, it may use Tulip in it's module-form... But perhaps you are right about not rushing it, perhaps web4py would be a better target... :) (or rather, web3py 2.x...)

But my point is that PEP380 in Python 3.3 is already a very solid basis for building event-loop kind of web-framework with non-blocking-IO that are single-threaded...

The current recommendation for production-deployment of web2py is Apache, which (as I understand it) has it's mod_wsgi spawn a python-thread for each request. This has all the performance-penalties mentioned in the first video I posted (relating to the GIL), in addition to the famous 10K barrier of machine resource and system-thread-limits.

Saying that Rocket and Cherrypy are the main focus, is irrelevant.
It may stay like that for educational/experimentation use-cases, but it has no relevance to production-deployments.

But I think the main thing that bugs me is that whenever I start talking about asunc and non-blocking-I/O in this group, everybody immediately assume I am just talking about either front-end web-serving, or back-end database-connections... I am considering I/O as also being ipc or even in-proc communications - having web2py able to communicate with other server-side services, and/or different connection-sessions within itself for "push" enabled "collaborative" use-cases (whether being based on WS, SSE, LP. or any other...). I am talking about transforming web2py's internals to being async/event-loop capable, within a single-threaded deployment story.
It is frustrating to me that people are not getting this message... This IS the direction that web-development is moving into around the world, and if web3py will not keep up with this trend, it may not even see the light of day before being left in the dust of history... And I would deem that a very sad day for all of us... Web2py is keen on backwards-compatibility - web3py is not - so it is an opportunity for restructuring some internals and joining web3py with the rest of the second-decade of the 21's century... (if not spear-heading it...)




--
 
---
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/ExLCcJzS79k/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Niphlod

unread,
Apr 16, 2013, 12:18:34 PM4/16/13
to web...@googlegroups.com

But my point is that PEP380 in Python 3.3 is already a very solid basis for building event-loop kind of web-framework with non-blocking-IO that are single-threaded...

It's just a PEP describing how to interact in an eventloop. That being said, python misses just the wsgi hook (read: a unique way to interact with all different implementations) to make it usable. It's not web2py's job to be the next event-looped webserver.
 

The current recommendation for production-deployment of web2py is Apache, which (as I understand it) has it's mod_wsgi spawn a python-thread for each request. This has all the performance-penalties mentioned in the first video I posted (relating to the GIL), in addition to the famous 10K barrier of machine resource and system-thread-limits.

Whaaaaat ? if this is reported somewhere please point it out, we'll remove it ASAP. Apache with mod_wsgi is a resource-hog that right now has far better counterparts. Here too, that being said, if there's an apache instance managing something else in your stack, then it could be useful to run a python webserver in it, but it's the only case.
 
Saying that Rocket and Cherrypy are the main focus, is irrelevant.
It may stay like that for educational/experimentation use-cases, but it has no relevance to production-deployments.

Glad we agree that web2py is not meant to be used only in threaded webservers.
 

But I think the main thing that bugs me is that whenever I start talking about asunc and non-blocking-I/O in this group, everybody immediately assume I am just talking about either front-end web-serving, or back-end database-connections... I am considering I/O as also being ipc or even in-proc communications - having web2py able to communicate with other server-side services, and/or different connection-sessions within itself for "push" enabled "collaborative" use-cases (whether being based on WS, SSE, LP. or any other...). I am talking about transforming web2py's internals to being async/event-loop capable, within a single-threaded deployment story.


It is frustrating to me that people are not getting this message... This IS the direction that web-development is moving into around the world, and if web3py will not keep up with this trend, it may not even see the light of day before being left in the dust of history... And I would deem that a very sad day for all of us... Web2py is keen on backwards-compatibility - web3py is not - so it is an opportunity for restructuring some internals and joining web3py with the rest of the second-decade of the 21's century... (if not spear-heading it...)


But your are missing that you didn't present any large usecase for that (meaning IPC in general, not related to the web client-server patterns).
 
"IPC" is just something you agree on your stack to be the common ground. There's no way you'll find, e.g., Erlang talk to Python through their respective native APIs, neither Python to Node's javascript modules.

Given that there are a lot of choices on the "external to both", we're back to the beginning: it's far more productive code your own information-exchange-messager than come up with a silver-bullet implementation that fits all IPC paradigms, and force web2(3)py users to have that particular tech in their stack.

António Ramos

unread,
Apr 16, 2013, 1:06:07 PM4/16/13
to web...@googlegroups.com
and then there was nodejs

:(




2013/4/16 Niphlod <nip...@gmail.com>

--
 
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.

Niphlod

unread,
Apr 16, 2013, 2:08:51 PM4/16/13
to web...@googlegroups.com
that has the same exact issues on the "IPC" matter.
If python had only the "twisted-way" of doing things, your code would be "async-ready" yet ^_^ .
ATM I'm not that sure were I looked that up - so please bear with me if this is incorrect - but a little while ago running node on multiple processes sharing resources wasn't possible.

Vasile Ermicioi

unread,
Apr 16, 2013, 3:00:30 PM4/16/13
to web...@googlegroups.com
I believe that the future of python is pypy :)
last pypy release (2.0 beta 2) has builtin jit-ed stackless and greenlets

also there is 
- pyuv built on top of libuv the platform layer for NodeJS,
- uvent: a gevent core implemented using libuv






Arnon Marcus

unread,
Apr 16, 2013, 4:43:13 PM4/16/13
to web...@googlegroups.com
Yup. As noted, there is a proliferation of c-extensions binding to various event-loop implementations - no "native" one, not until 2014, that is... :)

Still, it shows how doable things are even now for web2py the way it is...

I'm not sure if Pypy is relevant to web2py... Can it run on it? I mean, the problem is obviousely that c-extensions for c-python an not run in RPython (Pypy), right? I mean, what is the status about, say, database-drivers?

As for ipc use-cases, these are more related to 0mq stuff... External back-end sockets 

But what I thing most still don't see what I'm seeing, is about SAPs:
Look, it doesn't really matter how you turn it around - SAPs are the future, for the most part.
It doesn't really matter what protocol is used - web2py needs to be non-blocking/evented in order to ride this SAP wave... One way or another, sessions need to interact - there needs to exist a built-in loop-back path of messages comming from one session and being distributed into other long-standing-sessions for pushing back to their respective clients on the connection they are holding - social-network applications and collaborative-views are on the rise, and server-side frameworks need to adapt accordingly.
Currently, there is no way to accomplish this with web2py without threading (which is diabolic in python). The only other way to do that in a single-threaded way, is via an external connection - whether that being a message-queue like redis, or a secondary web-framework that is evented.
In the first case, you would need some kind of co-routine implementation in the controller-actions, which would then require some kind of centralized-scheduler for this to be done in a sane way. Some greenlet-based integration is thus a viable option.
In the second case, we are talking IPC. Then 0MQ shines, with pub/sub-interoperability with external message-queues or secondary web-frameworks.
In a threaded-web2py story, you could implement a pub/sub topology with zmq-sockets that have queues, with threads calling I/O-blocking socket-listeners within controller-actions.
But the whole point is to get away from threads...
I thing the best approach is to use a combination of simple generator-actions in contollers, in conjunction with non-blocking greenlet'ed-0MQ sockets - this can actually be done today... In either threaded/non-threaded story... You sont even need to monkey-patch all of web2py - just the 0MQ sockets...
I just thing that his sould be integrated natively to web2py, as it should be as simple as adding a decorator for controller-actions to make them "async". It should be like any other service-implementation in web2py. The generator-stuff could be part of this integration also, so you could reuse your existing controller-actions without changing the code...

Arnon Marcus

unread,
Apr 16, 2013, 4:52:07 PM4/16/13
to web...@googlegroups.com
This way, it could be future-proofed. As web2py looms, and 2014 arrives, you could only need to modify the internals of this integration, to support PEP380 (Python 3.3's "yield from") and/or the higher-level Tulip implementation (Python 3.4) - All without breaking existing code - The controller-actions would still use the same decorators-syntax - it's just their back-room implementation that would be different.

Here is what is being done today that can be implemented, even for web2py 2.x:

Massimo Di Pierro

unread,
Apr 16, 2013, 5:04:25 PM4/16/13
to web...@googlegroups.com
Hello Arnon,

I am very interested in the things you are suggested. I do not think web2py will be an async framework because in order to make an async component in Python we need to waif for 3.4. I do not think we have a choice.

The problem with building an async framework are:
1) scalability (how to you deploy many servers behind a load balancer and make them work together)
2) security (it can be done but it would be different than web2py's).
3) most of the code runs client side (ember.js) so why build the server in Python at all? A good case must be made. Perhpas the JS should be generated from Python?

Niphlod

unread,
Apr 16, 2013, 5:04:59 PM4/16/13
to web...@googlegroups.com
you can do it with web2py right now without any change. You just need to code your own zmq hooks.

As you stated earlier, until something gets packed into stable python, there's no way around GIL.

tl;dr; anyone wanting to run python in an evented loop needs to go around GIL and choose its own implementation (gevent, eventlet, pulsar, twisted, etc). If you need an evented loop, run web2py on it with anyserver.py, it will be alive and kicking as it is right now.
The missing part is messaging, and there are several stable modules/techs/external something that you can leverage from and that web2py can definitely not track (hence the previous recommendation to pack them as plugin, if you wish) until they get somewhat standardized inside python itself.


On Tuesday, April 16, 2013 10:52:07 PM UTC+2, Arnon Marcus wrote:
This way, it could be future-proofed. As web2py looms, and 2014 arrives, you could only need to modify the internals of this integration, to support PEP380 (Python 3.3's "yield from") and/or the higher-level Tulip implementation (Python 3.4) - All without breaking existing code - The controller-actions would still use the same decorators-syntax - it's just their back-room implementation that would be different.

Here is what is being done today that can be implemented, even for web2py 2.x:

Derek

unread,
Apr 16, 2013, 5:50:59 PM4/16/13
to web...@googlegroups.com


On Tuesday, April 16, 2013 1:43:13 PM UTC-7, Arnon Marcus wrote:
Yup. As noted, there is a proliferation of c-extensions binding to various event-loop implementations - no "native" one, not until 2014, that is... :)

Still, it shows how doable things are even now for web2py the way it is...

I'm not sure if Pypy is relevant to web2py... Can it run on it? I mean, the problem is obviousely that c-extensions for c-python an not run in RPython (Pypy), right? I mean, what is the status about, say, database-drivers?

Yes, It runs just fine on PyPy. Thanks to Massimo's planning on having almost everything as a 'python only' module. PyPyodbc was added not too long ago, which allows you to use odbc databases in a pure python environment. Also, pypy does have some support for c extensions. It's not fast, but it works.

Niphlod

unread,
Apr 16, 2013, 5:59:25 PM4/16/13
to web...@googlegroups.com
the obvious suspects sqlite (packed into pypy), postgresql (through mariano's pg8000, shipped with web2py), mysql (through pymysql, shipped with web2py) work on on pypy.

Recently we added pypy as an additional environment in our CI environment (so it's a "close-watched-fellow-environment") that leverages travis-ci: trunk runs fine on it.

Arnon Marcus

unread,
Apr 17, 2013, 8:57:26 AM4/17/13
to web...@googlegroups.com
Just notices, I was writing SAP instead of SPA the whole time... :)

I get what you are saying - the underlying unspoken agenda here, is this : "Web2py should remain Pure-Python in it's default implementation"
This places us on a discussion that is assuming a threaded-deployment, as it is also the default-implementation that would be publicized.
So, we then any argument about using evented-single-threaded deployment, is no longer relevant within the boundaries of this discussion.
In this case, we inevitable going to refer Tornado, which IS ALSO a pure-python implementation, and yet is doing non-blocking-I/O, by using threads.
The way I saw it in Tornado, it is wrapping async-style code that is abstracted away from the developers -cClasses of a kind they call "YieldPoints" (Callback, Task, Future):
Guido has basically "stolen" these classes for Tulip...

So essentially, if you run web2py on-top of tornado's I/O-loop, you should be able to use these YieldPoint classes within the controller-actions.
The question that remains open to me is, can you do it even "without" tornado's I/O-loop... (?)
If you can, than it should be relatively straight-forward to implement a service that wraps around these classes, and "decorate" controller-actions with it.
If you can't, than we are in a plugin/contrib story.. Can it include a "service"? Really, what I am simply asking could be expressed and answered-for by a simple service...
Can a plug-in/contib be a web2py-service?

Ricardo Pedroso

unread,
Apr 17, 2013, 11:06:49 AM4/17/13
to web...@googlegroups.com
I'm didn't follow this discussion closed enough, probably what I have to say
was already said, but ...

> So essentially, if you run web2py on-top of tornado's I/O-loop, you should
> be able to use these YieldPoint classes within the controller-actions.

I guess not, but I may be wrong. Using web2py served by tornado you need
to use the tornado WSGIContainer that invalidates the async nature of tornado.

Tornado is (at least was) not WSGI compliant to be able to be async.

> The question that remains open to me is, can you do it even "without"
> tornado's I/O-loop... (?)

Yes, with Greenlets. Can be done today without any change to web2py
internals or other WSGI framework.
And probably is, currently, the only option for WSGI Applications.

See bottle docs about it: http://bottlepy.org/docs/dev/async.html

It's just a question how you implement and how you deploy.


Ricardo
Reply all
Reply to author
Forward
0 new messages