Server-Sent Events

1,194 views
Skip to first unread message

Arnon Marcus

unread,
Apr 4, 2013, 8:29:56 PM4/4/13
to web...@googlegroups.com
The only thread on this in this group has been deleted.

I have a few questions:

1. The way I understand this, an implementation would be via a controller-action that receives the event-stream-request, then responds with a "200 OK" and that MIME thing, to affirm the connection. But from that point onward, new responses should be sent over the same open connection. How is the response-object being generated and sent without a request? How does it know where to send it to?

2. I would like to make a shared-collaborative view for multiple users, that any changed done by one, is reflected automatically in all the others. How would I go about doing that? Since web2py executes on each request, I would have to hold connection-data of all open-connections, in some semi-persistent location - would I have to use some external/internal caching? Is there some automatic session-saving already built-in web2py that can be useful?

Derek

unread,
Apr 4, 2013, 8:46:57 PM4/4/13
to web...@googlegroups.com
1. How? Usually by a yield somewhere. Where? The already existing open connection knows where.
2. Slow down there, you're overthinking it. Web2py comes with this.

Look at web2py/gluon/contrib/comet_messaging.py

And...

watch this video by Bruno Rocha

Chris May

unread,
Apr 6, 2013, 2:10:13 PM4/6/13
to web...@googlegroups.com
Wow. It's cool that web2py ships with a websockets implementation. I was also thinking like Arnon about how to implement server-sent events with web2py.

As such I do want to add, maybe for posterity sake, that websockets serve a different use case than server-sent events, and are more complicated.

But, with so much figured out already maybe it is better to use websockets.

Arnon Marcus

unread,
Apr 7, 2013, 7:35:23 AM4/7/13
to
I will elaborate.

I have 2 different interest:

The first one has to do with my personal feeling of understanding the inner-workings of the implementation of messaging options in web2py, on the architectural level. I do NOT want to re-invent the weal by any means, but I do want to understand the architectural structure of interaction between various optional-components, so I can better judge which option of which component-option can co-exist/communicate-with which other component-option.  

The second interest, is to be able to decide the best approach of messaging using web2py, for my use-case, which I will elaborate on more here.


As for the first interest, I would like to understand how web2py interact with a client, in any non-standard request-response methodology.
I have been researching SSE and WebSockets quite extensively in the past few days, as well as other messaging protocols and libraries, like XMPP, AMQP, RabbitMQ and ZeroMQ.
The way I understand it, from a performance and scaleability stand-point, a best-practice would suggest some kind of separate process built on a single-threaded, non-blocking-io, event-loop type of web-server. This sent me on a research on Eventlet, gEvent, gUnicorn, Tornado, mongrel2, twisted, and the like. I also know that web2py has an option of running via a gEvent server, and I know that this would require some extensions to the libraries I use, such as green_psycopg for my postgres-driver, and uGreen for my uWSGI server. This might be advisable regardless of my interest in messaging, but might have side-effects for other modules I use that don't support gEvent's usage of sub-routines.

The are 2 questions I need to answer for myself:

1. In case all is done within web2py, and given it is by-default not a non-blocking-event-loop type of system, than whether that be listening on a long-poll, a server-sent-event, a web-socket, or an amqp/0mq-socket, how does/can it handle long-lasting requests/connections, without blocking all the other "regular" HTTP requests?

2. In case it can't, this means a seperate process needs to be working in parallel - a separate python-interpreter means another PVM (python-virtual-machine) process.
How does web2py interact with external processes of other servers?
An inter-thread communication socket?
An Inter-process communication socket?
A sub-process-pawning?
A sub-routine?
An os-socket?
A TCP socket?
Maybe a ZeroMQ socket is in order? :)


As for the second interest, I'll elaborate on my use-case(s):

I have a web-application I am designing, for collaborative project management.
There are three main use-cases I am targeting:

1. It should have topic/subscription-based messaging system, that users can both subscribe themselves to topics, as well as add/remove other users as "topic-watchers" if they want to join them to a conversation and have them be notified immediately for any new update to that topic (obviously only topic-owners can add/remove new watchers, and other watchers can only remove themselves from a subscription to a topic they are watching). This requires a pub/sub-fan-out messaging-topology.
- For this use-case, an SSE transport should suffice, perhaps with an XMPP protocol layered on-top. ZeroMQ also seems lucrative...

2. It should have collaborative-screens, like a Gantt chart and a Scheduling run-chart, that each user that is using it, would be "subscribing" to that view, and have his changes be reflected immediately on all other users currently "subscribed" to that view, as well as receive any changes that any other user is doing to that view.
- For this use-case, web-sockets seem ideal, but a ZeroMQ socket seems lucrative as well... 

3. It has a CMS system that should interact with 3'rd party desktop-applications. For most uses, an internal-application's view would interact with web2py, either through an RPC'ish connection (like xmlrpc/jsonrpc), a RESTfull architecture (web2py's REST API), or some kind of messaging architecture (XMPP or ZeroMQ).
In addition, I might choose to add the same pus/sub-topic-based messaging system into the inner-app view.
- For this use-case, I am currently using web2py's built-in xmlrpc and amfrpc, but ZeroRPC looks lucrative as an xmlrpc-replacement - it's a python-rpc layered on-top of ZeroMQ sockets.

I also would like to use Redis as a caching mechanism for my web2py application, and this may suggest it being acting as a persistence-layer for a centralized messaging broker.

- For the desktop-application integration, I am currently leaning towards using some non-blocking-event-loop-type of server, that would use Redis as both session-cache and general-data cache. Web2py can interact with is as needed, to fill the cache with results from my database.

- For the browser-targeted-use-cases, I am not sure which road I should take... I would like to stay within the confines of web2py so I can have DAL access and store the messages and collaborative-view-changes, but I would also like to take advantage of pub/sub libraries for managing the queues and communications, and would rather have this communication not-block the other regular HTTP traffic that comes to web2py... So it's a dilemma... I would appreciate any suggestions...

Ideally, I would create another web2py server, running on-top of gEvent, and have it talk to the same database using the same DAL object (or at leas the same model-file) of the main web2py  uses. This way I don't need to have the 2 web2py instances interacting at all. Each one would be targeted for different use-cases of the same application. It can ideally also handle all the desktop-application communications through ZeroRPC. For the browser-fronts, this architecture might have issues I am not thinking about, such as cross-origin issues... 

Arnon Marcus

unread,
Apr 7, 2013, 7:51:11 AM4/7/13
to web...@googlegroups.com
As for the Tornado-web-socket example, I looked at the code, and couldn't figure out how this would work in a production environment...
Does it spawn a separate python interpreter for Tornado?
If so, how does it meld with we2py's controllers? It is unclear how this works...
What serves the Tornado web-app in production? Apache? How?

As for the comet file - I can't find it - it seems it no longer exist in the new version of the web2py source-code....

As for running web2py via gEvent - how should one deploy this in production?
Can it work with Apache the same way the wsgi-handler does?
Does it require/suggest a "gEvent"ed uWSGI under NginX ?

This is all very bewildering...
The documentation is very lacking...

Niphlod

unread,
Apr 7, 2013, 8:43:05 AM4/7/13
to web...@googlegroups.com


On Sunday, April 7, 2013 1:51:11 PM UTC+2, Arnon Marcus wrote:
As for the Tornado-web-socket example, I looked at the code, and couldn't figure out how this would work in a production environment...
Does it spawn a separate python interpreter for Tornado?
 
yes, it's very clear that there are two separate shells.

If so, how does it meld with we2py's controllers? It is unclear how this works...

none, effectively.  it's a separate process that handles just the messaging part..... usually the user's browsers (trough whatever you use, usually a javascript component) asks for messages from it or send them to it . You can also send messages within your web2py code .....

It's meant to handle all the "messaging" part, leaving the web2py working in the "usual" way.
 
What serves the Tornado web-app in production? Apache? How?

Tornado usually "stands" by itself. Gunicorn too. Apache doesn't handle well long-standing connection, usually the way to go if you need something in front is using nginx.
 

As for the comet file - I can't find it - it seems it no longer exist in the new version of the web2py source-code....

???? comet was referred only in the very first implementation.... web2py.js holds a websocket component.
 

As for running web2py via gEvent - how should one deploy this in production?

with anyserver.py
 
Can it work with Apache the same way the wsgi-handler does?
 
Nope. I'm not sure you can find any gevented "something" running behind apache.
 
Does it require/suggest a "gEvent"ed uWSGI under NginX ?
 
If you want to run a gevented web2py, it's another story alltogether, that's not "pertinent" to what websocket_messaging.py is.
BTW, you'd need to "adopt" gevent-friendly libraries, etc etc etc if you don't want to rely just on monkey patching, and code all your app with gevent-like statements if you want to "exploit" the real potential of gevent.

 
This is all very bewildering...
The documentation is very lacking...

I think you'd need to identify correctly what are your requirements and stop thinking that all of what you want can be achieved in a single-process.
Requirements for "threading-like" webserver, "forked" ones can be (with few limits) assumed as "equal" regarding the choosen programming style, but long-standing connections are very different.
That's why the messaging part is abstracted away in a lot of implementation: you do what needs to be done in your "normal" environment and leverage the "messaging" with another "component", maximizing the usefulness of an MVC framework and leaving the ultra-specific messaging implementation on something external alltogether, that does just what it's meant for.


Arnon Marcus

unread,
Apr 7, 2013, 10:03:58 AM4/7/13
to web...@googlegroups.com
10x Niphlod
Have you read my previous comment? the one just before the one you commented on?

Niphlod

unread,
Apr 7, 2013, 11:21:12 AM4/7/13
to web...@googlegroups.com
yep, but you clearly have just started documenting and have a lot of ideas going around.
as far as message passing between processes is concerned, you mentioned a lot of good "players".
as far as message passing in a web application, there are not so many:
- ajax (bi-directional, long-polling)
- sse (one direction only, but you can still pass from client to server with ajax)
- websocket (bi-directional)
one nice implementation that abstracts away the support of the "x" technology is socket.io : if you want to code something that works accross different browsers types and versions, go for that route.
I'd ditch zeromq, ampq and xmpp if you need to have a something working into the browser: it's true that there are js clients for those but the amount of boilerplate required would cripple your productivity.

Arnon Marcus

unread,
Apr 7, 2013, 12:28:10 PM4/7/13
to
How about engine.io? or SockJS?
Most libraries have fallbacks/pollifills/shims or whatever...

The thing is, it seems I would need some kind of centralized broker, if I want to share the messaging code across all use-cases, and I DO want the messages committed, in most cases, so I am not looking for a "direct" browser-to-browser channel:

1. Browser<->Browser : Commit all traffic to the database (pua/sub-chat AND collaborative-views)
2. Browser<->Desktop-App : Commit all traffic to the database (pua/sub-chat only)
3. Desktop-App<->Server (RPC/REST) : Don't commit anything to the database...

So, It seems that:
For use-case 1 - The best solution is a non-blocking web-server with SSE and a connection to the database.
For use-case 2 - The best solution is a dual-fronting (SSE + 0MQ) and a connection to the database.
For use-case 3 - I only need a 0MQ for web2py, or falling back to xmlrpc/jsonrpc/REST...

As for caching, I think I would need Redis as a stand-alone "third" service...
I think I read somewhere that is has some kind of messaging support by itself - acting as a proxy... I think AMQP was the protocol...

What do you think? 

Niphlod

unread,
Apr 7, 2013, 12:52:20 PM4/7/13
to web...@googlegroups.com


On Sunday, April 7, 2013 6:25:06 PM UTC+2, Arnon Marcus wrote:
How about engine.io? or SockJS?

as I was saying, you're reading too much too soon, just naming buzzwords without actually **thinking** to what you need.
Engine.io is just the "protocol" for socket.io.
Sockjs is another abstraction layer on top of websockets, has the same exact agenda of socket.io (with a bit less of flexibility)
 
Most libraries have fallbacks/pollifills/shims or whatever...
 
read it all carefully. the one "whatever" is not there, you have to code it yourself. World is full of "that was the right tool until I needed that extra bit" :P
 

The thing is, it seems I would need some kind of centralized broker, if I want to share the messaging code across all use-cases, and I DO want the messages committed, in most cases, so I am not looking for a "direct" browser-to-browser channel:

A "centralized broker" is a thing that (eventually) store messages and route them where you want them to go. All of the proposed solutions, included websocket_messaging.py, take care of that.
 
1. Browser<->Browser : Commit all traffic to the database (pua/sub-chat AND collaborative-views)
2. Browser<->Desktop-App : Commit all traffic to the database (pua/sub-chat only)
3. Desktop-App<->Server (RPC/REST) : Don't commit anything to the database...

It's not where you need to commit the central-point. The point of messaging is where your messages need to be originated and where they need to go.
Additionally, you have to check if what you want to do is feasible with the tech you choose.
Store what needs to be stored is a layer on top.
 

So, It seems that:
For use-case 1 - The best solution is a non-blocking web-server with SSE and a connection to the database.

so you say.... the solution can be very well be a normal webserver serving the pages and a non-blocking one to be the message-passer.
 
For use-case 2 - The best solution is a dual-fronting (SSE + 0MQ) and a connection to the database.

Don't know a single bit of what you'll use to code your desktop app. Given that you have to rely on connectivity I really won't go for a desktop client that basically does what your browser application does already.
 
For use-case 3 - I only need a 0MQ for web2py, or falling back to xmlrpc/jsonrpc/REST...

As for caching, I think I would need Redis as a stand-alone "third" service...
I think I read somewhere that is has some kind of messaging support by itself - acting as a proxy... I think AMQP was the protocol...


read it again. It has pub/sub support, but no AMPQ whatsoever, although you can find libraries that on top of it abstracts away the difference.

Arnon Marcus

unread,
Apr 7, 2013, 3:23:16 PM4/7/13
to web...@googlegroups.com
Don't wan't to start a flame-fest, but I feel like I am under fire here, and unjustly so...

as I was saying, you're reading too much too soon, just naming buzzwords without actually **thinking** to what you need.

I admit I don't have experience with many of the things I was writing about, but I don't think I am ill-informed or have erroneous-understanding of things. I made a broad-spectrum research, and wen just deep-enough into any component-option to get the "jist" of it, and see what it's all about.
 
Engine.io is just the "protocol" for socket.io.

I have watched a presentation of the guy who wrote these components. He defines engine.io to be the "core" of a new version of socket.io that is in the works. I watched another presentation by the guy who wrote SocketStream (a light-weight inter-connect layer for server-and-client for node.js), and he said his framework is built upon socket.io, but that he is probably going to "replace" that with engine.io - which would make sense correlating with the other presentation. It suggest a linear-progression - socket.io = mature but might be bloated/have things he does not necessarily need, and engin.io = a newer core that is being extracted out of the older socket.io for use-cases that want more flexibility and control  with less higher-level features.
 
Sockjs is another abstraction layer on top of websockets, has the same exact agenda of socket.io (with a bit less of flexibility)

I know that, that's why I said I would have to choose between the two. 
 
read it all carefully. the one "whatever" is not there, you have to code it yourself. World is full of "that was the right tool until I needed that extra bit" :P

I mean that from what I gather, SockJS initially had no fallback-strategy for older browsers, which makes socket.io a better choice for cross-browser compatibility - from what I hear this is changing, or has already changed. You might argue for "writing yourself" the stuff that you need "extra" from a library, but I don't see a strong-enough argument for saying the same thing about cross-browser compatibility - I don't think I should write "that" myself...
As for writing stuff myself, sure, I could do that. But why re-invent the weal ? Sure, if I find something I need that is not supported in a library, I might choose to extend it, or choose another library - I would try any of those before writing my own library though... I don't think there is anything wrong with starting with an existing library, and "gradually" finding out it's limitations "as the need arises" and do the necessary refactoring "if and when is needed" - I think it's a sane and logical strategy - not to mention efficiant and economical - saying that I "might" get stuck because of a library, so I might as well write my own, sounds like "premature optimization" to me... You could take that argument all the way, and write an operating system yourself... Good luck with that...

 
A "centralized broker" is a thing that (eventually) store messages and route them where you want them to go. All of the proposed solutions, included websocket_messaging.py, take care of that.

This I really don't understand...
By "centralized" I meant "a single place that has the routing code to maintain".
If I use websocket_messaging.py, AND I use 0MQ on another server, it means I have two codes dealing with message-routing in 2 different places. I would rather them exist in a single place so I can build the routing-topology as a single-layer on-top in a single place.
If, for example, I go the RabbitMQ way, it has support for web-sockets as well, so I would guess it can centralize topology-definition for me, for both protocols.
I know that 0MQ is a Broker-Less topology, so wherever I put a 0MQ socket, that same place would "ideally" rout messages for both 0MQ-sockets and web-sockets. Both protocols, in this case, are topology-neutral, and low-level enough, so I can build a layer on-top that defines the topology for both (which targets should be fan-outed, which sup/sub'ed, who is currently subscribed to which channel, etc.). Because 0MQ, as oppose to AMQP, is a "library", and not a "protocol, the topology definition is done in code anyways... So where for AMQP, I would define it in, say, XML format or something, for 0MQ I am goiing to "write" the topology myself in any case. Granted, as it is a broker-less architecture, I would have to define it in many "clients", but I can "simulate" a broker if I choose to, so I would rather do that if I build a centralized messaging server.


It's not where you need to commit the central-point. The point of messaging is where your messages need to be originated and where they need to go.
Additionally, you have to check if what you want to do is feasible with the tech you choose.
Store what needs to be stored is a layer on top.

Here you lost me completely...
Obviously the main part of messaging is the routing-topology - I am well aware of that.
But if I architect the components in a way that clients communicate among themselves  with no centralized location, it would be sub-optimal for storing the message's data from disparate places - It would mean more hops in the message rout, and might even eventually mean coding the "storing" code in multiple places. If I have a centralized message-broker, it may include in it's topology, a filtering of which messages should be stored and where, and may have, for example, a dedicated queue for an out-going channel that goes out to store the data - this way it may even be aggregated before submitting the request to store the data, so there would be less database traffic down the line.

 
so you say.... the solution can be very well be a normal webserver serving the pages and a non-blocking one to be the message-passer.

What I mean by that, is that if the non-blocking server for the messaging, that would also do the routing-topology, would be just another web2py server, running via gEvent, it can do the database-commits by itself. And since it is web2py, I could reuse the DAL code I have in the model of the main one - so I would not have learn a new ORM system, or device a channel for talking to the main web2py just for the database-commits.


Don't know a single bit of what you'll use to code your desktop app. Given that you have to rely on connectivity I really won't go for a desktop client that basically does what your browser application does already. 

As I said, it's also a CMS (content-management-system), so we have in my work-place a few desktop-applications that we write plug-ins for version-control and task-pipelining, so all the file-paths that the plug-in saves, are generated and stored at the main web2py application.
We also want to make a way for our workers to "chat" from within the desktop-app, through the same web2py app. Think of it like an ALM story - like Mylyn for eclipse, or the similar task-server for visual studio - you have workers working within a desktop app, but collaborating on the same project through a centralized server, via tasks and task-comments.
 

read it again. It has pub/sub support, but no AMPQ whatsoever, although you can find libraries that on top of it abstracts away the difference.

I got that impression from this:

Logstash treats Redis as a message-broker - with output messaging - I don't know how exactly...

Arnon Marcus

unread,
Apr 7, 2013, 3:46:44 PM4/7/13
to web...@googlegroups.com
Here is an interesting presentation I am currently watching, describing a messaging architecture using ZeroMQ + Redis :

Niphlod

unread,
Apr 7, 2013, 4:40:37 PM4/7/13
to web...@googlegroups.com


On Sunday, April 7, 2013 9:23:16 PM UTC+2, Arnon Marcus wrote:
Don't wan't to start a flame-fest, but I feel like I am under fire here, and unjustly so...


I was not trying too, I'm just noticing how much this discussion is starting to involve a lot of things that are "offtopic". It's one thing searching for answer (and expecting them) on a specific topic and another one to try to follow every bit of your proposed libraries/solutions/frameworks. The more you add to the "offtopic list" the less people will answer.
When I recommended socket.io for full-compatibility, I had discarded implicitely all other solutions that may be similar cause its the more complete one and fits nicely with python, given that it's the only technology where you're able to leverage a wsgi app (though gevent and gevent-socketio) .
That being said, if your interest is "academical" you may as well code your own new transport in C++. Here on web2py-users I tend to recommend ready-to-use-and-complete solutions involving both python and the web world as much as I can, cause it's not the "let's try something new" group :P
 
as I was saying, you're reading too much too soon, just naming buzzwords without actually **thinking** to what you need.

I admit I don't have experience with many of the things I was writing about, but I don't think I am ill-informed or have erroneous-understanding of things. I made a broad-spectrum research, and wen just deep-enough into any component-option to get the "jist" of it, and see what it's all about.

On the "home page" of engine.io on github.

Engine is the implementation of transport-based cross-browser/cross-device bi-directional communication layer for Socket.IO.

that should be the core concept to grasp on a broad-spectrum research, presentations aside.
 
Here you lost me completely...
Obviously the main part of messaging is the routing-topology - I am well aware of that.
But if I architect the components in a way that clients communicate among themselves  with no centralized location, it would be sub-optimal for storing the message's data from disparate places - It would mean more hops in the message rout, and might even eventually mean coding the "storing" code in multiple places. If I have a centralized message-broker, it may include in it's topology, a filtering of which messages should be stored and where, and may have, for example, a dedicated queue for an out-going channel that goes out to store the data - this way it may even be aggregated before submitting the request to store the data, so there would be less database traffic down the line.

we started from sse and added websockets.
then went to 0mq, that is the only one implementation without a central broker. Believe me, I'm starting to loose you as well ^_^
In my POV, the most pressing argument is that you need to choose either "single endpoint for messages" or "0mq" .... the latter choice will end up trimming all the possibilities to one.
 
What I mean by that, is that if the non-blocking server for the messaging, that would also do the routing-topology, would be just another web2py server, running via gEvent, it can do the database-commits by itself. And since it is web2py, I could reuse the DAL code I have in the model of the main one - so I would not have learn a new ORM system, or device a channel for talking to the main web2py just for the database-commits.

what I meant is that a very few set of applications need the "realtime interaction" on the route client --> server....
e.g. in your "calendaring" example the times the user will receive messages that supposedly holds the informations about appointments sent by other clients will be far more than the times the client will send its own appointment to the server.....
In this usecase you could leverage sse or websockets to receive messages on the page, and let the client send "in the usual way" to a normal webserver what is his appointment, then web2py would send that message to all the other clients passing though the "tornado broker".


I got that impression from this:

Logstash treats Redis as a message-broker - with output messaging - I don't know how exactly...

saying "they use redis as a message broker" is not the same of saying "it has ampq support".
Again, researching on the "gist" of the features provided, if you search for "redis messaging" the first result on google leads to http://redis.io/topics/pubsub


Arnon Marcus

unread,
Apr 7, 2013, 6:44:08 PM4/7/13
to web...@googlegroups.com


I was not trying too, I'm just noticing how much this discussion is starting to involve a lot of things that are "offtopic". It's one thing searching for answer (and expecting them) on a specific topic and another one to try to follow every bit of your proposed libraries/solutions/frameworks. The more you add to the "offtopic list" the less people will answer. ...  we started from sse and added websockets. then went to 0mq, that is the only one implementation without a central broker. Believe me, I'm starting to loose you as well ^_^

Point taken. I did want to go over all of these, to see what people have to say. I started with a specific topic, and since I was asked for my use-case, then since this is a thread I started, I gave myself the liberty to derail off-topic. You are right, I should have split these into separate topics.
 
When I recommended socket.io for full-compatibility, I had discarded implicitely all other solutions that may be similar cause its the more complete one and fits nicely with python, given that it's the only technology where you're able to leverage a wsgi app (though gevent and gevent-socketio) .

That's good, me too, but I am currently familiarizing myself with this huge world, so my research is broad enough to include whatever is going on, and then checking what implementations exist, and which (if at all) are the "fullest" for python.
 
That being said, if your interest is "academical" you may as well code your own new transport in C++. Here on web2py-users I tend to recommend ready-to-use-and-complete solutions involving both python and the web world as much as I can, cause it's not the "let's try something new" group :P

Well, none of the options I gave, are ones that don't have a good python implementation - except perhaps the web-sockets javascript frameworks that are mainly for node.js...  

On the "home page" of engine.io on github.

Engine is the implementation of transport-based cross-browser/cross-device bi-directional communication layer for Socket.IO.

that should be the core concept to grasp on a broad-spectrum research, presentations aside.

You can argue semantics, I think it's irrelevant...
An "implementation" is more of a "library" to my ear than a "protocol"... anyways...

In my POV, the most pressing argument is that you need to choose either "single endpoint for messages" or "0mq" .... the latter choice will end up trimming all the possibilities to one.
 

Hmmmm I think I already made that choice, and backed it by examples and use-cases... I am explicitly after a centralized messaging server - as a co-server to the main web2py one. The question is "how will it be structured". It could be:

1. RabbitMQ        : AMQP + WebSockets
2. Tornado           : 0MQ   + WebSockets
3. web2py/gEvent : 0MQ   + WebSockets
 
what I meant is that a very few set of applications need the "realtime interaction" on the route client --> server....
e.g. in your "calendaring" example the times the user will receive messages that supposedly holds the informations about appointments sent by other clients will be far more than the times the client will send its own appointment to the server.....
In this usecase you could leverage sse or websockets to receive messages on the page, and let the client send "in the usual way" to a normal webserver what is his appointment, then web2py would send that message to all the other clients passing though the "tornado broker".

I see what you are saying... In fact that was my original idea - I like SSE better for many reasons. But it IS less popular, and has even less browser-support then websockets... For example, last time I checked, it still has a problem with CORS... Most browsers just didn't implement that feature-in for SSE, even though it's in the spec...
Also, with the incentive of unifying messaging, I finf there would be more higher-level libraries generalizing 0MQ / AMQP over websockets, then over SSE.
For example:

Also, I still didn't get a clear answer as to how to implement SSE in web2py...
You said that the commet-thing is no longer existing, as "websockets" where already included in web2py.js, which as I remember correctly is referenced in the main application layout. But what about SSE? I mean, sure, it's just an HTTP request, at start, but there is a different model for "responding"... How is web2py built for doing that? Is it keeping the session afloat for that connection, if it get's the correct MIME-type? Will I just be able to reuse the same controller-action for consecutive replies? Can I explicitly call it from another controller, from a different session? Where should a "yield" be placed? There is ZERO documentation about this in the web2py book, and there was only one thread about this in this group, which had an attached "example application" packed in a w2p file that I couldn't use for some reason...  

Another reason, is that 0MQ / AMQP already implement all the necessary architecture of queues, routing, addressing and subscriptions...
Sure, I can "emulate" 0MQ/AMQP semantic-components, by using Redis for "queues", "sessions" for "bindings" and "controller-actions" for "exchanges"... It would just mean me having to write more code, and learn more low-level stuff - and the whole point about these things is that as there are standards and libraries, I should be able to remain-afloat on the higher-level.  


saying "they use redis as a message broker" is not the same of saying "it has ampq support".
Again, researching on the "gist" of the features provided, if you search for "redis messaging" the first result on google leads to http://redis.io/topics/pubsub

Obviously I got confused by how the logstash documentation wrote about this...
I just didn't get around to checking Redis's pub/sub yet...  

Niphlod

unread,
Apr 8, 2013, 4:21:13 AM4/8/13
to web...@googlegroups.com


You said that the commet-thing is no longer existing, as "websockets" where already included in web2py.js, which as I remember correctly is referenced in the main application layout. But what about SSE? I mean, sure, it's just an HTTP request, at start, but there is a different model for "responding"...

Nope, or maybe I expressed myself badly: that implementations started named as "comet messaging" but turned to "websocket messaging" at the first iteration.
web2py.js has an usable implementation for it and gluon/contrib/websocket_messaging.py is 200 lines of which 70 are comments, it's easy to hack it into.
 
How is web2py built for doing that? Is it keeping the session afloat for that connection, if it get's the correct MIME-type? Will I just be able to reuse the same controller-action for consecutive replies?

given that there's no "web2py+sse" package around, but only that app, you should wait/ask for who did that app ^_^
Of course the sse the implementation can be easily done within web2py, but given that the issue is that the connection stays open, you should run web2py in an evented environment.
From what I can see without having tested anything, you just return the text/event-stream content-type and in a loop you yield the message "segment".

Can I explicitly call it from another controller, from a different session? Where should a "yield" be placed? There is ZERO documentation about this in the web2py book, and there was only one thread about this in this group, which had an attached "example application" packed in a w2p file that I couldn't use for some reason... 

Of course you need to disable session locking when accessing to that controller with session.forget(response): basically that controller is "held captive" as soon as the user's connects to it.
I didn't get what you mean by "can I explicitely call it": either with websockets or SSE as soon as the user hits the page, a connection is established and remains open. There's no request/response cycle, just a request coming in and a (eventually) infinite response out.
Where you yield is at your discretion, but at least you should yield at the end of a single message.

Arnon Marcus

unread,
Apr 8, 2013, 8:34:32 AM4/8/13
to web...@googlegroups.com

Nope, or maybe I expressed myself badly: that implementations started named as "comet messaging" but turned to "websocket messaging" at the first iteration.
web2py.js has an usable implementation for it and gluon/contrib/websocket_messaging.py is 200 lines of which 70 are comments, it's easy to hack it into.
 

As I said, I've already gone over the websocket_messaging.py file - it has dealings with WebSockets - NOT SSE (!) - and via Tornado, NOT web2py...

I didn't get what you mean by "can I explicitely call it": either with websockets or SSE as soon as the user hits the page, a connection is established and remains open. There's no request/response cycle, just a request coming in and a (eventually) infinite response out.

What I mean, is that once the connection is ope, and say, is handled by a "session", then from that moment on, My usage of this connection would be "pushing" through that connection onto the browser. The usage of the "push" would obviously be from another controller.
I mean, let's take the "chat" use-case :
User "A" logs into a chat-view, and that sends a GET request to a controller-action, who's job is to opens an SSE connection for that user - a long-lasting session - let's call it "The SSE action". Then user "B" logs into the same view on his side, and the same thing happens for him. Now we have 2 outgoing sessions open - one for each user - 2 "SSE Actions" are waiting to send more responses - each to their respective recipients.
Now, user "A" writes a comment, and "submits" it. This sends a POST request to a different controller-action that saves the comment to the database -let's call it "The Submission Action". This controller-action is different from the SSE action, and may theoretically even belong to a different controller (say, the system may have chat-views in multiple pages...).
My question is, then :
"Can a submission-action 'call' an SSE-action that belongs to a different controller, and has a different session/request/response object(s)? If so How?".
I hope it's more clear now...

Arnon Marcus

unread,
Apr 8, 2013, 8:41:51 AM4/8/13
to web...@googlegroups.com
Oh, and I forgot the most important aspect:
"How can an active submission-action of user A, locate the correct session that holds the correct SSE-connection of user B? And how can it use that SSE-action with that session of user B?"

Niphlod

unread,
Apr 8, 2013, 8:51:09 AM4/8/13
to web...@googlegroups.com

As I said, I've already gone over the websocket_messaging.py file - it has dealings with WebSockets - NOT SSE (!) - and via Tornado, NOT web2py...

We all got that. it's an external process, but it's implemented already, it "just works", has a simple yet powerful routing algo and its secure.
With SSE you have to do it yourself.
 

I didn't get what you mean by "can I explicitely call it": either with websockets or SSE as soon as the user hits the page, a connection is established and remains open. There's no request/response cycle, just a request coming in and a (eventually) infinite response out.

What I mean, is that once the connection is ope, and say, is handled by a "session", then from that moment on, My usage of this connection would be "pushing" through that connection onto the browser. The usage of the "push" would obviously be from another controller.
I mean, let's take the "chat" use-case :
User "A" logs into a chat-view, and that sends a GET request to a controller-action, who's job is to opens an SSE connection for that user - a long-lasting session - let's call it "The SSE action". Then user "B" logs into the same view on his side, and the same thing happens for him. Now we have 2 outgoing sessions open - one for each user - 2 "SSE Actions" are waiting to send more responses - each to their respective recipients.
Now, user "A" writes a comment, and "submits" it. This sends a POST request to a different controller-action that saves the comment to the database -let's call it "The Submission Action". This controller-action is different from the SSE action, and may theoretically even belong to a different controller (say, the system may have chat-views in multiple pages...).

This is exactly the example shown on the videos about websocket_messaging.py . the user receives updates through the ws, and he sends to the default web2py installation with a simple ajax post its message. web2py then queues that message to tornado, that informs all connected users of the new message on the ws channel.

 
My question is, then :
"Can a submission-action 'call' an SSE-action that belongs to a different controller, and has a different session/request/response object(s)? If so How?".


On the SSE side, you'd have some controller that basically does:

def events():
      initialization_of_sse
      while True:
             yield send_a_message

you have to think to security, routing, etc by yourself.

Basically in that while True loop you'd likely want to inspect your "storage" (redis, ram, dict, database, whatever) if there's a new message for the user.
You can't "exit" from there and resume it....all the logic needs to happen inside that yield(ing) loop.

Arnon Marcus

unread,
Apr 8, 2013, 11:48:31 AM4/8/13
to web...@googlegroups.com
Look, I appreciate you're trying to help-out, but it seems you are answering the questions you know the answers to, instead of the questions I ask.
It's OK to say that you don't know the answer. You are not alone in this user-group, perhaps someone else does.

We all got that. it's an external process, but it's implemented already, it "just works", has a simple yet powerful routing algo and its secure.
With SSE you have to do it yourself.
 

I know that there is a "somewhat-working" solution for web-sockets, using Tornado.
I know it would be better to use it, instead of trying to make SSE work in web2py by myself.
In the long-term I'll probably do something like that.

But as you said, not in all scenarios, a web-socket is requited - sometimes an SSE does what I need.
And as it is HTTP-based, I thought it should have been easy to implement in web2py.

This is exactly the example shown on the videos about websocket_messaging.py . the user receives updates through the ws, and he sends to the default web2py installation with a simple ajax post its message. web2py then queues that message to tornado, that informs all connected users of the new message on the ws channel.


Again, that is not an answer to my questions. My questions where referring to how web2py can implement SSE, not how Tornado can implement web-sockets and have web2py push stuff into it.

On the SSE side, you'd have some controller that basically does:

def events():
      initialization_of_sse
      while True:
             yield send_a_message

you have to think to security, routing, etc by yourself.

Basically in that while True loop you'd likely want to inspect your "storage" (redis, ram, dict, database, whatever) if there's a new message for the user.
You can't "exit" from there and resume it....all the logic needs to happen inside that yield(ing) loop.

That is answering the question : "How does web2py keep a long-lasting connection".
That is NOT answering the question: "How can a different controller-action activate this"

I found a way to extract the web2py-SSE example, here are the relevant parts (I bold'ed the important stuff):

Controller:

# -*- coding: utf-8 -*-
import time
from gluon.contenttype import contenttype

### required - do no delete
def user(): return dict(form=auth())
def download(): return response.download(request,db)
def call(): return service()
### end requires

def index():
return dict()

def error():
return dict()

def sse():
return dict()

def buildMsg(eid , msg):
mmsg = "id: %s\n" %eid
mmsg += "data: {\n"
mmsg += "data: \"msg\": \"%s\", \n" %msg
mmsg += "data: \"id\": %s\n" %eid
mmsg += "data: }\n\n"
return mmsg

def sent_server_event():
response.headers['Content-Type'] = 'text/event-stream'
response.headers['Cache-Control'] = 'no-cache'
def sendMsg():
startedAt = time.time(); #http://www.epochconverter.com/
while True:
messaggio = buildMsg(startedAt , time.time())
yield messaggio
time.sleep(5)
if ((time.time() - startedAt) > 10):break
return sendMsg()

def event_sender():
response.headers['Content-Type'] = 'text/event-stream'
response.headers['Cache-Control'] = 'no-cache'
mtime = time.time()
return 'data:' + str(mtime)


View (script-part):

if (!window.DOMTokenList) {
  Element.prototype.containsClass = function(name) {
    return new RegExp("(?:^|\\s+)" + name + "(?:\\s+|$)").test(this.className);
  };

  Element.prototype.addClass = function(name) {
    if (!this.containsClass(name)) {
      var c = this.className;
      this.className = c ? [c, name].join(' ') : name;
    }
  };

  Element.prototype.removeClass = function(name) {
    if (this.containsClass(name)) {
      var c = this.className;
      this.className = c.replace(
          new RegExp("(?:^|\\s+)" + name + "(?:\\s+|$)", "g"), "");
    }
  };
}

// sse.php sends messages with text/event-stream mimetype.
var source = new EventSource('{{=URL("sent_server_event")}}');

function Logger(id) {
  this.el = document.getElementById(id);
}

Logger.prototype.log = function(msg, opt_class) {
  var fragment = document.createDocumentFragment();
  var p = document.createElement('p');
  p.className = opt_class || 'info';
  p.textContent = msg;
  fragment.appendChild(p);
  this.el.appendChild(fragment);
};

Logger.prototype.clear = function() {
  this.el.textContent = '';
};

var logger = new Logger('log');

function closeConnection() {
  source.close();
  logger.log('> Connection was closed');
  updateConnectionStatus('Disconnected', false);
}

function updateConnectionStatus(msg, connected) {
  var el = document.querySelector('#connection');
  if (connected) {
    if (el.classList) {
      el.classList.add('connected');
      el.classList.remove('disconnected');
    } else {
      el.addClass('connected');
      el.removeClass('disconnected');
    }
  } else {
    if (el.classList) {
      el.classList.remove('connected');
      el.classList.add('disconnected');
    } else {
      el.removeClass('connected');
      el.addClass('disconnected');
    }
  }
  el.innerHTML = msg + '<div></div>';
}

source.addEventListener('message', function(event) {
  //console.log(event.data)
  var data = JSON.parse(event.data);

  var d = new Date(data.msg * 1e3);
  var timeStr = [d.getHours(), d.getMinutes(), d.getSeconds()].join(':');

  coolclock.render(d.getHours(), d.getMinutes(), d.getSeconds());

  logger.log('lastEventID: ' + event.lastEventId +
             ', server time: ' + timeStr, 'msg');
}, false);

source.addEventListener('open', function(event) {
  logger.log('> Connection was opened');
  updateConnectionStatus('Connected', true);
}, false);

source.addEventListener('error', function(event) {
  if (event.eventPhase == 2) { //EventSource.CLOSED
    logger.log('> Connection was closed');
    updateConnectionStatus('Disconnected', false);
  }
}, false);

var coolclock = CoolClock.findAndCreateClocks();


Now, I can see that it's ported from php, and that there are some unused stuff in the controller - probably as this is a ruff proof-of-concept only...
Now, what this example is doing, basically, is establishing an SSE connection with a web2py controller, that yields a time-stamp, twice, than exits out of the loop.
Meaning, it generates 2 responses for each single-connection, while sleeping 5 seconds in between, then the loop is broken, so web2py stops sending more responses.
This closes the connection, and 3 seconds later (as is defined in the SSE spec), the connection re-establishes itself, and so on.
There is also an option to close the connection manually, from the client side.

 That's all fine and dandy...

But it answers NONE of the questions I asked...

There is no inter-controller/action communication in here, there is no way to POST something from the client to the server, that will call a different action in web2py, which will then invoke another yield of the SSE action, thus intentionally-spawning another response over the existing connection....
And what if there are multiple connections to multiple clients? the only way to differentiate between them would be via their sessions.
Now, the way I understand this, it's a fundamental "executional"  limitation of web2py - it has no concurrency, so each invocation of web2py's wsgi-handler, is in fact a single-process-single-thread type of scenario, so that there could never exist multiple sessions that are handled at the same time... Unless another process/thread is being spawned by the web-server itself. In that case, there would have to be some sort of inter-process/inter-thread communication going on, in order for one session in one thread, to invoke an action in a separate session on a different thread. The only way around this, would obviously have to be using web2py over something like gEvent. But the question would then still remain: Whether an in-proc/in-thread/cross-sub-routine communication, or an inter-proc/inter-thread communication, there would STILL exist a need to rout across "sessions". In a sense, the controller-action's execution-run-time would have to be bound to the session that invoked it.
Am I understanding this correctly?
If so, it's not a small matter - is a mismatch of fundamental execution-architecture. There is a critical component missing.
If it INDEED does not exist in web2py, I would like it to be said up-front CLEARLY.
This way, a discussion about future possibilities can be started, perhaps for web3py.
I would then also not have to waste time and effort digging through un-documented territories, and half-ass'ed "proof-of-concept" that, evidently, prove that the concept does not work, and go around it to show some completely useless use-case...

This is all very disappointing...

Arnon Marcus

unread,
Apr 8, 2013, 12:03:52 PM4/8/13
to web...@googlegroups.com
The reason this capability is useless, is that it limits the usage of SSE connection way too much...
There is no reason I would have to stick my entire code for dealing with updates to a section of my app, within a single action...
It makes no sense at all...

If I have an action that updates some data for a "related" section of my application, then using this mechanism I would have to copy it in it's entirety into the while-loop...
This is unacceptable... I could have docents of such actions, spread across my application, each dealing with a different "aspect" within the component that is presented in the SSE-enabled view... There is a relational-database in the back - the entire application is segregated in accordance with these relations. I can have many views that deal with related pieces of information, each belonging to a different controller, and has it's own dedicated update-action within that controller. Then I might have a view, that I wan't to make SSE-enabled, which I want to be updated "in reaction to" changes in "other views".... I am not talking about "chat" only... We have a complex business-logic and a varied business-application views. We need more than a simple stand-alone controller that can only be called by the client, and can talk to (and be called from) no one else in web2py...

I suggest that a very serious look be done on 0MQ for inter-process/inter-thread-communication - it could be the solution for web2py to become SSE capable - maybe even Web-Sockets capable...

Niphlod

unread,
Apr 8, 2013, 12:16:38 PM4/8/13
to


On Monday, April 8, 2013 5:48:31 PM UTC+2, Arnon Marcus wrote:
Look, I appreciate you're trying to help-out, but it seems you are answering the questions you know the answers to, instead of the questions I ask.
It's OK to say that you don't know the answer. You are not alone in this user-group, perhaps someone else does.

 
ok, back to the "sse only" discussion......... 


That is answering the question : "How does web2py keep a long-lasting connection".
That is NOT answering the question: "How can a different controller-action activate this"

Why does this have to be activated in another controller-action ?
The point of SSE is that as soon as your page does the eventsource(url) stuff, the connection is istantiated and (reconnections aside) never closed.
Sure, you can have your js code attaching/detaching to different urls, but you're going to increase the overall implementation weight.
Inside that loop you can do, e.g.

   msg = db(db.messages.recipient == auth.user_id).select().first()
   
yield msg

it's just a simple example that shows that you how in the same loop you can be "influenced" by other actions (i.e. that have submitted a record to the messages table)
 
 But it answers NONE of the questions I asked...

There is no inter-controller/action communication in here, there is no way to POST something from the client to the server, that will call a different action in web2py, which will then invoke another yield of the SSE action, thus intentionally-spawning another response over the existing connection....
oh my.... SSE are unidirectional, so of course the example shows you just the server --> client part and not the client-->server one.
you can do the client--> server part as usual with an ajax post.

 
And what if there are multiple connections to multiple clients? the only way to differentiate between them would be via their sessions.
Now, the way I understand this, it's a fundamental "executional"  limitation of web2py - it has no concurrency, so each invocation of web2py's wsgi-handler, is in fact a single-process-single-thread type of scenario, so that there could never exist multiple sessions that are handled at the same time....

 
No framework supports multiple connections in the same thread/coroutine/process.  Every different implementation either spawns a process, or a thread, or a greenlet to do the work . Of course the more lightweight the better, so greenlets win hands down.
That being said, web2py is obviously concurrent: no site will be deployed with web2py if that was in fact happening.
Concurrency is achieved with threads with the default webserver, but noone is posing a limit on what you want to run web2py into.
I really don't see what's the missing component on web2py (aside not being able with rocket to support a lot of users with SSE). As soon as you run it in a gevent wsgiserver (or the usual gevented gunicorn) you can do whatever you want.
In the loop you have to build your own routing, that is all is needed.


EDIT: you don't need to have one-and-only sse capable controller.
You just need to code into a single one of them what is required by the view who will call it (i.e. you can have a page for a chat that will "call" the sse that deals with the chat,  the page of the calendar that listens to the calendar sse and so on)

Arnon Marcus

unread,
Apr 8, 2013, 1:11:57 PM4/8/13
to web...@googlegroups.com

oh my.... SSE are unidirectional, so of course the example shows you just the server --> client part and not the client-->server one.
you can do the client--> server part as usual with an ajax post.

(I would appreciate you refrain from using expressions with condescending implications such as "oh my...")
I know it's uni-directional... It's not the point...
I mean that another view from another controller, would invoke, say, an ajax call, to IT'S OWN CONTROLLER's action, which will THEN invoke the SSE-enabled-action in the other controller.


EDIT: you don't need to have one-and-only sse capable controller.
You just need to code into a single one of them what is required by the view who will call it (i.e. you can have a page for a chat that will "call" the sse that deals with the chat,  the page of the calendar that listens to the calendar sse and so on)

Now you are getting closer... Of course I understand that I can have more then a single SSE-enabled controller-action, but as you said - this would mean that, say, a "chat" view, may ONLY invoke a "chat" SSE-enabled-controller-action, and a "calendar" view, may ONLY invoke a "calendar" SSE-enabled-controller-action...
What if I want 2 users to collaborate on the same data, using different views, and still get real-updates?
Let's say we have 2 views, a calendar, and a scheduling-run-chart - Different views of the same (or partially-shared) data, for different use-cases.
How can I have one updating the calendar, and getting live-updates from another user updating the schedule (and vice-versa) ?
If it is not clear verbally, perhaps a picture is in order...

I'm attaching a picture - the "missing-part" in web2py from this stand-point, are the "green arrows"...

Arnon Marcus

unread,
Apr 8, 2013, 2:51:44 PM4/8/13
to web...@googlegroups.com
That's what I'm talking about... :)

http://vimeo.com/41410528

And this is from 2011...

When will web2py support 0MQ ???

Ach...

Can web2py subscribe to Redis publishing?

Niphlod

unread,
Apr 8, 2013, 3:37:52 PM4/8/13
to web...@googlegroups.com


On Monday, April 8, 2013 7:11:57 PM UTC+2, Arnon Marcus wrote:

oh my.... SSE are unidirectional, so of course the example shows you just the server --> client part and not the client-->server one.
you can do the client--> server part as usual with an ajax post.

(I would appreciate you refrain from using expressions with condescending implications such as "oh my...")

Sorry, it wasn't my intention.... I'm not a native English speaker and writing does not always explain "emotions" like a face to face discussion. By all means, feel free to put a :-) everywhere ....
it's just that seeing web2py "bashed" with expression like "the example is half-assed" or "something crucial is missing" for something that clearly its not a problem of web2py itself "sounds bad", I'm trying to follow you and explain/give-an-alternative to the problem(s) you're pointing to.
 


EDIT: you don't need to have one-and-only sse capable controller.
You just need to code into a single one of them what is required by the view who will call it (i.e. you can have a page for a chat that will "call" the sse that deals with the chat,  the page of the calendar that listens to the calendar sse and so on)

Now you are getting closer... Of course I understand that I can have more then a single SSE-enabled controller-action, but as you said - this would mean that, say, a "chat" view, may ONLY invoke a "chat" SSE-enabled-controller-action, and a "calendar" view, may ONLY invoke a "calendar" SSE-enabled-controller-action...
What if I want 2 users to collaborate on the same data, using different views, and still get real-updates?
Let's say we have 2 views, a calendar, and a scheduling-run-chart - Different views of the same (or partially-shared) data, for different use-cases.
How can I have one updating the calendar, and getting live-updates from another user updating the schedule (and vice-versa) ?
If it is not clear verbally, perhaps a picture is in order...


Picture definitely helps.

What needs to be cleared is this (sorry if this repeats something that is yet clear): you can have as many SSE "hooks" defined in a single page as you want, but **usually**  you'd want to have a single one for each page and send different events (on the SSE specs, it's different "event:" key) through the same connection, cause doing so you can "allocate" a single connection per user. That being said, if your machine can hold 1000 connections and you have no more that 50 users, use as many "hooks" as you wish.

Every SSE "hook" will at all effects "hold a greenlet captive" for all the duration of the "streaming" of responses.
Not to stress out on the core concept, but in the end you choose SSE over a recurring poll with ajax just to "spare" the reconnection times.
Given that a new "greenlet" will be costantly active to send events to the client (a greenlet per page per user), you can't "expect" the normal request/response cycle: the "method" to hold a connection open is not to return something, it's yielding "small blocks" in a never-ending loop.
This "requires" that the logic, e.g., to check for new messages, "happens" in that while loop.

Whatever you choose to implement that logic is up to you: when I said "database, redis, cache, etc" I just pointed out some of the possible implementations:
- a messages table on a database
- a key in the cache.ram
- a list in redis
but it may be as well leveraging the pubsub features of redis.

So, let's take a messages table: you define topics, type of events, content of the event, recipients....
user 1 opens the page /app/default/index.html that has a piece of javascript to hook to the sse on /app/sse/index .
user 2 opens /app/default/index.html and you want him to book an appointment.
When he books it, web2py receives the booking (a normal ajax post in response to a click on a button) and stores it into the messages table.
Inside your "yielding" loop on /app/sse/index you check for new appointments every 3 seconds and user 1 receives the update.

Let's take instead a pubsub redis topic:
user 1 opens the page /app/default/index.html that has a piece of javascript to hook to the sse on /app/sse/index .
user 2 opens /app/default/index.html and you want him to book an appointment. When he books it, web2py receives the booking and stores it into the 
redis topic. Inside your "yielding" loop on /app/sse/index you subscribe to the topic and wait for redis to send you a payload that user 1 receives.

As stated before, you "need" to build your own routing mechanism: if you leverage redis pubsub some things can be easier than a database table, but you could as well store each message in a flat file and read from that ....
The yielding loop can very well "subscribe" to different redis topics as watching for different type of records in your messages table.
Now, taking your graph as example, the green arrows can be done:
- with the "schedule" controller putting a record into redis or into a table, so the "controller 2 sse" when checks for new updates can "see" the added ones
- on the other end, the controller 2 that sets the updates can put a record into what "schedule controller sse" is watching over

Basically, if you need that kind of functionality where a shared state is needed between two different connections, you need a place where both of them can look into. The "normal" action can have a request/response cycle that closes as soon as the new "event" is submitted to the "message queue", while the "sse" action needs to check for new messages in the "queue" every once in a while, never returning from it (because as soon as you returns, "wsgi dictates" that the connection closes).

The "every once in a while" is a loop.
If the backend you choose for storage don't has the ability to notify something like "hey, I have a new entity for you", you need to loop and sleep a bit (e.g. a database table).
That's more or less what web2py's scheduler worker(s) does: it uses a table to communicate its state to other workers (so they can coordinate among each others), and the web2py "web" process uses those tables to communicate from/to the workers (to-->queueing new tasks, from-->looking for stored results).
 
When you start 4 workers and a webserver, you have 5 processes that know what is happening on the other 4 just looking into those shared tables. It's not a much different "paradigm" from your separate controllers "situation": they just need a place to speak to each other.

Redis pubsub has a "blocking call": this means that the method itself sleeps "automatically" until a new entity is available, in which case you can avoid the sleep() call alltogether.


Arnon Marcus

unread,
Apr 8, 2013, 5:13:52 PM4/8/13
to web...@googlegroups.com
I think a picture is worth "more" than a thousand words... ;)

Thanks for clearing it out - I get it now. It is still disappointing that the only way to do that is by "polling"... It's not solving the problem, just moving it around. It's fundamentally (in terms of execution-model), not different than using "long-polling" in the client instead of SSE... In both cases you got this scenario:

Side A : Are we there yet?
Side B : No
Side A : Are we there yet?
Side B : No
Side A : Are we there yet?
Side B : No
Side A : Are we there yet?
Side B : No
Side A : Are we there yet?
Side B : No
Side A : Are we there yet?
Side B : Yes, here you go...
Side A : Are we there yet?
Side B : No
Side A : Are we there yet?
Side B : No
Side A : Are we there yet?
Side B : No
....

The whole point of SSE is to avoid that execution model...
You alluded to Redi's "push" mechanism - I've read your link on Redis's Pub/Sub protocol, but couldn't find how the push is being done.
I'm currently looking into the Python-client implementation options there, but let's assume that there is a way to listen from Python to Redis - where do I put that? Inside the while-loop?
And how does this "generator-instance-yield in a return statement" work from an execution-model perspective? What happens when it's sleeping? Isn't the python-run-time blocked? I mean, the controller-action "itself" is NOT a generator - it "returns" a generator-instance. It is returning an object. That object has a ".next()" method.. Great. Now what happens? Is web2py recognizing it as a generator-instance by it's type/methods ? Then it does a ".next()" call and issues the result within a response with the response headers? What happens then? It sleeps, right? What happens during that sleep? And after it finishes sleeping, it does not yield another values by itself - a generator is not a self-activating-agency - it needs to be called explicitly  - only then will it replay the loop and yield another result.
This part is still unclear...

Paolo Caruccio

unread,
Apr 8, 2013, 5:50:42 PM4/8/13
to web...@googlegroups.com
When I wrote the small app "SSE_clock" I was searching a replacement for a "long polling javascript code" that I was using in order to push db's table update notifications to clients. I abandoned the project by lack of browser's support.
Anyway, the application is a simple translation from php to python. Original demo target is to show that SSEs reconnect automatically and that it possible send multiple events on a single connection. Here attached you'll find original code in php to compare with python version.
However SSE has other features not discussed in the clock example. 
Below some links that I collected during my research:

sse.7z

Niphlod

unread,
Apr 8, 2013, 6:55:24 PM4/8/13
to web...@googlegroups.com
ok, we are getting closer! ^__^


Thanks for clearing it out - I get it now. It is still disappointing that the only way to do that is by "polling"... It's not solving the problem, just moving it around. It's fundamentally (in terms of execution-model), not different than using "long-polling" in the client instead of SSE... In both cases you got this scenario:

Polling has to be done if your backend doesn't notify you that there is a new message.
Between a long-polling ajax and SSE stands all the difference (traffic-wise) that is added by a new http connection: dns resolving, socket opening, proxies, headers (client-side) and permission checking, session inspecting, validation (server-side).
The point is not short-circuting the execution model, but short-circuiting the need to establish a new connection.
That being said, if you choose a backend that notifies you when a new message arrives, you can also short-circuit the "polling part".
 
The whole point of SSE is to avoid that execution model...
You alluded to Redi's "push" mechanism - I've read your link on Redis's Pub/Sub protocol, but couldn't find how the push is being done.

The pubsub pattern basically does:
- publish a new message
- all the subscribers receive that message
on redis, the publish part is a method that returns as soon as you sent that message

publish(channel, message)

the subscribe part is a method that listens "blocking" and "resumes" as soon as a message is received (so, it blocks if there are no new messages until there is a new one)
so, in your SSE action you should do something like (pseudo-code)

a = pubsub.subscribe(channel)
while True:
      yield a.listen()

 
I'm currently looking into the Python-client implementation options there, but let's assume that there is a way to listen from Python to Redis - where do I put that? Inside the while-loop?
And how does this "generator-instance-yield in a return statement" work from an execution-model perspective? What happens when it's sleeping? Isn't the python-run-time blocked? I mean, the controller-action "itself" is NOT a generator - it "returns" a generator-instance. It is returning an object. That object has a ".next()" method.. Great. Now what happens? Is web2py recognizing it as a generator-instance by it's type/methods ? Then it does a ".next()" call and issues the result within a response with the response headers? What happens then? It sleeps, right? What happens during that sleep? And after it finishes sleeping, it does not yield another values by itself - a generator is not a self-activating-agency - it needs to be called explicitly  - only then will it replay the loop and yield another result.

In a gevent environment the coroutine "context switching" happens when you put that thread to sleep. This is done in several standard libs where an IO is done. Additionally, if you monkey_patched web2py (as anyserver.py does), every sleep() call effectively calls gevent.sleep() that is "coroutine-friendly": while that coroutine sleeps the execution of other coroutines can go forward, so yes, a sleep() blocks the execution, but only of that greenlet, letting other greenlets to pick up from where they were put to sleep.

As per wsgi specs if the body is an iterator the body is returned in a chunked-like manner to the client: this enables the yielding loop to "stream" pieces of information while keeping the connection open.
You can yield with the default threaded webserver, but the way it's implemented is a pool of threads, that has a maximum vale: as soon as there are n connections = n of threads of the webserver replying to a connection, no other connection can be established.
 
On gevent, on the other end, a new greenlet is spawned at every request and given that they are lighter, there's (virtually) no upper bound: that's why an evented environment is recommended (not required, but "highly-appreciated" nonetheless) while doing long-standing connections.


Arnon Marcus

unread,
Apr 9, 2013, 5:18:46 AM4/9/13
to web...@googlegroups.com


On Tuesday, April 9, 2013 12:50:42 AM UTC+3, Paolo Caruccio wrote:
When I wrote the small app "SSE_clock" I was searching a replacement for a "long polling javascript code" that I was using in order to push db's table update notifications to clients. I abandoned the project by lack of browser's support.
Anyway, the application is a simple translation from php to python. Original demo target is to show that SSEs reconnect automatically and that it possible send multiple events on a single connection. Here attached you'll find original code in php to compare with python version.
However SSE has other features not discussed in the clock example. 


I know all about SSE's features, and have already gone through some of these links, but thank you.
My reaction was not about the client-side example, but about the server-side's limitations in web2py. 

Arnon Marcus

unread,
Apr 9, 2013, 6:43:27 AM4/9/13
to

the subscribe part is a method that listens "blocking" and "resumes" as soon as a message is received (so, it blocks if there are no new messages until there is a new one)
so, in your SSE action you should do something like (pseudo-code)


Yup. That's my point - without gEven, it would be blocking, and thus useless... While it is blocking, we2py is essentially stalled - It's like your main server just got stuck, and is not receiving anything else as of that moment - It will not be able to get out of the loop - ever - until you manually crash it...
 
a = pubsub.subscribe(channel)
while True:
      yield a.listen()

In a gevent environment the coroutine "context switching" happens when you put that thread to sleep. This is done in several standard libs where an IO is done. Additionally, if you monkey_patched web2py (as anyserver.py does), every sleep() call effectively calls gevent.sleep() that is "coroutine-friendly": while that coroutine sleeps the execution of other coroutines can go forward, so yes, a sleep() blocks the execution, but only of that greenlet, letting other greenlets to pick up from where they were put to sleep.


Yes, I have been learning about gEvent's implementation of sub-routines -  it's basically using a c-compiled module that plays with moving things between the stack and the heap of the PVM process. I also know how event-loops are working with python's co-routines (generators). I meant that in the example-application which is supposed to show web2py's capability of using SSE on the server, in a default-deployment, without gEvent, it as mislead me to believe that web2py "by itself" can support SSE on the server, for "real-world" use-cases - and it can't - so I got upset and disappointed.
 
As per wsgi specs if the body is an iterator the body is returned in a chunked-like manner to the client: this enables the yielding loop to "stream" pieces of information while keeping the connection open.
You can yield with the default threaded webserver, but the way it's implemented is a pool of threads, that has a maximum vale: as soon as there are n connections = n of threads of the webserver replying to a connection, no other connection can be established.

That's actually the interesting bit - it is as I suspected - this is good news! :)
 
 
On gevent, on the other end, a new greenlet is spawned at every request and given that they are lighter, there's (virtually) no upper bound: that's why an evented environment is recommended (not required, but "highly-appreciated" nonetheless) while doing long-standing connections.


Well, I wouldn't say that...
It "IS" required if you don't want to inject a "pooling" loop into your controllers...

Niphlod

unread,
Apr 9, 2013, 7:16:17 AM4/9/13
to web...@googlegroups.com
ok, all clear!
One point though: with a threaded webserver web2py can manage as many connections as there are free threads: it's not blocking everything at the first SSE yielding loop, it doesn't communicate with new connections as soon as there are n open connections, with n == max number of threads.
I can't test it right now, but rocket (that is the "embedded" webserver of web2py) unless specified explicitely, opens a new thread at every connection (if you don't set the maxthreads option). Of course this can't scale up to 1000 connections, but its nonetheless sufficient for testing purposes or a small userbase.

Arnon Marcus

unread,
Apr 9, 2013, 4:56:51 PM4/9/13
to web...@googlegroups.com
The first yield WILL block the thread, but as you say, only the thread of that connection. So the inter-thread communication would then be solved via another "shared" process - Redis - which will act as a message broker, listening to submissions and submit publications for subscribers.
I guess I can live with that, for now, our user-base is small enough I think...
Apache is doing the same, right?

P.S : Here is a nice lecture about concurrency and co-routines in python:





--
 
---
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/bpx7ZcL67Co/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Paolo Caruccio

unread,
Apr 9, 2013, 5:50:41 PM4/9/13
to web...@googlegroups.com
just a crazy question: what about if you wrap the eventsource in a web worker?

Niphlod

unread,
Apr 9, 2013, 5:53:19 PM4/9/13
to web...@googlegroups.com


On Tuesday, April 9, 2013 10:56:51 PM UTC+2, Arnon Marcus wrote:
The first yield WILL block the thread, but as you say, only the thread of that connection. So the inter-thread communication would then be solved via another "shared" process - Redis - which will act as a message broker, listening to submissions and submit publications for subscribers.

redis is not required either, but that's what happens, yes.
 
 
I guess I can live with that, for now, our user-base is small enough I think...
Apache is doing the same, right?

 
depends on the implementation choosen but basically yes, you're holding either a thread or a process for as long as the SSE connection is alive.

Arnon Marcus

unread,
Apr 9, 2013, 6:53:01 PM4/9/13
to web...@googlegroups.com
Well, again, Redis IS required for inter-controller communication... (the notorious "green arrows" in my picture...) Which is, to me, a trivial requirement for most production use-cases...

So, to sum-up :
- For inter-controller communications, you need an external message-broker (Redis/RabbitMQ).
- To avoid "polling" the message-broker, you need concurrency (threads/processes/Eevntlets).

Now we can move on to Socket.IO:

What integration for it (if any), already exists "within" web2py for a "gevent'ed-deployment story" ?
I don't know or care much for Tornado... From what I gather, it is similar to twisted in terms of asynchronous-coding requirements...
The way I understand it, unless there is some special-integration code, then using socket.io, would usually require running an independent gEvent'ed Socket.IO server - and routing "/socket.io/*" URI's in the web-server to it... It will then deal with all browser's "client-socket.io" interactions, and inter-operate with web2py via a message-broker (as noted above).

Am I understanding this correcty?


Arnon Marcus

unread,
Apr 9, 2013, 7:02:19 PM4/9/13
to web...@googlegroups.com


On Tuesday, April 9, 2013 2:50:41 PM UTC-7, Paolo Caruccio wrote:
just a crazy question: what about if you wrap the eventsource in a web worker?


Not sure I'm following you on this one...
How would that help?
The browser is already a non-blocking event-machine...
Web-workers are not aimed at solving io-blockage - they are for solving long-heavy-computational-blockage...
Our predicament is in the server, not the client.. 

Niphlod

unread,
Apr 10, 2013, 3:38:01 AM4/10/13
to web...@googlegroups.com


On Wednesday, April 10, 2013 12:53:01 AM UTC+2, Arnon Marcus wrote:
Well, again, Redis IS required for inter-controller communication... (the notorious "green arrows" in my picture...) Which is, to me, a trivial requirement for most production use-cases...

I use redis too in standard deployments, it was only for next eyes coming to this thread.
 

So, to sum-up :
- For inter-controller communications, you need an external message-broker (Redis/RabbitMQ).
- To avoid "polling" the message-broker, you need concurrency (threads/processes/Eevntlets).

Now we can move on to Socket.IO:

yeah!
 

What integration for it (if any), already exists "within" web2py for a "gevent'ed-deployment story" ?

Noone within web2py, but the library gevent-socketio is advertised as "simple to inject".
 
I don't know or care much for Tornado... From what I gather, it is similar to twisted in terms of asynchronous-coding requirements...

well, tornado is a "friendlier" implementation of the twisted event-loop for webservers. Given though that the "partnership" between gevent and socketio is stronger, I'd go for gevent too.
 
The way I understand it, unless there is some special-integration code, then using socket.io, would usually require running an independent gEvent'ed Socket.IO server - and routing "/socket.io/*" URI's in the web-server to it... It will then deal with all browser's "client-socket.io" interactions, and inter-operate with web2py via a message-broker (as noted above).

Am I understanding this correcty?

yes. Also if it was a single process (i.e. you successfully embed gevent-socketio within web2py) with your requirements a message broker would be needed anyway: it's up to you if you want to have a single process serving "two masters" (standard traffic and socket-io ones) or two separate processes handling their own business.
 
Reply all
Reply to author
Forward
0 new messages