Sticky sessions in a distributed environment

237 views
Skip to first unread message

Louis Amon

unread,
Oct 24, 2014, 5:41:06 AM10/24/14
to web...@googlegroups.com
I am trying to scale up my application deployed on Heroku by increasing the number of dynos and am currently confronted with the issue of handling sessions in a distributed environment.

The regular solution (storing sessions in the database) does not seem to work anymore when multiple dynos run concurrently : clients get asked for login at every request.
I have no idea why this doesn't work since databases are supposed to be shared between dynos on Heroku, but as far as I know there are 2 possible ways to manage scalable sticky sessions:

  1. Memcache : couldn't use gluon/contrib to test this because the MemcacheClient does not allow authentication in a connection string (i.e. services like Memcached, MemCachier...)
  2. Redis : same issue --> Redis client does not seem to work well with auth-based services that are available on Heroku (e.g. RedisCloud)


Any idea why db-based sessions do not stick out of the box on Heroku, and/or how to use a Cloud-based service to achieve session stickiness ?

Massimo Di Pierro

unread,
Oct 24, 2014, 9:24:41 AM10/24/14
to web...@googlegroups.com

<quote>
The Heroku routing infrastructure does not support “sticky sessions”. Requests from clients will be distributed randomly to all dynos running your application.
</quote>

Louis Amon

unread,
Oct 24, 2014, 5:22:51 PM10/24/14
to web...@googlegroups.com
Indeed the very structure of Heroku makes it difficult to retain session coherence.

But is there a way to work around this ?

I mean if Heroku’s structure makes it so that web2py gets completely confused about which dyno is which, then any performance gain attained through a distributed architecture is practically useless (unless you do not need sessions).

Is it safe to say that web2py as a framework is not fit to work with Heroku unless you only need one dyno ?


--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/aRIVySTv6hE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Anthony

unread,
Oct 24, 2014, 5:48:15 PM10/24/14
to web...@googlegroups.com
With sessions in a shared database, you shouldn't need sticky sessions, as the sessions are accessed in one central location. Not sure why it's not working in this case, but it bears further investigation.

Anthony

Niphlod

unread,
Oct 24, 2014, 7:00:28 PM10/24/14
to
+1 on anthony. Dynos are meant to scale as "serving frontends".
The data(base) that a dyno1 sees NEEDS to be the same data(base) dyno2 sees, or the whole concept of consistency is not assured (and there's little point of being "distributed" when the only consistency is assured by having a replicated static content).
Being distributed doesn't mean being scalable by default. Being "easily" scalable needs consistency at infrastructure level, and that's what - supposedly - Heroku gets paid for.


PS (on redis issue): I'm working towards a small rewrite of redis_cache and redis_session that will avoid having separate methods (and limited to the implementation of redis that was available at the time), with the added bonus of NOT requiring a separate connection for cache and session. That'll solve the issue because - ideally - if you can use service xyz with the redis library, you'd need to be able to use that in web2py too. That being said, what is missing from the current implementation ? we already have "password" that should be enough to connect to RedisCloud.

Derek

unread,
Oct 28, 2014, 6:39:37 PM10/28/14
to web...@googlegroups.com
Store your sessions in cookies?

To store sessions in cookies instead you can do:

session.connect(request,response,cookie_key='yoursecret',compression_level=None)

Here cookie_key is a symmetric encryption key. compression_level is an optional zlib encryption level.

While sessions in cookie are often recommended for scalability reason they are limited in size. Large sessions will result in broken cookies.

Louis Amon

unread,
Oct 29, 2014, 12:59:05 PM10/29/14
to web...@googlegroups.com
I tried to apply Derek’s advice by storing sessions in cookies, same problem : as long as my web app runs on 1 dyno everything works fine but if I scale it to 2 or more then sessions start to break.

I also +1 on what Anthony said : wether it be stored in a database or in a cookie, a session should be consistent no matter how much you scale your system. This issue is really weird (and annoying).

If you guys you like to replicate here’s how :

  1. Deploy any app to Heroku using web2py’s Rocket web server
  2. Scale your Heroku app to 2 or more (you don’t have to register a means of payment to do it)
  3. Go to yourapp.com/appadmin
  4. Navigate through appadmin (at least 5-10 requests)
  5. You will notice that you get randomly disconnected, and have to enter your password multiple times


Louis Amon

unread,
Nov 25, 2014, 5:10:17 AM11/25/14
to web...@googlegroups.com
I've been in contact with Heroku's support regarding this issue, and here is what they told me :

This doesn't appear to be a problem coming from our platform. Any request coming into your app will have session cookies passed.
The way sessions stored on databases work is by setting a cookie with an id, and then looking for the session data related to that id.


I checked into my browser's development tools to check cookies and found that indeed, there are two cookies : session_id_APPNAME and session_id_admin. These cookies and their values are persistent through requests, even with multiple dynos.


I now wonder about web2py's session management and especially regarding Auth : How exactly does it decide wether a user is an existing user or a new user ?


If it works as this person from Heroku support described, then it weird because my database has exactly ONE entry in the table "web2py_session_APPNAME" and that is exactly the value of the cookie in found in my browser.
To unsubscribe from this group and all its topics, send an email to web2py+unsubscribe@googlegroups.com.

Anthony

unread,
Nov 25, 2014, 7:43:21 AM11/25/14
to web...@googlegroups.com
I checked into my browser's development tools to check cookies and found that indeed, there are two cookies : session_id_APPNAME and session_id_admin. These cookies and their values are persistent through requests, even with multiple dynos.


I now wonder about web2py's session management and especially regarding Auth : How exactly does it decide wether a user is an existing user or a new user ?

There are two separate issues -- checking to see if there is a currently active session, and separately checking for login. web2py determines if there is a session by checking for a session cookie and seeing if it has a database record (or file) with a matching session ID. For login, it checks whether there is an "auth" object in the session, and if so, whether it is expired or not.

Can you first determine whether sessions are working (independing of auth/login)? If you save some value to the session, can you retrieve it on subsequent requests? If so, the problem isn't with the session per se, but specifically with Auth. We may need to see some code to figure out what's going on.

Anthony

Louis Amon

unread,
Nov 26, 2014, 6:43:26 AM11/26/14
to web...@googlegroups.com
Ok I think I found where the problem lies:

In applications/admin/models/access.py we have this structure:

if request.env.web2py_runtime_gae:

   session_db = DAL('gae')

   session.connect(request, response, db=session_db)

   hosts = (http_host, )

   is_gae = True

else:

   is_gae = False


What it basically does is either connect sessions to a database based on GAE or fall back to the default session management (which is file-based if I'm not mistaken).

On heroku, each dyno has its own ephemeral filesystem so admin sessions cannot be found from one request to the other.


The underlying problem is that we do not have a variable that tells if we are running on Heroku (something like request.env.web2py_runtime_heroku)

I've tried to locate where in the code this variable is set but no luck so far. 


I'm thinking of a patch that would look like this:

if request.env.web2py_runtime_gae:

  session_db = DAL('gae')

  session.connect(request, response, db=session_db)

  hosts = (http_host, )

  is_gae = True

elif
request.env.web2py_runtime_heroku:

  session_db
= ???

  session
.connect(request, response, db=session_db)

else:

  is_gae = False


On GAE there has to be only one database so finding where to store sessions is easy.

On Heroku on the other hand... you can have multiple PostgreSQL databases that can be found in environment variables.

I don't see any way to define a clear rule about which database should store sessions.


What do you think ?

Leonel Câmara

unread,
Nov 26, 2014, 7:01:52 AM11/26/14
to web...@googlegroups.com
How will you get your db configuration in admin? I don't think it should have all these ifs and elifs specially for heroku when this isn't heroku specific, this is about how you can configure the admin to store cookies in the DB.

Maybe just add an option in settings.cfg to configure the admin's session storage? Otherwise, I think people just modifying the admin model if they want that is ok too.

Louis Amon

unread,
Nov 26, 2014, 8:34:58 AM11/26/14
to web...@googlegroups.com
I can’t find any documentation about settings.cfg.

How does it work ? Where is it loaded ? Is it application-specific ?


I’m not sure setting a connection string in a file is the way to go with Heroku : you don’t really have those (they are dynamically generated).
When you create a db in Heroku you have a key like this : « HEROKU_POSTGRESQL_ONYX_URL » and the environment variable corresponding to this key has the connection string that the DAL needs.

So basically my DAL connection looks something like :

db = DAL(os.environ[‘HEROKU_POSTGRESQL_ONYX_URL’])


I’ll probably edit the models in admin for now but it isn’t a very good solution because if I update web2py then my changes are lost...

Le 26 nov. 2014 à 13:01, Leonel Câmara <leonel...@gmail.com> a écrit :

How will you get your db configuration in admin? I don't think it should have all these ifs and elifs specially for heroku when this isn't heroku specific, this is about how you can configure the admin to store cookies in the DB.

Maybe just add an option in settings.cfg to configure the admin's session storage? Otherwise, I think people just modifying the admin model if they want that is ok too.

--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/aRIVySTv6hE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.

Anthony

unread,
Nov 26, 2014, 10:15:50 AM11/26/14
to web...@googlegroups.com
On Wednesday, November 26, 2014 8:34:58 AM UTC-5, Louis Amon wrote:
I can’t find any documentation about settings.cfg.

How does it work ? Where is it loaded ? Is it application-specific ?

settings.cfg is in /web2py/applications/admin/ and is specific to the admin app. Currently it only holds editor settings and is only read in when editing. I suppose we could add a setting to specify where to store sessions.

Anthony

Louis Amon

unread,
Nov 26, 2014, 12:15:42 PM11/26/14
to web...@googlegroups.com
After a whole day of struggling I came to the conclusion that this is more of a design flaw than a bug :

Appadmin is designed to check credentials with another application (admin).

The default behaviour of the credential checking system does not allow any modification and has to be done with gluon/fileutils.py (i.e. a file-based session management of the 'admin' application).

The only exception is Google App Engine (it is written in the code with a big 'if' statement) which uses Google's SDK.


I think it is a shame that the implementation of GAE in web2py is so restrictive : it would be so much better if the code was written to be more open to other cloud services. Many services are emerging these days (e.g. : http://www.paasify.it/vendors) and Google is not necessarily the best answer for everyone. At least it is not the only answer that's for sure.

On the whole "session management" subject : I think it is an interesting idea to be able to check for credentials through applications but as we allow sessions to be stored in a database for any application, 'admin' should also be able to manage sessions in a database... with all corresponding functions that are currently file-based (e.g. appadmin.py).



[tl;dr]

So far the only solution to my issue was to :
  1. Remove appadmin.py from my app
  2. Disable 'admin' application
  3. Rewrite gluon/contrib/heroku.py (I put my current file attached to this reply)
Now basically if you use the "detect_heroku" function then you can apply "session.connect(...)" in your model with no risk of sessions being lost no matter how many dynos you deploy on heroku.

That's the best solution I found so far, although far from perfect.
heroku.py

Niphlod

unread,
Nov 26, 2014, 2:48:49 PM11/26/14
to web...@googlegroups.com
I agree, but let us remind that the "admin" app is not meant to be deployed anywhere in production: its probably the reason why the "corner-case" surfaced now instead of some time ago.

Louis Amon

unread,
Nov 28, 2014, 3:29:23 AM11/28/14
to web...@googlegroups.com
Right.

Then why does the whole ticketting system depend on "admin" ?

It is even said in the Deployment Recipe chapter of the doc : "You can later view the errors using the admin app, clicking on the "switch to: db" button at the top, with the same exact functionality as if they were stored on the file system."

Same goes for appadmin.py and its permission-based "manage" feature which is described as a role-based database access. Why would you define roles in a local environment ? Obviously this is meant to go on production in a secure back-office...


Admin is designed as more than a testing tool for a local or development environment, yet it is not meant to be deployed on production.


I now get why many blogs and forums discussing "web2py vs other framework" always end up mentionning that developping on web2py starts as a very thrilling experience but often leaves a bitter taste.

Leonel Câmara

unread,
Nov 28, 2014, 6:15:57 AM11/28/14
to web...@googlegroups.com
> I now get why many blogs and forums discussing "web2py vs other framework" always end up mentionning that developping on web2py starts as a very thrilling experience but often leaves a bitter taste.

You could have suggested that we make admin more production ready than it already is, maybe even contribute towards it, instead you make a cheap jab. Are you 12 or something?

Louis Amon

unread,
Nov 28, 2014, 7:04:59 AM11/28/14
to web...@googlegroups.com
I don't think commenting about my age is any more mature than what you claim I would be...


Anyway, contributing is amongst my current preoccupations - as I mentionned in another post.


The thing is : my disappointment here lies in the architecture, not the features.

There is little contribution to be done when regarding an architectural issue (except a big refactoring of course, which you can't expect a regular contributor to do on his own).


Web2py provides a lot of features, and they are supposed to be adapable and easy to include with your own application. In many ways, I find this is not true.

My personal experience is that I put a lot of time and effort to try and work as close to web2py as I could, and now when I try to make my app evolve I end up either cutting from web2py (i.e. rebuilding the whole feature with custom code instead of using web2py's) or deleting the feature altogether.

Perhaps web2py is too feature-intensive to be truly adaptable, perhaps it is too education-oriented to work quite well in other domains (mine would be e-commerce)...


I think it is fair to share personal experiences regarding the framework (good or bad), as long as they are relevant.

Leonel Câmara

unread,
Nov 28, 2014, 10:31:08 AM11/28/14
to web...@googlegroups.com
Any project will eventually outgrow any framework it uses, when that happens you can either contribute to the framework if it makes sense or just do it yourself for that application. The advantage is that with web2py you can use the included tools while you don't reach that moment while with a more bare-bones framework you don't even get that option.

> I think it is fair to share personal experiences regarding the framework (good or bad), as long as they are relevant.

Sure, but keep it technical.

Niphlod

unread,
Nov 28, 2014, 12:20:54 PM11/28/14
to web...@googlegroups.com
ticketing doesn't depend on admin, and you - again usually - don't want to see errors on the production server. 
Just copy errors over and see them on your development instance (or let the script tickets2email.py send the errors over to you).

And...in a production server appadmin.py isn't needed.

BUT.... you're right. Given that it kinda does work on GAE, it should kinda work on other diskless PaaS as well. It's just not received - yet - that much attention.


Anthony

unread,
Nov 28, 2014, 4:39:05 PM11/28/14
to web...@googlegroups.com
Louis, I think you're getting a little pushback because you make a rather extreme criticism in the face of a fairly small issue that (a) you can easily work around right now and (b) would not be difficult to fix in the framework. Furthermore, the issue at hand affects features of web2py (i.e., admin and appadmin) that most other frameworks don't even have.

Note, aside from adding a few lines in the admin app to have its sessions stored in the db, another option might be to edit the handler file used to start web2py. In that file, you should be able to do something like this:

from gluon.settings import global_settings
global_settings
.db_sessions = True

Your handler file shouldn't change on web2py upgrade, so this setting will remain in effect.

If all you really need is appadmin, then you could also directly edit the appadmin.py controller so it uses the Auth of its own app rather than the admin app (this won't change when you upgrade web2py because the appadmin.py file is part of the app code and is not automatically overwritten on upgrade). In fact, this is already how the appadmin "manage" feature works, so if that feature is adequate for you, you don't need to make any changes at all.

As for error tickets, if you want them to go directly into the db instead of initially being written to the filesystem, I think you can add the following in a model:

request.tickets_db = db

To then view the tickets via admin, you need a "ticket_storage.txt" file in the app's /private folder that includes the URI if the database. I suppose this poses a problem on Heroku, so we may need to come up with an alternative means of getting the URI.

More generally, perhaps we could add a feature that would allow a flag to be set indicating that the runtime environment lacks a persistent filesystem, in which case, sessions and error tickets would automatically go in the database. The framework obviously already includes the needed functionality, as this is what is done on GAE -- we just need to generalize the setting so it can be used on any platform.


The thing is : my disappointment here lies in the architecture, not the features.

There is little contribution to be done when regarding an architectural issue (except a big refactoring of course, which you can't expect a regular contributor to do on his own).

I don't think this is an architectural issue or requires a big refactoring. We already have the functionality in place to work on GAE -- it just needs to be tweaked to apply more generally.
 
I think it is fair to share personal experiences regarding the framework (good or bad), as long as they are relevant.

Of course, but consider the tone. Every improvement to web2py is contributed by a volunteer.
 
Same goes for appadmin.py and its permission-based "manage" feature which is described as a role-based database access. Why would you define roles in a local environment ? Obviously this is meant to go on production in a secure back-office...

As noted above, the "manage" feature of appadmin does not in fact delegate authorization to the admin app. It instead relies directly on the app's own Auth system (which is how it is able to use role-based authorization).

In any case, I agree that to the degree that it is feasible, it should be possible to run admin and appadmin in production. They are probably not the ideal tools in "serious" production environments, but they have their place as a convenience in some settings. We should be realistic, though -- if you don't have a persistent filesystem, the admin app will have its limitations.

Anthony

Louis Amon

unread,
Dec 4, 2014, 9:21:22 AM12/4/14
to web...@googlegroups.com
Note, aside from adding a few lines in the admin app to have its sessions stored in the db, another option might be to edit the handler file used to start web2py. In that file, you should be able to do something like this:

from gluon.settings import global_settings
global_settings
.db_sessions = True

This does not work because db_sessions is only considered if response.session_storage_type == 'db' (i.e. if the db attribute is set in session.connect(...)).
Admin uses the default session.connect(...) so it does not set a db to store sessions into. Global settings won't change that (sadly).

The only reason why admin does store sessions in db on GAE is this snippet of code in its models (access.py):

if request.env.web2py_runtime_gae:

   session_db = DAL('gae')

   session.connect(request, response, db=session_db)



Storing tickets in db with request.tickets_db sounds nead but as you mentioned, it can't be done on Heroku because of the ephemeral filesystem.

My solution so far was to build a custom application on which all errors are routed (using routes_onerror). I built some code (largely based on scripts/tickets2email.py) to render tickets into html emails that are sent directly to app administrators when an error 500 is hit.

The reason for this is that the ephemeral system on Heroku can be reset randomly at any given time, so sending tickets through the bundled script would sometimes send "ticket missing" emails.

Anthony

unread,
Dec 4, 2014, 11:37:04 AM12/4/14
to web...@googlegroups.com
On Thursday, December 4, 2014 9:21:22 AM UTC-5, Louis Amon wrote:
Note, aside from adding a few lines in the admin app to have its sessions stored in the db, another option might be to edit the handler file used to start web2py. In that file, you should be able to do something like this:

from gluon.settings import global_settings
global_settings
.db_sessions = True

This does not work because db_sessions is only considered if response.session_storage_type == 'db' (i.e. if the db attribute is set in session.connect(...)).
Admin uses the default session.connect(...) so it does not set a db to store sessions into. Global settings won't change that (sadly).

Good point. So, I suppose you would have to edit admin no matter what. Still, it's a fairly simple edit.

The reason for this is that the ephemeral system on Heroku can be reset randomly at any given time, so sending tickets through the bundled script would sometimes send "ticket missing" emails.

I'm not too familiar with Heroku, but according to the little documentation I have seen, the filesystem should persist until the dyno restarts or is shut down. Does it really just reset randomly?

Anthony

Louis Amon

unread,
Dec 4, 2014, 1:37:22 PM12/4/14
to web...@googlegroups.com
I'm not too familiar with Heroku, but according to the little documentation I have seen, the filesystem should persist until the dyno restarts or is shut down. Does it really just reset randomly?

In theory you would be correct. The doc says that it resets on restarts or upon stopping a dyno.

I've had a few heroku apps running in the last few months though, and I've noticed that tickets stored on the ephemeral filesystem during a weekend are often found missing on monday...

In a similar fashion, if you use "admin" to edit some source code on the fly (before putting it in a commit for example), you'll often find that your changes won't persist more than a few minutes.

Louis Amon

unread,
May 4, 2015, 11:56:36 AM5/4/15
to web...@googlegroups.com
Update

Heroku now provides an elegant solution to this issue : session affinity


Basically, it sets a special cookie that tends to distribute users towards the dyno they've previously been served with.

I've tested it and it does solve the initial issue : which is allowing a file-based session system (admin & appadmin) in a concurrent server environment.
This is also a great performance improvement as you can rely less on Memcache and more on RAM-based caching.

The only caveat I've found so far is that when you scale up or down your dyno pool, sessions ends up being redistributed on the fly to even things out between dynos.
This shouldn't be an issue if your main application doesn't rely too heavily on the filesystem though.
Reply all
Reply to author
Forward
0 new messages