smartcache

60 views
Skip to first unread message

Massimo DiPierro

unread,
Mar 17, 2015, 2:33:21 PM3/17/15
to web2py-d...@googlegroups.com
This came up briefly but without a proper discussion. I propose that when web2py stores session on disk, as an option, the disk sessions be buffered on RAM using (https://github.com/mdipierro/w3/blob/master/modules/smartramcache.py)

This will keep a fixed number of sessions in RAM. If the session changes the cache is refreshed and the record is updated. If the record is not updated and the data is in cache, it reads from cache. This cache is limited in size and only keeps the N most recent records.

This would not work behind a load balancer unless the sessions are sticky.

Do you see any problem? Should be on or off by default for sessions in db?

Massimo

Leonel Câmara

unread,
Mar 17, 2015, 3:27:53 PM3/17/15
to web2py-d...@googlegroups.com
It's bugged, because times doesn't really track the dict like it should. How to reproduce: 

>>> c = SmartRamCache(cache_size=3)
>>> c.get('foo', lambda: 'bar')
'bar'
>>> c.get('foo', lambda: 'bar', force=True)
'bar'
>>> c.get('foo', lambda: 'bar', force=True)
'bar'
>>> c.get('baz', lambda: 'bar')
'bar'
>>> c.get('baz', lambda: 'bar', force=True)
'bar'
>>> c.get('baz', lambda: 'bar', force=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 20, in get
KeyError: 'foo'

It would probably be better to just use an OrderedDict.

I think using a cache for RAM can be a good idea for sessions if we are careful with flushing the changes to the session, and dealing with log outs and session timeouts appropriately.

Massimo DiPierro

unread,
Mar 17, 2015, 3:30:01 PM3/17/15
to web2py-d...@googlegroups.com
Implementation aside (yes we should use ordereddict), what do you think of the general idea?

--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Leonel Câmara

unread,
Mar 17, 2015, 3:35:52 PM3/17/15
to web2py-d...@googlegroups.com
I think it's great, I'm annoyed that I haven't thought about it, it makes a lot of sense, and it could help web2py gain a few crucial milliseconds per request. If we can refactor Session in the process of implementing this - even better.


Massimo DiPierro

unread,
Mar 17, 2015, 3:39:29 PM3/17/15
to web2py-d...@googlegroups.com
We have refactored session many times. It is a very delicate piece of code and works very well now. I am not very keen in refactoring it. The only think I would like to do it somehow decouple its dependence from the web2py request/response objects so we can make it work with other frameworks too.

On Mar 17, 2015, at 2:35 PM, Leonel Câmara <leonel...@gmail.com> wrote:

I think it's great, I'm annoyed that I haven't thought about it, it makes a lot of sense, and it could help web2py gain a few crucial milliseconds per request. If we can refactor Session in the process of implementing this - even better.



Niphlod

unread,
Mar 17, 2015, 3:52:38 PM3/17/15
to web2py-d...@googlegroups.com
isn't a proper LRU best suited for the job ? How about concurrency ?

Niphlod

unread,
Mar 17, 2015, 4:55:06 PM3/17/15
to web2py-d...@googlegroups.com
and process restarted by an hypervisor ?

Michele Comitini

unread,
Mar 17, 2015, 5:06:07 PM3/17/15
to web2py-developers

Each process would have a different copy of the session cache, i do not feel it is a good road to travel.

https://docs.python.org/2.7/library/multiprocessing.html#sharing-state-between-processes

Having left the 2.5 legacy, it would be nice to use the above at least, but we cannot, unless we create the Array or Value early in the wsgi chain, before the forks start.  This needs some investigation.

A viable fallback could be:

https://docs.python.org/2.7/library/mmap.html

That is portable enough and gives a lot of freedom at the expense of allowing disasters.
A good service of shared memory is offered in many WSGI adapters such the excellent UWSGI.
What is needed is a SharedMemoryDict that can be extended to use other things above mmap eventually


NB  The same should be used for managing the compiled python code that now is kept in dictionaries.


2015-03-17 20:52 GMT+01:00 Niphlod <nip...@gmail.com>:
isn't a proper LRU best suited for the job ? How about concurrency ?

--

Niphlod

unread,
Mar 18, 2015, 6:10:44 AM3/18/15
to web2py-d...@googlegroups.com
I frown upon the number of tests we should carry on with it :D

Leonel Câmara

unread,
Mar 20, 2015, 9:39:00 AM3/20/15
to web2py-d...@googlegroups.com
Couldn't we use memcache for this? Basically the sessions would still be saved in disk/db but they would be cached as they're retrieved, the cache would be deleted when modifications in the session are done.

Niphlod

unread,
Mar 20, 2015, 10:02:55 AM3/20/15
to web2py-d...@googlegroups.com
there already already the facilities to store sessions in memcache and in redis. 

Michele Comitini

unread,
Mar 20, 2015, 10:06:08 AM3/20/15
to web2py-developers
yes as a *cache* memcache would be ok, but somewhat slower and needs some system configuration.
Memcache cannot be used as storage facility for session, because they would be deleted when memcache is full, even if they are not expired.



2015-03-20 14:39 GMT+01:00 Leonel Câmara <leonel...@gmail.com>:
Couldn't we use memcache for this? Basically the sessions would still be saved in disk/db but they would be cached as they're retrieved, the cache would be deleted when modifications in the session are done.

--

Massimo DiPierro

unread,
Mar 20, 2015, 10:17:43 AM3/20/15
to web2py-d...@googlegroups.com
The problem with cache.ram and cache.disk is that they do not provide constant size. They can suffer from memory leak unless the developer takes care of cleaning cache explicitly.

Niphlod

unread,
Mar 20, 2015, 12:54:03 PM3/20/15
to web2py-d...@googlegroups.com

On Friday, March 20, 2015 at 3:17:43 PM UTC+1, Massimo Di Pierro wrote:
The problem with cache.ram and cache.disk is that they do not provide constant size. They can suffer from memory leak unless the developer takes care of cleaning cache explicitly.

we know it and we love/hate that implementation detail. cache.disk at least is multiple-readers friendly now, albeit I should really test it.

However, while cache.ram is EXPLICITELY marked as not working on multiprocessor environments, the backend used for storing sessions NEEDS to work in multiprocessor environments.
 


On Mar 20, 2015, at 9:06 AM, Michele Comitini <michele....@gmail.com> wrote:

yes as a *cache* memcache would be ok, but somewhat slower and needs some system configuration.
I'd urge you to bench redis :-P . Point taken, if you're using more than your RAM size to store session data, but that's really not the issue at hand

Michele Comitini

unread,
Mar 20, 2015, 1:07:12 PM3/20/15
to web2py-developers
Yes redis can be used to persist sessions and is as fast as memcache
.  I use memcache a lot for caching dal queries, and session get thrown out fast :)
Many systems have a preconfigured ridiculous 64MB of ram form memcache, that would kill any benefit and make you loose all of your sessions.

--
Reply all
Reply to author
Forward
0 new messages