[web2py] simple update of an cache.ram python dict

272 views
Skip to first unread message

Richard

unread,
Dec 21, 2015, 10:22:45 PM12/21/15
to web2py-users
Hello,

I am still under 2.9.5, I have a simple dict cached in ram which never expire that I update when new key value are added to the system... Mainly the dict contain id and their representation...

It works flawlessly in dev, but once I pushed in prod, it seems that the cached dict takes time to really update... Here how I manage the creation an update of this dict :

def set_id_represent(update_id_represent_if_elapsed_time=None, id=None):
   
"""
    Calling this function will create in globals the "
id_represent" variable if the call is made without
    id. If id is passed, it will update the id_represent dictionary with new
    id and it representation.
    :param id:
    :param update_id_represent_if_elapsed_time:
    """

   
if 'id_represent' not in globals():
       
global id_represent
        id_represent
= \
            cache
.ram('id_represent',
                     
lambda: {r.id: r.represent_field
                               
for r in db().select(db.table_name.id,
                                                    db
.table_name.represent_field,
                                                   
orderby=db.table_name.represent_field)
                               
},
                      time_expire
=update_id_represent_if_elapsed_time)
   
elif isinstance(id, int) or isinstance(id, long):
        id_represent_query
= \
            db
(db.table_name.id == id
               
).select(db.table_name.id,
                        db
.table_name.represent_field,
                       
orderby=db.table_name.represent_field)
        id_represent
.update({r.id: r.represent_field for r in id_represent_query})
   
if id:
       
return id_represent[id]

set_id_represent
(update_id_represent_if_elapsed_time=None)

Then when I want to update the cached dict with new k, v :

set_id_represent(id=my_id)

I have made some test and print after the id_represent.update(...) above from the function call and the dict seems to be updated... The function that call set_in_represent(id=id) doesn't failed, but when we want to access page which user id_represent[some_id], they all failed for a couples of minutes... Like if the cached dict not get update immediately...

Thanks for any pointer...

Richard





Richard Vézina

unread,
Jan 4, 2016, 4:18:12 PM1/4/16
to web2py-users
UP here!

Any help would be appreciate...

Richard

--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Richard Vézina

unread,
Jan 13, 2016, 11:26:16 AM1/13/16
to web2py-users
Hello,

Still struggle with this. I don't understand why cache dict is not updated in real time...

It get updated but there is a strange delay.

Thanks

Richard

Anthony

unread,
Jan 13, 2016, 12:19:22 PM1/13/16
to web2py-users
Are you using nginx/uwsgi? If so, I believe cache.ram would not be shared across the different uwsgi worker processes. You might consider switching to the Redis cache.

Anthony
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscribe@googlegroups.com.

Richard Vézina

unread,
Jan 13, 2016, 1:01:03 PM1/13/16
to web2py-users
Ha!!

Yes nginx/uwsgi...

That what I suspecting, it looks like the dict was kind of unique by user...

I think we should leave a note somewhere in the book about this issue...

I was in the process of exploring Redis cache or memcache just to see if there were not improvement. I will look into Redis.

Thank you Anthony.

Richard

To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.

Richard Vézina

unread,
Jan 13, 2016, 3:18:10 PM1/13/16
to web2py-users
Redis keys are stick for ever...

I figure out most of how to use the Niphold contrib, though it seems that created cached element stays in Redis for ever... I restart the server and they still there... 

My issue look like it still there... There maybe something wrong in my logic...

It like if the cached dict can't just be updated as usual :

dict.update(key: value)

Richard

Anthony

unread,
Jan 13, 2016, 4:36:15 PM1/13/16
to web2py-users
On Wednesday, January 13, 2016 at 3:18:10 PM UTC-5, Richard wrote:
Redis keys are stick for ever...

I figure out most of how to use the Niphold contrib, though it seems that created cached element stays in Redis for ever... I restart the server and they still there... 

Are you saying you re-started the web server (e.g., uwsgi) or the entire VM? I don't think Redis would be affected by the former.

Anthony

Niphlod

unread,
Jan 13, 2016, 4:42:29 PM1/13/16
to web2py-users
hem....... cache.ram behaves differently than ANY other backend because it just stores a reference to the computed value .
That's why you can do a dict.update() without setting explicitely a new cached value....... but you'd need to do if you see it from a "flow" perspective.

What you are tryng to do in pseudo code is really:

.... if something_happens:
            search_cached_value
            if cached_value is not there:
                  calculate cached_value for all
                  store it in cache
            if just_a_piece_isn't_there:
                  fetch_cached_value
                  update_a_piece_of_it
                  store updated_cached_value into cache

and it is what you need to do with anything not being cache.ram (i.e. fetch "all", update a piece, store "the new all")

This difference of behaviours, and the lack of proper shortcuts is one of the things I hate most from web2py's cache behaviour (which has some pros, but lots of cons, like in your case). API for this kind of operations isn't really straightforward...


Wrapping things up....

if 'foo' not in globals():
    fetching = cache.whatever_but_not_ram('thekey', lambda:1, time_expire=None) # special case to never refresh the value
    if fetching == 1:
        # nobody ever stored something here
        huge_thing_to_compute = cache.whatever_but_not_ram('thekey', lambda: long_cache_func(), time_expire=0) #special case to ALWAYS refresh the value
    else:
       # something is stored here
       if needs_updating:
            fetching[just_a_piece] = 'newpiecevalue'
            cache.whatever_but_not_ram('thekey', lambda: fetching, time_expire=0)


BTW: I'm the only one thinking that you're overcomplicating the issue ? Just store pieces in cache and let them refresh when needed without dealing with the whole object and its relative invalidation (plus global escalation of local variables, etc etc etc)

BTW2: if you ask to store keys indefinitely, redis is happy to persist them. While cache.ram and cache.disk will effectively NEVER delete the values, redis has a strict upper value of one day (24*60*60) to avoid stale records everywhere.

Niphlod

unread,
Jan 13, 2016, 4:56:19 PM1/13/16
to web...@googlegroups.com
errata corrige on BTW2: on redis, time_expire=None results in key stored at most one day. you can always do time_expire=30*24*60*60 for 30days worth.

things_to_know: only cache.disk and cache.ram can effectively cache a value indefinitely and enable that strange behaviour of retrieving the previously cached element - that you marked to expire after 5 seconds - one day later. 
Web2py's cache API is effectively incompatible with any high-performance caching system, unless the user calls upon himself the burden of clearing the cache regularly (which is never the case because cache invalidation is one of the hardest things to master).
Both memcache and redis take a more strict behaviour, i.e. the value will effectively expire at the time it was set on the first time. So, you can only fetch a cached value one day later if you originally marked it as expiring one day later. Redis allows a leeway of 120 seconds to accomodate the 90% of the behavioural usecases of cache.ram and cache.disk.

so, if you need to use cache.redis or cache.memcache, you need to implement another step.
fetch
calculate_new_value
if was_there:
       delete_previously_stored_in_cache
cache_new_value

Richard Vézina

unread,
Jan 14, 2016, 10:00:36 AM1/14/16
to web2py-users
Hello Simone,

Thanks to jump in this thread... I understand what you say... The thing is I need this dict to be global and I though it could be more clean to have these dict create and update by the same function.

So, my main issue with both cache.ram and cache.redis is that new id representation never get added to the dict "permanently". In case of cache.ram, the issue may come from what Anthony explain because I use uwsgi/nginx. But I have made some test with redis and the issue still there, but may still be there for a differents reasons, I don't know. I mean if I update the Redis cached dict from shell, and I try to retrieve the representation passing the key to the dict it works, but it looks like this only works in shell. In case of Redis, I may have to recompute the whole dict base on what you explain, which will not provide any performance improvement if it the case, because what I try to prevent it exactly the creation of the dictionary which requires a lot of computing for nothing each time a new record get created. There maybe something I don't understand about how to refresh Redis cache or in what you explained.



On Wed, Jan 13, 2016 at 4:56 PM, Niphlod <nip...@gmail.com> wrote:
errata corrige on BTW2: on redis, time_expire=None results in key stored at most one day. you can always do time_expire=30*24*60*60 for 30days worth.

things_to_know: only cache.disk and cache.ram can effectively cache a value indefinitely and enable that strange behaviour of retrieving the previously cached element - that you marked to expire after 5 seconds - one day later. 
Web2py's cache API is effectively incompatible with any high-performance caching system, unless the user calls upon himself the burden of clearing the cache regularly (which is never the case because cache invalidation is one of the hardest things to master).
Both memcache and redis take a more strict behaviour, i.e. the value will effectively expire at the time it was set on the first time. So, you can only fetch a cached value one day later if you originally marked it as expiring one day later. Redis allows a leeway of 120 seconds to accomodate the 90% of the behavioural usecases of cache.ram and cache.disk.

--

Anthony

unread,
Jan 14, 2016, 10:39:42 AM1/14/16
to web2py-users
So, my main issue with both cache.ram and cache.redis is that new id representation never get added to the dict "permanently". In case of cache.ram, the issue may come from what Anthony explain because I use uwsgi/nginx. But I have made some test with redis and the issue still there, but may still be there for a differents reasons, I don't know. I mean if I update the Redis cached dict from shell, and I try to retrieve the representation passing the key to the dict it works, but it looks like this only works in shell. In case of Redis, I may have to recompute the whole dict base on what you explain, which will not provide any performance improvement if it the case, because what I try to prevent it exactly the creation of the dictionary which requires a lot of computing for nothing each time a new record get created. There maybe something I don't understand about how to refresh Redis cache or in what you explained.

The point is that when you retrieve something cached anywhere but RAM, you are getting a copy of the object. If you then update that copy in your Python code, that does nothing to update the value that is stored in the cache. So, if you want to update the cached value, you have to explicitly put the new copy of the entire object back into the cache.

Anthony


Richard Vézina

unread,
Jan 14, 2016, 11:12:12 AM1/14/16
to web2py-users
This is true for any other cache except cache.ram right?

If so, there is no gain with cache.redis the way I use it...

@Anthony, are you sure about the issue with uwsgi/nginx and cache.ram dict update?

I guess, I should start to look at how to get rid of these global dict while not degrading system performance. There surely place where I use these global vars that wouldn't suffer from a little query to the backend, but for grid where the performance was the greatest or simplifying code was acheive with those it will be difficult to stop using them...

Thanks

Richard


--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Richard Vézina

unread,
Jan 14, 2016, 11:21:58 AM1/14/16
to web2py-users
@Niphold, I just send a PR with improvements mainly docstring and PEP8 over cache redis contrib...

:D

Richard

Anthony

unread,
Jan 14, 2016, 12:06:22 PM1/14/16
to web2py-users
On Thursday, January 14, 2016 at 11:12:12 AM UTC-5, Richard wrote:
This is true for any other cache except cache.ram right?

Right. cache.ram works because it doesn't have to pickle a Python object and put it into external storage (and therefore create a fresh copy of the stored object via unpickling at retrieval time). Rather, it simply stores a pointer to the existing Python object within the current Python process. Of course, this limits cache.ram to a single process, so if your app is being served by multiple processes, each will have its own version of cache.ram.
 
If so, there is no gain with cache.redis the way I use it...

Well, the gain with Redis is that it will actually work, though you will have to adjust your code to save the whole dictionary back to the cache upon update.
 
@Anthony, are you sure about the issue with uwsgi/nginx and cache.ram dict update?

I think so. You might try configuring uwsgi to run a single process with multiple threads instead of using multiple processes. Not sure how that will impact performance.
 
I guess, I should start to look at how to get rid of these global dict while not degrading system performance. There surely place where I use these global vars that wouldn't suffer from a little query to the backend, but for grid where the performance was the greatest or simplifying code was acheive with those it will be difficult to stop using them...

Is it really a problem to write the whole dictionary to Redis? How often are updates happening?

Anthony

Richard Vézina

unread,
Jan 14, 2016, 12:13:09 PM1/14/16
to web2py-users
Yes, this may be an option (update whole dict in Redis)... Mean time I get rid of them, if I can succeed in that...

:)

Thanks Anthony.

Richard

--

Richard Vézina

unread,
Jan 14, 2016, 3:40:09 PM1/14/16
to web2py-users
There is something I don't understand... I put a couple of print statments to see if my cached vars was in globals() and I discover that my var was never there...

I am lost completly... If it pass throught my "if" my dict will be recreated each request...

:(

Richard

Richard Vézina

unread,
Jan 14, 2016, 3:48:00 PM1/14/16
to web2py-users
Forget about last message I was making a mistake in my print statments

Richard Vézina

unread,
Jan 14, 2016, 9:19:44 PM1/14/16
to web2py-users
I don't understand something... I have this :

```python
    def set_dict_test():
        if 'dict123' not in globals():
            print 'dict123 not in globals : %s' % str('dict123' not in globals())
            global dict123
            dict123 = cache.ram('dict123', lambda: {1: 1, 2: 2, 3: 3}, time_expire=None)
            # dict123 = {1: 1, 2: 2, 3: 3}
        else:
            print 'no dict creation'
            print dict123

    set_dict_test()
    # dict123 = cache.ram('dict123', lambda: {1: 1, 2: 2, 3: 3}, time_expire=None)
    print 'AFTER function call... dict123 in globals : %s' % str('dict123' in globals())
    print dict123
```

What I don't understand is why each time set_dict_test() is call dict123 is never in globals().

If I don't use cache.ram dict123 is in globals() and function return the else: part of the function...

It the same for cache.redis()

Richard

Anthony

unread,
Jan 15, 2016, 12:58:05 AM1/15/16
to web2py-users
If I don't use cache.ram dict123 is in globals() and function return the else: part of the function...

What do you mean by "if I don't use cache.ram"? Are you saying you are using cache.disk, or no cache at all? If the latter, how is dict123 in globals() (i.e., where do you define it)?

Anthony

Richard Vézina

unread,
Jan 15, 2016, 9:21:16 AM1/15/16
to web...@googlegroups.com
I ever use cache.ram... Function and function call is in models, so variable is accessible from every controller files...



With Redis, this work :

from gluon.contrib.redis_cache import RedisCache
cache.redis = RedisCache('localhost:6379', db=None, debug=True, with_lock=False, password=None)


def redis_set_id_represent(update_id_represent_if_elapsed_time=None, update_cache=None):
    """
    Calling this function will create in globals the "id_represent" variable. If "id_represent" already
    present in globals() we do nothing. To update redis.cache key "id_represent" the whole dictionary has to be
    recreate, to do so, someone should call this function with "update_chache=True".

    In module, to avoid cached dictionary to be recreated all the time this function should be call without passing
    "update_cache" parameter or by passing "update_cache=False".

    :param update_cache:
    :param update_id_represent_if_elapsed_time:
    """

    if 'id_represent' not in globals() or update_cache is not None:

        # print 'id_represent not in globals() or init_cache'

        # Clear Redis Key
        cache.redis.clear(regex='id_represent')

        # Delete dictionary from global
        if 'id_represent' in globals():
            del globals()['id_represent']

        # Query the database
        id_represent_query = \
            db().select(db.table_name.id,
                              db.table_name.represent_field,
                              orderby=db.table_name.represent_field)

        # Create New global and assign Redis Cache variable
        global id_represent
        id_represent = cache.redis('id_represent',
                                          lambda: {r[0]: r[1] for r in id_represent_query},
                                          time_expire=update_id_represent_if_elapsed_time)

redis_set_id_represent(update_id_represent_if_elapsed_time=None)

Richard

--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscribe@googlegroups.com.

Anthony

unread,
Jan 16, 2016, 6:11:53 PM1/16/16
to web...@googlegroups.com
It's not clear what you are trying to achieve. If this code is in a model file, then "id_represent" will never be in globals() before it is called (each request is executed in an isolated, ephemeral environment, so subsequent requests will not see the "id_represent" from previous requests). Also, why define "id_represent" as global and assign it within the function rather than simply returning it and assigning it in the model when calling the function (which will make it globally available in any controller or view)?

In other words, why not just use the cache as usual:

id_represent = cache.redis('id_represent',
                           lambda: db().select(db.mytable.id, db.mytable.field).as_dict(),
                           time_expire
=expiration)

Or just cache the select directly:

id_represent = db().select(db.mytable.id, db.mytable.field,
                           cache
=(cache.redis, expiration), cacheable=True)

In the latter case, lookups might be a little slower because you'll have to us the rows.find method, which will scan the records, but you should do some profiling to see if this is really an issue for your use case.

Anthony

Richard Vézina

unread,
Jan 18, 2016, 10:18:21 AM1/18/16
to web2py-users
The idea of making a function was to be DRY. On one part create the dictionary of ids representation once... Then use the same function to update it where record get created which required that we add a new representation. I could have done what you explain, plain cache.ram() assignation and an update cached var function, so no global stuff...

In case of cache.redis(), I can do the same thing, but since I will need a function (or repeat the same code snippet everywhere NOT DRY), I use this path...

Note: In case of cache.redis() function above, I can see that cached redis key is in globals()... So the issue (if an issue) is only in case of cache.ram() which I find strange... But it could explain why cache.ram() can't be use in context of uswgi/nginx maybe...

Richard



To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages