"The select
method has an optional cacheable
argument, normally set to False
. When cacheable=True
the resulting Rows
is serializable but The Row
s lack update_record
and delete_record
methods.
If you do not need these methods you can speed up selects a lot by setting the cacheable attribute:
rows = db(query).select(cacheable=True)" |
Hi everyone,
I discovered this in the book:
"The
select
method has an optionalcacheable
argument, normally set toFalse
. Whencacheable=True
the resultingRows
is serializable but TheRow
s lackupdate_record
anddelete_record
methods.If you do not need these methods you can speed up selects a lot by setting the cacheable attribute:
rows = db(query).select(cacheable=True)"
what's the difference (what makes it faster) between a normal select and the above select ?
given the dynamic nature of a database app is the caching mechanism reserved for static/pseudo-static content ?
what I was wondering is the type of data of a dynamic app demanding caching ?
what typically needs caching in such a context ?
suppose I would allow every user to cache some of its individualized selects. wouldn't that be too expensive in terms of consumed cache.disk/cache.ram ?
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I'm impressed Anthony...
well all of these - memcache-redis - seem to require lot's of technicality and probably to set-up your own deployment machine. I am not very enthusiastic about that option since the internet if full of endless technical setup-config issues. Given what's being said, I see only two kinds of page of my app I could cache : general informations and maybe forms (no db().select()) all shared and uniform datas.
There should be a simple way to achieve such a simple thing whatever the platform: pythonanywhere,vs......Is there one ?
on these "philosophical" notes, boring for someone, exciting for yours truly, the are a few other pointers (just to point out that the concept of "caching" isn't really something to discard beforehand)...be aware that on "caching" there are probably thousands of posts and pages of books written on the matter.As everything, it's a process that has "steps". Let's spend 10 seconds of silence on the sentence "premature optimization is the root of all evil". And another 10 on "There are only two hard things in Computer Science: cache invalidation and naming things". Let those sink in.Ready? Let's go.Step #0 : assessmentConsider an app that has a page that shows the records of a table that YOU KNOW (as you are the developer) gets refreshed once a day (e.g. the temperature recorded for LA for the previous day)Or a page that shows the content of a row that never gets updated (e.g. a blogpost)Given that the single most expensive operation on a traditional webapp is the database (just think to web2py requesting the data, the database reading it from disk, preparing, sending it over the wire, web2py receiving it) developers should always find a way to spend the least possible time on those steps.Optimizing queries (and/or database tuning, normalization, etc). Reducing the number of needed queries to render a page. Requesting just the amount of data needed (paging). Those are HUUUUGE topics (again, zillion of posts, books, years of expertise to master, etc etc etc). But - of course - not having to issue a query at all shortcircuits all of the above!Still at step #0, as users come by your app, every request made to those pages triggers the roundtrip to the database back and forth, always for the same data, over and over.Granted, 50 reqs/sec won't certainly hurt performances, but once they get to 500, it'll become pretty obvious that "a" shortcircuit could save LOTS of processing power.When you face the problem of scaling to serve more concurrent requests, either you do spawning more processes, or adding servers.Adding frontend servers is easy: the data is transactionally consistent as long as you have a single database instance. You put a load balancer in front of frontends (it's relatively inexpensive) and go on.Scaling databases adding servers is NEVER easy (againt, the interwebs and libraries are full of evidences, and a big part of nosql "shiny" features are indeed horizontal scaling, with pros and cons).Step #1: local cachingBack to your app without cache...wouldn't be better to avoid calling the db for the same data 500 times per second ? Sure. Cache it.Assuming you cache the database results, web2py still needs to compute the view, but that step, in regards of the shortcircuit, is orders of magnitude less expensive. (yes, if you cache views, you're sidestepping web2py's rendering too, but let's keep as less variables as possible for the sake of this discussion)And there you are, at the first iteration of step #1, using 1MB of RAM more to avoid hitting the database.Cache it for just an hour, do the math on the simple example of 50 req/s, and you saved 50*60*60 - 1 = 179999 roundtrips. You can use the extra savings to do 179999 roundtrips you actually NEED to in other places of your app, and having the same performances, without additional costsWhoa!
Step #300You start caching here and there, and you use 500MB or RAM. You're using cache.ram, everything is super-speedy, no third-parties, just web2py features.Now, you need to serve 100 req/s, you spawn another process.... whoopsie .... 1GB of RAM. Or, another server, 500MB on the first and 500MB on the second... 500 are "clearly" wasted, as they are a copy of the "original" 500.And the second process (or server) still needs to do roundtrips if its local cache doesn't contain your already-cached-in-another-place query.Also, something else "crepts in"...as your apps grows, you start loosing track of what you cached, when you cached, for how long it's needed to be cached... a record fetched on the first server at 8:00AM could be updated in the meantime and fetched on the second server (because it isn't in its local cache) at 8:02AM...you're effectively serving from cache different versions!#Step 301To sidestep both issues, you use redis or memcached: they sit outside of the web2py process and consume only 500MB.For one, two, a zillion processes. And they are a single source of truth. And they are as speedy as cache.ram (or at least, they have the same magnitude).
On Wednesday, April 27, 2016 at 2:04:07 PM UTC+2, Anthony wrote:On Wednesday, April 27, 2016 at 7:00:53 AM UTC-4, Pierre wrote:I'm impressed Anthony...
well all of these - memcache-redis - seem to require lot's of technicality and probably to set-up your own deployment machine. I am not very enthusiastic about that option since the internet if full of endless technical setup-config issues. Given what's being said, I see only two kinds of page of my app I could cache : general informations and maybe forms (no db().select()) all shared and uniform datas.I'm not sure what you mean by caching forms, but you probably don't want to do that (at least if we're talking about web2py forms, which each include a unique hidden _formkey field to protect against CSRF attacks).There should be a simple way to achieve such a simple thing whatever the platform: pythonanywhere,vs......Is there one ?You can just use cache.ram. If running uWSGI with multiple processes, you will have a separate cache for each, but that won't necessarily be a problem (just not as efficient as it could be). You could also try cache.disk and do some testing to see how it impacts performance.More generally, caching is something you do to improve efficiency, which becomes important as you start to have lots of traffic. But if you've got enough traffic where efficiency becomes so important, you should probably be willing (and hopefully able) to put in some extra effort to set up something like Memcached or Redis. Until you hit that point, don't worry about it.Anthony
--
When you face the problem of scaling to serve more concurrent requests, either you do spawning more processes, or adding servers.Adding frontend servers is easy: the data is transactionally consistent as long as you have a single database instance. You put a load balancer in front of frontends (it's relatively inexpensive) and go on.