Faster download() function

122 views
Skip to first unread message

Anthony

unread,
Apr 10, 2013, 5:41:30 PM4/10/13
to web2py-d...@googlegroups.com
In the welcome app, should we make these changes to the download() function to improve performance:
 
@cache.client()
def download():
   
"""
    allows downloading of uploaded files
    http://..../[app]/default/download/[filename]
    """

    session
.forget(response)
   
return response.download(request, db)

Adds client-side caching by default, and unlocks the session file so multiple simultaneous downloads (e.g., multiple images loaded on a single page) don't block.

Anthony

Niphlod

unread,
Apr 10, 2013, 5:49:53 PM4/10/13
to web2py-d...@googlegroups.com
watch out, in theory if the download(ed) file is just for that user, it's better to use @cache.client(public=False), so it can't be stored by proxies.

Massimo DiPierro

unread,
Apr 10, 2013, 5:52:19 PM4/10/13
to web2py-d...@googlegroups.com
what about access control which may be performed by response.download?

--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Ricardo Pedroso

unread,
Apr 10, 2013, 5:53:45 PM4/10/13
to web2py-d...@googlegroups.com
Doesn't web2py send "Cache-Control: private" header already?

Anthony

unread,
Apr 10, 2013, 5:54:23 PM4/10/13
to web2py-d...@googlegroups.com
For what it's worth, I still think the naming of @cache.client() is confusing. You would think it is used for client-side caching, but in fact it is used for both client and server side caching (with additional server-side features not available in the standard @cache() decorator). I think it would be better to move the new server-side and client-side features to the standard @cache() decorator (maintaining the current default behavior), with @cache.client() simply being a shortcut for defaulting to turning on client-side caching (as well as changing some of the server-side default behavior when it is activated). Failing that, maybe just look for a more appropriate name, such as @cscache(), @cache.enhanced(), etc.

Anthony

Ricardo Pedroso

unread,
Apr 10, 2013, 5:54:42 PM4/10/13
to web2py-d...@googlegroups.com
I think it's a good idea... and probably put a small comment in front
of session.forget explaining why it's there.
Or leave the session.forget commented with the explanation

Anthony

unread,
Apr 10, 2013, 7:16:13 PM4/10/13
to web2py-d...@googlegroups.com
Would @cache.client(public=False) be sufficient in that case?

Anthony

Massimo DiPierro

unread,
Apr 10, 2013, 7:20:43 PM4/10/13
to web2py-d...@googlegroups.com
I think so. the access control would not be perfect because it may change but generally if a user has access once and loses access, I think we can still offer the cached version.

Anthony

unread,
Apr 10, 2013, 7:42:54 PM4/10/13
to web2py-d...@googlegroups.com
Default client-side caching is only for 5 minutes (actually, maybe we should make that longer).

Anthony

Massimo Di Pierro

unread,
Apr 11, 2013, 1:33:25 PM4/11/13
to web2py-d...@googlegroups.com
I do not have a strong opinion on this. What do you think it should be?

Niphlod

unread,
Apr 11, 2013, 3:02:53 PM4/11/13
to web2py-d...@googlegroups.com
This issue was kinda addressed before on the session.forget() part.
If you inspect the response.download() method, you'll see it uses session.forget already.

On the cache side, much depends on how the users handle the upload fields.
For apps where the uploads are something that gets rarely rendered by a browser (e.g. storing attachments), cache can't help that much.

If instead users use uploads just to store something than then gets shared a lot (e.g., pictures public on the home page, media on the wiki, etc) and without constraints, they should know better and use a folder that "falls" into the /static/ handling (so in production it can be offloaded).

In any case, decorating it with a @cache.client(public=False) is safe, cause it will send out headers that will prevent the client's browser to fetch that resource for the next 5 minutes (and that only client).
If permissions on those file change that fast, I assume a "dedicated" download replacement would be in the user app code: by the way the second a browser has fetched the resource I see no issues on send it the first time cached..... it's kinda a funny requirement to have a client download a file and then a few minutes back preventing him to download the same resource.

On the "let's move cache.client() to some other method, or use it by default", there's a problem: cache.client needs to execute the function completely before storing it into the cache, cause it needs to inspect response.status to know if cache headers can be set (yes, it can eventually suffer the "thundering herd" issue, but there's no other way to code it).

If you find a new name for that I have no absolute issue with that (I posted originally it as a separate module and I'm famous for having no actual sense for giving out names): just not consider it as a simple @cache replacement because the inner workings needed are different from a standard @cache decorator.



Anthony

unread,
Apr 11, 2013, 5:28:42 PM4/11/13
to web2py-d...@googlegroups.com
This issue was kinda addressed before on the session.forget() part.
If you inspect the response.download() method, you'll see it uses session.forget already.

Oops, hadn't noticed that - good catch.

Anthony

Anthony

unread,
Apr 11, 2013, 5:39:06 PM4/11/13
to web2py-d...@googlegroups.com
I suppose it makes sense to leave the default server-side caching at 5 minutes (if nothing else, for backward compatibility), but since @cache.client() is new, we can change the default there. I'd say somewhere from 15 to 60 minutes would make sense. Anyone else have an opinion?

Also, I notice that the docstring for Cache.__call__ recommends using Cache.client for functions that are actions, so how about changing the name of Cache.client to Cache.action? We can then promote using @cache.action() for caching actions and @cache() for caching other non-action functions. That seems to be the fundamental difference between the two -- @cache.action() includes features specifically for dealing with caching actions (e.g., checking response status, including user agent, sending client headers, etc.) that aren't relevant for caching other types of functions.

Anthony

Niphlod

unread,
Apr 11, 2013, 6:20:48 PM4/11/13
to web2py-d...@googlegroups.com
no probl with me. I just changed that line in the docstring that reported already the term "action" to associate it with a "page" response.

As always, I can't take the credit for coming up with the 'action' term ^____^

Niphlod

unread,
Apr 14, 2013, 9:25:12 AM4/14/13
to web2py-d...@googlegroups.com
there's more story into it...

https://groups.google.com/d/msg/web2py/kwcz1G8-T5E/HRbBT4iWJPsJ

Suggestions welcome, so in the next release we'll have a (stable) naming and conventions.

Anthony

unread,
Apr 14, 2013, 12:26:59 PM4/14/13
to
What if we cut

                        rtn = cache_model(cache_key, lambda : func(), time_expire=time_expire)
                   
else:
                        rtn
= func()

and then put the following right after all the headers are set:

                    if cache_model:
                        rtn
= cache_model(cache_key, lambda : func(), time_expire=time_expire)
                   
else:
                        rtn
= func()

That would set the headers before func() gets called.

Anthony

Niphlod

unread,
Apr 14, 2013, 12:40:36 PM4/14/13
to web2py-d...@googlegroups.com
there's always the problem that headers needs to be discarded (as the cached value) for a http(non-200ish) ... no?

Anthony

unread,
Apr 14, 2013, 1:10:39 PM4/14/13
to web2py-d...@googlegroups.com
Oh yeah, forgot we have to check the status first.

Massimo DiPierro

unread,
Apr 17, 2013, 8:24:49 PM4/17/13
to web2py-d...@googlegroups.com
So... we need to change this. Before we do? What shall I change "client" to something else?

Anthony

unread,
Apr 17, 2013, 8:33:00 PM4/17/13
to web2py-d...@googlegroups.com
My suggestion is to change cache.client() to cache.action() as it is intended specifically for caching actions (but not other types of function). It does both client and server caching, so calling it cache.client() is a bit misleading.

Anthony

Massimo DiPierro

unread,
Apr 17, 2013, 9:08:33 PM4/17/13
to web2py-d...@googlegroups.com
Yes, the name is more appropriate. Committing!

Niphlod

unread,
Apr 18, 2013, 3:49:17 AM4/18/13
to web2py-d...@googlegroups.com
remember to update the relative docs in the book.

PS: Do we need to extract a generic function to fiddle with cache headers only, to be used in such cases as the download function?
Basically it won't check for the status codes, but it will enable a leaner syntax for whatever method raises HTTP to trick web2py into serving a page. I'm a bit unsure on how "the trick is spread": @cache.action() was "patching a hole" in blogs/wiki/rss/public services, but internal modules do use "the trick" pretty often.
Can anyone think to something that can decorate a function that uses raise HTTP(), intercept that exception, fiddle with what's there to fiddle and "re-raise" the same HTTP() ? That would close the deal once and for all.

Anthony

unread,
Apr 18, 2013, 7:49:52 AM4/18/13
to web2py-d...@googlegroups.com
Can anyone think to something that can decorate a function that uses raise HTTP(), intercept that exception, fiddle with what's there to fiddle and "re-raise" the same HTTP() ? That would close the deal once and for all.

Haven't thought it through, bu could we wrap the call to func() in a try...except, catch the HTTP exception, and then re-raise it (after possibly setting the headers)? 

Anthony

Niphlod

unread,
Apr 18, 2013, 8:13:26 AM4/18/13
to web2py-d...@googlegroups.com
pseudo-code (aka "who does know how to fill the red bits?")
try:
    rtn = func()
    fiddle_with_headers = True
except HTTP:
    rtn = ?
    fiddle_with_headers = probably False
if fiddle_with_headers:
    fiddle_with_headers_yes()
raise HTTP(what?, rtn?, **headers)

Anthony

unread,
Apr 18, 2013, 9:09:32 AM4/18/13
to web2py-d...@googlegroups.com
Maybe something like this:

try:
    rtn
= func()
    fiddle_with_headers
= True
except HTTP as e:
    http
= HTTP(e.status, e.body, **e.headers)
    fiddle_with_headers
= probably False
else:
    http
= None
if fiddle_with_headers:
    fiddle_with_headers_yes
()
if http:
   
if fiddle_with_headers:
        http
.headers.update(**fiddled_headers)
   
raise http
else:
   
return rtn

However, we still have to account for the case where func() is called within the call to cache():

    rtn = cache_model(cache_key, lambda : func(), time_expire=time_expire)

Anthony

Niphlod

unread,
Apr 18, 2013, 3:51:53 PM4/18/13
to web2py-d...@googlegroups.com
ok, it seems that I tackled it: unfortunately it seems that my brain is so wrapped that whatever line I try to change to make it more readable ends in malfunctions: can you test it please (and optionally beautify it) ?

tried combinations of
@cache.action(quick='SVL') #send headers only if required
@cache.action(quick='SVL', cache_model=cache.ram) #send headers and cache only if required

and def test():
#..return 'a' #should "engage"

#..raise HTTP(200, [a]) #should "engage"

#..raise HTTP(404, [a]) #shouldn't "engage"

#..response.status = 404
#..return 'a' ## shouldn't "engage"




cache.action.txt

Niphlod

unread,
Apr 29, 2013, 3:13:00 PM4/29/13
to web2py-d...@googlegroups.com
ok, after countless trials and errors, I made a pull request on github fixing cache.action for whatever action needs to be decorated based on the code posted ealier. Thanks to all for chiming in, especially Anthony for the pseudo-code-almost-working implementation ^_^ (was indeed very very close to the actual one).

@massimo: I'll send a patch for inclusion for docs for web2py-book ASAP.

Reply all
Reply to author
Forward
0 new messages