RFC: models router

65 views
Skip to first unread message

Anthony

unread,
Sep 1, 2012, 1:39:11 AM9/1/12
to web2py-d...@googlegroups.com
This thread got me thinking. What do you think of this patch for a simple models router (net addition of 8 lines of code). In any model file, you can do:

response.models_to_run = ['list', 'of', 'regexes']

It can be a single regex string, a list of regex strings, or a compiled regex. The regexes are relative to '/models/'. It can be altered within any model file, therefore affecting whichever model files come later alphanumerically. The default is:

response.models_to_run = [r'^\w+\.py$', r'^%s/\w+\.py$' % request.controller,
    r
'^%s/%s/\w+\.py$' % (request.controller, request.function)]

which is equivalent to the current conditional models functionality. Because models_to_run can't be changed by user code until at least one model file is executed, the first model file always runs. Works for compiled models as well (newer style, not legacy).

Note, the path separator in the regexes should always be "/", even on Windows (before matching against file paths, it replaces os.path.sep with "/" in the file paths). This ensures the code is portable across OS'es.

For now it requires regexes, but if we eventually want to offer a simplified syntax, I suppose we could develop a function like response.add_models() that could take some parameters and update response.models_to_run with the appropriate regexes.

Worthwhile?

Anthony
models_router.patch

Massimo DiPierro

unread,
Sep 1, 2012, 8:09:22 AM9/1/12
to web2py-d...@googlegroups.com
I think this is an excellent idea. It is simple and does the job.
It adds a re.compile at every request. We should time that and make sure it is negligible.


--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
 
 
<models_router.patch>

Anthony

unread,
Sep 1, 2012, 9:35:22 PM9/1/12
to web2py-d...@googlegroups.com
On Saturday, September 1, 2012 8:09:27 AM UTC-4, Massimo Di Pierro wrote:
I think this is an excellent idea. It is simple and does the job.
It adds a re.compile at every request. We should time that and make sure it is negligible.

Good point. It's actually an re.compile for every file in the models folder (and sub-folders). However, I believe when you call re.compile with a particular regex, the interpreter caches the compiled regex, so subsequent re.compile calls for the same regex are much faster. For a standard request with no custom models_to_run, there will be only one regex to compile. For requests that set one custom models_to_run, there will be two regexes to compile (the default regex, followed by the custom regex).

I used timeit to measure the time to compile the default models_to_run regex. To get the time to compile a new regex (not cached), I appended a random.random() number each time so each regex would be unique. In that case, it took about 0.6ms per regex -- a bit high, but not too bad. I then ran it without appending the random number in order to get the time to do a repeat compile, and that drops to about 1 microsecond.

The question is, does the cached compiled regex survive across requests? If so, then assuming an app has a relatively small fixed number of distinct models_to_run regexes, once there have been a few requests forcing the regexes to get cached, the additional time on new requests should be trivial.

Anthony

Massimo DiPierro

unread,
Sep 1, 2012, 9:55:48 PM9/1/12
to web2py-d...@googlegroups.com
How about we cache it ourself in a global dict defined in the compile app module, just to make sure?


Anthony

unread,
Sep 2, 2012, 12:19:01 AM9/2/12
to web2py-d...@googlegroups.com
On Saturday, September 1, 2012 9:55:56 PM UTC-4, Massimo Di Pierro wrote:
How about we cache it ourself in a global dict defined in the compile app module, just to make sure?

Have a look at http://stackoverflow.com/questions/7520622/python-re-modules-cache-clearing. Looks like re caches only up to 100 entries (re._MAXCACHE is 100). So, might be a good idea to have a separate cache just for models_to_run (though we may also want to purge it when it hits some limit).

Anthony

Michele Comitini

unread,
Sep 2, 2012, 10:47:38 AM9/2/12
to web2py-d...@googlegroups.com
We should not just solve the problem with this single re.compile. We
should find a way to address the re.compile problem more globally.
While profiling web2py I see that re.compile is one of the most
frequent and time consuming calls, more than session locking and other
logical issues. I have no clue on how this can be improved, anyway
finding a solution would speedup everything!

mic




2012/9/2 Anthony <abas...@gmail.com>:

Massimo DiPierro

unread,
Sep 2, 2012, 10:52:20 AM9/2/12
to web2py-d...@googlegroups.com
I am surprised by that. Can you help identify which re.compile are called at every request?

Massimo DiPierro

unread,
Sep 2, 2012, 3:13:16 PM9/2/12
to web2py-d...@googlegroups.com
This is now in trunk, with caching of regexes. Can you please check it works as intend for compiled apps as well?

On Sep 1, 2012, at 12:39 AM, Anthony wrote:

--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
 
 
<models_router.patch>

Anthony

unread,
Sep 2, 2012, 8:58:30 PM9/2/12
to web2py-d...@googlegroups.com
Seems to work with the app compiled as well (as did the original version).

Should we include any logic to clear the cache if it gets too large (as the re module does), or is it not worth worrying about given the relatively small number of distinct regexes likely to exist in a given installation?

Anthony

Massimo DiPierro

unread,
Sep 2, 2012, 11:32:23 PM9/2/12
to web2py-d...@googlegroups.com
Great!

As long as this used only in compileapp.py we do not need to worry about clearing cache. It is almost impossible to grow it.

Anthony

unread,
Sep 2, 2012, 11:49:06 PM9/2/12
to web2py-d...@googlegroups.com
As long as this used only in compileapp.py we do not need to worry about clearing cache. It is almost impossible to grow it.

Nevertheless, I think I'll sleep better if we do something simple like:

CACHED_REGEXES = {}
_MAXCACHE
= 1000

def re_compile(regex):
   
try:
       
return CACHED_REGEXES[regex]
   
except KeyError:
       
if len(CACHED_REGEXES) >= _MAXCACHE:
            CACHED_REGEXES
.clear()
        compiled_regex
= CACHED_REGEXES[regex] = re.compile(regex)
       
return compiled_regex

Am I being paranoid?

Anthony
Reply all
Reply to author
Forward
0 new messages