BETA: new URL routing facility

152 views
Skip to first unread message

Jonathan Lundell

unread,
Jan 3, 2011, 9:28:49 PM1/3/11
to web...@googlegroups.com
The new URL routing facility that I described a few days ago is now in the trunk. Iprovides fairly powerful rewriting with very simple configuration and no regexes. The configuration is described below.

WARNING: this is beta-quality code. There are surely bugs in it, and the API/configuration will no doubt change a little. On the other hand, it's simple to configure, and I'd appreciate as much testing as possible.

You should see *no* change in routing behavior if you do not add a router entry to routes.py. Please notify me ASAP if you notice anything to the contrary.

Features:

* remove default application/controller names from URLs

* support domain<->app mapping (no visible app names)

* support language codes embedded in URLs: /app/en/ctlr/fcn/args

* handle static files, including root-based files like favicon.ico, automatically

* make full URL-legal character set available for args and vars
(This was the original driver for making the changes, since it was essentially impossible to
retrofit into the existing rewrite system. The secondary goal was to address 99% of required
functionality while keeping configuration as close to trivial as possible.)

* app-specific routing

The new logic is selected in routes.py. The old regex logic remains available, though not simultaneously.

The language feature is not yet tied into web2py's existing language support, so for now all you get is URL conversion and support for language-specific subdirectories in app/static.

There's a new routing example file called router.example.py that has some documentation, reproduced below. As with routes.example.py, you copy it to routes.py and edit it for your own configuration.


# router is a dictionary of URL routing parameters.
#
# For each request, the effective router is the default router (below),
# updated by the base router (if any) from routes.py,
# updated by the relevant application-specific router (if any)
# from applications/app/routes.py.
#
# Optional members of base router:
#
# default_application: default application name
# applications: list of all recognized applications,
# or 'ALL' to use all currently installed applications
# map_domain: dict used to map domain names to application names
#
# These values may be overridden by app-specific routers:
#
# default_controller: name of default controller
# default_function: name of default function (all controllers)
# root_static: list of static files accessed from root
# (mapped to the selected application's static/ directory)
#
#
# Optional members of application-specific router:
#
# These values override those in the base router:
#
# default_controller
# default_function
# root_static
#
# When these appear in the base router, they apply to the default application only:
#
# controllers: list of valid controllers in selected app
# or "DEFAULT" to use all controllers in the selected app plus 'static'
# or [] to disable controller-name omission
# languages: list of all supported languages
# default_language
# The language code (for example: en, it-it) optionally appears in the URL following
# the application (which may be omitted). For incoming URLs, the code is copied to
# request.language; for outgoing URLs it is taken from request.language.
# If languages=[], language support is disabled.
# The default_language, if any, is omitted from the URL.
# check_args: set to False to suppress arg checking
# request.raw_args always contains a list of raw args from the URL, not unquoted
# request.args are the same values, unquoted
# By default (check_args=True), args are required to match args_match.
# acfe_match: regex for valid application, controller, function, extension /a/c/f.e
# file_match: regex for valid file (used for static file names)
# args_match: regex for valid args (see also check_args flag)
#
#
# The built-in default router supplies default values (undefined members are None):
#
# router = dict(
# default_application = 'init',
# applications = 'ALL',
# default_controller = 'default',
# controllers = 'DEFAULT',
# default_function = 'index',
# root_static = ['favicon.ico', 'robots.txt'],
# map_domain = dict(),
# languages = [],
# default_language = None,
# check_args = True,
# map_hyphen = True,
# acfe_match = r'\w+$', # legal app/ctlr/fcn/ext
# file_match = r'(\w+[-=./]?)+$', # legal file (path) name
# args_match = r'([\w@ -]+[=.]?)+$', # legal arg in args
# )
#
# See rewrite.map_url_in() and rewrite.map_url_out() for implementation details.


# This simple router overrides only the default application name,
# but provides full rewrite functionality.
#
# router = dict(
# default_application = 'welcome',
# )

# This router supports the doctests below; it's not very realistic.
#
router = dict(
applications = ['welcome', 'admin', 'app', 'myapp', 'bad!app'],
default_application = 'myapp',
controllers = ['myctlr', 'ctr'],
default_controller = 'myctlr',
default_function = 'myfunc',
languages = ['en', 'it', 'it-it'],
default_language = 'en',
map_domain = {
"domain1.com" : "app1",
"domain2.com" : "app2"
},
)

VP

unread,
Jan 3, 2011, 11:29:59 PM1/3/11
to web2py-users
Questions:

1. Shouldn't app1 or app2 be one of the 5 apps specified in
"applications"?

2. Does this (controllers = ['myctlr', 'ctr']) all apps (app1, app2 in
particular) must have 2 controllers named "myctlr" and "ctr"?

Jonathan Lundell

unread,
Jan 4, 2011, 12:10:03 AM1/4/11
to web...@googlegroups.com
On Jan 3, 2011, at 8:29 PM, VP wrote:
>
> Questions:
>
> 1. Shouldn't app1 or app2 be one of the 5 apps specified in
> "applications"?

This is a somewhat subtle point. The role of "applications" is to help identify ambiguities. If there's a controller or function name that collides with an application name, then we have to give the application name priority: if we see that name as the first element of a URL, it must be the application.

But in the example, app1 and app2 are domain-mapped apps, corresponding to domain1.com and domain2.com. My convention is that all accesses to app1 and app2 will come via domain1 & domain2, so we can ignore ambiguities associated with other domains. This is a somewhat arbitrary choice, since the developer might want to access those apps from (for example) localhost. In that case, you *would* want to include app1 and app2 in "applications". But for maximum URL reduction, you can leave them out.

>
> 2. Does this (controllers = ['myctlr', 'ctr']) all apps (app1, app2 in
> particular) must have 2 controllers named "myctlr" and "ctr"?

No. In the base router (the one in the root routes.py), "controllers" applies only to the default application. Values for "controllers" for other apps must be specified in app-specific routers.

I'll be checking in a new feature in a day or two that allows you to define app-specific routers in the root routes.py.

Wikus van de Merwe

unread,
Jan 10, 2011, 1:55:10 AM1/10/11
to web...@googlegroups.com
Jonathan, can you explain a bit more about how the new routes would work on some examples? How would the mappings below (just /c/f/args) look like with the new routes?
/_ah/xmpp/message/chat/  ->  /comm/jabber
/new-article  ->  /post/create
/about  ->  /article/2010/07/11/welcome-to-my-world
/article/2010/11/02/new-world-order  ->  /article?year=2010&month=11&day=02&title=new-world-order
/article/2010/11  ->  /article?year=2010&month=11
/author/smith  ->  /author?name=smith
/publisher/smith/update  ->  /publisher-update?name=smith


> I'll be checking in a new feature in a day or two that allows you to define app-specific routers in the root routes.py.

I thought the goal is per application routes control for better modularity/portability of apps. If so, global route management should be minial and limited to the mapping to applications only. I don't think any local/per application rules should be there. What is the reason why in your proposition e.g. controllers of the default app are specified in global routes instead of having them in the local ones?

Jonathan Lundell

unread,
Jan 10, 2011, 2:08:36 AM1/10/11
to web...@googlegroups.com
On Jan 9, 2011, at 10:55 PM, Wikus van de Merwe wrote:
> Jonathan, can you explain a bit more about how the new routes would work on some examples? How would the mappings below (just /c/f/args) look like with the new routes?
> /_ah/xmpp/message/chat/ -> /comm/jabber
> /new-article -> /post/create
> /about -> /article/2010/07/11/welcome-to-my-world
> /article/2010/11/02/new-world-order -> /article?year=2010&month=11&day=02&title=new-world-order
> /article/2010/11 -> /article?year=2010&month=11
> /author/smith -> /author?name=smith
> /publisher/smith/update -> /publisher-update?name=smith

In general, transformations like this will have to use the existing regex mechanism.

Are you actually using those transformations? If the path on the left is the incoming URI, I'm not making sense of the paths on the right; we need to convert a URI into /a/c/f... for routing purposes.

>
> > I'll be checking in a new feature in a day or two that allows you to define app-specific routers in the root routes.py.
>
> I thought the goal is per application routes control for better modularity/portability of apps. If so, global route management should be minial and limited to the mapping to applications only. I don't think any local/per application rules should be there. What is the reason why in your proposition e.g. controllers of the default app are specified in global routes instead of having them in the local ones?

It's an option only, for those who want to define system routing centrally.

pbreit

unread,
Jan 10, 2011, 2:41:23 AM1/10/11
to web...@googlegroups.com
I'm a bit confused as well. Perhaps some more example would help.

Also, I would agree that per-application routing should be the default. And further, the global routes file should be enabled by default and should provide the same routes as you currently get with a fresh install of Web2py (with any app-routes overriding).

Jonathan Lundell

unread,
Jan 10, 2011, 11:26:04 AM1/10/11
to web...@googlegroups.com
On Jan 9, 2011, at 11:41 PM, pbreit wrote:
> I'm a bit confused as well. Perhaps some more example would help.

What's your current routes.py?

A general question for the list: how many of you are using domain routing? That is, where you route a particular incoming domain to a particular route? And if you are, are you using autoroutes, or your own regexes?

How many of you are using routes.py for something else?

>
> Also, I would agree that per-application routing should be the default.

I'm not sure what you mean by this. You get per-app routing if you specify an app-specific route, either in the base router or the app-specific routes.py.

Wikus van de Merwe

unread,
Jan 10, 2011, 12:33:40 PM1/10/11
to web...@googlegroups.com
I'm not using these routes. These were just made up examples to challenge the proposed new routes. Now I understand that new routes are not intend to be replacement for the old one, but rather a simplified alternative to handle the most common needs.

However, it looks still quite complicated for me. I never liked the fact that application specific routes are global. I think it would be much more clear if global routes would be used to map to applications only (both based on url paths and domains) and the app-specific local routes would be used to map paths to controllers/functions. I also support pbreit here saying it should be the default, that is web2py on installation should have welcome and example apps with local routes. This separation makes things easier to understand and manage.

And answering your questions, I'm deploying my apps on GAE and I'm not using domains. I'm using routes.py to simplify urls so that it is easier to remember them or use in printed materials. For the rest I'm happy with the default /controller/function dispatching.

Massimo Di Pierro

unread,
Jan 10, 2011, 1:34:17 PM1/10/11
to web2py-users
What do you mean by "I never liked the fact that application specific
routes are global"?
You can have per/app routing files. Jonathan implemented this and it
has been available for a while, although not documented in the printed
book.
Massimo

On Jan 10, 11:33 am, Wikus van de Merwe <dupakrop...@googlemail.com>
wrote:

Jonathan Lundell

unread,
Jan 10, 2011, 1:44:26 PM1/10/11
to web...@googlegroups.com
On Jan 10, 2011, at 9:33 AM, Wikus van de Merwe wrote:
> I'm not using these routes. These were just made up examples to challenge the proposed new routes. Now I understand that new routes are not intend to be replacement for the old one, but rather a simplified alternative to handle the most common needs.

Though by "most common" I mean "nearly all". If there's a real-world case that the new scheme doesn't handle, I'd like to know about it so that case can go on the list.

>
> However, it looks still quite complicated for me. I never liked the fact that application specific routes are global. I think it would be much more clear if global routes would be used to map to applications only (both based on url paths and domains) and the app-specific local routes would be used to map paths to controllers/functions.

That's entirely up to the user. The reason I allow (not require) app-specific route to be defined in the base routes.py is that the router for applications like the stock welcome/admin/examples is exactly:

routers=dict(
BASE = dict(),
admin = dict(),
examples = dict(),
welcome = dict(),
)

or, more compactly:

routers = {}

...and it seemed like an undue burden to oblige the user to create three more routes.py files for that. Nonetheless, you're free to do it if you want.


> I also support pbreit here saying it should be the default, that is web2py on installation should have welcome and example apps with local routes. This separation makes things easier to understand and manage.

I agree, except that I think it needs to be something like routes.standard.py, with the user copying it to routes.py. Otherwise, if the user customizes routes.py, it'll get overwritten at the next web2py update.

>
> And answering your questions, I'm deploying my apps on GAE and I'm not using domains. I'm using routes.py to simplify urls so that it is easier to remember them or use in printed materials. For the rest I'm happy with the default /controller/function dispatching.

In that case, with the new facility, all you need in routes.py is:

routers = {}

...if your default application is 'init'. Otherwise:

routers = dict(
BASE = dict( default_application='myapp' )
)

...and you should get maximal URL simplification. (You can also specify default_controller and default_function if they're not default/index.)

pbreit

unread,
Jan 10, 2011, 2:31:15 PM1/10/11
to web...@googlegroups.com
As usual, there's more to it than I first imagined!

To overcome the issue of routes being overwritten with an update, could there be a routes_default.py that ships with Web2py and can be overridden by routes.py in the same directory? Not super important.

I envision something like (pseudocode):

In /web2py/routes.py:

    default_app = 'init', 'welcome'  # try init first, then welcome
    default_controller = 'default'
    default_function = 'index'
    if exists(applications/*/routes.py)
        use(applications/*/routes.py) # would override default controller/function

But I guess hard-coding default routing in core can protect against accidents.

Domain-based routing does sound like it would be a good feature.

Do I understand correctly that regex and non-regex routing cannot be used at same time? I'm guessing that's going to be a requested feature.


Jonathan Lundell

unread,
Jan 10, 2011, 2:43:54 PM1/10/11
to web...@googlegroups.com
On Jan 10, 2011, at 11:31 AM, pbreit wrote:
> As usual, there's more to it than I first imagined!
>
> To overcome the issue of routes being overwritten with an update, could there be a routes_default.py that ships with Web2py and can be overridden by routes.py in the same directory? Not super important.

That's possible. The problem I see with it is that it changes the behavior (including URLs) of web2py for those users who are not using routes.py unless routes_default.py specifies exactly the same routing as not having a routes file at all. In which case, what's the point?

>
> I envision something like (pseudocode):
>
> In /web2py/routes.py:
>
> default_app = 'init', 'welcome' # try init first, then welcome
> default_controller = 'default'
> default_function = 'index'
> if exists(applications/*/routes.py)
> use(applications/*/routes.py) # would override default controller/function

That's basically the way it works.

>
> But I guess hard-coding default routing in core can protect against accidents.
>
> Domain-based routing does sound like it would be a good feature.
>
> Do I understand correctly that regex and non-regex routing cannot be used at same time? I'm guessing that's going to be a requested feature.
>

That's true for now, anyway. Mixing them doesn't work because there ends up being redundant code in the base routes.py, and we have to know which one to use.

If there's a demand for it, I could probably add a regex capability to the new router. I'd need to think about that some.

Wikus van de Merwe

unread,
Jan 11, 2011, 5:39:45 PM1/11/11
to web...@googlegroups.com
I meant "global by default". I was aware of the existence of the "routes_app" parameter. But my point here was more specifically about this mechanism not being used for welcome and example apps by default. This and lack of documentation in book makes it almost non-existing for users.

It would be nice to improve on this with the new routes and have the app-specific routes for one of the apps and default ones for the other. I know both welcome and examples apps don't need any special routes configuration, but to demonstrate how to use the routes one of the apps could have its local routes defined. And I mean here some kind of example urls not neccessery used anywhere in the app. Just to let users learn the routes mechanism by example.

It is indeed a bit problematic to keep the user routes and the default routes separated. As Jonathan suggested, having routes.standard.py loaded when no routes.py is present might be a solution. Alternative would be to have routes.example.py which is not loaded at all and serves as example only. In both cases however, there should be the "demo" routes.py inside i.e. examples app folder.

I must admit that I missed the RC thread when I was writing my comments here and I didn't understood well the concept of new routes. After reading more and giving some thought into how the new system works I think Jonathan did a really good job. New routes seems to tackle almost all practically useful cases and are easier to use and customise.

To visualise that better let me specify some examples from my own routes which are already covered:
(".*:/favicon.ico", "/init/static/favicon.ico")  -  covered by default (root_static key)
(".*:/robots.txt", "/init/static/robots.txt")  -  covered by default (root_static key)
("/", "/init/default/index")  -  covert by default (default_controller and default_function keys)
("/(.+)", r"/init/\1")  -  covered by default (default_application key)

Now a few cases of url shortening that I'm using in old routes that maybe are worth considering in the new one:
1) ("/(%s)" % "|".join(DEFAULTS), r"/init/default/\1")  -  DEFAULTS is the list of functions in default controller (I want to skip the default controller name in URL)
2) ("/f1/(.+)", r"/init/default/f2/\1")  -  skipping the controller and mapping to a function in the default one (it can be turn into example 1 if f1 == f2)
3) ("/label", "/init/default/func/arg1/arg2/arg3/arg4")  -  I want to link with a short label to a deep element of the page

Other examples that I brought forward before might not be worth considering as they are very specific and could be represented differently anyway:
e.g. /author/smith  ->  /init/default/author?name=smith
This is equal to ("/func/(.+)", r"/init/default/func?arg1=\1") and could be ("/func/(.+)", r"/init/default/func/\1") which is the same as example 1.

One last question, as this is not entirely clear for me. Can I use the new routes to point to my app-specific routes defined with old routes regex syntax? What about something like autoroutes routes.conf syntax? Wouldn't that be useful to benefit from new routes and still have regex (although simplified to local app-specific paths) as last resort?

Jonathan Lundell

unread,
Jan 11, 2011, 6:13:40 PM1/11/11
to web...@googlegroups.com, Wikus van de Merwe
On Jan 11, 2011, at 2:39 PM, Wikus van de Merwe wrote:
> To visualise that better let me specify some examples from my own routes which are already covered:
> (".*:/favicon.ico", "/init/static/favicon.ico") - covered by default (root_static key)
> (".*:/robots.txt", "/init/static/robots.txt") - covered by default (root_static key)
> ("/", "/init/default/index") - covert by default (default_controller and default_function keys)
> ("/(.+)", r"/init/\1") - covered by default (default_application key)
>
> Now a few cases of url shortening that I'm using in old routes that maybe are worth considering in the new one:
> 1) ("/(%s)" % "|".join(DEFAULTS), r"/init/default/\1") - DEFAULTS is the list of functions in default controller (I want to skip the default controller name in URL)
> 2) ("/f1/(.+)", r"/init/default/f2/\1") - skipping the controller and mapping to a function in the default one (it can be turn into example 1 if f1 == f2)

That's handled, but from a different angle. Instead of listing the functions in the default controller, the new router lists the controllers in the application (partly because it's an easy list to generate automatically, and partly because the function list tends to change more often).

In the normal case, we have a list of all apps, and a list of all controllers in each app, with the lists generated automatically from the file system. Besides dropping the default application, we drop the default controller whenever it can be done unambiguously, so a URL like (say) /init/default/fcn is shortened to /fcn whenever we know that 'fcn' can't be mistaken for a controller or application. If it *does* collide with one of those, then we turn the controller name back on.

Example: suppose you have a function /init/default/init. This gets shortened to /default/init, which is unambiguous.

A side note: a consequence of the algorithm is that a URL that isn't shortened, or is less than completely shortened, is always acceptable. In particular, the full URL is always OK.

> 3) ("/label", "/init/default/func/arg1/arg2/arg3/arg4") - I want to link with a short label to a deep element of the page

We don't do that, though it'd be straightforward to implement, either as a lookup table or as a more general regex facility.

>
> Other examples that I brought forward before might not be worth considering as they are very specific and could be represented differently anyway:
> e.g. /author/smith -> /init/default/author?name=smith
> This is equal to ("/func/(.+)", r"/init/default/func?arg1=\1") and could be ("/func/(.+)", r"/init/default/func/\1") which is the same as example 1.

Not supported, but if I added a regex capability (as above), it could work.

BTW, this rewrite has problems in the legacy regex system because validity checking is fairly strict for args. So for example:

/author/Pétain

would be rejected, I think. That makes moving elements from the path to the query string somewhat less desirable than it might otherwise be. In the new router, you can relax the validity checking of the path elements, though of course you'd want to check them at the app level depending on what they're used for.

>
> One last question, as this is not entirely clear for me. Can I use the new routes to point to my app-specific routes defined with old routes regex syntax? What about something like autoroutes routes.conf syntax? Wouldn't that be useful to benefit from new routes and still have regex (although simplified to local app-specific paths) as last resort?

Not supported.

You can have app-specific new-style routes, either in the base routes.py or the app-specific routes.py. But I don't have a way of invoking the old logic when the new logic is active.

I'd been thinking that a regex option in the new router would be the way to do it. It occurs to me that I might be able to let you specify on an app-specific basis, to use the regex system, but it'd be non-trivial to do it, because the new system uses completely different URL parsing (in part to avoid some of the parsing restrictions of the old system).

Reply all
Reply to author
Forward
0 new messages