reverse urls

604 views
Skip to first unread message

Andrew Gwozdziewycz

unread,
Oct 18, 2009, 6:40:41 PM10/18/09
to python-...@googlegroups.com
In Django there's a very useful DRY pattern called named url patterns
which you can use to compute a url based on it's regular expression
(called reversing). I've started playing around with this idea at a
basic level in tornado, as I am extremely tired of changing urls in
multiple places. Anyway, it's on github,
http://github.com/apgwoz/tornado/commit/0b57fbcf992647b6674ad9e433bffffac900f83a
and I would appreciate feedback and ideas to make it better. As the
commit message says, it's my first crack at it, so again, it's basic.

The patch adds a couple of things:

1. a new attribute in Application instances called named_handlers
2. a new method in Application called reverse_url, which given a name
and a set of arguments will attempt to create a url out of the looked up
"URLSpec" (I don't like the name URLSpec)
3. a new class URLSpec (again, I dislike the name), which creates a
format string representing the URL
based on the number of groups found in the pattern specified for
dispatching.
4. a new function added to templates: reverse_url(name, arg1, arg2...)

URLSpec acts like the tuples you pass now, so that there are minimal
changes to the add_handlers code in Application, and they do little work
up front to get this all to work.

URLSpec does have some limitations. The most major one being that
counting parenthesis isn't guaranteed to provide an accurate count on
the number of groups, so I expect this to work only on the simplest
dispatching url patterns.

Example usage taken from the blog demo:

from tornado.web import url

class Application(tornado.web.Application):
def __init__(self):
handlers = [
url(r"/", HomeHandler),
url(r"/archive", ArchiveHandler),
url(r"/feed", FeedHandler),
url(r"/entry/([^/]+)", EntryHandler),
(r"/compose", ComposeHandler),
(r"/auth/login", AuthLoginHandler),
(r"/auth/logout", AuthLogoutHandler),
]

Then in archive.html:

...
{% block body %}
<ul class="archive">
{% for entry in entries %}
<li>
<div class="title"><a href="{{ reverse_url("entry", entry.slug) }}">{{ escape(entry.title) }}</a></div>
<div class="date">{{ locale.format_date(entry.published, full_format=True, shorter=True) }}</div>
</li>
{% end %}
</ul>
{% end %}

--
http://apgwoz.com

Bret Taylor

unread,
Oct 18, 2009, 9:32:21 PM10/18/09
to python-...@googlegroups.com
This is a neat idea. I will take a look at this change and experiment with Django to see how we can incorporate this into Tornado officially.

Thanks for taking the time to propose this and send a diff.

Bret

Alkis Evlogimenos ('Αλκης Ευλογημένος)

unread,
Oct 19, 2009, 6:23:28 AM10/19/09
to python-...@googlegroups.com
I wrote this for reversing urls.  It covers keyword arguments as well. Feel free to use it if you think it will make this patch better.

def reverse_url_regex(regex, *args, **kwargs):
  """
  Performs a reverse lookup on a url regex used to build a
  WSGIApplication. Args/kwargs are applied to the regular
  expression. For example:

  >>> reverse_url_regex(re.compile('^foo'))
  'foo'
  >>> reverse_url_regex(re.compile('^foo/(\d+)/$'), 3)
  'foo/3/'
  >>> reverse_url_regex(re.compile('^foo/(?P<id>\d+)/$'), id=3)
  'foo/3/'
  >>> reverse_url_regex(re.compile('^foo/(?P<id>\d+)/(\w+)/$'), 'bar', id=3)
  'foo/3/bar/'

  Returns None if the args/kwargs are invalid for the regex.
  """

  try:
    pat = regex.pattern
    # Cleanup the regex.
    if pat[0] == '^' and pat[-1] == '$':
      pat = pat[1:-1]
    elif pat[0] == '^':
      pat = pat[1:]
    elif pat[-1] == '$':
      pat = pat[:-1]
    # Escape all formatting patterns in the regex pattern.
    pat = pat.replace('%', '%%')
    # Substitute all the positional arguments.
    if args:
      pat = re.sub(r'\([^?][^)]*\)', '%s', pat) % args
    # Substitute all the keyword arguments.
    if kwargs:
      pat = re.sub(r'\(\?P<(\w+)>[^)]*\)', r'%(\1)s', pat) % kwargs
    # Check if the resulting string matches the original regex.
    if regex.match(pat):
      return pat
  except Exception, e:
    print(e)

  return None
--

Alkis

Andrew Gwozdziewycz

unread,
Oct 19, 2009, 6:56:00 AM10/19/09
to python-...@googlegroups.com
On Sun, Oct 18, 2009 at 9:32 PM, Bret Taylor <bta...@gmail.com> wrote:
> This is a neat idea. I will take a look at this change and experiment with
> Django to see how we can incorporate this into Tornado officially.
>
> Thanks for taking the time to propose this and send a diff.

Django does quite a bit, even going so far as to almost hand compile the regex
so that it can attempt to throw an error if the arguments you specify do not
correctly match the url you're trying to reverse. I'm not sure this is exactly
overkill as there is no reason to produce a url that won't actually be resolved,
and as a result, they only support a subset of urls that can be
reversed. It does
however, handle nested () correctly, which is a win.

Alkis' technique adds support for keyword args, which based on my
understanding aren't passed through to the handler as kwargs, correct? Is
there any plan for passing named capture groups as kwargs to request
handlers?

If so, it should be fairly simple to add what is needed to support that.

Thanks for checking this out!

--
http://www.apgwoz.com

Alkis Evlogimenos ('Αλκης Ευλογημένος)

unread,
Oct 19, 2009, 7:02:53 AM10/19/09
to python-...@googlegroups.com
Wouldn't things be a quite a bit simpler if handlers were passed only keyword args? Is there interest in making a change like this?
--

Alkis

Andrew Gwozdziewycz

unread,
Oct 19, 2009, 7:13:06 AM10/19/09
to python-...@googlegroups.com
2009/10/19 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlog...@gmail.com>:

> Wouldn't things be a quite a bit simpler if handlers were passed only
> keyword args? Is there interest in making a change like this?

How so? The only way I can see this being true is with a regular
expression that has
different groups based on positions... something like:

/(([a-z]+)|([0-9]+)([a-z]+))/

And even then, len(match.groups()) is 1 + the number of groups, in
left to right order,
meaning you'd have to have 3 positional arguments anyway. And

/((?P<word>[a-z]+)|(?P<number>[0-9]+)(?P<word>[a-z]+))/

is invalid, since there can't be multiple groups named the same thing.

Did you have some other reason in mind?

--
http://www.apgwoz.com

Alkis Evlogimenos ('Αλκης Ευλογημένος)

unread,
Oct 19, 2009, 7:44:30 AM10/19/09
to python-...@googlegroups.com
Is this a useful use case for webapps? (granted my experience is limited in this area). Any examples where it might be used?

After thinking it over again, I don't think this restriction will make it simpler.

2009/10/19 Andrew Gwozdziewycz <apg...@gmail.com>



--

Alkis

Alkis Evlogimenos ('Αλκης Ευλογημένος)

unread,
Oct 19, 2009, 8:02:02 AM10/19/09
to python-...@googlegroups.com
It will make this operation faster: ecause the format operator accepts either positional or keyword arguments constraining the regex to one of the two makes the generated format string cachable.

2009/10/19 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlog...@gmail.com>
Is this a useful use case for webapps? (granted my experience is limited in this area). Any examples where it might be used?

After thinking it over again, I don't think this restriction will make it simpler.
 

--

Alkis

Andrew Gwozdziewycz

unread,
Oct 19, 2009, 8:16:29 AM10/19/09
to python-...@googlegroups.com
2009/10/19 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlog...@gmail.com>:

> It will make this operation faster: ecause the format operator accepts
> either positional or keyword arguments constraining the regex to one of the
> two makes the generated format string cachable.

The code that I used for reverse_url stores the format string in the
instance of the URLSpec, so that it computes it only once. Is this
what you mean by cachable? Or are you saying that Python could then
cache it somewhere?

Incidentally, I found this microbenchmark to be interesting. Using
named format string arguments adds quite a bit of overhead to string
creation:

$ python -m timeit -n 1000000 "'%(a)s %(b)s' % {'a': 'hello', 'b': 'world'}"
1000000 loops, best of 3: 0.902 usec per loop
$ python -m timeit -n 1000000 "'%s %s' % ('hello', 'world')"
1000000 loops, best of 3: 0.411 usec per loop


> 2009/10/19 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlog...@gmail.com>
>>
>> Is this a useful use case for webapps? (granted my experience is limited
>> in this area). Any examples where it might be used?
>> After thinking it over again, I don't think this restriction will make it
>> simpler.
>
>
> --
>
> Alkis
>

--
http://www.apgwoz.com

Alkis Evlogimenos ('Αλκης Ευλογημένος)

unread,
Oct 19, 2009, 8:19:07 AM10/19/09
to python-...@googlegroups.com
You can cache it because named groups are not supported by that scheme, right? If you do support them you won't be able to.

2009/10/19 Andrew Gwozdziewycz <apg...@gmail.com>



--

Alkis

Andrew Gwozdziewycz

unread,
Oct 19, 2009, 10:26:43 AM10/19/09
to python-...@googlegroups.com
2009/10/19 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlog...@gmail.com>:
> You can cache it because named groups are not supported by that scheme,
> right? If you do support them you won't be able to.

I'm not sure I follow...

--
http://www.apgwoz.com

Alkis Evlogimenos ('Αλκης Ευλογημένος)

unread,
Oct 19, 2009, 10:34:33 AM10/19/09
to python-...@googlegroups.com
If you support named groups you will need a format string that takes keyword arguments. The problem is that you can't mix positional and keyword arguments in a format string.
--

Alkis

Andrew Gwozdziewycz

unread,
Oct 19, 2009, 10:50:29 AM10/19/09
to python-...@googlegroups.com
2009/10/19 Alkis Evlogimenos ('Αλκης Ευλογημένος) <evlog...@gmail.com>:
> If you support named groups you will need a format string that takes keyword
> arguments. The problem is that you can't mix positional and keyword
> arguments in a format string.

Ah ha! I see what you mean now. Yeah, I guess that is problematic, and
that's probably
why Django converts it's format string to named parameters of the form
_N, where N
represents the positional group number. Sorry for the misunderstanding.

--
http://www.apgwoz.com

JGAllen23

unread,
Oct 19, 2009, 3:26:36 PM10/19/09
to Tornado Web Server
The thing I like about Django urls is being able to name the url, so
you aren't tying the actual url path in the template. So you could
have something like
(r'/entry-blah-blah/(\d+)/', EntryHandler, name = 'entry')

and then in the template:
{{ url('entry', '123') }}

that way if you ever have to change the url, you don't have to go
through and change all your templates as well. Thoughts?

On Oct 18, 3:40 pm, Andrew Gwozdziewycz <apg...@gmail.com> wrote:
> In Django there's a very useful DRY pattern called named url patterns
> which you can use to compute a url based on it's regular expression
> (called reversing). I've started playing around with this idea at a
> basic level in tornado, as I am extremely tired of changing urls in
> multiple places. Anyway, it's on github,http://github.com/apgwoz/tornado/commit/0b57fbcf992647b6674ad9e433bff...

Elias Torres

unread,
Oct 19, 2009, 3:38:01 PM10/19/09
to python-...@googlegroups.com
On Oct 19, 2009, at 3:26 PM, JGAllen23 wrote:

>
> The thing I like about Django urls is being able to name the url, so
> you aren't tying the actual url path in the template. So you could
> have something like
> (r'/entry-blah-blah/(\d+)/', EntryHandler, name = 'entry')
>
> and then in the template:
> {{ url('entry', '123') }}
>
> that way if you ever have to change the url, you don't have to go
> through and change all your templates as well. Thoughts?

Absolutely. +1

-Elias

Andrew Gwozdziewycz

unread,
Oct 19, 2009, 4:08:57 PM10/19/09
to python-...@googlegroups.com
See my patch in that i mentioned in the first post of this thread.
This is exactly what it does.

--
http://www.apgwoz.com

JGAllen23

unread,
Oct 19, 2009, 4:21:09 PM10/19/09
to Tornado Web Server
where do I put in the the name of the handler? It looks like the
reverse_url call is based on the url path

Andrew Gwozdziewycz

unread,
Oct 19, 2009, 4:30:18 PM10/19/09
to python-...@googlegroups.com
Not at all.

In your handlers

handlers = [
url('/some/regex/([a-z]+)/', SomeHandler, name="some_handler"),
]

Then, {{ reverse_url('some_handler', 'arg1') }}. Note though that
there is no verification for parameters.

Woah... Upon looking back, I see that I made a mistake in what I
posted with the example:

class Application(tornado.web.Application):
def __init__(self):
handlers = [

url(r"/", HomeHandler, name="home"),
url(r"/archive", ArchiveHandler, name="archive"),
url(r"/feed", FeedHandler, name="feed"),
url(r"/entry/([^/]+)", EntryHandler, name="entry"),


(r"/compose", ComposeHandler),
(r"/auth/login", AuthLoginHandler),
(r"/auth/logout", AuthLogoutHandler),
]

Sorry about that!!

--
http://www.apgwoz.com

JGAllen23

unread,
Oct 19, 2009, 4:38:46 PM10/19/09
to Tornado Web Server
Awesome. This is exactly what I was looking for (and was about to
build). Thanks

David Novakovic

unread,
Nov 8, 2009, 12:10:56 AM11/8/09
to Tornado Web Server
Would love to see this integrated into tornado proper!

How is work proceeding on it, is there some way I can contribute to
help out?

David
Reply all
Reply to author
Forward
0 new messages