Patch: Named regular expression groups in URL patterns

1,031 views
Skip to first unread message

Marc

unread,
Mar 11, 2010, 6:55:08 PM3/11/10
to Tornado Web Server
Being able to use named regular expression groups in URL patterns
seems like a logical extension of the regular expression matching that
Tornado is already doing and it is very easy to add:

--- a/tornado/web.py
+++ b/tornado/web.py
@@ -1015,7 +1015,7 @@ class Application(object):
RequestHandler._templates = None
RequestHandler._static_hashes = {}

- handler._execute(transforms, *args)
+ handler._execute(transforms, *args, **match.groupdict())
return handler

def reverse_url(self, name, *args):

Any reason not to do it?

-Marc

Marc

unread,
Mar 11, 2010, 7:15:18 PM3/11/10
to Tornado Web Server
On Mar 11, 3:55 pm, Marc <msabr...@gmail.com> wrote:
> Being able to use named regular expression groups in URL patterns
> seems like a logical extension of the regular expression matching that
> Tornado is already doing and it is very easy to add:
>
> --- a/tornado/web.py
> +++ b/tornado/web.py
> @@ -1015,7 +1015,7 @@ class Application(object):
[snip]

Oops. I didn't test that very well. It fails when there is no match.
Here's one that works better and is more consistent with the existing
code style:

--- a/tornado/web.py
+++ b/tornado/web.py
@@ -995,6 +995,7 @@ class Application(object):
transforms = [t(request) for t in self.transforms]
handler = None
args = []
+ kwargs = {}
handlers = self._get_host_handlers(request)
if not handlers:
handler = RedirectHandler(
@@ -1005,6 +1006,7 @@ class Application(object):
if match:
handler = spec.handler_class(self, request,
**spec.kwargs)
args = match.groups()
+ kwargs = match.groupdict()
break
if not handler:
handler = ErrorHandler(self, request, 404)
@@ -1015,7 +1017,7 @@ class Application(object):


RequestHandler._templates = None
RequestHandler._static_hashes = {}

- handler._execute(transforms, *args)
+ handler._execute(transforms, *args, **kwargs)
return handler

def reverse_url(self, name, *args):

Hmmm. I should probably learn how to do a git pull request... :-)

-Marc
http://marc-abramowitz.com

Ben Darnell

unread,
Mar 11, 2010, 8:27:57 PM3/11/10
to python-...@googlegroups.com
I like the idea, but it seems to break down because match.groups()
includes both named and unnamed groups. What did your handler method
signature look like?

>>> m=re.match(r'(?P<foo>foo)(?P<bar>bar)', 'foobar')
>>> m.groups(), m.groupdict()
(('foo', 'bar'), {'foo': 'foo', 'bar': 'bar'})
>>> def f(bar, foo): print bar, foo
...
>>> f(*m.groups(), **m.groupdict())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: f() got multiple values for keyword argument 'foo'
>>> def f2(foo, bar): print foo, bar
...
>>> f2(*m.groups(), **m.groupdict())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: f2() got multiple values for keyword argument 'foo'

-Ben

Anton Bessonov

unread,
Mar 11, 2010, 8:41:04 PM3/11/10
to python-...@googlegroups.com

Andrew Gwozdziewycz

unread,
Mar 12, 2010, 6:29:22 AM3/12/10
to python-...@googlegroups.com


On Thu, Mar 11, 2010 at 8:41 PM, Anton Bessonov <exe...@googlemail.com> wrote:
Too late!

http://groups.google.com/group/python-tornado/browse_thread/thread/1d8a90ad5ec4aa90

His point is that you can pass the parameters that the named groups capture to the handler as kwargs (which Django does), which currently isn't done. There's a slight problem with this. 1) They are already passed as regular positional parameters, and 2) reverse_url's weren't really designed to handle optional groups, (or groups that have quantifiers for that matter e.g. (\d+/)+) in urls, making default arguments sort of useless. In fact, if you try to do it, and it works, I'd consider yourself lucky :).

 
Any reason not to do it?
 



--
http://www.apgwoz.com

Marc

unread,
Mar 12, 2010, 9:30:20 PM3/12/10
to Tornado Web Server
On Mar 11, 5:27 pm, Ben Darnell <ben.darn...@gmail.com> wrote:
> I like the idea, but it seems to break down because match.groups()
> includes both named and unnamed groups.  What did your handler method
> signature look like?

Good point. Well, my handler signature uses (self, *args, **kwargs),
and I just use that to populate a simple dictionary, so it doesn't
matter in my case that the named groups might be included twice in the
function args. That's why I didn't run into that problem.

Here's how Django does it:

http://docs.djangoproject.com/en/dev/topics/http/urls/#s-named-groups

They assume that you are going to use all named groups or all unnamed
groups and not mix them (which seems reasonable). In their words, "If
there are any named arguments, it will use those, ignoring non-named
arguments. Otherwise, it will pass all non-named arguments as
positional arguments.". Following a rule like that, it should be
pretty easy to implement. Something like:

--- a/tornado/web.py
+++ b/tornado/web.py
@@ -995,6 +995,7 @@ class Application(object):
transforms = [t(request) for t in self.transforms]
handler = None
args = []
+ kwargs = {}
handlers = self._get_host_handlers(request)
if not handlers:
handler = RedirectHandler(

@@ -1005,6 +1006,9 @@ class Application(object):


if match:
handler = spec.handler_class(self, request,
**spec.kwargs)
args = match.groups()
+ kwargs = match.groupdict()

+ if kwargs:
+ args = []


break
if not handler:
handler = ErrorHandler(self, request, 404)

@@ -1015,7 +1019,7 @@ class Application(object):


RequestHandler._templates = None
RequestHandler._static_hashes = {}

- handler._execute(transforms, *args)
+ handler._execute(transforms, *args, **kwargs)
return handler

def reverse_url(self, name, *args):


I tested the above with:

class PathRegexHandler(tornado.web.RequestHandler):
def get(self, thing_type, thing_name):
self.write("thing_type = %r\n" % thing_type)
self.write("thing_name = %r\n" % thing_name)

application = tornado.web.Application([
(r'/unnamed_groups/([a-zA-Z0-9]+)/([a-zA-Z0-9]+)',
PathRegexHandler),
(r'/named_groups/(?P<thing_type>[a-zA-Z0-9]+)/(?
P<thing_name>[a-zA-Z0-9]+)', PathRegexHandler),
])

$ curl http://127.0.0.1:8000/unnamed_groups/animal/tiger
thing_type = 'animal'
thing_name = 'tiger'
$ curl http://127.0.0.1:8000/named_groups/animal/tiger
thing_type = 'animal'
thing_name = 'tiger'

How's that?

-Marc

Ben Darnell

unread,
Mar 13, 2010, 1:18:13 PM3/13/10
to python-...@googlegroups.com
Yeah, that sounds reasonable (maybe with a sanity check that if
m.groupdict() is non-empty that it has the same length as m.groups()).

-Ben

Anton Bessonov

unread,
Mar 13, 2010, 1:45:25 PM3/13/10
to python-...@googlegroups.com

Ben Darnell

unread,
Mar 17, 2010, 9:52:43 PM3/17/10
to python-...@googlegroups.com
I've committed both of these changes.

Thanks,
-Ben

Marc

unread,
Mar 18, 2010, 2:30:47 AM3/18/10
to Tornado Web Server
On Mar 17, 6:52 pm, Ben Darnell <ben.darn...@gmail.com> wrote:
> I've committed both of these changes.

Cool. Thanks!

Reply all
Reply to author
Forward
0 new messages