Unicode regex pattern in Webapp2 WSGIApplication URL Mapping

238 views
Skip to first unread message

Jonathan

unread,
May 9, 2012, 2:08:30 AM5/9/12
to google-a...@googlegroups.com
In the python regex module, when you set the re.UNICODE flag, the '\w' word pattern will match 
 the characters [0-9_] plus whatever is classified as alphanumeric in the Unicode character properties database.

i.e. It will match the characters a-z in English.  It will also match latin accented characters, such as à (0x00E0), ė (0x0117) or Asian word characters 我 (0x6211, meaning 'I'), 你 (0x4F60, meaning 'you'), which is brilliant for supporting other languages easily.  

I want to be able to have my URL to match alphanumeric unicode characters in it:
  /Search/hello
  /Search/hablé
  /Search/我会明白

-->  app = webapp2.WSGIApplication( ['/Search/[\w]+', SearchHandler ], ...)

But I cannot find a way to set the unicode flag for the WSGIApplication regex pattern.  i.e. the above pattern only matches the a-z characters and not any of the other unicode alphanumeric characters.  I've tried including the shortcut unicode flag (?u) in the string, but that didn't work.  Is there a way to set the unicode flag?  

Thanks in advance.  I'm using App Engine ver 1.6.5 and Python 2.7.  


pdknsk

unread,
May 12, 2012, 10:01:38 AM5/12/12
to Google App Engine
What is the purpose of matching the search string when you don't seem
to use it. You'll probably figure this out easily when you group and
log the matched string.

pdknsk

unread,
May 12, 2012, 10:16:07 AM5/12/12
to Google App Engine
You're probably using self.request.path rather than groups then. Ask
this question on SO, it's easier to format the answer.
Reply all
Reply to author
Forward
0 new messages