Invalid character in URL query string causesUnicodeDecodeError exception to be raised

184 views
Skip to first unread message

Ben Sizer

unread,
Jul 5, 2012, 9:38:18 AM7/5/12
to pylons-...@googlegroups.com
Sorry for not having a full test case, but here's the basic overview:

 - Create a view that accesses request.params
 - Form a URL that uses that view, and which contains some query parameters that are not legal UTF-8 (eg. a random byte string)
 - Try to access that URL on the Pyramid server

The result, for me, is a traceback like this:

Traceback (most recent call last):
  File "/usr/lib/python2.7/wsgiref/handlers.py", line 85, in run
    self.result = application(self.environ, self.start_response)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/router.py", line 187, in __call__
    response = self.handle_request(request)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/tweens.py", line 20, in excview_tween
    response = handler(request)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/router.py", line 164, in handle_request
    response = view_callable(context, request)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/config/views.py", line 333, in rendered_view
    result = view(context, request)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/config/views.py", line 471, in _requestonly_view
    response = view(request)
  File "accountserver.py", line 297, in verify_account
    if "admin_auth" not in request.params or request.params["admin_auth"] != auth_val:
  File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 831, in params
    params = NestedMultiDict(self.GET, self.POST)
  File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 813, in GET
    vars = GetDict(data, env)
  File "/usr/local/lib/python2.7/dist-packages/webob/multidict.py", line 273, in __init__
    MultiDict.__init__(self, data)
  File "/usr/local/lib/python2.7/dist-packages/webob/multidict.py", line 37, in __init__
    items = list(args[0])
  File "/usr/local/lib/python2.7/dist-packages/webob/compat.py", line 125, in parse_qsl_text
    yield (x.decode(encoding), y.decode(encoding))
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa7 in position 0: invalid start byte

It looks like maybe this is an error in WebOb rather than Pyramid as such, but it leaks out into my view functions, and it lets an external user raise exceptions on the server via malformed URLs, which -feels- like a security or potential DoS issue, even if the server (in my case, at least) stays up afterwards.

Can anybody with more understanding of this comment on this issue?

--
Ben Sizer


Michael Merickel

unread,
Jul 10, 2012, 2:13:42 PM7/10/12
to pylons-...@googlegroups.com
Not everything will decode. Webob attempts to decode things with utf-8
by default but you can change this if you wish with a custom request
factory. Anyway, you can't guess the encoding every time, and if it
fails then it'll raise an exception and return a 500 back to the
client. If you prefer something else you can create a custom exception
view for UnicodeDecodeErrors or try/excepts in the request factory to
handle the error in the way you wish.
> --
> You received this message because you are subscribed to the Google Groups
> "pylons-discuss" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/pylons-discuss/-/qc-hV7Mq66oJ.
> To post to this group, send email to pylons-...@googlegroups.com.
> To unsubscribe from this group, send email to
> pylons-discus...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/pylons-discuss?hl=en.

Ben Sizer

unread,
Jul 10, 2012, 4:25:12 PM7/10/12
to pylons-...@googlegroups.com
The exception is only raised once we try to access request.params - so the problem wouldn't be caught in the request factory stage unless there is some explicit code to try and read these values. I could have the factory explicitly look at .GET, I suppose. (By the way, the docs for using a Request factory are not very clear on what form the provided callable should take (ie. maybe a dict of environment options?), and if there are any expectations on what is returned.)

If I instead chose to write a custom exception view for UnicodeDecodeErrors, this would also get used if I accidentally introduced some Unicode-related errors into my code, when really I want to distinguish between bad user input and bad code.

I'm not sure what the right approach is here, given that there seems to be no standard on what a valid URI has to look like, but I would -suggest- that it would be handy if Pyramid could be told what encoding to expect and bail out early before it hits the application-specific view code. Perhaps a URI encoding could be another parameter to view_config?
> To post to this group, send email to pylons-discuss@googlegroups.com.
> To unsubscribe from this group, send email to
> pylons-discuss+unsubscribe@googlegroups.com.

Chris McDonough

unread,
Jul 10, 2012, 4:41:17 PM7/10/12
to pylons-...@googlegroups.com
On 07/10/2012 04:25 PM, Ben Sizer wrote:

> If I instead chose to write a custom exception view for
> UnicodeDecodeErrors, this would also get used if I accidentally
> introduced some Unicode-related errors into my code, when really I want
> to distinguish between bad user input and bad code.

I'd suggest submitting a patch to WebOb which causes it to catch
UnicodeDecodeError where such exceptions are likely to be raised and
reraise a *subclass* of UnicodeDecodeError (e.g. ParamsDecodeError).

- C
Reply all
Reply to author
Forward
0 new messages