Invalid character in URL query string causesUnicodeDecodeError exception to be raised

169 views
Skip to first unread message

Ben Sizer

unread,
Jul 10, 2012, 9:03:28 AM7/10/12
to paste...@googlegroups.com
I've copied most of this post over from one I made in pylons-discuss, which hasn't had any replies in 4 days. I'm not sure whether the problem is considered to be in WebOb or Pyramid, but maybe someone here will have an opinion.

Sorry for not having a full test case, but here's the basic overview:

 - Create a Pyramid view that accesses request.params
 - Form a URL that accesses that view, and which contains some query parameters that are not legal UTF-8 (eg. a random byte string)
 - Using CURL or similar, try to access that URL on the Pyramid server

The result, for me, is a traceback like this:


Traceback (most recent call last):
  File "/usr/lib/python2.7/wsgiref/handlers.py", line 85, in run
    self.result = application(self.environ, self.start_response)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/router.py", line 187, in __call__
    response = self.handle_request(request)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/tweens.py", line 20, in excview_tween
    response = handler(request)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/router.py", line 164, in handle_request
    response = view_callable(context, request)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/config/views.py", line 333, in rendered_view
    result = view(context, request)
  File "/usr/local/lib/python2.7/dist-packages/pyramid/config/views.py", line 471, in _requestonly_view
    response = view(request)
  File "accountserver.py", line 297, in verify_account
    if "admin_auth" not in request.params or request.params["admin_auth"] != auth_val:
  File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 831, in params
    params = NestedMultiDict(self.GET, self.POST)
  File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 813, in GET
    vars = GetDict(data, env)
  File "/usr/local/lib/python2.7/dist-packages/webob/multidict.py", line 273, in __init__
    MultiDict.__init__(self, data)
  File "/usr/local/lib/python2.7/dist-packages/webob/multidict.py", line 37, in __init__
    items = list(args[0])
  File "/usr/local/lib/python2.7/dist-packages/webob/compat.py", line 125, in parse_qsl_text
    yield (x.decode(encoding), y.decode(encoding))
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa7 in position 0: invalid start byte


Is it Pyramid's fault for letting invalid UTF-8 get into request.GET? Is it right to assume UTF-8? Is the throwing of this exception documented and should Pyramid be catching it?

I appreciate the URL is most likely invalid, but this does let an external user raise exceptions on the server via malformed URLs, which -feels- like a security or potential DoS issue, even if the server (in my case, at least) stays up afterwards.

Can anybody with more understanding of this comment on this issue?

--
Ben Sizer

Reply all
Reply to author
Forward
0 new messages