utf-8 cookie encoding problem

635 views
Skip to first unread message

虞坤霖

unread,
Jan 27, 2015, 12:20:20 PM1/27/15
to cherryp...@googlegroups.com
Hi!
I wanted to response a chinese cookie, like this:

cookie = cherrypy.response.cookie                                                                     
cookie['userid'] = '用户名'   #<<<<<<<<<<<<<<<<< chinese value
cookie['userid']['path'] = '/'
cookie['userid']['max-age'] = 10
cookie['userid']['version'] = 1

but my broswer didn't receive the cookie corrctly.
Chrome received : 
Set-Cookie: =?utf-8?b?dXNlcmlkPSLnlKjmiLflkI0iOyBNYXgtQWdlPTEwOyBQYXRoPS87IFZlcnNpb249MQ==?=
I think it's base64.

Then I found this:
cherrypy/_cprequest.py:
953         cookie = self.cookie.output()
954         if cookie:
955             print (cookie)
956             for line in cookie.split("\n"):
957                 if line.endswith("\r"):
958                     # Python 2.4 emits cookies joined by LF but 2.5+ by CRLF.
959                     line = line[:-1]
960                 name, value = line.split(": ", 1)
961                 if isinstance(name, unicodestr):
962                     name = name.encode("ISO-8859-1")
963                 if isinstance(value, unicodestr):
964                     value = headers.encode(value)  # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
965                 h.append((name, value))

The definition of headers.encode is below:
cherrypy/lib/httputil.py
491 def encode(cls, v): 492 """Return the given header name or value, encoded for HTTP output.""" 493 for enc in cls.encodings: 494 try: 495 return v.encode(enc) 496 except UnicodeEncodeError: 497 continue 498 499 if cls.protocol == (1, 1) and cls.use_rfc_2047: 500 # Encode RFC-2047 TEXT 501 # (e.g. u"\u8200" -> "=?utf-8?b?6IiA?="). 502 # We do our own here instead of using the email module 503 # because we never want to fold lines--folding has 504 # been deprecated by the HTTP working group. 505 v = b2a_base64(v.encode('utf-8')) 506 return (ntob('=?utf-8?b?') + v.strip(ntob('\n')) + ntob('?='))# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 507 508 raise ValueError("Could not encode header part %r using " 509 "any of the encodings %r." % 510 (v, cls.encodings))

This function encode v as base64,
Cause these is no correct encodings in cls.encodings:
cherrypy/lib/httputil.py
442 protocol = (1, 1) 443 encodings = ["ISO-8859-1"]

Then I add 'utf-8' to encodings:
cherrypy/lib/httputil.py
442 protocol = (1, 1) 443 encodings = ["ISO-8859-1","utf-8"]

problem solved!

Now, I want to ask, is there any API to add encodings, when I want to response a cookie?

Tim Roberts

unread,
Jan 27, 2015, 1:27:26 PM1/27/15
to cherryp...@googlegroups.com
虞坤霖 wrote:
Hi!
I wanted to response a chinese cookie, like this:

cookie = cherrypy.response.cookie                                                                     
cookie['userid'] = '用户名'   #<<<<<<<<<<<<<<<<< chinese value
cookie['userid']['path'] = '/'
cookie['userid']['max-age'] = 10
cookie['userid']['version'] = 1

but my broswer didn't receive the cookie corrctly.
Chrome received : 
Set-Cookie: =?utf-8?b?dXNlcmlkPSLnlKjmiLflkI0iOyBNYXgtQWdlPTEwOyBQYXRoPS87IFZlcnNpb249MQ==?=
I think it's base64.

Yes, that's exactly the right way to send a non-ASCII header value.  HTML headers have to be 7-bit clean.  Extended characters have to be encoded.

Why do think this is a problem?  The cookie receiver should be able to decode this back to Unicode.
-- 
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
Reply all
Reply to author
Forward
0 new messages