Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

HeaderParseError

50 views
Skip to first unread message

Thomas Guettler

unread,
Jul 4, 2011, 4:31:35 AM7/4/11
to
Hi,

I get a HeaderParseError during decode_header(), but Thunderbird can
display the name.

>>> from email.header import decode_header
>>> decode_header('=?iso-8859-1?B?QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?=')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.6/email/header.py", line 101, in decode_header
raise HeaderParseError
email.errors.HeaderParseError


How can I parse this in Python?

Thomas

Same question on Stackoverflow:
http://stackoverflow.com/questions/6568596/headerparseerror-in-python

--
Thomas Guettler, http://www.thomas-guettler.de/
E-Mail: guettli (*) thomas-guettler + de

Peter Otten

unread,
Jul 4, 2011, 5:51:41 AM7/4/11
to
Thomas Guettler wrote:

> I get a HeaderParseError during decode_header(), but Thunderbird can
> display the name.
>
>>>> from email.header import decode_header
>>>>
decode_header('=?iso-8859-1?B?QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?=')
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "/usr/lib64/python2.6/email/header.py", line 101, in decode_header
> raise HeaderParseError
> email.errors.HeaderParseError
>
>
> How can I parse this in Python?

Trying to decode as much as possible:

>>> s = "QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?="
>>> for n in range(len(s), 0, -1):
... try: t = s[:n].decode("base64")
... except: pass
... else: break
...
>>> n, t
(49, 'Anmeldung Netzanschluss S\x19\x1c\x9a[\x99\xcc\xdc\x0b\x9a\x9c\x19')
>>> print t.decode("iso-8859-1")
Anmeldung Netzanschluss S[ÌÜ

>>> s[n:]
'w==?='

The characters after "...Netzanschluss " look like garbage. What does
Thunderbird display?

Thomas Guettler

unread,
Jul 4, 2011, 6:38:39 AM7/4/11
to

Hi Peter, Thunderbird shows this:

Anmeldung Netzanschluss Südring3p.jpg

Thomas

Peter Otten

unread,
Jul 4, 2011, 7:20:42 AM7/4/11
to pytho...@python.org
Thomas Guettler wrote:

> On 04.07.2011 11:51, Peter Otten wrote:
>> Thomas Guettler wrote:
>>
>>> I get a HeaderParseError during decode_header(), but Thunderbird can
>>> display the name.
>>>
>>>>>> from email.header import decode_header
>>>>>>
>>
decode_header('=?iso-8859-1?B?QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?=')
>>> Traceback (most recent call last):
>>> File "<stdin>", line 1, in <module>
>>> File "/usr/lib64/python2.6/email/header.py", line 101, in
>>> decode_header
>>> raise HeaderParseError
>>> email.errors.HeaderParseError

>> The characters after "...Netzanschluss " look like garbage. What does


>> Thunderbird display?
>
> Hi Peter, Thunderbird shows this:
>
> Anmeldung Netzanschluss Südring3p.jpg

>>> a = u"Anmeldung Netzanschluss
Südring3p.jpg".encode("iso-8859-1").encode("base64")

>>> b = "QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?="
>>> for i, (x, y) in enumerate(zip(a, b)):
... if x != y: print i, x, y
...
33 / _
52
?
>>> b.decode("base64")


Traceback (most recent call last):
File "<stdin>", line 1, in <module>

File "/usr/lib/python2.6/encodings/base64_codec.py", line 42, in
base64_decode
output = base64.decodestring(input)
File "/usr/lib/python2.6/base64.py", line 321, in decodestring
return binascii.a2b_base64(s)
binascii.Error: Incorrect padding
>>> b.replace("_", "/").decode("base64")
'Anmeldung Netzanschluss S\xfcdring3p.jpg'

Looks like you encountered a variant of base64 that uses "_" instead of "/"
for chr(63). The wikipedia page http://en.wikipedia.org/wiki/Base64
calls that base64url.

You could try and make the email package accept that with a monkey patch
like the following:

#untested
import binascii
def a2b_base64(s):
return binascii.a2b_base64(s.replace("_", "/"))

from email import base64mime
base64mime.a2b_base64 = a2b_base64

Alternatively monkey-patch the binascii module before you import the email
package.


Thomas Guettler

unread,
Jul 5, 2011, 8:02:32 AM7/5/11
to
On 04.07.2011 13:20, Peter Otten wrote:
> Thomas Guettler wrote:
>
>> On 04.07.2011 11:51, Peter Otten wrote:
>>> Thomas Guettler wrote:
>>>
>>>> I get a HeaderParseError during decode_header(), but Thunderbird can
>>>> display the name.
>>>>
>>>>>>> from email.header import decode_header
>>>>>>>
>>>

Hi,

I created a ticket: http://bugs.python.org/issue12489

Thomas Güttler

0 new messages