Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

httplib raises ValueError reading chunked content

44 views
Skip to first unread message

philip2...@gmail.com

unread,
Mar 8, 2006, 6:21:11 PM3/8/06
to
Hi all,
Has anyone ever seen Python 2.4.1's httplib choke when reading chunked
content? I'm using it via urrlib2, and I ran into a particular server
that returns something that httplib doesn't expect. Specifically, in
the code below where the error occurs, line == ''.

Python 2.4.1 (#2, Oct 12 2005, 01:36:32)
[GCC 3.4.4 [FreeBSD] 20050518] on freebsd6
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> req = urllib2.Request("http://www.mistyshaven.com/")
>>> f = urllib2.urlopen(req)
>>> content = f.read()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/local/lib/python2.4/socket.py", line 285, in read
data = self._sock.recv(recv_size)
File "/usr/local/lib/python2.4/httplib.py", line 456, in read
return self._read_chunked(amt)
File "/usr/local/lib/python2.4/httplib.py", line 495, in
_read_chunked
chunk_left = int(line, 16)
ValueError: invalid literal for int():
>>>

I'm running Python 2.4.1 under FreeBSD 6.0. Interestingly, I can't
recreate the problem using Python 2.3 under OS X.

I've done a little digging for clues. First, the response headers
include:
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322

I reckon that if that popular server was sending out broken chunked
content, it'd be a well-known problem but that doesn't seem to be the
case. So I assume (big assumption) that it is sending correct
responses. Another clue is that the content fits all in one chunk.
Under my 2.3 installation (where I can fetch the content successfully),
len(content) == 0x303. The first chunk size reported by the server is
0x311, so I guess that adds up when one adds a fudge factor for \r\n
and so forth.

My guess is that httplib is somehow reading the blank line that
signifies the end of chunked content as part of the content. I don't
know enough about debugging HTTP conversations to go any further. Can
anyone at least confirm the problem elsewhere?

Thanks
Philip

PS - The email address with which this was posted is live; you can also
email Philip Semanchuk: my first name @ my last name .com

Etienne Desautels

unread,
Mar 8, 2006, 8:32:50 PM3/8/06
to pytho...@python.org, phi...@semanchuk.com

Hi Philip,

> Hi all,
> Has anyone ever seen Python 2.4.1's httplib choke when reading chunked
> content?

Yes, it's a know bug. See for yourself:
https://sourceforge.net/tracker/?
func=detail&atid=305470&aid=900744&group_id=5470

Etienne

Philip Semanchuk

unread,
Mar 8, 2006, 9:00:27 PM3/8/06
to Etienne Desautels, pytho...@python.org

On Mar 8, 2006, at 8:32 PM, Etienne Desautels wrote:

>
> Hi Philip,
>
>> Hi all,
>> Has anyone ever seen Python 2.4.1's httplib choke when reading chunked
>> content?
>
> Yes, it's a know bug. See for yourself:
> https://sourceforge.net/tracker/?
> func=detail&atid=305470&aid=900744&group_id=5470

Merci beaucoup Etienne, I don't know why I couldn't find that in the
bug list. I will have to work on my searching skills.

Cheers
Philip

0 new messages