If the file exist, the HEAD works as expected and I get valid headers
back that I can parse and pull the ETag out of the dictionary using
getheader('ETag')[1:-1] (using the slice to trim off the double-quotes
in the string.
The problem lies when I attempt to send a HEAD request when no file
exists. As expected, a 404 Not Found response is sent back from
Amazon however, my test scripts seem to hang. I run python with
trace.py and it hangs here:
--- modulename: httplib, funcname: _read_chunked
httplib.py(536): assert self.chunked != _UNKNOWN
httplib.py(537): chunk_left = self.chunk_left
httplib.py(538): value = ''
httplib.py(542): while True:
httplib.py(543): if chunk_left is None:
httplib.py(544): line = self.fp.readline()
--- modulename: socket, funcname: readline
socket.py(321): data = self._rbuf
socket.py(322): if size < 0:
socket.py(324): if self._rbufsize <= 1:
socket.py(326): assert data == ""
socket.py(327): buffers = []
socket.py(328): recv = self._sock.recv
socket.py(329): while data != "\n":
socket.py(330): data = recv(1)
It eventually completes with an exception here:
File "C:\Python25\lib\httplib.py", line 509, in read
return self._read_chunked(amt)
File "C:\Python25\lib\httplib.py", line 548, in _read_chunked
chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: ''
For reference, ethereal captured the following request and response:
HEAD <REMOVED> HTTP/1.1
Host: s3.amazonaws.com
Accept-Encoding: identity
Date: Tue, 13 Mar 2007 02:54:12 GMT
Authorization: AWS <REMOVED>
HTTP/1.1 404 Not Found
x-amz-request-id: E20B4C0D0C48B2EF
x-amz-id-2: <REMOVED>
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Tue, 13 Mar 2007 02:54:16 GMT
Server: AmazonS3
Am I doing something wrong? Is this a known issue? I am an
experienced developer, but pretty new to Python and dynamic languages
in general.
Thanks,
Patrick
> I am attempting to use a HEAD request against Amazon S3 to check
> whether a file exists or not and if it does parse the md5 hash from
> the ETag in the response to verify the contents of the file so as to
> save on bandwidth of uploading files when it is not necessary.
> The problem lies when I attempt to send a HEAD request when no file
> exists. As expected, a 404 Not Found response is sent back from
> Amazon however, my test scripts seem to hang. I run python with
> trace.py and it hangs here:
Yes, it's a known problem. See this message with a self-response:
http://mail.python.org/pipermail/python-list/2006-March/375087.html
--
Gabriel Genellina
Are there plans to include this fix in the standard Python libraries
or must I make the modifications myself (I'm running Python 2.5)?
>> Yes, it's a known problem. See this message with a
>> self-response:http://mail.python.org/pipermail/python-list/2006-March/375087.html
Submit a bug report, if not already done.
http://sourceforge.net/tracker/?group_id=5470
--
Gabriel Genellina
Bug already exists at:
https://sourceforge.net/tracker/index.php?func=detail&aid=1486335&group_id=5470&atid=105470
In the meantime, I implemented a work around for my specific case in
the Amazon S3 library in that I implemented a head() method but am
actually just requesting a GET operation with a very small byte
range. This is essentially yielding all the same header data that I
need (md5 hash in the ETag if the file exists, 404 Not Found if it
doesn't).