A newbie here. I was wondering why the following fails on Python 2.6.2 (r262:71605) on win32. Am I doing something inappropriate?
Interestingly, it works in 3.1, but would like to also get it working in 2.6.
Thanks in advance,
--Matt
import io
import shutil
import tempfile
import zipfile
with tempfile.TemporaryFile() as f:
# (Real code retrieves archive via urllib2.urlopen().)
zip = zipfile.ZipFile(f, mode='w')
zip.writestr('unknowndir/src.txt', 'Hello, world!')
zip.close();
# (Pretend we just downloaded the zip file.)
f.seek(0)
# Result of urlopen() is not seekable, but ZipFile requires a
# seekable file. Work around this by copying the file into a
# memory stream.
with io.BytesIO() as memio:
shutil.copyfileobj(f, memio)
zip = zipfile.ZipFile(file=memio)
# Can't use zip.extract(), because I want to ignore paths
# within archive.
src = zip.open('unknowndir/src.txt')
with open('dst.txt', mode='wb') as dst:
shutil.copyfileobj(src, dst)
The last line throws an Error:
Traceback (most recent call last):
File "test.py", line 25, in <module>
shutil.copyfileobj(src, dst)
File "C:\Python26\lib\shutil.py", line 27, in copyfileobj
buf = fsrc.read(length)
File "C:\Python26\lib\zipfile.py", line 594, in read
bytes = self.fileobj.read(bytesToRead)
TypeError: integer argument expected, got 'long'
> with io.BytesIO() as memio:
> shutil.copyfileobj(f, memio)
> zip = zipfile.ZipFile(file=memio)
> # Can't use zip.extract(), because I want to ignore paths
> # within archive.
> src = zip.open('unknowndir/src.txt')
> with open('dst.txt', mode='wb') as dst:
> shutil.copyfileobj(src, dst)
>
>
> The last line throws an Error:
>
>
> Traceback (most recent call last):
> File "test.py", line 25, in <module>
> shutil.copyfileobj(src, dst)
> File "C:\Python26\lib\shutil.py", line 27, in copyfileobj
> buf = fsrc.read(length)
> File "C:\Python26\lib\zipfile.py", line 594, in read
> bytes = self.fileobj.read(bytesToRead)
> TypeError: integer argument expected, got 'long'
Try adding a length parameter to the copyfileobj call, so the copy is done
in small enough chunks.
--
Gabriel Genellina
Hmmm...tried a variety of lengths (512, 1024, etc.) with no luck. Maybe this is a good opportunity for me to learn some Python debugging tools.
Thanks!
--Matt
It should hopefully work if you use cStringIO/StringIO instead of
BytesIO.
I think the issue is essentially that StringIO.read() will accept a
long object while the backport of bytesio to to 2.6 does an explicit
check for int:
py> StringIO.StringIO("foo").read(long(1))
'f'
py> io.BytesIO("foo").read(long(1))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: integer argument expected, got 'long'
Should this be amended? Perhaps someone on core can consider it.
As for why the bytesToRead calculation in ZipExtFile.read() results in
a long, I've not yet looked at it closely.
Simple, actually:
In ZipExtFile.__init__():
self.bytes_read = 0L
In ZipExitFile.read():
bytesToRead = self.compress_size - self.bytes_read
13 - 0L == 13L
It does! Excellent! You've saved me the trouble of a weekend debug session.
--Matt