Copying a ZipExtFile

Moore, Mathew L

unread,

Oct 23, 2009, 1:15:33 PM10/23/09

to pytho...@python.org

Hello all,

A newbie here. I was wondering why the following fails on Python 2.6.2 (r262:71605) on win32. Am I doing something inappropriate?

Interestingly, it works in 3.1, but would like to also get it working in 2.6.

Thanks in advance,
--Matt

import io
import shutil
import tempfile
import zipfile

with tempfile.TemporaryFile() as f:
# (Real code retrieves archive via urllib2.urlopen().)
zip = zipfile.ZipFile(f, mode='w')
zip.writestr('unknowndir/src.txt', 'Hello, world!')
zip.close();

# (Pretend we just downloaded the zip file.)
f.seek(0)

# Result of urlopen() is not seekable, but ZipFile requires a
# seekable file. Work around this by copying the file into a
# memory stream.
with io.BytesIO() as memio:
shutil.copyfileobj(f, memio)
zip = zipfile.ZipFile(file=memio)
# Can't use zip.extract(), because I want to ignore paths
# within archive.
src = zip.open('unknowndir/src.txt')
with open('dst.txt', mode='wb') as dst:
shutil.copyfileobj(src, dst)

The last line throws an Error:

Traceback (most recent call last):
File "test.py", line 25, in <module>
shutil.copyfileobj(src, dst)
File "C:\Python26\lib\shutil.py", line 27, in copyfileobj
buf = fsrc.read(length)
File "C:\Python26\lib\zipfile.py", line 594, in read
bytes = self.fileobj.read(bytesToRead)
TypeError: integer argument expected, got 'long'

Gabriel Genellina

unread,

Oct 24, 2009, 1:49:28 AM10/24/09

to pytho...@python.org

En Fri, 23 Oct 2009 14:15:33 -0300, Moore, Mathew L <Moo...@battelle.org>
escribi�:

> with io.BytesIO() as memio:
> shutil.copyfileobj(f, memio)
> zip = zipfile.ZipFile(file=memio)
> # Can't use zip.extract(), because I want to ignore paths
> # within archive.
> src = zip.open('unknowndir/src.txt')
> with open('dst.txt', mode='wb') as dst:
> shutil.copyfileobj(src, dst)
>
>
> The last line throws an Error:
>
>
> Traceback (most recent call last):
> File "test.py", line 25, in <module>
> shutil.copyfileobj(src, dst)
> File "C:\Python26\lib\shutil.py", line 27, in copyfileobj
> buf = fsrc.read(length)
> File "C:\Python26\lib\zipfile.py", line 594, in read
> bytes = self.fileobj.read(bytesToRead)
> TypeError: integer argument expected, got 'long'

Try adding a length parameter to the copyfileobj call, so the copy is done
in small enough chunks.

--
Gabriel Genellina

Moore, Mathew L

unread,

Oct 26, 2009, 10:36:14 AM10/26/09

to pytho...@python.org

> En Fri, 23 Oct 2009 14:15:33 -0300, Moore, Mathew L
> <Moo...@battelle.org>

> escribió:

>
> > with io.BytesIO() as memio:
> > shutil.copyfileobj(f, memio)
> > zip = zipfile.ZipFile(file=memio)
> > # Can't use zip.extract(), because I want to ignore paths
> > # within archive.
> > src = zip.open('unknowndir/src.txt')
> > with open('dst.txt', mode='wb') as dst:
> > shutil.copyfileobj(src, dst)
> >
> >
> > The last line throws an Error:
> >
> >
> > Traceback (most recent call last):
> > File "test.py", line 25, in <module>
> > shutil.copyfileobj(src, dst)
> > File "C:\Python26\lib\shutil.py", line 27, in copyfileobj
> > buf = fsrc.read(length)
> > File "C:\Python26\lib\zipfile.py", line 594, in read
> > bytes = self.fileobj.read(bytesToRead)
> > TypeError: integer argument expected, got 'long'
>
> Try adding a length parameter to the copyfileobj call, so the copy is
> done in small enough chunks.
>

Hmmm...tried a variety of lengths (512, 1024, etc.) with no luck. Maybe this is a good opportunity for me to learn some Python debugging tools.

Thanks!
--Matt

ryles

unread,

Oct 28, 2009, 8:33:10 PM10/28/09

to

It should hopefully work if you use cStringIO/StringIO instead of
BytesIO.

I think the issue is essentially that StringIO.read() will accept a
long object while the backport of bytesio to to 2.6 does an explicit
check for int:

py> StringIO.StringIO("foo").read(long(1))
'f'

py> io.BytesIO("foo").read(long(1))

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

TypeError: integer argument expected, got 'long'

Should this be amended? Perhaps someone on core can consider it.

As for why the bytesToRead calculation in ZipExtFile.read() results in
a long, I've not yet looked at it closely.

ryles

unread,

Oct 29, 2009, 2:27:18 AM10/29/09

to

On Oct 28, 8:33 pm, ryles <ryle...@gmail.com> wrote:
> As for why the bytesToRead calculation in ZipExtFile.read() results in
> a long, I've not yet looked at it closely.

Simple, actually:

In ZipExtFile.__init__():

self.bytes_read = 0L

In ZipExitFile.read():

bytesToRead = self.compress_size - self.bytes_read

13 - 0L == 13L

Moore, Mathew L

unread,

Oct 29, 2009, 9:54:20 AM10/29/09

to ryles, pytho...@python.org

> On October 28, 2009 8:33 PM, "ryles" wrote:
>
<snip>

> > with io.BytesIO() as memio:
> > shutil.copyfileobj(f, memio)
> > zip = zipfile.ZipFile(file=memio)
> > # Can't use zip.extract(), because I want to ignore paths
> > # within archive.
> > src = zip.open('unknowndir/src.txt')
> > with open('dst.txt', mode='wb') as dst:
> > shutil.copyfileobj(src, dst)
> >
> > The last line throws an Error:
> >
> > Traceback (most recent call last):
> > File "test.py", line 25, in <module>
> > shutil.copyfileobj(src, dst)
> > File "C:\Python26\lib\shutil.py", line 27, in copyfileobj
> > buf = fsrc.read(length)
> > File "C:\Python26\lib\zipfile.py", line 594, in read
> > bytes = self.fileobj.read(bytesToRead)
> > TypeError: integer argument expected, got 'long'
>
> It should hopefully work if you use cStringIO/StringIO instead of
> BytesIO.
>

It does! Excellent! You've saved me the trouble of a weekend debug session.

--Matt