MNIST dataset - gzip: train-images-idx3-ubyte.gz: not in gzip format

561 views
Skip to first unread message

gre...@eng.ucsd.edu

unread,
Dec 26, 2017, 12:21:45 PM12/26/17
to Discuss
I'm trying to use:

from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)

but I am getting the error that the downloaded file is not in GZip format.

There are several bug reports on this, but none of them seem to solve the issue. 

I found train-images-idx3-ubyte.gz in the local directory MNIST_data, but even trying `gunzip` fails:

```(tf) [bduser@param03 MNIST_data]$ gunzip
gzip: compressed data not read from a terminal. Use -f to force decompression.
For help, type: gzip -h
(tf) [bduser@param03 MNIST_data]$ gunzip train-images-idx3-ubyte.gz

gzip: train-images-idx3-ubyte.gz: not in gzip format
```

I'm thinking that this is a problem with the original datafile. Maybe the GZip format has changed or is in conflict?? Or perhaps my OS' version of GZip can't understand the GZip file??

I am using TF 1.4 on CentOS Linux release 7.4.1708 (Core)

Thanks.
-Tony




Andrew Selle

unread,
Dec 27, 2017, 1:46:04 PM12/27/17
to gre...@eng.ucsd.edu, Discuss
This seems to work for me, but please file a bug if it doesn't work for you still (and link to the bugs that you found). Try looking for text in the files it did create and see if there is an error message in it.
-A

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@tensorflow.org.
To post to this group, send email to dis...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/discuss/4dd4c2b5-2872-4253-9ebe-be0c3d55f1c7%40tensorflow.org.

G Reina

unread,
Dec 27, 2017, 3:54:56 PM12/27/17
to Andrew Selle, Discuss
I am guessing it might be something specific to CentOS. The file downloads correctly, but gunzip tells me is is corrupted. Anyone else able to test on CentOS?

Thanks.
-Tony


>>> from tensorflow.examples.tutorials.mnist import input_data
>>> mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)
Successfully downloaded train-images-idx3-ubyte.gz 727 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/bduser/miniconda2/envs/tf/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py", line 242, in read_data_sets
    train_images = extract_images(f)
  File "/home/bduser/miniconda2/envs/tf/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py", line 56, in extract_images
    magic = _read32(bytestream)
  File "/home/bduser/miniconda2/envs/tf/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py", line 38, in _read32
    return numpy.frombuffer(bytestream.read(4), dtype=dt)[0]
  File "/home/bduser/miniconda2/envs/tf/lib/python2.7/gzip.py", line 268, in read
    self._read(readsize)
  File "/home/bduser/miniconda2/envs/tf/lib/python2.7/gzip.py", line 303, in _read
    self._read_gzip_header()
  File "/home/bduser/miniconda2/envs/tf/lib/python2.7/gzip.py", line 197, in _read_gzip_header
    raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file
>>> exit()


On Wed, Dec 27, 2017 at 10:45 AM, Andrew Selle <ase...@google.com> wrote:
This seems to work for me, but please file a bug if it doesn't work for you still (and link to the bugs that you found). Try looking for text in the files it did create and see if there is an error message in it.
-A

On Tue, Dec 26, 2017 at 9:21 AM <gre...@eng.ucsd.edu> wrote:
I'm trying to use:

from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)

but I am getting the error that the downloaded file is not in GZip format.

There are several bug reports on this, but none of them seem to solve the issue. 

I found train-images-idx3-ubyte.gz in the local directory MNIST_data, but even trying `gunzip` fails:

```(tf) [bduser@param03 MNIST_data]$ gunzip
gzip: compressed data not read from a terminal. Use -f to force decompression.
For help, type: gzip -h
(tf) [bduser@param03 MNIST_data]$ gunzip train-images-idx3-ubyte.gz

gzip: train-images-idx3-ubyte.gz: not in gzip format
```

I'm thinking that this is a problem with the original datafile. Maybe the GZip format has changed or is in conflict?? Or perhaps my OS' version of GZip can't understand the GZip file??

I am using TF 1.4 on CentOS Linux release 7.4.1708 (Core)

Thanks.
-Tony




--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@tensorflow.org.

Andrew Selle

unread,
Dec 27, 2017, 4:02:59 PM12/27/17
to G Reina, Discuss
This is not the right forum for these kinds of queries. I've opened an issue on your behalf here:
I also added a comment which details why your file was not downloaded correctly. If you want to comment and/or receive notifications use your github account. Thank you.

Thanks,
-A

-A

To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@tensorflow.org.
Reply all
Reply to author
Forward
0 new messages