Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Zlib::GzipReader doesn't work as expected

26 views
Skip to first unread message

Thomas Wolf

unread,
Apr 25, 2012, 4:57:30 AM4/25/12
to
Hi,
given 2 files:
cat 5lines.txt
5 lines
5 lines
5 lines
5 lines
5 lines

cat more5lines.txt
More 5 lines
More 5 lines
More 5 lines
More 5 lines
More 5 lines

These files are "gzip"ed as follows:
gzip < 5lines.txt > foo.gz
gzip < more5lines.txt >> foo.gz

zcat foo.gz:
5 lines
5 lines
5 lines
5 lines
5 lines
More 5 lines
More 5 lines
More 5 lines
More 5 lines
More 5 lines

This ruby code only reads the first 5 lines:
#!/usr/bin/ruby
require "zlib"
filename = ARGV[0]

Zlib::GzipReader.open(filename) {|gz|
print gz.read
}

./test.rb foo.gz
5 lines
5 lines
5 lines
5 lines
5 lines

How do I force Zlib::GzipReader do read the whole file?

ruby versions: 1.8.7 and 1.9.0

Thanks and regards,
Thomas Wolf

Robert Klemme

unread,
Apr 25, 2012, 3:03:44 PM4/25/12
to
That's a fairly common limitation of GZip libs (Java's standard lib also
has this limitation, or at least hat last time I checked).

You might get away with wrapping the GzipReader around an open IO object
and wrapping another GzipReader when the first finishes.

Kind regards

robert

Simon Krahnke

unread,
Apr 25, 2012, 3:53:15 PM4/25/12
to
* Thomas Wolf <tho...@viacanale.de> (10:57) schrieb:

> These files are "gzip"ed as follows:
> gzip < 5lines.txt > foo.gz
> gzip < more5lines.txt >> foo.gz

So you have two streams of gzipped data in foo.gz.

And the ruby library reads only the first one.

> How do I force Zlib::GzipReader do read the whole file?

I don't know, read the source.

mfg, simon .... l

Simon Krahnke

unread,
Apr 25, 2012, 3:55:00 PM4/25/12
to
* Robert Klemme <short...@googlemail.com> (21:03) schrieb:

> You might get away with wrapping the GzipReader around an open IO object
> and wrapping another GzipReader when the first finishes.

Like this:

,----[ gz.rb ]
| #!/usr/bin/env ruby
|
| require 'zlib'
| require 'pp'
|
| filename = *ARGV
|
| File.open filename do | f |
| gz1 = Zlib::GzipReader.new(f)
| pp gz1.read
| pp Zlib::GzipReader.new(f).read
| end
`----

Doesn't work.

mfg, simon .... l

Thomas Wolf

unread,
Apr 26, 2012, 5:54:56 AM4/26/12
to
Am 25.04.2012 21:03, schrieb Robert Klemme:
>> How do I force Zlib::GzipReader do read the whole file?
>
> That's a fairly common limitation of GZip libs (Java's standard lib also
> has this limitation, or at least hat last time I checked).
>
> You might get away with wrapping the GzipReader around an open IO object
> and wrapping another GzipReader when the first finishes.

Thank you.

I found the following thread:
http://www.velocityreviews.com/forums/t866074-zlib-gzipreader-and-multiple-compressed-blobs-in-a-single-stream.html

and that code works with ruby 1.9.3p0:

require 'stringio'
require 'zlib'

def inflate(filename)
File.open(filename) do |file|
zio = file
loop do
io = Zlib::GzipReader.new zio
puts io.read
unused = io.unused
io.finish
break if unused.nil?
zio.pos -= unused.length
end
end
end

inflate "foo.gz"

Regards,
Thomas

Simon Krahnke

unread,
Apr 26, 2012, 4:02:07 PM4/26/12
to
* Thomas Wolf <tho...@viacanale.de> (11:54) schrieb:

> require 'stringio'

This is unneeded.

>require 'zlib'
>
>def inflate(filename)
> File.open(filename) do |file|
> zio = file

You could just use | zio | instead of |file| and get rid of the
assignment.

> loop do
> io = Zlib::GzipReader.new zio
> puts io.read

puts here will put another "\n" at the end of the output, use print
instead.

> unused = io.unused
> io.finish
> break if unused.nil?
> zio.pos -= unused.length
> end
> end
>end
>
>inflate "foo.gz"

Note that as said in the thread this works only for files and other
seekable sources.

So "(seq 1 5 | gzip; seq 6 10 | gzip) | yourscript.rb" won't work.

mfg, simon .... hth
0 new messages