gzip.Reader.Read does not fill the given buffer

245 views
Skip to first unread message

Amit Lavon

unread,
Jun 9, 2020, 10:05:38 AM6/9/20
to golang-nuts
Hi gophers,

Demonstration: https://play.golang.org/p/AY-fVWiOrFd

I am reading raw bytes from a gzip input. I found that it only reads up to chunks of 2^15 even though there is more data to be read.

Is that the intended behavior? I expected whatever internal buffering it may have to be invisible to the caller. If that's intended, how could I have anticipated that? The contract of Reader says "Read conventionally returns what is available instead of waiting for more" but it doesn't seem to be the case here. I call Read expecting to wait for gzip to do its internal processing and make the data available.

Happy to get your input.

Amit

Ronny Bangsund

unread,
Jun 9, 2020, 10:13:51 AM6/9/20
to golang-nuts
On Tuesday, June 9, 2020 at 4:05:38 PM UTC+2, Amit Lavon wrote:
I am reading raw bytes from a gzip input. I found that it only reads up to chunks of 2^15 even though there is more data to be read.

Is that the intended behavior? I expected whatever internal buffering it may have to be invisible to the caller. If that's intended, how could I have anticipated that? The contract of Reader says "Read conventionally returns what is available instead of waiting for more" but it doesn't seem to be the case here.
We're at the mercy of what the filesystem prefers to give us, and in my experience it likes many small chunks on any OS. I've seen consistent returns as low as 4k from plain file reading, so 32k from gzip decompression seems good. Behind the scenes the OS may already have buffered your entire file if it's not too huge, so I wouldn't worry about it being slower to loop through small buffers repeatedly.

Brian Candler

unread,
Jun 9, 2020, 11:10:45 AM6/9/20
to golang-nuts
There is io.ReadFull if you want to read as much as it can into the preallocated buffer; you'll get io.ErrUnexpectedEOF if it's less than this.

There is ioutil.ReadAll if you want to allocate memory to read the entire stream into a single buffer (assuming you have enough RAM) 

Amit Lavon

unread,
Jun 9, 2020, 12:53:07 PM6/9/20
to golang-nuts
Thank you!!
io.ReadFull is just what I needed (and what I actually expected from Reader.Read).
Why would I ever use Reader.Read rather than ReadFull?

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/u-wNH3NyMHo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/8ff054df-218c-4d52-8984-83b1c4444ed9o%40googlegroups.com.

Sam Whited

unread,
Jun 9, 2020, 1:03:19 PM6/9/20
to golan...@googlegroups.com
Read lets you build pipelines, it involves fewer expensive allocations
(ie. you might not want to use ReadFull in the hot path of an important
project), you could use read to read into a slice at different points,
or not read the entirety of an expensive document into memory all at
once, you can implement buffering on top of it, etc.

It's probably pretty rare that you actually want to use ReadFull, or at
least, I don't find myself reaching for it very often.

—Sam


On Tue, Jun 9, 2020, at 12:51, Amit Lavon wrote:
> Thank you!! io.ReadFull is just what I needed (and what I actually
> expected from Reader.Read). Why would I ever use Reader.Read rather
> than ReadFull?
>
> On Tue, Jun 9, 2020 at 6:11 PM Brian Candler
> <b.ca...@pobox.com> wrote:
> > There is io.ReadFull <https://golang.org/pkg/io/#ReadFull> if you
> > want to read as much as it can into the preallocated buffer; you'll
> > get io.ErrUnexpectedEOF if it's less than this.
> > https://play.golang.org/p/gIAX046vNvW
> >
> > There is ioutil.ReadAll <https://golang.org/pkg/io/ioutil/#ReadAll>
> > if you want to allocate memory to read the entire stream into a
> > single buffer (assuming you have enough RAM)
> > https://play.golang.org/p/Pg17S6A74SN
>
> > --
> > You received this message because you are subscribed to a topic in
> > the Google Groups "golang-nuts" group. To unsubscribe from this
> > topic, visit
> > https://groups.google.com/d/topic/golang-nuts/u-wNH3NyMHo/unsubscribe.
> > To unsubscribe from this group and all its topics, send an email to
> > golang-nuts...@googlegroups.com. To view this discussion
> > on the web visit
> > https://groups.google.com/d/msgid/golang-nuts/8ff054df-218c-4d52-8984-83b1c4444ed9o%40googlegroups.com
> > <https://groups.google.com/d/msgid/golang-nuts/8ff054df-218c-4d52-8984-83b1c4444ed9o%40googlegroups.com?utm_medium=email&utm_source=footer>
> > .
>
> --
> You received this message because you are subscribed to the Google
> Groups "golang-nuts" group. To unsubscribe from this group and stop
> receiving emails from it, send an email to golang-
> nuts+uns...@googlegroups.com. To view this discussion on the web
> visit
> https://groups.google.com/d/msgid/golang-nuts/CAPTkDQVudE24DF6tWBO6yFTyF4TgZOopEfjnqZXLhVphp8SBwQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/golang-nuts/CAPTkDQVudE24DF6tWBO6yFTyF4TgZOopEfjnqZXLhVphp8SBwQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .

--
Sam Whited

Brian Candler

unread,
Jun 9, 2020, 4:39:52 PM6/9/20
to golang-nuts
On Tuesday, 9 June 2020 17:53:07 UTC+1, Amit Lavon wrote:
Why would I ever use Reader.Read rather than ReadFull?


Because it's often more efficient to process data in the "natural" chunks it comes in, rather than packing it out to fit a particular buffer size - in which case the final read may be split.

For example, with HTTP chunked encoding, I'd expect each Read to give me one chunk - as long as it fits in the provided buffer, of course.

Amit Lavon

unread,
Jun 9, 2020, 5:27:59 PM6/9/20
to golang-nuts
Interesting points. So I guess ReadFull can be suitable for the consuming end of those bytes.

Thank you!

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/u-wNH3NyMHo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

Ronny Bangsund

unread,
Jun 10, 2020, 10:44:46 AM6/10/20
to golang-nuts
Reading in small chunks is handy for progress displays too.

Michael Jones

unread,
Jun 10, 2020, 2:27:16 PM6/10/20
to Ronny Bangsund, golang-nuts
Philosophy, this is one of the Go library’s most beautiful designs. It allows the one in charge—the program you write—to say what should happen, while allowing everything subordinate to be guided on how it should happen. 

This allows code to be adaptive to device optimal read sizes, network happenstance, the size of various buffers in code that you can’t see, etc. 

Maximum flexibility, simple formulaic coding:

While not done, Use this, Ask for more.  

On Wed, Jun 10, 2020 at 7:45 AM Ronny Bangsund <ronny.b...@gmail.com> wrote:
Reading in small chunks is handy for progress displays too.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/115eb9de-b821-44ae-a91c-b8dde26c28beo%40googlegroups.com.
--
Michael T. Jones
michae...@gmail.com

Michael Jones

unread,
Jun 10, 2020, 2:37:02 PM6/10/20
to Ronny Bangsund, golang-nuts
Philosophically 
Reply all
Reply to author
Forward
0 new messages