io.Reader contract

165 views
Skip to first unread message

Florian Weimer

unread,
Aug 25, 2011, 3:40:25 PM8/25/11
to golan...@googlegroups.com
The io.Reader contract is somewhat non-standard (Read() can return
both data and and error), so I looked at callers which assume the
standard behavior (an error implies that no bytes were read). Here's
what I found:

bufio.Reader.Read() does not implement its documented interface if the
underlying Reader returns with n != 0, err != nil. Or at least, the
description is quite unclear.

debug/elf.NewFile() and debug/pe.NewFile() slightly misuse
io.ReaderAt. This is harmless harmless because the input files should
extend beyond the read attempt.

fmt.readRune.readByte() has an interface which needs error buffering.
I think the current version is correct only with Readers which follow
the standard contract.

gob.decodeUintReader() incorrectly assumes the standard behavior from
an io.Reader.

html.Tokenizer.readByte() misues the io.Reader interface because it
discards the count on error.

http.chunkedReader.Read() does not perform the CRLF check on EOF.
This could be an actual bug.

image/png.parseIDAT() assumes that if Reader.Read() returns the length
of the input buffer, there will be no error.

mail.qDecoder.Read() assumes the standard Reader.Read contract and
appears to be buggy.

src/pkg/net/dnsclient_unix.go:exchange() is a borderline case. The
Read() function called there is implemented in the same package and
seems to guarantee the standard contract.

src/pkg/os/sys_linux.go:Hostname() contains another borderline misuse.

src/cmd/godoc/utils.go:isTextFile() looks buggy. Small files may not
be flagged as text.


In Effective Go, this code snippet is wrong---the error test should
happen after incrementing n, and checking nbytes against 0 is
superfluous:

var n int
var err os.Error
for i := 0; i < 32; i++ {
nbytes, e := f.Read(buf[i:i+1]) // Read one byte.
if nbytes == 0 || e != nil {
err = e
break
}
n += nbytes
}


All in all, the list of issues is shorter than I expected. So maybe
the current non-standard contract is quite reasonable. I haven't
checked for loops which assume that the os.EOF error is sticky
(mime/multipart.Part.Read() contains an example), but I suspect that
there are many, so the "The next Read should return 0, os.EOF
regardless." recommendation of io.Reader should be a must.

I also discovered some unrelated strangeness. bufio.Reader.Peek()
clears the error as a side effect (and reads more data, which is a bit
surprising but makes sense). It doesn't seem to make good on its
promiseq to return "an error explaining why the read is short".

bflm

unread,
Aug 25, 2011, 4:43:56 PM8/25/11
to golan...@googlegroups.com
On Thursday, August 25, 2011 9:40:25 PM UTC+2, Florian Weimer wrote:

I wonder what is the 'non standard contract' and/or 'standard contract' as neither of them seems to really defined in the post. Which criterion/authority/... defines what a standard/non standard contract in Go is?

j_al...@rocketmail.com

unread,
Aug 25, 2011, 4:56:43 PM8/25/11
to golan...@googlegroups.com
Which criterion/authority/... defines what a
> standard/non standard contract in Go is?

The contract is specified here: http://golang.org/pkg/io/#Reader

Florian Weimer

unread,
Aug 25, 2011, 5:04:46 PM8/25/11
to golan...@googlegroups.com
* bflm:

Go's contract is unexpected if you have used POSIX, (parts of) the
Windows API, Java, Perl, and many other systems. Unexpected interface
properties lead to programmer errors. It's not always bad to break
with tradition, but this difference is rather subtle, and I'm not sure
if there is much benefit.

Paul Borman

unread,
Aug 25, 2011, 5:15:46 PM8/25/11
to Florian Weimer, golan...@googlegroups.com
POSIX returns a single value.  Go's Read returns 2 values.  Clearly you cannot expect Go's Read to be POSIX.  As a very long time UNIX and POSIX programmer, I see Go's Read as doing what I would expect it should do.

Florian Weimer

unread,
Aug 25, 2011, 5:32:21 PM8/25/11
to Paul Borman, golan...@googlegroups.com
* Paul Borman:

> POSIX returns a single value. Go's Read returns 2 values. Clearly you
> cannot expect Go's Read to be POSIX.

POSIX has the explicit return value and errno. Two values as well. 8-)

Russ Cox

unread,
Aug 25, 2011, 5:38:36 PM8/25/11
to Florian Weimer, golan...@googlegroups.com
Thanks for looking at these.
Please feel free to send CLs fixing these issues.

The current contract is what it is because
sometimes you do get some data and then
an error. It's easy to handle correctly:

n, err := f.Read(buf)
process(buf[:n])
if err != nil {
...
}

Thanks.
Russ

Paul Borman

unread,
Aug 25, 2011, 5:57:26 PM8/25/11
to Florian Weimer, golan...@googlegroups.com
Sorry, but that is not correct.

Upon successful completion, read() and pread() shall return a non-negative integer indicating the number of bytes actually read.  Otherwise, the functions shall return -1 and set errno to indicate the error.

Errno only has meaning if read returns -1.  Go is different in that you can get both data *and* an error.  POSIX does not handle this case.  This is not to say there are not some inconsistencies as you pointed out, just that I don't think there is an issue with the semantics of Go's Read.  You can't write a POSIX compliant program in Go.  There is no POSIX defined binding for Go.

Florian Weimer

unread,
Aug 26, 2011, 5:04:36 PM8/26/11
to Paul Borman, golan...@googlegroups.com
* Paul Borman:

> Sorry, but that is not correct.
>

> *Upon successful completion, *read*() and *pread*() shall return a


> non-negative integer indicating the number of bytes actually read.
> Otherwise, the functions shall return -1 and set errno to indicate the

> error.*

> Errno only has meaning if read returns -1. Go is different in that you can
> get both data *and* an error. POSIX does not handle this case.

Correct, it's just that with errno, C programs could do the same, but
they don't. I suspected that programmers assume that Go follows
similar semantics (of an implied Either type), but I'm no longer sure
it's due to exposure to other APIs.

But curiously, the correct way to deal with Read is more similar to
the POSIX interface than the rest of Go: you need to look at the
returned length first, before checking the error value. The reasons
are very different, but the coding pattern is similar. On the other
hand, in most other parts of Go, the error checking idiom goes like
this:

result, err := Func()
if err != nil {
return
}
// process result

While for Read, it's more like this:

result, err := Func()
// process result
if err != nil {
return
}

This does seem to cause some confusion, and could be avoided if the
Read contract specified that it's either data or error.

Ian Lance Taylor

unread,
Aug 26, 2011, 5:38:45 PM8/26/11
to Florian Weimer, Paul Borman, golan...@googlegroups.com
Florian Weimer <f...@deneb.enyo.de> writes:

> * Paul Borman:


>
>> Errno only has meaning if read returns -1. Go is different in that you can
>> get both data *and* an error. POSIX does not handle this case.
>
> Correct, it's just that with errno, C programs could do the same, but
> they don't.

To be pedantic, no, they couldn't. C library functions may change
errno, but they may never set it to 0. You can never tell whether a
function succeeded by looking only at errno.

Ian

Florian Weimer

unread,
Aug 27, 2011, 10:07:42 AM8/27/11
to Ian Lance Taylor, Paul Borman, golan...@googlegroups.com
* Ian Lance Taylor:

> Florian Weimer <f...@deneb.enyo.de> writes:
>
>> * Paul Borman:
>>
>>> Errno only has meaning if read returns -1. Go is different in that you can
>>> get both data *and* an error. POSIX does not handle this case.
>>
>> Correct, it's just that with errno, C programs could do the same, but
>> they don't.
>
> To be pedantic, no, they couldn't. C library functions may change
> errno, but they may never set it to 0.

I'm sure some libraries do this. Even in the C library, there are a
couple of functions where a particular error state can only be
recognized if you set errno to 0 before the call (e.g., strtold).

> You can never tell whether a function succeeded by looking only at
> errno.

Correct, my comparison with C was a bit off. Read is more like C than
the rest of Go: For Read and C, you check the returned value first,
and then look at errno. For most other Go functions, you check the
error return first, and then the actual result. So my speculation
about the cause for the confusion was wrong, but I still think it's
there (and could perhaps be avoided by changing the contract for
Read).

tiny_dust

unread,
Aug 27, 2011, 10:36:51 AM8/27/11
to golang-nuts
+1

it seems a good idea.

we can see 'n' as a hint . Reader return non empty bytes iff. err!
=nil
Reply all
Reply to author
Forward
0 new messages