bufio Scanner error handling

1,393 views
Skip to first unread message

Tom Scrace

unread,
Mar 6, 2016, 11:48:36 PM3/6/16
to golang-nuts
Hi,

Go beginner here.

I am curious about the design of the bufio.Scanner with regard to error handling.

Scanner lets you read lines of text like this:

stdinscanner := bufio.NewScanner(os.Stdin)
for stdinscanner.Scan() {
    scannedline
:= stdinscanner.Text()
   
// do something with scannedline
}


Successive calls the scanner's Scan() method return true (if there is a line to read) or false (if not) while advancing the scanner to the next line, which is made available by calling the Text() method. When all the lines are exhausted, it returns false, thus ending the loop.

If the scanner hits an error, it also returns false. In order to get the error, or even know that an error has happened, we must call the scanner's Error() method.

So, after we exit the loop, we have have either hit the end of the input, or encountered an error. In order to determine the difference we must remember to call Error() lest we march ahead under the illusion that we have successfully scanned all the input:


stdinscanner := bufio.NewScanner(os.Stdin)
for stdinscanner.Scan() {
    scannedline
:= stdinscanner.Text()
   
// do something with scannedline
}

err
:= stdinscanner.Error()
if err != nil {
   
// handle error
}

Elsewhere in Go I have seen errors returned in a way that makes ignoring them much harder (and perhaps impossible?). For example:

response, err := http.Get(url)

Here, the http package uses multiple return to give us the error at the same time as the hoped-for response. Since we have to assign the error to a variable, and do something with this variable - lest we incur the compiler's wrath - we must either handle it or at least consciously ignore it.

My question is why the choice was made in the case of bufio.Scanner not to do this:

line, err := stdinscanner.Text()

Thanks in advance. Sorry if I am missing something; only just getting started with Go.

Tom

andrey mirtchovski

unread,
Mar 6, 2016, 11:54:28 PM3/6/16
to Tom Scrace, golang-nuts
this should explain things a little bit:
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

andrey mirtchovski

unread,
Mar 6, 2016, 11:55:42 PM3/6/16
to Tom Scrace, golang-nuts
this should explain things a little bit:

https://blog.golang.org/errors-are-values

it uses bufio.Scanner as an example.

(apologies for the double-post. gmail shouldn't ever have a hotkey for
"sending email")

gaurav

unread,
Mar 7, 2016, 12:32:18 AM3/7/16
to golang-nuts, t.sc...@gmail.com
I think the relevant section from the blog is:

This isn't very different, but there is one important distinction. In this code, the client must check for an error on every iteration, but in the real Scanner API, the error handling is abstracted away from the key API element, which is iterating over tokens. With the real API, the client's code therefore feels more natural: loop until done, then worry about errors. Error handling does not obscure the flow of control.

But, as Tom is pointing out, this way of dealing with error condition is very different from other places where the stress is put on exposing and handling errors more naturally and inline instead of as an afterthought (even at the cost of error handling obscuring the control flow).

Jakob Borg

unread,
Mar 7, 2016, 2:46:35 AM3/7/16
to gaurav, golang-nuts, t.sc...@gmail.com
2016-03-07 6:32 GMT+01:00 gaurav <gaurava...@gmail.com>:
> But, as Tom is pointing out, this way of dealing with error condition is
> very different from other places where the stress is put on exposing and
> handling errors more naturally and inline instead of as an afterthought
> (even at the cost of error handling obscuring the control flow).

Indeed. But note also (Tom) that it's fairly common and idiomatic to
wrap and abstract the error handling of lower level operations to make
the API nicer, as bufio.Scanner does. See for example
http://talks.golang.org/2013/bestpractices.slide#6 which shows the
code:

func (g *Gopher) WriteTo(w io.Writer) (int64, error) {
bw := &binWriter{w: w}
bw.Write(int32(len(g.Name)))
bw.Write([]byte(g.Name))
bw.Write(int64(g.AgeYears))
return bw.size, bw.err
}

Note that error handling is not done for each Write() call here, as
the binWriter type handles this and effectively turns all Write()
calls after an error into no-ops, making it very concise to use with
just an error check at the end.

//jb

Tom Scrace

unread,
Mar 7, 2016, 2:55:05 AM3/7/16
to golang-nuts

https://blog.golang.org/errors-are-values

it uses bufio.Scanner as an example.

(apologies for the double-post. gmail shouldn't ever have a hotkey for
"sending email")

Thanks Andrey. And I entirely sympathise with your gmail problem. It has bitten me many times. The 'Undo Send' gmail extension can help with this.


 But, as Tom is pointing out, this way of dealing with error condition is very different from other places where the stress is put on exposing and handling errors more naturally and inline instead of as an afterthought (even at the cost of error handling obscuring the control flow).

Indeed, Gaurav, this was my point. Thanks to the blog post I can now understand that the API was designed this way in order to preculde the necessity for checking for errors every time we go through the loop.

I think I would have argued that this is a fairly marginal benefit. The error checking would still appear only once on the page after all. And although a comparison check would have to be executed with every iteration, this seems a very small price to pay for ensuring that errors do not go unchecked.

Tamás Gulácsi

unread,
Mar 7, 2016, 3:13:25 AM3/7/16
to golang-nuts, gaurava...@gmail.com, t.sc...@gmail.com
Another nice pattern I've seen is a use of errorWriter to simplify error checking for Writes:

type errorWriter struct {
  errp *error
  w io.Writer
}

func (ew errorWriter) Write(p []byte) (int, error) {
  if err := *ew.errp; err != nil {
    return 0, err
  }
  n, *ew.errp = ew.w.Write(p)
  return n, *ew.errp
}

// Usage:

func () {
  var err error
  ew := errorWriter{errp: &err, w:os.Stdout}
  ew.Write([]byte("aaa"))
  ew.Write([]byte("bbb"))
  ...
  if err != nil {
    log.Fatal(err)
  }
}


or without a pointer to the error.

Tom Scrace

unread,
Mar 7, 2016, 4:23:00 AM3/7/16
to golang-nuts, gaurava...@gmail.com, t.sc...@gmail.com
On Monday, March 7, 2016 at 7:46:35 AM UTC, Jakob Borg wrote:

Indeed. But note also (Tom) that it's fairly common and idiomatic to
wrap and abstract the error handling of lower level operations to make
the API nicer, as bufio.Scanner does. See for example
http://talks.golang.org/2013/bestpractices.slide#6 which shows the
code:

    func (g *Gopher) WriteTo(w io.Writer) (int64, error) {
       bw := &binWriter{w: w}
       bw.Write(int32(len(g.Name)))
       bw.Write([]byte(g.Name))
       bw.Write(int64(g.AgeYears))
       return bw.size, bw.err
    }

Note that error handling is not done for each Write() call here, as
the binWriter type handles this and effectively turns all Write()
calls after an error into no-ops, making it very concise to use with
just an error check at the end.


Thanks Jakob. That makes sense.

There is clearly a trade-off here between concision and unobtrusiveness on the one hand, and ensuring that errors are caught on the other hand.

To my mind, the latter is by far the most important. And - although I know this is not a popular view within the Go community - I think this is a tradeoff that exceptions solve quite nicely.

Nonetheless I think there might be better ways to sove this. For example:

    func (g *Gopher) WriteTo(w io.Writer) (int64, error) {


       bw
, bwErr := &binWriter{w: w}

       bw
.Write(int32(len(g.Name)))

       bw
.Write([]byte(g.Name))

       bw
.Write(int64(g.AgeYears))

       
return bw.size, bwErr

   
}


Jakob Borg

unread,
Mar 7, 2016, 4:53:39 AM3/7/16
to Tom Scrace, golang-nuts, Gaurav Agarwal
2016-03-07 10:23 GMT+01:00 Tom Scrace <t.sc...@gmail.com>:
> To my mind, the latter is by far the most important. And - although I know
> this is not a popular view within the Go community - I think this is a
> tradeoff that exceptions solve quite nicely.
>
> Nonetheless I think there might be better ways to sove this. For example:
>
> func (g *Gopher) WriteTo(w io.Writer) (int64, error) {
> bw, bwErr := &binWriter{w: w}
> bw.Write(int32(len(g.Name)))
> bw.Write([]byte(g.Name))
> bw.Write(int64(g.AgeYears))
> return bw.size, bwErr
> }

Here bwErr would need to be a *error, which would need to be
dereferenced in the return statement. That would be a very unusual API
and somewhat spooky action at a distance. You might think you could
return an object that implements the error interface and just update
that object behind the scenes, but since the convention is to compare
the error to nil that wouldn't fly.

If you're worried about the error not being checked, the above API
implies that the error would happen during the *creation* of binWriter
(which would need a constructor to return the two arguments) so you'd
probably see someone do the error check immediately after construction
(and then not later) and that wouldn't look immediately wrong as it
follows a common pattern. Except for the pointer-to-error type, that
is.

So all in all, I think the previous variant is significantly cleaner. :)

//jb

Tom Scrace

unread,
Mar 7, 2016, 6:09:35 AM3/7/16
to golang-nuts, t.sc...@gmail.com, gaurava...@gmail.com


On Monday, March 7, 2016 at 9:53:39 AM UTC, Jakob Borg wrote:


If you're worried about the error not being checked, the above API
implies that the error would happen during the *creation* of binWriter
(which would need a constructor to return the two arguments) so you'd
probably see someone do the error check immediately after construction
(and then not later) and that wouldn't look immediately wrong as it
follows a common pattern. Except for the pointer-to-error type, that
is.

Yes. Very good point.
 
So all in all, I think the previous variant is significantly cleaner. :)

I am just concerned about errors going unnoticed at the point of creation. Anytime errors are allowed to pass unnoticed, and go on to cause problems in areas of the code remote from the point of error creation, it is a big problem in my mind. It makes debugging harder and can cause problems that are not even noticed at runtime.

Go has a good story to tell about this. Anytime an error might be created we do this:

value, err := someFunction()

In this case, the compiler forces us to do *something* with the error. Even if it is to explicitly ignore it. There is no way that the developer can forget about the possibility of the error.

But this all goes out the window if we're writing APIs  that don't follow this pattern. In the interests of concision we're allowing for the possibility of unchecked errors, and we're back to relying on developers to remember every time to check the error. When the Go compiler has the power to do this for us, this seems like a bad tradeoff to me.

 Tom
Reply all
Reply to author
Forward
0 new messages