bufio readability

1016 views
Skip to first unread message

Egon Elbre

unread,
Mar 30, 2014, 1:29:51 AM3/30/14
to golan...@googlegroups.com
I came across this https://michaelwhatcott.com/familiarity-admits-brevity/ which suggested that bufio.Reader and bufio.Writer is hard to read due to things being too much abbreviated, as an out of context example:

    nn := 0
    for len(s) > b.Available() && b.err == nil {
        n := copy(b.buf[b.n:], s)
        b.n += n
        nn += n
        s = s[n:]
        b.flush()
    }

The whole article seems to ignore the history of the code, but the code in "bufio" is indeed hard to read in places. I'm not sure how much attention this requires or how worthwhile the whole refactor would be, but I guess you can tell whether this warrants a new issue or not. (If needed I can make one)

+ egon

(PS. I'm not the author of the post and it doesn't necessarily reflect my opinions)

andrey mirtchovski

unread,
Mar 30, 2014, 1:46:31 AM3/30/14
to Egon Elbre, golang-dev
i believe the article's author was correct: familiarity does breed
brevity. witness similar code from Plan 9 that has been "in
production" since at least 2000 (when it was last modified):

http://plan9.bell-labs.com/sources/plan9/sys/src/libbio/bwrite.c

take away the (n == 0) block and you can see the example you posted
with sufficient squinting. some of the people behind Go have been
writing code like this for more than thirty years. in that sense,
brevity is the result of familiarity.

times are different now of course, camel-casing makes everything more
readable and we don't care about brevity for the sake of saving a byte
here and there, but a rewritten bufio.Writer seems foreign to those
familiar with the old code, as i realized when reading the article you
linked.

peterGo

unread,
Mar 30, 2014, 11:37:34 AM3/30/14
to golan...@googlegroups.com
Egon,

"Your task is to look at the first line after the function signature in order to discover what the variable named nn represents. Ready, Go!"

// WriteString writes a string.
// It returns the number of bytes written.
func (b *Writer) WriteString(s string) (int, error) { 
    nn := 0
    // . . .
    return nn, nil
}

nn is the number of bytes written.

That was easy!

Peter

Egon Elbre

unread,
Mar 30, 2014, 1:34:55 PM3/30/14
to golan...@googlegroups.com


On Sunday, March 30, 2014 6:37:34 PM UTC+3, peterGo wrote:
Egon,

"Your task is to look at the first line after the function signature in order to discover what the variable named nn represents. Ready, Go!"

As previously stated, I'm not the author of the post and it doesn't necessarily reflect my opinions.
For me it's quite easy to read, but then again, I've refactored/rewritten much, much, much, much worse code.

Given that someone had trouble reading the code it basically "proves" that it's hard to read for some people.
How much or in what way it should be improved is a different matter.
 

// WriteString writes a string.
// It returns the number of bytes written.
func (b *Writer) WriteString(s string) (int, error) { 
    nn := 0
    // . . .
    return nn, nil
}

nn is the number of bytes written.

That was easy!

Keep in mind that ease is subjective.

Andrew Gerrand

unread,
Mar 30, 2014, 5:44:21 PM3/30/14
to Egon Elbre, golang-dev
I'm not the author of package bufio but I have made changes to it.
I personally didn't find the short names to be a problem, but they are short. I can see why some people might have an issue with them.

As a matter of course we don't refactor standard library code without good reason.
The bufio code has been in use for years now.
Let's leave it alone.


--

---
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Russ Cox

unread,
Mar 30, 2014, 8:11:14 PM3/30/14
to Egon Elbre, golang-dev
The local variable name 'nn' is a little obscure; 'total' would have been better. But that's hardly evidence for the claims in the blog post.

I have written before (http://research.swtch.com/names):

A name's length should not exceed its information content. For a local variable, the name i conveys as much information as index or idx and is quicker to read. Similarly, i and j are a better pair of names for index variables than i1 and i2 (or, worse, index1 and index2), because they are easier to tell apart when skimming the program. Global names must convey relatively more information, because they appear in a larger variety of contexts. Even so, a short, precise name can say more than a long-winded one: compare acquire and take_ownership. Make every name tell.

By this reasoning, I believe that:

n := copy(w.buf[w.pos:], s)
w.pos += n
total += n

is perfectly fine, certainly easier to skim than the rewrite:

chunkLength := copy(writer.buffer[writer.unflushedBytes:], input)
writer.unflushedBytes += chunkLength
totalBytesWritten += chunkLength

The suffix 'Bytes' adds no information content: the whole package is about bytes. Likewise, in Writer's WriteString method, the suffix 'Written' adds nothing. The fact that bytes are being written need not be echoed in a variable name. Familiarity does admit brevity, and long-winded names hurt clarity. A helper method for these three lines is attractive mainly because the bloated names make the code hard to scan. Repeating the original three lines is no big deal.

The post's most telling sentence is this: "Intention-revealing, fully spelled, pronounceable names are a baseline for any professional, especially in the field of software development." I would have said the baseline for professional work is that the code be correct, be efficient, use appropriate algorithms, have good tests, have well documented APIs, and be delivered on time. The names of internal variables don't even make the list.

Finally, making a point by claiming to be unable to read a 17-line function due to one cryptic variable name is a dubious strategy. Being able to dive into an unfamiliar code base, find your way around, understand how things are typically done, and then learn how the code works is a crucial skill, one that eludes many 'professionals'. Working on the Go runtime, I have found it invaluable to consult the FreeBSD, Linux, and OS X kernel sources from time to time, along with other language runtimes (Haskell, Erlang, SBCL, ...). Web developers inevitably read other web developers' code, and so on. Different code bases have different conventions and idioms, and internalizing them is part of the process. Much better to spend time reading and learning from the code than blogging about what's wrong with the local variable names.

Russ

Lucio De Re

unread,
Mar 31, 2014, 4:54:36 AM3/31/14
to Russ Cox, Egon Elbre, golang-dev
... and to add to rsc's good points, "chunkLength" means nothing to an
Italian programmer, while "n" can much more readily be considered an
international idiom. And when it comes to idioms, Rob has made that
point successfully a long time ago, no need to flog that horse.

Lucio.


On 3/31/14, Russ Cox <r...@golang.org> wrote:
> The local variable name 'nn' is a little obscure; 'total' would have been
> better. But that's hardly evidence for the claims in the blog post.
>
> I have written before (http://research.swtch.com/names):
>
> A name's length should not exceed its information content. For a local
> variable, the name i conveys as much information as index or idx and is
> quicker to read. Similarly, i and j are a better pair of names for index
> variables than i1 and i2 (or, worse, index1 and index2), because they are
> easier to tell apart when skimming the program. Global names must convey
> relatively more information, because they appear in a larger variety of
> contexts. Even so, a short, precise name can say more than a long-winded
> one: compare acquire and take_ownership. Make every name tell.
>
>
> By this reasoning, I believe that:
>
> n := copy(w.buf[w.pos:], s)
> w.pos += n
> total += n
>
>
> is perfectly fine, certainly easier to skim than the rewrite:
>
> chunkLength := copy(writer.buffer[writer.unflushedBytes:], input)
> writer.unflushedBytes += chunkLength
> totalBytesWritten += chunkLength
>
>
> The suffix 'Bytes' adds no information content: the whole package is about
> bytes. Likewise, in Writer's WriteString method, the suffix 'Written' adds
> nothing. The fact that bytes are being written need not be echoed in a
> variable name. Familiarity *does* admit brevity, and long-winded names hurt
> clarity. A helper method for these three lines is attractive mainly because
> the bloated names make the code hard to scan. Repeating the original three
> lines is no big deal.
>
> The post's most telling sentence is this: "Intention-revealing, fully
> spelled, pronounceable names are a baseline for any professional,
> especially in the field of software development." I would have said the
> baseline for professional work is that the code be correct, be efficient,
> use appropriate algorithms, have good tests, have well documented APIs, and
> be delivered on time. The names of internal variables don't even make the
> list.
>
> Finally, making a point by claiming to be unable to read a 17-line function
> due to one cryptic variable name is a dubious strategy. Being able to dive
> into an unfamiliar code base, find your way around, understand how things
> are typically done, and then learn how the code works is a crucial skill,
> one that eludes many 'professionals'. Working on the Go runtime, I have
> found it invaluable to consult the FreeBSD, Linux, and OS X kernel sources
> from time to time, along with other language runtimes (Haskell, Erlang,
> SBCL, ...). Web developers inevitably read other web developers' code, and
> so on. Different code bases have different conventions and idioms, and
> internalizing them is part of the process. Much better to spend time
> reading and learning from the code than blogging about what's wrong with
> the local variable names.
>
> Russ
>

jonathan....@gmail.com

unread,
Apr 1, 2014, 12:05:52 PM4/1/14
to golan...@googlegroups.com

Wow. Reading the comments here is a little depressing. The post in question isn't really about bufio. The example he picked is just that—an example, but it's a good candidate to pick on.


The author isn't advocating a massive rewrite of the entire Go codebase. He's not going to submit a pull request or anything like that. What he's advocating is better naming.


He's making the point that compilers don't care about names and the length of the names. The resulting compilation doesn't care if your names are nn or total, so why not use total to begin with? To save a few bytes or a few keystrokes? So instead of naming a function fbn why not name it fibonacci? Doesn't that more clearly communicate the meaning? Sure, if you're not familiar with fibonacci then there's some additional overhead. But fibonacci is infinitely more Googleable than fbn.


30 years ago, fixed-width screens were the norm and so were rotary phones. Going over 80 characters on a line was a very big deal (this is a carry over from the punch card days from even before that). Just because you've done something for 30 years doesn't justify it's continued use. My 75-year-old parents still want to use rotary phones and are completely inept at cell phones. Is there any doubt today that using rotary phones is a thing of the past? These conventions are from a bygone era and the limitations that caused them have fallen away.


Everyone is talking about familiarity. If you want to get familiar, why not use the English language? Using English words imposes far less mental mapping than cryptic variable names such as fg or pp or what have you like minifiers and obfuscators do. Granted, programers who have limited English skills may have trouble, but that's another topic. Then again, if the variable name is a contrived acronym for something, it may be easier to read a full English word which can be punched into an online translator. A few years back I worked with a small outsourcing team in South America who delivered the source code to us in Spanish. I could easily read Spanish but others on my team couldn't. Were others on my team irresponsible or amateur because they had trouble understanding the code? No. They had trouble reading the names and the names conveyed a meaning they couldn't decipher. Don't even get me started on Japanese source code. Good luck reading that unless you read Japanese. If you really want to be concise and brief, method names and variable names could be single-character Japanese runes. That would make the source code files much more compact, but you would spend all day deciphering code rather than making improvements to it.


Of course you want to have correct code and good algorithms, but you always want to have those. They're a baseline for deliverable code. The post already mentions this.


The entire point of the post is this: When choosing between brevity and clarity, choose clarity—clarity of thought and expression that conveys meaning and understanding to those who maintain the codebase. Saving a few keystrokes is often at odds with this ideal.

Russ Cox

unread,
Apr 1, 2014, 12:47:35 PM4/1/14
to jonathan....@gmail.com, golang-dev
On Tue, Apr 1, 2014 at 12:05 PM, <jonathan....@gmail.com> wrote:

He's making the point that compilers don't care about names and the length of the names. The resulting compilation doesn't care if your names are nn or total, so why not use total to begin with? To save a few bytes or a few keystrokes? So instead of naming a function fbn why not name it fibonacci? Doesn't that more clearly communicate the meaning? Sure, if you're not familiar with fibonacci then there's some additional overhead. But fibonacci is infinitely more Googleable than fbn.


The method we are looking at is called WriteString, not wrstr, so I hope it is obvious to you that we agree with the general idea.

I object to your use of nn as a motivating example. It's a local variable name. We spend a lot of time working on exported API. I admitted in the first line of my reply that total would have been better than nn, but it's a local variable in a 15-line function body. There's just not enough time in the day to care. No one googles local variable names. The original CL that introduced the variable named nn (as part of Write, not WriteString) introduced the entire bufio package along with the new interfaces io.Read and io.Write (later renamed to io.Reader and io.Writer). I think we got an enormous amount right in that CL, and as you can see from the renaming of the interface and the receiver type, we went back and reconsidered nearly every exported name in every package, most multiple times, to make sure they were clear. I hope you will forgive us that we focused on the names everyone must use instead of the names of local variables that almost no one sees.

I also object to your hypothetical fibonacci vs fbn example. No one is suggesting to shorten fibonacci to fbn, especially not in exported API. You've gone from arguing using names that don't matter to arguing using names that don't exist. This is not a step forward. As an unexported name, I might shorten fibonacci to fib, though, especially since that is a very commonly used shortening - Google [fib function] works just fine.

The entire point of the post is this: When choosing between brevity and clarity, choose clarity—clarity of thought and expression that conveys meaning and understanding to those who maintain the codebase. Saving a few keystrokes is often at odds with this ideal.

The entire point of my reply is this: the opposite of brevity is length, not clarity. Being long-winded can hurt clarity as much as being brief can. The rewritten code is a perfect example. Yes, the names are longer, but they are also completely inconsistent with each other: chunkLength, unflushedBytes, totalBytesWritten. Why does one say Length and the other two say Bytes? Why is the past participle at the beginning of unflushedBytes but the end of totalBytesWritten? Why do we have to say Bytes on every line of a function that only deals with bytes in a package that only deals with bytes? Why do we have to say Written in a method that is all about writing, a method on a type named Writer? And why does only one of the variable names merit the 'Written' suffix? I'm sorry but this is not clear code. If you're going to use long names at the very least make them internally consistent.

Russ

mdwha...@gmail.com

unread,
Apr 1, 2014, 1:49:27 PM4/1/14
to golan...@googlegroups.com, jonathan....@gmail.com
Thanks for pointing out that there is a difference in the way you approach naming when exported API functions are concerned. I've often wondered about the reason for that difference. I disagree that internal/unexported state should be treated differently as far as choosing names is concerned. You're right that going too far toward verbosity is a problem. My general feeling is that a lot of go code goes too far toward brevity.

Michael

Aram Hăvărneanu

unread,
Apr 1, 2014, 3:32:23 PM4/1/14
to mdwha...@gmail.com, golang-dev, jonathan....@gmail.com
The list for unsolicited advice is -nuts, not -dev.

--
Aram Hăvărneanu

Egon Elbre

unread,
Apr 2, 2014, 1:30:12 AM4/2/14
to golan...@googlegroups.com, Egon Elbre


On Monday, March 31, 2014 12:44:21 AM UTC+3, Andrew Gerrand wrote:
I'm not the author of package bufio but I have made changes to it.
I personally didn't find the short names to be a problem, but they are short. I can see why some people might have an issue with them.

As a matter of course we don't refactor standard library code without good reason.
The bufio code has been in use for years now.
Let's leave it alone.


People read those packages to understand what good Go style looks like, so fixing them might be helpful. And, I understand that there should be a very good reason for refactoring.

So, just to clarify.

Should clarity and readability issues in internal code be reported?

That way it would be documented where readability issues are and when someone eventually goes to modify/improve/fix that package they know that readability could be fixed as well. So kind of like a targeted boy scout rule. I can understand that, it could lead to a flood of minor reports, which the issue tracker probably isn't quite good for, some simple list of "packages that need cleaning" would be probably better. But I digress... I just want to know whether to report those case in the future or not.

+ egon

Andrew Gerrand

unread,
Apr 2, 2014, 1:43:11 AM4/2/14
to Egon Elbre, golan...@googlegroups.com
No, thanks. It's not worth the time. 

ron minnich

unread,
Apr 2, 2014, 1:44:28 AM4/2/14
to Andrew Gerrand, Egon Elbre, golan...@googlegroups.com
On Tue, Apr 1, 2014 at 10:43 PM, Andrew Gerrand <a...@google.com> wrote:
> No, thanks. It's not worth the time.

Good. I found the "improved" version much less readable.

ron

Aram Hăvărneanu

unread,
Apr 2, 2014, 3:30:26 AM4/2/14
to ron minnich, Andrew Gerrand, Egon Elbre, golan...@googlegroups.com
On Wed, Apr 2, 2014 at 7:44 AM, ron minnich <rmin...@gmail.com> wrote:
> Good. I found the "improved" version much less readable.

Indeed.

> People read those packages to understand what good Go style looks like,
> so fixing them might be helpful.

Yes, the standard library is a good example of fine Go, but those
packages don't need any "fixing".

--
Aram Hăvărneanu
Reply all
Reply to author
Forward
0 new messages