[go-nuts] An extra warning about semi-colon insertion

215 views
Skip to first unread message

James Fisher

unread,
Apr 26, 2010, 11:44:41 AM4/26/10
to golang-nuts
Not sure where is appropriate for comments on the Go tutorial (http://golang.org/doc/go_tutorial.html#tmp_33), but there's a semi-colon problem I've stumbled across that I think people could be warned of (the tutorial atm only mentions the opening brace warning).  It crops up when formatting a list over multiple lines as follows.

package main
func main() {
  hw := []string{
    "hello",
    "world"
    }
  for i := 0; i < len(hw); i++ {
    print(hw[i])
    }
  }

The above when compiled has a semi-colon auto-inserted after "world".  The solution, from trial-and-error, is to insert another comma after it.  The reason this can be confusing is that some languages don't allow that final comma in lists, yet the multiple-line idiom is extremely common.

There's also an idiom in some cultures of inserting commas at the *start* of the line, like

  hw := []string{
    "hello"
  , "world"
  , "!"
    }

This obviously doesn't work either.

hotei

unread,
Apr 26, 2010, 1:49:25 PM4/26/10
to golang-nuts
James,
Thanks for the 'heads-up' on this. I also scratched my head for a few
minutes over the same problem and finally just gave up and let gofmt
take over and move the trailing brace. I like the idea of go ignoring
the trailing comma on 'world' as it was always a hassle to track the
final comma in C/C++. (I have a program where I change the order of
some lists very frequently to make test cases.) Did gofmt leave the
'world' alone (ie on a line by itself - no close brace)?

Hotei
> --
> Subscription settings:http://groups.google.com/group/golang-nuts/subscribe?hl=en

James Fisher

unread,
Apr 26, 2010, 7:56:49 PM4/26/10
to hotei, golang-nuts
I should note another similar case:  multiple-line lists of operators, e.g.

s := "I"
  + "am"
  + "a"
  + "string"

is an error -- all operators should end the lines rather than begin them:

s := "I" +
    "am" +
    "a" +
    "string"

Steven

unread,
Apr 26, 2010, 11:38:46 PM4/26/10
to James Fisher, hotei, golang-nuts
Yeah, basically, every time  you put a newline in the middle of a statement, it has to end with an appropriate continuation character.
'('  '{'  ',' '.' ':' ':=' '=' and any binary operator.

package main

import "fmt"

func main() {
v :=
[
5]int{
1,
2,
1 +
2,
2 *
2,
(
12 -
2) /
5,
}

fmt.
Println(
v[
1:
5],
)
}

ceving

unread,
Apr 27, 2010, 11:18:14 AM4/27/10
to golang-nuts
On 27 Apr., 05:38, Steven <steven...@gmail.com> wrote:

> Yeah, basically, every time  you put a newline in the middle of a statement,
> it has to end with an appropriate continuation character.
> '('  '{'  ',' '.' ':' ':=' '=' and any binary operator.

The more I read about the semicolon injector the more I am convinced
that it is completely broken by design to invent a syntax which
requires semicolons in order to invent an heuristic injector to hide
them again.

Is there any good reason to do such kludge?

Corey Thomasson

unread,
Apr 27, 2010, 11:38:42 AM4/27/10
to ceving, golang-nuts
It's talked about somewhere either on golang.org or in one of the talks.

If I understand correctly, gofmt already existed, so when they decided
to make terminating semicolons optional (they were required before),
it was much easier to implement in gofmt than to rewrite the lexer
(and doesn't break any old code still using semicolons). Also, I
believe some other languages employ a similar mechanism, semicolons
are only required where the lexer (or in gos case formatter) can't
determine the state based on the end of the line. i.e.

x =
3

obviously the line cannot end in a binary operator, but
x = 3
y = {
3
}

How does a lexer differentiate between those two?

chris dollin

unread,
Apr 27, 2010, 12:36:18 PM4/27/10
to Corey Thomasson, ceving, golang-nuts
On 27 April 2010 16:38, Corey Thomasson <cthom...@gmail.com> wrote:
It's talked about somewhere either on golang.org or in one of the talks.

If I understand correctly, gofmt already existed, so when they decided
to make terminating semicolons optional (they were required before),
it was much easier to implement in gofmt than to rewrite the lexer
(and doesn't break any old code still using semicolons). Also, I
believe some other languages employ a similar mechanism, semicolons
are only required where the lexer (or in gos case formatter)  can't
determine the state based on the end of the line. i.e.

x =
3

obviously the line cannot end in a binary operator, but
x = 3
y = {
3
}

How does a lexer differentiate between those two?

Identifiers can start statements, and constants can end
them, so a ; is plausible.

If y were . or +, then "clearly" that's not the start of a
statement, so 3 isn't the end of one, so ; would be
implausible.

--
Chris "allusive" Dollin

peterGo

unread,
Apr 27, 2010, 12:42:21 PM4/27/10
to golang-nuts
Corey,

It applies these rules.
http://golang.org/doc/go_spec.html#Semicolons

Peter

chris dollin

unread,
Apr 27, 2010, 12:45:55 PM4/27/10
to Corey Thomasson, ceving, golang-nuts
On 27 April 2010 17:36, chris dollin <ehog....@googlemail.com> wrote:


Identifiers can start statements, and constants can end
them, so a ; is plausible.

If y were . or +, then "clearly" that's not the start of a
statement, so 3 isn't the end of one, so ; would be
implausible.

Clarification: that's how /a/ lexer can tell the difference, not
how the Go lexer does -- it has a much more brutal rule.

--
Chris "subtext" Dollin

Corey Thomasson

unread,
Apr 27, 2010, 1:00:28 PM4/27/10
to peterGo, golang-nuts
I should've read what i typed before I sent it.

I've read most of the spec, and know why it does what it does. My
question was meant to be an answer to "Is there any good reason to do
such kludge?", not an actual question seeking answers.

On Tue, Apr 27, 2010 at 12:42 PM, peterGo <go.pe...@gmail.com> wrote:
> Corey,
>
> It applies these rules.
> http://golang.org/doc/go_spec.html#Semicolons
>
> Peter


chris dollin

unread,
Apr 27, 2010, 1:15:26 PM4/27/10
to Corey Thomasson, peterGo, golang-nuts
On 27 April 2010 18:00, Corey Thomasson <cthom...@gmail.com> wrote:
I should've read what i typed before I sent it.

I've read most of the spec, and know why it does what it does. My
question was meant to be an answer to "Is there any good reason to do
such kludge?", not an actual question seeking answers.

I'd lay odds that if you take the semicolons out of the grammar
that it becomes ambiguous, or at least not LR(1) or whatever.

If there's ever a place where something might be an end of a
statement that might optionally be continued by the next token,
then your parser will have a problem. For example (and not
checking with the fine detail of Go's grammar, since this is
supposed to be an illustration), suppose statements can start
with expressions that may start with an open bracket, such as

  (*something).whatever

And that statements can end with expressions

  lhs = rhs

And now we write the one statement after the other

  lhs = rhs(*something) .whatever

When the parser is standing between rhs and (, is it
going to take the ( as part of a function application, or
as the beginning of the next statement? If the latter and
you meant the former, how are you going to say that?

; is how to say "done with that statement, here's another"
in those cases.

You have to do /something/. Whatever you do, someone
won't like it. Might even be the same someone.

--
Chris "example" Dollin

Russ Cox

unread,
Apr 27, 2010, 1:29:19 PM4/27/10
to ceving, golang-nuts
> The more I read about the semicolon injector the more I am convinced
> that it is completely broken by design to invent a syntax which
> requires semicolons in order to invent an heuristic injector to hide
> them again.

The more I *use* the semicolon injector the more I am convinced
that it is an incredibly elegant solution to the statement
separator/terminator problem. I almost never forget to type
newline characters. I was pretty skeptical about the change
at the time, but it's definitely made coding in Go feel even more
lightweight to me.

> Is there any good reason to do such kludge

Stop reading and start coding; see for yourself.

Russ

Peter Williams

unread,
Apr 27, 2010, 9:33:13 PM4/27/10
to Corey Thomasson, peterGo, golang-nuts
On 28/04/10 03:00, Corey Thomasson wrote:
> I should've read what i typed before I sent it.
>
> I've read most of the spec, and know why it does what it does. My
> question was meant to be an answer to "Is there any good reason to do
> such kludge?", not an actual question seeking answers.
>
> On Tue, Apr 27, 2010 at 12:42 PM, peterGo<go.pe...@gmail.com> wrote:
>> Corey,
>>
>> It applies these rules.
>> http://golang.org/doc/go_spec.html#Semicolons
>>
>> Peter
>
>

I share your concern. I think that semicolon injection is a very kludgy
way of handling optional semicolons. I would have thought that an
LALR(1) parser would be able to handle optional semicolons without the
need for such an error prone kludge.

It's suddenly making the language (partially) source format sensitive by
inserting semicolons in the wrong place unless the code is formatted a
specific way. As a Python programmer, I can cope with source format
sensitivity so I'm not against it per se but I think that it's a bad
idea in this case as it has all the appearance of being an accident.

Peter

chris dollin

unread,
Apr 28, 2010, 1:15:39 AM4/28/10
to Peter Williams, Corey Thomasson, peterGo, golang-nuts
On 28 April 2010 02:33, Peter Williams <pwil...@gmail.com> wrote:

I share your concern.  I think that semicolon injection is a very kludgy way of handling optional semicolons.  I would have thought that an LALR(1) parser would be able to handle optional semicolons without the need for such an error prone kludge.

See my earlier post for an illustration of why that's not so;
if the ends of statements can mix into the beginnings of
statements, LR(1) isn't enough. In fact, it may be the case
that /no/ parser will be enough -- the grammar might become
ambiguous.

--
Chris "allusive" Dollin

Peter Williams

unread,
Apr 28, 2010, 3:04:17 AM4/28/10
to chris dollin, Corey Thomasson, peterGo, golang-nuts
On 28/04/10 15:15, chris dollin wrote:
> On 28 April 2010 02:33, Peter Williams <pwil...@gmail.com
> <mailto:pwil...@gmail.com>> wrote:
>
>
> I share your concern. I think that semicolon injection is a very
> kludgy way of handling optional semicolons. I would have thought
> that an LALR(1) parser would be able to handle optional semicolons
> without the need for such an error prone kludge.
>
>
> See my earlier post for an illustration of why that's not so;
> if the ends of statements can mix into the beginnings of
> statements, LR(1) isn't enough. In fact, it may be the case
> that /no/ parser will be enough -- the grammar might become
> ambiguous.

If the production terminator is defined as a semicolon or a newline then
it should be possible. And even easier if you replace the word
terminator with separator in the specification.

More importantly, the "insert semicolon" hack causes error messages that
are confusing unless you know that the optional semicolon feature is
implemented using this hack.

If you want a language to be format sensitive you should state this
upfront and the syntax description should make it very clear what the
acceptable format is. It shouldn't be the side effect of the
implementation (which I notice has now crept into the specification)
that pops up in the most unlikeliest of places (from the POV of the
person reading the resultant error messages).

What really makes me giggle though is having this hack described as
elegant by some. It's anything but.

Peter
PS Maybe the question that should be asked is "Are semicolons required
at all in those places that they are optional i.e. definition and
statement terminators?". There are no examples in the specification
(that I could find) that use semicolons for those purposes and I think
that the only thing you lose is the ability to put more than one
definition or statement on a line. If you care about that, you could
regain this ability by a slight change to the specification to make the
semicolon and newline statement/definition separator alternatives.

ceving

unread,
Apr 28, 2010, 4:55:40 AM4/28/10
to golang-nuts
On 27 Apr., 19:29, Russ Cox <r...@golang.org> wrote:
> at the time, but it's definitely made coding in Go feel even more
> lightweight to me.
>

As long as you like to toe the Go way of life. But for others
(Kernighan, Stroustrup, Torvalds, Stallman) there are good reasons to
write opening and closing braces in the same column. And this is not
possible because the semicolon injector is too stupid the handle this
and the most annoying: it seems to me that it is not possible to
disable it.

Mue

unread,
Apr 28, 2010, 5:33:41 AM4/28/10
to golang-nuts
On 28 Apr., 10:55, ceving <cev...@googlemail.com> wrote:

> As long as you like to toe the Go way of life. But for others
> (Kernighan, Stroustrup, Torvalds, Stallman) there are good reasons to
> write opening and closing braces in the same column. And this is not
> possible because the semicolon injector is too stupid the handle this
> and the most annoying: it seems to me that it is not possible to
> disable it.

As a Go developer one should like the Go way of life. Each laguage has
its own style. C, Lisp, Smalltalk, Erlang, Python. And it makes no
sense to want one language behave like another one. I totally agree
with Russ. Go is pretty lightweighted and productive and it's simple
to produce code where gofmt has not much work. *smile*

So, just give it a chance.

mue

Noah Evans

unread,
Apr 28, 2010, 7:08:00 AM4/28/10
to ceving, golang-nuts
Could you give an example?

roger peppe

unread,
Apr 28, 2010, 7:42:07 AM4/28/10
to Peter Williams, chris dollin, Corey Thomasson, peterGo, golang-nuts
On 28 April 2010 08:04, Peter Williams <pwil...@gmail.com> wrote:
> If the production terminator is defined as a semicolon or a newline then it
> should be possible.  And even easier if you replace the word terminator with
> separator in the specification.

if you try changing the grammar to allow an arbitrary run of semicolons in
all the places that you can put newlines now, i think you'll quickly
find that this is quite hard.

ceving

unread,
Apr 28, 2010, 8:08:52 AM4/28/10
to golang-nuts
On 28 Apr., 13:08, Noah Evans <noah.ev...@gmail.com> wrote:
> Could you give an example?
>

$ cat -n brace.go
1 // -*- tab-width: 4; indent-tabs-mode: nil -*-
2
3 package main
4 import "fmt"
5 import "flag"
6
7 var quiet = flag.Bool ("q", false, "quiet")
8
9 func main ()
10 {
11 flag.Parse ()
12 if !*quiet
13 {
14 fmt.Println ("Hello World!");
15 }
16 }
$ 6g brace.go
brace.go:10: syntax error: unexpected semicolon or newline before {
$ cat -n brace.go
1 // -*- tab-width: 4; indent-tabs-mode: nil -*-
2
3 package main
4 import "fmt"
5 import "flag"
6
7 var quiet = flag.Bool ("q", false, "quiet")
8
9 func main () {
10 flag.Parse ()
11 if !*quiet
12 {
13 fmt.Println ("Hello World!");
14 }
15 }
$ 6g brace.go
brace.go:11: !*quiet not used
$ cat -n brace.go
1 // -*- tab-width: 4; indent-tabs-mode: nil -*-
2
3 package main
4 import "fmt"
5 import "flag"
6
7 var quiet = flag.Bool ("q", false, "quiet")
8
9 func main () {
10 flag.Parse ()
11 if !*quiet {
12 fmt.Println ("Hello World!");
13 }
14 }
$ 6g brace.go
$

befelemepeseveze

unread,
Apr 28, 2010, 8:12:07 AM4/28/10
to golang-nuts
On 28 dub, 10:55, ceving <cev...@googlemail.com> wrote:
> As long as you like to toe the Go way of life. But for others
> (Kernighan, Stroustrup, Torvalds, Stallman) there are good reasons to
> write opening and closing braces in the same column. And this is not
> possible because the semicolon injector is too stupid the handle this
> and the most annoying: it seems to me that it is not possible to
> disable it.

I'm trying to guess how many lines of e.g. sed/awk/... would be
required to switch the on the line alone opening braces to the proper
Go style (at end of the previous line) before throwing the source on
the compiler. Probably I'm wrong and it can't be done that simply.

Peter Bourgon

unread,
Apr 28, 2010, 8:14:54 AM4/28/10
to golang-nuts
On Wed, Apr 28, 2010 at 2:08 PM, ceving <cev...@googlemail.com> wrote:
> $ cat -n brace.go

Yes, Go enforces K&R style bracing. It is one of several strict style
requirements that, as a whole, contribute to making Go what it is. You
may take it or leave it but I don't think it's up for debate. In any
case, the list has already hashed out the pro- and con-arguments quite
extensively.

yy

unread,
Apr 28, 2010, 8:31:29 AM4/28/10
to befelemepeseveze, golang-nuts
2010/4/28 befelemepeseveze <befeleme...@gmail.com>:
> I'm trying to guess how many lines of e.g. sed/awk/... would be
> required to switch the on the line alone opening braces to the proper
> Go style (at end of the previous line) before throwing the source on
> the compiler. Probably I'm wrong and it can't be done that simply.

If it is a matter of lines, you can do it in one, sort of:

awk '/^[ \t]*{n=n" {"; next}{if(n) print n; n=$0}END{print}'

I don't know if using that style you use to add comments or any code
to the line with the opening brace:

awk '/^[ \t]*{.*/ {sub(/^[ \t]*/," "); n=n$0; next}{if(n) print n;
n=$0}END{print}'

I have not really tested these commands, my point is that is certainly
much shorter than this thread.

--
- yiyus || JGL . 4l77.com

ceving

unread,
Apr 28, 2010, 8:40:46 AM4/28/10
to golang-nuts
On 28 Apr., 14:14, Peter Bourgon <peterbour...@gmail.com> wrote:
>
> Yes, Go enforces K&R style bracing.

K&R style is different for funcs. Maybe you should take a look a at
the C book.

chris dollin

unread,
Apr 28, 2010, 9:18:40 AM4/28/10
to Peter Williams, Corey Thomasson, peterGo, golang-nuts
On 28 April 2010 08:04, Peter Williams <pwil...@gmail.com> wrote:

If the production terminator is defined as a semicolon or a newline then it should be possible. 

Thus making it pretty much illegal to split a statement over a newline?

I don't think that's a desirable side-effect.

--
Chris "allusive" Dollin

Ian Lance Taylor

unread,
Apr 28, 2010, 9:55:29 AM4/28/10
to Peter Williams, chris dollin, Corey Thomasson, peterGo, golang-nuts
Peter Williams <pwil...@gmail.com> writes:

> On 28/04/10 15:15, chris dollin wrote:
>>
>> See my earlier post for an illustration of why that's not so;
>> if the ends of statements can mix into the beginnings of
>> statements, LR(1) isn't enough. In fact, it may be the case
>> that /no/ parser will be enough -- the grammar might become
>> ambiguous.
>
> If the production terminator is defined as a semicolon or a newline
> then it should be possible. And even easier if you replace the word
> terminator with separator in the specification.

I think it's more complex than that. We want to retain the ability to
break statements across lines. And we don't want the lexer to have to
read ahead to decide what to do--I think JavaScript shows why that can
be confusing in practice.

> More importantly, the "insert semicolon" hack causes error messages
> that are confusing unless you know that the optional semicolon feature
> is implemented using this hack.

Error messages can be fixed. I would encourage you to file issues
about confusing cases.

> If you want a language to be format sensitive you should state this
> upfront and the syntax description should make it very clear what the
> acceptable format is. It shouldn't be the side effect of the
> implementation (which I notice has now crept into the specification)
> that pops up in the most unlikeliest of places (from the POV of the
> person reading the resultant error messages).

It's not a side effect of the implementation. As you note, the
precise rules are in the language spec, and both current
implementations behave the same way.

Ian

Russ Cox

unread,
Apr 28, 2010, 1:54:49 PM4/28/10
to ceving, golang-nuts
> $ cat -n brace.go
>     1  // -*- tab-width: 4; indent-tabs-mode: nil -*-
>     2
>     3  package main
>     4  import "fmt"
>     5  import "flag"
>     6
>     7  var quiet = flag.Bool ("q", false, "quiet")
>     8
>     9  func main () {
>    10      flag.Parse ()
>    11      if !*quiet
>    12      {
>    13          fmt.Println ("Hello World!");
>    14      }
>    15  }
> $ 6g brace.go
> brace.go:11: !*quiet not used

This is a correct error: the program you've written is equivalent to

if !*quiet; { ... }

which, like all the other conditions, can be omitted and defaults
to true:

if !*quiet; true { ... }

so the compiler is telling you that it's odd you've evaluated an
expression where a statement (with a side-effect) was expected.

We could disallow the missing condition for if, but that wouldn't
help at all for

switch !*quiet
{

We can't disallow the empty condition there since it's such a common
idiom to write:

switch x := complicated; {
case x < 0: ...
case x > 0: ...
default: ...
}

Go is not C. It takes a little while to figure that out, and then once
you do, you're all set.

Russ

Peter Williams

unread,
Apr 28, 2010, 8:27:56 PM4/28/10
to roger peppe, chris dollin, Corey Thomasson, peterGo, golang-nuts
I think that you only need to make them separators for Statements,
MethodSpecs (inside Interface declarations) and FieldDecls in structure
declarations. I.e. in those places where they are currently specified
as terminators.

Peter

Peter Williams

unread,
Apr 28, 2010, 8:49:10 PM4/28/10
to Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
On 28/04/10 23:55, Ian Lance Taylor wrote:
> Peter Williams<pwil...@gmail.com> writes:
>
>> On 28/04/10 15:15, chris dollin wrote:
>>>
>>> See my earlier post for an illustration of why that's not so;
>>> if the ends of statements can mix into the beginnings of
>>> statements, LR(1) isn't enough. In fact, it may be the case
>>> that /no/ parser will be enough -- the grammar might become
>>> ambiguous.
>>
>> If the production terminator is defined as a semicolon or a newline
>> then it should be possible. And even easier if you replace the word
>> terminator with separator in the specification.
>
> I think it's more complex than that. We want to retain the ability to
> break statements across lines.

I don't see how it would be more difficult than it is now. Whether a
new line is a prospective "end of statement" largely depends on what's
gone before not what comes after. If you place (sensible) restrictions
on whereabouts new lines can occur in a statement (i.e. those places
where it can't possibly be the end of a statement) it would totally
depend on what's gone before.

Changing the semicolon from an optional terminator to a separator
doesn't change this. Only the removal of the option would make it simpler.

The current arrangement makes the newline a de facto terminator in the
productions in question so the same argument applies to it.

> And we don't want the lexer to have to
> read ahead to decide what to do--I think JavaScript shows why that can
> be confusing in practice.

Why is the lexer making syntax decisions?

>
>> More importantly, the "insert semicolon" hack causes error messages
>> that are confusing unless you know that the optional semicolon feature
>> is implemented using this hack.
>
> Error messages can be fixed. I would encourage you to file issues
> about confusing cases.
>
>> If you want a language to be format sensitive you should state this
>> upfront and the syntax description should make it very clear what the
>> acceptable format is. It shouldn't be the side effect of the
>> implementation (which I notice has now crept into the specification)
>> that pops up in the most unlikeliest of places (from the POV of the
>> person reading the resultant error messages).
>
> It's not a side effect of the implementation. As you note, the
> precise rules are in the language spec, and both current
> implementations behave the same way.

I see it as an implementation artefact that's crept into the
documentation. :-)

But the fact that it effects the way the source can be formatted in
positions that are far removed from anywhere where a semicolon would be
required if they weren't optional is the real problem. This could be
fixed if the mechanism for injecting the semicolons was smarter but I
suspect that ends up being as complex as fixing the parser.

Peter

Ian Lance Taylor

unread,
Apr 28, 2010, 10:31:21 PM4/28/10
to Peter Williams, chris dollin, Corey Thomasson, peterGo, golang-nuts
Peter Williams <pwil...@gmail.com> writes:

> I don't see how it would be more difficult than it is now. Whether a
> new line is a prospective "end of statement" largely depends on what's
> gone before not what comes after. If you place (sensible)
> restrictions on whereabouts new lines can occur in a statement
> (i.e. those places where it can't possibly be the end of a statement)
> it would totally depend on what's gone before.

I'm not sure I understand what you are suggesting. It sounds like you
are saying that things should stay as they are now, except that the
lexer does not introduce semicolons, and newline acts as semicolon
does now. What's the substantive effect of such a change?

Also, how would it be described? Describing the behaviour in two
levels make it clear exactly where lines may be broken.


> I see it as an implementation artefact that's crept into the
> documentation. :-)

I see it more as a way of making the behaviour precise without
complicating the grammar significantly.

Ian

Peter Williams

unread,
Apr 29, 2010, 3:14:52 AM4/29/10
to Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
On 29/04/10 12:31, Ian Lance Taylor wrote:
> Peter Williams<pwil...@gmail.com> writes:
>
>> I don't see how it would be more difficult than it is now. Whether a
>> new line is a prospective "end of statement" largely depends on what's
>> gone before not what comes after. If you place (sensible)
>> restrictions on whereabouts new lines can occur in a statement
>> (i.e. those places where it can't possibly be the end of a statement)
>> it would totally depend on what's gone before.
>
> I'm not sure I understand what you are suggesting. It sounds like you
> are saying that things should stay as they are now, except that the
> lexer does not introduce semicolons, and newline acts as semicolon
> does now. What's the substantive effect of such a change?

Roughly that. I'm not advocating making semicolons compulsory just
trying to ameliorate the effect making them optional has on the syntax's
source format sensitivity.

>
> Also, how would it be described? Describing the behaviour in two
> levels make it clear exactly where lines may be broken.
>
>
>> I see it as an implementation artefact that's crept into the
>> documentation. :-)
>
> I see it more as a way of making the behaviour precise without
> complicating the grammar significantly.

Needs a big warning to warn users that this places (what may be
unexpected) restrictions on the source format that the compiler will
accept. With some examples.

Or modify the specification productions to make this clear.

E.g. the following production for a function declaration:

FunctionDecl = "func" identifier Signature [ Body ] .
Body = Block.

Doesn't make it clear that you can't have a new line after identifier or
between Signature and Body. IMHO, semicolon injection makes that
production invalid and it should look something like:

FunctionDecl = "func" identifier [\t ]* Signature [\t ]* [ Body ] .
Body = Block.

Peter

ceving

unread,
Apr 29, 2010, 4:09:08 AM4/29/10
to golang-nuts
On 28 Apr., 19:54, Russ Cox <r...@golang.org> wrote:
> This is a correct error: the program you've written is equivalent to
>

Maybe but no user (programmer) wants to care about that.

The fact is: simply putting a new line at the wrong place produces
strange error messages. And I do not think that it gets better by
telling everybody that this is an elegant feature to get complains
about semicolons although the source does not contain any at all.

Of course there are some secret reasons, which can explain the error
message. And from an internal point of view the error messages makes
sense. But it is a bad idea to design a language, which requires the
user to know essentially hidden internal secrets.

Andrew Gerrand

unread,
Apr 29, 2010, 4:29:51 AM4/29/10
to ceving, golang-nuts
It's not a "hidden internal secret". It's right there in the spec,
which is quite short:

http://golang.org/doc/go_spec.html#Semicolons

Andrew

befelemepeseveze

unread,
Apr 29, 2010, 4:38:14 AM4/29/10
to golang-nuts
On 29 dub, 10:09, ceving <cev...@googlemail.com> wrote:
> The fact is: simply putting a new line at the wrong place produces
> strange error messages.

De Morgan says: Not putting a new line in the wrong place (as per
language specs) gets rid of the error message.

Rules for the "right" and "wrong" places are quite simple.

> And I do not think that it gets better by
> telling everybody that this is an elegant feature to get complains
> about semicolons although the source does not contain any at all.

No valid Go source causes the compiler to complain about semicolons
(compiler bugs excluded).

> Of course there are some secret reasons, which can explain the error
> message. And from an internal point of view the error messages makes
> sense. But it is a bad idea to design a language, which requires the
> user to know essentially hidden internal secrets.

The language specification doesn't require the user to know
essentially hidden internal secrets. Anyone just writing code
according to the specs doesn't need to know them.

Chris Wedgwood

unread,
Apr 29, 2010, 4:40:36 AM4/29/10
to Andrew Gerrand, ceving, golang-nuts
On Thu, Apr 29, 2010 at 06:29:51PM +1000, Andrew Gerrand wrote:

> It's not a "hidden internal secret". It's right there in the spec,
> which is quite short:
>
> http://golang.org/doc/go_spec.html#Semicolons

putting something in the specification doesn't make it intuitive,
whilst i'm quite happy with the semi's change that was made, it has on
this list and irc caused a lot of confusion

i think the change made was a good change, but i don't think the issue
is entirely solved yet

arguably though it's not a high-priority

Beoran

unread,
Apr 29, 2010, 5:33:57 AM4/29/10
to golang-nuts
Dear Ceving,

Ruby, much like Go, has optional semicolons, and it is a bit more
flexible with regards to style. Still, Ruby has some problems at
times, and it also has a very complicated parser (10k lines of yacc),
and the parsing is slow. Go opted for simpler syntax rules that keep
the parser light and quick, at the expense of some programmer
convenience.

You can have a slow parsing language with a complex syntax, a faster
parsing language with a simple syntax, but not a fast parsing language
with a complex syntax. I think it's a nice trade-off. Go is a bit less
convenient, but it's light and compiles down to machine code. If you
don't like it, program in Ruby in stead. :)

Kind Regards,

B.

roger peppe

unread,
Apr 29, 2010, 5:40:50 AM4/29/10
to Peter Williams, Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
On 29 April 2010 08:14, Peter Williams <pwil...@gmail.com> wrote:
> Roughly that.  I'm not advocating making semicolons compulsory just trying
> to ameliorate the effect making them optional has on the syntax's source
> format sensitivity.

you can't have it both ways.

x := y
<-z

if semicolons are optional and newlines are treated
as white space, there are two equally valid
interpretations of the above:
a) assign y to x then receive on z
b) non-blocking send z on y, then assign the boolean
result to x

if all newlines are turned into semicolons, then
there are still cases where you'd need to put
the brace at the start of the line, otherwise the following
two lines might be a single assignment (creating a new
instance of y) or an assignment followed by a statement
block).

x := y
{
foo()
}

i'm sure there are other good examples too.

chris dollin

unread,
Apr 29, 2010, 5:48:36 AM4/29/10
to roger peppe, Peter Williams, Ian Lance Taylor, Corey Thomasson, peterGo, golang-nuts
On 29 April 2010 10:40, roger peppe <rogp...@gmail.com> wrote:
On 29 April 2010 08:14, Peter Williams <pwil...@gmail.com> wrote:
> Roughly that.  I'm not advocating making semicolons compulsory just trying
> to ameliorate the effect making them optional has on the syntax's source
> format sensitivity.

you can't have it both ways.

x := y
<-z

Rather than worry about semicolons, we could introduce two keywords,
`set` and `call`, so that every statement started with a keyword and
we didn't need semicolons or newlines as terminators or separators.

(Mentioned to show that a solution that eliminates the grammar difficulties
may be socially unacceptable ...)

Really, the concrete grammar one uses for a programming language should
be up to the developer; programs like gofmt can render code into whatever
the publication style is.

--
Chris "not completely joking, not utterly serious" Dollin

ceving

unread,
Apr 29, 2010, 5:49:48 AM4/29/10
to golang-nuts
On 29 Apr., 11:33, Beoran <beo...@gmail.com> wrote:
>
> If you don't like it, program in Ruby in stead. :)
>

The world does not move for resignation. ;-)

Peter Williams

unread,
Apr 29, 2010, 6:16:24 AM4/29/10
to roger peppe, Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
So you're arguing for non optional semicolons?

Peter

chris dollin

unread,
Apr 29, 2010, 6:23:46 AM4/29/10
to Peter Williams, roger peppe, Ian Lance Taylor, Corey Thomasson, peterGo, golang-nuts

Looks like he's arguing that the suggested alternatives to the
current solution (mandatory semicolons + one form of semicolon
insertion) have not been thought through.

--
Chris "allusive" Dollin

roger peppe

unread,
Apr 29, 2010, 6:29:13 AM4/29/10
to chris dollin, Peter Williams, Ian Lance Taylor, Corey Thomasson, peterGo, golang-nuts
On 29 April 2010 11:23, chris dollin <ehog....@googlemail.com> wrote:
>> So you're arguing for non optional semicolons?
>
> Looks like he's arguing that the suggested alternatives to the
> current solution (mandatory semicolons + one form of semicolon
> insertion) have not been thought through.

indeed i am.

personally, i think the current rules work very well.

James Fisher

unread,
Apr 29, 2010, 11:02:21 AM4/29/10
to golang-nuts
I feel like I've opened a can of worms here.

I'm not going to get into much of the discussion here, but suggest two possible (half-assed) solutions:

1. Change the terminology from "semicolon" to something semantic like "statement terminator".  If someone is told about a mysterious "unexpected semicolon or newline", they will look for trivial typos in their source code.  If they are told about an "unexpected statement termination", it is a hint that the error is higher up the ladder of parsing abstraction than just "unexpected characters."
2. When semicolons (read: statement terminators)  are auto-inserted, could the fact that they have been auto-inserted, and the reason why it was auto-inserted, not be kept around with it?  That way the error messages could be made much more explicit.

So, going with the example that I started this thread with:

package main
func main() {
  hw := []string{
    "hello",
    "world"
    }
  for i := 0; i < len(hw); i++ {
    print(hw[i])
    }
  }

The error that this currently generates is:

test.go:5: syntax error: unexpected semicolon or newline, expecting }
test.go:7: syntax error: unexpected for

If what I suggest was implemented, this could be displayed as:

test.go:5: syntax error: unexpected statement termination (inferred between string literal "world" and newline character in test.go:5), expecting }
test.go:7: syntax error: unexpected for


Thoughts?


James


p.s. why does it only say "expecting }" when the comma is also a valid next token?  (Is the comma not a token?)

Ian Lance Taylor

unread,
Apr 29, 2010, 2:07:04 PM4/29/10
to Peter Williams, chris dollin, Corey Thomasson, peterGo, golang-nuts
Peter Williams <pwil...@gmail.com> writes:

> On 29/04/10 12:31, Ian Lance Taylor wrote:
>>
>> I'm not sure I understand what you are suggesting. It sounds like you
>> are saying that things should stay as they are now, except that the
>> lexer does not introduce semicolons, and newline acts as semicolon
>> does now. What's the substantive effect of such a change?
>
> Roughly that. I'm not advocating making semicolons compulsory just
> trying to ameliorate the effect making them optional has on the
> syntax's source format sensitivity.

If you aren't suggesting a change to the language, then I'm not sure I
see the point to a change to the spec.

> FunctionDecl = "func" identifier [\t ]* Signature [\t ]* [ Body ] .

That does not seem to me to be easier to understand. The current
semicolon rule does have consequences, but the language spec is not
the place to give all the details.

I agree that the compiler error messages should be improved. I'm not
sure what other substantive issues there are.

Ian

Peter Williams

unread,
Apr 29, 2010, 8:12:53 PM4/29/10
to Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
The substantive issue is that the injection of semicolons creates two de
facto classes of white space:

1. one without embedded new lines (that can be used anywhere), and
2. one with embedded new lines that can only be used in a few places.

I think that's a problem and that the correct way to address it is to
make it go away. The next best solution is to state it formally in all
of the effected grammar productions.

Of course, this next best solution offers a hint for a mechanism to make
the problem go away. That mechanism would be to extend the productions
used in the actual parser to include white space rules so that handling
line feeds as an alternate terminator can be handled in a way that
doesn't expose two classes of white space.

Internally, there would be the need for two types of white space (with
and without embedded new lines) and the lexer would need to report these
to the parser. You then add a production that states that a terminator
is either a semicolon or white space with embedded new lines and you
have optional semicolons without any side effects (and syntax decisions
are being made in the parser where they belong and not in the lexer).

Cheers,
Peter

Russ Cox

unread,
Apr 29, 2010, 8:17:47 PM4/29/10
to Peter Williams, Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
How many lines of Go code have you written?

Russ

Ian Lance Taylor

unread,
Apr 29, 2010, 11:28:47 PM4/29/10
to Peter Williams, golang-nuts
Peter Williams <pwil...@gmail.com> writes:

> The substantive issue is that the injection of semicolons creates two
> de facto classes of white space:
>
> 1. one without embedded new lines (that can be used anywhere), and
> 2. one with embedded new lines that can only be used in a few places.
>
> I think that's a problem and that the correct way to address it is to
> make it go away. The next best solution is to state it formally in
> all of the effected grammar productions.

I don't see why it is a problem.

Even if we were to agree that it is a problem, it seems to me that
your suggestions for addressing it actually more complex than the
problem itself.

Ian

Peter Williams

unread,
Apr 30, 2010, 12:02:02 AM4/30/10
to Russ Cox, Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
On 30/04/10 10:17, Russ Cox wrote:
> How many lines of Go code have you written?

About a thousand. Why? Is it relevant?

Peter


Peter Williams

unread,
Apr 30, 2010, 12:15:20 AM4/30/10
to Ian Lance Taylor, golang-nuts
On 30/04/10 13:28, Ian Lance Taylor wrote:
> Peter Williams<pwil...@gmail.com> writes:
>
>> The substantive issue is that the injection of semicolons creates two
>> de facto classes of white space:
>>
>> 1. one without embedded new lines (that can be used anywhere), and
>> 2. one with embedded new lines that can only be used in a few places.
>>
>> I think that's a problem and that the correct way to address it is to
>> make it go away. The next best solution is to state it formally in
>> all of the effected grammar productions.
>
> I don't see why it is a problem.

It's a problem because it breaks a large number of the productions in
your specification. That you don't see that is a problem is also a
problem :-).

>
> Even if we were to agree that it is a problem, it seems to me that
> your suggestions for addressing it actually more complex than the
> problem itself.
>

Things are rarely free. Knowing where it's OK to allow a new line to
act as a terminator is a SYNTAX issue and needs to be solved in the
parser not the lexer. Unfortunately, this makes the parser slightly
more complex but that's the price that needs to be paid in order to make
optional semicolons tidy.

Otherwise, the specification document should be modified to point out
all the places where new lines can't be used in white space as a side
effect of the semicolon injection kludge. At the moment, it makes Go
look very messy.

In other words, if you want Go to be source code format sensitive then
say so and modify the specification to make it very clear what the
format restrictions are. Don't leave it to the user to discover as a
result of trial and error chasing down compiler complaints about
misplaced semicolons in code that contains no semicolons. It's not a
good look.

Peter

Russ Cox

unread,
Apr 30, 2010, 12:26:29 AM4/30/10
to Peter Williams, Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
It seems to be. My experience has been that the fewer lines
of code people have actually written in Go the louder they
argue about details like semicolon or brace placement.
I was curious whether you were arguing from experience
or hypothetically.

Russ

Ian Lance Taylor

unread,
Apr 30, 2010, 12:28:56 AM4/30/10
to Peter Williams, golang-nuts
Peter Williams <pwil...@gmail.com> writes:

> On 30/04/10 13:28, Ian Lance Taylor wrote:
>> Peter Williams<pwil...@gmail.com> writes:
>>
>>> The substantive issue is that the injection of semicolons creates two
>>> de facto classes of white space:
>>>
>>> 1. one without embedded new lines (that can be used anywhere), and
>>> 2. one with embedded new lines that can only be used in a few places.
>>>
>>> I think that's a problem and that the correct way to address it is to
>>> make it go away. The next best solution is to state it formally in
>>> all of the effected grammar productions.
>>
>> I don't see why it is a problem.
>
> It's a problem because it breaks a large number of the productions in
> your specification. That you don't see that is a problem is also a
> problem :-).

You're quite right, I don't see that it is a problem. The
specification is clear and the productions are not broken. There are
(at least) three different Go parsers written independently by
different people, and they all work.


> Knowing where it's OK to allow a new line to
> act as a terminator is a SYNTAX issue and needs to be solved in the
> parser not the lexer.

Why? Who are you trying to help? What actual problem are you trying
to solve--not what problem in the spec, but what actual problem?


> In other words, if you want Go to be source code format sensitive then
> say so and modify the specification to make it very clear what the
> format restrictions are.

We do say so.

http://golang.org/doc/go_spec.html#Semicolons
http://golang.org/doc/go_lang_faq.html#semicolons
http://golang.org/doc/go_tutorial.html#tmp_35

Ian

Peter Williams

unread,
Apr 30, 2010, 12:53:40 AM4/30/10
to Ian Lance Taylor, golang-nuts
On 30/04/10 14:28, Ian Lance Taylor wrote:
> Peter Williams<pwil...@gmail.com> writes:
>
>> On 30/04/10 13:28, Ian Lance Taylor wrote:
>>> Peter Williams<pwil...@gmail.com> writes:
>>>
>>>> The substantive issue is that the injection of semicolons creates two
>>>> de facto classes of white space:
>>>>
>>>> 1. one without embedded new lines (that can be used anywhere), and
>>>> 2. one with embedded new lines that can only be used in a few places.
>>>>
>>>> I think that's a problem and that the correct way to address it is to
>>>> make it go away. The next best solution is to state it formally in
>>>> all of the effected grammar productions.
>>>
>>> I don't see why it is a problem.
>>
>> It's a problem because it breaks a large number of the productions in
>> your specification. That you don't see that is a problem is also a
>> problem :-).
>
> You're quite right, I don't see that it is a problem. The
> specification is clear and the productions are not broken. There are
> (at least) three different Go parsers written independently by
> different people, and they all work.
>
>
>> Knowing where it's OK to allow a new line to
>> act as a terminator is a SYNTAX issue and needs to be solved in the
>> parser not the lexer.
>
> Why? Who are you trying to help?

I'm trying to help make Go as good as it can be.

> What actual problem are you trying
> to solve--not what problem in the spec, but what actual problem?

Go is no longer as good as it could be. It has unnecessary (poorly
documented) restrictions on the way that code is formatted. That is the
problem.

To make matters worse the problem is fixable.

>
>
>> In other words, if you want Go to be source code format sensitive then
>> say so and modify the specification to make it very clear what the
>> format restrictions are.
>
> We do say so.
>
> http://golang.org/doc/go_spec.html#Semicolons

This says nothing about the grammar being format sensitive. It just
explains that semicolons are optional and how that was implemented.

> http://golang.org/doc/go_lang_faq.html#semicolons
> http://golang.org/doc/go_tutorial.html#tmp_35

These do. But they aren't the spec. And they basically say "By the
way, the way we implemented optional semicolons broke the grammar but we
don't care so just suck it up.".

It looks like someone has decided on a solution and then altered to
change the specification so that their solution matches the problem
rather than the other way around. It makes Go look like amateurish. I
think that this is a pity as it could be better.

Sorry if I sound tetchy but I'm old and grumpy,
Peter

Peter Williams

unread,
Apr 30, 2010, 1:15:26 AM4/30/10
to r...@golang.org, Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
That would be relevant if I were arguing for making semicolons
compulsory. I'm not. I'm just arguing that optional semicolons can be
and should be implemented in a way that doesn't break the formal grammar.

I'd been quiet on this issue in the past as I'd assumed that the
semicolon injection (by the lexer) was just a temporary implementation
and that a proper implementation would follow in due course. I'm very
disappointed that that isn't the case and that Go is giving up on the
finer points of its design so early in its life.

Peter

Steven

unread,
Apr 30, 2010, 1:25:04 AM4/30/10
to Peter Williams, Ian Lance Taylor, golang-nuts
...and fervently clinging to arguments had in other groups about other languages.

I like the way the language works now. Semicolon injection, while a useful implementation, is not very useful in understanding the language. Defining an end-of-statement in the formal grammar separate from ";", then specifying where ends-of-statements occur might be a bit clearer. Otherwise, though, I don't think there's any particular problem to huff and stomp about.

Russ Cox

unread,
Apr 30, 2010, 1:26:28 AM4/30/10
to Peter Williams, Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
> I'd been quiet on this issue in the past as I'd assumed that the semicolon
> injection (by the lexer) was just a temporary implementation and that a
> proper implementation would follow in due course.  I'm very disappointed
> that that isn't the case and that Go is giving up on the finer points of its
> design so early in its life.

I think the semicolon design is fantastic. It is working really
well for me as a programmer. It is simple enough that it is
easy to keep in my head and has no dark corners created by
carving out exceptions to the rule here and there.

I went back and re-read your earlier messages on this thread
and I still don't understand what is is you are proposing or even
complaining about. I'd like to see a concrete example of
what you are proposing or a concrete example of something
that is wrong with the current situation.

Earlier you wrote:

> I don't see how it would be more difficult than it is now.
> Whether a new line is a prospective "end of statement"
> largely depends on what's gone before not what comes after.
> If you place (sensible) restrictions on whereabouts new lines
> can occur in a statement (i.e. those places where it can't
> possibly be the end of a statement) it would totally depend
> on what's gone before.

It already completely depends on what has gone before.
In fact it completely depends on the previous token on the line.
So I don't see how what you are proposing would improve
the matter, because we already enjoy the benefit you cited.

Any proposal that makes semicolons truly optional rather
than implied by most newlines would require unambiguous
statement endings or statement beginnings. All the solutions
I can think of along those lines introduce more noise than we
had with the semicolons.

Russ

Peter Williams

unread,
Apr 30, 2010, 1:31:36 AM4/30/10
to Steven, Ian Lance Taylor, golang-nuts
On 30/04/10 15:25, Steven wrote:
> On Fri, Apr 30, 2010 at 12:53 AM, Peter Williams <pwil...@gmail.com
> <mailto:pwil...@gmail.com>> wrote:
>
> On 30/04/10 14:28, Ian Lance Taylor wrote:
>
> Peter Williams<pwil...@gmail.com <mailto:pwil...@gmail.com>>
> writes:
>
> On 30/04/10 13:28, Ian Lance Taylor wrote:
>
> Peter Williams<pwil...@gmail.com
The reason I'm huffing and stomping is that the problem is fixable.
I.e. it's possible to have optional semicolons without the side effects.
Everyone's a winner.

Peter

befelemepeseveze

unread,
Apr 30, 2010, 1:36:28 AM4/30/10
to golang-nuts
On 30 dub, 06:53, Peter Williams <pwil3...@gmail.com> wrote:

> Go is no longer as good as it could be. It has unnecessary (poorly
> documented) ...

For me, the newlines and semicolon rules are documented well (it's
simple, it's short, it's easy to understand and learn). I don't know
why the same is "poor" for someone else.

> ... restrictions on the way that code is formatted. That is the
> problem.

I'm not a Python programmer, but I've heard some rumors of it being
even restrictive on *white space* use. If Go has got a "problem" with
the semicolons and newlines, then Python should be declared "illegal"
perhaps? :)

> >http://golang.org/doc/go_spec.html#Semicolons
>
> This says nothing about the grammar being format sensitive. It just
> explains that semicolons are optional and how that was implemented.

And from that immediately follows the facts that Go is format
sensitive and how, doesn't it?

> >http://golang.org/doc/go_lang_faq.html#semicolons
> >http://golang.org/doc/go_tutorial.html#tmp_35
>
> These do. But they aren't the spec. And they basically say "By the
> way, the way we implemented optional semicolons broke the grammar but we
> don't care so just suck it up.".

Does 'broken grammar' means a bug in the specifications? If so please
report it.

befelemepeseveze

unread,
Apr 30, 2010, 1:39:42 AM4/30/10
to golang-nuts
On 30 dub, 07:31, Peter Williams <pwil3...@gmail.com> wrote:

> The reason I'm huffing and stomping is that the problem is fixable.
> I.e. it's possible to have optional semicolons without the side effects.

I'm really only guessing now, but the guess is - such parser have to
be in O(N^2) at best.

>   Everyone's a winner.

If the guess is right, then not.

Ian Lance Taylor

unread,
Apr 30, 2010, 1:41:16 AM4/30/10
to Peter Williams, golang-nuts
Peter Williams <pwil...@gmail.com> writes:

> Go is no longer as good as it could be. It has unnecessary (poorly
> documented) restrictions on the way that code is formatted. That is
> the problem.

This sounds like an argument for better documentation, not an argument
for changing the spec.


>> http://golang.org/doc/go_spec.html#Semicolons
>
> This says nothing about the grammar being format sensitive. It just
> explains that semicolons are optional and how that was implemented.

The spec says precisely what happens, which is the job of the spec.
The spec is not there to teach you the language, it's there to be
precise.

Your earlier proposal for changing the spec made it more complicated,
not less. That increase in complexity, which means more work for
language implementors, requires a corresponding benefit elsewhere, and
I really don't see where it is coming from.

Ian

Peter Williams

unread,
Apr 30, 2010, 1:48:07 AM4/30/10
to r...@golang.org, Ian Lance Taylor, chris dollin, Corey Thomasson, peterGo, golang-nuts
On 30/04/10 15:26, Russ Cox wrote:
>> I'd been quiet on this issue in the past as I'd assumed that the semicolon
>> injection (by the lexer) was just a temporary implementation and that a
>> proper implementation would follow in due course. I'm very disappointed
>> that that isn't the case and that Go is giving up on the finer points of its
>> design so early in its life.
>
> I think the semicolon design is fantastic. It is working really
> well for me as a programmer. It is simple enough that it is
> easy to keep in my head and has no dark corners created by
> carving out exceptions to the rule here and there.

There are exceptions. They just show up elsewhere as restrictions on
how one can format the code. What I propose would make them genuinely
exception free.

>
> I went back and re-read your earlier messages on this thread
> and I still don't understand what is is you are proposing or even
> complaining about. I'd like to see a concrete example of
> what you are proposing or a concrete example of something
> that is wrong with the current situation.
>
> Earlier you wrote:
>
>> I don't see how it would be more difficult than it is now.
>> Whether a new line is a prospective "end of statement"
>> largely depends on what's gone before not what comes after.
>> If you place (sensible) restrictions on whereabouts new lines
>> can occur in a statement (i.e. those places where it can't
>> possibly be the end of a statement) it would totally depend
>> on what's gone before.
>
> It already completely depends on what has gone before.

Yes, but the lexer doesn't know the syntactic context and has no real
idea of whether it's a good idea to inject a semicolon or not. That
this is true is demonstrated by the need to format one's code in
specific ways so as not to trigger inappropriate semicolon injections.

The parser would know whether it's a good idea to inject a semicolon or
not as it has knowledge of the syntactic context.

> In fact it completely depends on the previous token on the line.
> So I don't see how what you are proposing would improve
> the matter, because we already enjoy the benefit you cited.

Yes, but at the unnecessary cost of undesirable side effects.

>
> Any proposal that makes semicolons truly optional rather
> than implied by most newlines would require unambiguous
> statement endings or statement beginnings. All the solutions
> I can think of along those lines introduce more noise than we
> had with the semicolons.

I disagree. I would opine that the nub of the issue is that optional
semicolons make white space syntactically significant and therefore the
correct place to handle them is in the parser. There would be no more
ambiguity than there is at present.

Normally, white space is just a token separator and therefore (to all
intents and purposes) syntactically insignificant and the normal
practice of having the lexer just silently ignore it works well. With
optional semicolons this is no longer the case.

Just to reiterate my position: it's possible to have optional
semicolons without the side effects if the semicolon injection is moved
from the lexer to the parser. Sure this makes the parser more complex
but it will be worth it.

Peter


chris dollin

unread,
Apr 30, 2010, 1:53:20 AM4/30/10
to Peter Williams, r...@golang.org, Ian Lance Taylor, Corey Thomasson, peterGo, golang-nuts
On 30 April 2010 06:48, Peter Williams <pwil...@gmail.com> wrote:


Any proposal that makes semicolons truly optional rather
than implied by most newlines would require unambiguous
statement endings or statement beginnings.  All the solutions
I can think of along those lines introduce more noise than we
had with the semicolons.

I disagree.  I would opine that the nub of the issue is that optional semicolons make white space syntactically significant and therefore the correct place to handle them is in the parser.  There would be no more ambiguity than there is at present.

Optional semicolons don't "make white space" [newlines] "syntactically
significant". They make the grammar ambiguous, or at least non-LR(1),
as previous examples show.

Just to reiterate my position:  it's possible to have optional semicolons without the side effects if the semicolon injection is moved from the lexer to the parser.  Sure this makes the parser more complex but it will be worth it.

It makes the /grammar/ much messier. People need to understand
the grammar. Even if the parser for the messy grammar works, and
is fast enough, both of which are currently unsupported assumptions.

Chris

--
Chris "allusive" Dollin

Peter Williams

unread,
Apr 30, 2010, 1:54:01 AM4/30/10
to befelemepeseveze, golang-nuts
On 30/04/10 15:36, befelemepeseveze wrote:
> On 30 dub, 06:53, Peter Williams<pwil3...@gmail.com> wrote:
>
>> Go is no longer as good as it could be. It has unnecessary (poorly
>> documented) ...
>
> For me, the newlines and semicolon rules are documented well (it's
> simple, it's short, it's easy to understand and learn). I don't know
> why the same is "poor" for someone else.
>
>> ... restrictions on the way that code is formatted. That is the
>> problem.
>
> I'm not a Python programmer, but I've heard some rumors of it being
> even restrictive on *white space* use. If Go has got a "problem" with
> the semicolons and newlines, then Python should be declared "illegal"
> perhaps? :)

No. Python's restrictions use are part of the formal grammar and are
therefore perfectly acceptable.

>
>>> http://golang.org/doc/go_spec.html#Semicolons
>>
>> This says nothing about the grammar being format sensitive. It just
>> explains that semicolons are optional and how that was implemented.
>
> And from that immediately follows the facts that Go is format
> sensitive and how, doesn't it?

Not really.

>
>>> http://golang.org/doc/go_lang_faq.html#semicolons
>>> http://golang.org/doc/go_tutorial.html#tmp_35
>>
>> These do. But they aren't the spec. And they basically say "By the
>> way, the way we implemented optional semicolons broke the grammar but we
>> don't care so just suck it up.".
>
> Does 'broken grammar' means a bug in the specifications? If so please
> report it.

That documentation admits that the formal grammar has been compromised
(i.e. broken).

Why is everybody against getting rid of the side effects while retaining
the usefulness of optional semicolons. You lose nothing and gain
something. What's not to like about that?

Peter

Russ Cox

unread,
Apr 30, 2010, 2:02:30 AM4/30/10
to Peter Williams, befelemepeseveze, golang-nuts
> Why is everybody against getting rid of the side effects while retaining the
> usefulness of optional semicolons.  You lose nothing and gain something.
> What's not to like about that?

I'm not against anything but I can't tell what you are for.
Make a concrete proposal, or at least a concrete complaint.
You keep saying vague things like "getting rid of the side
effects while retaining the usefulness of optional semicolons".
What does that even mean? Give us details.

Russ

Peter Williams

unread,
Apr 30, 2010, 2:07:52 AM4/30/10
to befelemepeseveze, golang-nuts
On 30/04/10 15:39, befelemepeseveze wrote:
> On 30 dub, 07:31, Peter Williams<pwil3...@gmail.com> wrote:
>
>> The reason I'm huffing and stomping is that the problem is fixable.
>> I.e. it's possible to have optional semicolons without the side effects.
>
> I'm really only guessing now, but the guess is - such parser have to
> be in O(N^2) at best.

You would be wrong. The O() complexity would be the same as it is now.

It boils down to about 3 extra productions and the addition of extra
terms to the right hand side of productions wherever white space is
legal (in the implemented grammar).

>
>> Everyone's a winner.
>
> If the guess is right, then not.

But the guess is wrong.

Peter

Andrew Gerrand

unread,
Apr 30, 2010, 2:08:19 AM4/30/10
to Peter Williams, befelemepeseveze, golang-nuts
On 30 April 2010 15:54, Peter Williams <pwil...@gmail.com> wrote:
>
> Why is everybody against getting rid of the side effects while retaining the
> usefulness of optional semicolons.  You lose nothing and gain something.
>  What's not to like about that?

We are not averse to being proven wrong. After all, we want Go to be a
useful (and well-used) language. If there are problems, they should be
fixed.

In this thread, however, you have done little more than say "It's
inconsistent and hard to understand!" to people find it consistent and
easy to understand. You say "it could be so much better!" without
providing any context for those claims.

What we have demonstrably works well. In all the Go code I've written
since the semicolon change I've not encountered a single issue caused
by it. It's hard to accept your criticism when it is so vague, and
flies in the face of my personal experience.

Sincerely,
Andrew

Peter Williams

unread,
Apr 30, 2010, 2:15:16 AM4/30/10
to Ian Lance Taylor, golang-nuts
On 30/04/10 15:41, Ian Lance Taylor wrote:
> Peter Williams<pwil...@gmail.com> writes:
>
>> Go is no longer as good as it could be. It has unnecessary (poorly
>> documented) restrictions on the way that code is formatted. That is
>> the problem.
>
> This sounds like an argument for better documentation, not an argument
> for changing the spec.
>
>
>>> http://golang.org/doc/go_spec.html#Semicolons
>>
>> This says nothing about the grammar being format sensitive. It just
>> explains that semicolons are optional and how that was implemented.
>
> The spec says precisely what happens, which is the job of the spec.
> The spec is not there to teach you the language, it's there to be
> precise.

The spec should be saying what should happen and the implementation
should make that happen. I know that in reality things often happen the
other way but the final result should make it look like it happened in
the theoretically correct way. This part of the specification fails this
test.

>
> Your earlier proposal for changing the spec made it more complicated,
> not less.

That was intended as an illustration of how the implementation breaks
the formal grammar. In practice, I'd leave the formal grammar as it is
described in the specification and change the bit about semicolons to
say that they can be omitted where the syntactic context makes them
unnecessary (with a description of what meets this criteria). I
wouldn't mention semicolon injection as it is an implementation detail
that has no part in the specification.

I would then implement the parser with a slightly modified set of
productions that include the syntactic consideration of white space
necessary to implement optional semicolons in a side effect free way.

> That increase in complexity, which means more work for
> language implementors, requires a corresponding benefit elsewhere, and
> I really don't see where it is coming from.

Yes, it means more work for the implementers. The benefit is that you
have a cleaner language without unnecessary exceptions.

Peter

Peter Williams

unread,
Apr 30, 2010, 2:16:56 AM4/30/10
to chris dollin, r...@golang.org, Ian Lance Taylor, Corey Thomasson, peterGo, golang-nuts
On 30/04/10 15:53, chris dollin wrote:
> On 30 April 2010 06:48, Peter Williams <pwil...@gmail.com
> <mailto:pwil...@gmail.com>> wrote:
>
>
>
> Any proposal that makes semicolons truly optional rather
> than implied by most newlines would require unambiguous
> statement endings or statement beginnings. All the solutions
> I can think of along those lines introduce more noise than we
> had with the semicolons.
>
>
> I disagree. I would opine that the nub of the issue is that
> optional semicolons make white space syntactically significant and
> therefore the correct place to handle them is in the parser. There
> would be no more ambiguity than there is at present.
>
>
> Optional semicolons don't "make white space" [newlines] "syntactically
> significant". They make the grammar ambiguous, or at least non-LR(1),
> as previous examples show.

That's rubbish and if it weren't it would be true for the current
implementation as well.

Peter

Peter Williams

unread,
Apr 30, 2010, 2:18:33 AM4/30/10
to Andrew Gerrand, befelemepeseveze, golang-nuts
I've explained how it can be fixed.

Peter

chris dollin

unread,
Apr 30, 2010, 2:19:33 AM4/30/10
to Peter Williams, Ian Lance Taylor, golang-nuts
On 30 April 2010 07:15, Peter Williams <pwil...@gmail.com> wrote:
That was intended as an illustration of how the implementation breaks the formal grammar.  In practice, I'd leave the formal grammar as it is described in the specification and change the bit about semicolons to say that they can be omitted where the syntactic context makes them unnecessary (with a description of what meets this criteria).

I think you'll make the grammar ambiguous. If it's not ambiguous,
it won't be LR(1). What parsing technology do you propose?

(Counter-examples welcome -- I can use them elsewhere.)
 

chris dollin

unread,
Apr 30, 2010, 2:22:27 AM4/30/10
to Peter Williams, r...@golang.org, Ian Lance Taylor, Corey Thomasson, peterGo, golang-nuts

The current /grammar/ doesn't have optional semicolons. The
/implementation/ has the effect of "optional someplace emicolons"
with a lexical hack -- the one you don't like.

--
Chris "allusive" Dollin

befelemepeseveze

unread,
Apr 30, 2010, 2:29:55 AM4/30/10
to golang-nuts
On 30 dub, 08:07, Peter Williams <pwil3...@gmail.com> wrote:
> You would be wrong.  The O() complexity would be the same as it is now.
>
> It boils down to about 3 extra productions and the addition of extra
> terms to the right hand side of productions wherever white space is
> legal (in the implemented grammar).

Please show an example grammar for say Go/0 language with optional
statement separators/terminators. It will help me understand where I'm
wrong.

Thanks in advance.

Peter Williams

unread,
Apr 30, 2010, 2:32:01 AM4/30/10
to r...@golang.org, befelemepeseveze, golang-nuts
At the risk of repeating myself, take the specification for function
declarations (minus the example):

----
Function declarations

A function declaration binds an identifier to a function (§Function types).

FunctionDecl = "func" identifier Signature [ Body ] .
Body = Block.

A function declaration may omit the body. Such a declaration provides
the signature for a function implemented outside Go, such as an assembly
routine.
----

This specification does not correctly reflect what has been implemented
because the way semicolons are injected (with gay abandon instead of
with care) makes the (implied) optional white space between identifier
and Signature and between Signature and Body special in that it cannot
contain new lines. Elsewhere (but not everywhere) in the spec implied
white space can contain new lines. This limitation on the white space
in this documentation is not mentioned in the spec so this production is
broken.

If the injection of the semicolons is done in the parser (instead of the
lexer) this limitation on the use of new lines in those pieces of white
space can be avoided.

Do you see what I'm getting on about now?
Peter

Peter Williams

unread,
Apr 30, 2010, 2:34:05 AM4/30/10
to chris dollin, Ian Lance Taylor, golang-nuts
On 30/04/10 16:19, chris dollin wrote:
> On 30 April 2010 07:15, Peter Williams <pwil...@gmail.com
> <mailto:pwil...@gmail.com>> wrote:
>
> That was intended as an illustration of how the implementation
> breaks the formal grammar. In practice, I'd leave the formal
> grammar as it is described in the specification and change the bit
> about semicolons to say that they can be omitted where the syntactic
> context makes them unnecessary (with a description of what meets
> this criteria).
>
>
> I think you'll make the grammar ambiguous. If it's not ambiguous,
> it won't be LR(1). What parsing technology do you propose?
>
> (Counter-examples welcome -- I can use them elsewhere.)
> Chris

As I said elsewhere it won't make the language any more ambiguous than
it already is. If the current implementation of optional semicolons
doesn't make Go ambiguous then neither will an alternative implementation.

Peter

Russ Cox

unread,
Apr 30, 2010, 2:44:14 AM4/30/10
to Peter Williams, befelemepeseveze, golang-nuts
> Do you see what I'm getting on about now?

I think I've narrowed it down to two possibilities.

1. Perhaps you don't believe the lexing process is part of the
spec and simply want a clearer reflection of the semicolon
rules in the grammar.

We could rewrite every grammar rule in the spec to say
where newlines can and cannot serve as ordinary space,
but that would be more text and more error prone than the
current, simple rule: newlines, unless they follow a small set
of continuation tokens, turn into semicolons.

And suppose someone did go through and edit the spec to
annotate all the places where newlines can and cannot be.
What then? It's still exactly the same language, just with
a more error prone spec.

2. Perhaps you want to change the language, making the
rules about which newlines turn into semicolons more
sophisticated.

There's already a language that does this: JavaScript.
This is one of the frequently-cited problems with JavaScript,
because users cannot easily look at a program and tell
whether a particular newline is a semicolon or just white space.
In Go we intentionally made the rule as simple as possible:
the answer depends only on the token (often the single character)
at the end of the line.

Russ

befelemepeseveze

unread,
Apr 30, 2010, 2:46:30 AM4/30/10
to golang-nuts
On 30 dub, 08:32, Peter Williams <pwil3...@gmail.com> wrote:
> At the risk of repeating myself, take the specification for function
> declarations (minus the example):
>
> ----
> Function declarations
>
> A function declaration binds an identifier to a function (§Function types).
>
> FunctionDecl = "func" identifier Signature [ Body ] .
> Body         = Block.
>
> A function declaration may omit the body. Such a declaration provides
> the signature for a function implemented outside Go, such as an assembly
> routine.
> ----
>
> This specification does not correctly reflect what has been implemented
> because the way semicolons are injected (with gay abandon instead of
> with care) makes the (implied) optional white space between identifier
> and Signature and between Signature and Body special in that it cannot
> contain new lines.  Elsewhere (but not everywhere) in the spec implied
> white space can contain new lines.  This limitation on the white space
> in this documentation is not mentioned in the spec so this production is
> broken.

Actually that limitation *is* in the specs and the production is thus
*not* broken:
http://golang.org/doc/go_spec.html#Semicolons

Reading the specs reveals, that a newline between say 'func' and
'identifier' would be seen by the parser as []token{kwd_func,
semicolon, identifier, ...} and that doesn't fit the spec's
'"FunctionDecl = "func" identifier Signature [ Body ] .'

> Do you see what I'm getting on about now?

No. Not at all.

Peter Williams

unread,
Apr 30, 2010, 3:02:11 AM4/30/10
to befelemepeseveze, golang-nuts
OK. Take the following formal productions from the spec:

----
FunctionDecl = "func" identifier Signature [ Body ] .
Body = Block.
StructType = "struct" "{" { FieldDecl ";" } "}" .
----

They would become:

----
FunctionDecl = "func" identifier Signature [ Body ] .
Body = Block.
StructType = "struct" "{" { FieldDecl StatementTerminator } "}" .
StatementTerminator = ( ";" | TerminatingWhiteSpace )
TerminatingWhiteSpace = [\t ]*\n[\t\n ]*
----

The implemented grammar would be slightly messier with addition of a
generalized white space production:

----
OptionalWhiteSpace = ( TerminatingWhiteSpace | [\t\n ]* )
----

And the others would be expanded to become:

----
FunctionDecl = "func" identifier OptionalWhiteSpace Signature [ Body ] .
Body = ( StatementTerminator | OptionalWhiteSpace Block ).
StructType = "struct" "{" { FieldDecl StatementTerminator } "}" .
StatementTerminator = ( ";" | TerminatingWhiteSpace )
TerminatingWhiteSpace = [\t ]*\n[\t\n ]*
----

There's no need to say anything between white space between "func" and
identifier as that's how the lexer determines "func" to be a token.

The effect of the above would be that the equivalent of semicolon
injection would occur as per now inside the field declaration component
of a StructType but not after the identifier or Signature in a FunctionDecl.

I hope that this simplified example helps to clarify my proposal?

Peter

Peter Williams

unread,
Apr 30, 2010, 3:18:27 AM4/30/10
to r...@golang.org, befelemepeseveze, golang-nuts
On 30/04/10 16:44, Russ Cox wrote:
>> Do you see what I'm getting on about now?
>
> I think I've narrowed it down to two possibilities.
>
> 1. Perhaps you don't believe the lexing process is part of the
> spec and simply want a clearer reflection of the semicolon
> rules in the grammar.
>
> We could rewrite every grammar rule in the spec to say
> where newlines can and cannot serve as ordinary space,
> but that would be more text and more error prone than the
> current, simple rule: newlines, unless they follow a small set
> of continuation tokens, turn into semicolons.
>
> And suppose someone did go through and edit the spec to
> annotate all the places where newlines can and cannot be.
> What then? It's still exactly the same language, just with
> a more error prone spec.

I would argue that it would be less error prone.

>
> 2. Perhaps you want to change the language, making the
> rules about which newlines turn into semicolons more
> sophisticated.

I guess I could be accused of that.

>
> There's already a language that does this: JavaScript.
> This is one of the frequently-cited problems with JavaScript,
> because users cannot easily look at a program and tell
> whether a particular newline is a semicolon or just white space.
> In Go we intentionally made the rule as simple as possible:

I'd argue that you've actually made it more complex than it needs to be.

> the answer depends only on the token (often the single character)
> at the end of the line.

This assumption is what causes the current implementation to be less
than ideal.

----
Not really either of the above but something like that.

I want to change the implementation. I'd leave the formal specification
much the same as it is except for changes to indicate where semicolons
can be omitted (just a few spots really). White space would remain
implied just as it is now.

For implementation, I would have a slightly expanded grammar
specification to handle the fact that white space is now syntactically
significant, modify the lexer to pass on white space data to the parser
where appropriate and we're singing and dancing.

Peter
PS I think that you would be surprised at how few productions in the
grammar as it is now defined have semicolons that are truly optional.

Russ Cox

unread,
Apr 30, 2010, 3:29:21 AM4/30/10
to Peter Williams, befelemepeseveze, golang-nuts
> PS I think that you would be surprised at how few productions in the grammar
> as it is now defined have semicolons that are truly optional.

I don't believe any of them are optional, except
before } or ). That's the point: it's very regular.
And it happens that you can type a semicolon by
hitting the "enter/return" key on your keyboard.

Russ

befelemepeseveze

unread,
Apr 30, 2010, 3:53:44 AM4/30/10
to golang-nuts
On 30 dub, 09:02, Peter Williams <pwil3...@gmail.com> wrote:
> OK. Take the following formal productions from the spec:
>
> ----
> FunctionDecl = "func" identifier Signature [ Body ] .
> Body = Block.
> StructType = "struct" "{" { FieldDecl ";" } "}" .
> ----
>
> They would become:
>
> ----
> FunctionDecl = "func" identifier Signature [ Body ] .
> Body = Block.
> StructType = "struct" "{" { FieldDecl StatementTerminator } "}" .
> StatementTerminator = ( ";" | TerminatingWhiteSpace )
> TerminatingWhiteSpace = [\t ]*\n[\t\n ]*
> ----
>
> The implemented grammar would be slightly messier with addition of a
> generalized white space production:
>
> ----
> OptionalWhiteSpace = ( TerminatingWhiteSpace | [\t\n ]* )
> ----

Rewriting to disambiguate '[' and ']' meaning (optional production
part vs regexp charset):

TerminatingWhiteSpace = { '\t' | ' ' } '\n' { '\t' | '\n' | ' ' } .
OptionalWhiteSpace = TerminatingWhiteSpace | { '\t' | '\n' | ' ' } .

Plugin TerminatingWhiteSpace into OptionalWhiteSpace we get:

OptionalWhiteSpace = { '\t' | ' ' } '\n' { '\t' | '\n' | ' ' } |
{ '\t' | '\n' | ' ' } .

Factored it is:

OptionalWhiteSpace = [ { '\t' | ' ' } '\n' ] { '\t' | '\n' | ' ' } .

which at a glance I've not much idea what exactly it says - in
contrast with the current, at first look understood, simple rules.
On a second look it seems to me being equivalent of:

OptionalWhiteSpace = { '\t' | '\n' | ' ' } . // i.e. [\t\n ]* in the
proposal notation.

And this is surprising - why is it same as the current lexer's/
tokenizer "production" for optional white space (sans \r)?
If it really is the same:
- why it get promoted from the tokenizer spec to the (parser)
grammar?
- what's the purpose of OptionalWhiteSpace in 'FunctionDecl = "func"
identifier OptionalWhiteSpace Signature [ Body ] .' if:

> There's no need to say anything between white space between "func" and
> identifier as that's how the lexer determines "func" to be a token.

In any case, thanks for formalizing your proposal. I'll be looking
into it a bit more.

Regards,

-bflm

roger peppe

unread,
Apr 30, 2010, 4:50:25 AM4/30/10
to Peter Williams, befelemepeseveze, golang-nuts
On 30 April 2010 08:02, Peter Williams <pwil...@gmail.com> wrote:
> OptionalWhiteSpace = ( TerminatingWhiteSpace | [\t\n ]* )

i presume you mean
OptionalWhiteSpace= { TerminatingWhiteSpace }

> FunctionDecl = "func" identifier OptionalWhiteSpace Signature [ Body ] .
> Body         = ( StatementTerminator | OptionalWhiteSpace Block ).
> StructType     = "struct" "{" { FieldDecl StatementTerminator } "}" .
> StatementTerminator = ( ";" | TerminatingWhiteSpace )
> TerminatingWhiteSpace = [\t ]*\n[\t\n ]*

> There's no need to say anything between white space between "func" and
> identifier as that's how the lexer determines "func" to be a token.

i *think* what you're saying there is that the lexer knows about some tokens
and does not return a TerminatingWhiteSpace token after those.
presumably those include all tokens that are currently *not* dealt
with by the semicolon injection rules.

so by going this route you're still presuming the same level
of lexer hackery (which i think is what you don't like) while
making the grammar considerably more complex.

all for the sake of being able to do:

for i := 0; i < n; i++
{
foo
}

is it really worth it?

especially as you'd *still* have to put the braces on the end
of a line in some cases, e.g. function declarations:

func foo() int
{
}

the lexer must return a TerminatingWhiteSpace after the int,
which makes the first line into a function declaration without
a body, but then the brace becomes a syntax error.

> I hope that this simplified example helps to clarify my proposal?

clarified, but not justified, i'm afraid.

if you really want free formatting in go code, hack your version of the
compiler to turn off the semicolon injection - then you can format
your code however you like... but you'll have to write all those
pesky semicolons again :-)

rog.

Peter Williams

unread,
Apr 30, 2010, 5:26:49 AM4/30/10
to Ian Lance Taylor, golang-nuts
On 30/04/10 15:41, Ian Lance Taylor wrote:
How about I provide some patches? That should take the pain out of the
implementation costs.

Whereabouts in https://go.googlecode.com/hg/ do I find the lexer and the
parser (yacc source or equivalent preferably)?

Peter


befelemepeseveze

unread,
Apr 30, 2010, 5:41:06 AM4/30/10
to golang-nuts
On 30 dub, 11:26, Peter Williams <pwil3...@gmail.com> wrote:
> How about I provide some patches?  That should take the pain out of the
> implementation costs.
>
> Whereabouts inhttps://go.googlecode.com/hg/do I find the lexer and the
> parser (yacc source or equivalent preferably)?

Hope this helps:
http://www.google.com/codesearch?q=semicolon+package%3Ahttp%3A%2F%2Fgo\.googlecode\.com&origq=semicolon&btnG=Search+Trunk

Peter Williams

unread,
Apr 30, 2010, 6:01:06 AM4/30/10
to roger peppe, befelemepeseveze, golang-nuts
On 30/04/10 18:50, roger peppe wrote:
> On 30 April 2010 08:02, Peter Williams<pwil...@gmail.com> wrote:
>> OptionalWhiteSpace = ( TerminatingWhiteSpace | [\t\n ]* )
>
> i presume you mean
> OptionalWhiteSpace= { TerminatingWhiteSpace }

No. TerminatingWhiteSpace has to contain at least one newline.

>
>> FunctionDecl = "func" identifier OptionalWhiteSpace Signature [ Body ] .
>> Body = ( StatementTerminator | OptionalWhiteSpace Block ).
>> StructType = "struct" "{" { FieldDecl StatementTerminator } "}" .
>> StatementTerminator = ( ";" | TerminatingWhiteSpace )
>> TerminatingWhiteSpace = [\t ]*\n[\t\n ]*
>
>> There's no need to say anything between white space between "func" and
>> identifier as that's how the lexer determines "func" to be a token.
>
> i *think* what you're saying there is that the lexer knows about some tokens
> and does not return a TerminatingWhiteSpace token after those.
> presumably those include all tokens that are currently *not* dealt
> with by the semicolon injection rules.

Yes. I've been thinking about this some more while watching the
football and I think I made it too complex. I'll put a revised version
at the end.

>
> so by going this route you're still presuming the same level
> of lexer hackery (which i think is what you don't like) while
> making the grammar considerably more complex.

Yes. Except that under my revised model (after thinking during
football) I would change the lexer to send "\n" at those points where it
now injects a ";" (emphasis on the injects).

>
> all for the sake of being able to do:
>
> for i := 0; i< n; i++
> {
> foo
> }

Depending on where you put TerminatingWhiteSpace in the grammar:

for i := 0
i< n
i++
{
foo
}

So this mod would give the option of disallowing that formatting if you
wished. At the moment, as I interpret the Semicolon section in the
spec, you'd get away with:

for i := 0
i< n
i++ {
foo
}

if you wanted to. Personally, I wouldn't. But I find it strange that
you could do that but not:

for i := 0; i< n; i++
{
foo
}

> is it really worth it?

Not necessarily for those specific examples but in general, yes.

>
> especially as you'd *still* have to put the braces on the end
> of a line in some cases, e.g. function declarations:
>
> func foo() int
> {
> }
>
> the lexer must return a TerminatingWhiteSpace after the int,
> which makes the first line into a function declaration without
> a body, but then the brace becomes a syntax error.

Maybe. I'd have to look at it further. This was only an example and
not meant to be perfect. I was probably being over officious in my
specification. (I'll work on that.)

>
>> I hope that this simplified example helps to clarify my proposal?
>
> clarified, but not justified, i'm afraid.

I was only trying to clarify :-)

>
> if you really want free formatting in go code, hack your version of the
> compiler to turn off the semicolon injection - then you can format
> your code however you like... but you'll have to write all those
> pesky semicolons again :-)

I've decided to have a crack at implementing my idea in full. Keep your
eye out for some patches.

Revised and simplified example. Take the following formal productions
from the spec:

----
FunctionDecl = "func" identifier Signature [ Body ] .
Body = Block.
StructType = "struct" "{" { FieldDecl ";" } "}" .
----

They would become:

----
FunctionDecl = "func" identifier Signature [ Body ] .
Body = Block.
StructType = "struct" "{" { FieldDecl ( ";" | "\n" ) } "}" .
----

And the implementation version would be expanded to become:

----
FunctionDecl = "func" identifier [ "\n" ] Signature [ "\n" ] [ Body ] .
Body = Block.
StructType = "struct" "{" { FieldDecl ( ";" | "\n" ) } "}" .
----

I think (but may be wrong) this also fixes your point above where I
responded "maybe".

As mentioned above, the lexer is then modified to return "\n" where it
would have injected ";" and sends ";" where they actually occur.

Peter

Peter Williams

unread,
Apr 30, 2010, 6:04:57 AM4/30/10
to befelemepeseveze, golang-nuts
It does.

Thanks
Peter

roger peppe

unread,
Apr 30, 2010, 7:09:07 AM4/30/10
to Peter Williams, befelemepeseveze, golang-nuts
On 30 April 2010 11:01, Peter Williams <pwil...@gmail.com> wrote:
> On 30/04/10 18:50, roger peppe wrote:
>>
>> On 30 April 2010 08:02, Peter Williams<pwil...@gmail.com>  wrote:
>>>
>>> OptionalWhiteSpace = ( TerminatingWhiteSpace | [\t\n ]* )
>>
>> i presume you mean
>> OptionalWhiteSpace= { TerminatingWhiteSpace }
>
> No. TerminatingWhiteSpace has to contain at least one newline.

indeed. that's why i presumed you wanted OptionalWhiteSpace
to represent zero or more occurrences of new line or white space.

> I've decided to have a crack at implementing my idea in full.  Keep your eye
> out for some patches.

good luck.

> FunctionDecl = "func" identifier [ "\n" ] Signature [ "\n" ] [ Body ] .

i think you'll find that's ambiguous.
for instance, here's a little yacc grammar to demonstrate.

%term IDENT FUNC
%%
decl: fndecl term
fndecl: FUNC IDENT optnl signature optnl optbody
optbody:
| '{' '}'
optnl:
| optnl '\n'
term:
| term ';'
| term '\n'
signature: '(' ')'

if you run yacc on that, you'll find it generates a shift/reduce
conflict, and that's because when it gets to the optnl
after the signature, it doesn't know whether it's a statement
terminator or a spacer between the function signature
and its body.

there is a fundamental ambiguity there that can't be resolved
with a LR(1) parser, i believe, unless you have the lexer
even more, and that's the much disliked javascript approach.

Marko Macek

unread,
Apr 30, 2010, 10:07:33 AM4/30/10
to golang-nuts
On Apr 29, 11:48 am, chris dollin <ehog.he...@googlemail.com> wrote:

> `set` and `call`, so that every statement started with a keyword and
> we didn't need semicolons or newlines as terminators or separators.

I'd actually prefer something like this (only 'call' is probably
needed) to
newline sensitivity, if golang is deviating from C style syntax. Maybe
as
an operator, \func or the like.


Ian Lance Taylor

unread,
Apr 30, 2010, 10:18:46 AM4/30/10
to Peter Williams, golang-nuts
Peter Williams <pwil...@gmail.com> writes:

> On 30/04/10 15:41, Ian Lance Taylor wrote:
>>
>> The spec says precisely what happens, which is the job of the spec.
>> The spec is not there to teach you the language, it's there to be
>> precise.
>
> The spec should be saying what should happen and the implementation
> should make that happen. I know that in reality things often happen
> the other way but the final result should make it look like it
> happened in the theoretically correct way. This part of the
> specification fails this test.

I can not find any way of reading this paragraph that makes the last
sentence correct.


>> Your earlier proposal for changing the spec made it more complicated,
>> not less.
>
> That was intended as an illustration of how the implementation breaks
> the formal grammar. In practice, I'd leave the formal grammar as it
> is described in the specification and change the bit about semicolons
> to say that they can be omitted where the syntactic context makes them
> unnecessary (with a description of what meets this criteria). I
> wouldn't mention semicolon injection as it is an implementation detail
> that has no part in the specification.

Semicolon injection is in no way an implementation detail. It is part
of the language. Removing semicolon injection without changing the
formal grammer would give you a different language.


>> That increase in complexity, which means more work for
>> language implementors, requires a corresponding benefit elsewhere, and
>> I really don't see where it is coming from.
>
> Yes, it means more work for the implementers. The benefit is that you
> have a cleaner language without unnecessary exceptions.

For whom is that a benefit?

When looking at this kind of thing, don't forget cases like
a := T
{}
The parser does not know whether T is a type. Is this an assignment
with a composite literal on the right hand side, or is it an
assignment of a single named value followed by a block? There is no
ambiguity in the current language spec.

Ian

James Fisher

unread,
Apr 30, 2010, 3:29:47 PM4/30/10
to golan...@googlegroups.com
(Apologies to Ian to whom I emailed this earlier rather than to the
mailing list.)

As I suggested earlier, using terminology like "statement terminator"
would make things much clearer than terminology like "semicolon
injection". Your debate over specification/implementation could be
solved here. I think what Peter is saying is that "the Go
implementation injects semicolons as part of its implementation of the
Go specification on how to interpret linebreaks".

In one sense, Peter's right: a Go lexer/parser/compiler suite could be
written that makes the exact same transformation between source code
and executable binary, without ever specifically doing something like
"injecting semicolons" insome intermediate stage.

In another, Ian's right: the Go specification
(http://golang.org/doc/go_spec.html#Semicolons), when formalizing the
exact meaning of a newline in its context, does so by describing one
possible approach: semicolon injection.

Here is the full wording of the implementation:

---
The formal grammar uses semicolons ";" as terminators in a number of
productions. Go programs may omit most of these semicolons using the
following two rules:

1. When the input is broken into tokens, a semicolon is
automatically inserted into the token stream at the end of a non-blank
line if the line's final token is
* an identifier
* an integer, floating-point, character, or string literal
* one of the keywords break, continue, fallthrough, or return
* one of the operators and delimiters ++, --, ), ], or }
2. To allow complex statements to occupy a single line, a semicolon
may be omitted before a closing ")" or "}".

To reflect idiomatic use, code examples in this document elide
semicolons using these rules.
---

Now, I suggest the specification could be changed to something like
the following, while remaining exactly the same semantically (i.e. the
specification as a whole still specifies the exact same one-way
transformation between source code and executable binary).

---
The semicolon is Go's explicit non-context-sensitive statement terminator.

Go also includes a second statement terminator, the newline character.
The newline is context-sensitive, and is interpreted as a terminator
if the line that it ends is non-blank and its final token is one of:
* an identifier
* an integer, floating-point, character, or string literal
* one of the keywords break, continue, fallthrough, or return
* one of the operators and delimiters ++, --, ), ], or }

To reflect idiomatic use, code examples in this document elide
semicolons using these rules.
---

This does not include yet include rule two ('To allow complex
statements to occupy a single line, a semicolon may be omitted before
a closing ")" or "}"'). This doesn't seem to have been discussed at
all in this thread. Would I be wrong in saying that the ")" and "}"
are in fact treated the same as the newline? i.e., if any of the
above listed tokens precede the closing parenthesis/brace, it has a
statement termination injected before it?

befelemepeseveze

unread,
Apr 30, 2010, 5:06:06 PM4/30/10
to golang-nuts
On 30 dub, 21:29, James Fisher <jameshfis...@gmail.com> wrote:
> Now, I suggest the specification could be changed to something like
> the following, while remaining exactly the same semantically (i.e. the
> specification as a whole still specifies the exact same one-way
> transformation between source code and executable binary).
>
> ---
> The semicolon is Go's explicit non-context-sensitive statement terminator.
>
> Go also includes a second statement terminator, the newline character.
>  The newline is context-sensitive, and is interpreted as a terminator
> if the line that it ends is non-blank and its final token is one of:
>  * an identifier
>  * an integer, floating-point, character, or string literal
>  * one of the keywords break, continue, fallthrough, or return
>  * one of the operators and delimiters ++, --, ), ], or }
>
> To reflect idiomatic use, code examples in this document elide
> semicolons using these rules.
> ---

'Statement' is not the correct word in this proposal. That would
change the current semantics as the semicolon injection applies not
only to statements.

http://golang.org/doc/go_spec.html#Statements

E.g. that will miss function declaration which is not a statement.

Additionally, I can understand the original definition immediately and
in full. Can't say the same about the proposed one (omitting it's not
even correct).

IMO this proposal would make things worse then they are ATM and
provides nothing in return.

Peter Williams

unread,
Apr 30, 2010, 6:24:40 PM4/30/10
to roger peppe, befelemepeseveze, golang-nuts
After more thinking, more simplification and revision.

On 30/04/10 20:01, Peter Williams wrote:
> Revised and simplified example. Take the following formal productions
> from the spec:
>
> ----
> FunctionDecl = "func" identifier Signature [ Body ] .
> Body = Block.
> StructType = "struct" "{" { FieldDecl ";" } "}" .
> ----
>
> They would become:

They would stay the same. I.e. no change to the specification document
except to rewording of the Semicolon section to add "where appropriate"
riders to the description of ";" injection.

>
> ----
> FunctionDecl = "func" identifier Signature [ Body ] .
> Body = Block.
> StructType = "struct" "{" { FieldDecl ( ";" | "\n" ) } "}" .
> ----

This is no longer done and this avoids the unfortunate fact that it
would have formally disallowed ";" at the end of a line. I say formally
because in practice the implementation I described would have allowed it.

>
> And the implementation version would be expanded to become:
>
> ----
> FunctionDecl = "func" identifier [ "\n" ] Signature [ "\n" ] [ Body ] .
> Body = Block.
> StructType = "struct" "{" { FieldDecl ( ";" | "\n" ) } "}" .
> ----
>
> I think (but may be wrong) this also fixes your point above where I
> responded "maybe".
>
> As mentioned above, the lexer is then modified to return "\n" where it
> would have injected ";" and sends ";" where they actually occur.

This implementation part stays mostly the same and the whole thing is
now just smarter ";" injection rather than a syntax overhaul.

Peter

Russ Cox

unread,
Apr 30, 2010, 6:28:47 PM4/30/10
to Peter Williams, roger peppe, befelemepeseveze, golang-nuts
Are you still proposing changes to the implementation now,
or just changes to the spec? (Sorry, but it is hard for me to
follow the chain of diffs.)

Russ

Peter Williams

unread,
Apr 30, 2010, 7:52:42 PM4/30/10
to roger peppe, befelemepeseveze, golang-nuts
On 30/04/10 21:09, roger peppe wrote:
> On 30 April 2010 11:01, Peter Williams<pwil...@gmail.com> wrote:
>> On 30/04/10 18:50, roger peppe wrote:
>>>
>>> On 30 April 2010 08:02, Peter Williams<pwil...@gmail.com> wrote:
>>>>
>>>> OptionalWhiteSpace = ( TerminatingWhiteSpace | [\t\n ]* )
>>>
>>> i presume you mean
>>> OptionalWhiteSpace= { TerminatingWhiteSpace }
>>
>> No. TerminatingWhiteSpace has to contain at least one newline.
>
> indeed. that's why i presumed you wanted OptionalWhiteSpace
> to represent zero or more occurrences of new line or white space.
>
>> I've decided to have a crack at implementing my idea in full. Keep your eye
>> out for some patches.
>
> good luck.
>
>> FunctionDecl = "func" identifier [ "\n" ] Signature [ "\n" ] [ Body ] .
>
> i think you'll find that's ambiguous.
> for instance, here's a little yacc grammar to demonstrate.
>
> %term IDENT FUNC
> %%
> decl: fndecl term
> fndecl: FUNC IDENT optnl signature optnl optbody
> optbody:
> | '{' '}'
> optnl:
> | optnl '\n'

This should be:

optnl: | '\n'

> term:
> | term ';'
> | term '\n'

This needs to be only:

term: ';' | '\n'

> signature: '(' ')'
>
> if you run yacc on that, you'll find it generates a shift/reduce
> conflict, and that's because when it gets to the optnl
> after the signature, it doesn't know whether it's a statement
> terminator or a spacer between the function signature
> and its body.
>
> there is a fundamental ambiguity there that can't be resolved
> with a LR(1) parser, i believe, unless you have the lexer
> even more, and that's the much disliked javascript approach.

I think my change above fixes this problem provided that the lexer only
sends a ';' or a '\n' and not both (which would be the case if the only
change to the current lexer is to inject '\n' instead of ';'). The
parser won't see every new line in the source but only the ones where
the lexer thinks a ';' should be injected.

We'll have to wait and see if it works when I get down and dirty with
the implementation.

Here's hoping
Peter

Peter Williams

unread,
Apr 30, 2010, 7:55:38 PM4/30/10
to r...@golang.org, roger peppe, befelemepeseveze, golang-nuts
Minor change to the spec:

diff --git a/doc/go_spec.html b/doc/go_spec.html
--- a/doc/go_spec.html
+++ b/doc/go_spec.html
@@ -181,9 +181,8 @@ using the following two rules:
<ol>
<li>
<p>
-When the input is broken into tokens, a semicolon is automatically inserted
-into the token stream at the end of a non-blank line if the line's final
-token is
+A semicolon may be omitted if it would occur at the end of a line whose
+final token is
</p>
<ul>
<li>an identifier

And then modify the lexer and parser to do smarter ';' injection based
on my earlier proposal.

Peter


befelemepeseveze

unread,
May 1, 2010, 1:18:41 AM5/1/10
to golang-nuts
On 1 kvě, 01:55, Peter Williams <pwil3...@gmail.com> wrote:

> -When the input is broken into tokens, a semicolon is automatically inserted
> -into the token stream at the end of a non-blank line if the line's final
> -token is

This communicates: Except for few constructs e.g. 'for', you don't
have to worry/think about semicolons anymore if you don't format your
code in an unsupported way.

> +A semicolon may be omitted if it would occur at the end of a line whose
> +final token is

And this: Semicolons are everywhere as in e.g. C. Sometimes you don't
have to write them if you learn some additional rules.

I prefer the former.

Peter Williams

unread,
May 1, 2010, 3:05:10 AM5/1/10
to befelemepeseveze, golang-nuts
On 01/05/10 15:18, befelemepeseveze wrote:
> On 1 kvě, 01:55, Peter Williams<pwil3...@gmail.com> wrote:
>
>> -When the input is broken into tokens, a semicolon is automatically inserted
>> -into the token stream at the end of a non-blank line if the line's final
>> -token is
>
> This communicates: Except for few constructs e.g. 'for', you don't
> have to worry/think about semicolons anymore if you don't format your
> code in an unsupported way.

You don't have to worry about them for "for" either.

for i := 0
i < 10
i++ {
}

should compile OK with the current implementation as all of the criteria
for ';' insertion are met. I'll check next time I do a build without my
patches.

>
>> +A semicolon may be omitted if it would occur at the end of a line whose
>> +final token is
>
> And this: Semicolons are everywhere as in e.g. C. Sometimes you don't
> have to write them if you learn some additional rules.
>
> I prefer the former.

The former (I assume you mean the bit that I replaced) implies that
semicolons can also be inserted where they are inappropriate. How about:

"When the input is broken into tokens, a semicolon is automatically
inserted into the token stream at the end of a non-blank line if it is
appropriate and the line's final token is"

Or maybe not.

The point of the change is to indicate that ';' are no longer inserted
into input stream after those tokens without restriction and that it
only happens when it makes syntactic sense. I'm not wedded to my change
as long as the new version conveys that information.

BTW The modification is coming along nicely and I've already picked most
of the low hanging fruit e.g. if/for/select/switch. Unfortunately,
because it's fairly messy func and else are proving more challenging.

The amount of change required so far is fairly light:

doc/go_spec.html | 5 +--
src/cmd/gc/go.y | 75
++++++++++++++++++++++++++++-----------------------
src/cmd/gc/lex.c | 7 ++--
test/syntax/semi1.go | 6 ++--
test/syntax/semi2.go | 4 +-
test/syntax/semi3.go | 5 ++-
test/syntax/semi4.go | 7 ++--
test/syntax/semi6.go | 2 -
8 files changed, 61 insertions(+), 50 deletions(-)

And, as you can see, a fairly large chunk of it was modifying the tests
that were there to ensure that ';' insertion broke the context free
properties of Go.

It's a fairly pleasant experience modifying the parser and lexer because
you can rebuild and run a complete set of tests very quickly.
Especially after you realize that the second part of:

./bash.all clean; ./bash.all

is unnecessary :-).

Peter

befelemepeseveze

unread,
May 1, 2010, 3:31:55 AM5/1/10
to golang-nuts
On 1 kvě, 09:05, Peter Williams <pwil3...@gmail.com> wrote:
> On 01/05/10 15:18, befelemepeseveze wrote:
> You don't have to worry about them for "for" either.
>
> for i := 0
> i < 10
> i++ {
>
> }
>
> should compile OK with the current implementation as all of the criteria
> for ';' insertion are met.

True. Meant was semicolons in a for clause on one line like 'for i :=
0; i < 10; i++ {...'.

> >> +A semicolon may be omitted if it would occur at the end of a line whose
> >> +final token is
>
> > And this: Semicolons are everywhere as in e.g. C. Sometimes you don't
> > have to write them if you learn some additional rules.
>
> > I prefer the former.
>
> The former (I assume you mean the bit that I replaced) implies that
> semicolons can also be inserted where they are inappropriate.

True, but only if not adhering to standard Go formating (as
mentioned).

> How about:
> "When the input is broken into tokens, a semicolon is automatically
> inserted into the token stream at the end of a non-blank line if it is
> appropriate and the line's final token is"

Yes, that's IMO better.

> It's a fairly pleasant experience modifying the parser and lexer because
> you can rebuild and run a complete set of tests very quickly.

And it's a valuable experience for sure. I'm afraid, that it will stay
that. Do you expect your effort to change the specs/language to be
accepted? I do not.

Peter Williams

unread,
May 1, 2010, 11:42:27 PM5/1/10
to befelemepeseveze, golang-nuts
On 01/05/10 17:31, befelemepeseveze wrote:
>
> And it's a valuable experience for sure. I'm afraid, that it will stay
> that. Do you expect your effort to change the specs/language to be
> accepted? I do not.

After finding tests in the test harness that are there specifically to
test that the (to me undesirable) side effects of ';' injection actually
occur, I'm starting to think you're probably right.

Still I'm having fun so I'll persevere and see what happens if I'm
successful.

Peter

Peter Williams

unread,
May 1, 2010, 11:45:08 PM5/1/10
to roger peppe, befelemepeseveze, golang-nuts
This can be beaten by using similar techniques to those used to '=' and
'==' in the lexer. I have a working implementation.

But the pragmatic approach would be to say that disambiguation is one of
the things that the ';' terminator is there fore. If you change bits of
the above to something like:

fndecl: FUNC IDENT optnl signature optbody
optbody:
| optnl '{' '}'

it would have the effect of making the ';' non optional when optbody is
empty. That would be a sensible solution to the problem but judging by
the vehemence of those who like the status quo I doubt that it would
gain acceptance.

But even having solved this problem there are more difficulties in store
as function declarations have quite a few more optional components which
fall foul of the ';' injection and need to be addressed if you want Go
to be fully free format. Solvable but it will get very messy.

But thinking about ways to make it less messy caused me to have the idea
that the solution might be to have parser tell the lexer when to try ';'
injection rather than just turning it on after the listed tokens. Early
attempts at making this work indicate that I need to get a deeper
understanding of the lexer's code so I've got to do some study.

Peter

peterGo

unread,
May 2, 2010, 2:37:56 AM5/2/10
to golang-nuts
Peter,

The problem of identifying the end of statements across multiple lines
has been around since the punched card for Assemblers, FORTRAN, COBOL,
etc.. Visual Basic, as a child of BASIC, has been trying to come up
with a solution for years. In its latest incarnation, Visual Basic
2010, the developers confess that a heuristic solution for implicit
line continuation is the best they have been able to come up with.
There's even a video. Once you've fixed Go, you can hop over to
Microsoft and fix their problems too. :-)

Continuing a Statement over Multiple Lines
http://msdn.microsoft.com/en-us/library/865x40k4.aspx
Implicit Line Continuation in Visual Basic 2010 | kmcgrath | Channel 9
http://channel9.msdn.com/posts/kmcgrath/Implicit-Line-Continuation-in-Visual-Basic-2010/

Peter

chris dollin

unread,
May 2, 2010, 3:18:17 AM5/2/10
to Peter Williams, roger peppe, befelemepeseveze, golang-nuts
On 2 May 2010 04:45, Peter Williams <pwil...@gmail.com> wrote:

But thinking about ways to make it less messy caused me to have the idea that the solution might be to have parser tell the lexer when to try ';' injection rather than just turning it on after the listed tokens.  Early attempts at making this work indicate that I need to get a deeper understanding of the lexer's code so I've got to do some study.

Having the parser feed back to the lexer is asking for trouble.
It makes parser-independent tests for lexing difficult (perhaps
impossible ...), and harder to run the lexer (any lexer -- this is
a general observation, see also typedef ...) as a separate
goroutine/process/thread/yourchoiceofmultiprocessingnoun.

Chris

--
Chris "run away! run away!" Dollin
It is loading more messages.
0 new messages