Message:
Hello rsc, r, iant (cc: golan...@googlegroups.com),
I'd like you to review this change to
https://go.googlecode.com/hg/
Description:
spec: narrow syntax for expression statements
This is not a language change, it simply expresses the
accepted cases explicitly in the respective production.
Also: check requirement in go/parser and add test cases.
Please review this at http://codereview.appspot.com/4428057/
Affected files:
M doc/go_spec.html
M src/pkg/go/parser/parser.go
M src/pkg/go/parser/parser_test.go
Index: doc/go_spec.html
===================================================================
--- a/doc/go_spec.html
+++ b/doc/go_spec.html
@@ -1,5 +1,5 @@
<!-- title The Go Programming Language Specification -->
-<!-- subtitle Version of Apr 19, 2011 -->
+<!-- subtitle Version of Apr 20, 2011 -->
<!--
TODO
@@ -3524,9 +3524,8 @@
can appear in statement context.
</p>
-
<pre class="ebnf">
-ExpressionStmt = Expression .
+ExpressionStmt = PrimaryExpr Call | "<-" UnaryExpr .
</pre>
<pre>
Index: src/pkg/go/parser/parser.go
===================================================================
--- a/src/pkg/go/parser/parser.go
+++ b/src/pkg/go/parser/parser.go
@@ -1497,6 +1497,9 @@
}
// expression
+ // (Only calls and receive operations can be expression statements,
+ // but we cannot verify this here because parseSimpleStmt is also
+ // called for conditions/tags of structured control flow operations.)
return &ast.ExprStmt{x[0]}
}
@@ -1579,6 +1582,24 @@
}
+func (p *parser) checkExprStmt(s ast.Stmt) ast.Stmt {
+ // expression statements can only be calls and receive operations
+ if x, ok := s.(*ast.ExprStmt); ok {
+ switch t := x.X.(type) {
+ case *ast.CallExpr:
+ return s
+ case *ast.UnaryExpr:
+ if t.Op == token.ARROW {
+ return s
+ }
+ }
+ p.error(s.Pos(), "expression statement must be call or receive
operation")
+ s = &ast.BadStmt{s.Pos(), s.End()}
+ }
+ return s
+}
+
+
func (p *parser) makeExpr(s ast.Stmt) ast.Expr {
if s == nil {
return nil
@@ -1612,6 +1633,7 @@
s = p.parseSimpleStmt(false)
if p.tok == token.SEMICOLON {
p.next()
+ s = p.checkExprStmt(s)
x = p.parseRhs()
} else {
x = p.makeExpr(s)
@@ -1708,7 +1730,7 @@
}
if p.tok == token.SEMICOLON {
p.next()
- s1 = s2
+ s1 = p.checkExprStmt(s2)
s2 = nil
if p.tok != token.LBRACE {
s2 = p.parseSimpleStmt(false)
@@ -1839,14 +1861,14 @@
}
if p.tok == token.SEMICOLON {
p.next()
- s1 = s2
+ s1 = p.checkExprStmt(s2)
s2 = nil
if p.tok != token.SEMICOLON {
s2 = p.parseSimpleStmt(false)
}
p.expectSemi()
if p.tok != token.LBRACE {
- s3 = p.parseSimpleStmt(false)
+ s3 = p.checkExprStmt(p.parseSimpleStmt(false))
}
}
p.exprLev = prevLev
@@ -1908,8 +1930,9 @@
// because of the required look-ahead, labeled statements are
// parsed by parseSimpleStmt - don't expect a semicolon after
// them
- if _, isLabeledStmt := s.(*ast.LabeledStmt); !isLabeledStmt {
+ if _, ok := s.(*ast.LabeledStmt); !ok {
p.expectSemi()
+ s = p.checkExprStmt(s)
}
case token.GO:
s = p.parseGoStmt()
Index: src/pkg/go/parser/parser_test.go
===================================================================
--- a/src/pkg/go/parser/parser_test.go
+++ b/src/pkg/go/parser/parser_test.go
@@ -22,6 +22,11 @@
`package p; func f() { if ; /* should have condition */ {} };`,
`package p; func f() { if f(); /* should have condition */ {} };`,
`package p; const c; /* should have constant value */`,
+ `package p; func f() { 0 /* not an expression statement */ };`,
+ `package p; func f() { if 0 /* not an expression statement */; true {}
};`,
+ `package p; func f() { for 0 /* not an expression statement */; false; {}
};`,
+ `package p; func f() { for ; false; 0 /* not an expression statement */
{} };`,
+ `package p; func f() { switch 0 /* not an expression statement */; {} };`,
}
@@ -53,6 +58,8 @@
`package p; func f() { select { case <- c: case c <- d: case c <- <- d:
case <-c <- d: } };`,
`package p; func f() { if ; true {} };`,
`package p; func f() { switch ; {} };`,
+ `package p; func f() { a.f() };`,
+ `package p; func f() { <-ch };`,
}
But only if go/parser does not implement this rule.
I think we should be able to gofmt programs that
have top-level expressions. Also I don't think we should
have different error messages for the unused top-level
expressions:
len(x)
x+1
but that's almost unavoidable if the parser handles
one but the type checker handles the other (and it must).
On the same note, I am a little worried that having this
in the grammar will encourage other parser implementers
to implement the rule, which will limit the generality of
their parsers. But it does look correct.
Russ
What is your rationale for it? If the grammar has this restriction,
the parser should enforce it, I would think.
Maybe the grammar should not have this restriction, but I think it is
easily syntactically enforced.
> have different error messages for the unused top-level
> expressions:
>
> len(x)
> x+1
>
> but that's almost unavoidable if the parser handles
> one but the type checker handles the other (and it must).
There is a difference between legal grammar (assuming we want to put
above rule in) and the unused rule in my mind. The reason why I
started making this change is because I want to get a handle on the
unused rule so that it can be written down in the spec. Fixing the
grammar reduces the problem a little bit. As it is, 6g and gccgo have
different ideas about what is used and what is not. We put the unused
rule to a large part so that we would see potential errors with :=
redeclarations. It seems that 6g goes a step further and applies it to
built-in function calls w/ no side-effect. I'm not sure that's
necessary, especially if we can come up with a simpler rule.
>
> On the same note, I am a little worried that having this
> in the grammar will encourage other parser implementers
> to implement the rule, which will limit the generality of
> their parsers. But it does look correct.
Are you worried about the exported ParseStmtList not being useful
enough anymore (say for an interpreter), or is it some other concern?
- gri
>
> Russ
>
Yes, it is easily syntactically enforced. But the higher up you
give the error, the more useful you can make it, and the more
uniform you can make the errors. This is a semantic rule
(can't compute something without side effects and not use it),
and even though it can be partially expressed syntactically
that doesn't mean it's the right place to implement the restriction.
Delaying the rejection means you can identify more problems
in a single pass through the compiler.
Consider this program:
package main
func g(int)
func f() (int, int)
func main() {
x = f() // oops, forgot that f returns two arguments
println(x)
g-(x) // oops, meant g(-x)
}
I think everyone can look at this program and see what the
Go AST would be for it. If the parser accepts it, then the
type checker can tell the programmer about both errors.
If the parser rejects it early, you only find out about one.
To take another example,
x, y, z = m[k]
is definitely an invalid program, because a map index assignment
cannot have 3 expressions on the lhs. But it's not a syntax error
(I think the parser did reject it once but I asked for a change),
for much the same reasons as above. We all still know what
the AST looks like, and if we let it go through, a tool can
point out more problems (or process) the program instead of
failing. Also, in this case, because the parser can't possibly reject
x, y, z = f() // oops, f only returns two values
and you'd want to have a compiler or other tool print the same
error message in both cases, it makes sense to leave the
check to the higher levels.
Back to the current case, especially since I believe the same
error message is going to apply to a statement-level len(x)
as applies to +x (and the parser can't reasonably reject len(x)),
I think the parser should let both through and let higher level
processing take care of them.
> Are you worried about the exported ParseStmtList not being useful
> enough anymore (say for an interpreter), or is it some other concern?
I'm worried about generality in general more than specifics.
I don't see any benefit to the parser excluding side effect-free
expression statements since it feels like a semantic concern
and since higher level code is going to have to deal with some
other cases anyway.
That said, the case you brought up (for exp/eval) is a good
specific reason not to restrict the parser.
Russ
sure, I am fully aware of that
At some point it's a judgement call where to best put an error.
There's other error messages in the parser that could be delivered
better and later. The rule I have tried to follow is simple: if it's
in the syntax, the parser should complain. That's the only reason for
the checks in the parser.
> uniform you can make the errors. This is a semantic rule
> (can't compute something without side effects and not use it),
> and even though it can be partially expressed syntactically
> that doesn't mean it's the right place to implement the restriction.
Originally it was not a semantic rule. It was a syntactic rule with a
semantic implication. The "is used" rule came later. But that's not
the point, really.
> Delaying the rejection means you can identify more problems
> in a single pass through the compiler.
Sure. But that was not the primary design goal of all these packages.
The goal for the parser was to reject anything that is not
syntactically correct.
I am not disputing this in any way.
>
> Back to the current case, especially since I believe the same
> error message is going to apply to a statement-level len(x)
> as applies to +x (and the parser can't reasonably reject len(x)),
> I think the parser should let both through and let higher level
> processing take care of them.
We don't have a spec rule about what "is used" means yet. gccgo and gc
have different answers. But we have a trivial syntactic rule.
>
>> Are you worried about the exported ParseStmtList not being useful
>> enough anymore (say for an interpreter), or is it some other concern?
>
> I'm worried about generality in general more than specifics.
> I don't see any benefit to the parser excluding side effect-free
> expression statements since it feels like a semantic concern
> and since higher level code is going to have to deal with some
> other cases anyway.
Ok. Perhaps the syntax should not restrict the kinds of expressions
permitted in a statement (like before), and the explanation should not
even mention calls and receive operations but instead refer to a more
general is-used rule. I would be fine with that.
I am going to let this rest until we have an is-used rule that we
believe and that can be implemented.
- Robert