Switching to new parser in compiler

848 views
Skip to first unread message

Matthew Dempsky

unread,
Oct 25, 2016, 12:29:48 PM10/25/16
to golang-dev
Overnight I ran "go build -a -p=1 -toolexec=toolbench std cmd", where toolbench is this simple program that interleaves running the compiler 30 times with the old and new parsers: https://gist.github.com/mdempsky/bc3dc8bebce0a545c1d515afc3f45014

I also augmented cmd/compile to run runtime.GC() immediately before taking timestamps as a partial fix to golang.org/issue/17434 ("cmd/compile: -bench should correct for GC").

The benchstat -geomean output from comparing just the "total" times is here, showing a -0.01% geo mean difference between old and new parser: https://gist.github.com/mdempsky/86378ecef7367dea689dad5900646c99

More detailed output including fe:parse and [fb]e:subtotal times here: https://gist.github.com/mdempsky/1d312fbc33c7372e00933a864d935e8e

(Note that I omitted the "Compile:main" entries, since those are confounded by representing all of the different main packages in cmd/*.)

Robert, Russ, and I think agree the performance is acceptable to switch to the new parser so we can move forward.  Unless I hear any objections, I'll submit the CL to enable the new parser by default this afternoon.

Cheers

Brad Fitzpatrick

unread,
Oct 25, 2016, 12:40:59 PM10/25/16
to Matthew Dempsky, golang-dev
Are we also going to delete the old parser for Go 1.8?

Even if -0.01% isn't exciting, deleting code at least is.


--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matthew Dempsky

unread,
Oct 25, 2016, 12:54:20 PM10/25/16
to Brad Fitzpatrick, golang-dev
I would like to also delete the old parser.

John C.

unread,
Oct 25, 2016, 9:20:38 PM10/25/16
to golang-dev
Can you remind me for the record what the benefits of the new parser are?  I think it's been stated but I can't recall and couldn't find it.

--
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.

Robert Griesemer

unread,
Oct 25, 2016, 11:45:04 PM10/25/16
to John C., golang-dev
It's probably not been communicated largely.

This parser is part of a new internal package syntax that's now separate from the compiler. It uses a very fast lexer and creates a new syntax tree (similar to go/ast but simpler). Currently this syntax tree is translated back to the existing compiler's node structure, but the long-term plan is to move to the more modern and more space-efficient tree structure and slowly eliminate the vestiges of the compiler's current node structure, which is hard to maintain.

Its a testament to the speed of this new parser that the compiler is (on average) the same speed as before, despite the fact that we create a separate new syntax tree which is then translated back to the current node structure (for now).

The next step is going to make parsing of individual package files concurrent. The new parser can process more than two million lines per second when running concurrently over the std library (outside of the compiler). Of course we won't see that performance across an entire compilation, but parsing and syntax tree creation could become almost insignificant in the overall time budget of a compilation.

Another benefit besides speed is that this isolates a significant chunk of the compiler's code in a separate package which makes it easier to maintain and test.

Down the road (say a couple of years from now), when we are confident that this package doesn't change much, we may decide to make it a non-internal package for general consumption (as a replacement for the existing go/* libs which are showing their age). That would have the benefit that compiler and other programs (gofmt) would use the exact same syntax tree. 

- gri

To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.

roger peppe

unread,
Oct 26, 2016, 4:11:17 AM10/26/16
to Robert Griesemer, John C., golang-dev
How do the error messages compare between the two parsers?

Robert Griesemer

unread,
Oct 26, 2016, 1:29:02 PM10/26/16
to roger peppe, John C., golang-dev
The new parser (which I wrote) is mostly based on the existing one (which I wrote before and which was a 1:1 translation of the yacc-based implementation) and thus has essentially the same structure except that it's cleaner and simpler. I like to think that it contains the best of both worlds, the gc parser and the go/parser. I believe the error messages and behavior are either identical or at least very close.

We know of a few cases where we are worse than before, especially when compared to yacc (issues are filed), and some cases where they have improved. At any rate this is an area which we are continuously trying to make better - good error messages really improve the user experience.

- gri


>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an

roger peppe

unread,
Oct 26, 2016, 2:43:18 PM10/26/16
to Robert Griesemer, John C., golang-dev
On 26 October 2016 at 18:28, Robert Griesemer <g...@golang.org> wrote:
> The new parser (which I wrote) is mostly based on the existing one (which I
> wrote before and which was a 1:1 translation of the yacc-based
> implementation) and thus has essentially the same structure except that it's
> cleaner and simpler. I like to think that it contains the best of both
> worlds, the gc parser and the go/parser. I believe the error messages and
> behavior are either identical or at least very close.
>
> We know of a few cases where we are worse than before, especially when
> compared to yacc (issues are filed), and some cases where they have
> improved. At any rate this is an area which we are continuously trying to
> make better - good error messages really improve the user experience.

Yes. This would be my major reason for deciding on one or the other,
given that the speed is the same.

Brad Fitzpatrick

unread,
Oct 26, 2016, 2:48:39 PM10/26/16
to roger peppe, Robert Griesemer, John C., golang-dev
The speed is the same *at the moment*.

Note what Robert said: "It uses a very fast lexer and creates a new syntax tree (similar to go/ast but simpler). Currently this syntax tree is translated back to the existing compiler's node structure, but the long-term plan is to move to the more modern and more space-efficient tree structure and slowly eliminate the vestiges of the compiler's current node structure, which is hard to maintain.

roger peppe

unread,
Oct 26, 2016, 3:26:51 PM10/26/16
to Brad Fitzpatrick, Robert Griesemer, John C., golang-dev
For me if the error messages were significantly worse, that would trump
the speed advantage even if the new parser was quite a bit faster.
That's just my point of view though, and it doesn't sound like that is
the case.

Robert Griesemer

unread,
Oct 26, 2016, 4:35:52 PM10/26/16
to roger peppe, Brad Fitzpatrick, John C., golang-dev
Error messages are essentially the same. For one, the tests we have ensure the same error messages as before for the cases we test. It would be a showstopper for us if error message quality had deteriorated in any significant way.

Or in other words: Any speed gains are not at the cost of error recovery or error messages.
- gri
Reply all
Reply to author
Forward
0 new messages