Need help/tips on golang compiler souce code reading

402 views
Skip to first unread message

messi...@gmail.com

unread,
Feb 25, 2021, 1:24:10 AM2/25/21
to golang-nuts
Hi,

I'm trying to read golang compiler source code and now have come to the middle of syntax analysis: just finished parser.fileOrNil, next step is noder.node().  

So far everything is fine,  both lex and syntax tree parsing is easy to understand. But after that part, it feels more and more difficult, in order to make my learning process smoother, I think I need to ask some help/tips from the community.

Currently I'm doing it this way:
1. Figure out the main phases and the role of each one.  
2. For each phase, figure out related data structures firstly, like interfaces and structs
3. Read the source code of the phase, figure out the main logic
3. Guess, and use UT to verify
4. Use git log to see the author's original thoughts

Problems I'm facing: 
1. Can't find enough docs, especially official design docs
2. Comment is not enough in code repo, I believe it's enough for compiler developers, but not for beginners like me

This makes it very difficult to understand the design thoughts, in some cases you have to read the source code line by line for its purpose, but still don't know why it's implemented like that in the end.

So my questions are:
1. Is there a place I can find the design docs for go compiler design?
2. What's the most productive way to learn the source code? Especially from the perspective of go compiler developers. 

Thanks for any helps/suggestions/tips :)

Ian Lance Taylor

unread,
Feb 25, 2021, 9:56:01 AM2/25/21
to messi...@gmail.com, golang-nuts
Unfortunately there are no design docs.

The most productive approach is the one you are already doing. Feel
free to ask questions on this mailing list about why things are
written the way they are.

That said, the history of the compiler is that it was originally
written in C and based on the Inferno C compiler. Ken Thompson
modified that compiler to compile Go code, but the compiler was itself
still written in C. Several years later Russ Cox wrote a tool to
translate the C code into Go code. That Go code was naturally not
very idiomatic. A lot of it has been rewritten, but some still looks
like C code. A couple of years after that Keith Randall rewrote the
entire backend to use an SSA representation. At some point Robert
Griesemer rewrote the entire parser. Matthew Dempsky and Russ Cox
rewrote a lot of the frontend. Right now Robert Griesemer and Rob
Findley are rewriting the type checker. Many other people have
written significant components of the compiler, replacing earlier
components.

My point in providing this partial history is that many questions
about "why does the compiler work this way" have the answer "because
of the long and complicated history of the code base." It is not the
case that a group of people sat down and designed a clean and elegant
Go compiler. As far as I know nobody has ever written a Go compiler
from scratch.

Ian

David Skinner

unread,
Feb 25, 2021, 7:43:07 PM2/25/21
to golang-nuts
Ian, I very much appreciate your taking the time to provide the history here.
I am almost inclined to suggest that the OP create a design document much the way an engineer creates an as-built design before creating a new design.
For my part, I would be willing to assist the OP on an occasional basis in trying to figure some things out just because I enjoy puzzles and I think that these efforts to learn will pay great dividends in the future.

messi...@gmail.com

unread,
Feb 26, 2021, 3:28:46 AM2/26/21
to golang-nuts
Thanks a lot for your time, Ian. The history info is very helpful.

I'm reading the source code of version 1.15.6, but I found the code base structure on master branch is very different, some changes are refactors as you mentioned above(typechecker), and some are for new features(e.g. generics). Contribution guide says Github issue tracker is the place where contributors take task from, but most of the issues are about bugs, so how about the kind of "Features" and "Plans"? Take the typechecker-rewriting task you mentioned above as example,  I searched a lot to try to find the "kickoff" place and the original discussion but failed. (Most of issues assigned to Griesemer are specific bug or improvement.)

Is there any place/information I missed? Or reading the source code is the only way to fully catch up the whole process?

Thanks again. 

messi...@gmail.com

unread,
Feb 26, 2021, 4:08:55 AM2/26/21
to golang-nuts
Thanks for your suggestion, that's a good idea. Any specific ideas about how to do it? After all, the code/design is evolving all the time. 

Ian Lance Taylor

unread,
Feb 26, 2021, 10:40:29 AM2/26/21
to messi...@gmail.com, golang-nuts
On Fri, Feb 26, 2021 at 12:29 AM messi...@gmail.com
<messi...@gmail.com> wrote:
>
> I'm reading the source code of version 1.15.6, but I found the code base structure on master branch is very different, some changes are refactors as you mentioned above(typechecker), and some are for new features(e.g. generics). Contribution guide says Github issue tracker is the place where contributors take task from, but most of the issues are about bugs, so how about the kind of "Features" and "Plans"? Take the typechecker-rewriting task you mentioned above as example, I searched a lot to try to find the "kickoff" place and the original discussion but failed. (Most of issues assigned to Griesemer are specific bug or improvement.)
>
> Is there any place/information I missed? Or reading the source code is the only way to fully catch up the whole process?

The contribution guide is aimed more for smaller, focused changes than
for large scale changes.

The type checker work is being done as part of the generics work. The
generics work is not currently being tracked on the issue tracker.
It's too big for that to be useful.

There is mention of the refactoring work at
https://groups.google.com/g/golang-dev/c/U7eW9i0cqmo/m/ffs0tyIYBAAJ .
Jeremy Faller has recently started posting notes about runtime team
discussions at https://golang.org/issue/43930.

In general, though, I don't think you've missed anything. We don't
have a formal process for tracking large scale changes to the tools.
There is generally a proposal for the idea, but once the proposal has
been accepted there is no formal tracking of the work. The register
ABI proposal is https://golang.org/issue/40724 and the generics
proposal is https://golang.org/issue/43651. Those proposals have
associated design docs, but those are focused on the changes rather
than on the specific tasks required to implement those changes.

Hope this helps.

Ian

zhz shi

unread,
Feb 28, 2021, 10:05:07 AM2/28/21
to Ian Lance Taylor, golang-nuts
That's very helpful, thank you Ian.

According Russ's plan(replacing cmd/compile/internal/types with types2) and the discussion topic Jeremy Faller posted, can I understand it as that check2 will be the entry of new and default typechecker in the future(with the release of 1.18 one year later)? And the current type-checking code will be removed at the same time? If so I think I'd better to use master branch for learning.

And one more question please. I found there's copy of lex/parser and AST dcls under https://github.com/golang/go/tree/master/src/go as public API, why the compiler doesn't reuse this part of code, but keep a separate copy?

Thanks and best regards.
--
BR, Zhenzhong

Ian Lance Taylor

unread,
Feb 28, 2021, 1:40:25 PM2/28/21
to zhz shi, golang-nuts
On Sun, Feb 28, 2021 at 7:04 AM zhz shi <messi...@gmail.com> wrote:
>
> According Russ's plan(replacing cmd/compile/internal/types with types2) and the discussion topic Jeremy Faller posted, can I understand it as that check2 will be the entry of new and default typechecker in the future(with the release of 1.18 one year later)? And the current type-checking code will be removed at the same time? If so I think I'd better to use master branch for learning.

That is likely but not certain.


> And one more question please. I found there's copy of lex/parser and AST dcls under https://github.com/golang/go/tree/master/src/go as public API, why the compiler doesn't reuse this part of code, but keep a separate copy?

The go/parser and related packages fall under the Go 1 compatibility
guarantee, which makes it hard to update them as we learn more about
what the compiler needs. Although there is duplicate code, it seems
simpler overall for the compiler to have its own parser.

Ian

messi...@gmail.com

unread,
Apr 22, 2021, 10:47:27 AM4/22/21
to golang-nuts
Hi,

Now I've finished the package types2 for typecheck logic, and continue the journey to next step: IR generation(directory: cmd/compile/internal/ir/).

It seems the ir package is a new abstract layer created during the refactor process of types2, and it's main purpose is to translate the AST to IR Tree. Before jumping into the code details, I would like to ask for some help from the community for the following questions:

1. In current typechecker(cmd/compile/internal/typecheck), the ir.Node tree is built along with the process of typecheck, those two processes are mixed with each other. Why and what's the design thoughts of creating a new layer named ir?

2. Currently package ir involves a lot of code of package types(cmd/compile/internal/types/) and package noder(cmd/compile/internal/noder/), which is used by the current typechecker(cmd/compile/internal/typecheck). If the Go team will replace types with types2 and abandon the current typechecker in the future, can I understand this "mixed-up style code" as a transitional stage to connect types2 to current typechecker for now? And there will be big changes for this part? If so, what's the plan?

3. If I want to write an article about IR generation, what's the best summary of topics for this phase? What I summarized now are: 
    1). Translate AST to IR Tree;  
    2). Analysis stack/heap usage for variables;
    3). Handle generic instantiations;
   
     Are there other important topics/purposes I'm missing?
     
If I have any understanding errors, please help me to point them out; and if there's anything import you think I'm missing, or you think it's helpful to me, please tell me as well. 

Thanks for any help and tips.
Reply all
Reply to author
Forward
0 new messages