The GitHub Vet Project

414 views
Skip to first unread message

K. Alex Mills

unread,
Dec 20, 2020, 2:38:49 PM12/20/20
to golan...@googlegroups.com
Hello Gophers!

During a Q&A session at this year's GopherCon, some members of the Go language team indicated a willingness to modify the behavior of range loops if we can be reasonably certain that the change will not cause incorrect behavior in current programs. To make that determination, a large collection of real-world go programs would need to be vetted. If we can find that, in every case, modifying the compiler behavior does not lead to an undesirable outcome, we will have strong evidence that the change may only have a small impact.

I've been working on a project to gather and crowdsource data for that sort of analysis. It has reached a point where I think that it's time to share it with the Go community.

The GitHub Vet Project performs static analysis on public Go repositories hosted on GitHub in order to better understand the impact of this proposed language change. It consists of two bots. VetBot crawls GitHub to snippets of code it thinks could be interesting, it reports it as an issue in a separate repository, for humans to analyze via crowdsourcing. TrackBot manages the crowdsourcing workflow.

At this point, all of the features I think are necessary for the project's success have been implemented. But I am only one Gopher, and so I would like to a) make the community aware of the project and b) ask the community to help review it in three ways:

1) Test out TrackBot and the project instructions by actually crowdsourcing through the issues found so far.
2) Review the output and raise concern if anything that I claim ought to be excluded is somehow making it through.
3) Static analysis experts: review the analyzers and ruthlessly question anything that could lead to a false negative. I don't think anything exists, but one pair of eyes is often wrong, and there are many folks on this list with (much) more expertise than me.

One final thing. The crowd-sourcing workflow I've designed relies on the opinion of experts to help estimate the reliability of the community assessment. In order for that to work, I need the help of some Go experts who are willing to commit some time to reviewing the findings. If you'd like to participate in this way, first, read the TrackBot documentation to understand what being an expert entails. If you're still interested, email me with the subject line 'GitHub Vet Expert' and include your GitHub username and a brief outline of your experience with Go.

It's my hope that this project can provide some data that can help to move Go forward. To that end, I'm also interested in any and all feedback and suggestions. Contributions are also welcome.

Thanks for reading,

K. Alex Mills, Ph.D

Jan Mercl

unread,
Dec 20, 2020, 3:02:39 PM12/20/20
to K. Alex Mills, golang-nuts
On Sun, Dec 20, 2020 at 8:38 PM K. Alex Mills <k.alex...@gmail.com> wrote:

> During a Q&A session at this year's GopherCon, some members of the Go language team indicated a willingness to modify the behavior of range loops if we can be reasonably certain that the change will not cause incorrect behavior in current programs.

I expressed my concerns at the issue tracker in 2017 when I still had
a Github account. That's no more the case, so I'll take the
opportunity to make a comment here. I still believe that the intent to
change the behavior the way issue #20733 proposes is [plain] wrong. It
can turn loops in existing code that do not allocate to loops that
allocate. That can be avoided by certain compiler optimizations that
may be already present in [some or all] known Go compilers. But such
particular optimization is not part of the language specification so
the runtime performance of some code may differ dramatically between
[future] compilers.

Go is not, and I hope should not become, a language defined by [any
particular] implementation.

K. Alex Mills

unread,
Dec 21, 2020, 2:31:06 PM12/21/20
to Jan Mercl, golang-nuts
On Sun, Dec 20, 2020, 2:01 PM Jan Mercl <0xj...@gmail.com> wrote:

I expressed my concerns at the issue tracker in 2017 when I still had
a Github account. That's no more the case, so I'll take the
opportunity to make a comment here. I still believe that the intent to
change the behavior the way issue #20733 proposes is [plain] wrong. It
can turn loops in existing code that do not allocate to loops that
allocate. That can be avoided by certain compiler optimizations that
may be already present in [some or all] known Go compilers. But such
particular optimization is not part of the language specification so
the runtime performance of some code may differ dramatically between
[future] compilers.

Go is not, and I hope should not become, a language defined by [any
particular] implementation.

For the purposes of this project, I am planning to remain agnostic as to the specific proposal which gets implemented to resolve the range loop issue. That sort of discussion relies heavily on the design principles used to write Go, and I am not yet informed enough about them to have an opinion.

That said...

AFAIK, the proposal I linked to provides a good overview of the problem, and, despite a robust discussion, remains a viable implementation candidate. 

I think that it would be best to keep our discussion of the proposed change part of the relevant GitHub tracker to ensure the conversation can happen in context of prior discussions. For my part, after having read through the proposal and the subsequent discussion, I wouldn't have anything to add other than my opinion.

It is my hope that the GitHub Vet Project has been designed in a way that the data collected can be useful and interesting regardless of whatever change is eventually implemented. Unfortunately, the project only has a chance of success if enough Gophers participate in the crowdsourcing effort. I hope that the community won't be discouraged from participation based on the perception that we don't yet have a viable fix for this problem since, as far as I am aware, that is not the case.

Jan Mercl

unread,
Dec 21, 2020, 2:40:22 PM12/21/20
to K. Alex Mills, golang-nuts
On Mon, Dec 21, 2020 at 8:30 PM K. Alex Mills <k.alex...@gmail.com> wrote:

> I think that it would be best to keep our discussion of the proposed change part of the relevant GitHub tracker to ensure the conversation can happen in context of prior discussions.

Unfortunately, that's not going to happen, and I am truly sorry about that.

Go is a Google initiated, funded and directed project. Please explain
why a Microsoft controlled account is required to continue the
discussion of this Go issue.

K. Alex Mills

unread,
Dec 29, 2020, 1:21:04 PM12/29/20
to golang-nuts
I'd like to solicit some input from the community regarding a decision that could reduce the effort required in crowdsourcing.

After thinking carefully about the loopclosure vet check, it seems to me that any time it reports an issue, there actually is an issue. That has also been the case for everything I've seen GitHub Vet report from loopclosure thus far.

What loopclosure does is find instances where a range loop variable is used inside a goroutine or defer statement. My understanding is that, in every case, this raises the potential for a race-condition between updating the range loop variable and reading from it within the body of the goroutine. So unless I'm missing something, it seems to me that there is not any need for a human to double-check loopclosure results.

So my plan -- unless someone comes up with something I have missed -- is to omit loopclosure findings from the crowdsourcing effort. Once this change is made, loopclosure findings will be kept on disk, but not uploaded to GitHub for crowdsourcing.

It's worth mentioning that there are issues which loopclosure misses which VetBot is able to find, and the community's effort will still be needed to help sift through false-positives once the project has stabilized.

jake...@gmail.com

unread,
Dec 30, 2020, 11:02:24 AM12/30/20
to golang-nuts
On Tuesday, December 29, 2020 at 1:21:04 PM UTC-5 k.alex...@gmail.com wrote:
I'd like to solicit some input from the community regarding a decision that could reduce the effort required in crowdsourcing.

After thinking carefully about the loopclosure vet check, it seems to me that any time it reports an issue, there actually is an issue. That has also been the case for everything I've seen GitHub Vet report from loopclosure thus far.

What loopclosure does is find instances where a range loop variable is used inside a goroutine or defer statement. My understanding is that, in every case, this raises the potential for a race-condition between updating the range loop variable and reading from it within the body of the goroutine. So unless I'm missing something, it seems to me that there is not any need for a human to double-check loopclosure results.

Technically, I believe a range loop variable can be used in a goroutine safely. A very contrived example: https://play.golang.org/p/jgZbGg_XP6S . In practice I can not see much use for this kind of pattern, but it does exist.

K. Alex Mills

unread,
Dec 30, 2020, 5:09:36 PM12/30/20
to jake...@gmail.com, golang-nuts
Good point, and thanks for the response.

This example avoids the race condition by breaking out of the loop immediately afterward. IIRC, loopclosure also reports in case the range loop variable is passed as an argument using a name that shadows the loop variable. 

I think any loopclosure findings which work properly like this do have to do so because they mitigate the race condition like these examples do. I don't think examples like this can be the kind we are (really) looking for: an instance where the proposed change would break code that behaves correctly today.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/72281283-6824-4396-abd2-0dcb2b08ad62n%40googlegroups.com.

Tim King

unread,
Dec 30, 2020, 7:44:54 PM12/30/20
to jake...@gmail.com, golang-nuts
This is a very interesting and ambitious experiment! If successful, this is getting at the type of evidence that would be useful when considering to change the language.

When evaluating the change, we will need to find cases where the semantics of the program might differ after the change. Semantic changes include race conditions being eliminated (like in the Bugs section for the project). It can also include changes to the memory layout after the loop (one aliased memory allocation vs a memory allocation per iteration being stored in memory), changes in branching behavior (pointer comparisons), changes in variable lifetime when an escape does happen (which can impact when finalizers get run), and order of calls to external functions. Looking at closures of loop variables and escaping references is definitely a good start. I am not yet sure exactly what would suffice (but happy to chat over email).

To make a decision, we probably want to know for each location:
* Is it "semantically equivalent" before/after the change? This can be done by a combination of automation and manual effort.
* If it differs, was the behavior before clearly "incorrect"/"has "bugs"?
* If it differs, was the behavior after clearly "incorrect"/"has "bugs"?
For the above, a location might not have a clear answer after expert review. It might be worth documenting this. A few of these map decently onto the existing categories. Knowing when the number of differing locations where someone is clearly helped vs differing locations where someone is clearly hurt is probably the most important in making a decision.

Cases with bugs may want to include a notion of confidence. Examples like those in the Bugs section are extremely likely to be bugs. Others may not be as clear. ("Is there actually a race condition here?" can be hard to assess in someone else's code.) The current phrasing "could lead to undesirable behavior" leaves the confidence ambiguous. It might be helpful to clarify what this means to reviewers.

The current wording also currently suggests we can also stop looking once we have identified "a fix to a bug" without considering downstream interactions after the loop. For example: `for _, x := range l { defer func() { x.someMethod() }() }`. That is almost certainly a bug. However, on the list `{nil, <something valid>}` the original version avoids a nil pointer dereference panic by only using the last element. Basically having two bugs pull in the opposite direction mitigating the other. I have only really seen this where an iffy unit test was broken by creating a new copy for each loop. The larger point though is it is not always clear how much context to look at before labeling an example "clearly buggy before and would be fixed". Stopping early could be a source of false negatives. This may be more of a theoretical concern than a real one.

Minor technical points:

For folks developing/commenting on the static analyzers, it would also help to formalize the notion of "semantic equivalence" that is trying to be reached. It is likely a refinement relation like one would see in the program transformation literature. The change will always be safe when the loop after the change is a refinement of the original loop. (If we also care about performance, we will also want to know how often this transformation can be undone to reuse the memory, e.g. the original refines the modified. But declaring this out of scope is reasonable.)

Technically, I believe a range loop variable can be used in a goroutine safely. A very contrived example: https://play.golang.org/p/jgZbGg_XP6S . In practice I can not see much use for this kind of pattern, but it does exist.  

It is correct that the range loop variable can be used in a goroutine safely. However, the existing loopclosure vet check will not report on this instance. Today it only reports when the go statement is the last statement in the for loop. I have seen non-contrived examples where range loop variables are used by multiple goroutines. What comes to mind is doing multiple RPCs, with each RPC in a goroutine on the same range variable and waiting for all of these to finish before starting the next iteration of the loop. Essentially `for _, x := range l { someComplexOperationWithGoRoutines(&x) }`.

So my plan -- unless someone comes up with something I have missed -- is to omit loopclosure findings from the crowdsourcing effort. Once this change is made, loopclosure findings will be kept on disk, but not uploaded to GitHub for crowdsourcing.

Omitting loopclosure reports from crowdsourcing makes sense to me. It would also be valuable to track and make available how many cases of these there are. These would essentially be locations that do differ and automation can confidently deduce that there is a bug before. There is also probably not a bug after (harder to guarantee this direction). FWIW the key approximation loopclosure makes is that the final statement is reachable by more than the last element. When this is not true, it can give false positives. This does not sound like it is worth a human's time. (This also applies to diagnostic reports on defer statements.)

Tyler Compton

unread,
Dec 30, 2020, 11:35:22 PM12/30/20
to K. Alex Mills, golang-nuts
This is a very interesting project! I went ahead and labeled a few of the issues. However, much as I hate to pour cold water on the idea, I have some doubts about how effective this approach can be. The githubvet bot has reported over ten thousand issues so far, which is quite a lot. My gut feeling is that if code does exist that correctly relies on the current behavior, it would be extremely subtle. If we're going to get through all of these issues, we won't be able to give each example a complete deep dive which I think means we'll likely miss very subtle cases. Not that I have any better ideas :)

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

K. Alex Mills

unread,
Dec 30, 2020, 11:58:23 PM12/30/20
to Tyler Compton, golang-nuts
On Wed, Dec 30, 2020, 10:34 PM Tyler Compton <xav...@gmail.com> wrote:
This is a very interesting project! I went ahead and labeled a few of the issues. However, much as I hate to pour cold water on the idea, I have some doubts about how effective this approach can be. The githubvet bot has reported over ten thousand issues so far, which is quite a lot.

This is a valid concern but since the tool remains under active development, I would caution against making an assessment based on the number of issues currently reported in the tracker.

Part of the problem is that the current tracker contains issues from multiple runs, since I've rescanned during testing, as I find and fixing issues.

There are still some things that can be done to bring down the false-positive rate, but before addressing that I need to spend time on some of the false-negatives that I am currently aware of.

My gut feeling is that if code does exist that correctly relies on the current behavior, it would be extremely subtle. If we're going to get through all of these issues, we won't be able to give each example a complete deep dive which I think means we'll likely miss very subtle cases. Not that I have any better ideas :)

Part of the hope has been that a large number of community participants would help parallelize the effort and (perhaps) even get multiple pairs of eyes on each issue. Whether that will work or not remains to be seen.

I will also say that many of the issues I have looked at have seemed to be classifiable rather quickly. OTOH, that could also be an indication that the false-positive rate remains a little high.

All that said, so far, the issues reported only focus on race-conditions or incorrectly storing a reference to a range-loop pointer. Tim King has just pointed out a few other concerns like pointer comparison and finalizers for which analyzers have yet to be written. My perception is that these two features aren't very widely used, so I wouldn't expect to find a large number of instances like these.

Tyler Compton

unread,
Dec 30, 2020, 11:59:59 PM12/30/20
to K. Alex Mills, golang-nuts
This is a valid concern but since the tool remains under active development, I would caution against making an assessment based on the number of issues currently reported in the tracker.

Fair point! I'll reserve judgement in that case then. 

K. Alex Mills

unread,
Jan 1, 2021, 4:59:30 PM1/1/21
to Tim King, jake...@gmail.com, golang-nuts
Technically, I believe a range loop variable can be used in a goroutine safely. A very contrived example: https://play.golang.org/p/jgZbGg_XP6S . In practice I can not see much use for this kind of pattern, but it does exist.  

It is correct that the range loop variable can be used in a goroutine safely. However, the existing loopclosure vet check will not report on this instance.
 
VetBot uses a variant of loopclosure which I modified to be more sensitive (to avoid false-negatives). The variant in VetBot does report code like this example even though it is a false-positive, due to the break.
 
When evaluating the change, we will need to find cases where the semantics of the program might differ after the change. Semantic changes include race conditions being eliminated (like in the Bugs section for the project). It can also include changes to the memory layout after the loop (one aliased memory allocation vs a memory allocation per iteration being stored in memory),changes in branching behavior (pointer comparisons), changes in variable lifetime when an escape does happen (which can impact when finalizers get run), and order of calls to external functions. Looking at closures of loop variables and escaping references is definitely a good start. I am not yet sure exactly what would suffice (but happy to chat over email).  

I like how this list starts by laying out all the semantic changes in the proposal rather than starting from asking 'what can go wrong' -- which is where I started.

Now that we have this list -- I'm looking at each of the entries in this list and asking myself "what could go wrong, and how can we detect it?" There are a few cases where I'm not able to come up with a satisfactory answer on my own. I'm hoping that by laying out my understanding below in a transparent way, someone might be willing to help fill in the gaps.

race conditions being eliminated (like in the Bugs section for the project)  
 
It's almost certainly the case that any current code which relies on the existence of a race condition for its correct operation is flaky. We can probably safely classify these as 'beneficial' semantic changes, but that can't be a unilateral decision. If we're willing to accept that any code that passes a range-loop pointer to a goroutine is going to have its behavior improved by the change, we can avoid many of the issues being reported by the prototype.

changes to the memory layout after the loop (one aliased memory allocation vs a memory allocation per iteration being stored in memory)  

It seems to me that this can only break existing code which relies on the very specific way in which variables are laid out in memory. Based on the little I know, I would expect this to require the code to do something with the unsafe package or use CGO. Are there other ways to make code rely on memory layout in Go (other than things like pointer comparisons and finalizers mentioned below)?

changes in branching behavior (pointer comparisons)
This makes sense and it seems straightforward to check for by finding instances where a range-loop reference is used in a boolean expression.

changes in variable lifetime when an escape does happen (which can impact when finalizers get run) 

I'm still processing this one; but after a glance, it looks like we might only be interested in when a range-loop reference is passed as the first argument to `runtime.SetFinalizer`. If that's the case, writing an analyzer should be straightforward.

order of calls to external functions

I must be missing something here, since I am not sure how the proposed language change can impact the order in which calls to external functions are made. I don't think that I'm grasping your intended meaning.
Reply all
Reply to author
Forward
0 new messages