Hard-to-explain race detector report

198 views
Skip to first unread message

Caleb Spare

unread,
Jun 6, 2023, 7:31:23 PM6/6/23
to golang-nuts
Can someone explain why the following test shows a race between the
indicated lines?

https://github.com/cespare/misc/blob/b2e201dfbe36504c88e521e02bc5d8fbb04a4532/httprace/httprace_test.go#L12-L43

The race seems to be triggered by the very last line of the test:

get(client1)

If I comment that out, then the race detector doesn't complain. But
then it seems that a read of a variable which happens before an HTTP
request which causes a write of the variable ultimately races with the
original read, which doesn't make sense.

It seems like a false positive (perhaps the race detector doesn't
understand causality across a TCP connection?), but so far I've never
seen the race detector have a false positive, so I think I must be
missing something.

I wrote a slightly simpler test (see TestHTTPRace2 right below in the
same file) which tries to make the race happen using a regular HTTP
handler and a single client and the race detector doesn't complain.

Caleb

burak serdar

unread,
Jun 7, 2023, 3:52:35 AM6/7/23
to Caleb Spare, golang-nuts
On Tue, Jun 6, 2023 at 5:31 PM Caleb Spare <ces...@gmail.com> wrote:
Can  someone explain why the following test shows a race between the
indicated lines?

https://github.com/cespare/misc/blob/b2e201dfbe36504c88e521e02bc5d8fbb04a4532/httprace/httprace_test.go#L12-L43

The race seems to be triggered by the very last line of the test:

get(client1)

If I comment that out, then the race detector doesn't complain. But
then it seems that a read of a variable which happens before an HTTP
request which causes a write of the variable ultimately races with the
original read, which doesn't make sense.

I ran that test. As you said, the race detector complains about the concurrent write to a variable that was read before in another goroutine.

It is technically what the race detector calls a race because it is unsynchronized concurrent access to a shared variable. It does look like a false-positive though.
 

It seems like a false positive (perhaps the race detector doesn't
understand causality across a TCP connection?), but so far I've never
seen the race detector have a false positive, so I think I must be
missing something.

I wrote a slightly simpler test (see TestHTTPRace2 right below in the
same file) which tries to make the race happen using a regular HTTP
handler and a single client and the race detector doesn't complain.

Caleb

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAGeFq%2B%3DZGE5agaLYDgsdYvykbaWwHgjtKJf9q%2B1YJhR26%3DY45Q%40mail.gmail.com.

Sven Anderson

unread,
Jun 7, 2023, 5:34:03 AM6/7/23
to Caleb Spare, golang-nuts
That’s not only a read/write race, it’s also a write/write race. Every request to the server creates a new Go routine that might increment newConns in parallel, so it may get corrupted. Same for lines 39/40.

You might claim, that for infrastructural reasons, there can be no concurrent requests to your server, but that would just mean that the race is not triggered, it’s there nevertheless.

burak serdar

unread,
Jun 7, 2023, 7:12:13 AM6/7/23
to Sven Anderson, Caleb Spare, golang-nuts


On Wed, Jun 7, 2023, 12:33 PM Sven Anderson <sv...@redhat.com> wrote:
That’s not only a read/write race, it’s also a write/write race. Every request to the server creates a new Go routine that might increment newConns in parallel, so it may get corrupted. Same for lines 39/40.

The write/write race will happen only if the increment is done from with the new goroutine. If the increment is done after accepting a connection but before starting a new goroutine there is no write/write race, right?

But that is not the problem here. Race detector is complaining about a concurrent write after read

Ian Lance Taylor

unread,
Jun 7, 2023, 10:05:36 AM6/7/23
to Caleb Spare, golang-nuts
I haven't looked at your case in full detail. But it's true that the
race detector does not know anything about TCP connections. In
general it only knows about the cases described in the Go memory model
(https://go.dev/ref/mem). It does seem that your code is only
avoiding a race because the TCP read serializes the server goroutine
and the main client goroutine. The race detector doesn't know
anything about that.

Ian

Caleb Spare

unread,
Jun 7, 2023, 1:21:36 PM6/7/23
to Ian Lance Taylor, golang-nuts
Makes sense, but then I don't understand why the second test doesn't
show a race as well.

Caleb

Caleb Spare

unread,
Jun 7, 2023, 1:23:04 PM6/7/23
to Sven Anderson, golang-nuts
On Wed, Jun 7, 2023 at 2:33 AM Sven Anderson <sv...@redhat.com> wrote:
>
> That’s not only a read/write race, it’s also a write/write race. Every request to the server creates a new Go routine that might increment newConns in parallel, so it may get corrupted. Same for lines 39/40.
>
> You might claim, that for infrastructural reasons, there can be no concurrent requests to your server, but that would just mean that the race is not triggered, it’s there nevertheless.

The race detector reports on races that actually happened, not races
that could happen.

Sven Anderson

unread,
Jun 7, 2023, 4:19:44 PM6/7/23
to Caleb Spare, golang-nuts


Caleb Spare <ces...@gmail.com> schrieb am Mi. 7. Juni 2023 um 19:22:
On Wed, Jun 7, 2023 at 2:33 AM Sven Anderson <sv...@redhat.com> wrote:
>
> That’s not only a read/write race, it’s also a write/write race. Every request to the server creates a new Go routine that might increment newConns in parallel, so it may get corrupted. Same for lines 39/40.
>
> You might claim, that for infrastructural reasons, there can be no concurrent requests to your server, but that would just mean that the race is not triggered, it’s there nevertheless.

The race detector reports on races that actually happened, not races
that could happen.

A race condition is a condition, not an event, so I think „happened“ is not correct, but maybe someone who knows the heuristics of the race detector better can clear that up.

And why would it matter then if it understands the causalities of TCP? One must be incorrect: Either your understanding of the TCP causalities or of how the race detector is working, because it does indeed report something, right? ;-)

burak serdar

unread,
Jun 8, 2023, 3:24:56 AM6/8/23
to Sven Anderson, Caleb Spare, golang-nuts
On Wed, Jun 7, 2023 at 2:19 PM Sven Anderson <sv...@redhat.com> wrote:


Caleb Spare <ces...@gmail.com> schrieb am Mi. 7. Juni 2023 um 19:22:
On Wed, Jun 7, 2023 at 2:33 AM Sven Anderson <sv...@redhat.com> wrote:
>
> That’s not only a read/write race, it’s also a write/write race. Every request to the server creates a new Go routine that might increment newConns in parallel, so it may get corrupted. Same for lines 39/40.
>
> You might claim, that for infrastructural reasons, there can be no concurrent requests to your server, but that would just mean that the race is not triggered, it’s there nevertheless.

The race detector reports on races that actually happened, not races
that could happen.

A race condition is a condition, not an event, so I think „happened“ is not correct, but maybe someone who knows the heuristics of the race detector better can clear that up.

The race detector reports when shared memory is accessed concurrently from multiple goroutines. So it detects when a memory race happens. In this case however, what is reported is a concurrent write-after-read. Is that really a memory race?
 

Brian Candler

unread,
Jun 8, 2023, 5:13:39 AM6/8/23
to golang-nuts
> In this case however, what is reported is a concurrent write-after-read. Is that really a memory race?

In general, it would be: if these accesses are not synchronized, there's a risk that the goroutines could slip relative to each other so that the write and read take place at the same time, and the read sees an inconsistent value.

In this particular case: if there's some external synchronization which means that the read is always strictly before the next write, then the problem may not occur in practice.

I would be inclined to change newconns to atomic.Int32 so it's safe either way, and the race detector should be silenced.

Reply all
Reply to author
Forward
0 new messages