Parsing stack traces

234 views
Skip to first unread message

Brendan Tracey

unread,
May 20, 2013, 4:10:11 PM5/20/13
to golan...@googlegroups.com
I just had the following stack trace output, and I'm trying to be better about using it to debug

brendan:~/Documents/mygo$ go test nnet
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x78 pc=0x35420]

goroutine 18 [running]:
nnet.(*Net).seqLossAndDerivative(0x0, 0x2103897c0, 0x5, 0x3c, 0x2103d83c0, ...)
/Users/brendan/Documents/mygo/src/nnet/nnet.go:502 +0xb0
nnet.(*trainStruct).Compute(0x210312c60)
/Users/brendan/Documents/mygo/src/nnet/par.go:43 +0xc4
nnet.func·001()
/Users/brendan/Documents/mygo/src/nnet/par.go:112 +0x4c
created by nnet.(*Net).launchTrainWorkers
/Users/brendan/Documents/mygo/src/nnet/par.go:113 +0x564

goroutine 1 [chan receive]:
testing.RunTests(0x18a9f8, 0x223fe0, 0xb, 0xb, 0x1, ...)
/usr/local/go/src/pkg/testing/testing.go:434 +0x88e
testing.Main(0x18a9f8, 0x223fe0, 0xb, 0xb, 0x227580, ...)
/usr/local/go/src/pkg/testing/testing.go:365 +0x8a
main.main()
nnet/_test/_testmain.go:63 +0x9a

goroutine 12 [chan receive]:
nnet.(*Net).parLossAndDerivative(0x2102e4600, 0x21030b850, 0x2103d4a60, 0x4, 0x4, ...)
/Users/brendan/Documents/mygo/src/nnet/par.go:134 +0xb2
nnet.TestParLossAndDeriv(0x210312750)
/Users/brendan/Documents/mygo/src/nnet/nnet_test.go:625 +0x344
testing.tRunner(0x210312750, 0x2240b8)
/usr/local/go/src/pkg/testing/testing.go:353 +0x8a
created by testing.RunTests
/usr/local/go/src/pkg/testing/testing.go:433 +0x86b

goroutine 13 [select]:
nnet.(*trainStruct).Compute(0x210312c60)
/Users/brendan/Documents/mygo/src/nnet/par.go:41 +0x172
nnet.func·001()
/Users/brendan/Documents/mygo/src/nnet/par.go:112 +0x4c
created by nnet.(*Net).launchTrainWorkers
/Users/brendan/Documents/mygo/src/nnet/par.go:113 +0x564

goroutine 14 [select]:
nnet.(*trainStruct).Compute(0x210312c60)
/Users/brendan/Documents/mygo/src/nnet/par.go:41 +0x172
nnet.func·001()
/Users/brendan/Documents/mygo/src/nnet/par.go:112 +0x4c
created by nnet.(*Net).launchTrainWorkers
/Users/brendan/Documents/mygo/src/nnet/par.go:113 +0x564

... bunch more goroutines that look like 13 and 14



Questions:

1) What does the second line mean: [signal 0xb code=0x1 addr=0x78 pc=0x35420] ?
2) What does nnet.func·001() mean? The line it reports is close to (but not identical to) the launching of the goroutine. Is that just an identifier?
3) The signature of seqLossAndDerivative is 
func (net *Net) seqLossAndDerivative(inputs, trueOutputs [][]float64, totalDeriv []float64, tmpMemory []float64) (totalLoss float64)

Looking at the pointers in the call to seqLossAndDerivative for goroutine 18, I see 
nnet.(*Net).seqLossAndDerivative(0x0, 0x2103897c0, 0x5, 0x3c, 0x2103d83c0, ...)
This says that the inputs [][]float64 is nil, correct?

Thanks 

Ian Lance Taylor

unread,
May 20, 2013, 4:26:49 PM5/20/13
to Brendan Tracey, golan...@googlegroups.com
On Mon, May 20, 2013 at 1:10 PM, Brendan Tracey
<tracey....@gmail.com> wrote:
> I just had the following stack trace output, and I'm trying to be better
> about using it to debug
>
> brendan:~/Documents/mygo$ go test nnet
> panic: runtime error: invalid memory address or nil pointer dereference
> [signal 0xb code=0x1 addr=0x78 pc=0x35420]
>
> goroutine 18 [running]:
> nnet.(*Net).seqLossAndDerivative(0x0, 0x2103897c0, 0x5, 0x3c, 0x2103d83c0,
> ...)

...

> Questions:
>
> 1) What does the second line mean: [signal 0xb code=0x1 addr=0x78
> pc=0x35420] ?

Your program received signal 0xb == 1 == SIGSEGV. The code value is
OS dependent; if you are running on GNU/Linux I believe it means that
the address is not mapped in memory. The addr field is the address
being accessed. The PC field is, well, the PC. If you run gdb on the
executable and enter "x/i 0x35420" you should see the exact
instruction that made the invalid access.

> 2) What does nnet.func·001() mean? The line it reports is close to (but not
> identical to) the launching of the goroutine. Is that just an identifier?

That is a name generated by the compiler for a function literal. If
your code is something like
go func() { }()
then that is the name of the code generated for "func() {}".

> 3) The signature of seqLossAndDerivative is
> func (net *Net) seqLossAndDerivative(inputs, trueOutputs [][]float64,
> totalDeriv []float64, tmpMemory []float64) (totalLoss float64)
>
> Looking at the pointers in the call to seqLossAndDerivative for goroutine
> 18, I see
> nnet.(*Net).seqLossAndDerivative(0x0, 0x2103897c0, 0x5, 0x3c, 0x2103d83c0,
> ...)
> This says that the inputs [][]float64 is nil, correct?

No, because the first parameter is the receiver, net. A slice takes
three words when passed as an argument. This says that inputs is a
slice with len 0x5 and capacity 0x3c. The 0x2103897c0 is the address
of the underlying array for the slice. The 0x2103d83c0 is the address
of the underlying array for trueOutputs.

You can see from this that the method is being called on a nil
receiver. If that is unexpected, that is most likely your bug.

Ian

Brendan Tracey

unread,
May 20, 2013, 4:30:20 PM5/20/13
to golan...@googlegroups.com, Brendan Tracey


On Monday, May 20, 2013 1:26:49 PM UTC-7, Ian Lance Taylor wrote:
On Mon, May 20, 2013 at 1:10 PM, Brendan Tracey
<tracey....@gmail.com> wrote:
> I just had the following stack trace output, and I'm trying to be better
> about using it to debug
>
> brendan:~/Documents/mygo$ go test nnet
> panic: runtime error: invalid memory address or nil pointer dereference
> [signal 0xb code=0x1 addr=0x78 pc=0x35420]
>
> goroutine 18 [running]:
> nnet.(*Net).seqLossAndDerivative(0x0, 0x2103897c0, 0x5, 0x3c, 0x2103d83c0,
> ...)

...

> Questions:
>
> 1) What does the second line mean: [signal 0xb code=0x1 addr=0x78
> pc=0x35420] ?

Your program received signal 0xb == 1 == SIGSEGV.  The code value is
OS dependent; if you are running on GNU/Linux I believe it means that
the address is not mapped in memory.  The addr field is the address
being accessed.  The PC field is, well, the PC.  If you run gdb on the
executable and enter "x/i 0x35420" you should see the exact
instruction that made the invalid access.

"The address field being accessed". Which address field? The goroutine? The function?
 

> 2) What does nnet.func·001() mean? The line it reports is close to (but not
> identical to) the launching of the goroutine. Is that just an identifier?

That is a name generated by the compiler for a function literal.  If
your code is something like
    go func() { }()
then that is the name of the code generated for "func() {}".

> 3) The signature of seqLossAndDerivative is
> func (net *Net) seqLossAndDerivative(inputs, trueOutputs [][]float64,
> totalDeriv []float64, tmpMemory []float64) (totalLoss float64)
>
> Looking at the pointers in the call to seqLossAndDerivative for goroutine
> 18, I see
> nnet.(*Net).seqLossAndDerivative(0x0, 0x2103897c0, 0x5, 0x3c, 0x2103d83c0,
> ...)
> This says that the inputs [][]float64 is nil, correct?

No, because the first parameter is the receiver, net.  A slice takes
three words when passed as an argument.  This says that inputs is a
slice with len 0x5 and capacity 0x3c.  The 0x2103897c0 is the address
of the underlying array for the slice.  The 0x2103d83c0 is the address
of the underlying array for trueOutputs.

You can see from this that the method is being called on a nil
receiver.  If that is unexpected, that is most likely your bug.

That is indeed true. I was literally in the middle of writing a post saying I had figured that one out when I saw your reply.

Thanks again for all your help.

Ian Lance Taylor

unread,
May 20, 2013, 4:39:40 PM5/20/13
to Brendan Tracey, golan...@googlegroups.com
On Mon, May 20, 2013 at 1:30 PM, Brendan Tracey
The hardware memory address being accessed by the compiled code. In
this case addr is 0x78. That means that the program tried to access
address 0x78, and failed. In this case it is mostly likely because
the code was accessing a field at offset 0x78 from net, which, when
net is nil, means accessing address 0x78.

Ian

Brendan Tracey

unread,
May 20, 2013, 4:41:07 PM5/20/13
to Ian Lance Taylor, golan...@googlegroups.com
Ahh, okay, I understand now. Thanks.

Brendan Tracey

unread,
May 21, 2013, 2:00:21 AM5/21/13
to golan...@googlegroups.com, Ian Lance Taylor
Ahhh, it turns out the problem (and the solution) was exactly what we were talking about in the other thread about. However, I'm not sure why it comes across as a nil pointer dereference

I think my code has the structure here 

though my actual structure is more complicated (has a quit channel for one), so maybe I'm missing the critical bit (hard to test when the playground doesn't have parallelism). This code is clearly wrong, for the reason mentioned above, but I'm not sure what is nil that gets dereferenced. i changes inside the goroutines, but  all of the structures are created, and so there should be some channel that exists that gets created.

Jesse McNelis

unread,
May 21, 2013, 2:16:39 AM5/21/13
to Brendan Tracey, golang-nuts, Ian Lance Taylor
You have a race between the loop and the goroutines it spawns.
 
--
=====================
http://jessta.id.au

Brendan Tracey

unread,
May 21, 2013, 2:25:39 AM5/21/13
to Jesse McNelis, golang-nuts, Ian Lance Taylor
Yes, I understand that I'm creating a data race, I don't understand why it would trigger a null pointer dereference. The example in the FAQ does not.

Jesse McNelis

unread,
May 21, 2013, 2:34:11 AM5/21/13
to Brendan Tracey, golang-nuts, Ian Lance Taylor
On Tue, May 21, 2013 at 4:25 PM, Brendan Tracey <tracey....@gmail.com> wrote:
Yes, I understand that I'm creating a data race, I don't understand why it would trigger a null pointer dereference. The example in the FAQ does not.

A goroutine can see a different 'list' to the 'i' it sees.
 


--
=====================
http://jessta.id.au

Reply all
Reply to author
Forward
0 new messages