> The hex of the odd characters is 78 ee 22 21 08. At the time of the crash there were 30,427 goroutines running. This is running against 40ba4d4e4672.
Does FreeBSD limit the per process threads to 256? Of the running goroutines, can you summarise the top of the call stack for each as I suspect almost all will be executing syscalls.
On 15/11/2012, at 9:49, John Graham-Cumming <j...@cloudflare.com> wrote:
On Wednesday, November 14, 2012 11:10:26 PM UTC, Dave Cheney wrote: > Does FreeBSD limit the per process threads to 256? Of the running > goroutines, can you summarise the top of the call stack for each as I > suspect almost all will be executing syscalls.
On Wednesday, November 14, 2012 11:18:31 PM UTC, John Graham-Cumming wrote:
> On Wednesday, November 14, 2012 11:10:26 PM UTC, Dave Cheney wrote:
>> Does FreeBSD limit the per process threads to 256? Of the running >> goroutines, can you summarise the top of the call stack for each as I >> suspect almost all will be executing syscalls.
In the latter issue, you'll find a link to a CL which should reduce
the number of goroutines in syscall state, however, from memory the CL
only covers linux, not freebsd, but adapting it should not be tricky.
On Thu, Nov 15, 2012 at 10:10 AM, Dave Cheney <d...@cheney.net> wrote:
> Does FreeBSD limit the per process threads to 256? Of the running
> goroutines, can you summarise the top of the call stack for each as I
> suspect almost all will be executing syscalls.
> On 15/11/2012, at 9:49, John Graham-Cumming <j...@cloudflare.com> wrote:
> On Wednesday, November 14, 2012 10:44:06 PM UTC, Dave Cheney wrote:
>> Can you share some more details of the platform you found this error on?
> 64-bit FreeBSD
>> Can you share some details on the condition of the 30,000 odd goroutines
>> at the time of the panic? How many of them were in syscall.Syscall?
> Here are the counts and status of the goroutines.
Thanks. It's entirely possible that this is simply a 'you can't have any more threads' situation, but I'd like to understand the weird error message to make sure that it's not something else.
But it's peculiar that these lines do not print "error: ". I don't
know where that is coming from in your output. And, of course,
strerror should not return a garbage string. This code is compiled by
gcc and invokes libc functions in the usual way. strerror should not
return a garbage pointer.
Hmmm, wait. This file does not #include <string.h>. It's possible
that strerror was never declared and that GCC is implicitly declaring
it to return int. On amd64 int is 32 bits and char* is 64 bits, so it
is possible that when the return value is moved from %rax to %rdx only
the low order 32 bits are moved. This might then be an invalid
pointer, causing printf to spit out garbage. A series of guesses, to
be sure, but a possible explanation for what you are seeing, except
for the "error: " string. But to be safe let's have all those files
#include <string.h>. Unfortunately, if these guesses are correct,
there is no way to determine what error pthread_create actually
returned.
On Thursday, November 15, 2012 1:03:40 AM UTC, Ian Lance Taylor wrote: > Hmmm, wait. This file does not #include <string.h>. It's possible > that strerror was never declared and that GCC is implicitly declaring > it to return int. On amd64 int is 32 bits and char* is 64 bits, so it > is possible that when the return value is moved from %rax to %rdx only > the low order 32 bits are moved. This might then be an invalid > pointer, causing printf to spit out garbage. A series of guesses, to > be sure, but a possible explanation for what you are seeing, except > for the "error: " string. But to be safe let's have all those files > #include <string.h>. Unfortunately, if these guesses are correct, > there is no way to determine what error pthread_create actually > returned.
That seems like a likely explanation. The Linux version of the file does include <string.h> and there's a comment about it being for strerror(), but the FreeBSD version does not. Note that on FreeBSD strerror() requires <stdio.h> not <string.h>.
It's likely that the problem I am seeing is actually just a thread limit on the machine being hit. If it turns out that the error message is garbled because of this issue then I'm happy because it's not something more serious.
But... shouldn't gcc being giving a warning and -Werror be used?
# runtime/cgo
./gcc_linux_amd64.c: In function ‘libcgo_sys_thread_start’:
./gcc_linux_amd64.c:45:3: error: format ‘%s’ expects argument of type
‘char *’, but argument 3 has type ‘int’ [-Werror=format]
cc1: all warnings being treated as errors
On Thu, Nov 15, 2012 at 5:29 PM, Ian Lance Taylor <i...@google.com> wrote:
> # runtime/cgo
> ./gcc_linux_amd64.c: In function ‘libcgo_sys_thread_start’:
> ./gcc_linux_amd64.c:45:3: error: format ‘%s’ expects argument of type
> ‘char *’, but argument 3 has type ‘int’ [-Werror=format]
> cc1: all warnings being treated as errors
Sounds great. In fact I think we should add all the options from
src/cmd/dist/build.c:
>> # runtime/cgo
>> ./gcc_linux_amd64.c: In function ‘libcgo_sys_thread_start’:
>> ./gcc_linux_amd64.c:45:3: error: format ‘%s’ expects argument of type
>> ‘char *’, but argument 3 has type ‘int’ [-Werror=format]
>> cc1: all warnings being treated as errors
> Sounds great. In fact I think we should add all the options from
> src/cmd/dist/build.c:
On Thu, Nov 15, 2012 at 6:01 PM, Dave Cheney <d...@cheney.net> wrote:
> OK, i'll prepare a CL, it might take a bit of testing.
> On Thu, Nov 15, 2012 at 5:52 PM, Ian Lance Taylor <i...@google.com> wrote:
>> On Wed, Nov 14, 2012 at 10:45 PM, Dave Cheney <d...@cheney.net> wrote:
>>> What about something like this
>>> diff -r ceaa16504f36 src/pkg/runtime/cgo/cgo.go
>>> --- a/src/pkg/runtime/cgo/cgo.go Thu Nov 15 13:59:46 2012 +1100
>>> +++ b/src/pkg/runtime/cgo/cgo.go Thu Nov 15 17:44:53 2012 +1100
>>> @@ -14,6 +14,7 @@
>>> #cgo darwin LDFLAGS: -lpthread
>>> #cgo freebsd LDFLAGS: -lpthread
>>> #cgo linux LDFLAGS: -lpthread
>>> +#cgo CFLAGS: -Werror
>>> #cgo netbsd LDFLAGS: -lpthread
>>> #cgo openbsd LDFLAGS: -lpthread
>>> #cgo windows LDFLAGS: -lm -mthreads
>>> diff -r ceaa16504f36 src/pkg/runtime/cgo/gcc_linux_amd64.c
>>> --- a/src/pkg/runtime/cgo/gcc_linux_amd64.c Thu Nov 15 13:59:46 2012 +1100
>>> +++ b/src/pkg/runtime/cgo/gcc_linux_amd64.c Thu Nov 15 17:44:53 2012 +1100
>>> @@ -3,7 +3,6 @@
>>> // license that can be found in the LICENSE file.
>>> # runtime/cgo
>>> ./gcc_linux_amd64.c: In function ‘libcgo_sys_thread_start’:
>>> ./gcc_linux_amd64.c:45:3: error: format ‘%s’ expects argument of type
>>> ‘char *’, but argument 3 has type ‘int’ [-Werror=format]
>>> cc1: all warnings being treated as errors
>> Sounds great. In fact I think we should add all the options from
>> src/cmd/dist/build.c: