On Wed, Jan 28, 2015 at 10:18 AM, <
ralph....@sendgrid.com> wrote:
>
> We recently started seeing a crash in production related to our long running
> Go daemon crashing with: pthread_create failed: Resource temporarily
> unavailable.
I believe this can happen on GNU/Linux if your program uses cgo and if
thread A is in the Go runtime starting up a new thread B while thread
C is execing a program. The underlying cause is that while one thread
is calling exec the Linux kernel will fail attempts by other threads
to call clone by returning EAGAIN. (Look for uses of the in_exec
field in the kernel sources.)
You said your program uses cgo, so if it also calls exec (that is,
uses the os/exec package or calls os.StartProcess), then this may be
the problem.
A simple fix might be to edit runtime/cgo/gcc_linux_amd64.c to use
some sort of loop when pthread_create returns EAGAIN. If that seems
to help, please open a bug report and we can address this for 1.5 one
way or another.
Ian