At Wed, 16 Dec 2015 12:15:50 -0800 (PST), Kyle Hale wrote:
> I can give more context if necessary, but briefly, the issue I'm having is
> a strange one with memory corruption in the GC's memory management code,
> and I suspect the cause might be coming from an interplay with signals.
> When racket is running in the specialized OS kernel, we forward any page
> fault that occurs over to a user-space thread on a Linux core, which
> recreates the memory reference, forcing Linux to handle it for us.
My guess is that you're seeing an issue with thread-local variables, as
opposed to the stack. Does disabling places and futures (with
`configure --disable-places --disabled-futures`) change anything?
> The issue with this is that if racket is *expecting* references to invalid
> pages, e.g. for some kind of lazy allocation, the signal handler (for
> SIGSEGV) is going to run in a different thread, on a different core, and
> with a different stack. [...]
That should be fine.
> So my question is this: is racket expecting to catch SIGSEGV to fix up
> memory regions or something similar, or is it going to be doing any clever
> magic with something like setjmp() and longjmp()?
No. In fact, we use sigaltstack() on Linux, and the write barrier works
at the level of Mach messages on Mac OS X --- so it's a common mode for
signals to be handled on a different stack than the one for the
faulting thread.
In the case of Mac OS X, the garbage collector must specifically
arrange for the message-handling thread to see the thread-local
variables of the main thread. My guess is that you'll need to do
something similar, but thread-local variables are used only when places
or futures are enabled, so that's why I asked about disabling them.