When a crashing thread requests a dump from crashpad_handler, it includes its own thread ID in its namespace and a pointer to its stack. Renderer processes are in a PID namespace so "thread ID 1" is likely what the thread reported using getttid(). Crashpad knows about PID namespacing though, and should use the stack pointer to find which thread has that stack allowing it to convert to a thread ID in crashpad_handler's namespace. It looks like that isn't happening here. The stack address isn't being passed through properly when using a broker so adding it as a parameter to HandleExceptionWithBroker() might fix this, though it is unexpected that chromium would be using the broker on Linux (unless you've modified it to use StartHandlerAtCrash()?), though it should generally be okay.
On Tue, Mar 15, 2022 at 7:36 PM Joshua Peraza <jpe...@google.com> wrote:When a crashing thread requests a dump from crashpad_handler, it includes its own thread ID in its namespace and a pointer to its stack. Renderer processes are in a PID namespace so "thread ID 1" is likely what the thread reported using getttid(). Crashpad knows about PID namespacing though, and should use the stack pointer to find which thread has that stack allowing it to convert to a thread ID in crashpad_handler's namespace. It looks like that isn't happening here. The stack address isn't being passed through properly when using a broker so adding it as a parameter to HandleExceptionWithBroker() might fix this, though it is unexpected that chromium would be using the broker on Linux (unless you've modified it to use StartHandlerAtCrash()?), though it should generally be okay.Thanks for the detailed explanation! I'm currently using CrashpadClient::StartHandler via crashpad::HandlerMain and copying the Linux-specific bits (like calling GetHandlerSocket and passing kCrashpadHandlerPid) from your headless Linux commit. This HandlerMain approach matches what we're currently using with Crashpad on Windows and MacOS, but maybe it's not the correct approach for Linux?As an additional data point, crash reporting from the renderer works if I run with "--no-sandbox". I'll try to implement the PID namespacing change that you mention and see if that helps.
On Thu, Mar 17, 2022 at 11:29 AM Marshall Greenblatt <magree...@gmail.com> wrote:On Thu, Mar 17, 2022 at 2:14 PM Marshall Greenblatt <magree...@gmail.com> wrote:On Tue, Mar 15, 2022 at 7:36 PM Joshua Peraza <jpe...@google.com> wrote:When a crashing thread requests a dump from crashpad_handler, it includes its own thread ID in its namespace and a pointer to its stack. Renderer processes are in a PID namespace so "thread ID 1" is likely what the thread reported using getttid(). Crashpad knows about PID namespacing though, and should use the stack pointer to find which thread has that stack allowing it to convert to a thread ID in crashpad_handler's namespace. It looks like that isn't happening here. The stack address isn't being passed through properly when using a broker so adding it as a parameter to HandleExceptionWithBroker() might fix this, though it is unexpected that chromium would be using the broker on Linux (unless you've modified it to use StartHandlerAtCrash()?), though it should generally be okay.Thanks for the detailed explanation! I'm currently using CrashpadClient::StartHandler via crashpad::HandlerMain and copying the Linux-specific bits (like calling GetHandlerSocket and passing kCrashpadHandlerPid) from your headless Linux commit. This HandlerMain approach matches what we're currently using with Crashpad on Windows and MacOS, but maybe it's not the correct approach for Linux?As an additional data point, crash reporting from the renderer works if I run with "--no-sandbox". I'll try to implement the PID namespacing change that you mention and see if that helps.Looks like it takes the kDirectPtrace code path. The code is calling FindThreadWithStackAddress and |local_requesting_thread_id| is -1 both with sandbox enabled and disabled.
That sounds like either crashpad isn't finding this thread in /proc/<pid>/task, or it isn't finding the right bounds for the stack. In FindThreadWithStackAddress() is the count of threads correct for your process? Are any of the stacks close to the stack_address being searched for?
On Thu, Mar 17, 2022 at 3:01 PM Joshua Peraza <jpe...@google.com> wrote:On Thu, Mar 17, 2022 at 11:29 AM Marshall Greenblatt <magree...@gmail.com> wrote:On Thu, Mar 17, 2022 at 2:14 PM Marshall Greenblatt <magree...@gmail.com> wrote:On Tue, Mar 15, 2022 at 7:36 PM Joshua Peraza <jpe...@google.com> wrote:When a crashing thread requests a dump from crashpad_handler, it includes its own thread ID in its namespace and a pointer to its stack. Renderer processes are in a PID namespace so "thread ID 1" is likely what the thread reported using getttid(). Crashpad knows about PID namespacing though, and should use the stack pointer to find which thread has that stack allowing it to convert to a thread ID in crashpad_handler's namespace. It looks like that isn't happening here. The stack address isn't being passed through properly when using a broker so adding it as a parameter to HandleExceptionWithBroker() might fix this, though it is unexpected that chromium would be using the broker on Linux (unless you've modified it to use StartHandlerAtCrash()?), though it should generally be okay.Thanks for the detailed explanation! I'm currently using CrashpadClient::StartHandler via crashpad::HandlerMain and copying the Linux-specific bits (like calling GetHandlerSocket and passing kCrashpadHandlerPid) from your headless Linux commit. This HandlerMain approach matches what we're currently using with Crashpad on Windows and MacOS, but maybe it's not the correct approach for Linux?As an additional data point, crash reporting from the renderer works if I run with "--no-sandbox". I'll try to implement the PID namespacing change that you mention and see if that helps.Looks like it takes the kDirectPtrace code path. The code is calling FindThreadWithStackAddress and |local_requesting_thread_id| is -1 both with sandbox enabled and disabled.That sounds like either crashpad isn't finding this thread in /proc/<pid>/task, or it isn't finding the right bounds for the stack. In FindThreadWithStackAddress() is the count of threads correct for your process? Are any of the stacks close to the stack_address being searched for?Below is the result of adding some printf statements in FindThreadWithStackAddress. The thread count appears to be correct (checked immediately before crashing). I ran it multiple times both with and without sandbox, and the results are pretty consistent each time (closest always off by about 63MB).
On Thu, Mar 17, 2022 at 2:14 PM Marshall Greenblatt <magree...@gmail.com> wrote:On Tue, Mar 15, 2022 at 7:36 PM Joshua Peraza <jpe...@google.com> wrote:When a crashing thread requests a dump from crashpad_handler, it includes its own thread ID in its namespace and a pointer to its stack. Renderer processes are in a PID namespace so "thread ID 1" is likely what the thread reported using getttid(). Crashpad knows about PID namespacing though, and should use the stack pointer to find which thread has that stack allowing it to convert to a thread ID in crashpad_handler's namespace. It looks like that isn't happening here. The stack address isn't being passed through properly when using a broker so adding it as a parameter to HandleExceptionWithBroker() might fix this, though it is unexpected that chromium would be using the broker on Linux (unless you've modified it to use StartHandlerAtCrash()?), though it should generally be okay.Thanks for the detailed explanation! I'm currently using CrashpadClient::StartHandler via crashpad::HandlerMain and copying the Linux-specific bits (like calling GetHandlerSocket and passing kCrashpadHandlerPid) from your headless Linux commit. This HandlerMain approach matches what we're currently using with Crashpad on Windows and MacOS, but maybe it's not the correct approach for Linux?As an additional data point, crash reporting from the renderer works if I run with "--no-sandbox". I'll try to implement the PID namespacing change that you mention and see if that helps.Looks like it takes the kDirectPtrace code path. The code is calling FindThreadWithStackAddress and |local_requesting_thread_id| is -1 both with sandbox enabled and disabled.