Attention is currently required from: Joshua Peraza.
Patch set 17:Commit-Queue +1
1 comment:
Patchset:
Joshua - ptal - this will be used in chrome (wip CL https://chromium-review.googlesource.com/c/chromium/src/+/3688951)
Most of this CL is a fake helper (crashpad_wer.dll) for tests within crashpad.
The meat of the CL is in crashpad_client_win.cc and crashpad_wer.cc.
There is some deps fiddling to avoid pulling //base etc. into the final dll in chrome.
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Alex Gough.
2 comments:
File client/win_wer_handler/BUILD.gn:
This wer module seems more appropriately placed in handler/win. client/ is for code loaded in the client process. ios_handler/ is here because it *is* loaded in the client process.
File handler/win/fastfail_test_program.cc:
HRESULT res = WerRegisterRuntimeExceptionModule(mod_path.value().c_str(),
client->GetWerContext());
if (res != S_OK) {
LOG(ERROR) << "WerRegisterRuntimeExceptionModule";
return EXIT_FAILURE;
}
Why not make this call in a method on CrashpadClient instead of having GetWerContext() return an opaque pointer only meant to be used with this call?
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Alex Gough.
4 comments:
File client/win_wer_handler/crashpad_wer.cc:
Patch Set #17, Line 70: reinterpret_cast
crashpad::VMAddress is 64-bit. If this code runs as 32-bit, it should use FromPointerCast instead to properly handle sign extension (or just cast to uintptr_t first if you want to avoid the dep): https://source.chromium.org/chromium/chromium/src/+/main:third_party/crashpad/crashpad/util/misc/from_pointer_cast.h;l=32?q=frompointercast
Patch Set #17, Line 92: reinterpret_cast<PVOID>(target_context)
pContext
Patch Set #17, Line 107: reinterpret_cast
Is this necessary, casting from void*?
Patch Set #17, Line 142: const
constexpr
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Joshua Peraza.
7 comments:
Patchset:
thanks - ptal
File client/win_wer_handler/BUILD.gn:
This wer module seems more appropriately placed in handler/win. […]
right that makes sense - I've moved the bulk of the code to ./handler/
File client/win_wer_handler/crashpad_wer.cc:
Patch Set #17, Line 70: reinterpret_cast
crashpad::VMAddress is 64-bit. […]
used a cast via uintptr_t - frompointercast brings in base and I'm trying to keep the resulting DLL small - also this is now only used for calculating another VMAddress field later (in the exception pointers structure) so the var is gone.
Patch Set #17, Line 92: reinterpret_cast<PVOID>(target_context)
pContext
Done
Patch Set #17, Line 107: reinterpret_cast
Is this necessary, casting from void*?
No, removed the cast.
Patch Set #17, Line 142: const
constexpr
Done
File handler/win/fastfail_test_program.cc:
HRESULT res = WerRegisterRuntimeExceptionModule(mod_path.value().c_str(),
client->GetWerContext());
if (res != S_OK) {
LOG(ERROR) << "WerRegisterRuntimeExceptionModule";
return EXIT_FAILURE;
}
Why not make this call in a method on CrashpadClient instead of having GetWerContext() return an opa […]
good idea.
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Alex Gough.
12 comments:
File client/BUILD.gn:
Patch Set #21, Line 89: deps += [ "../handler/win/wer:crashpad_wer_handler" ]
This should be a data_dep in the client_test target down below.
File client/crashpad_client.h:
Patch Set #21, Line 672: context
module?
//! This method should only be called after a successful call to
//! SetHandlerIPCPipe() or StartHandler(). The context is valid for the
//! lifetime of this object. The context is opaque its contents should not be
//! used outside of crashpad.
These lines don't look right anymore.
Patch Set #21, Line 684: bool RegisterWerModule(const std::wstring& full_path);
static
File client/crashpad_client_win.cc:
Patch Set #21, Line 764: g_wer_registration
If instead of copying these globals, you removed the standalone copies and updated other references to point to the ones in g_wer_registration, there'd be no requirement to call this after StartHandler() or SetHandlerIPCPipe().
At the least, it seems convenient to worry less about properly ordered calls. If we can call RegisterRefModule() significantly sooner than StartHandler() or SetHandlerIPCPipe(), maybe that could also give us useful information about early startup crashes?
Patch Set #21, Line 769: g_non_crash_exception_information
DumpWithoutCrashing() has g_non_crash_dump_lock to prevent multiple simultaneous calls to DumpWithoutCrashing() from interfering with each other. Does WerFault.exe provide any guarantees about the state of the crashing client when our WER helper gets to it?
File handler/win/wer/crashpad_wer.h:
`handled_exceptions` is a list of
//! exception codes that the helper should pass on to crashpad handler (if
//! possible). `pContext` `pExceptionInformation` are the context and exception
//! information provided by WER to the helper DLL. Provide an empty list to
//! pass every exception on to the crashpad handler.
parameter documentation should go on \param lines.
File handler/win/wer/crashpad_wer.cc:
Patch Set #21, Line 37: CloseHandle
Should we CHECK() this succeeded?
// Only deal with exceptions that crashpad would not have handled.
bool handled_exception = false;
if (handled_exceptions.empty()) {
handled_exception = true;
} else {
for (auto status : handled_exceptions) {
if (e_info->exceptionRecord.ExceptionCode == status) {
handled_exception = true;
break;
}
}
}
if (!handled_exception)
return false;
How about:
if (std::find(handled_exceptions.begin(),
handled_exceptions.end(),
e_info->exceptionRecord.ExceptionCode) != handled_exceptions.end()) {
return false;
}
Patch Set #21, Line 118: EXCEPTION_RECORD
sizeof(e_info->exceptionRecord)
better to say sizeof() the thing you are copying rather than the type you expect the thing to be.
Patch Set #21, Line 126: sizeof(CONTEXT),
here too
File handler/win/wer/crashpad_wer_internal.h:
Patch Set #21, Line 25: struct WerRegistration {
I think this would fit in util/win/registration_protocol.h which similarly describes client<->handler communication and moving it to util/ avoids the client depending on this library.
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Joshua Peraza.
4 comments:
Patchset:
thanks - some questions before I get to the other stuff
File client/crashpad_client_win.cc:
Patch Set #21, Line 764: g_wer_registration
If instead of copying these globals, you removed the standalone copies and updated other references […]
yep I see. I don't think we can register before the handler is started and handles are exchanged, as otherwise the handles won't be valid to signal the handler to collect the crash.
Patch Set #21, Line 769: g_non_crash_exception_information
DumpWithoutCrashing() has g_non_crash_dump_lock to prevent multiple simultaneous calls to DumpWithou […]
The crashed process will be suspended and won't have a chance to start again. It's possible it could crash while calling dumpwithoutcrashing - we could add a bool or something to this registration structure that is true if the target has the dumpwithoutcrashing lock and skip taking a dump if so.
File handler/win/wer/crashpad_wer.cc:
Patch Set #21, Line 37: CloseHandle
Should we CHECK() this succeeded?
that pulls in a lot of deps, and we won't see the werfault crashes anyway.
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
File client/crashpad_client_win.cc:
Patch Set #21, Line 764: g_wer_registration
yep I see. […]
The WER handler still needs to gracefully handle this case because the client could always corrupt itself.
Patch Set #21, Line 769: g_non_crash_exception_information
The crashed process will be suspended and won't have a chance to start again. […]
The regular crash dump event is already only expected to be used once. If we were using the regular crash dump event, I think this is the bool we'd want to get at:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/crashpad/crashpad/client/crashpad_client_win.cc;drc=0a949b5a72ddeec8d6292947c5c93aa4551cfb38;l=143
If it worked, it could serve the dual purpose of helping us avoid duplicate crash dumps, rather than relying on screening exit codes.
File handler/win/wer/crashpad_wer.cc:
Patch Set #21, Line 37: CloseHandle
that pulls in a lot of deps, and we won't see the werfault crashes anyway.
I was more worried about leaking handles. My understanding is that chromium normally CHECKs handle and file descriptor close()s to prevent those handles accidentally leaking into unintended contexts (e.g. new processes). It could be an assert() instead.
Returning `true` causes the exception to not continue to be reported to WER? Do we need to worry about any consumers of or the consistency of the WER data, particularly while this feature is new?
Related to questions about synchronization with DumpWithoutCrash():
If this used the crash dump event handle instead of the non-crash dump event handle, crashpad_handler would terminate the client with the right exit code, right?:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/crashpad/crashpad/util/win/exception_handler_server.cc;drc=0a949b5a72ddeec8d6292947c5c93aa4551cfb38;l=549
So then this is still using the non-crash dump path in order to get the dump_done message back and know whether to return `true`?
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Joshua Peraza.
4 comments:
File client/crashpad_client_win.cc:
Patch Set #21, Line 764: g_wer_registration
The WER handler still needs to gracefully handle this case because the client could always corrupt i […]
I think it does - if the handles fail to duplicate it gives up, if it cannot signal the handles it will give up. In the worst case a corrupt client can arrange to send over a valid different handle to be signalled but the client could already have signalled those handles itself.
Utlimately the wer handler (in werfault.exe's process) has to signal a crashpad handler process, and the only way to know what to signal has to come from the crashed process, so we can't register the wer module until the handler process is registered with the client process.
(I do intend to tidy this up and add a bool but should resolve if using the two-event dumpwithoutcrashing approach is the right one first)
Patch Set #21, Line 769: g_non_crash_exception_information
The regular crash dump event is already only expected to be used once. […]
One aspect about using the dumpwithoutcrashing route is that the crashed process likely won't be in the UnhandledExceptionHandler - it almost feels cleaner to reserve the regular case for that case.
I also like that we can pass the crash to wer to report through its channels if say the handler itself has crashed - we won't be able to do that if we signal the single event that the regular crash uses as we cannot tell if it worked or not.
File handler/win/wer/crashpad_wer.cc:
Patch Set #21, Line 37: CloseHandle
I was more worried about leaking handles. […]
Sure. This code only runs in the short-lived WerFault.exe process so leaks aren't a big deal. Ultimately I'm not sure there's much we can do if we fail the close these handles.
I can see if assert() makes the dll much larger.
Returning `true` causes the exception to not continue to be reported to WER? Do we need to worry abo […]
Yep, we use the second event to know what to tell wer. Mainly it covers the case of the crashpad handler process being gone in which case we let WER's fallback machinery take the crash instead.
Ultimately we don't look at the WER Health Dashboard very often and it's probably better for people's bandwidth if only one of chrome and windows sends the crash somewhere.
We certainly don't have any metrics or similar things that will be messed up by us taking the crash rather than the wer->windows health dashboard machinery getting it.
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Alex Gough.
2 comments:
File client/crashpad_client_win.cc:
Patch Set #21, Line 769: g_non_crash_exception_information
One aspect about using the dumpwithoutcrashing route is that the crashed process likely won't be in […]
With either event, we need to deal with the possibility the crash has occurred in that code path, mid-dump.
If we use the non-crash dump event, that allows DumpWithoutCrash() to mask real crashes, potentially a crash in DumpWithoutCrash(). This seems a worse compromise than skipping dumps if the crash dump event has already been set because it alters our perceived severity of the crash. Reports from DumpWithoutCrash() might be deprioritized because they are non-fatal when they are really masking some real source of instability.
File handler/win/wer/crashpad_wer.cc:
Yep, we use the second event to know what to tell wer. […]
The in-process handler doesn't make the same decision about passing apparently unsuccessful crash dumps on to WER (maybe it should?): https://source.chromium.org/chromium/chromium/src/+/main:third_party/crashpad/crashpad/client/crashpad_client_win.cc;drc=0a949b5a72ddeec8d6292947c5c93aa4551cfb38;l=181
The in-process handler terminates itself after a timeout if crashpad_handler doesn't terminate it and I'd expect that to not propagate to WER.
I think a symmetrical design for this out-of-process trigger would be to pass the exception on to WER if the (non)crash-dump handles grabbed from the client are invalid.
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Joshua Peraza.
14 comments:
Patchset:
ptal - sorry it's been a while I was OOO for various reasons.
File client/BUILD.gn:
Patch Set #21, Line 89: deps += [ "../handler/win/wer:crashpad_wer_handler" ]
This should be a data_dep in the client_test target down below.
Done
File client/crashpad_client.h:
Patch Set #21, Line 672: context
module?
went with DLL
//! This method should only be called after a successful call to
//! SetHandlerIPCPipe() or StartHandler(). The context is valid for the
//! lifetime of this object. The context is opaque its contents should not be
//! used outside of crashpad.
These lines don't look right anymore.
good spot - tidied this up and added a note of caution now that this function does the registration.
Patch Set #21, Line 684: bool RegisterWerModule(const std::wstring& full_path);
static
I think this is more analogous to WaitForHandler start so should not be static - it relies on the state of the singleton to some extent.
File client/crashpad_client_win.cc:
Patch Set #21, Line 764: g_wer_registration
I think it does - if the handles fail to duplicate it gives up, if it cannot signal the handles it w […]
I've added a bool which should allow us to know if the client is using the dumpwithoutcrashing structures - if so we wait on the finished event to see if we can have a go ourselves - if the event doesn't signal we give up as likely the handler process is muddled.
Patch Set #21, Line 769: g_non_crash_exception_information
With either event, we need to deal with the possibility the crash has occurred in that code path, mi […]
I think that the dumps that are collected are the same (https://source.chromium.org/chromium/chromium/src/+/main:third_party/crashpad/crashpad/util/win/exception_handler_server.cc;drc=0a949b5a72ddeec8d6292947c5c93aa4551cfb38;l=559 & https://source.chromium.org/chromium/chromium/src/+/main:third_party/crashpad/crashpad/util/win/exception_handler_server.cc;drc=20aad063bc0434f1d89ccfb023efe427afd3a96c;l=544 both call the same dumping function) - the current DumpWithoutCrashing dumps are later classified in crash/ because the target process has DumpWithoutCrashing on its stack - these dumps won't have that so there shouldn't be any confusion about these being real crash dumps.
I've added a bool which is set to indicate that the target process is using the dumpwithoutcrashing structures - if that is set we give the crashpad handler a short period to finish the first dump before triggering another, or giving up if the in-progress crash never finishes.
File handler/win/wer/crashpad_wer.h:
`handled_exceptions` is a list of
//! exception codes that the helper should pass on to crashpad handler (if
//! possible). `pContext` `pExceptionInformation` are the context and exception
//! information provided by WER to the helper DLL. Provide an empty list to
//! pass every exception on to the crashpad handler.
parameter documentation should go on \param lines.
Done
File handler/win/wer/crashpad_wer.cc:
Patch Set #21, Line 37: CloseHandle
Sure. This code only runs in the short-lived WerFault.exe process so leaks aren't a big deal. […]
I think leaking this is fine - we shouldn't get here with an unclosable handle unless something very weird is happening.
// Only deal with exceptions that crashpad would not have handled.
bool handled_exception = false;
if (handled_exceptions.empty()) {
handled_exception = true;
} else {
for (auto status : handled_exceptions) {
if (e_info->exceptionRecord.ExceptionCode == status) {
handled_exception = true;
break;
}
}
}
if (!handled_exception)
return false;
How about: […]
yep that's a bit shorter.
Patch Set #21, Line 118: EXCEPTION_RECORD
sizeof(e_info->exceptionRecord) […]
Done
Patch Set #21, Line 126: sizeof(CONTEXT),
here too
Done
The in-process handler doesn't make the same decision about passing apparently unsuccessful crash du […]
yep - I've added a validation for the handles.
File handler/win/wer/crashpad_wer_internal.h:
Patch Set #21, Line 25: struct WerRegistration {
I think this would fit in util/win/registration_protocol. […]
yep moved it there.
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Alex Gough.
3 comments:
File client/crashpad_client_win.cc:
Patch Set #21, Line 769: g_non_crash_exception_information
I think that the dumps that are collected are the same (https://source.chromium. […]
I was talking about the failure mode where we don't receive these WER triggered dumps because we crashed while in DumpWithoutCrash(). In this case, the dumps we receive (if any) will be from the call to DumpWithoutCrash() which will have DumpWithoutCrash() on the stack as well as an artificial exception code. The latest patch-set mitigates this but is still racy. A fast-fail crash that occurs in DumpWithoutCrash() might or might not be reported, depending on how responsive crashpad_handler is in creating a dump.
We would face the same race condition if the crash dump event is used. The WER triggered dump might be skipped if the in-process handler crashed while servicing a crash, but in this case, the first dump (assuming it succeeded) would be for a real crash instead of a DumpWithoutCrash().
File util/win/registration_protocol_win.h:
Patch Set #23, Line 43: WerRegistration
Could use a version field to detect mismatched DLL installs.
Patch Set #23, Line 55: void*
This (and exception/context structs below?) will have varying size based on bitness/architecture. Crashpad code is generally capable of tracing cross-bitness so that we might have a single crashpad_handler handling crashes for both 32 and 64-bit code. Is that a possible configuration for the WER handler?
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Joshua Peraza.
2 comments:
File util/win/registration_protocol_win.h:
Patch Set #23, Line 43: WerRegistration
Could use a version field to detect mismatched DLL installs.
Good idea - added a field and validate it in the helper DLL.
Patch Set #23, Line 55: void*
This (and exception/context structs below?) will have varying size based on bitness/architecture. […]
WerFault runs with the same bitness as the crashed process so shouldn't be an issue here. I could use a WinVMAddress but it will never be cross-bitness so stuck with void*. I'll expand the comment.
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Alex Gough.
Patch set 26:Code-Review +1
Patch set 26:Commit-Queue +2
1 comment:
Patchset:
Thanks Joshua!
To view, visit change 3677284. To unsubscribe, or for help writing mail filters, visit settings.
Crashpad LUCI CQ submitted this change.
Add WER runtime exception helper module for Windows
This adds a runtime exception helper (& test module) for Windows and
plumbing to allow the module to be registered by the crashpad client,
and to trigger the crashpad handler. Embedders can build their own
module to control which exceptions are passed to the handler.
See: go/chrome-windows-runtime-exception-helper for motivation.
When registered (which is the responsibility of the embedding
application), the helper is loaded by WerFault.exe when Windows
Error Reporting receives crashes that are not caught by crashpad's
normal handlers - for instance a control-flow violation when a
module is compiled with /guard:cf.
Registration:
The embedder must arrange for the full path to the helper to
be added in the appropriate Windows Error Reporting\
RuntimeExceptionHelperModules registry key.
Once an embedder's crashpad client is connected to a crashpad
handler (e.g. through SetIpcPipeName()) the embedder calls
RegisterWerModule. Internally, this registration includes handles
used to trigger the crashpad handler, an area reserved to hold an
exception and context, and structures needed by the crashpad handler.
Following a crash:
WerFault.exe handles the crash then validates and loads the helper
module. WER hands the helper module a handle to the crashing target
process and copies of the exception and context for the faulting thread.
The helper then copies out the client's registration data and
duplicates handles to the crashpad handler, then fills back the various structures in the paused client that the crashpad handler will need.
The helper then signals the crashpad handler, which collects a dump then
notifies the helper that it is done.
Support:
WerRegisterExceptionHelperModule has been availble since at least
Windows 7 but WerFault would not pass on the exceptions that crashpad
could not already handle. This changed in Windows 10 20H1 (19041),
which supports HKCU and HKLM registrations, and passes in more types of
crashes. It is harmless to register the module for earlier versions
of Windows as it simply won't be loaded by WerFault.exe.
Tests:
snapshot/win/end_to_end_test.py has been refactored slightly to
group crash generation and output validation in main() by breaking
up RunTests into smaller functions.
As the module works by being loaded in WerFault.exe it is tested
in end_to_end_test.py.
Bug: crashpad:133, 866033, 865632
Change-Id: Id668bd15a510a24c79753e1bb03e9456f41a9780
Reviewed-on: https://chromium-review.googlesource.com/c/crashpad/crashpad/+/3677284
Reviewed-by: Joshua Peraza <jpe...@chromium.org>
Commit-Queue: Alex Gough <aj...@chromium.org>
---
M client/BUILD.gn
M client/crashpad_client.h
M client/crashpad_client_win.cc
M handler/BUILD.gn
A handler/win/fastfail_test_program.cc
A handler/win/wer/BUILD.gn
A handler/win/wer/crashpad_wer.cc
A handler/win/wer/crashpad_wer.def
A handler/win/wer/crashpad_wer.h
A handler/win/wer/crashpad_wer.ver
A handler/win/wer/crashpad_wer_main.cc
A handler/win/wer/crashpad_wer_module_unittest.cc
M snapshot/win/end_to_end_test.py
M util/BUILD.gn
M util/win/registration_protocol_win.h
15 files changed, 917 insertions(+), 48 deletions(-)