Attention is currently required from: Ben Hamilton, Justin Cohen.
2 comments:
File snapshot/ios/module_snapshot_ios_intermediate_dump.cc:
GetDataValueFromMap(image_data,
Key::kSourceVersion,
&source_version_,
LogMissingDataValueFromMap::kDontLogIfMissing);
This is fine: you’re not in control of every module, so you’re not in control of whether this load command will be present in every module.
File snapshot/ios/thread_snapshot_ios_intermediate_dump.cc:
GetDataStringFromMap(thread_data,
Key::kThreadName,
&thread_name_,
LogMissingDataValueFromMap::kDontLogIfMissing);
Is this missing for threads that nobody set a name for, or where OS support doesn’t allow thread names to be set? Then this is fine.
Is it missing for threads that we’re expecting to have a name set for (and where OS support allows names to be set)? Then this isn’t fine and we should investigate further.
To view, visit change 4283479. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Justin Cohen, Mark Mentovai.
1 comment:
File snapshot/ios/thread_snapshot_ios_intermediate_dump.cc:
GetDataStringFromMap(thread_data,
Key::kThreadName,
&thread_name_,
LogMissingDataValueFromMap::kDontLogIfMissing);
Is this missing for threads that nobody set a name for, or where OS support doesn’t allow thread nam […]
Neither is quite the case.
This logic is only used on iOS. I confirmed thread names are supported all the way back to iOS 9 aka xnu-3248.50 (might go back earlier, I didn't check any farther than that, since I assume Chromium no longer supports iOS 9):
https://github.com/apple/darwin-xnu/blob/xnu-3248.50.21/osfmk/kern/thread.c#L1540
The logic unconditionally sets this key as long as `thread_info(thread, THREAD_EXTENDED_INFO, …)` succeeds. It will set this key to a zero-length string if the name is not set:
`THREAD_EXTENDED_INFO` contains all sorts of data (not just the thread name), so I would not expect it to fail when nobody set a name for the thread.
What we're seeing is that `thread_info(thread, THREAD_EXTENDED_INFO, …)` is failing many orders of magnitude more often than `thread_info(thread, THREAD_BASIC_INFO, …)`.
Unless Apple has changed `THREAD_EXTENDED_INFO_COUNT` in a recent version of xnu (or Chromium's compiler settings are padding `struct thread_extended_info` differently than the compiler used to build xnu), I would not expect this to fail that often.
To view, visit change 4283479. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Ben Hamilton, Justin Cohen.
1 comment:
File snapshot/ios/thread_snapshot_ios_intermediate_dump.cc:
GetDataStringFromMap(thread_data,
Key::kThreadName,
&thread_name_,
LogMissingDataValueFromMap::kDontLogIfMissing);
Neither is quite the case.
This logic is only used on iOS. I confirmed thread names are supported all the way back to iOS 9 aka xnu-3248.50 (might go back earlier, I didn't check any farther than that, since I assume Chromium no longer supports iOS 9):
https://github.com/apple/darwin-xnu/blob/xnu-3248.50.21/osfmk/kern/thread.c#L1540
The logic unconditionally sets this key as long as `thread_info(thread, THREAD_EXTENDED_INFO, …)` succeeds. It will set this key to a zero-length string if the name is not set:
`THREAD_EXTENDED_INFO` contains all sorts of data (not just the thread name), so I would not expect it to fail when nobody set a name for the thread.
What we're seeing is that `thread_info(thread, THREAD_EXTENDED_INFO, …)` is failing many orders of magnitude more often than `thread_info(thread, THREAD_BASIC_INFO, …)`.
*How* is it failing? The whole point of logging the `kern_return_t` is to help us understand what’s gone wrong. https://chromium.googlesource.com/crashpad/crashpad/+/c7d9c710f2b2d8c9879b9250600e646719e76a9d/client/ios_handler/in_process_intermediate_dump_handler.cc#772
What about `THREAD_IDENTIFIER_INFO`?
Unless Apple has changed `THREAD_EXTENDED_INFO_COUNT` in a recent version of xnu (or Chromium's compiler settings are padding `struct thread_extended_info` differently than the compiler used to build xnu), I would not expect this to fail that often.
- https://github.com/apple/darwin-xnu/blob/8f02f2a044b9bb1ad951987ef5bab20ec9486310/osfmk/mach/thread_info.h#L167
- https://github.com/apple/darwin-xnu/blob/a1babec6b135d1f35b2590a1990af3c5c5393479/osfmk/kern/thread.c#L1961
- https://github.com/apple/darwin-xnu/blob/a1babec6b135d1f35b2590a1990af3c5c5393479/osfmk/kern/thread.c#L1984
Please don’t reference that outdated train wreck. The official and maintained dump from Apple is https://github.com/apple-oss-distributions/xnu.
To view, visit change 4283479. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Justin Cohen, Mark Mentovai.
1 comment:
File snapshot/ios/thread_snapshot_ios_intermediate_dump.cc:
GetDataStringFromMap(thread_data,
Key::kThreadName,
&thread_name_,
LogMissingDataValueFromMap::kDontLogIfMissing);
How is it failing? The whole point of logging the kern_return_t is to help us understand what’s gone wrong.
Good question! I don't think I have access to that data — Justin, can you help? I know you provided the raw counts here:
https://chromium-review.googlesource.com/c/crashpad/crashpad/+/4283479/comment/58b7da9a_19998338/
but I don't know if there's a breakdown by error value.
What about THREAD_IDENTIFIER_INFO?
From Justin's graphs in the above comment thread, it looks like fetching `THREAD_IDENTIFIER_INFO` (for the key `kThreadID`) has failed 67K times since 2023-02-08, and `THREAD_EXTENDED_INFO` (for the key `kThreadName`, aka `6014` in Justin's graphs) has failed 634K times since 2023-02-08.
To view, visit change 4283479. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Ben Hamilton, Mark Mentovai.
1 comment:
File snapshot/ios/thread_snapshot_ios_intermediate_dump.cc:
GetDataStringFromMap(thread_data,
Key::kThreadName,
&thread_name_,
LogMissingDataValueFromMap::kDontLogIfMissing);
> How is it failing? The whole point of logging the kern_return_t is to help us understand what’s go […]
We don't store the kr value in the intermediate dump so we have no uma logs of the failure. We do log it to stderr -- but I don't see any evidence of this failing in our unit tess in google3 or chromium. I'm still digging.
To view, visit change 4283479. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Ben Hamilton, Justin Cohen.
1 comment:
File snapshot/ios/thread_snapshot_ios_intermediate_dump.cc:
GetDataValueFromMap(thread_data, Key::kSuspendCount, &suspend_count_);
GetDataValueFromMap(thread_data, Key::kPriority, &priority_);
GetDataValueFromMap(thread_data, Key::kThreadID, &thread_id_);
GetDataValueFromMap(
thread_data, Key::kThreadDataAddress, &thread_specific_data_address_);
// This key is often missing.
GetDataStringFromMap(thread_data,
Key::kThreadName,
&thread_name_,
LogMissingDataValueFromMap::kDontLogIfMissing);
What's strange is kThreadName fails 8x more than kThreadId. kThreadDataAddress, kPriority, kSuspendCount all seems to fail at a similar rate, although slightly less (70%) of kThreadID. I'd expect kThread stuff to fail all at the same rate.
see: https://screenshot.googleplex.com/AsjFyNmYgk2wGga
and without outliers: https://screenshot.googleplex.com/3oSSg2zGsyNXUGV
None of this means that the problem is definitely in any of the calls to `thread_info`. Those UMAs just mean that the intermediate dump didn’t have the data, which _could_ indicate a problem with a `thread_info` call, but could also point to some other failure to write that data to the intermediate dump: some other failure within `crashpad::internal::(unnamed)::WritePropertyBytes` → `crashpad::internal::IOSIntermediateDumpWriter::AddPropertyBytes`, for example. It could even have something to do with the buffering scheme.
To view, visit change 4283479. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Ben Hamilton, Mark Mentovai.
1 comment:
File snapshot/ios/thread_snapshot_ios_intermediate_dump.cc:
GetDataValueFromMap(thread_data, Key::kSuspendCount, &suspend_count_);
GetDataValueFromMap(thread_data, Key::kPriority, &priority_);
GetDataValueFromMap(thread_data, Key::kThreadID, &thread_id_);
GetDataValueFromMap(
thread_data, Key::kThreadDataAddress, &thread_specific_data_address_);
// This key is often missing.
GetDataStringFromMap(thread_data,
Key::kThreadName,
&thread_name_,
LogMissingDataValueFromMap::kDontLogIfMissing);
> What's strange is kThreadName fails 8x more than kThreadId. […]
Here's the number of UMA entries for all the missing keys:
https://paste.googleplex.com/6740451389341696?raw
Around 30 keys have zero entries. Another 30 have just a few hundred. The top keys missing over a thousand are all related to thread data and module data:
kSourceVersion (17M entries) (Module related)
kThreadName (1M entries) (Thread related)
kThreadID (297K entries) (Thread related)
kThreadDataAddress (215K entries) (Thread related)
kPriority (215K entries) (Thread related)
kSuspendCount (214K entries) (Thread related)
kThreadContextMemoryRegions (185K entries) (Thread related)
kThreadState (121K entries) (Thread related)
kModules (74K entries) (Module related)
kFileType (19K entries) (Module related)
kUUID (17K entries) (Module related)
kName (14K entries) (Module related)
kSize (13K entries) (Module related)
kAddress (7K entries) (Module related)
To view, visit change 4283479. To unsubscribe, or for help writing mail filters, visit settings.
Attention is currently required from: Ben Hamilton, Mark Mentovai.
2 comments:
Patchset:
FYI extract the 'kSourceVersion' portion of this CL to https://chromium-review.googlesource.com/c/crashpad/crashpad/+/4294861.
File snapshot/ios/module_snapshot_ios_intermediate_dump.cc:
GetDataValueFromMap(image_data,
Key::kSourceVersion,
&source_version_,
LogMissingDataValueFromMap::kDontLogIfMissing);
This is fine: you’re not in control of every module, so you’re not in control of whether this load c […]
Done
To view, visit change 4283479. To unsubscribe, or for help writing mail filters, visit settings.