please file a chromium issue and reference it in this CL, describing the defect in full.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
mark might be better suited to review this as it's Linux
const auto path = base::StringPrintf("/proc/%d/root", pid) + mapping.name; statbuf.st_dev == mapping.device && statbuf.st_ino == mapping.inode) {does this check work on all filesystems? e.g. I think on Btrfs and OverlayFS this is not always true. e.g. synthetic inode? would it be better to inspect /proc/[pid]/map_files to make sure it's actually the same file?
// Resolves any symlinked components of |path|. The kernel reports canonicalnit: chromium uses backtick (\`) instead of `|` for quoting arguments to functions now.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
please file a chromium issue and reference it in this CL, describing the defect in full.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks for the review. Addressed in the latest patchset:
const auto path = base::StringPrintf("/proc/%d/root", pid) + mapping.name;are you sure all `mapping.name` will start with a slash?
With the new patchset and the map_files approach, it no longer reconstructs a path from the name.
statbuf.st_dev == mapping.device && statbuf.st_ino == mapping.inode) {does this check work on all filesystems? e.g. I think on Btrfs and OverlayFS this is not always true. e.g. synthetic inode? would it be better to inspect /proc/[pid]/map_files to make sure it's actually the same file?
You're right!
As proposed, switched to /proc/[pid]/map_files
// Resolves any symlinked components of |path|. The kernel reports canonicalnit: chromium uses backtick (\`) instead of `|` for quoting arguments to functions now.
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
base::StringPrintf("/proc/%d/map_files/%" PRIx64 "-%" PRIx64,might not work before kernel 4.4 but I am pretty sure we don't support kernels that far back?
stat(entry.c_str(), &statbuf) == 0 && statbuf.st_nlink > 0) {`stat` can fail, and I think if it does then the value returned is wrong, it should return the original name?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
static constexpr std::string_view kDeletedSuffix = " (deleted)";Why a `string_view` and not a `char[]`?
const std::string entry =Don’t use a `std::string` for this. You can size a `char[]` properly and use `snprintf`.
if (struct stat statbuf{};We don’t usually do this.
1. Move the declaration outside of the `if`.
2. No need to initialize `statbuf`.
3. Use underscores to separate words: `stat_buf`.
stat(entry.c_str(), &statbuf) == 0 && statbuf.st_nlink > 0) {Explain the `statbuf.st_nlink` check.
//! \param module_soname The SONAME for the module. If empty, the module is
//! built without a DT_SONAME.This change (and the one in the .cc file) seems unrelated. Can you break it into its own commit with its own description?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
static constexpr std::string_view kDeletedSuffix = " (deleted)";Why a `string_view` and not a `char[]`?
Done; switched to a stack char entry[64] with snprintf.
On sizing it properly: the longest string the format can produce is 62 bytes:
Total 62, so char[64] is provably sufficient
Don’t use a `std::string` for this. You can size a `char[]` properly and use `snprintf`.
Done
base::StringPrintf("/proc/%d/map_files/%" PRIx64 "-%" PRIx64,might not work before kernel 4.4 but I am pretty sure we don't support kernels that far back?
FYI: `/proc/[pid]/map_files/` has existed since Linux 3.3 (2012).
The one real caveat is that map_files/ is gated on the kernel config CONFIG_CHECKPOINT_RESTORE; without it the directory is absent regardless of version. That's now handled by the stat-failure path (see the change for your other comment): if stat fails for any reason, we keep the name exactly as the kernel reported it rather than stripping.
We don’t usually do this.
1. Move the declaration outside of the `if`.
2. No need to initialize `statbuf`.
3. Use underscores to separate words: `stat_buf`.
Done
stat(entry.c_str(), &statbuf) == 0 && statbuf.st_nlink > 0) {`stat` can fail, and I think if it does then the value returned is wrong, it should return the original name?
Done
stat(entry.c_str(), &statbuf) == 0 && statbuf.st_nlink > 0) {aj coteExplain the `statbuf.st_nlink` check.
I sharpened the existing comment and moved it directly above the check.
Detail:
`/proc/[pid]/map_files/<start>-<end>` is a symlink to the inode backing the mapping. stat() follows it, so `stat_buf.st_nlink` is the inode's hard-link count and independent of path resolution or filesystem quirks.
//! \param module_soname The SONAME for the module. If empty, the module is
//! built without a DT_SONAME.This change (and the one in the .cc file) seems unrelated. Can you break it into its own commit with its own description?
DetermineMappingName (the suffix-stripping this CL adds) is only reached on the no-SONAME branch; a module with a SONAME is named by the SONAME and never sees the suffix. So a module built without a DT_SONAME is the only way to exercise the fix, which is exactly what the test_modules change enables. I'd prefer to keep it with the fix it supports (or having it cancelled along it).
tl;dr: I'm validating the lib path too, not only the executable one.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
//! \param module_soname The SONAME for the module. If empty, the module is
//! built without a DT_SONAME.aj coteThis change (and the one in the .cc file) seems unrelated. Can you break it into its own commit with its own description?
DetermineMappingName (the suffix-stripping this CL adds) is only reached on the no-SONAME branch; a module with a SONAME is named by the SONAME and never sees the suffix. So a module built without a DT_SONAME is the only way to exercise the fix, which is exactly what the test_modules change enables. I'd prefer to keep it with the fix it supports (or having it cancelled along it).
tl;dr: I'm validating the lib path too, not only the executable one.
DetermineMappingName (the suffix-stripping this CL adds) is only reached on the no-SONAME branch; a module with a SONAME is named by the SONAME and never sees the suffix. So a module built without a DT_SONAME is the only way to exercise the fix, which is exactly what the test_modules change enables. I'd prefer to keep it with the fix it supports (or having it cancelled along it).
tl;dr: I'm validating the lib path too, not only the executable one.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
hi, please address mark's comments then come back for further review.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
hi, please address mark's comments then come back for further review.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
std::string DetermineMappingName(pid_t pid,
const MemoryMap::Mapping& mapping) {It actually seems like this would be a better fit as a method on `MemoryMap::Mapping`. In particular, if the `Mapping` is created with knowledge of the pid, it becomes a really straightforward interface to use.
That, in turn, makes it easy to answer the next question: is this deleted-name handling something that `MemoryMap::FindMappingWithName` should do too? (Maybe, maybe not—it only has one very specific Android-only caller.)
const MemoryMap::Mapping& mapping) {There are only two callers, and both of them are holding a `MemoryMap::Mapping const *`. Rather than forcing them to do a goofy `*dereference`, this should just accept the pointer.
char entry[64];More of a path than an entry, really.
snprintf(entry, sizeof(entry), "/proc/%d/map_files/%" PRIx64 "-%" PRIx64,
pid, mapping.range.Base(), mapping.range.End());Do we have reasonable assurance based on how the caller prepared its `MemoryMap::Mapping` that the start and end addresses will match a node in /proc/${pid}/map_files?
// map_files symlink to the mapped inode). Zero means the file was unlinked, so
// the suffix is the kernel's annotation and is stripped below; a nonzero count
// means the file still exists and is genuinely named with the suffix, so theRemain within 80 characters.
return mapping.name.substr(0,
mapping.name.size() - (sizeof(kDeletedSuffix) - 1));Here too. Run `git cl format`.
mapping.name.size() - (sizeof(kDeletedSuffix) - 1));Using `sizeof(x) - 1` to size a string literal is somewhat common, but more error-prone. For example, here, it’s dependent on getting the parentheses placed correctly, and given the other parentheses nearby, isn’t the easiest thing to see or verify immediately.
But the `sizeof` thing is actually unnecessary. If you just write `strlen`, the compiler will intelligently recognize that you’re applying it to a literal, and substitute the correct string length without any runtime calls into libc. You get the same exact object code with `strlen(x)` as with `sizeof(x) - 1`.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
std::string DetermineMappingName(pid_t pid,
const MemoryMap::Mapping& mapping) {It actually seems like this would be a better fit as a method on `MemoryMap::Mapping`. In particular, if the `Mapping` is created with knowledge of the pid, it becomes a really straightforward interface to use.
That, in turn, makes it easy to answer the next question: is this deleted-name handling something that `MemoryMap::FindMappingWithName` should do too? (Maybe, maybe not—it only has one very specific Android-only caller.)
Having worked through this, I think the honest position is that there are a few defensible shapes here, and the choice between them is an ownership decision I would rather you make than have me impose. The behavior and the tests come out identical in every case; the only thing that varies is where the name resolution lives and how the pid reaches it. For the two "method" forms it reduces to a single question: whether a `/proc/map_files` lookup belongs on the `Mapping` value or on the `MemoryMap` that owns the procfs connection. I do not have a confident basis to settle that, and since you own the long-term shape of this code, I would rather lay out the options and let you pick.
The three viable shapes:
(I also considered storing the pid on every `Mapping`, and a back-reference from `Mapping` to its owner, and dropped both: the first duplicates process-scoped state on every region, the second inverts the composition and adds lifetime coupling.)
My own preference is the free function, for two reasons. First, the two inputs are genuinely orthogonal: the pid is process-scoped (one per map) and the range is region-scoped (one per mapping), so passing both explicitly mirrors where the data actually lives instead of making one object carry the other. Second, I cannot currently justify whether this operation belongs to `MemoryMap` or to `Mapping`, and the free function avoids committing to an ownership I am unsure of. If you would rather it be a method, the choice between the two method forms is exactly that ownership question, and I am happy to follow your call.
There are only two callers, and both of them are holding a `MemoryMap::Mapping const *`. Rather than forcing them to do a goofy `*dereference`, this should just accept the pointer.
I prefer the reference, as done with FindFilePossibleMmapStarts(const Mapping&) and ReverseIteratorFrom(const Mapping&), but it is your call.
More of a path than an entry, really.
Done
snprintf(entry, sizeof(entry), "/proc/%d/map_files/%" PRIx64 "-%" PRIx64,
pid, mapping.range.Base(), mapping.range.End());Do we have reasonable assurance based on how the caller prepared its `MemoryMap::Mapping` that the start and end addresses will match a node in /proc/${pid}/map_files?
Yes. A Mapping is a verbatim `/proc/[pid]/maps` line: `ParseMapsLine` sets range to the line's own start/end, and both callers get a const Mapping* out of mappings_, never a synthesized one. The kernel keys each map_files/<start>-<end> node by that same pair, so they match by construction. We only look it up when the name ends in " (deleted)", which is set only on file-backed mappings,and those always have a map_files entry.
Any miss (e.g. no CONFIG_CHECKPOINT_RESTORE, or stat fails) just keeps the kernel's name, so never a wrong one.
// map_files symlink to the mapped inode). Zero means the file was unlinked, so
// the suffix is the kernel's annotation and is stripped below; a nonzero count
// means the file still exists and is genuinely named with the suffix, so theaj coteRemain within 80 characters.
Done
return mapping.name.substr(0,
mapping.name.size() - (sizeof(kDeletedSuffix) - 1));Here too. Run `git cl format`.
Done
Using `sizeof(x) - 1` to size a string literal is somewhat common, but more error-prone. For example, here, it’s dependent on getting the parentheses placed correctly, and given the other parentheses nearby, isn’t the easiest thing to see or verify immediately.
But the `sizeof` thing is actually unnecessary. If you just write `strlen`, the compiler will intelligently recognize that you’re applying it to a literal, and substitute the correct string length without any runtime calls into libc. You get the same exact object code with `strlen(x)` as with `sizeof(x) - 1`.
Done. I was not aware compilers fold strlen on a constant string into a compile-time constant.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
std::string DetermineMappingName(pid_t pid,
const MemoryMap::Mapping& mapping) {aj coteIt actually seems like this would be a better fit as a method on `MemoryMap::Mapping`. In particular, if the `Mapping` is created with knowledge of the pid, it becomes a really straightforward interface to use.
That, in turn, makes it easy to answer the next question: is this deleted-name handling something that `MemoryMap::FindMappingWithName` should do too? (Maybe, maybe not—it only has one very specific Android-only caller.)
Having worked through this, I think the honest position is that there are a few defensible shapes here, and the choice between them is an ownership decision I would rather you make than have me impose. The behavior and the tests come out identical in every case; the only thing that varies is where the name resolution lives and how the pid reaches it. For the two "method" forms it reduces to a single question: whether a `/proc/map_files` lookup belongs on the `Mapping` value or on the `MemoryMap` that owns the procfs connection. I do not have a confident basis to settle that, and since you own the long-term shape of this code, I would rather lay out the options and let you pick.
The three viable shapes:
- The free function `DetermineMappingName(pid, mapping)`.
- A method on the region: `Mapping::GetName(pid)`, with the pid passed in.
- A method on the owner: `MemoryMap::GetMappingName(mapping)`, with the pid read from the connection.
(I also considered storing the pid on every `Mapping`, and a back-reference from `Mapping` to its owner, and dropped both: the first duplicates process-scoped state on every region, the second inverts the composition and adds lifetime coupling.)
My own preference is the free function, for two reasons. First, the two inputs are genuinely orthogonal: the pid is process-scoped (one per map) and the range is region-scoped (one per mapping), so passing both explicitly mirrors where the data actually lives instead of making one object carry the other. Second, I cannot currently justify whether this operation belongs to `MemoryMap` or to `Mapping`, and the free function avoids committing to an ownership I am unsure of. If you would rather it be a method, the choice between the two method forms is exactly that ownership question, and I am happy to follow your call.
Summary of where this can live, for you to pick. The lookup needs two inputs from different scopes: the region's range (on the `Mapping`) and the target pid (one per map, on the connection). All three options below leave `Mapping::name` as the raw kernel string, since other readers rely on it unchanged, and all keep a best-effort, fail-safe direct `stat()` for now.
One cross-cutting finding for the decision: this `stat()` is the only direct `/proc` access in this layer that bypasses `connection_` (every other target read goes through the connection, and the broker for sandboxed targets). Option 3 places the lookup next to `connection_`, so if the sandboxed case ever needs stripping too, switching to a brokered stat would be an internal change with no caller churn. That is a follow-up, not this patch; I note it only because it bears on where the function belongs.
Happy to go with whichever you prefer.
Thanks for the detailed analysis.
I still believe that `Mapping::GetName` _without_ supplying a `pid` argument would be cleanest from an interface perspective, but recognize that realizing that goal without introducing new layering quirks would require a medium-sized reimagining of how `MemoryMap` and `Mapping` relate to one another. We don‘t need to answer that here.
Putting the interfaces to the kernel close together is what originally motivated me to raise the issue, and option 3 is cleanest from that perspective, so let’s give that a try and see how it works out.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
//! running program's binary is replaced on disk. A file may also be genuinelyremoved or replaced
// The kernel exports no constant for the suffix. Refer to /proc/[pid]/maps
// in proc(5) and the PROCMAP_QUERY ioctl in linux/fs.h.Kernel source is more authoritative than anything else, so if you’re giving any references at all, “linux fs/d_path.c d_path” should be among them.
TEST(MemoryMap, MappingNameKeepsRealDeletedSuffix) {Good tests. Thanks for including these.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
//! running program's binary is replaced on disk. A file may also be genuinelyaj coteremoved or replaced
Done
// The kernel exports no constant for the suffix. Refer to /proc/[pid]/maps
// in proc(5) and the PROCMAP_QUERY ioctl in linux/fs.h.Kernel source is more authoritative than anything else, so if you’re giving any references at all, “linux fs/d_path.c d_path” should be among them.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |