Christopher StaiteI'm a little stuck with the DEPS. I've had another go at it, this time only depending on the child_process_id.h which is imported by the mojom, so that should be supported. However, I wonder if I'm working around the checks... Should I be abstracting this in some way in //device or is this ok?
Alexander CooperIf this doesn't work, then perhaps we can wait for https://crrev.com/c/7498491 and use `net::OriginatingProcess` in `//device` and then do the translation back and forth. That should solve the issue I believe.
Alexander CooperTypically I think I had been avoiding any form of content dependency in //device/vr due to circular dependency issues and the like and that is honestly one of the reasons for the presence of the //components/webxr directory.
Christopher StaiteFWIW, content/services appears to be the only dependency in device/vr for anyhting under content at present: https://source.chromium.org/search?q=-f:%5Eout%20-f:%5Egen%20f:device%2Fvr%20content%2F&ss=chromium%2Fchromium%2Fsrc
Moved to using the `network::OriginatingProcess` instead.
Could I please have a CQ+1 to check my porting of the Android code to `network::OriginatingProcess`. I am unable to run this build locally and therefore it's likely to have issues.
*out = content::ChildProcessId(data.process_id());Christopher StaiteWith other similar types we fail the deserialization (so basically return false here) if the data is invalid (here, it would be the case if process_id is nonpositive). For example we do this for UnguessableToken in order to avoid potential bugs from uninitialized UnguessableTokens being passed around.
I think the same reasoning would apply here (would you ever want to pass through mojo an invalid ChildProcessId or would that be a sign of a bug in the sending code?).
Christopher StaiteWell, kBrowserRendererId (0) is technically invalid, but also passed over the interface to indicate the source was the browser process. I'm not sure if there are instances where -1 is passed, I certainly couldn't say that with any authority.
This would suggest that a value of -1 would fail a DCHECK build, so certainly that should be invalid:
https://source.chromium.org/chromium/chromium/src/+/main:services/network/url_loader_factory.cc;l=106;drc=89424e3fec685efa12c7a6f598dbf3e9ebfbcf4d;bpv=0;bpt=1
Antonio SartoriOn the basis of the above, which I believe to be the only source of data, I've added in the check for -1.
Antonio SartoriThat is somehow unfortunate, since at https://source.chromium.org/chromium/chromium/src/+/main:content/public/browser/child_process_id.h;l=17;drc=89424e3fec685efa12c7a6f598dbf3e9ebfbcf4d we say that 0 is also invalid. I haven't looked at the other CL in details but using it as an implicit value for the browser process seems bad. Also, if this is called "ChildProcessId" it is kind of weird that it can also contain the browser process id.
I'm fine approving this CL if you change this to also fail on `0`, and you can find out in the follow-up CL how to best handle the case of the browser process (which might also end up being allowing 0 again and renaming this to something else, but I'd say most likely there is a better alternative, tbd with the mojo reviewer).
Antonio Sartori(actually, all nonpositive values should be invalid)
Christopher Staite(and if you're up to it, a unit test would be great. Something as trivial as mojo/public/cpp/base/unguessable_token_unittest.cc, possibly also checking invalid values.)
Antonio SartoriGreat idea, that showed up that I forgot to export the bindings! I've added a separate boolean to determine if it's the browser process performing the request. Hopefully this is satisfactory.
Christopher StaiteThanks for including the tests, and sorry to be pedantic on this. However, what we need is not really a different way of serializing the data in mojo. We really don't care about how that is done (it's somehow an internal implementation detail, and it would be totally fine to pass the browser process id as zero). What we do want is to avoid callers passing around a `content::ChildProcessId` which is not valid, since the receiver might, at some point in the future, assume that that is not the case. That means that what we want to change is not really the mojo serialization, but rather the exposed //content type (i.e. create some sort of content::BrowserOrChildProcessId). However I am not a //content owner and I did not do any research into how the kInvalidProcessId is used around the codebase to identify the browser process, so I would let this decision to someone that does. Maybe the base issue is that content::ChildProcessId is not properly named.
I still believe that the best path to unblock you here is to initially restrict to fail on kBrowserProcessInvalidId. And figure out how to handle this best in a follow up.
Does this make sense?
Sorry for insisting on iterating here.
Avi DrissmanTotally fair, but this does appear to be core functionality here. Perhaps @a...@chromium.org could provide some context here about how to move forward with this strangeness of a second invalid value to communicate the browser process.
Christopher StaitePlease bring a question of this magnitude to content...@chromium.org. (This isn’t really my area of expertise and I would prefer to not opine.)
Avi DrissmanThank you for the steer!
Will HarrisPlease directly compose an email to the group explaining the situation.
Antonio Sartoridrive-by: if negative values are invalid can the type in mojom be unsigned?
if you want to represent an unknown value you can use an optional?
Avi DrissmanUsing an optional in the follow-up CL so that nullopt indicates the browser process could make sense.
I would not use unsigned in order to avoid additional casts. Note that even if you used unsigned you would have to check that the cast is safe (i.e. that the uint32 can be cast to int32) which is worse I think than just checking the values are positive.
(Also, I personally try to stick to https://google.github.io/styleguide/cppguide.html#Integer_Types as much as possible and just avoid unsigned).
Will HarrisFYI, that was an excellent writeup of the issue for the discussion list; thank you!
Christopher StaiteThe mojo guidance to never transmit invalid values supercedes the google style guide and we use the safe numerics to perform the cast (and crash if invalid) rather than checking if the values are positive. Using a signed value for `process_id` where negative is invalid would not pass security review.
Antonio SartoriSo, I updated to change only the content uses and I will create a new definition in network for that one. There we can have a union for the browser process and renderer process to resolve this. Since there were only two usages I decided to bring them along for the ride. I've done a local build, but I'm not 100% sure I touched all the code, so a CQ+1 would be appreciated.
Will Harriswfh@ I'm not sure I follow. We want to create a mojo type which typemaps to ChildProcessId, which is an `int32_t` (with the additional properties that all negative values are invalid, and that 0 is kind of a weird value that is also invalid but also means the browser process). So the valid values of this field are `2^31 - 1` so there is no mojo type which allows us to serialize all and only valid values.
Now I see two options: (1) either we use `int32` and we fail (de)serialization on non-positive values, as this CL is currently doing, or (2) we use `uint32`, we add casts when serializing and deserializing and we fail (de)serialization if the values are invalid (fail the cast).
Option (2) seems much worse to me than option (1): this CL shows how easy it is to check validity using the builtin functionality of `IdType`, while this would get kind of nasty if we started adding (checked) casts as required from option (2). I fail to see any advantage of doing (2) instead of (1).
Antonio Sartorithere is a mojo type that would allow this, you use unsigned int to represent the real ids then you union or an enum that can represent the magic values. You then use checked cast to cast between the values (except for the magic sentinel values which you can convert directly into the union/enum in the mojo type).
Will HarrisSorry, I think I still don't understand what you are proposing here.
Antonio Sartoriwe want to avoid sending invalid values on the wire, because rule
right now since `int32 process_id` is in the mojo, it's possible to send -2 down the wire. This is invalid, and thus violates the rules.
I'm not sure I can grok from this discussion whether `-1` is a valid value to send? If not, then simply change the type to `uint32` and we're done here. You will need to add a set of `base::checked_cast` in the mojo type traits to verify nobody tries to send a negative value through mojo.
If `-1` **is** a valid value to send (but `-2` is definitely not) then what I am suggesting is that, on the wire, in the mojo you define a type e.g. something like
```
union ChildProcessId {
uint32 process_id;
bool is_browser_process;
};
```then you can marshall `-1` (if needed, see comment above) as the `is_browser_process` `bool` but all other positive values as `process_id`. it will come out the other end as `-1`.
This ensures that mojo type never carries invalid info. This is assuming `-1` is valid (which I am not sure it is?).
Antonio SartoriThanks for clarifying. If the purpose is not to send any invalid values to the wire, I think an easier solution is to CHECK when serializing that we have a valid value. I still believe there is no point in casting to unsigned.
I wonder why we don't do the same at other places (for example, in mojo/public/cpp/base/unguessable_token_mojom_traits.h we allow serializing an invalid UnguessableToken but we fail the deserialization, or in url/mojom/url_gurl_mojom_traits.cc we do the same for GURL). Our styleguide seems to follow the same pattern to fail deserialization but not serialization (https://chromium.googlesource.com/chromium/src/+/HEAD/docs/security/mojo.md#use-structtraits).
Antonio SartoriThinking more about this, I think the point is really that we want to check on deserialization and not on serialization. If a process tries to serialize an invalid value, then there (maybe) was a problem already somewhere in the call stack, where that invalid value was passed around. What we really care about is that on the receiving side no invalid value is deserialized again.
Think about a mojo message renderer -> browser where a URL is passed. The URL might come from a javascript function call, which might accept or not an invalid URL depending on how it is specified. This URL is then passed around in the renderer until it is included in the mojo message. Now killing the renderer when serializing that URL might not be the best decision always. If that URL should have never been invalid, then this is an invariant that the calling code should have already ensured, somewhere else, for example also with CHECKs. In any case, though, what needs to be checked is, at the receiving side, that no invalid value is deserialized (and even if we checked that no invalid value is serialized, that would not be enough).
So I really don't think we should do anything else here. I think we could add checks to the code that uses the new mojo type in this CL to ensure that the process_id we are sending is valid, but I don't think this should be done in the mojo serialization code and I don't even think that we should always do it everywhere as a rule - we should only do it when it makes sense.
Christopher StaiteBut again thinking more about this (and sorry for spamming with messages), in this case we probably really never want to send invalid process ids in the first place, so maybe it's a good idea to add a CHECK and make debugging problems easier.
Will HarrisDone
Christopher StaiteYes checking the id is valid on both ends would suffice. I still don't quite understand your objection to just using typed on-wire values but agree the same goal is achieved if it's impossible to transmit invalid values as enforced by the traits.
And yes, if a renderer sends an invalid value we do want to kill that renderer. We don't want to allow a renderer sending an invalid value to crash the browser though, if possible.
I think typically when there is a continued lack of consensus it is typical not to close a comment thread.
Will HarrisSorry, I didn't realise that we hadn't reached consensus here. I thought that the conclusion was to perform a check while serialising and validate the value on deserialisation which is how I've now implemented it. I didn't realise there was still contention on the wire value.
Christopher Staitesorry I couldn't see the check here. Can you point it out in the code? Can you also clarify what you consider an invalid value as I wasn't sure above whether `-1` was valid or not?. Is `-2` ever valid... I presume not? What are the ranges of valid values here - just so we are on the same page.
Will HarrisInvalid values are -1 and 0. This uses a base::IdType which increments by one each time starting at 1, therefore, in theory -2 is a valid value, just before it overflows.
Christopher Staitei'm super confused, how is `-2` possibly a valid value here?
Will HarrisThe `content::ChildProcessId` (https://source.chromium.org/chromium/chromium/src/+/main:content/public/common/child_process_id.h) is simply a `base::IdType` where the invalid values are defined as `0` and `-1`. `base::IdType::Generator` (https://source.chromium.org/chromium/chromium/src/+/main:base/types/id_type.h) generates values by simply performing `next_id_++`, which increments from the starting value of `1` with a base type of `int32_t`. Therefore, if `2147483647` render processes were generated then there would be a perfectly valid `content::ChildProcessId` of `-2147483648` as it overflowed. If this continued then eventually a perfectly valid value of `-2` could be generated. Now, that's a lot of processes, but I suppose it's entirely possible. Adding a check here would mean that currently functional code that can handle `4294967294` processes, is now restricted to the space of `2147483647` processes for no good reason.
Christopher Staite4294967294 is not a valid value for an int32_t.
int32_t values are -2147483648 to 2147483647.
but this particular type only generates positive values, and uses 0 and -1 as 'invalid values'.
so -2 is not valid here.
Perhaps we are getting confused because we are using different definitions of the phase 'invalid'?
0 and -1 are valid for this type and represent an invalid value, but for the value to be 0 and -1 is valid, it just represents an invalid sentinel value.
-2 is a totally invalid value, and should never be in this type.
Will Harris`4294967294` is not a valid value for `int32_t` but `content::ChildProcessId` has `4294967294` unique and valid values currently. Making a change to this has a much wider impact. Currently there is nothing that handles this overflow in `content` and therefore having this enforced here is unusual.
Christopher Staiteokay understood, I don't agree with you on this, since you're saying that allowing the int32 to overflow to negative is valid, and I don't think it is - it's an integer overflow.
but I'll look in some content owners to adjudicate since we're not making any progress on this thread.
Alex MoshchukThanks, for clarity, I'm not saying you're wrong, I'm simply saying this is a change to the status quo which I believe is outside of the scope of this bug. It will be interesting to hear what content owners think. I know that @rak...@chromium.org has strong subject matter expertise here, perhaps they can chime in.
Christopher StaiteChiming in here as one of the content owners, after nasko@ pointed me to this thread (which also came up in some CSA discussions).
I don't think we should (continue to) support negative process IDs. If -2 and other negative values aren't considered invalid today, they should be, whether in mojo, in browser process code, or in any other process's code. AFAIK content has never relied on integer overflow to double the process ID space, and this is not the behavior we should support going forward. Of course in practice, we'd likely hit OS limits (for FDs, PIDs, etc) long before we'd reach ID 2147483647, not to mention that even without this, and even if you created 1000 processes per second, it'd take 4 years to get to that point, far beyond Chrome's expected uptime. But also, allowing integer overflow to happen might mean that new processes could eventually be created with -1, 0, and existing process IDs, leading to logic bugs. I do think this is in scope for the ChildProcessId refactor and for this CL - we should not allow transport of negative IDs over mojo only because we happen to not treat them as invalid in content today, since one of the points of introducing ChildProcessId is to catch and fix bugs and unexpected behavior like this. :) So I think it'd be fine if we implement a policy that's stricter than status quo.
I also agree that it's not great to use 0 to represent the browser process in the network service, and it's better to instead use a union {ChildProcessId, browser_process} (as suggested by wfh@ and also dtapuska@ on [this thread](https://groups.google.com/a/chromium.org/g/content-owners/c/iU0xYC7jgf8/m/rK4nDN59BwAJ?utm_medium=email&utm_source=footer)), an optional, a variant, or something similar. Using 0 as a magic number feels risky (as we've already seen bugs other contexts that actually incorrectly treat 0 as a valid child process ID - https://crbug.com/473591414), and using a ChildProcessId to represent the browser process in the network service also feels wrong because the browser process is not a child process. If the network service cares about identifying both browser and renderer processes, perhaps it should either use a different union-based type, or we should generalize ChildProcessId to a ProcessId that's also capable of representing the browser process.
I don't have a strong opinion on whether to transport the ID as an unsigned vs signed int in mojo (unsigned seems cleaner to me - but deferring this to mojo experts). I do think it'd also be good to update ChildProcessId to prevent integer overflow from being possible in existing code. (E.g., CHECK fail in the ID generator if we ever wrap around.)
Alex MoshchukThanks for the very useful comments. I have good news, https://crrev.com/c/7498491 implements just that, a union type for the network service. I've moved this to WIP and currently considering removing the possibility of transmitting a ChildProcessId outside of content entirely and only ever allowing a `network::RendererProcess` or `network::OriginatingProcess`. I'll update the definition of `network::RendererProcess` in that to comply with this discussion and disallow all negative values.
Christopher StaiteThank you, that sounds reasonable to me!
Moved to using the `network::OriginatingProcess` instead.
CHECK(id);Will Harris@w...@chromium.org here is the check.
Christopher StaiteExcuse my ignorance but are you sure that a value of `-2` will check here? I looked at the definition of `ChildProcessId` and it doesn't define this as invalid, but might be misunderstanding of the `IdType`. We can simply resolve this with a test! Can you add a unit test with a death test to verify that -2 (or other invalid values) causes a crash when performing mojo serialization?
Christopher Staite`-2` will not check here, and has no reason to.
Moved to using the `network::OriginatingProcess` instead.
#include "content/public/common/child_process_id_traits.h"Christopher Staite(nit) add include for content/public/common/child_process_id.h and for content/public/common/child_process_id.mojom-shared.h
Moved to using the `network::OriginatingProcess` instead.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
ToRendererProcess(render_frame_host_->GetProcess()->GetID()),This call seems odd like it's basically doing `ToProcess(Process()->Id());`. Should `GetProcess` not be enough? If not IMO either `ToRendererProcess` should be renamed or a comment should be added here articulating what is different about the two processes.
network::RendererProcess render_process_id;Throughout: It's maybe more churn, but it seems like this is not an ID anymore?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
ToRendererProcess(render_frame_host_->GetProcess()->GetID()),This call seems odd like it's basically doing `ToProcess(Process()->Id());`. Should `GetProcess` not be enough? If not IMO either `ToRendererProcess` should be renamed or a comment should be added here articulating what is different about the two processes.
They are the same thing, but we've had to create a new type in `network` so it can be referenced without depending on `content` as that makes it circular. So it's simply a type cast. Not sure if you think it's necessary to add a comment for a type cast?
network::RendererProcess render_process_id;Throughout: It's maybe more churn, but it seems like this is not an ID anymore?
I'm having regrets about my naming choices here, this should have been called `RendererProcessId` I think. The refactor for that would be pretty substantial now though. Perhaps it should be done. What do you think?
Personally, IMO the name chosen from Network is inaccurate and leads to some potential readability concerns. If you can convince the content and/or IPC reviewers it's fine or can point to other network objects that are named/behave similarly, I won't push/block though.
ToRendererProcess(render_frame_host_->GetProcess()->GetID()),Christopher StaiteThis call seems odd like it's basically doing `ToProcess(Process()->Id());`. Should `GetProcess` not be enough? If not IMO either `ToRendererProcess` should be renamed or a comment should be added here articulating what is different about the two processes.
They are the same thing, but we've had to create a new type in `network` so it can be referenced without depending on `content` as that makes it circular. So it's simply a type cast. Not sure if you think it's necessary to add a comment for a type cast?
My confusion is that "ToRendererProcess" doesn't look like a type cast, when I read the method it seems to me like you're converting to the same type that I can get from render_frame_host_->GetProcess(); so I think it's an unfortunate naming thing. I don't love `ToNetworkRendererProcessId`, but I think that's at least clearer? Or maybe just `ToTransferrableRenderProcessId`. How widespread is the usage?
network::RendererProcess render_process_id;Christopher StaiteThroughout: It's maybe more churn, but it seems like this is not an ID anymore?
I'm having regrets about my naming choices here, this should have been called `RendererProcessId` I think. The refactor for that would be pretty substantial now though. Perhaps it should be done. What do you think?
IMO, calling it `network::RendererProcess` when it is still just an ID (and doesn't technically contain everything) makes this seem like the type is incorrectly named and hurts readability. Further there's the issues like the method that apparently is doing a cast.
How widespread is the usage? It should mostly be a find/replace at least?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
ToRendererProcess(render_frame_host_->GetProcess()->GetID()),Christopher StaiteThis call seems odd like it's basically doing `ToProcess(Process()->Id());`. Should `GetProcess` not be enough? If not IMO either `ToRendererProcess` should be renamed or a comment should be added here articulating what is different about the two processes.
Alexander CooperThey are the same thing, but we've had to create a new type in `network` so it can be referenced without depending on `content` as that makes it circular. So it's simply a type cast. Not sure if you think it's necessary to add a comment for a type cast?
My confusion is that "ToRendererProcess" doesn't look like a type cast, when I read the method it seems to me like you're converting to the same type that I can get from render_frame_host_->GetProcess(); so I think it's an unfortunate naming thing. I don't love `ToNetworkRendererProcessId`, but I think that's at least clearer? Or maybe just `ToTransferrableRenderProcessId`. How widespread is the usage?
Done
network::RendererProcess render_process_id;Christopher StaiteThroughout: It's maybe more churn, but it seems like this is not an ID anymore?
Alexander CooperI'm having regrets about my naming choices here, this should have been called `RendererProcessId` I think. The refactor for that would be pretty substantial now though. Perhaps it should be done. What do you think?
IMO, calling it `network::RendererProcess` when it is still just an ID (and doesn't technically contain everything) makes this seem like the type is incorrectly named and hurts readability. Further there's the issues like the method that apparently is doing a cast.
How widespread is the usage? It should mostly be a find/replace at least?
| Code-Review | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
mojo LGTM
Create the Mojom mapping and traits required to pass `ChildProcessId`
across Mojo, replacing raw integers with strong types. Apply thisI think this refers now to another CL?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Create the Mojom mapping and traits required to pass `ChildProcessId`
across Mojo, replacing raw integers with strong types. Apply thisI think this refers now to another CL?
Quite right, that was moved.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |