On Tue, Mar 25, 2025 at 11:02 AM Xianzhu Wang <wangx...@chromium.org> wrote:
+pkasting who has been recently working on PaintOpReader.
I haven't, actually -- I think I checked in a cleanup change to it in passing and the fuzzer has been furiously assigning bugs to me as a result, which I've been ignoring :(
On Thu, Feb 27, 2025 at 2:18 PM Xianzhu Wang <wangx...@chromium.org> wrote:We have seen a lot of clusterfuzz bugs about crashes in PaintOpReader. Some of them never happen in a valid environment because our serializer can never produce such data. We have fixed some of such bugs, while I closed some as WAI because I think crashing is a better choice (but clusterfuzz will sometimes create another test case and file another bug for the same crash).
If we have CHECKs that say that certain preconditions are expected on the caller side, we should update fuzzers to not pass inputs that violate those CHECKs.If there are no CHECKs, then IMO code should either handle those inputs, or else add such CHECKs and prevent them caller-side (and then not fuzz such input ranges). Doing neither (e.g. just crashing on shouldn't-happen inputs to public APIs) is unclear to readers.
Now PaintOpReader handles invalid inputs in the following ways:1. Normal control flow using the valid_ flag.2. CHECK/NOTREACHED.3. Crash in c++ library, Skia etc.According to the CHECK guide, because PaintOpReader handles "data across security boundaries" in a high privilege process, we should use normal control flow (#1). However, I'm wondering if the effort is worth it. IMHO crashing is better because it entirely avoids the impact of an invalid input. If we can make the GPU process an exception to the rule in the CHECK guide,PaintOpReader should use ImmediateCrash to check for invalid inputs.
Why not CHECK (option 2)?
PK
+pkasting who has been recently working on PaintOpReader.
On Thu, Feb 27, 2025 at 2:18 PM Xianzhu Wang <wangx...@chromium.org> wrote:We have seen a lot of clusterfuzz bugs about crashes in PaintOpReader. Some of them never happen in a valid environment because our serializer can never produce such data. We have fixed some of such bugs, while I closed some as WAI because I think crashing is a better choice (but clusterfuzz will sometimes create another test case and file another bug for the same crash).
Now PaintOpReader handles invalid inputs in the following ways:1. Normal control flow using the valid_ flag.2. CHECK/NOTREACHED.3. Crash in c++ library, Skia etc.According to the CHECK guide, because PaintOpReader handles "data across security boundaries" in a high privilege process, we should use normal control flow (#1). However, I'm wondering if the effort is worth it. IMHO crashing is better because it entirely avoids the impact of an invalid input. If we can make the GPU process an exception to the rule in the CHECK guide,PaintOpReader should use ImmediateCrash to check for invalid inputs.
base::ImmediateCrash()
or CHECK()
etc. as part of normal control flow.PK
The rule in the guide that "CHECK in PaintOpReader" doesn't strictly follow is (under this section):PaintOpReader resides in the GPU process which is a high privilege process. It processes input produced by renderer processes.
- Data across security boundaries: A compromised renderer should not be able to bring down the browser process (higher privilege). Bad IPC messages should be safely rejected by Chromium without the use of
base::ImmediateCrash()
orCHECK()
etc. as part of normal control flow.
In the cases of the closed WAI bugs, the preconditions are enforced from the renderer process (can we treat them as the caller?), except for the fuzzer. It sounds complex to let the fuzzer meet the preconditions, so I'm wondering if there is a way to let clusterfuzz know that a crash is intentional and ignore it.
On Mon, Mar 31, 2025 at 3:12 PM Xianzhu Wang <wangx...@chromium.org> wrote:The rule in the guide that "CHECK in PaintOpReader" doesn't strictly follow is (under this section):PaintOpReader resides in the GPU process which is a high privilege process. It processes input produced by renderer processes.
- Data across security boundaries: A compromised renderer should not be able to bring down the browser process (higher privilege). Bad IPC messages should be safely rejected by Chromium without the use of
base::ImmediateCrash()
orCHECK()
etc. as part of normal control flow.I don't think a compromised renderer bringing down the GPU process violates that rule, but the security people would know definitively. Especially if the GPU process is auto-restarted on crashing, the consequences of that are much less user-visible than a browser crash.
In the cases of the closed WAI bugs, the preconditions are enforced from the renderer process (can we treat them as the caller?), except for the fuzzer. It sounds complex to let the fuzzer meet the preconditions, so I'm wondering if there is a way to let clusterfuzz know that a crash is intentional and ignore it.Complex because the conditions are complicated to encode (or unclear), or complex because you aren't sure how the fuzzer works and where to encode such things?
PK
On Mon, Mar 31, 2025 at 5:49 PM Xianzhu Wang <wangx...@chromium.org> wrote:On Mon, Mar 31, 2025 at 4:11 PM Peter Kasting <pkas...@chromium.org> wrote:On Mon, Mar 31, 2025 at 3:12 PM Xianzhu Wang <wangx...@chromium.org> wrote:The rule in the guide that "CHECK in PaintOpReader" doesn't strictly follow is (under this section):PaintOpReader resides in the GPU process which is a high privilege process. It processes input produced by renderer processes.
- Data across security boundaries: A compromised renderer should not be able to bring down the browser process (higher privilege). Bad IPC messages should be safely rejected by Chromium without the use of
base::ImmediateCrash()
orCHECK()
etc. as part of normal control flow.I don't think a compromised renderer bringing down the GPU process violates that rule, but the security people would know definitively. Especially if the GPU process is auto-restarted on crashing, the consequences of that are much less user-visible than a browser crash.Agreed.
It would be better if the GPU process did not crash in this situation, and instead killed the misbehaving renderer.
+1 to handling invalid renderer input safely and failing to de-serialize. If it's easy to bubble up the error to a point which can terminate the mojo connection, that's even better.I assumed that's how it already works, PaintOpReader sets a bit to indicate invalid input and fails safely instead of asserting. If there's a case where we're CHECK-ing, that would be a bug we should fix. Are you seeing such cases when we deserialize Skia data structures?
Even absent any security risk, the user experience of a GPU crash is a lot worse than the user experience of a renderer crash, and we should avoid it when feasible.
On Wed, Apr 2, 2025 at 2:46 PM Elly <elly...@chromium.org> wrote:Even absent any security risk, the user experience of a GPU crash is a lot worse than the user experience of a renderer crash, and we should avoid it when feasible.IIUC we're only talking about crashing when there's a compromised renderer. Does a slightly degraded user experience really matter in this case?
- Vasiliy