WebNN: Schedule `OnLost` task immediately to avoid race conditions [chromium/src : main]

1 view
Skip to first unread message

Xu, Mingming1 (Gerrit)

unread,
Jan 5, 2026, 8:37:51 PM (6 days ago) Jan 5
to Hu, Ningxin, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
Attention needed from Hu, Ningxin

Xu, Mingming1 added 1 comment

Patchset-level comments
File-level comment, Patchset 4 (Latest):
Xu, Mingming1 . resolved

PTAL, thanks!

Open in Gerrit

Related details

Attention is currently required from:
  • Hu, Ningxin
Submit Requirements:
  • requirement satisfiedCode-Coverage
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: chromium/src
Gerrit-Branch: main
Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
Gerrit-Change-Number: 7364029
Gerrit-PatchSet: 4
Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
Gerrit-CC: Jiewei Qian <q...@chromium.org>
Gerrit-Attention: Hu, Ningxin <ningx...@intel.com>
Gerrit-Comment-Date: Tue, 06 Jan 2026 01:37:38 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
satisfied_requirement
unsatisfied_requirement
open
diffy

Hu, Ningxin (Gerrit)

unread,
Jan 6, 2026, 8:40:43 AM (5 days ago) Jan 6
to Xu, Mingming1, Bernhart, Bryan, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
Attention needed from Bernhart, Bryan and Xu, Mingming1

Hu, Ningxin added 5 comments

Commit Message
Line 9, Patchset 4 (Latest):Currently, dispatching and reading tensor are scheduled to a
Hu, Ningxin . unresolved

Nit: remove this "a"

Line 20, Patchset 4 (Latest):1. If handling context lost is scheduled after reading output
tensor, renderer process can still complete reading the tensor
with an invaild result before the
`MLTensor::OnConnectionError`[3] is invoked.
Hu, Ningxin . unresolved

Can we mark a context "is being lost"? A tensor reading can check that and handle it correctly?

Line 25, Patchset 4 (Latest):2. If handling context lost is scheduled before reading output
tensor, renderer process will throw an error: "Tensor has been
destroyed or context is lost." in `MLTensor::OnConnectionError`.
Hu, Ningxin . unresolved

This is the expected behavior, correct?

Line 36, Patchset 4 (Latest):broadcasted across all contexts. The broadcast is unnecessary
and may cause a security issue where contexts created on
different web pages could share the context lost reason.
Hu, Ningxin . unresolved

Are the context lost reasons non-recoverable system-wide error, e.g. device-removal? Why does broadcasting them cause security issue?

Line 39, Patchset 4 (Latest):Even before this change, the broadcast was not guaranteed
due to a race condition between killing the GPU process[4]
on the main thread and invoking OnLost on each context's
owning thread.
Hu, Ningxin . unresolved

Does gpu scheduler destruction ensure all tasks are scheduled? @bryan.b...@intel.com

Open in Gerrit

Related details

Attention is currently required from:
  • Bernhart, Bryan
  • Xu, Mingming1
Submit Requirements:
    • requirement satisfiedCode-Coverage
    • requirement is not satisfiedCode-Owners
    • requirement is not satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement is not satisfiedReview-Enforcement
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: comment
    Gerrit-Project: chromium/src
    Gerrit-Branch: main
    Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
    Gerrit-Change-Number: 7364029
    Gerrit-PatchSet: 4
    Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
    Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
    Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
    Gerrit-CC: Bernhart, Bryan <bryan.b...@intel.com>
    Gerrit-CC: Jiewei Qian <q...@chromium.org>
    Gerrit-Attention: Xu, Mingming1 <mingmi...@intel.com>
    Gerrit-Attention: Bernhart, Bryan <bryan.b...@intel.com>
    Gerrit-Comment-Date: Tue, 06 Jan 2026 13:40:32 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Bernhart, Bryan (Gerrit)

    unread,
    Jan 6, 2026, 5:39:35 PM (5 days ago) Jan 6
    to Xu, Mingming1, Hu, Ningxin, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
    Attention needed from Xu, Mingming1

    Bernhart, Bryan added 2 comments

    Commit Message
    Line 36, Patchset 4 (Latest):broadcasted across all contexts. The broadcast is unnecessary
    and may cause a security issue where contexts created on
    different web pages could share the context lost reason.
    Hu, Ningxin . unresolved

    Are the context lost reasons non-recoverable system-wide error, e.g. device-removal? Why does broadcasting them cause security issue?

    Bernhart, Bryan

    It depends on whether contexts share the same device; only contexts on different devices (GPU vs NPU) see independent reasons.

    Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

    We could store the reason in the provider and have each context pull it locally, possibly using `RunOrPostTaskAndWaitOnSequence` for sequence safety. Otherwise, keeping the reason only for the context that detects the loss is probably fine for now.

    [1] https://issues.chromium.org/issues/467033530

    Line 39, Patchset 4 (Latest):Even before this change, the broadcast was not guaranteed
    due to a race condition between killing the GPU process[4]
    on the main thread and invoking OnLost on each context's
    owning thread.
    Hu, Ningxin . unresolved

    Does gpu scheduler destruction ensure all tasks are scheduled? @bryan.b...@intel.com

    Bernhart, Bryan

    No. Destroying the GPU scheduler / killing the GPU process does not ensure any pending tasks have been scheduled or executed.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Xu, Mingming1
    Submit Requirements:
    • requirement satisfiedCode-Coverage
    • requirement is not satisfiedCode-Owners
    • requirement is not satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement is not satisfiedReview-Enforcement
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: comment
    Gerrit-Project: chromium/src
    Gerrit-Branch: main
    Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
    Gerrit-Change-Number: 7364029
    Gerrit-PatchSet: 4
    Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
    Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
    Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
    Gerrit-CC: Bernhart, Bryan <bryan.b...@intel.com>
    Gerrit-CC: Jiewei Qian <q...@chromium.org>
    Gerrit-Attention: Xu, Mingming1 <mingmi...@intel.com>
    Gerrit-Comment-Date: Tue, 06 Jan 2026 22:39:23 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    Comment-In-Reply-To: Hu, Ningxin <ningx...@intel.com>
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Xu, Mingming1 (Gerrit)

    unread,
    Jan 7, 2026, 3:27:31 AM (5 days ago) Jan 7
    to Bernhart, Bryan, Hu, Ningxin, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
    Attention needed from Bernhart, Bryan and Hu, Ningxin

    Xu, Mingming1 added 4 comments

    Commit Message
    Line 9, Patchset 4:Currently, dispatching and reading tensor are scheduled to a
    Hu, Ningxin . resolved

    Nit: remove this "a"

    Xu, Mingming1

    Done

    Line 20, Patchset 4:1. If handling context lost is scheduled after reading output

    tensor, renderer process can still complete reading the tensor
    with an invaild result before the
    `MLTensor::OnConnectionError`[3] is invoked.
    Hu, Ningxin . unresolved

    Can we mark a context "is being lost"? A tensor reading can check that and handle it correctly?

    Xu, Mingming1

    Currently, the GPU process is possible to be killed by ourselves on main thread before `OnLost` task is executed on WebNN's owning thread. The callback of reading tensor will still have no chance to be executed. But that's fine, because the renderer process will also throw an error: "Tensor has been


    destroyed or context is lost." in `MLTensor::OnConnectionError`

    Line 25, Patchset 4:2. If handling context lost is scheduled before reading output

    tensor, renderer process will throw an error: "Tensor has been
    destroyed or context is lost." in `MLTensor::OnConnectionError`.
    Hu, Ningxin . resolved

    This is the expected behavior, correct?

    Xu, Mingming1

    Correct!

    Line 36, Patchset 4:broadcasted across all contexts. The broadcast is unnecessary

    and may cause a security issue where contexts created on
    different web pages could share the context lost reason.
    Hu, Ningxin . unresolved

    Are the context lost reasons non-recoverable system-wide error, e.g. device-removal? Why does broadcasting them cause security issue?

    Bernhart, Bryan

    It depends on whether contexts share the same device; only contexts on different devices (GPU vs NPU) see independent reasons.

    Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

    We could store the reason in the provider and have each context pull it locally, possibly using `RunOrPostTaskAndWaitOnSequence` for sequence safety. Otherwise, keeping the reason only for the context that detects the loss is probably fine for now.

    [1] https://issues.chromium.org/issues/467033530

    Xu, Mingming1

    For some non-system-wide errors and independent errors from different devices, the lost reasons should not be broadcasted, right?

    Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

    Agree, now all contexts even on different web pages share one provider, so I think sharing the lost reason across contexts is not safe.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Bernhart, Bryan
    • Hu, Ningxin
    Submit Requirements:
    • requirement satisfiedCode-Coverage
    • requirement is not satisfiedCode-Owners
    • requirement is not satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement is not satisfiedReview-Enforcement
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: comment
    Gerrit-Project: chromium/src
    Gerrit-Branch: main
    Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
    Gerrit-Change-Number: 7364029
    Gerrit-PatchSet: 4
    Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
    Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
    Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
    Gerrit-CC: Bernhart, Bryan <bryan.b...@intel.com>
    Gerrit-CC: Jiewei Qian <q...@chromium.org>
    Gerrit-Attention: Bernhart, Bryan <bryan.b...@intel.com>
    Gerrit-Attention: Hu, Ningxin <ningx...@intel.com>
    Gerrit-Comment-Date: Wed, 07 Jan 2026 08:27:18 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    Comment-In-Reply-To: Bernhart, Bryan <bryan.b...@intel.com>
    Comment-In-Reply-To: Hu, Ningxin <ningx...@intel.com>
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Bernhart, Bryan (Gerrit)

    unread,
    Jan 7, 2026, 7:00:19 PM (4 days ago) Jan 7
    to Xu, Mingming1, Hu, Ningxin, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
    Attention needed from Hu, Ningxin and Xu, Mingming1

    Bernhart, Bryan voted and added 1 comment

    Votes added by Bernhart, Bryan

    Code-Review+1

    1 comment

    Patchset-level comments
    File-level comment, Patchset 5 (Latest):
    Bernhart, Bryan . resolved

    Looks reasonable to me, thanks @mingmi...@intel.com.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Hu, Ningxin
    • Xu, Mingming1
    Submit Requirements:
      • requirement satisfiedCode-Coverage
      • requirement is not satisfiedCode-Owners
      • requirement satisfiedCode-Review
      • requirement is not satisfiedNo-Unresolved-Comments
      • requirement is not satisfiedReview-Enforcement
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: chromium/src
      Gerrit-Branch: main
      Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
      Gerrit-Change-Number: 7364029
      Gerrit-PatchSet: 5
      Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-Reviewer: Bernhart, Bryan <bryan.b...@intel.com>
      Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
      Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-CC: Jiewei Qian <q...@chromium.org>
      Gerrit-Attention: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-Attention: Hu, Ningxin <ningx...@intel.com>
      Gerrit-Comment-Date: Thu, 08 Jan 2026 00:00:07 +0000
      Gerrit-HasComments: Yes
      Gerrit-Has-Labels: Yes
      satisfied_requirement
      unsatisfied_requirement
      open
      diffy

      Hu, Ningxin (Gerrit)

      unread,
      Jan 7, 2026, 7:27:35 PM (4 days ago) Jan 7
      to Xu, Mingming1, Rafael Cintron, Bernhart, Bryan, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
      Attention needed from Rafael Cintron and Xu, Mingming1

      Hu, Ningxin added 5 comments

      Patchset-level comments
      Hu, Ningxin . resolved

      Add @rafael....@microsoft.com as this CL introduces a behavior change across DML and ORT backends.

      Commit Message
      Line 15, Patchset 5 (Latest):DestroyAllContextsAndKillGpuProcess`[2], a race condition can
      Hu, Ningxin . unresolved

      Nit: can this part be appended into previous line?

      Line 20, Patchset 4:1. If handling context lost is scheduled after reading output
      tensor, renderer process can still complete reading the tensor
      with an invaild result before the
      `MLTensor::OnConnectionError`[3] is invoked.
      Hu, Ningxin . resolved

      Can we mark a context "is being lost"? A tensor reading can check that and handle it correctly?

      Xu, Mingming1

      Currently, the GPU process is possible to be killed by ourselves on main thread before `OnLost` task is executed on WebNN's owning thread. The callback of reading tensor will still have no chance to be executed. But that's fine, because the renderer process will also throw an error: "Tensor has been
      destroyed or context is lost." in `MLTensor::OnConnectionError`

      Hu, Ningxin

      Acknowledged

      Line 36, Patchset 4:broadcasted across all contexts. The broadcast is unnecessary
      and may cause a security issue where contexts created on
      different web pages could share the context lost reason.
      Hu, Ningxin . unresolved

      Are the context lost reasons non-recoverable system-wide error, e.g. device-removal? Why does broadcasting them cause security issue?

      Bernhart, Bryan

      It depends on whether contexts share the same device; only contexts on different devices (GPU vs NPU) see independent reasons.

      Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

      We could store the reason in the provider and have each context pull it locally, possibly using `RunOrPostTaskAndWaitOnSequence` for sequence safety. Otherwise, keeping the reason only for the context that detects the loss is probably fine for now.

      [1] https://issues.chromium.org/issues/467033530

      Xu, Mingming1

      For some non-system-wide errors and independent errors from different devices, the lost reasons should not be broadcasted, right?

      Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

      Agree, now all contexts even on different web pages share one provider, so I think sharing the lost reason across contexts is not safe.

      Hu, Ningxin

      For some non-system-wide errors and independent errors from different devices, the lost reasons should not be broadcasted, right?

      Agreed broadcasting errors across origins are inappropriate. But after this change, we are not able to notify all contexts in one origin the lost reason, correct?

      Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

      This sounds like the intended behavior, could we add a TODO when removing the broadcasting code?

      Line 39, Patchset 4:Even before this change, the broadcast was not guaranteed

      due to a race condition between killing the GPU process[4]
      on the main thread and invoking OnLost on each context's
      owning thread.
      Hu, Ningxin . resolved

      Does gpu scheduler destruction ensure all tasks are scheduled? @bryan.b...@intel.com

      Bernhart, Bryan

      No. Destroying the GPU scheduler / killing the GPU process does not ensure any pending tasks have been scheduled or executed.

      Hu, Ningxin

      Acknowledged

      Open in Gerrit

      Related details

      Attention is currently required from:
      • Rafael Cintron
      • Xu, Mingming1
      Submit Requirements:
      • requirement satisfiedCode-Coverage
      • requirement is not satisfiedCode-Owners
      • requirement satisfiedCode-Review
      • requirement is not satisfiedNo-Unresolved-Comments
      • requirement is not satisfiedReview-Enforcement
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: chromium/src
      Gerrit-Branch: main
      Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
      Gerrit-Change-Number: 7364029
      Gerrit-PatchSet: 5
      Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-Reviewer: Bernhart, Bryan <bryan.b...@intel.com>
      Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
      Gerrit-Reviewer: Rafael Cintron <rafael....@microsoft.com>
      Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-CC: Jiewei Qian <q...@chromium.org>
      Gerrit-Attention: Rafael Cintron <rafael....@microsoft.com>
      Gerrit-Attention: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-Comment-Date: Thu, 08 Jan 2026 00:27:24 +0000
      Gerrit-HasComments: Yes
      Gerrit-Has-Labels: No
      Comment-In-Reply-To: Xu, Mingming1 <mingmi...@intel.com>
      satisfied_requirement
      unsatisfied_requirement
      open
      diffy

      Xu, Mingming1 (Gerrit)

      unread,
      Jan 8, 2026, 3:06:52 AM (4 days ago) Jan 8
      to Rafael Cintron, Bernhart, Bryan, Hu, Ningxin, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
      Attention needed from Bernhart, Bryan, Hu, Ningxin and Rafael Cintron

      Xu, Mingming1 added 2 comments

      Commit Message
      Line 15, Patchset 5:DestroyAllContextsAndKillGpuProcess`[2], a race condition can
      Hu, Ningxin . resolved

      Nit: can this part be appended into previous line?

      Xu, Mingming1

      Done

      Line 36, Patchset 4:broadcasted across all contexts. The broadcast is unnecessary
      and may cause a security issue where contexts created on
      different web pages could share the context lost reason.
      Hu, Ningxin . unresolved

      Are the context lost reasons non-recoverable system-wide error, e.g. device-removal? Why does broadcasting them cause security issue?

      Bernhart, Bryan

      It depends on whether contexts share the same device; only contexts on different devices (GPU vs NPU) see independent reasons.

      Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

      We could store the reason in the provider and have each context pull it locally, possibly using `RunOrPostTaskAndWaitOnSequence` for sequence safety. Otherwise, keeping the reason only for the context that detects the loss is probably fine for now.

      [1] https://issues.chromium.org/issues/467033530

      Xu, Mingming1

      For some non-system-wide errors and independent errors from different devices, the lost reasons should not be broadcasted, right?

      Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

      Agree, now all contexts even on different web pages share one provider, so I think sharing the lost reason across contexts is not safe.

      Hu, Ningxin

      For some non-system-wide errors and independent errors from different devices, the lost reasons should not be broadcasted, right?

      Agreed broadcasting errors across origins are inappropriate. But after this change, we are not able to notify all contexts in one origin the lost reason, correct?

      Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

      This sounds like the intended behavior, could we add a TODO when removing the broadcasting code?

      Xu, Mingming1

      Agreed broadcasting errors across origins are inappropriate. But after this change, we are not able to notify all contexts in one origin the lost reason, correct?

      Correct. While even before this change, the broadcast was not guaranteed due to a race condition between [killing the GPU process](https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/webnn_context_provider_impl.cc;l=177) on the main thread and invoking `OnLost` on each context's owning thread.

      TODO is added.

      Open in Gerrit

      Related details

      Attention is currently required from:
      • Bernhart, Bryan
      • Hu, Ningxin
      • Rafael Cintron
      Submit Requirements:
      • requirement satisfiedCode-Coverage
      • requirement is not satisfiedCode-Owners
      • requirement satisfiedCode-Review
      • requirement is not satisfiedNo-Unresolved-Comments
      • requirement is not satisfiedReview-Enforcement
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: chromium/src
      Gerrit-Branch: main
      Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
      Gerrit-Change-Number: 7364029
      Gerrit-PatchSet: 7
      Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-Reviewer: Bernhart, Bryan <bryan.b...@intel.com>
      Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
      Gerrit-Reviewer: Rafael Cintron <rafael....@microsoft.com>
      Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-CC: Jiewei Qian <q...@chromium.org>
      Gerrit-Attention: Rafael Cintron <rafael....@microsoft.com>
      Gerrit-Attention: Bernhart, Bryan <bryan.b...@intel.com>
      Gerrit-Attention: Hu, Ningxin <ningx...@intel.com>
      Gerrit-Comment-Date: Thu, 08 Jan 2026 08:06:35 +0000
      satisfied_requirement
      unsatisfied_requirement
      open
      diffy

      Hu, Ningxin (Gerrit)

      unread,
      Jan 8, 2026, 9:07:26 AM (3 days ago) Jan 8
      to Xu, Mingming1, Rafael Cintron, Bernhart, Bryan, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
      Attention needed from Bernhart, Bryan, Rafael Cintron and Xu, Mingming1

      Hu, Ningxin voted and added 2 comments

      Votes added by Hu, Ningxin

      Code-Review+1

      2 comments

      Patchset-level comments
      Commit Message
      Line 36, Patchset 4:broadcasted across all contexts. The broadcast is unnecessary
      and may cause a security issue where contexts created on
      different web pages could share the context lost reason.
      Hu, Ningxin . resolved

      Are the context lost reasons non-recoverable system-wide error, e.g. device-removal? Why does broadcasting them cause security issue?

      Bernhart, Bryan

      It depends on whether contexts share the same device; only contexts on different devices (GPU vs NPU) see independent reasons.

      Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

      We could store the reason in the provider and have each context pull it locally, possibly using `RunOrPostTaskAndWaitOnSequence` for sequence safety. Otherwise, keeping the reason only for the context that detects the loss is probably fine for now.

      [1] https://issues.chromium.org/issues/467033530

      Xu, Mingming1

      For some non-system-wide errors and independent errors from different devices, the lost reasons should not be broadcasted, right?

      Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

      Agree, now all contexts even on different web pages share one provider, so I think sharing the lost reason across contexts is not safe.

      Hu, Ningxin

      For some non-system-wide errors and independent errors from different devices, the lost reasons should not be broadcasted, right?

      Agreed broadcasting errors across origins are inappropriate. But after this change, we are not able to notify all contexts in one origin the lost reason, correct?

      Currently, all contexts share a provider, but this will change to per-renderer providers [1], making broadcasting a lost reason safe.

      This sounds like the intended behavior, could we add a TODO when removing the broadcasting code?

      Xu, Mingming1

      Agreed broadcasting errors across origins are inappropriate. But after this change, we are not able to notify all contexts in one origin the lost reason, correct?

      Correct. While even before this change, the broadcast was not guaranteed due to a race condition between [killing the GPU process](https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/webnn_context_provider_impl.cc;l=177) on the main thread and invoking `OnLost` on each context's owning thread.

      TODO is added.

      Hu, Ningxin

      Acknowledged

      Open in Gerrit

      Related details

      Attention is currently required from:
      • Bernhart, Bryan
      • Rafael Cintron
      • Xu, Mingming1
      Submit Requirements:
      • requirement satisfiedCode-Coverage
      • requirement satisfiedCode-Owners
      • requirement satisfiedCode-Review
      • requirement satisfiedReview-Enforcement
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: chromium/src
      Gerrit-Branch: main
      Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
      Gerrit-Change-Number: 7364029
      Gerrit-PatchSet: 7
      Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-Reviewer: Bernhart, Bryan <bryan.b...@intel.com>
      Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
      Gerrit-Reviewer: Rafael Cintron <rafael....@microsoft.com>
      Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-CC: Jiewei Qian <q...@chromium.org>
      Gerrit-Attention: Rafael Cintron <rafael....@microsoft.com>
      Gerrit-Attention: Xu, Mingming1 <mingmi...@intel.com>
      Gerrit-Attention: Bernhart, Bryan <bryan.b...@intel.com>
      Gerrit-Comment-Date: Thu, 08 Jan 2026 14:07:16 +0000
      Gerrit-HasComments: Yes
      Gerrit-Has-Labels: Yes
      satisfied_requirement
      open
      diffy

      Rafael Cintron (Gerrit)

      unread,
      Jan 8, 2026, 5:25:15 PM (3 days ago) Jan 8
      to Xu, Mingming1, Hu, Ningxin, Bernhart, Bryan, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
      Attention needed from Bernhart, Bryan and Xu, Mingming1

      Rafael Cintron added 1 comment

      File services/webnn/ort/context_impl_ort.cc
      Line 378, Patchset 7 (Latest): OnLost(error_message);
      DestroyAllContextsAndKillGpuProcess();
      Rafael Cintron . unresolved

      I see that both places which call DestroyAllContextsAndKillGpuProcess also now call OnLost. Would it ever make sense to call one without the other? If not, consider combining these into one, perhaps better named, helper function.

      Open in Gerrit

      Related details

      Attention is currently required from:
      • Bernhart, Bryan
      • Xu, Mingming1
      Submit Requirements:
        • requirement satisfiedCode-Coverage
        • requirement satisfiedCode-Owners
        • requirement satisfiedCode-Review
        • requirement is not satisfiedNo-Unresolved-Comments
        • requirement satisfiedReview-Enforcement
        Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
        Gerrit-MessageType: comment
        Gerrit-Project: chromium/src
        Gerrit-Branch: main
        Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
        Gerrit-Change-Number: 7364029
        Gerrit-PatchSet: 7
        Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
        Gerrit-Reviewer: Bernhart, Bryan <bryan.b...@intel.com>
        Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
        Gerrit-Reviewer: Rafael Cintron <rafael....@microsoft.com>
        Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
        Gerrit-CC: Jiewei Qian <q...@chromium.org>
        Gerrit-Attention: Xu, Mingming1 <mingmi...@intel.com>
        Gerrit-Attention: Bernhart, Bryan <bryan.b...@intel.com>
        Gerrit-Comment-Date: Thu, 08 Jan 2026 22:24:52 +0000
        Gerrit-HasComments: Yes
        Gerrit-Has-Labels: No
        satisfied_requirement
        unsatisfied_requirement
        open
        diffy

        Xu, Mingming1 (Gerrit)

        unread,
        Jan 8, 2026, 8:14:58 PM (3 days ago) Jan 8
        to Hu, Ningxin, Rafael Cintron, Bernhart, Bryan, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
        Attention needed from Bernhart, Bryan and Rafael Cintron

        Xu, Mingming1 added 1 comment

        File services/webnn/ort/context_impl_ort.cc
        Line 378, Patchset 7 (Latest): OnLost(error_message);
        DestroyAllContextsAndKillGpuProcess();
        Rafael Cintron . unresolved

        I see that both places which call DestroyAllContextsAndKillGpuProcess also now call OnLost. Would it ever make sense to call one without the other? If not, consider combining these into one, perhaps better named, helper function.

        Xu, Mingming1

        I am thinking it's possible to only call `OnLost` for some errors which can be recovered by only re-creating contexts instead of killing GPU process.

        See https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/dml/context_impl_dml.cc;l=999, here we only call `OnLost` for non-device-removal errors without killing GPU process.

        Open in Gerrit

        Related details

        Attention is currently required from:
        • Bernhart, Bryan
        • Rafael Cintron
        Submit Requirements:
        • requirement satisfiedCode-Coverage
        • requirement satisfiedCode-Owners
        • requirement satisfiedCode-Review
        • requirement is not satisfiedNo-Unresolved-Comments
        • requirement satisfiedReview-Enforcement
        Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
        Gerrit-MessageType: comment
        Gerrit-Project: chromium/src
        Gerrit-Branch: main
        Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
        Gerrit-Change-Number: 7364029
        Gerrit-PatchSet: 7
        Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
        Gerrit-Reviewer: Bernhart, Bryan <bryan.b...@intel.com>
        Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
        Gerrit-Reviewer: Rafael Cintron <rafael....@microsoft.com>
        Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
        Gerrit-CC: Jiewei Qian <q...@chromium.org>
        Gerrit-Attention: Rafael Cintron <rafael....@microsoft.com>
        Gerrit-Attention: Bernhart, Bryan <bryan.b...@intel.com>
        Gerrit-Comment-Date: Fri, 09 Jan 2026 01:14:40 +0000
        Gerrit-HasComments: Yes
        Gerrit-Has-Labels: No
        Comment-In-Reply-To: Rafael Cintron <rafael....@microsoft.com>
        satisfied_requirement
        unsatisfied_requirement
        open
        diffy

        Rafael Cintron (Gerrit)

        unread,
        Jan 9, 2026, 6:38:58 PM (2 days ago) Jan 9
        to Xu, Mingming1, Hu, Ningxin, Bernhart, Bryan, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
        Attention needed from Bernhart, Bryan and Xu, Mingming1

        Rafael Cintron voted and added 1 comment

        Votes added by Rafael Cintron

        Code-Review+1

        1 comment

        File services/webnn/ort/context_impl_ort.cc
        Line 378, Patchset 7 (Latest): OnLost(error_message);
        DestroyAllContextsAndKillGpuProcess();
        Rafael Cintron . resolved

        I see that both places which call DestroyAllContextsAndKillGpuProcess also now call OnLost. Would it ever make sense to call one without the other? If not, consider combining these into one, perhaps better named, helper function.

        Xu, Mingming1

        I am thinking it's possible to only call `OnLost` for some errors which can be recovered by only re-creating contexts instead of killing GPU process.

        See https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/dml/context_impl_dml.cc;l=999, here we only call `OnLost` for non-device-removal errors without killing GPU process.

        Rafael Cintron

        Acknowledged

        Open in Gerrit

        Related details

        Attention is currently required from:
        • Bernhart, Bryan
        • Xu, Mingming1
        Submit Requirements:
          • requirement satisfiedCode-Coverage
          • requirement satisfiedCode-Owners
          • requirement satisfiedCode-Review
          • requirement satisfiedReview-Enforcement
          Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
          Gerrit-MessageType: comment
          Gerrit-Project: chromium/src
          Gerrit-Branch: main
          Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
          Gerrit-Change-Number: 7364029
          Gerrit-PatchSet: 7
          Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
          Gerrit-Reviewer: Bernhart, Bryan <bryan.b...@intel.com>
          Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
          Gerrit-Reviewer: Rafael Cintron <rafael....@microsoft.com>
          Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
          Gerrit-CC: Jiewei Qian <q...@chromium.org>
          Gerrit-Attention: Xu, Mingming1 <mingmi...@intel.com>
          Gerrit-Attention: Bernhart, Bryan <bryan.b...@intel.com>
          Gerrit-Comment-Date: Fri, 09 Jan 2026 23:38:47 +0000
          Gerrit-HasComments: Yes
          Gerrit-Has-Labels: Yes
          Comment-In-Reply-To: Rafael Cintron <rafael....@microsoft.com>
          Comment-In-Reply-To: Xu, Mingming1 <mingmi...@intel.com>
          satisfied_requirement
          open
          diffy

          Hu, Ningxin (Gerrit)

          unread,
          Jan 10, 2026, 7:22:34 PM (22 hours ago) Jan 10
          to Xu, Mingming1, Rafael Cintron, Bernhart, Bryan, Chromium LUCI CQ, AyeAye, chromium...@chromium.org, Jiewei Qian
          Attention needed from Bernhart, Bryan and Xu, Mingming1

          Hu, Ningxin voted Commit-Queue+2

          Commit-Queue+2
          Gerrit-Comment-Date: Sun, 11 Jan 2026 00:22:24 +0000
          Gerrit-HasComments: No
          Gerrit-Has-Labels: Yes
          satisfied_requirement
          open
          diffy

          Chromium LUCI CQ (Gerrit)

          unread,
          Jan 10, 2026, 8:22:15 PM (21 hours ago) Jan 10
          to Xu, Mingming1, Hu, Ningxin, Rafael Cintron, Bernhart, Bryan, AyeAye, chromium...@chromium.org, Jiewei Qian

          Chromium LUCI CQ submitted the change

          Change information

          Commit message:
          WebNN: Schedule `OnLost` task immediately to avoid race conditions

          Currently, dispatching and reading tensor are scheduled to each
          context's owning thread in sequence. `WebNNContextImpl::OnLost`[1] task
          to handle the context lost during dispatching is also scheduled to the
          context's owning thread, but since the scheduling is called from the
          GPU main thread by
          `WebNNContextProviderImpl::DestroyAllContextsAndKillGpuProcess`[2], a
          race condition can happen in the following scenario:

          dispatch -> handle context lost ->read output tensor


          1. If handling context lost is scheduled after reading output tensor,
          renderer process can still complete reading the tensor with an invalid

          result before the `MLTensor::OnConnectionError`[3] is invoked.

          2. If handling context lost is scheduled before reading output tensor,
          renderer process will throw an error: "Tensor has been destroyed or
          context is lost." in `MLTensor::OnConnectionError`. This is the
          expected behavior.

          WebNN conformance tests can give flaky results. To address this issue,
          this CL decouples `OnLost` from `DestroyAllContextsAndKillGpuProcess`
          to schedule `OnLost` task immediately, ensuring the `OnLost` is invoked
          before reading an output tensor.

          After this change, the context lost reason will not be broadcasted

          across all contexts. The broadcast is unnecessary and may cause a
          security issue where contexts created on different web pages could
          share the context lost reason. Even before this change, the broadcast

          was not guaranteed due to a race condition between killing the GPU
          Bug: 472359963
          Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
          Reviewed-by: Hu, Ningxin <ningx...@intel.com>
          Reviewed-by: Bernhart, Bryan <bryan.b...@intel.com>
          Reviewed-by: Rafael Cintron <rafael....@microsoft.com>
          Commit-Queue: Hu, Ningxin <ningx...@intel.com>
          Cr-Commit-Position: refs/heads/main@{#1567460}
          Files:
          • M services/webnn/dml/context_impl_dml.cc
          • M services/webnn/ort/context_impl_ort.cc
          • M services/webnn/webnn_context_impl.cc
          • M services/webnn/webnn_context_impl.h
          • M services/webnn/webnn_context_provider_impl.cc
          • M services/webnn/webnn_context_provider_impl.h
          Change size: S
          Delta: 6 files changed, 18 insertions(+), 18 deletions(-)
          Branch: refs/heads/main
          Submit Requirements:
          • requirement satisfiedCode-Review: +1 by Hu, Ningxin, +1 by Rafael Cintron, +1 by Bernhart, Bryan
          Open in Gerrit
          Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
          Gerrit-MessageType: merged
          Gerrit-Project: chromium/src
          Gerrit-Branch: main
          Gerrit-Change-Id: If94312dc10178d81b34b12beffa1694f18a0c17a
          Gerrit-Change-Number: 7364029
          Gerrit-PatchSet: 8
          Gerrit-Owner: Xu, Mingming1 <mingmi...@intel.com>
          Gerrit-Reviewer: Bernhart, Bryan <bryan.b...@intel.com>
          Gerrit-Reviewer: Chromium LUCI CQ <chromiu...@luci-project-accounts.iam.gserviceaccount.com>
          Gerrit-Reviewer: Hu, Ningxin <ningx...@intel.com>
          Gerrit-Reviewer: Rafael Cintron <rafael....@microsoft.com>
          Gerrit-Reviewer: Xu, Mingming1 <mingmi...@intel.com>
          open
          diffy
          satisfied_requirement
          Reply all
          Reply to author
          Forward
          0 new messages