| Code-Review | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
When a tool returns a value the rendererNit: Seems like it was truncated.
| Commit-Queue | +1 |
Nit: Seems like it was truncated.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Initial comments - I am not deeply familiar with frame tokens so hopefully the content reviewers can help check the work here 😊
blink::mojom::ScriptToolPtr cloned_tool = t.Clone();I didn't quite follow this; it looks like you are traversing the frame tree to get a token for the frame that registered the tool? Can this be recorded at tool registration time (versus tool retrieval time)?
// The token of the frame that registered this tool.Should this be passed from the renderer or should the browser determine these values from its view of the frame tree? It seems more secure to do it the latter way.
ExecuteScriptTool(blink.mojom.FrameToken tool_owner_frame_token,```suggestion
ExecuteRemoteScriptTool(blink.mojom.FrameToken tool_owner_frame_token,
```
Otherwise it will require extra context to know both the direction and purpose of an `ExecuteScriptTool` IPC in the code
string input_arguments) => (string? result, bool success);How are JS exceptions / Promise rejections represented here?
=> (string? result, bool success);Also apropos my comment above, recommend returning a struct in the callback here so we can extend the calling protocol in the future with less code churn
result->setOrigin(t->origin->ToString());Is it part of the registration contract that tool owner's origin will be shared with any documents in the exposedTo origin list? I suppose a potential caller needs to know what origin they will calling into (like postMessage)
if (success) {Do we need to check if the abort handler was signaled before resolving here?
[CallWith=ScriptState] Promise<DOMString?> executeTool(RegisteredTool tool, DOMString input_arguments, optional ExecuteToolOptions options = {});Nit: can you line wrap or is that not a thing in IDL
blink::mojom::ScriptToolPtr cloned_tool = t.Clone();I didn't quite follow this; it looks like you are traversing the frame tree to get a token for the frame that registered the tool? Can this be recorded at tool registration time (versus tool retrieval time)?
It can't because the token we need to retrieve is caller-specific. Imagine A.com embeds a B.com iframe and a C.com iframe (B and C are siblings). There is only ONE B.com, yet at the same time, there has to be 3 of them:
- One that actually lives in B.com's process, and corresponds with its RFHI
- A "stand-in" of B in A.com's process, i.e., its RemoteFrame/RenderFrameProxyHost
- The exact same thing above, but for C.com
Now imagine A.com calls getTools(). This yields a tool with a `Window` property that points to B.com's RemoteFrame in A.com's process. Same thing with C.com: its view of B.com's tool has a `Window` that points to B.com's RemoteFrame *in C.com's process*. These RemoteFrames are all backed by RenderFrameProxyHosts, and each has a different frame token. So depending on the caller, we need to find the right token that represents B.com's RemoteFrame *in that process*, so that process can resolve the right RemoteFrame/DOMWindow pair.
it looks like you are traversing the frame tree to get a token for the frame that registered the tool?
So in short, we're traversing the frame tree to get a token for the frame that registered the tool, *as the caller sees that frame*, which means it has to be computed at call-time, and can change arbitrarily has frames are added/removed.
// The token of the frame that registered this tool.Should this be passed from the renderer or should the browser determine these values from its view of the frame tree? It seems more secure to do it the latter way.
See the comment/TODO in `model_context_user_data.cc` and https://crbug.com/509568047 which is linked from there. I suppose I could've put all of that here instead, but it felt fitting for the purposes of //content to throw that documentation there, where it's used.
Is it part of the registration contract that tool owner's origin will be shared with any documents in the exposedTo origin list? I suppose a potential caller needs to know what origin they will calling into (like postMessage)
Yeah, that was my plan and what was originally mentioned in my design doc. It helps the caller make better security decisions: ie., you have a chance of knowing you're calling bank.com's "make_transaction()" tool instead of ad-iframe-on-bank.com's "make_transaction()" tool.
Plus just like with postMessage(), when ever someone shares information with you (posts a message to you, or shares a tool with you), you should know who sent the message with you, so you can know what to do with that message (or tool).
When we spec this, we can re-ligitgate it in the WG we'd like, but so far Johann, I, and MSFT people seemed to think it was reasonable on the slides/meeting notes. I'll resolve this for now!
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
blink::mojom::ScriptToolPtr cloned_tool = t.Clone();Dominic FarolinoI didn't quite follow this; it looks like you are traversing the frame tree to get a token for the frame that registered the tool? Can this be recorded at tool registration time (versus tool retrieval time)?
It can't because the token we need to retrieve is caller-specific. Imagine A.com embeds a B.com iframe and a C.com iframe (B and C are siblings). There is only ONE B.com, yet at the same time, there has to be 3 of them:
- One that actually lives in B.com's process, and corresponds with its RFHI
- A "stand-in" of B in A.com's process, i.e., its RemoteFrame/RenderFrameProxyHost
- The exact same thing above, but for C.comNow imagine A.com calls getTools(). This yields a tool with a `Window` property that points to B.com's RemoteFrame in A.com's process. Same thing with C.com: its view of B.com's tool has a `Window` that points to B.com's RemoteFrame *in C.com's process*. These RemoteFrames are all backed by RenderFrameProxyHosts, and each has a different frame token. So depending on the caller, we need to find the right token that represents B.com's RemoteFrame *in that process*, so that process can resolve the right RemoteFrame/DOMWindow pair.
it looks like you are traversing the frame tree to get a token for the frame that registered the tool?
So in short, we're traversing the frame tree to get a token for the frame that registered the tool, *as the caller sees that frame*, which means it has to be computed at call-time, and can change arbitrarily has frames are added/removed.
| Commit-Queue | +1 |
if (tool_owner_frame_token.Is<blink::LocalFrameToken>()) {Dominic FarolinoThis is effectively the logic in `LookupRenderFrameHostOrProxy()` without refactoring that static fn out into something more broad/shared. Happy to do that if needed. Or maybe there's a simpler way to do this altogether...
Done
base::UnguessableToken invocation_id = base::UnguessableToken::Create();Dominic FarolinoRight now these tokens are generated for DevTools
... and I'll need to plumb them to the renderer process, through `ExecuteTool()` so that the renderer can [pass them to DevTools here](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/script_tools/model_context.cc;l=401;drc=dc85ebf9cd885dca75e2bce71438bd13262bb945). But I'm avoiding doing that for now, since I don't want to add more inspector-specific tests on top of this already-large CL. Will handle in a follow-up.
ExecuteScriptTool(blink.mojom.FrameToken tool_owner_frame_token,```suggestion
ExecuteRemoteScriptTool(blink.mojom.FrameToken tool_owner_frame_token,
```Otherwise it will require extra context to know both the direction and purpose of an `ExecuteScriptTool` IPC in the code
Good call, done.
string input_arguments) => (string? result, bool success);How are JS exceptions / Promise rejections represented here?
I added some documentation here that should answer the question. PTAL! Resolving now but let me know if you'd like more changes.
Do we need to check if the abort handler was signaled before resolving here?
We do not. If the abort handler was signaled, the Promise was already rejected, and further calls to resolve/reject it will be ignored.
[CallWith=ScriptState] Promise<DOMString?> executeTool(RegisteredTool tool, DOMString input_arguments, optional ExecuteToolOptions options = {});Nit: can you line wrap or is that not a thing in IDL
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks, some comments for content/ below.
- Introduces the `executeTool()` method on `ModelContext`Might be helpful to include a snippet of how it's used, or at least point out that it works on window references.
// Associated with the Page, managing pending tool executions.Maybe also mention why this needs page-level tracking, and how this is different from ModelContextUserData which is declared further below?
if (origin.scheme() != url::kHttpsScheme) {Side note that just occurred to me: would this get in the way of testing with http://localhost? Just mentioning in case it's worth addressing at some point (not in this CL).
// TOOD(https://crbug.com/509568047): Stop passing in a frame token and origin
// during tool registration. These values are obvious from context, and the
// renderer shouldn't need to pass them in and have the browser verify them.+1! Thanks for filing the bug; I'm ok with doing this in a followup. (I'm assuming there's a reason why we don't want to do it right away in this CL, since it seems simpler on first glance?)
And I suppose we shouldn't get races here because ModelContextUserData is always scoped per-document, even if we were to somehow bypass RenderDocument and race the renderer IPC with a commit of a different document into the same RFH (e.g., via a corner case like [this one](https://source.chromium.org/chromium/chromium/src/+/main:android_webview/browser/aw_content_browser_client.cc;l=949-951;drc=687984083fbf694a1a50ccb2892cac27a0dbc4ea)).
if (!token) {Alex MoshchukI'm honestly not totally sure when this could happen. At first I thought it could be the result of a race: Document A sends `GetScriptTools()` right as Document B navigates away. But as long as we're iterating through all RFHs under the caller's main frame, all frames should be able to reference each other with a token. So I'm tempted to `CHECK(token)`, but want to check with Alex first.
Correct - I agree that within the same frame tree all frames should be able to find and reference each other, so I think `token` should always exist. Changing this to `CHECK(token)` SGTM.
One corner case question here: is it ok for a frame to execute a tool that it declares itself? And does that case get handled properly? (Seems that it should?)
static_cast<SiteInstanceImpl*>(
render_frame_host().GetSiteInstance())
->group());Consider saving the caller SIG outside the ForEachRenderFrameHostWithAction for clarity and to save a few lookups.
if (tool_owner_frame_token.Is<blink::LocalFrameToken>()) {This is effectively the logic in `LookupRenderFrameHostOrProxy()` without refactoring that static fn out into something more broad/shared. Happy to do that if needed. Or maybe there's a simpler way to do this altogether...
I see the same pattern used in [other places](https://source.chromium.org/chromium/chromium/src/+/main:components/optimization_guide/content/browser/page_content_proto_provider.cc;l=295-302;drc=4f9e0984c01ffe4c405ab9c22add5d101df25729) as well. It's probably ok for the purposes of this CL. We could consider introducing a new helper and doing a separate cleanup for all such cases in a followup.
if (!target_rfh || main_frame != target_rfh->GetMainFrame()) {I agree `!target_rfh` can happen legitimately (e.g., if a frame is detached), but for this part, wouldn't trying to execute a tool in a different FrameTree be something that only a compromised renderer could attempt to do, since it normally wouldn't have a window reference to it? Hmm, then again, I guess it's possible to get a cross-frame-tree window reference in some of these cases, e.g. when postMessaging between embedder and guest in a <webview> tag.
render_frame_host().GetLastCommittedOrigin();nit: move down closer to its first use?
A couple of edge case questions:
base::UnguessableToken invocation_id = base::UnguessableToken::Create();Right now these tokens are generated for DevTools
Acknowledged
execution.target_token = target_rfh->GetDocumentToken();For `caller_token` and `target_token`, it looks like these are used only in the browser process to track the corresponding RFHs. Would it be better to use GlobalRenderFrameHostId (or GlobalRenderFrameHostToken) instead of DocumentToken, just to avoid any potential confusion down the road with what their process is? (Then again, I do also see some advantages of keeping everything done via DocumentTokens - but we need to guarantee that we won't ever resolve them without a process ID whenever they come from the renderer)
const std::optional<std::string>& result, bool success) {In the future, will you need additional handling for cases where the renderer execuring the script tool crashes or takes too long to respond, to avoid blocking the caller forever?
// responds with the result of the tool the asspcoated pending execution istypo
std::move(it->second.callback).Run(std::nullopt, false);Double-checking: if the caller is gone, does it still make sense to invoke the callback for it?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (origin.scheme() != url::kHttpsScheme) {Side note that just occurred to me: would this get in the way of testing with http://localhost? Just mentioning in case it's worth addressing at some point (not in this CL).
Yep, I thought about this too. I think we'll want to use something like `IsOriginPotentiallyTrustworthy()`. I've filed https://crbug.com/509983801 to track this. Thanks.
// TOOD(https://crbug.com/509568047): Stop passing in a frame token and origin
// during tool registration. These values are obvious from context, and the
// renderer shouldn't need to pass them in and have the browser verify them.+1! Thanks for filing the bug; I'm ok with doing this in a followup. (I'm assuming there's a reason why we don't want to do it right away in this CL, since it seems simpler on first glance?)
And I suppose we shouldn't get races here because ModelContextUserData is always scoped per-document, even if we were to somehow bypass RenderDocument and race the renderer IPC with a commit of a different document into the same RFH (e.g., via a corner case like [this one](https://source.chromium.org/chromium/chromium/src/+/main:android_webview/browser/aw_content_browser_client.cc;l=949-951;drc=687984083fbf694a1a50ccb2892cac27a0dbc4ea)).
The only reason I'm not just yet is because we store a local copy of the ScriptTool in the renderer, and my gut tells me I'll have to think through the conversion between the ordinary mojom ScriptTool that Blink and the actor framework use + the "reduced" browser side representation here. I think it's mostly mechanical, but stands to bloat this CL a little more by introducing a new mojo type and refactoring unrelated things around here, as well as I need to think through the implications—if any—on the actor stack's usage of ScriptTool (probably trivial, but TBD).
Thanks!
if (!token) {Alex MoshchukI'm honestly not totally sure when this could happen. At first I thought it could be the result of a race: Document A sends `GetScriptTools()` right as Document B navigates away. But as long as we're iterating through all RFHs under the caller's main frame, all frames should be able to reference each other with a token. So I'm tempted to `CHECK(token)`, but want to check with Alex first.
Correct - I agree that within the same frame tree all frames should be able to find and reference each other, so I think `token` should always exist. Changing this to `CHECK(token)` SGTM.
One corner case question here: is it ok for a frame to execute a tool that it declares itself? And does that case get handled properly? (Seems that it should?)
Instead of `CHECK(token)`, I've just dereferenced the optional above, with a brief comment.
s it ok for a frame to execute a tool that it declares itself?
Yes. If a frame declares a tool and executes it, this is allowed and nothing special really. I'm pretty sure one of the WPTs cover it, but I'll check in the morning. Like `postMessage()`, my vision is to enable direct tool execution for LocalFrame=>LocalFrame, bypassing the browser entirely. But for now, everything gets mediated here, which is actually "simpler" in a sense. So I'll check for the WPT and if one doesn't exist, I'll write one.
static_cast<SiteInstanceImpl*>(
render_frame_host().GetSiteInstance())
->group());Consider saving the caller SIG outside the ForEachRenderFrameHostWithAction for clarity and to save a few lookups.
Good call, done!
if (!target_rfh || main_frame != target_rfh->GetMainFrame()) {I agree `!target_rfh` can happen legitimately (e.g., if a frame is detached), but for this part, wouldn't trying to execute a tool in a different FrameTree be something that only a compromised renderer could attempt to do, since it normally wouldn't have a window reference to it? Hmm, then again, I guess it's possible to get a cross-frame-tree window reference in some of these cases, e.g. when postMessaging between embedder and guest in a <webview> tag.
I think the simplest way would be to grab a `Window` reference from your `window.opener`'s frame tree, right? That's what `third_party/blink/web_tests/external/wpt/webmcp/imperative/executeTool-across-trees.https.html` does, and exercises exactly this condition.
render_frame_host().GetLastCommittedOrigin();nit: move down closer to its first use?
I just removed this and inlined it below, with an argument comment.
// responds with the result of the tool the asspcoated pending execution is| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
void ModelContext::ExecuteScriptTool(const String& name,Applying https://chromium-review.googlesource.com/c/chromium/src/+/7817087 and this CL, the following code rejects with "Tool execution cancelled, since tool definition was updated" error in https://googlechromelabs.github.io/webmcp-tools/demos/french-bistro/ while it should hang until user clicks the submit button instead.
```
const [tool] = await navigator.modelContext.getTools();
await navigator.modelContext.executeTool(tool, '{}');
```
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
=> (string? result, bool success);Mark FoltzAlso apropos my comment above, recommend returning a struct in the callback here so we can extend the calling protocol in the future with less code churn
Elaborating a bit, we may want to pass the rejection value (or some serialized version of it) to the caller in the future; right now if a tool call rejects, there's just a generic error code passed to the caller. A similar change would need to be made to `ExecuteRemoteScriptTool` so having that done once would be a little easier. But this change can be made later as need arises.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
void ModelContext::ExecuteScriptTool(const String& name,Applying https://chromium-review.googlesource.com/c/chromium/src/+/7817087 and this CL, the following code rejects with "Tool execution cancelled, since tool definition was updated" error in https://googlechromelabs.github.io/webmcp-tools/demos/french-bistro/ while it should hang until user clicks the submit button instead.
```
const [tool] = await navigator.modelContext.getTools();
await navigator.modelContext.executeTool(tool, '{}');
```
Everything works correctly with the latest PS in that CL you linked to, so I'll resolve this now. Thanks for reporting!
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
- Introduces the `executeTool()` method on `ModelContext`Might be helpful to include a snippet of how it's used, or at least point out that it works on window references.
Done. For now, the `window` kind of serves two purposes. It gives the consumer of a tool some insight into who in their frame tree actually owns this tool, but practically speaking from the browser's perspective, it exists so that w can get a FrameToken from it, and locate the tool for execution. I've commented on the latter here.
// Associated with the Page, managing pending tool executions.Maybe also mention why this needs page-level tracking, and how this is different from ModelContextUserData which is declared further below?
Done. I realize it is not strictly necessary to have this class, but I think I've explained the conceptual convenience it provides.
if (!target_rfh || main_frame != target_rfh->GetMainFrame()) {Dominic FarolinoI agree `!target_rfh` can happen legitimately (e.g., if a frame is detached), but for this part, wouldn't trying to execute a tool in a different FrameTree be something that only a compromised renderer could attempt to do, since it normally wouldn't have a window reference to it? Hmm, then again, I guess it's possible to get a cross-frame-tree window reference in some of these cases, e.g. when postMessaging between embedder and guest in a <webview> tag.
I think the simplest way would be to grab a `Window` reference from your `window.opener`'s frame tree, right? That's what `third_party/blink/web_tests/external/wpt/webmcp/imperative/executeTool-across-trees.https.html` does, and exercises exactly this condition.
(Will resolve this now, since I think it's not actionable, and is full tested).
A couple of edge case questions:
- should there be a check that `target_rfh` is active and not pending deletion? We probably wouldn't want to run a tool in a RFH that's running unload handlers and about to be destroyed, even if such a RFH might still have its DocumentUserData.
- what would happen if the whole page enters bfcache just before we receive this IPC? Should there be a `IsInactiveAndDisallowActivation` check?
- what happens if the page enters bfcache during the tool execution?
- If `target_rfh` is an OOPIF that crashed, I think it will still [keep its DocumentUserData](https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/render_frame_host_impl.cc;l=18644-18645;drc=b4528fe64d8704dc89748f6e371c7f97b13da16f) until it's recreated. Should we avoid trying to send a request to `target_rfh` if it's not live (checking with `IsRenderFrameLive()`)?
Thanks for pulling on these. Answers below, but also more questions.
- Pending deletion: I'm not sure how to evaluate this. It seems sensible to not want to run tools then, but [from what I can tell](https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/render_frame_proxy_host.cc;l=499-510;drc=54af12075326fb3b52088fcf8c1a34d6fb264f65), `postMessage()` has no special checks for this? What if we did run tools (or send message events), could it delay the document's deletion? I assume not, because there is a timeout that would ultimately win and kill the document, right? I guess my inclination is to leave this be unless it's very common to gate document-bound things on this check, but let me know what you think.
- Bf-cache: I also don't see an `IsInactiveAndDisallowActivation()` check for `postMessage()`, but maybe I looked in the wrong place. I have to plead ignorance on this. I assumed that since iframes don't individually go in the bf-cache, we'd be avoiding all special handling of bf-cache across frame trees, since the entire tree would be cached at once. The renderer handles bf-cache in that it does fully active checks before sending IPCs to the browser process, but I'm not sure what would happen if the race is as you propose, and the browser process is about to try and invokes a tool while in the bf-cache. Would this IPC make it through, or be queued?
- There's no special handling for entering the bf-cache during tool execution, so my hope is that things just get suspended, and the IPC to resolve the caller's Promise will just make it through once we're out of the bf-cache. Man, now I'm really wondering if there is a good resource that has the answer to questions like "what happens with script that is running when the page enters the bf-cache? can Promises resolve? Can IPCs be sent? Both ways? What gets queued/frozen and what doesn't?
- Regarding OOPIF crash: My understanding is that the is-live check matters more for when some state in the browser depends on the live-ness of the render frame, and less (perhaps not at all?) when it's just used to sanity check "is this IPC actually going to go anywhere?. This is especially true since the remote is still bound even if it's "disconnected", making it safe to always call this. So in general, and especially here since this is document-scoped and there's no risk of this ever being sent to the wrong document, and us having to catch the dead period between documents, I prefer leaving the consumer blind to the "connection" status of this remote and the document on the other side. Let me know if you feel otherwise.
/*caller_origin=*/render_frame_host().GetLastCommittedOrigin())) {
std::move(callback).Run(std::nullopt, false);Please fix this WARNING reported by ClangTidy: check: bugprone-argument-comment
argument name 'caller_origin' in comment does ...
check: bugprone-argument-comment
argument name 'caller_origin' in comment does not match parameter name 'target_origin' (https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html)
(Note: You can add `Skip-Clang-Tidy-Checks: bugprone-argument-comment` footer to the CL description to skip the check)
(Lint observed on `linux-clang-tidy-rel`, but not on `android-clang-tidy-rel`)
const std::optional<std::string>& result, bool success) {In the future, will you need additional handling for cases where the renderer execuring the script tool crashes or takes too long to respond, to avoid blocking the caller forever?
By "additional handling", do you mean cases *besides* where the `ModelContextUserData` is destroyed, and we manually "cancel" the invoker's execution callback? You might be driving at the fact that `DocumentUserData` is not destroyed during OOPIF crashes (which I learned in your comment above), and therefore callers are not correctly notified (i.e., their Promises rejected), right? I think I have two responses:
- From a "surfacing the right reason" perspective, https://crbug.com/509555636 tracks codifying a more list of these reasons and exposing them to the caller with more structure.
- From a "did we catch all the cases" perspective, it seems like we do need special handling for process crashes that the DocumentUserData destructor is not privy to. I'd like to do that in a follow-up if possible, since it's not crucial for our OT. If that's cool with you, I'll file a bug. Otherwise, I don't think we'll implement a custom timeout. Any other cases we can think of are probably just gaps in the current implementation.
std::move(it->second.callback).Run(std::nullopt, false);Double-checking: if the caller is gone, does it still make sense to invoke the callback for it?
If the caller is gone, then this does nothing. Checking specially for that case and dropping the callback without running it should be safe—it shouldn't DCHECK in mojo since the callback's receiver is disconnected. So that'd also do nothing, but it's more work. So I slightly prefer this.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks! Responded on the thread about edge cases.
// Associated with the Page, managing pending tool executions.Dominic FarolinoMaybe also mention why this needs page-level tracking, and how this is different from ModelContextUserData which is declared further below?
Done. I realize it is not strictly necessary to have this class, but I think I've explained the conceptual convenience it provides.
Thanks - that's a great explanation.
// TOOD(https://crbug.com/509568047): Stop passing in a frame token and origin
// during tool registration. These values are obvious from context, and the
// renderer shouldn't need to pass them in and have the browser verify them.Dominic Farolino+1! Thanks for filing the bug; I'm ok with doing this in a followup. (I'm assuming there's a reason why we don't want to do it right away in this CL, since it seems simpler on first glance?)
And I suppose we shouldn't get races here because ModelContextUserData is always scoped per-document, even if we were to somehow bypass RenderDocument and race the renderer IPC with a commit of a different document into the same RFH (e.g., via a corner case like [this one](https://source.chromium.org/chromium/chromium/src/+/main:android_webview/browser/aw_content_browser_client.cc;l=949-951;drc=687984083fbf694a1a50ccb2892cac27a0dbc4ea)).
The only reason I'm not just yet is because we store a local copy of the ScriptTool in the renderer, and my gut tells me I'll have to think through the conversion between the ordinary mojom ScriptTool that Blink and the actor framework use + the "reduced" browser side representation here. I think it's mostly mechanical, but stands to bloat this CL a little more by introducing a new mojo type and refactoring unrelated things around here, as well as I need to think through the implications—if any—on the actor stack's usage of ScriptTool (probably trivial, but TBD).
Thanks!
Ack
SiteInstanceGroup* site_instance_group =I think we can go further and move this outside the iteration on line 190 as well?
if (!target_rfh || main_frame != target_rfh->GetMainFrame()) {Dominic FarolinoI agree `!target_rfh` can happen legitimately (e.g., if a frame is detached), but for this part, wouldn't trying to execute a tool in a different FrameTree be something that only a compromised renderer could attempt to do, since it normally wouldn't have a window reference to it? Hmm, then again, I guess it's possible to get a cross-frame-tree window reference in some of these cases, e.g. when postMessaging between embedder and guest in a <webview> tag.
Dominic FarolinoI think the simplest way would be to grab a `Window` reference from your `window.opener`'s frame tree, right? That's what `third_party/blink/web_tests/external/wpt/webmcp/imperative/executeTool-across-trees.https.html` does, and exercises exactly this condition.
(Will resolve this now, since I think it's not actionable, and is full tested).
Doh, you're right about this obviously being the case with openers, and thanks for having a test for it. :) It might be helpful to mention that as an example in the comment to help motivate this check.
It looks like opener support is something you're [considering](https://docs.google.com/document/d/1ycdzuXA-VE8lRDFSArh0Um3PChHV0Hq6Om1MSMG8qPE/edit?tab=t.0#bookmark=id.p895sonheo0) post-OT? I agree it seems a bit weird if a frame can use a particular tool but its same-origin opener can't, though adding this support would have implications on things like tracking tool invocations with PageUserData vs something scoped to BCGs.
Dominic FarolinoA couple of edge case questions:
- should there be a check that `target_rfh` is active and not pending deletion? We probably wouldn't want to run a tool in a RFH that's running unload handlers and about to be destroyed, even if such a RFH might still have its DocumentUserData.
- what would happen if the whole page enters bfcache just before we receive this IPC? Should there be a `IsInactiveAndDisallowActivation` check?
- what happens if the page enters bfcache during the tool execution?
- If `target_rfh` is an OOPIF that crashed, I think it will still [keep its DocumentUserData](https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/render_frame_host_impl.cc;l=18644-18645;drc=b4528fe64d8704dc89748f6e371c7f97b13da16f) until it's recreated. Should we avoid trying to send a request to `target_rfh` if it's not live (checking with `IsRenderFrameLive()`)?
Thanks for pulling on these. Answers below, but also more questions.
- Pending deletion: I'm not sure how to evaluate this. It seems sensible to not want to run tools then, but [from what I can tell](https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/render_frame_proxy_host.cc;l=499-510;drc=54af12075326fb3b52088fcf8c1a34d6fb264f65), `postMessage()` has no special checks for this? What if we did run tools (or send message events), could it delay the document's deletion? I assume not, because there is a timeout that would ultimately win and kill the document, right? I guess my inclination is to leave this be unless it's very common to gate document-bound things on this check, but let me know what you think.
- Bf-cache: I also don't see an `IsInactiveAndDisallowActivation()` check for `postMessage()`, but maybe I looked in the wrong place. I have to plead ignorance on this. I assumed that since iframes don't individually go in the bf-cache, we'd be avoiding all special handling of bf-cache across frame trees, since the entire tree would be cached at once. The renderer handles bf-cache in that it does fully active checks before sending IPCs to the browser process, but I'm not sure what would happen if the race is as you propose, and the browser process is about to try and invokes a tool while in the bf-cache. Would this IPC make it through, or be queued?
- There's no special handling for entering the bf-cache during tool execution, so my hope is that things just get suspended, and the IPC to resolve the caller's Promise will just make it through once we're out of the bf-cache. Man, now I'm really wondering if there is a good resource that has the answer to questions like "what happens with script that is running when the page enters the bf-cache? can Promises resolve? Can IPCs be sent? Both ways? What gets queued/frozen and what doesn't?
- Regarding OOPIF crash: My understanding is that the is-live check matters more for when some state in the browser depends on the live-ness of the render frame, and less (perhaps not at all?) when it's just used to sanity check "is this IPC actually going to go anywhere?. This is especially true since the remote is still bound even if it's "disconnected", making it safe to always call this. So in general, and especially here since this is document-scoped and there's no risk of this ever being sent to the wrong document, and us having to catch the dead period between documents, I prefer leaving the consumer blind to the "connection" status of this remote and the document on the other side. Let me know if you feel otherwise.
Thanks for pulling on these. Answers below, but also more questions.
- Pending deletion: I'm not sure how to evaluate this. It seems sensible to not want to run tools then, but [from what I can tell](https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/render_frame_proxy_host.cc;l=499-510;drc=54af12075326fb3b52088fcf8c1a34d6fb264f65), `postMessage()` has no special checks for this? What if we did run tools (or send message events), could it delay the document's deletion? I assume not, because there is a timeout that would ultimately win and kill the document, right? I guess my inclination is to leave this be unless it's very common to gate document-bound things on this check, but let me know what you think.
I suppose it's fair to compare to postMessage, which intentionally supports pending deletion frames. If the target_rfh is pending deletion, the ExecuteScriptTool will never execute for it, since the document will get destroyed as soon as the unload handler finishes executing, which will also clean up the RFH and DocumentUserData and invoke CancelPendingScriptToolExecutionsForDocument - so I think this still works/fails correctly, and checking for it here would mostly be an optimization to avoid an unnecessary IPC to the renderer and waiting for the unload ack.
If the caller is pending deletion (e.g., sends an executeTool from its unload handler), it won't ever receive the callback, but we probably still want to run the tool, so that part seems fine as well.
The biggest danger with pending deletion is acting on an incorrect RFH (current RFH in the FTN instead of a pending deletion frame). The only place where this might happen here is with the RenderFrameHost::FromPlaceholderToken() path which uses current_frame_host(). The tool/origin validation on target_rfh helps catch most bad things, though I do wonder about the following scenario:
Is that something we should prevent? Compared to postMessage, there isn't a targetOrigin param or check [equivalent to this](https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/render_frame_proxy_host.cc;l=522;drc=54af12075326fb3b52088fcf8c1a34d6fb264f65) to not deliver the message to an unrelated document.
> - Bf-cache: I also don't see an `IsInactiveAndDisallowActivation()` check for `postMessage()`, but maybe I looked in the wrong place. I have to plead ignorance on this. I assumed that since iframes don't individually go in the bf-cache, we'd be avoiding all special handling of bf-cache across frame trees, since the entire tree would be cached at once. The renderer handles bf-cache in that it does fully active checks before sending IPCs to the browser process, but I'm not sure what would happen if the race is as you propose, and the browser process is about to try and invokes a tool while in the bf-cache. Would this IPC make it through, or be queued?
I think you're right and it should work as is - the page and all frames should get frozen; so I'd expect that if we forward the IPC to the renderer, it would get queued there until the page is restored. This seems to work with postMessage and should work similarly for this, though in a followup it would be good to add a test to make sure.
- There's no special handling for entering the bf-cache during tool execution, so my hope is that things just get suspended, and the IPC to resolve the caller's Promise will just make it through once we're out of the bf-cache. Man, now I'm really wondering if there is a good resource that has the answer to questions like "what happens with script that is running when the page enters the bf-cache? can Promises resolve? Can IPCs be sent? Both ways? What gets queued/frozen and what doesn't?
This is probably going to be similar to the case above. It does seem like these these IPCs don't warrant throwing a page out of BFCache (there are no incompatible state changes, and the callback can wait as long as needed), but might be worth double-checking with rakina@ or fergal@ at some point to make sure.
- Regarding OOPIF crash: My understanding is that the is-live check matters more for when some state in the browser depends on the live-ness of the render frame, and less (perhaps not at all?) when it's just used to sanity check "is this IPC actually going to go anywhere?. This is especially true since the remote is still bound even if it's "disconnected", making it safe to always call this. So in general, and especially here since this is document-scoped and there's no risk of this ever being sent to the wrong document, and us having to catch the dead period between documents, I prefer leaving the consumer blind to the "connection" status of this remote and the document on the other side. Let me know if you feel otherwise.
Makes sense - from my perspective, I mostly want to double-check that we don't get stuck never calling the callback if the target_rfh is non-live. Would `ExecuteScriptTool`'s callback get called if it's invoked on a remote that's disconnected? FWIW, postMessage does have a corresponding check [here](https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/render_frame_proxy_host.cc;l=508;drc=54af12075326fb3b52088fcf8c1a34d6fb264f65), and there's also things like [this](https://source.chromium.org/chromium/chromium/src/+/main:mojo/public/cpp/bindings/callback_helpers.h;l=15;drc=54af12075326fb3b52088fcf8c1a34d6fb264f65) which make it seem like the callback might not run?
const std::optional<std::string>& result, bool success) {Dominic FarolinoIn the future, will you need additional handling for cases where the renderer execuring the script tool crashes or takes too long to respond, to avoid blocking the caller forever?
By "additional handling", do you mean cases *besides* where the `ModelContextUserData` is destroyed, and we manually "cancel" the invoker's execution callback? You might be driving at the fact that `DocumentUserData` is not destroyed during OOPIF crashes (which I learned in your comment above), and therefore callers are not correctly notified (i.e., their Promises rejected), right? I think I have two responses:
- From a "surfacing the right reason" perspective, https://crbug.com/509555636 tracks codifying a more list of these reasons and exposing them to the caller with more structure.
- From a "did we catch all the cases" perspective, it seems like we do need special handling for process crashes that the DocumentUserData destructor is not privy to. I'd like to do that in a follow-up if possible, since it's not crucial for our OT. If that's cool with you, I'll file a bug. Otherwise, I don't think we'll implement a custom timeout. Any other cases we can think of are probably just gaps in the current implementation.
Yes, I was thinking of whether we'd need to handle RenderProcessGone for OOPIF crashes (since DUD indeed stays around in that case) or add a timer similar to the beforeunload/unload timers to put a limit on how long tool invocations can take. All of this is fine to explore later. :)
std::move(it->second.callback).Run(std::nullopt, false);Dominic FarolinoDouble-checking: if the caller is gone, does it still make sense to invoke the callback for it?
If the caller is gone, then this does nothing. Checking specially for that case and dropping the callback without running it should be safe—it shouldn't DCHECK in mojo since the callback's receiver is disconnected. So that'd also do nothing, but it's more work. So I slightly prefer this.
I think we can go further and move this outside the iteration on line 190 as well?
Oh duh, that's definitely what I meant to do /facepalm. Done.
if (!target_rfh || main_frame != target_rfh->GetMainFrame()) {Dominic FarolinoI agree `!target_rfh` can happen legitimately (e.g., if a frame is detached), but for this part, wouldn't trying to execute a tool in a different FrameTree be something that only a compromised renderer could attempt to do, since it normally wouldn't have a window reference to it? Hmm, then again, I guess it's possible to get a cross-frame-tree window reference in some of these cases, e.g. when postMessaging between embedder and guest in a <webview> tag.
Dominic FarolinoI think the simplest way would be to grab a `Window` reference from your `window.opener`'s frame tree, right? That's what `third_party/blink/web_tests/external/wpt/webmcp/imperative/executeTool-across-trees.https.html` does, and exercises exactly this condition.
Alex Moshchuk(Will resolve this now, since I think it's not actionable, and is full tested).
Doh, you're right about this obviously being the case with openers, and thanks for having a test for it. :) It might be helpful to mention that as an example in the comment to help motivate this check.
It looks like opener support is something you're [considering](https://docs.google.com/document/d/1ycdzuXA-VE8lRDFSArh0Um3PChHV0Hq6Om1MSMG8qPE/edit?tab=t.0#bookmark=id.p895sonheo0) post-OT? I agree it seems a bit weird if a frame can use a particular tool but its same-origin opener can't, though adding this support would have implications on things like tracking tool invocations with PageUserData vs something scoped to BCGs.
though adding this support would have implications on things like tracking tool invocations with PageUserData vs something scoped to BCGs.
Precisely why I've deferred so far, heh.
----
Added some documentation above the main-frame check! Thanks for recommending.
Is that something we should prevent? Compared to postMessage, there isn't a targetOrigin param or check equivalent to this to not deliver the message to an unrelated document.
Ah, so the RFPH resolves correctly, but points to the wrong RFHI. I see, very interesting. What would be the way to mitigate this? It wouldn't be a pending deletion check, since `current_frame_host()` has already moved on and we don't have access to A's RFHI that *is* in the pending deletion state. I guess we just need the renderer to supply the origin that it believes is hosting the tool, in this IPC, so that we can perform the targetOrigin equivalent. That's easy enough, will do it in the morning!
but might be worth double-checking with rakina@ or fergal@ at some point to make sure.
Regarding bf-cache, sounds good. I've filed https://crbug.com/510487685 to remind me to write some bf-cache tests, and I'll be sure to involve rakina@ and/or fergal@.
FWIW, postMessage does have a corresponding check here, and there's also things like this which make it seem like the callback might not run?
I'll check this out. I was under the impression that a detached document would have its DocumentUserData destroyed, so execution would fail either immediately because the target document wasn't found by the time we wound up here, or shortly after, when the target DUD *does* get destroyed. I'll write a test for this to confirm and report back.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
Alright, I've addressed the pending deletion edge case by passing the `RegisteredTool`'s origin over the `ExecuteRemoteScriptTool()` IPC, and running a target-origin check against the resolved RFHI. Also added some documentation, and a test for where `origin` is not passed in `RegisteredTool` in `executeTool()`. Note that this does break the `external/wpt/webmcp/imperative/opaque-origin-tools.https.html` test that I added; it basically makes it impossible to execute your own tools if you're in an opaque origin, since you'd always be passing `"null"` to `executeTool()` which won't match anything in the browser process. I think this is OK for now since we can solve this in the future by just exposing a proper `Tool` interface that internally keeps the real origin of the tool, even if it's opaque, that it doesn't have to get serialized to a string. Then we'll retain opaque origin support.
I haven't added a test for the browser check, since it seems super subtle/racy, but if you have a good one I can mimic, I can try and add one.
Quick question though: if the RFHI that a RFPH represents is pending deletion, is this represented on the RFPH somewhere, and we could just check some state on the RFPH itself without even having to grab its FTN's current host, only to realize that the two are sort of out-of-sync? I'm fine with the current check, especially since it matches postMessage(). I'm just asking for my own knowledge.
----
Regarding detached documents: I've added two tests in `executeTool-target-detachment.https.html` which detach the target frame just before tool execution, and *during* execution. Both reflect what I suspected, which is that the execute callback is invoked and the caller's Promise rejects properly.
----
Alright, I think everything here has been addressed or filed for a follow-up, so I'll resolve this! Lmk if you think I missed anything.
execution.target_token = target_rfh->GetDocumentToken();For `caller_token` and `target_token`, it looks like these are used only in the browser process to track the corresponding RFHs. Would it be better to use GlobalRenderFrameHostId (or GlobalRenderFrameHostToken) instead of DocumentToken, just to avoid any potential confusion down the road with what their process is? (Then again, I do also see some advantages of keeping everything done via DocumentTokens - but we need to guarantee that we won't ever resolve them without a process ID whenever they come from the renderer)
just to avoid any potential confusion down the road with what their process is?
Hmm, I'm not sure I understand the trade-offs. How would we benefit from using global tokens instead of DocumentTokens? Is it just to make resolving these into actual RFHIs easier, so that we don't have to grab the child process ID from somewhere to do so? I suppose that makes sense, but I'd probably lean towards holding off until the need to resolve these deeper arises, if it ever does. Does that sound OK?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
Thanks, content/ LGTM with some last notes!
Alright, I've addressed the pending deletion edge case by passing the `RegisteredTool`'s origin over the `ExecuteRemoteScriptTool()` IPC, and running a target-origin check against the resolved RFHI. Also added some documentation, and a test for where `origin` is not passed in `RegisteredTool` in `executeTool()`. Note that this does break the `external/wpt/webmcp/imperative/opaque-origin-tools.https.html` test that I added; it basically makes it impossible to execute your own tools if you're in an opaque origin, since you'd always be passing `"null"` to `executeTool()` which won't match anything in the browser process. I think this is OK for now since we can solve this in the future by just exposing a proper `Tool` interface that internally keeps the real origin of the tool, even if it's opaque, that it doesn't have to get serialized to a string. Then we'll retain opaque origin support.
I haven't added a test for the browser check, since it seems super subtle/racy, but if you have a good one I can mimic, I can try and add one.
Thanks! It's good to add a test eventually, and it might be easier with a browsertest where it's easier to get the right timing. We typically use things like RFH::DoNotDeleteForTesting() to keep the old RFH around in pending deletion state indefinitely, and then I think you can get the right timing if you use executeTool() that targets A in an unload handler of B in a A(B) setup, and then navigating A to C. Fine to not do it in this CL.
Quick question though: if the RFHI that a RFPH represents is pending deletion, is this represented on the RFPH somewhere, and we could just check some state on the RFPH itself without even having to grab its FTN's current host, only to realize that the two are sort of out-of-sync? I'm fine with the current check, especially since it matches postMessage(). I'm just asking for my own knowledge.
Unfortunately, the RFPH doesn't track what document it was originally created for - e.g., for A(B), the subframe's RemoteFrame in A's process stays the same as B navigates to C and then to D. So we can't really tell purely from a RemoteFrameToken when the current document has changed and that the old one is pending deletion; we'd need to separately track the subframe's current RFH.
----Regarding detached documents: I've added two tests in `executeTool-target-detachment.https.html` which detach the target frame just before tool execution, and *during* execution. Both reflect what I suspected, which is that the execute callback is invoked and the caller's Promise rejects properly.
Ah, those are for detached cases, rather than crashed cases, though. I was worried more about an OOPIF crashing rather than getting detached, which probably requires a browsertest to check. It's a minor issue, so feel free to leave it out of this CL.
FWIW, here is a Gemini-provided test (will need work to not be a security exploit test, do more of this in JS, etc) which I ran locally and it seems to show that the callback will hang unless we add a `!target_rfh->IsRenderFrameLive()` check to fail explicitly:
```
// Test that executing a tool on a target frame that has crashed does not hang
// the caller's callback (and by extension, the caller's JS Promise).
class SecurityExploitBrowserTestWebMCPEnabledHttps
: public SecurityExploitBrowserTestWebMCPEnabled {
public:
void SetUpOnMainThread() override {
host_resolver()->AddRule("*", "127.0.0.1");
https_server()->StartAcceptingConnections();
}
void SetUpCommandLine(base::CommandLine* command_line) override {
https_server()->AddDefaultHandlers(GetTestDataFilePath());
https_server()->ServeFilesFromSourceDirectory(GetTestDataFilePath());
https_server()->SetSSLConfig(net::EmbeddedTestServer::CERT_TEST_NAMES);
SetupCrossSiteRedirector(https_server());// Initialize and listen before calling base class, as we want HTTPS
// to handle the traffic.
ASSERT_TRUE(https_server()->InitializeAndListen());
}
net::EmbeddedTestServer* https_server() { return &https_server_; }private:
net::EmbeddedTestServer https_server_{net::EmbeddedTestServer::TYPE_HTTPS};
};
IN_PROC_BROWSER_TEST_F(SecurityExploitBrowserTestWebMCPEnabledHttps,
ExecuteToolOnCrashedTargetDoesNotHang) {
// 1. Navigate main frame to a.com.
GURL main_url(https_server()->GetURL("a.test", "/title1.html"));
EXPECT_TRUE(NavigateToURL(shell(), main_url));
RenderFrameHostImpl* main_rfh = static_cast<RenderFrameHostImpl*>(
shell()->web_contents()->GetPrimaryMainFrame());
// 2. Create an OOPIF on b.com.
GURL child_url(https_server()->GetURL("b.test", "/title2.html"));
TestNavigationObserver nav_observer(shell()->web_contents());
EXPECT_TRUE(ExecJs(main_rfh,
JsReplace("let iframe = document.createElement('iframe');"
"iframe.allow = 'tools *';"
"iframe.src = $1;"
"document.body.appendChild(iframe);",
child_url)));
nav_observer.Wait();
RenderFrameHostImpl* child_rfh = main_rfh->child_at(0)->current_frame_host();
EXPECT_NE(main_rfh->GetSiteInstance(), child_rfh->GetSiteInstance());
// 3. Register a tool in the OOPIF, exposed to the main frame's origin.
EXPECT_TRUE(ExecJs(
child_rfh, JsReplace("navigator.modelContext.registerTool({"
" name: 'test-tool',"
" description: 'desc',"
" execute: async () => 'result'"
"}, { exposedTo: [$1] });",
main_rfh->GetLastCommittedOrigin().Serialize())));
// Force a round trip to ensure the registration IPC was processed by the
// browser before we crash the renderer.
EXPECT_TRUE(ExecJs(child_rfh, "navigator.modelContext.getTools()"));
// 4. Simulate a crash in the OOPIF renderer.
RenderProcessHostWatcher crash_observer(
child_rfh->GetProcess(),
RenderProcessHostWatcher::WATCH_FOR_PROCESS_EXIT);
child_rfh->GetProcess()->Shutdown(0);
crash_observer.Wait();
EXPECT_FALSE(child_rfh->IsRenderFrameLive());
// 5. Connect to the ModelContextHost on the main frame.
mojo::Remote<blink::mojom::ModelContextHost> script_tool_host;
main_rfh->BindModelContextHost(script_tool_host.BindNewPipeAndPassReceiver());
base::RunLoop run_loop;
bool callback_called = false;
bool execution_success = true;
// 6. Try to execute a tool targeting the crashed OOPIF.
RenderFrameProxyHost* proxy =
child_rfh->browsing_context_state()->GetRenderFrameProxyHost(
main_rfh->GetSiteInstance()->group());
ASSERT_TRUE(proxy);
script_tool_host->ExecuteRemoteScriptTool(
proxy->GetFrameToken(), child_rfh->GetLastCommittedOrigin(), "test-tool",
"[]",
base::BindLambdaForTesting(
[&](const std::optional<std::string>& result, bool success) {
callback_called = true;
execution_success = success;
run_loop.Quit();
}));
// Add a fallback timeout. If the bug exists, Mojo destroys the callback
// without running it, so `run_loop.Quit()` is never called, and this
// timeout will force the loop to exit.
base::SingleThreadTaskRunner::GetCurrentDefault()->PostDelayedTask(
FROM_HERE, run_loop.QuitClosure(), base::Seconds(2));
run_loop.Run();
EXPECT_TRUE(callback_called)
<< "The callback was dropped without being invoked! "
<< "This means the JS Promise hangs forever.";
EXPECT_FALSE(execution_success);
}
```
----Alright, I think everything here has been addressed or filed for a follow-up, so I'll resolve this! Lmk if you think I missed anything.
execution.target_token = target_rfh->GetDocumentToken();Dominic FarolinoFor `caller_token` and `target_token`, it looks like these are used only in the browser process to track the corresponding RFHs. Would it be better to use GlobalRenderFrameHostId (or GlobalRenderFrameHostToken) instead of DocumentToken, just to avoid any potential confusion down the road with what their process is? (Then again, I do also see some advantages of keeping everything done via DocumentTokens - but we need to guarantee that we won't ever resolve them without a process ID whenever they come from the renderer)
just to avoid any potential confusion down the road with what their process is?
Hmm, I'm not sure I understand the trade-offs. How would we benefit from using global tokens instead of DocumentTokens? Is it just to make resolving these into actual RFHIs easier, so that we don't have to grab the child process ID from somewhere to do so? I suppose that makes sense, but I'd probably lean towards holding off until the need to resolve these deeper arises, if it ever does. Does that sound OK?
Yes, it's fine to keep this as is. :)
This is more about defense-in-depth to protect against inadvertently grabbing an unverified DocumentToken from a renderer process (if that were to happen in the future) and resolving it to a RFH without validating the process (allowing a renderer to target a RFH it shouldn't have access to). In terms of typical usage, I think it's generally more common to track RFHs via those global IDs, whereas DocumentTokens are typically used to identify documents in IPCs to/from the renderer process. Given that you aren't resolving `caller_token` and `target_token` back to RFHs at the moment, this is fine, but let's revisit if that ever changes.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Exportable changes to web-platform-tests were detected in this CL and a pull request in the upstream repo has been made: https://github.com/web-platform-tests/wpt/pull/59740.
When this CL lands, the bot will automatically merge the PR on GitHub if the required GitHub checks pass; otherwise, ecosystem-infra@ team will triage the failures and may contact you.
WPT Export docs:
https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md#Automatic-export-process
| Code-Review | +1 |
Nice CL :)
It occurs to me that we will need to reconcile executeTool() with the cross-document tool calling flow in actor, to allow execution of tools that navigate. Happy to chat offline about it.
| Commit-Queue | +2 |
Nice CL :)
It occurs to me that we will need to reconcile executeTool() with the cross-document tool calling flow in actor, to allow execution of tools that navigate. Happy to chat offline about it.
Yep, I'm already starting to think about this...
WebMCP: 3/3 Cross-origin iframe tool execution and safety hardening
This CL implements cross-origin iframe tool execution via the
`executeTool()` method on the `ModelContext` interface.
Major changes introduced in this CL:
- Introduces the `executeTool()` method on `ModelContext`. To use it,
first you must obtain a `RegisteredTool` dictionary from
`navigator.modelContext.getTools()`. Then, you can pass a tool
directly as the first argument of `executeTool()`:
```
const [tool] = await navigator.modelContext.getTools();
navigator.modelContext.executeTool(tool, args);
```
The contents of `RegisteredTool` are mostly to provide identifiable
information about the tool to invokers; but from the browser's
perspective, the tool's `name` and `window` (which the browser
gets a FrameToken from) are used to uniquely identify the tool.
- Modifies the `RegisteredTool` dictionary definition to expose a tool
host's `Window` and `origin`. The Window/name combination uniquely
identify the tool in the frame tree.
- Introduces a `RegisteredToolDeprecated` reduced dictionary for the
ModelContextTesting API to use, so that legacy interface's surface
area remains unchanged after this CL. This keeps ModelContextTesting
operating narrowly on its own Document, and unconcerned with
cross-document tools.
- Performs tool execution with two new IPCs:
- `ExecuteScriptTool()` on ModelContextHost, for the renderer to
request that the browser invokes a tool in another document. It
locates the tool by the frame token that the caller knows the tool
host by and requests the hosting document to invoke the tool.
- `ExecuteScriptTool()` on ModelContext, for the browser to tell a
renderer that owns a tool to invoke it.
- Both IPCs have callbacks: the latter's callback lets the tool host
gives the tool response to the browser, and the former's callback
lets the browser forward it back to the caller.
- Tracks pending tool executions with `ModelContextPageUserData`
- When a tool is executed, the invoker's callback is stored in this
Page-scoped data structure. Then, when a document hosting a
still-running tool gets destroyed, this data structure is
consulted to reject the Promise in the invoker's document.
For a sense of remaining work that needs to be done for the iframe use
case, see:
- https://crbug.com/509555636
- https://crbug.com/509568047
- https://crbug.com/508306795
- https://crbug.com/508285989
- http://b/481899636
R=mfoltz
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
The WPT PR for this CL has been merged upstream! https://github.com/web-platform-tests/wpt/pull/59740
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |