[PromptAPI] Tool use IDL changes (v2 - no execution of tool by blink) [chromium/src : main]

0 views
Skip to first unread message

Frank Li (Gerrit)

unread,
Oct 29, 2025, 3:48:45 AM (4 days ago) Oct 29
to AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org

Frank Li added 1 comment

Commit Message
Line 10, Patchset 1:an agentic loop involving the JS App to recieve ToolCalls, execute
Frank Li . resolved

Please fix this WARNING reported by Spellchecker: "recieve" is a possible misspelling of "receive".

To bypass Spellchecker, add a...

"recieve" is a possible misspelling of "receive".

To bypass Spellchecker, add a footer with DISABLE_SPELLCHECKER

Open in Gerrit

Related details

Attention set is empty
Submit Requirements:
  • requirement satisfiedCode-Coverage
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: chromium/src
Gerrit-Branch: main
Gerrit-Change-Id: I066a4e5515d127f26bce0a276eec60a3dc809a6a
Gerrit-Change-Number: 7092943
Gerrit-PatchSet: 7
Gerrit-Owner: Frank Li <fra...@microsoft.com>
Gerrit-Reviewer: Frank Li <fra...@microsoft.com>
Gerrit-CC: Chromium Metrics Reviews <chromium-met...@google.com>
Gerrit-CC: Kentaro Hara <har...@chromium.org>
Gerrit-CC: Raphael Kubo da Costa <ku...@igalia.com>
Gerrit-Comment-Date: Wed, 29 Oct 2025 07:48:33 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
satisfied_requirement
unsatisfied_requirement
open
diffy

Frank Li (Gerrit)

unread,
Oct 29, 2025, 2:02:13 PM (4 days ago) Oct 29
to Mike Wasserman, Jingyun Liu, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
Attention needed from Jingyun Liu, Mike Wasserman and Sushanth Rajasankar

Frank Li added 1 comment

Patchset-level comments
File-level comment, Patchset 9 (Latest):
Frank Li . resolved

This IDL changes ('Tool use' v2) is ready for the review.

Thanks!

Open in Gerrit

Related details

Attention is currently required from:
  • Jingyun Liu
  • Mike Wasserman
  • Sushanth Rajasankar
Submit Requirements:
  • requirement satisfiedCode-Coverage
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: chromium/src
Gerrit-Branch: main
Gerrit-Change-Id: I066a4e5515d127f26bce0a276eec60a3dc809a6a
Gerrit-Change-Number: 7092943
Gerrit-PatchSet: 9
Gerrit-Owner: Frank Li <fra...@microsoft.com>
Gerrit-Reviewer: Frank Li <fra...@microsoft.com>
Gerrit-Reviewer: Jingyun Liu <jin...@google.com>
Gerrit-Reviewer: Mike Wasserman <m...@google.com>
Gerrit-Reviewer: Sushanth Rajasankar <Sush...@microsoft.com>
Gerrit-CC: Chromium Metrics Reviews <chromium-met...@google.com>
Gerrit-CC: Etienne Noël <etien...@chromium.org>
Gerrit-CC: Kenji Baheux <kenji...@google.com>
Gerrit-CC: Kentaro Hara <har...@chromium.org>
Gerrit-CC: Raphael Kubo da Costa <ku...@igalia.com>
Gerrit-Attention: Mike Wasserman <m...@google.com>
Gerrit-Attention: Sushanth Rajasankar <Sush...@microsoft.com>
Gerrit-Attention: Jingyun Liu <jin...@google.com>
Gerrit-Comment-Date: Wed, 29 Oct 2025 18:02:06 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
satisfied_requirement
unsatisfied_requirement
open
diffy

Frank Li (Gerrit)

unread,
Oct 29, 2025, 2:54:34 PM (4 days ago) Oct 29
to Mike Wasserman, Jingyun Liu, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
Attention needed from Jingyun Liu, Mike Wasserman and Sushanth Rajasankar

Frank Li added 1 comment

File third_party/blink/renderer/modules/ai/language_model.idl
Line 54, Patchset 9 (Latest): // When this object was created with LanguageModelToolDeclaration, this now
Frank Li . unresolved

and expectedOutputs has {"tool-call"}.

Same for the comment in promptStreaming().

Open in Gerrit

Related details

Attention is currently required from:
  • Jingyun Liu
  • Mike Wasserman
  • Sushanth Rajasankar
Submit Requirements:
    • requirement satisfiedCode-Coverage
    • requirement is not satisfiedCode-Owners
    • requirement is not satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement is not satisfiedReview-Enforcement
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: comment
    Gerrit-Project: chromium/src
    Gerrit-Branch: main
    Gerrit-Change-Id: I066a4e5515d127f26bce0a276eec60a3dc809a6a
    Gerrit-Change-Number: 7092943
    Gerrit-PatchSet: 9
    Gerrit-Owner: Frank Li <fra...@microsoft.com>
    Gerrit-Reviewer: Frank Li <fra...@microsoft.com>
    Gerrit-Reviewer: Jingyun Liu <jin...@google.com>
    Gerrit-Reviewer: Mike Wasserman <m...@chromium.org>
    Gerrit-Reviewer: Sushanth Rajasankar <Sush...@microsoft.com>
    Gerrit-CC: Chromium Metrics Reviews <chromium-met...@google.com>
    Gerrit-CC: Etienne Noël <etien...@chromium.org>
    Gerrit-CC: Kenji Baheux <kenji...@google.com>
    Gerrit-CC: Kentaro Hara <har...@chromium.org>
    Gerrit-CC: Raphael Kubo da Costa <ku...@igalia.com>
    Gerrit-Attention: Mike Wasserman <m...@chromium.org>
    Gerrit-Attention: Sushanth Rajasankar <Sush...@microsoft.com>
    Gerrit-Attention: Jingyun Liu <jin...@google.com>
    Gerrit-Comment-Date: Wed, 29 Oct 2025 18:54:29 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Jingyun Liu (Gerrit)

    unread,
    Oct 29, 2025, 6:54:34 PM (4 days ago) Oct 29
    to Frank Li, Mike Wasserman, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Frank Li, Mike Wasserman and Sushanth Rajasankar

    Jingyun Liu added 3 comments

    Patchset-level comments
    Jingyun Liu . resolved

    Thanks for the quick turnaround!

    File third_party/blink/renderer/modules/ai/language_model.idl
    Line 56, Patchset 9 (Latest): Promise<LanguageModelPromptResult> prompt(
    Jingyun Liu . unresolved

    when we change the return type from DOMString to PromptResult, will we break backward compatibility for existing users?

    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Line 75, Patchset 9 (Latest):typedef (LanguageModelToolSuccess or LanguageModelToolError) LanguageModelToolResponse;
    Jingyun Liu . unresolved

    are we also expecting to let model handle tool error (vs. application handles error)?

    Also in the current structure, LanguageModelMessageContent contains a
    LanguageModelMessageValue which can be a LanguageModelToolResponse which can contain LanguageModelMessageContent again...

    I'm thinking if we can just use a single struct for {string callID, string name, object result} as the LanguageModelToolResponse, and if we want model to handle it, developer can put error info in the "result" field.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Frank Li
    • Mike Wasserman
    • Sushanth Rajasankar
    Gerrit-Attention: Frank Li <fra...@microsoft.com>
    Gerrit-Comment-Date: Wed, 29 Oct 2025 22:54:26 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Jingyun Liu (Gerrit)

    unread,
    Oct 29, 2025, 6:56:14 PM (4 days ago) Oct 29
    to Frank Li, Mike Wasserman, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Frank Li, Mike Wasserman and Sushanth Rajasankar

    Jingyun Liu added 1 comment

    File third_party/blink/renderer/modules/ai/language_model.idl
    Line 56, Patchset 9 (Latest): Promise<LanguageModelPromptResult> prompt(
    Jingyun Liu . resolved

    when we change the return type from DOMString to PromptResult, will we break backward compatibility for existing users?

    Jingyun Liu

    Oops, nevermind, I just found the def for LanguageModelPromptResult in the other IDL.

    Gerrit-Comment-Date: Wed, 29 Oct 2025 22:55:59 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    Comment-In-Reply-To: Jingyun Liu <jin...@google.com>
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Frank Li (Gerrit)

    unread,
    Oct 29, 2025, 7:50:01 PM (4 days ago) Oct 29
    to Mike Wasserman, Jingyun Liu, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Jingyun Liu, Mike Wasserman and Sushanth Rajasankar

    Frank Li added 1 comment

    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Line 75, Patchset 9 (Latest):typedef (LanguageModelToolSuccess or LanguageModelToolError) LanguageModelToolResponse;
    Jingyun Liu . unresolved

    are we also expecting to let model handle tool error (vs. application handles error)?

    Also in the current structure, LanguageModelMessageContent contains a
    LanguageModelMessageValue which can be a LanguageModelToolResponse which can contain LanguageModelMessageContent again...

    I'm thinking if we can just use a single struct for {string callID, string name, object result} as the LanguageModelToolResponse, and if we want model to handle it, developer can put error info in the "result" field.

    Frank Li

    Thanks for your comments and discussions.

    (1) Tool Error Handling
    Yes, we're expecting the application to handle tool errors and communicate them back to the model. The flow is:

    // Model requests tool → App executes → App reports result (success or error) -> Model reacts

    ~~~
    const response = await session.prompt([
    { role: "user", content: "What's the weather in SF?" },
    { role: "assistant", content: [
    { type: "tool-call", value: { callID: "1", name: "getWeather", arguments: {city: "SF"} }}
    ]},
    { role: "user", content: [
    { type: "tool-response", value: {
    callID: "1",
    name: "getWeather",
    errorMessage: "API rate limit exceeded" // App reports error.
    }}
    ]}
    ]);
    // Model can now see the error and react (e.g., ask user to try later)
    ~~~

    Both the app and model have knowledge of the error - the app decides how to report it, and the model can react accordingly (e.g., suggest alternatives, explain to user, etc.).

    (2) LanguageModelMessageContent Recursion

    You're right - we should create a non-recursive type for tool results.
    'object result' is something I would like to avoid because it is not
    structured.

    I would like to make the IDL change as:

    ~~~
    // Data-only value (no tools)
    typedef (
    ImageBitmapSource
    or AudioBuffer
    or HTMLAudioElement
    or BufferSource
    or DOMString
    ) LanguageModelDataValue;

    // Data-only type enum
    enum LanguageModelDataType { "text", "image", "audio" };

    // Data-only content (no tool-call, no tool-response)
    dictionary LanguageModelDataContent {
    required LanguageModelDataType type;
    required LanguageModelDataValue value;
    };
    // Successful tool execution result.
    dictionary LanguageModelToolSuccess {
    required DOMString callID;
    required DOMString name;
    required LanguageModelDataContent result;
    };
    // Failed tool execution result.
    dictionary LanguageModelToolError {
    required DOMString callID;
    required DOMString name;
    required DOMString errorMessage;
    };

    // The response from executing a tool call - either success or error.


    typedef (LanguageModelToolSuccess or LanguageModelToolError) LanguageModelToolResponse;

    ~~~

    JS App example of usage for just 1 Tool Call:

    ~~~
    const response = await session.prompt([
    { role: "user", content: "What's the weather in Paris?" },

    // Model makes tool call
    { role: "assistant", content: [
    { type: "tool-call", value: {
    callID: "call_1",
    name: "getWeather",
    arguments: {city: "Paris"}
    }}
    ]},

    // App returns tool result
    { role: "user", content: [
    { type: "tool-response", value: {
    callID: "call_1",
    name: "getWeather",
    result: {
    type: "text",
    value: "Sunny, 22°C"
    }
    }}
    ]}
    ]);
    ~~~
    Open in Gerrit

    Related details

    Attention is currently required from:
    • Jingyun Liu
    • Mike Wasserman
    • Sushanth Rajasankar
    Gerrit-Attention: Jingyun Liu <jin...@google.com>
    Gerrit-Comment-Date: Wed, 29 Oct 2025 23:49:47 +0000
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Jingyun Liu (Gerrit)

    unread,
    Oct 29, 2025, 7:59:29 PM (4 days ago) Oct 29
    to Frank Li, Mike Wasserman, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Frank Li, Mike Wasserman and Sushanth Rajasankar

    Jingyun Liu added 1 comment

    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Jingyun Liu

    I actually feel "object" is like a base::Value in c++, it can have arbitrary structure is still a json.

    With LanguageModelData[Content|Type], how would we represent a map/dict? e.g, {"SFO": "22C", "LAX": false} is it always serialized to text? Also, our formatter works with a structured object (base::Value) below the chrome_ml_api layer.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Frank Li
    • Mike Wasserman
    • Sushanth Rajasankar
    Gerrit-Attention: Frank Li <fra...@microsoft.com>
    Gerrit-Comment-Date: Wed, 29 Oct 2025 23:59:18 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    Comment-In-Reply-To: Frank Li <fra...@microsoft.com>
    Comment-In-Reply-To: Jingyun Liu <jin...@google.com>
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Frank Li (Gerrit)

    unread,
    Oct 29, 2025, 9:09:36 PM (3 days ago) Oct 29
    to Mike Wasserman, Jingyun Liu, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Jingyun Liu, Mike Wasserman and Sushanth Rajasankar

    Frank Li added 1 comment

    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Frank Li

    JS APP will have to stringify it which is the Tool use v1 spec.
    I would think C++ base::Value cannot be used for "audio" or "image" - the multimodal tool result we would like to address.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Jingyun Liu
    • Mike Wasserman
    • Sushanth Rajasankar
    Gerrit-Attention: Jingyun Liu <jin...@google.com>
    Gerrit-Comment-Date: Thu, 30 Oct 2025 01:09:26 +0000
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Jingyun Liu (Gerrit)

    unread,
    Oct 30, 2025, 12:11:07 PM (3 days ago) Oct 30
    to Frank Li, Mike Wasserman, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Frank Li, Mike Wasserman and Sushanth Rajasankar

    Jingyun Liu added 1 comment

    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Jingyun Liu

    So when client pass a text string as tool result, we still require it to be serializable to a json object, and we throw error if it doesn't, correct?

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Frank Li
    • Mike Wasserman
    • Sushanth Rajasankar
    Gerrit-Attention: Frank Li <fra...@microsoft.com>
    Gerrit-Comment-Date: Thu, 30 Oct 2025 16:10:57 +0000
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Jingyun Liu (Gerrit)

    unread,
    Oct 30, 2025, 12:16:14 PM (3 days ago) Oct 30
    to Frank Li, Mike Wasserman, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Jingyun Liu

    And if client pass multimodal tool reuslt, I suppose we will check that the session is created with the correct "expectedInput" as well?

    Do you have any use case for multimodal tool results? I don't think server side APIs support multimodal tool results either.

    Gerrit-Comment-Date: Thu, 30 Oct 2025 16:16:04 +0000
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Frank Li (Gerrit)

    unread,
    Oct 30, 2025, 1:30:33 PM (3 days ago) Oct 30
    to Mike Wasserman, Jingyun Liu, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Jingyun Liu, Mike Wasserman and Sushanth Rajasankar

    Frank Li added 1 comment

    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Frank Li

    It could be just plain text or a serialized JSON string. We do not need to enforce it.

    >...Do you have any use case for multimodal tool results?...

    It is somewhat similar to [multimodal input sample](https://github.com/webmachinelearning/prompt-api?tab=readme-ov-file#multimodal-inputs).

    [Add multimodal tool outputs #149](https://github.com/webmachinelearning/prompt-api/pull/149) has discussions in commit on getting a video frame with a ToolCall.

    >...don't think server side APIs support multimodal tool results ...

    Yes, plain text is expected at the moment.
    [Tool calling: return types? #138](https://github.com/webmachinelearning/prompt-api/issues/138) discussed about multimodal tool results.

    >...the correct "expectedInput" as well...
    I think so.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Jingyun Liu
    • Mike Wasserman
    • Sushanth Rajasankar
    Gerrit-Attention: Jingyun Liu <jin...@google.com>
    Gerrit-Comment-Date: Thu, 30 Oct 2025 17:30:24 +0000
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Jingyun Liu (Gerrit)

    unread,
    Oct 30, 2025, 5:33:25 PM (3 days ago) Oct 30
    to Frank Li, Mike Wasserman, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Frank Li, Mike Wasserman and Sushanth Rajasankar

    Jingyun Liu added 1 comment

    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Jingyun Liu

    Thank you for the pointers on those issues! Now I see why we need the {type - value} pair for multimodal tool results.

    Then my only question left is the "DOMString" type of text tool result - Our formatter inside chrome_ml_api expects a structured object like base::Value to do various formatting and escaping. If we go with string in the signature, shall we try to de-serialize it into a json (base::Value) in the renderer?
    An alternative would be to also include "object" in LanguageModelDataValue, and then we don't do any special processing to the string. it will be treated by the formatter as a base::Value with type string.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Frank Li
    • Mike Wasserman
    • Sushanth Rajasankar
    Gerrit-Attention: Frank Li <fra...@microsoft.com>
    Gerrit-Comment-Date: Thu, 30 Oct 2025 21:33:13 +0000
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Mike Wasserman (Gerrit)

    unread,
    Oct 30, 2025, 5:45:51 PM (3 days ago) Oct 30
    to Frank Li, Jingyun Liu, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Frank Li and Sushanth Rajasankar

    Mike Wasserman added 2 comments

    File chrome/browser/ai/ai_manager.cc
    Line 252, Patchset 9 (Latest): // TODO(crbugs.com/422803232): Implement tool capabilities.
    Mike Wasserman . unresolved
    ```suggestion
    // TODO(crbug.com/422803232): Implement tool capabilities.
    ```
    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Mike Wasserman

    I have similar questions:
    1) Should we add an `object` LanguageModelDataType and let tools yield serializable JS objects (like `inputSchema` itself), as a model-agnostic structured format, instead of forcing tools to serialize those objects into JSON strings themselves?
    2) Should tools be able to yield a sequence of LanguageModelDataContent, so they can return mixed modalities?
    3) Is there a real advantage of coercing errors into a named `errorMessage` string field on a dedicated error type, versus a shared LanguageModelToolResponse being constructed with `result: 'Error: 'API rate limit exceeded'`, or possibly also passing structured objects errors if we permit objects in (1)? Does the API itself need to provide this distinction to the model, or for session history?

    i.e. What if we had:
    ```

    // Data-only value (no tools)
    typedef (
    ImageBitmapSource
    or AudioBuffer
    or HTMLAudioElement
    or BufferSource
    or DOMString
      or object
    ) LanguageModelDataValue;

    // Data-only type enum
    enum LanguageModelDataType { "text", "image", "audio", "object" };

    // Data-only content (no tool-call, no tool-response)
    dictionary LanguageModelDataContent {
    required LanguageModelDataType type;
    required LanguageModelDataValue value;
    };
    // Tool execution response; expresses a result or an error to the model.
    dictionary LanguageModelToolResponse {

    required DOMString callID;
    required DOMString name;
      required sequence<LanguageModelDataContent> resultOrError;
    };
    ```
    Open in Gerrit

    Related details

    Attention is currently required from:
    • Frank Li
    • Sushanth Rajasankar
    Gerrit-Attention: Sushanth Rajasankar <Sush...@microsoft.com>
    Gerrit-Attention: Frank Li <fra...@microsoft.com>
    Gerrit-Comment-Date: Thu, 30 Oct 2025 21:45:41 +0000
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Frank Li (Gerrit)

    unread,
    Oct 30, 2025, 6:38:39 PM (3 days ago) Oct 30
    to Mike Wasserman, Jingyun Liu, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Sushanth Rajasankar

    Frank Li added 2 comments

    File third_party/blink/renderer/modules/ai/language_model_prompt_builder.cc
    Line 453, Patchset 9 (Latest): // TODO(crbugs.com/422803232): Implement tool call handling.
    Frank Li . unresolved
    Line 468, Patchset 9 (Latest): // TODO(crbugs.com/422803232): Implement tool response handling.
    Frank Li . unresolved
    Open in Gerrit

    Related details

    Attention is currently required from:
    • Sushanth Rajasankar
    Gerrit-Comment-Date: Thu, 30 Oct 2025 22:38:30 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Frank Li (Gerrit)

    unread,
    Oct 31, 2025, 3:04:30 PM (2 days ago) Oct 31
    to Mike Wasserman, Jingyun Liu, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Jingyun Liu, Mike Wasserman and Sushanth Rajasankar

    Frank Li added 1 comment

    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Frank Li

    Thank all for the thoughtful discussions here.

    Hi Jingyun,
    > ... include "object" in LanguageModelDataValue...

    yes, we can use object.

    ...Our formatter inside chrome_ml_api expects a structured object like base::Value...

    I investigated the current architecture. In chrome_ml_types.h, we have:
    ```
    using InputPiece =
    std::variant<Token, std::string, SkBitmap, AudioBuffer, bool>;
    ```
    I am not sure something about `base::Value` input to chrome_ml_api Imp.
    Anyway, I plan to add a new ToolResponse variant to InputPiece. This will preserve the tool response structure all the way to the ML API implementation layer, where each implementation (Chrome ML API, Edge ML API, etc.) can format it appropriately for their underlying models.


    Hi Mike,
    >...1) Should we add an object LanguageModelDataType...

    Yes, will do.

    >...2) Should tools be able to yield a sequence of LanguageModelDataContent, ...

    Yes, agreed. This provides flexibility for multi-modal tool results in the
    future.

    >...3) Is there a real advantage of coercing errors into a named errorMessage string field on a dedicated error type, versus a shared LanguageModelToolResponse...


    Thank you for the thoughtful question! I believe separate ToolSuccess/ToolError types provide meaningful advantages:

    (a) Type Safety & Developer Experience: Both approaches work for JS apps, but separate types enable automatic type discrimination ('errorMessage' in response) without requiring developers to inspect content or parse for status fields.

    (b) Debugging & Transparency: Explicit LanguageModelToolError makes the error condition immediately visible in logs, DevTools, and debuggers. With merged types, we'd need heuristics to determine if content represents an error.

    (c) Web Platform Consistency: This pattern aligns with established Web APIs (e.g., FileReader: onload vs. onerror, Promise: .then() vs. .catch(), Fetch: Response vs. network errors) and C++ conventions.

    (d) ML API Implementation Flexibility: The discriminated type allows ML API implementations to apply different prompt formatting strategies (e.g., "Tool succeeded: {...}" vs. "Tool FAILED: ..."), which may improve model understanding of tool execution status.

    I'm happy to proceed with separate types unless there are other considerations I haven't addressed. Let me know your thoughts!

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Jingyun Liu
    • Mike Wasserman
    • Sushanth Rajasankar
    Gerrit-Attention: Mike Wasserman <m...@chromium.org>
    Gerrit-Attention: Sushanth Rajasankar <Sush...@microsoft.com>
    Gerrit-Attention: Jingyun Liu <jin...@google.com>
    Gerrit-Comment-Date: Fri, 31 Oct 2025 19:04:19 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    Comment-In-Reply-To: Mike Wasserman <m...@chromium.org>
    satisfied_requirement
    unsatisfied_requirement
    open
    diffy

    Frank Li (Gerrit)

    unread,
    Oct 31, 2025, 4:29:19 PM (2 days ago) Oct 31
    to Mike Wasserman, Jingyun Liu, Sushanth Rajasankar, Etienne Noël, Kenji Baheux, AyeAye, Chromium LUCI CQ, Chromium Metrics Reviews, chromium...@chromium.org, Kentaro Hara, Raphael Kubo da Costa, asvitkine...@chromium.org, blink-re...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, chrome-intell...@chromium.org, chrome-intelligence-te...@google.com, ipc-securi...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org
    Attention needed from Jingyun Liu, Mike Wasserman and Sushanth Rajasankar

    Frank Li added 5 comments

    File chrome/browser/ai/ai_manager.cc
    Line 252, Patchset 9: // TODO(crbugs.com/422803232): Implement tool capabilities.
    Mike Wasserman . resolved
    ```suggestion
    // TODO(crbug.com/422803232): Implement tool capabilities.
    ```
    Frank Li

    Done

    File third_party/blink/renderer/modules/ai/language_model.idl
    Line 54, Patchset 9: // When this object was created with LanguageModelToolDeclaration, this now
    Frank Li . resolved

    and expectedOutputs has {"tool-call"}.

    Same for the comment in promptStreaming().

    Frank Li

    Done

    File third_party/blink/renderer/modules/ai/language_model_create_options.idl
    Line 75, Patchset 9:typedef (LanguageModelToolSuccess or LanguageModelToolError) LanguageModelToolResponse;
    Jingyun Liu . resolved
    Frank Li

    PTAL with PS 10. Thanks!

    File third_party/blink/renderer/modules/ai/language_model_prompt_builder.cc
    Line 453, Patchset 9: // TODO(crbugs.com/422803232): Implement tool call handling.
    Frank Li . resolved

    crbugs.com

    Frank Li

    Done

    Line 468, Patchset 9: // TODO(crbugs.com/422803232): Implement tool response handling.
    Frank Li . resolved

    crbugs.com

    Frank Li

    Done

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Jingyun Liu
    • Mike Wasserman
    • Sushanth Rajasankar
    Submit Requirements:
      • requirement satisfiedCode-Coverage
      • requirement is not satisfiedCode-Owners
      • requirement is not satisfiedCode-Review
      • requirement is not satisfiedReview-Enforcement
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: chromium/src
      Gerrit-Branch: main
      Gerrit-Change-Id: I066a4e5515d127f26bce0a276eec60a3dc809a6a
      Gerrit-Change-Number: 7092943
      Gerrit-PatchSet: 10
      Gerrit-Owner: Frank Li <fra...@microsoft.com>
      Gerrit-Reviewer: Frank Li <fra...@microsoft.com>
      Gerrit-Reviewer: Jingyun Liu <jin...@google.com>
      Gerrit-Reviewer: Mike Wasserman <m...@chromium.org>
      Gerrit-Reviewer: Sushanth Rajasankar <Sush...@microsoft.com>
      Gerrit-CC: Chromium Metrics Reviews <chromium-met...@google.com>
      Gerrit-CC: Etienne Noël <etien...@chromium.org>
      Gerrit-CC: Kenji Baheux <kenji...@google.com>
      Gerrit-CC: Kentaro Hara <har...@chromium.org>
      Gerrit-CC: Raphael Kubo da Costa <ku...@igalia.com>
      Gerrit-Attention: Mike Wasserman <m...@chromium.org>
      Gerrit-Attention: Sushanth Rajasankar <Sush...@microsoft.com>
      Gerrit-Attention: Jingyun Liu <jin...@google.com>
      Gerrit-Comment-Date: Fri, 31 Oct 2025 20:29:09 +0000
      satisfied_requirement
      unsatisfied_requirement
      open
      diffy
      Reply all
      Reply to author
      Forward
      0 new messages