Intent to Prototype: Summarizer API

조회수 743회
읽지 않은 첫 메시지로 건너뛰기

Domenic Denicola

읽지 않음,
2024. 8. 9. 오전 2:20:228월 9일
받는사람 blink-dev, Fergal Daly, Kenji Baheux, Daisuke Enomoto

Contact emails

dom...@chromium.orgfer...@chromium.orgkenji...@chromium.org

Explainer

https://github.com/explainers-by-googlers/writing-assistance-apis/blob/main/README.md

Specification

None

Summary

A JavaScript API for producing summaries of input text, backed by an AI language model.


Blink component

Blink>AI>Summarization

Motivation

Browsers and operating systems are increasingly expected to gain access to a language model. By exposing this built-in model, we avoid every website needing to download their own multi-gigabyte language model, or send input text to third-party APIs. The summarizer API in particular exposes a high-level API for interfacing with a language model in order to summarize inputs for a variety of use cases [1], in a way that does not depend on the specific language model in question. [1]: https://github.com/explainers-by-googlers/writing-assistance-apis/blob/main/README.md#summarizer-api


Initial public proposal

https://github.com/WICG/proposals/issues/163

TAG review

None. (We intend to submit for TAG review after getting enough support to move to WICG.)

TAG review status

Pending

Risks



Interoperability and Compatibility

This feature has definite interoperability and compatibility risks, due to the likelihood that different implementations will use different language models, prompts, and fine-tunings, and even within a single implementation such as Chrome, these pieces will likely change over time. Additionally, not all browser and operating systems will have a built-in language model to expose, and not all devices will be able to run one. We are taking a variety of steps to attempt to mitigate these risks. For example, the specification is designed to allow the API to be backed by a cloud-based language model, which could help extend it to more users. And the high-level nature of the API, which hides the details of the specific language model, prompts, etc., makes it harder for developers to depend on specific outputs: they are just getting a summary, and not e.g. structured data. Finally, the API surface is designed with many clear points of failure, that encourage the developer to probe for capabilities ahead of time and fall back to other techniques if a capability is not available. Nevertheless, interoperability and compatibility risk remains high for these sorts of APIs, and we'll be closely monitoring it during the prototyping period.


(We intend to ask for other-browser signals after gathering enough support to move to WICG.)

Gecko: No signal

WebKit: No signal

Web developers: No signals

Other signals:

Activation

This feature would definitely benefit from having polyfills, backed by any of: cloud services, lazily-loaded on-device models using WebGPU, or the web developer's own server. We anticipate seeing an ecosystem of such polyfills grow as more developers experiment with this API.



WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?

None



Debuggability

It is possible that giving DevTools more insight into the nondeterministic states of the model, e.g. random seeds, could help with debugging. See related discussion at https://github.com/explainers-by-googlers/prompt-api/issues/9.



Is this feature fully tested by web-platform-tests?

No

We hope to work on web platform tests for this feature, but how much we can guarantee as testable beyond the surface API is unclear, given the nondeterministic nature of the output.



Flag name on chrome://flags

summarization-api-for-gemini-nano

Finch feature name

EnableAISummarizationAPI

Requires code in //chrome?

False

Tracking bug

https://issues.chromium.org/issues/351744634

Estimated milestones

No milestones specified



Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5193953788559360?gate=5110509217775616

This intent message was generated by Chrome Platform Status.
전체답장
작성자에게 답글
전달
새 메시지 0개