A JavaScript API for transforming and rephrasing input text in the requested ways, backed by an AI language model.
This feature has definite interoperability and compatibility risks, due to the likelihood that different implementations will use different language models, prompts, and fine-tunings, and even within a single implementation such as Chrome, these pieces will likely change over time. Additionally, not all browsers and operating systems will have a built-in language model to expose, and not all devices will be powerful enough to run one effectively. We are taking a variety of steps to attempt to mitigate these risks. For example, the specification is designed to allow the API to be backed by a cloud-based language model. This approach could extend the functionality to a wider range of devices and users. The API is designed to abstract away the specifics of the underlying language model, including prompts and fine-tuning. This prevents developers from relying on specific outputs, ensuring they receive rewritten text rather than structured data that might vary across implementations. Finally, the API surface is designed with many clear points of failure, that encourage the developer to probe for capabilities ahead of time and fall back to other techniques if a capability is not available. Nevertheless, interoperability and compatibility risk remains high for these sorts of APIs, and we'll be closely monitoring it during the experiment period.
This feature would definitely benefit from having polyfills, backed by any of: cloud services, lazily-loaded client-side models using WebGPU, or the web developer's own server. We anticipate seeing an ecosystem of such polyfills grow as more developers experiment with this API.
Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?
None
None
It is possible that giving DevTools more insight into the nondeterministic states of the model, e.g. random seeds, could help with debugging. See related discussion at https://github.com/explainers-by-googlers/prompt-api/issues/9.
Not all platforms will come with a language model. In particular, in the initial stages we are focusing on Windows, Mac, and Linux.
We plan to expand the web platform test coverage of the API surface over the course of the origin trial. The core algorithm might be difficult to test, given the nondeterministic nature of the output. The explainer discusses this in https://github.com/WICG/writing-assistance-apis/blob/main/README.md#specifications-and-tests.
Does the feature depend on any code or APIs outside the Chromium open source repository and its open-source dependencies to function?
Yes: this feature depends on a language model, which is bridged to the open-source parts of the implementation via the interfaces in //services/on_device_model.Origin trial desktop first | 137 |
Origin trial desktop last | 142 |
DevTrial on desktop | 129 |
Open questions about a feature may be a source of future web compat or interop issues. Please list open issues (e.g. links to known github issues in the project for the feature specification) whose resolution may introduce web compat/interop risk (e.g., changing to naming or structure of the API in a non-backward-compatible way).
At this point all known proposed changes have been incorporated into the specification and implementation.LGTM
/Daniel
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAM0wra8dnNHXKM26MvQxZ6LBE16PUbvH6-GH5B3Z2WDv4uH0WQ%40mail.gmail.com.