Questions About Prompt API vs. Summarizer API (Context Window, Model Behavior, and Performance)

29 views
Skip to first unread message

Kevin Weitgenant

unread,
Oct 24, 2025, 10:08:22 AM (4 days ago) Oct 24
to Chrome Built-in AI Early Preview Program Discussions

Hi everyone,

I’m looking for some technical clarifications regarding the differences between the Prompt API and the Summarizer API for Gemini Nano. Specifically, I have three questions:


1. Context Window Size

In the thread “Maximum Token Limits for Gemini Nano APIs”, it was mentioned that:

“For the Prompt API … session can retain the last 4096 tokens … For the Summarizer API: The context window is currently limited to 1024 tokens but we use about 26 of those under the hood. Thanks to your feedback … we are exploring how to expand this feature to 4096 tokens.”
(Source)

Has the Summarizer API’s input/context window been expanded to match the Prompt API’s ~4096-token capacity, or does it remain limited to around 1024 tokens for summarize() operations?


2. Model Architecture & Download Behavior

From the documentation, it seems that both APIs rely on Gemini Nano, which is downloaded to the device upon first use. I’ve also seen references to fine-tuning or adapter layers (e.g., LoRA) used in task-specific APIs.

If a developer uses both the Prompt API and the Summarizer API in the same environment, does this result in:

  • a single shared instance of Gemini Nano being downloaded and used by both APIs, or

  • separate model assets (for example, a base Gemini Nano plus a summarization-specific fine-tuned adapter) being downloaded and maintained for the Summarizer API?


3. Performance & Speed Differences

Are there any measurable differences in latency or throughput between the Summarizer API and the Prompt API when performing summarization tasks?

In other words, does the Summarizer API provide any performance benefits beyond the built-in prompt optimization, or is it mainly a simplified interface with a smaller context window?


Thanks in advance for any clarifications you can provide. At the moment, I’m leaning toward using only the Prompt API



Thomas Steiner

unread,
Oct 24, 2025, 10:23:05 AM (4 days ago) Oct 24
to Kevin Weitgenant, Chrome Built-in AI Early Preview Program Discussions
Hi Kevin,

On Fri, Oct 24, 2025 at 4:08 PM Kevin Weitgenant <kevin.we...@gmail.com> wrote:

Hi everyone,

I’m looking for some technical clarifications regarding the differences between the Prompt API and the Summarizer API for Gemini Nano. Specifically, I have three questions:


1. Context Window Size

In the thread “Maximum Token Limits for Gemini Nano APIs”, it was mentioned that:

“For the Prompt API … session can retain the last 4096 tokens … For the Summarizer API: The context window is currently limited to 1024 tokens but we use about 26 of those under the hood. Thanks to your feedback … we are exploring how to expand this feature to 4096 tokens.”
(Source)

Has the Summarizer API’s input/context window been expanded to match the Prompt API’s ~4096-token capacity, or does it remain limited to around 1024 tokens for summarize() operations?


You can just find out dynamically and should never hard-code previously communicated values:

(await Summarizer.create()).inputQuota
// 6000

(await LanguageModel.create()).inputQuota
// 9216
 

2. Model Architecture & Download Behavior

From the documentation, it seems that both APIs rely on Gemini Nano, which is downloaded to the device upon first use. I’ve also seen references to fine-tuning or adapter layers (e.g., LoRA) used in task-specific APIs.

If a developer uses both the Prompt API and the Summarizer API in the same environment, does this result in:

  • a single shared instance of Gemini Nano being downloaded and used by both APIs, or

Yes, Summarizer and LanguageModel (Prompt API) use the same base model. The Summarizer then uses a tailored system prompt on top. You can find out which by debugging Gemini Nano (the approach works for the Summarizer as well).
 
  • separate model assets (for example, a base Gemini Nano plus a summarization-specific fine-tuned adapter) being downloaded and maintained for the Summarizer API?
In the past, the Summarizer used LoRA to imrpove its response behavior, but not anymore. The only API based on Gemini Nano now that uses LoRA is the Proofreader. 

Also see my recent article for details. 


3. Performance & Speed Differences

Are there any measurable differences in latency or throughput between the Summarizer API and the Prompt API when performing summarization tasks?

In other words, does the Summarizer API provide any performance benefits beyond the built-in prompt optimization, or is it mainly a simplified interface with a smaller context window?

The Summarizer helps you insofar as that it has specific summarization types already built-in (like tldr, headline, etc.) and defined lengths and formats. With the Prompt API you would have to build this yourself.
 

Thanks in advance for any clarifications you can provide. At the moment, I’m leaning toward using only the Prompt API

If there's a specific task API, as in this case, I'd recommend you use it. The reason is that, (i) the Summarizer has already shipped, so it's a stable API you can rely upon, and (ii) if at any point there'll be a perfect summarization AI model released, browser vendors like Chrome will just update the underlying implementation and all users of the Summarizer API will profit. This won't be the case for the Prompt API, where, of course likewise new better model versions might be released, but they may not necessarily be better at summarizing (they well may be, but it's not given).

Cheers,
Tom

--
Thomas Steiner, PhD—Developer Relations Engineer (blog.tomayac.comtoot.cafe/@tomayac)

Google Spain, S.L.U.
Torre Picasso, Pl. Pablo Ruiz Picasso, 1, Tetuán, 28020 Madrid, Spain

CIF: B63272603
Inscrita en el Registro Mercantil de Madrid, sección 8, Hoja M­-435397 Tomo 24227 Folio 25

----- BEGIN PGP SIGNATURE -----
Version: GnuPG v2.4.8 (GNU/Linux)

iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck
0fjumBl3DCharaCTersAttH3b0ttom.xKcd.cOm/1181.
----- END PGP SIGNATURE -----
Reply all
Reply to author
Forward
0 new messages