More details on the Gemma 197M model behind the Summarizer API?

Younes Peters Touati

unread,

Jun 3, 2026, 3:35:17 PMJun 3

to Chrome Built-in AI Early Preview Program Discussions

Hi all,

Thanks for all the work on the built-in AI APIs. It's been great to follow along.

I tried the summarizer preference benchmark demo (https://googlechrome.github.io/samples/summarizer-preference-benchmark/) and was really encouraged by the results. Model size and speed have been the main things keeping me from adopting the Summarizer API, so seeing this kind of quality at a smaller footprint is exactly what I was hoping for.

It's not quite usable for my own project yet, though. My use case is headline generation, which the speed version of the API doesn't support yet, and my target language (Danish) isn't supported either.

So what I'd really like is more detail on how the new, faster model works. The I/O 2026 recap (https://developer.chrome.com/blog/chrome-at-io26) mentions a Gemma 197M "expert model" that can transparently power task-specific APIs like the summarizer, but I couldn't find anything more on it, and as far as I can tell it isn't public.

A few questions:

1. Are there plans to publish more about the Gemma 197M model?

2. How is it fine-tuned for the summarization task? Anything on the training data, distillation approach, or setup would help a lot.

3. Mainly: is there enough out there (or planned) for developers to reproduce this performance for an unsupported language like Danish? I'd be glad to do the fine-tuning myself if the recipe were documented.

I realize some of this might not be shareable yet, but even a pointer in the right direction would be appreciated.

Kenji Baheux

unread,

Jun 3, 2026, 11:01:13 PMJun 3

to Younes Peters Touati, Chrome Built-in AI Early Preview Program Discussions

Hi Younes,

Thanks for the great feedback! Running the preference benchmark and isolating your blockers (task support and language) is exactly the right approach.

For others on the list, we highly encourage you to follow Younes's lead: use the current English model with its current capabilities as a performance proxy. Run the benchmarks to compare performance against the full model, and show us if and how this specific speed/size delta actually unlocks your use case (i.e. latency delta, details about the use case, etc). Feel free to reach out directly if sharing all the details would be easier. At the same time, use the full model in your target language to establish your quality baseline. The full model generally represents the quality ceiling; if it can't achieve the accuracy your feature requires today, a smaller expert model will likely struggle too.

See inline for your other questions:

On Thu, Jun 4, 2026 at 4:35 AM Younes Peters Touati <younes...@gmail.com> wrote:

Hi all,

Thanks for all the work on the built-in AI APIs. It's been great to follow along.

I tried the summarizer preference benchmark demo (https://googlechrome.github.io/samples/summarizer-preference-benchmark/) and was really encouraged by the results. Model size and speed have been the main things keeping me from adopting the Summarizer API, so seeing this kind of quality at a smaller footprint is exactly what I was hoping for.

It's not quite usable for my own project yet, though. My use case is headline generation, which the speed version of the API doesn't support yet, and my target language (Danish) isn't supported either.

So what I'd really like is more detail on how the new, faster model works. The I/O 2026 recap (https://developer.chrome.com/blog/chrome-at-io26) mentions a Gemma 197M "expert model" that can transparently power task-specific APIs like the summarizer, but I couldn't find anything more on it, and as far as I can tell it isn't public.

A few questions:

1. Are there plans to publish more about the Gemma 197M model?

2. How is it fine-tuned for the summarization task? Anything on the training data, distillation approach, or setup would help a lot.

I'll bring these up with the engineering team. Our goal right now is to gather concrete data showing that these highly optimized expert models solve real performance bottlenecks for developers and thereby unlock compelling use cases.

3. Mainly: is there enough out there (or planned) for developers to reproduce this performance for an unsupported language like Danish? I'd be glad to do the fine-tuning myself if the recipe were documented.

I love the willingness to roll up your sleeves and fine-tune!

While I can't promise that you can reproduce a model with similar characteristics, if you want to explore tuning the open Gemma models yourself for external workflows, please check out the Gemma Tuning Guide.

Thanks again for your feedback and questions!

I realize some of this might not be shareable yet, but even a pointer in the right direction would be appreciated.

--
You received this message because you are subscribed to the Google Groups "Chrome Built-in AI Early Preview Program Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chrome-ai-dev-previe...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/chrome-ai-dev-preview-discuss/c0e7035e-1be0-491d-a62a-3e6e339bd756n%40chromium.org.

--

Kenji BAHEUX (my how-to)

Product Manager - Chrome

Google Japan

Thomas Steiner

unread,

Jun 4, 2026, 3:33:03 AMJun 4

to Kenji Baheux, Younes Peters Touati, Chrome Built-in AI Early Preview Program Discussions

While I can't promise that you can reproduce a model with similar characteristics, if you want to explore tuning the open Gemma models yourself for external workflows, please check out the Gemma Tuning Guide.

A very hands-on tutorial for Gemma 3 270M is https://developers.googleblog.com/en/own-your-ai-fine-tune-gemma-3-270m-for-on-device/.

--

Thomas Steiner, PhD—Developer Relations Engineer (blog.tomayac.com, toot.cafe/@tomayac)

Google Spain, S.L.U.

Torre Picasso, Pl. Pablo Ruiz Picasso, 1, Tetuán, 28020 Madrid, Spain

CIF: B63272603

Inscrita en el Registro Mercantil de Madrid, sección 8, Hoja M-435397 Tomo 24227 Folio 25

----- BEGIN PGP SIGNATURE -----

Version: GnuPG v2.4.8 (GNU/Linux)

iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck

0fjumBl3DCharaCTersAttH3b0ttom.xKcd.cOm/1181.

----- END PGP SIGNATURE -----

Reply all

Reply to author

Forward