AQA mode?

Robert Oschler

unread,

Dec 2, 2024, 3:45:29 PM12/2/24

to Chrome Built-in AI Early Preview Program Discussions

Previously most of my LLM works was done using the Gemini 1.5 Pro, AQA (Attributed Question Answer) model. This model is tuned towards answering questions using a series of provided grounding attributions provided with the user input as part of a full RAG pipeline.

How can I configure the LLM behind the Prompt API to behave the same, since that is the nature of my Chrome extension? Or should I be using a different API?

Thomas Steiner

unread,

Dec 2, 2024, 8:23:24 PM12/2/24

to Robert Oschler, Chrome Built-in AI Early Preview Program Discussions

Hi Robert,

The Prompt API is powered by Gemini Nano, which is a lot (a loooot) smaller than Gemini 1.5 Pro, so no matter the prompt engineering effort you put into your prompt, Nano will never behave the same (or even roughly similar) as 1.5 Pro.

Cheers,

Tom

On Tue, Dec 3, 2024 at 5:45 AM Robert Oschler <robert....@gmail.com> wrote:

Previously most of my LLM works was done using the Gemini 1.5 Pro, AQA (Attributed Question Answer) model. This model is tuned towards answering questions using a series of provided grounding attributions provided with the user input as part of a full RAG pipeline.

How can I configure the LLM behind the Prompt API to behave the same, since that is the nature of my Chrome extension? Or should I be using a different API?

--
You received this message because you are subscribed to the Google Groups "Chrome Built-in AI Early Preview Program Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chrome-ai-dev-previe...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/chrome-ai-dev-preview-discuss/c77590e1-0293-4007-99be-ae0885f04a9bn%40chromium.org.

--

Thomas Steiner, PhD—Developer Relations Engineer (blog.tomayac.com, toot.cafe/@tomayac)

Google Germany GmbH, ABC-Str. 19, 20354 Hamburg, Germany
Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891

----- BEGIN PGP SIGNATURE -----

Version: GnuPG v2.4.3 (GNU/Linux)

iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck

0fjumBl3DCharaCTersAttH3b0ttom.xKcd.cOm/1181.

----- END PGP SIGNATURE -----

Robert Oschler

unread,

Dec 2, 2024, 11:09:15 PM12/2/24

to Chrome Built-in AI Early Preview Program Discussions, Thomas Steiner, Chrome Built-in AI Early Preview Program Discussions, Robert Oschler

Funnily enough, things started working pretty good when I trimmed the prompt down to just this:

"Below is a list of web pages that I have bookmarked related to:

${userQuery}

Please tell me what my bookmarks say about this.

Here are the bookmarks:

${documentText}"

If were going to speculate (wildly), it seems like Gemini Nano is an extraordinary beast. They managed to find a sweet spot of maximum power, as long as you use language that appears to map to a known, common cognition pattern (like a user asking about their bookmarks or about a document). So rather than trying to micro-manage the LLM's "thinking" through complex highly specific prompt instructions, simpler is better. I saw in your video how you used "logic boundaries" to corral the LLM, like when you told it specifically not to put text outside the synonym list and that an answer should appear only once, but that was a structured data context. So perhaps in more pure semantic use contexts like mine, "less is better" with Gemini Nano.

You can see how well Gemini Nano performed in the RAG search my Chrome extension performs. The search uses a subset of my Canary browser bookmarks as "grounding attributions" that are culled out during a cosine similarity search with the help of a locally loaded Jira embeddings model, managed by transformers.js. See the screenshot. The top text area is the LLM answer from the Prompt API, and the bottom text area contains the matching bookmarks with their summaries that were created by the Summarization API when the bookmark was added.

Thomas Steiner

unread,

Dec 2, 2024, 11:19:19 PM12/2/24

to Robert Oschler, Chrome Built-in AI Early Preview Program Discussions, Thomas Steiner

Hi Robert,

I agree, in the beginning, I was bothering a lot with lengthy prompts, n-shot prompting, and just being overly specific, but it seems like indeed simple means better results, especially in more free-form cases where you expect the model to be a little creative.