Auto-injecting WebMCP tools on existing sites — am I solving the wrong problem?

91 views

Skip to first unread message

Benjamin Amos Wagner

unread,

Apr 13, 2026, 5:16:07 AMApr 13

to Chrome Built-in AI Early Preview Program Discussions

Hi everyone,

I've been lurking here for a while, learning a ton from threads like 정성우's TaskFlow integration, Alex McManus's complex SaaS feedback, and the Angular team's early findings. Really impressive work happening in this group.

A bit of background on me: I'm not a super experienced developer, but I've been kind of obsessed with this space for over a year now. It started when I wanted to build an insurance premium calculator that AI agents could actually use — like, an agent could go to a site, input the parameters, and get a quote back. I ended up building PrimAI (an OpenAPI-based approach), and that works pretty well. But it got me thinking: most businesses aren't going to build custom APIs for AI agents. There are millions of websites with forms, calculators, booking widgets — and agents can't reliably use any of them.

That obsession led me to WebMCP, and I built OpenHermit (https://github.com/openhermit/openhermit-js) — a drop-in script that auto-detects forms and interactive elements on a page, tries to figure out what they do, and injects WebMCP attributes so agents can discover and use them. One script tag, zero config.

But honestly, I'm not sure the core idea is right. I'd really appreciate honest feedback — even if it's "this is the wrong approach."

Here's what's making me doubt it:

Auto-generated tool descriptions might be worse than nothing. Alex McManus built a hand-crafted "toolmap" meta-tool for his complex SaaS that gives agents a JSON overview of all available tools. 정성우 exposed clean domain-level intents (create_task, move_task) through TaskFlow's runtime. Both approaches produce high-quality, semantic tool declarations. My auto-injection can infer "this looks like a contact form" but the descriptions are generic. Does a mediocre auto-generated tooldescription actually help agents, or does it create false confidence?
The timing problems are brutal. Alex found he needed 1000ms setTimeout delays because Chrome + Gemini couldn't pick up dynamically registered tools mid-prompt. Auto-injection on SPAs is even harder — I'm fighting MutationObserver timing, React synthetic events, route transitions. Maybe auto-injection only works on static sites, which are the easiest sites to annotate manually anyway.
Security is genuinely unsolved. If any third-party script can inject toolname and tooldescription on any form, that's a prompt injection vector. I only generate descriptions from visible text (labels, placeholders), but I'm not confident that's enough.

What I'm actually trying to figure out:

Coming from the insurance/business side, what I really care about is: how do we make it so an AI agent can go to any business website and actually do things — get a quote, book an appointment, fill out a form — without that business needing to hire a developer to build a custom integration?

Maybe auto-injection is the wrong answer. Maybe it's more about a structured sitemap for agent capabilities, or a config file that business owners can fill out. Maybe it's about the analytics side — helping businesses understand which agents visit and what they try to do. I'm genuinely not sure.

Questions I'd love your honest take on:

Is auto-generated tooldescription useful to agents, or is it noise that makes things worse?
Should auto-injection be a dev/debugging tool rather than a production solution?
For those testing with Gemini or Claude via MCP-B — do agents actually perform better with auto-injected WebMCP vs. just DOM parsing?
Is the real opportunity on the analytics/visibility side rather than the declaration side?
For anyone building in this space — would love to collaborate, especially on figuring out the right patterns for non-technical site owners.

Thanks for reading this far. Happy to be told I'm solving the wrong problem — that's more useful than encouragement at this point.

Benjamin b...@expat-savvy.ch | github.com/openhermit/openhermit-js

Alex McManus

unread,

Apr 13, 2026, 12:28:35 PMApr 13

to Chrome Built-in AI Early Preview Program Discussions, Benjamin Amos Wagner

Hi Benjamin,

My initial thoughts are that anything you do in your analysis script, the agent is capable of doing by analysing the DOM. My (limited) experience with MCP-B is that it does a pretty good job with the DOM alone. That said, WebMCP is much faster, and with well-crafted tools, more accurate.

It sounds like the bigger gain from your tool would be performance and lower token usage, given the agent would no longer have to parse the DOM. MCP-B quite often resorts to screenshots too, which take some time to analyse. Whether the owner of a business website cares about that, I couldn't say - given the costs are borne by the user and the agent may well be autonomous.

Presumably the agent would still need to parse the DOM or take screenshots to understand the outcome of submitting the form?

From my perspective, it would add more value if you could:

- get an LLM to build tool descriptions for the forms

- allow the business owner to improve them and correct any errors

- auto-inject the corrected descriptions for site visitors.

Cheers, Alex.

Reply all

Reply to author

Forward

0 new messages