At Thursday
nights meeting, I mentioned that I hadn't been able to get AI Agents to automatically test their work when developing desktop GUI apps. In contrast to the browser-based apps where this capability is fairly well established, and a key feature of AI IDEs like Google Antigravity.
Well, I now have it working! Claude in Copilot is able to see my running Electron app and thus test and fix its own work.
There were two key steps:
- A dev mode for the Electron app that enables remote debugging over a port:
if (process.env.ELECTRON_DEV_TOOLS) {
app.commandLine.appendSwitch("remote-debugging-port", "9222");
- Install the Chrome devtools MCP server for VSCode:
{
"servers": {
"chrome-devtools": {
"type": "stdio",
"command": "npx",
"args": [
"-y",
"chrome-devtools-mcp@latest",
"--browser-url=
http://127.0.0.1:9222",
"--experimentalVision",
"--experimentalScreencast",
"--no-usage-statistics",
"--no-performance-crux"
]
}
}
}
It can click buttons, introspect the DOM, take screenshots etc.
Ive just watched it test an app, fill in forms, find a bug, do a lot of reasoning to figure out the root cause, fix the bug, relaunch the app, retest to verify the bug is fixed. In the process it noticed a second different bug which it's now working on. And just solved:
"Found the bug. dynChildren uses the list index (0) as the key for both form and outcome components — so when runningRef goes from None to Some(...), KeyedReconciler sees key 0 retained and doesn't swap the DOM node. Fix: use dynKeyChildren with a key that changes between modes"
Pretty lucid reasoning..
-Ben
Meanwhile in Poland today, the big bad wolf-bot chased the three little pigs:


When asked about career ambitions, Wolfbot said it hoped to get promoted to chasing humans in the not-too-distant future.
Welcome to the "Singularity". I hope we're all having fun 😬