Attenuation is a good idea!

John Carlson

unread,

Apr 30, 2026, 9:15:10 AM (13 days ago) Apr 30

to cap-...@googlegroups.com

A Railway token in the hands of Cursor: AI deletes production database with staging token (or probably a production token with a confused AI deputy). The story of PocketOS:

https://x.com/lifeof_jer/status/2048103471019434248?s=46&t=r50cc7WczQ9czoo4jYsPlQ

ANI - Artificial Non-intelligence.

Don’t let AI near your production tokens! Why to disallow volume deletion as a web service! Take it offline for a few days!

I’m not saying CLI is better! Especially with ambient authority!

John

Rob Meijer

unread,

Apr 30, 2026, 10:09:28 AM (13 days ago) Apr 30

to cap-...@googlegroups.com

Oh, wow. I've been pondering about LLM session usage (from the runtime of a least-authority DSL I'm working with) from the perspective that LLMs agents become an potential hostile factor after ingesting data from a non-controlled (maybe a better name exists, but that is how I've been calling it in my drafts). My first thoughts were actually that blockchain and AI shouldn't mix, but I've come around from that stance to a somewhat milder stance: Blockchain an token-efficient AI shouldn't mix. My first ideas were caretaker centered, but none of the LLMs I experimented with took kindly to revoked tooling. Instead I ended up with a one-shot approach to LLM sessions and tooling, basically:

1) Runtime: Open session with LLM
2) Runtime: Here is the relevant context and here are the tools you can ask me to use to handle this prompt, you have one opportunity and one opportunity only to give me a batch of tool requests, don't respond to the prompt yet untill you have the batch results, just send me the batch.
3) LLM: Here is the batch of tool invocations that I need from you
4) Runtime: Here are all the tool results , here is a reminder of the prompt, please give me your response.
5) LLM: Here is my response
6) Runtime: Close session with LLM.

It is a very expensive (in terms of token usage) way to communicate with metered LLMs, but after my caretaker experiments, trying to tame potentially prompt-injected hostile LLMs in a cage of caretaker-ed tools that the LLM doesn't seem to be able to understand that have been revoked, made me come to the conclusion that this Mr-Meeseeks style least-authority LLM API would be the only responsible way I can hope to integrate LLM support into my least-authority runtime.

Reading this, however, I'm starting to realize that even before potential prompt injection an LLM can accidentally be hostile. What kinda makes me wonder if my original stand (that blockchain and AI shouldn't mix) wasn't the right call after all. Looking what crazy authority OpenClaw users throw at their instance, and reading this stuff, maybe there really is no least authority for LLMs without treating the LLM session as untrusted from session start? If so, maybe a MrMeeseeks style API only gives the illusion of added safety and least authority. I'm not sure.

John

--
You received this message because you are subscribed to the Google Groups "cap-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cap-talk/CAGC3UEnp36%2BqyeYMewzmeeCZgkdz%2BqYaBeKHG85yY10qi3CExg%40mail.gmail.com.

John Carlson

unread,

Apr 30, 2026, 11:00:52 AM (13 days ago) Apr 30

to cap-...@googlegroups.com

I ran pi (the basis of OpenClaw without all the bells and whistles) once with CLI permissions using Gemma4. I ask it to summarize the project in the folder, it look at the top folder with ls. I can’t remember what answer it came up with. I wasn’t impressed. I think I’ll probably would run agents on an air gapped computer (no Wi-Fi). Maybe I’ll get an M5 Mac Mini. I’ll just transfer results with a Flash drive. Maybe that’s safe enough. I don’t have a big enough GPU on any of my non-real devices to warrant trying to run a model on them. Perhaps a virtual machine, but there’s buggy drivers and Mythos (myth?) to worry about.

BTW, pi is a DIY agent harness, kind of like the Emacs of agent harnesses, maybe one could build something local called PiOLA. pi Agents are written in TypeScript.

It seems if paying is part of the LLM thing, I’ll do well to buy a Mac Mini and make my money back by saving over AI Company charges. Apple Intelligence, indeed.

Given the amount of supply chain hacking in GitHub, npm and pip systems, DIY seems very attractive now. Anyone have a POLA package manager? A Deno fork?

John

To view this discussion visit https://groups.google.com/d/msgid/cap-talk/CAMpet1VxiDuDvWDmVD3VhxocPobhVjQVkvLaFfSsyFTxfAcBig%40mail.gmail.com.

Niki Aimable Niyikiza

unread,

Apr 30, 2026, 9:17:23 PM (13 days ago) Apr 30

to cap-...@googlegroups.com

I like the Mr. Meeseeks pattern. Bound authority for a task, tear it down when done.

I've been pushing on capability tokens as a way to bind authority to each tool invocation. The LLM can be as untrusted, as long-running, or as prompt-injected as you want. The authority for any specific action is bounded by the token presented for that action. And because tokens attenuate across delegation, each sub-agent ends up with a narrower Mr. Meeseeks of its own.

I think this addresses John P.'s "is it just illusory" concern by making sure no single invocation inherits the whole authority. Curious whether this matches what you had in mind, or whether you were pointing at something else.

A token narrow enough to authorize only "delete row 7" will still delete row 7 though.
POLA + HITL signature at the boundary for sensitive actions could be the answer here.

I've been formalizing this through this in AAT (draft-niyikiza-oauth-attenuating-agent-tokens, IETF OAuth WG). Working on version -01 to incorporate feedback we've been getting.

Niki

To view this discussion visit https://groups.google.com/d/msgid/cap-talk/CAGC3UEnDoPWfv_KKG1ZArB1tkEL8yaa9f2X8KiPapKt%3DxcPobA%40mail.gmail.com.

--

Niki Aimable Niyikiza
Tenuo

Alan Karp

unread,

Apr 30, 2026, 11:46:10 PM (13 days ago) Apr 30

to cap-...@googlegroups.com

One problem with applying POLA to LLM agents is their nondeterminism. You don't know what strategy your agent will use to carry out the task, so you don't know what permissions it will need. Human-in-the-loop is only appropriate when high value resources are involved. One solution is a policy engine to hand out the capabilities, but articulating policy is hard.

--------------
Alan Karp

To view this discussion visit https://groups.google.com/d/msgid/cap-talk/CALGH9Z-3dKytX64mPg4sm6MNw4%2BzGTm5o29N%3Dj6deauj5CwBYw%40mail.gmail.com.

Reply all

Reply to author

Forward