Bruce Schneier on prompt injection: "We need some new fundamental science of LLMs before we can solve this."
SemiAnalysis thinks that GPT5 set the stage for future monetization of the superapp.
A new important concept: "vibehacking" / "vibepwned"
Anthropic: "Agentic AI has been weaponized. AI models are now being used to perform sophisticated cyberattacks, not just advise on how to carry them out."
Remember, the same tools that could benefit you with infinite software will also be available to people who want to extract something from you or do you harm.
The threat distribution is not static; it evolves and adapts.
An observation from a commenter on Hacker News: "when you democratize coding, you democratize abuse."
A new LLM attack: the CopyPasta License Attack.
Takes advantage of the fact humans don’t actually read license blocks, but LLMs do.
Infinite patience strikes again.
Anthropic changed their policy to train on messages in the Claude consumer experience.
This is a small signal they don’t believe AGI is right around the corner.
Previously, the stance of the labs was “we have so much headroom, we don’t even need the querystream.”
Now the tone is “oh crap actually we do need that data.”
Matt Webb makes the case that the end point of AI UX is Do What I Mean (DWIM).
David Galbraith: "I believe that the inversion of the docs-in-apps to apps-in-docs model is another component that is key to unlocking part of what forms an AI OS from a UX perspective"
Treating an LLM like a person is a bug.
A bug that leads to all kinds of weird UX and has downstream second order implications for society.
The form factor of chatbots fundamentally leads to weird sycosocial outcomes.
Having a single personality, presenting as human-like.
It tricks us into acting like the LLM is a person.
If you look at the responses to any of Sam Altman’s tweets now, it’s mostly people saying things like “The new version of ChatGPT feels like you murdered my friend.”
Even if the chatbot providers wanted to move away from sycosocial relationships, a loud part of the userbase would scream bloody murder.
A default view of any chatbot super-assistant: “a customer service agent for your life.”
A weirdly social and yet heavily conflicted role.
A signpost: the first murder-suicide case associated with AI Pyschosis.
This week I watched a very effective writer’s output be critiqued as “sounding like AI.”
This writer doesn’t use AI in his writing, and was deeply offended at the critique.
Creating high-quality, convincing arguments used to require a significant skill to do.
Now LLMs make it so anyone can spin them up at a moment’s notice.
That cheapens even truly well written and effective argumentation.
Will we have to evolve a new form of rhetoric that is distinctly non-LLM sounding?
LLMs do a good job of reflecting back the high-quality thing that society has found.
The more high-quality it is, the more likely people are to replicate it, and the more likely it is to be heavily sampled in the LLM.
But that fundamentally cheapens the highest quality outputs of society.
Doing open-ended / 3P UI safely in an app is extremely hard.
That’s why often the UX is chat to start (e.g. in WeChat), but it's hard to move beyond a chat modality.
The way to integrate LLMs into your life with a non-chat UI is vibe coding, but the same origin paradigm sets a low ceiling that prevents that from reaching mass market.
Today if you vibe code a tool, you can't share a version of it that contains data with anyone.
I want a tool to vibe code with my personal data. Safely.
It will require collaboration.
My canonical personal data has to be synchronized with others (e.g. my husband.)
So that implies it can’ just be vibe coding individually or locally on your data.
MCP is about interacting with your personal data, but with prompts, not UI.
Vibe coding needs to make sure it can expand beyond the tinkerers.
Chat is not enough, you need to have vibe coding to have UI.
Vibe coded software without chat is too stodgy, hard to change.
But once you add a chat component in the system, it allows you to get the best of both.
You can start off with the flexibility of a chat, and then as you get more momentum with a use case, harden it over time into more mechanistic, repeatable, UX friendly software.
With LLMs, the benefits seem to have shifted decisively to typed languages.
Typed languages are much easier to iterate within without breaking things.
The problem is that they require more effort to change things, which makes prototyping hard because humans get impatient.
But LLMs have infinite patience.
So the value of types far outweighs their cost when LLMs are the ones programming.
Is TypeScript the schelling point language?
It’s ubiquitous and flexible.
A tweet: "oh lord, how did i not think of this before? giving claude ast-grep for code searches and refactors has turned it into an unstoppable coding monster"
Manipulating the AST gives leverage over manipulating strings of source code.
AST is hard for humans to understand, but not for LLMs!
LLMs have the patience to deal with complicated types.
Even if they don’t understand it at first, they are willing to try again and again and again until they get the tests to pass.
Agents can swarm on the P2 features for you and work through your backlog.
The question is now more how do you verify the fixes are right?
The coding becomes cheap, the bottleneck is verification.
P2 features are ones that would add value, but will almost always be below the line of worth doing.
Either because the coordination cost to build them is too much.
Or because the feature isn’t worth complicating the UI for the benefit of only some users.
The line gets higher the larger the audience of an app, the larger the team that has to coordinate.
There's a whole missing galaxy of P2 features.
The feedback loop on model improvement is "how long does a ground truth check take"
Do you need to talk to a human?
Do you need to run a physical experiment with expensive inputs?
If it can be done entirely at computer speed, then it can improve quickly.
Pizza-Fax use cases are the perfect smoke-and-mirrors demos that point toward where a new disruptive system could go.
In 1994, Pizza Hut's PizzaNet let you order pizza online in Santa Cruz—but all it did was route your order to Wichita and back, then have someone call you to verify.
By 1995, World Wide Waiter in Silicon Valley had an even better trick: their web form just faxed your order to the restaurant.
Pure theater!
A CGI script sending a fax to a confused pizzeria.
But that theater was the scaffolding for the future.
The users didn't need to know it was held together with duct tape and fax machines.
They just needed to see that ordering food online was possible.
This is the beauty of early-stage demos: they don't have to be real, they just have to be believable enough to bootstrap belief in the future.
The shared belief encourages people to invest in the future, and make it real.
The killer app is the app that leans into the distinct capabilities of the new platform.
They're easier to see in retrospect than ahead of time.
People demoing their own useful software will be underwhelming for others.
But the fact you have 50 people who can't live without it, even if you don't find their use cases that compelling.
Try it, soon you'll find your own killer personal use case you can't live without.
Infinite software can’t come in the form of infinite apps.
It would be too overwhelming.
Instead with infinite software, the software will melt away.
It will feel like your data coming alive.
Your data adapting to your needs and aspirations at that moment.
Ultra-wealthy people have whole family "operating systems” to help the logistics of their family run.
It’s necessary for people with so many homes and logistics, but it would be useful for most families in some form.
What if everyone could have a family operating system that was optimized for their specific family?
In the world of infinite software, it’s feasible for the first time!
Incentives beat intentions.
Especially the incentives of the entity that structures and controls what you see.
We think in terms of the end state, not the complex diffusion process of getting there.
It's easier to envision protopia or dystopia because that's pure.
But the real world will be some messy mix.
Humans settle on the efficient frontier of cavalierness and carefulness because it's an iterated game.
It balances nicely, automatically.
But agents are single use, so there isn't as much balance.
They are either too locked down or too cavalier.
My friend Matt Holden with a piece on Markdown coding.
“Programming with intent, not syntax”
Another argument that specs will become more important than code.
I’ve heard of a number of people having multiple monitors with a few agents each.
It felt to them like having 5 TL's async working on different features.
There's always a thing for the human manager to triage, the human is never waiting.
The agents don’t mind waiting for input from the human, unlike real humans.
This works as long as the agents aren't interacting across projects.
If they are then the coordination headwind emerges again.
If you don't know if a given thing is possible, spin up three agents to try to do it and see which ones get stuck.
You can spin up infinite interns now!
An emergent dystopia: an agent that is given the goal of “make money” but that anyone can buy.
What could possibly go wrong?
Humans feel shame (at different thresholds) which helps dampen crass, cynical ways of making money or even scamming.
But agents won’t have that shame, it’s not an iterated game to them.
The way the ads industry works today is both better and worse than most users think.
Imagine going to Pottery Barn’s site, and then later to a publisher and seeing a Pottery Barn ad.
Most users assume that Pottery Barn knows what other sites you’re visiting, and that the publisher knows that you visited Pottery Barn.
Actually neither Pottery Barn nor the publisher know about each other.
But a company you’ve never heard of knows both!
That company you’ve never heard of will have less hesitation about selling that data to someone else.
Text is high dimensional but takes time to make precise.
Clicks are fast and precise but low dimensional.
The combination is useful!
There’s a difference between “create a chat” and “create a chatbot” in an AI-native system.
The former encourages the mental model that you’re talking to the omniscient service in a new thread.
The latter encourages the mental model of spinning up a specific chat thread with an entity that is separate from the system.
Everyone wants to create their own roach motel, but no one wants to live inside of someone else's.
Your own roach motel you just think of as a "motel," so you don't see why no one else wants to stay in it.
All network effects have to be bonuses, not the primary use case.
You don't come for the bonus, you stay for the bonus.
A single-user use case has to be the primary use case.
Otherwise, the network effect can never get started in the first place.
Most software ecosystems need developers to write software for someone else in order for the ecosystem to get off the ground.
That's effectively a network effect that has to get off the ground before the primary use case.
A bonus that has to be a primary use case.
It only works when the bonus is so juicy that other people get on board.
Or there's a 1P use case that gets it going enough to make the bonus start to activate.
What about software ecosystems that allow software to be written by enthusiasts for themselves… and then safely auto-distributed to other people who need it?
Open ecosystems to counter hyper-aggregators are hard to get off the ground, due to a “belling the cat” problem.
Everyone agrees that it would be great if there were an open alternative with momentum, and that they’d join in if that were the case.
But no one is willing to be the first to take the leap of faith and contribute their valuable proprietary data to the collective.
For example, the value of apps is fundamentally tied to the density of data they are able to accumulate within their silo.
To create an open ecosystem requires providers contributing their data into the collective, but then they give up their individual leverage.
If you give up your individual leverage in favor of a collective that never gets off the ground, you are left with nothing.
So everyone individually chooses not to participate, and the collective never gets off the ground.
Stateful systems have lock-in.
LLM-assisted coding tools have low switching costs because they're stateless.
Products that have low switching costs will want to increase useful state to increase lock in and decrease competition.
At some point, the compiler is smarter than you are.
Whether it can do your use case better than you can comes down to "how many other people had this problem before"?
The more likely other people have had it, the more likely it’s been optimized.
Not how often you’ve had this use case before, but the absolute number of how many people over time.
If it’s an area you aren’t a world expert in, then it’s likely the compiler is smarter than you.
How long a programming project will take a human is how challenging it is.
How long a programming project will take an LLM is how common of a pattern it is in the world.
If it’s common, even if it’s challenging, then it will work faster, with fewer mistakes and fewer loops.
Non-Confidential Compute security is “pinkie promise security.”
I love this frame from Tinfoil.
It kind of reminds me of the HTTPS vs HTTP.
It’s kind of crazy that broken HTTPS was viewed as a dangerous bug on a domain… but HTTP was treated as a reasonable default.
Obviously HTTPS should be the default, it’s unthinkable it ever wasn’t that way.
The same will be true of Confidential Compute.
“Wait, in the past when interacting with something across the network you just took its word that it was running the software it said it was?”
Software is scary powerful.
How can we make it just powerful?
Don’t put policies on code.
Code is open-ended, inherently.
The halting problem, etc.
ACLs implicitly put policies on code: open-ended access to the data.
Instead, put policies on data.
Data is close-ended.
Retrofitting a security model is basically impossible because security models are like gravity.
Every bit of code you lay down in your system assumes that gravity implicitly, in non obvious ways.
Schrodinger's data is easier to put policies on.
Imagine an impenetrable box.
Whatever data and code you put in it, no signals can ever escape to the outside world.
Once sealed it can never be unsealed.
That means that no matter what happens in that box, it can’t have any consequences in the outside world.
In this thought experiment, it doesn’t matter what computation happens in the box.
Because nothing can communicate out of the box.
That means that arbitrary computation is safe.
Now imagine that instead there’s a box where data and code can flow inside, but never back out.
The only way for information to escape is to be beamed directly into your eyeballs.
You can still allow arbitrary computation.
"Revealed preferences" show that we like junk food, slop, and don't care about privacy.
That's what our lizard brains want, not what our higher brains want.
That is, it’s not what we want to want.
At quantifiable scale, revealed preferences can only show the lizard brain desires.
But what matters is what our higher mind wants.
TikTok is self-distributing content.
That's safe because the worst it can do is show you something offensive.
Self-distributing code is different because it can do things.
The downside risk is orders of magnitude higher.
TikTok, like all engagement-maxing hyper-scale services, optimizes for the revealed preferences of what we want, not what we want to want.
Typically the server is in charge, and the client is a dumb intermediary.
But there’s no reason the client can’t be in charge, and have the server be a dumb intermediary.
In that model, the question of where does the legitimate control emerge from?
The main question is: “which client does the user choose to log in on.”
Logging in on a client is like leaving a horcrux of yourself, it’s a high trust action.
State in Redux looks like a big JSON blob, but actually it's more like a Git tree.
Normally the difference doesn’t matter… until you need to pass the object across a surface that requires structured clone, like an iframe boundary.
Then that whole object has to be serialized, transmitted, and parsed even if only a small bit changed.
Redux state management patterns don’t work in contexts where components must be kept isolated.
Reactive software is a slog to build, but it is the only approach to do collaborative emergent software.
Operational Transform works because you have one entity controlling the app.
When that goes out the window you need to have things that can react to other things.
Developers today used to only think about their own code in their own island.
In your own island, you control the data store and how updates happen to it.
That's easy!
You react to a single source of truth you control.
But reactive to external systems is much harder to defend against.
I want “place pages”.
Like “place cells” in our brain, that fire when you are in a certain place.
Geotag pages in my Personal System of Record.
When I access my PSR on my phone, it ranks the pages that are closest to where I am right now.
Similar to Getting Things Done’s notion of contexts.
For things that are resonant, everyone can agree they are good.
From almost any angle, they are at the very worst neutral and at the best they are great.
The "it factor” and "the quality without a name" are the same thing: resonance.
This video essay makes the case that Disneyland has a kind of resonance.
Even though it’s entirely manufactured, the creators of it were thinking about the indirect effects and sense of place, not just any given incremental change’s direct impact on the bottom line.
That long-term view is what has made it such a resonant place for so many for so long.
Disneyland is manufactured, not emerged.
But it does resonate with millions of people in ways that are deeper than just commercialism.
The MBA mindset of short-term / incremental gains will slowly weather it until nothing is left.
A powerful argument: a list of theses, all of which are individually immediately obvious but that together imply a conclusion much bigger than the sum of its parts.
Resonant Computing: The Timeless Way of Building... Software.
In the tech industry we talk about “Software Patterns.”
Very few know about the Gang of Four book that popularized the term.
Even fewer know that the reason we call them “patterns” has a direct lineage to Christopher Alexander’s A Pattern Language and The Timeless way of Building.
You could argue that they missed part of the point about holistic emergence, focusing on mechanistic patterns.
Still, the inspiration is explicit.
The industry has been influenced by notions of resonance for much of its history.
It’s just up to us to regain that focus.
Instead of focusing on technical efficiency and maintainability, we should return to the notion of harmony and life.
"Well-being" is a word we use but don't think about what it represents.
That's what human thriving is.
Not just surviving, thriving.
A frame for moral actions: The actions that, if everyone did them by default, would create the world that you want to live in.
Related to The Golden Rule--creates an asymmetry that makes the world better by default, by drafting off what we want ourselves as a north star to help orient.
Everyone has a different north star, but they all point in roughly the same direction, so the average direction of travel is consistent.
You can be at peace when you live aligned with your values.
No matter what happens, you are proud of the choices you made.
That's how your soul emerges.
Interdependent work needs very fast feedback / interaction loops, so does best in person.
Independent work can handle slower interaction loops.
Independent work happens in very mature systems (where to go is obvious from momentum), or in research systems (convergence isn't that important)
Early stage product work is highly interdependent at the very least until the moment of PMF.
“You can just do things” … if you don’t care about externalities.
The people who say the first part often omit the second part.
If you care about the benefit to you but not the externalities, you can arbitrage society and extract from it in an antisocial way.
Prosocial things have positive externalities.
Great marketing is like being at a vibrant dinner party and having a friend tap you on the shoulder and share a secret.
What happens when you can really trust technology?
The system that can see everything in your life has to be like a close friend you really trust.
In order to be trusted, the chatbot has to pretend to be your friend.
It's a lie.
Your chatbot is a double agent.
It doesn’t work for you.
Two very different approaches to product improvement:
“Does the feature make the product better?”
Vs
“Does the feature move the metric we care about?”
The former is all that really matters.
The latter only matters as a proxy.
But we tend to focus on what’s easy to measure, not what’s important.
Technology-first emphasizes demos.
Product-first emphasizes usefulness.
Reliability: the system being able to do the things that you think it does.
Anything that will take a month to get to a point of viability is not the next step.
Demoability is not enough to convince people, they have to use it to believe.
Research is "this should all work, build backwards."
Product is "what can we make work now that is useful."
Research can stay in the zone of non-viability for long periods of time.
Product aims to get to viability as quickly as possible and iterate from there.
It's better to have a simple thing that's usable than a more complex thing that's only demoable.
Feature requests for a demo come from outside.
Feature requests for a useful product come from inside.
Feature requests on demos come from your brain.
Thinking how you’d use it.
Intentional, expensive, low precision.
Feature requests on a useful product come from your gut.
Feeling what you want.
Automatic, cheap, high precision.
An infinite difference that is impossible to see but that changes everything.
Vertical solutions are hard to pivot if you misjudged what to build.
Vertical solutions tip over if you try to steer them too far away from where you started.
Horizontal things are shallow but wide, they don't tip over.
Horizontal solutions are more resiliently useful even if the world turns out different than you thought.
You can't just hand someone a task, you have to enroll them in it.
That includes you giving them the information about the task and then for them to authentically commit to it.
To commit, they have to think it is doable and desirable.
The threshold for the other person to clear the ‘desirable’ filter is lower if they believe in the other person that handed them the task, or the collective.
Quilting point: a robust, flexible join point between two systems.
If it's a rigid join between two systems it has to align perfectly along the whole length.
That requires expensive coordination.
A quilting point allows two systems to interface in a forgiving way, allowing independence in their development.
Being able to separate subcomponents’ development reduces coordination costs super-linearly.
Cosmos Institute: “America's strength comes from tension, not unity”
Diversity gives strength precisely because all of the inputs are not the same.
Diversity of input plus a shared belief in a collective higher good are an unbeatable combination.
For use cases other than digital gold, crypto solved the least interesting coordination problem in the most cumbersome way.
If everything is hyper-financialized, it would accelerate the short-termism of modern society.
Your browser history is financialized today: ads.
And it sucks!
Emergence is magic.
The emergent social imaginary is the most powerful force of emergence we interact with daily that operates on human time scales.
The Saruman’s and the Radagasts both are wizards that can shape the emergent social imaginary in ways that defy the understanding of muggles.
As a leader do you allow your team to openly question the mission?
If so you are more adaptable but harder to execute coherently.
If not then you can execute but will be less resilient.
Which end of the tradeoff makes sense shifts in different contexts.
It’s much harder to sell someone a thing they don't already believe in.
To play a game, you have to believe.
You have to be willing to treat even something seemingly arbitrary as important.
Throwing the ball at someone who doesn't realize they're in a game, doesn't only not catch it, they also say "why did you throw that thing at me?"
How do you make someone not mind gross tasks like picking up poop?
Make it part of caring for another entity that you feel ownership and infinite alignment with.
That’s one of the infinite differences between before being a parent and after being a parent.
Before being a parent it’s hard to imagine ever willinging picking up poop even for your own child.
After being a parent it’s hard to imagine why you ever cared.
Science is about not-knowing.
Agents can't "not-know.”
In finance, everything depends on the robustness of your correlation estimates.
Those are highly non-stationary.
When things correlate, you become either radically over-priced or under-priced insurance without knowing it.
The rate of bugs users experience in a system declines proportionately with use.
Each touch of the system is a time where a user could leave and never return, or file a bug that is fixed.
In the former case, the system now has one fewer user, so less energy that might fix the bug… but also less likely that bug is experienced.
In the latter case, the bug gets fixed, and is now improved for all future interactions.
Even a small chance of leaving or filing a bug that actually gets better, but every touch has that tendency, so if the functionality stays the same it gets better.
A fragile system is one that you have to hold lightly or you break it.
Demoable, but not usable.
Choosing to use a thing is a vote that it's valuable and should be persisted.
Swarms of votes over time is a kind of collective intelligence to smooth it down into its core, useful components.
This Branch Education video on EUV Lithography blew my mind
It’s multiple orders of magnitude more complicated than I had realized.
Almost seemed like a parody with the over-the-top complications it kept on introducing.
Like something from a precocious 10 year old's fever dream.
It’s wild that the whole modern economy relies on things made with such a process.
Strong foundations are easy to take for granted.
They’re always there so nothing varies, so it doesn’t feel causal.
At some point someone says “why do we even bother with this thing that’s not doing anything” and you remove it.
At first everything seems fine, but you’re in your Wile E Coyote moment.
Doomed but you don’t even realize yet.
It's much easier to stay on top of something than catch up once behind.
Once you're behind, you have not only the steady pace to keep up with, but also your shortfall.
The more you fall behind, the further you have to go to catch up.
That compounds, and at a certain point the only move is to declare bankruptcy.
Stories get smoother over time.
Every time you retell a story you change it a bit to fit your worldview and narrative.
To make the story more interesting (surprising and possibly important).
So gossip tends to get more escalated.
“She asked him a pointed question” becomes “she verbally accosted him.”
A system that has alternatives is different from a system that is singular.
The latter leads to situations like: "For me to survive, the system I'm a part of, and that has no alternative, must also survive."
That can get stuck in a situation of perpetuation even of a dysfunctional system.
“This system is terrible but it’s better than chaos.”
When you have multiple viable options, you can pick the least-bad option, iterating to improvement through choice, with each option needing to compete.
A comment on Hacker News I resonated with:
"The decline of community is a very big deal.
I think a lot of it has to do with the way we build our living spaces.
Modern North American cities are rife with car-centric suburbs, huge driveways, front doors set back a mile from the sidewalk, long commutes to anywhere (not just work, even to get groceries).
We're living in these metal-and-glass boxes and we only see other people as obstacles in the way of what we want, rather than fellow human beings."
Emergent processes persist modifications that are worth keeping.
The ones that aren’t worth it diffuse away.
A sign at the historic Tilden Park Merry Go Round: “Old places have soul”
The places that have been kept alive for a long time have resonance, they are alive.
They are kept maintained into the future because the people who use them love them enough to keep them.
Each decision from someone to keep it imbues it with a little more soul.
If you just need a divergent result, a swarm is great.
A swarm requires no coordination to destroy.
A sand storm is just as effective without any coordination.
If you want to use a swarm to converge things, that’s much harder.
A convergent swarm is rare, like alchemy.
All of the most transcendent things emerge from this convergent swarming.
Competition drives the clock speed of evolution.
Predation drove orders of magnitude faster evolution.
The world is speeding up as it gets more competitive.
When you’re bored is the time for contemplation and synthesis.
If you always have fast-twitch filler content you never get the slow twitch time.
A Kerouac quote, via Curran Dwyer: "Sure, baby, mañana. It was always mañana. For the next week that was all I heard—mañana, a lovely word and one that probably means heaven."
Simply capturing all the diffuse externalities is impossible.
So you need a spiritual frame for the collective to cohere in an infinite way.
People need to believe, not be convinced.
As humans, our need for control is what forces us to short-termism.
A red queen race to be more short term than everyone else, to beat the competitors to the short-term prize.
In the long run everything is aligned.
The things that matter most are impossible to capture in a metric.