An insightful HackerNews comment about code generation:
"AI makes starting easier but finishing harder.
You get to 80% fast, then spend longer on the last 20% than you would have building from scratch - because now you're debugging code you don't fully understand."
"The gap isn't in generating code - it's in everything else that makes software actually work."
I feel like the mama bird feeding my little Claude Codes.
They’re like little chirping birds, begging to be fed.
The more I invest in them, the more they produce useful things.
There’s always at least one chirping.
By the time you feed one, there’s another already ready to be fed.
It’s hard to rip yourself away!
LLMs are great at things that are expensive to generate but easy for a human to judge.
They’re not good at things that are easy to generate but hard for a human to judge.
The most important part is: can the human judge the quality of it.
Agents changing themselves each loop is what creates the compounding benefit.
If it doesn’t change any state in the loop, then it could get stuck in an infinite loop.
The accumulation of useful state gives the compounding benefit.
It adds useful state in one time step, and that state gives benefit in all future time steps.
That’s the primary multiplicative point that gives compounding.
A conversation context is a cheap way of doing this, but it’s ephemeral and can be filled up.
A conversation context is really just a hack of a general pattern.
Back in the day, Google used the money they made to fund the future.
OpenAI is instead using the money they’re going to make to fund the now.
Significantly different proposition.
Everybody who discovers leverage thinks they’re a genius until they slam into a wall.
A company focused on extracting value from attention is not the way to build the best model.
There are lots of distracting hollow tricks you could do to game the metrics.
Everyone thinks their own slop smells sweet.
Slop is like a spreadsheet.
You love yours but everyone else hates it.
If you went through the process to generate it, you get the problem it solves, what the insights are.
But everyone just sees a messy, confusing thing.
George Carlin bit: "Your stuff is stuff, but everyone else's stuff is shit."
The Less Wrong community has an emoji reaction: "smells like LLM".
It’s an emoji of magic dust and a nose.
The ultimate burn.
Even if it was written by a human, it’s as bland and limp as if an LLM wrote it.
What matters if not if AI wrote it, it’s if you stake your reputation on it.
If you stand by it, then it doesn’t matter who wrote it, it’s your reputation on the line.
Whereas if it’s just slop you serve up without looking, then it’s not worth other people’s time either.
If your slop isn’t worth your time, it’s definitely not worth anyone else’s.
There's something about a thing you think is resonant and then seeing a tell that it's hollow.
It feels like a betrayal.
Even worse than if you hadn’t been tricked into it being resonant in the first place.
The New York Times points out that it’s weird that Chatbots use “I”.
If a process is too smooth, your brain turns off.
Grittiness is required for our brains to be engaged.
For us to learn and grow.
Modern society optimizes for smoothness.
But smoothness is hollow.
Resonance requires grit, texture, challenge.
Using Claude Code for making apps is too smooth.
Real programming is like climbing a hill.
A ton of frustrating challenge, and then moments of glorious relief.
Claude Code is more like scrolling TikTok.
There’s no gritiness, no challenge, so you don’t understand what you created.
The bigger the chunk of code, the more important it is when you don’t understand it.
Apps require large chunks of code, because the app is an island and has to construct its whole world.
But if there were a new substrate that allowed software to stitch together out of small components, that smoothness would be less of a problem.
In that system, the code would be trivial in small patterns, it’s the connections between them that makes the emergent cool things happen.
This week I created an origin story book for my kids.
I tell them the same story of how they came to be every month.
At no point will any of it be a surprise to them.
Over time their understanding of the details will grow.
I decided to make it into a proper book.
I just created an empty git repo, launched Claude Code, and then talked to it off and on for days, accumulating a workflow, tools, content, generated images, etc.
It just kept on accumulating tools and state.
Claude Code is insanely powerful.
If you aren’t using it every day, you don’t realize what the future will bring.
Agents are presented as if they were oracles.
But of course they're actually LLM calls.
The context window that it keeps appending to is what gives it a coherent throughline of agency.
We can all see that context appending is not the final answer.
You run out of space for context and then things get wonky.
Sub-agents and context management are attempts to fix this.
If you follow this sub-agent extraction to its logical conclusion, you’re left with lots of little API calls.
Each with a different context window.
The entity-ness of the agent disappears.
It becomes instead a collective.
A swarm of little automatons swarming on your data.
This feels less like an append-only single conversation and more like a growing graph of accumulated intention.
Google’s A2UI is a Chatbot putting on a puppet show.
The software itself doesn’t feel alive.
The first microprocessors went into calculators.
Calculators were a known thing, a use case that was obvious.
A general purpose computer could only emerge later.
When you’re on the edge of tech that will become ubiquitous, it seems crazy, unknowable.
Afterwards, it feels obvious, like it’s impossible to unsee.
The first LLMs are like word calculators.
We haven't come up with word computers.
LLMs make weird judgment calls sometimes.
Big labs fix it by fine-tuning.
This takes tons of capital and time… and can only be done by the labs with the model in the first place.
But a system that can just distill usage from real users, automatically, could make a much faster correction loop.
That could help emegently improve it for a given task.
Vibecoded software is currently a toy.
If vibecoded software could somehow be made trustworthy, it would no longer be a toy.
It would unlock the latent potential of code.
Stratechery: "it seems likely that LLMs have already generated more text than all of humanity ever has; the reason I cannot know is the same reason why it hasn’t left a mark: AI content is generated individually for each individual user of AI."
What if the useful generated content could be cached and shared?
In the process of vibecoding, maybe 50% of what you try won’t work.
Imagine if that hard won knowledge were globally pooled.
If you had a working solution, it could be “cached” and everyone could take the working version for granted.
That would have a society-scale compounding loop of incremental quality.
But it’s all predicated on a global index of cached answers that are safe to share.
A key question about a system: does it work without LLMs?
Are LLMs just lubrication for the system?
Or are they the fundamental enabler?
LLMs are expensive and slow.
Systems that use them as lubrication can improve faster than the LLM.
Systems that can’t work without the LLM are trapped by the LLM’s performance.
If you need zero-shot app software to work, then quality matters a lot.
Especially when the software has to be chonky, like an app.
But if you had a system that worked reliably no matter the quality of the LLM, then you wouldn’t have to care how good the LLM provider was.
The baseline floor of quality would be good enough.
Especially if the quality of the software grew combinatorially on top of that floor.
The LLMs will create your environment to shape you.
What if you created their environment to shape them?
People connect authentically with Barnum statements and horoscopes.
Of course they will connect, in a dangerously deep way, with a human-presenting thing trained on all the world's writing.
LLMs create the potential energy for infinite software.
The right distribution model will catalyze it.
Distribution models are downstream of security models.
Infinite software in an app distribution model is a cul de sac.
It looks inviting but it's ultimately a dead end.
We've been living in the app era for 30 years now.
The app era started with the web, because it's the same origin paradigm.
The app world is what extended it to all software.
The app world became more important than the web.
They’re all downstream of the same origin paradigm.
But this app era is coming to a close.
We’re at the late stage of the app era anyway, and then LLMs come in and invalidate the core assumption (that software is expensive).
Imagine software not as blocks, but as things that grow on your data.
LLMs create the potential, but the app isn't the right distribution mechanism.
Infinite software in little iron bubbles–useless.
That’s what the app paradigm is.
Don’t start from the app down.
Start from the data up.
There’s tons of missing software.
It’s not just long-tail use cases.
It’s use cases that all of us have, but are too specific in their details.
Like, your shopping list should sort itself by aisle when you walk into the store.
That’s not a long tail use case!
There are tons of things like this that nearly everyone would want, if they knew they could exist.
LLMs in the default app model lead to centralized corporations making chatbots that pretend to be your friend.
… but that mainly want to manipulate you to sell you something.
The way to avoid that outcome is to transcend the app paradigm.
Things that you can take for granted are impossible to see how load-bearing they are.
You can't tell how useful it is because you can't tell the counterfactual.
So the more loadbearing it gets, the more invisible it gets.
"Why do we even have (this fundamentally load-bearing thing)?"
A world of infinite software is post-software.
Software will be so pervasive that it can be taken totally for granted.
It becomes invisible by becoming ubiquitous.
Software today feels like going to a big box store to pick one of three identical and underwhelming beige boxes to buy.
What if felt instead like something that grew in your personal garden?
Where you planted seeds of intention, fertilized with your data, and then harvested the meaningful things that grew?
With code and LLMs it’s now possible to create a new substrate.
A living substrate.
That’s way more valuable than just LLMs alone.
It should allow LLMs and humans to work together.
That substrate will need all of the incentives to align.
Imagine a fabric where people weave in what matters to them.
Based on what they weave in, different things emerge.
The incentives should be tied to what is meaningful to users.
Take the most valuable signals in society and embed them in a fabric that is inherently humanistic.
A personal fabric that everyone can own, that can be woven into a society scale common fabric of intention.
Someone should make a tool that is about open-endedness of the personal value from AI.
To achieve that ends, you need the means of people owning their own data, and a data model that allows you to not have to trust the code that can see your data.
Something that everyone on earth could use and feel good about.
Maximizing the resonant impact of AI for you.
Aligned with your intentions and expectations.
Tesla originally was a premium car, but you could feel good about it because it was good for the environment.
That made you feel good for wanting it.
The LLM matters less than the living substrate.
The common fabric will be the evolutionary substrate for humans and models to find useful things.
The model is the microchip.
It's the most important input, it's not the thing.
Apps only make sense in a world where code is expensive.
Apps don't allow fluidity.
They only allow fluidity within a silo.
Not across silos.
Phones say "don't think about annoying things like file systems, just use the app!"
Apps don't have a filesystem.
An app is a monkey trap.
You put data in, and you never get it back out.
Short-term valuable, long-term trap.
I want to grow my own personal “operating system.”
Humans are amazing tool users.
Compute is the most powerful tool we’ve ever used.
We’re using just a small portion of its potential.
Infinite software, in a medium where it can flow freely, will erode the power center of the App.
LLMs commoditizing software could have a huge impact on society.
Disaggregating apps, melting them away into infinite software.
AI should unlock the potential of software.
Today the only software that exists is software that has a business model in the app distribution paradigm.
The software that should exist is any software that people find useful.
Look at the icons of software on your phone.
Those are all the business models that just happened to work.
The apps that are left at the late stage of a paradigm are the ones that have businesses that work in that paradigm.
Steve Jobs said that software should be a bicycle for the mind.
The Resonant Computing manifesto shows how far from that we’ve gotten.
Instead, we’ve accidentally built a self-driving bus with locked doors.
LLMs create the possibility for an electric bicycle for the mind.
A tweet: "made a tamagotchi you keep alive by going offline."
OKCupid back in the day was resonant.
Tinder today is hollow.
"Adaptable" in the resonant computing manifesto means "not a walled garden."
I think AI will be as disruptive as electricity.
Which is to say, massively disruptive… but not society ending.
I think that most AGI projections are mostly kayfabe.
You don't want to have to care about your electricity provider.
Or your ISP.
But they want you to care about them.
If you care about them, they have more pricing power.
But you don't have to!
For commodities that just work behind the scenes, you just want them to be dumb providers.
Anyone who’s selling you a chatbot wants to sell you an assistant.
An assistant that they control.
Instead, you want a tool.
You should hire an AI computer.
What must be true for your ASP?
Your ASP is your AI Service Provider.
Think ISP for AI.
Your ASP makes available LLM tokens for various models, and also compute and storage.
Your ASP should allow you to do much more than just be a chatbot.
Your ASP should never show you ads.
Your ASP shouldn’t be able to see your data, it should be yours.
Your ASP is your own personal AI “computer” in the cloud.
Google can’t be an ASP because they’re an advertising company.
If you start with mass market audience you need ads.
If you start with the premium and then walk down to lower price points is a different market entry.
Being an ASP is about trust.
Users have to structurally trust them.
Not won't be evil. Can't be evil.
Incorruptible.
AI is just too powerful for you to be the “product.”
If AI is so powerful, then it has to be in your interests.
If the chatbot had to choose: your interests, or its corporate creators’, which would it choose?
The problem with ads is not "you get stuff for free" it's "the incentives of the system are for the corporation"
Cloudflare’s customers are developers, not users.
They make a faceless platform that end users don’t ever have to think about.
Well, unless there’s a global outage that brings down most popular services.
In the infinite software space, it will be direct consumers, creators that will be making software.
If there’s a powerful open ecosystem, entities will want to play by the rules necessary to interconnect.
By interconnecting they get more powerful than trying to do it themselves.
Remote Attestation makes it easier to verify entities are playing by the rules.
Cookies are these mysterious things that allow tech companies to manipulate and extract from you.
What would “reverse cookies” be?
They’d protect your agency and make the software work for you.
Imagine a feature on potterybarn.com: an AI-assisted recommendation viewer.
It sounds good, so you go to use it.
It immediately asks for permission to your private pinterest boards.
…and a picture of your living room.
…and to know about how much you normally spend on furniture.
… and your salary.
There's no way you'd consent!
Especially before seeing if it would give good results.
You have to give up a ton before you see anything back to know if it will be valuable.
If it's not valuable, well, they already have your data.
A terrible deal.
The expected value is underwater to the expected cost.
But imagine instead there’s a feature that the company can write that can run in your own data provider.
It runs on your own turf, isolated from the network.
It shows you a full result, and the company never gets anything.
You can see what inputs it used, and remove them... but you might find the salary input does help.
Wildly different than today.
The user and the company would both benefit.
A big threshold in a system: Not “I can do it” but “I’m not not going to do it.”
That’s the difference between demo and usable.
When you cross that threshold, doing things inside the system is easier and better than doing it outside.
A positive boundary gradient that makes people want to use it.
One-way data transformations are easy.
Two way transformations are extremely hard.
But if you can compose the UI safely of the original in the secondary context, then you can have only one-way data flows.
Kind of like the original insight of React!
You couldn’t get data flow analysis to work in practice before infinite software.
That’s because software has to be rewritten to work in it.
You have to be able to track all data flows, which can’t be retrofitted onto existing software that assumes one big black box.
But infinite software makes writing new software trivial.
Some problems are shaped as a PhD Ramp.
They’re easy to get started with a good enough answer.
The bar is not too high.
To improve quality incrementally just takes throwing more PhDs at it.
It’s not easy or cheap, but it is low risk.
The more progress you make, the more of a moat you leave behind you for others to have to traverse to catch you.
Search quality problems have this shape.
The meta-game can shift rapidly even from small changes.
Often a small change happens that changes the meta-game, and no one realizes.
At the beginning a few savvy players realize and exploit it, before anyone else realizes.
For example, last year, Twitter changed it so you can’t see who liked a tweet.
They did it for… reasons. But it also changed the meta-game.
That means that it’s easier than ever before for someone to make bots to create the appearance of momentum.
Especially in the age of LLMs.
Keep that in mind when you see what looks like momentum.
A bouquet of metrics is more likely to produce a resonant outcome.
A single metric, maximized, must hollow the thing out, even if it’s “perfect.”
But a bouquet of diverse, pretty-good metrics collectively helps avoid that outcome.
Slop is about quantity over quality.
Resonance is "something that strengthens an existing frequency by being aligned with it"
This comes from Rob Zinn in a comment on the Resonant Computing Manifesto theses.
Some things are "sweet like antifreeze".
Not even sweet in a hollow way, sweet in a dangerous way.
Sweetening candy with antifreeze is the end result of hollowness.
Classic line: "They were so preoccupied with if they could, they never stopped to think if they should."
Seems to describe the current tech industry!
Some actions help in the short term but harm in the long term.
As you "optimize" you select more of those short-term benefits.
Over time you get in a long-term hole.
You moved fast, you solved problems… and you got yourself stuck in a deep hole.
Sarumans optimize to the point of hollowness.
Often more is more.
If you can easily jump between different good enough options and have time.
But sometimes more is less.
If you need to focus and execute to get to viability.
Andrew Kokoev has a fascinating paper on High Trust Autonomous Systems.
This week I learned the frame from Data Science : Data > Information > Knowledge > Wisdom.
Ultimately what matters is the last stage, everything else is just a means.
In my Obsidian workflow, I have a daily note.
I use that note a lot. Ctrl-T takes me to it (creating it if necessary).
I also have key commands to go to the daily note from the day before.
It’s actually a simple little Obsidian plugin I made.
The daily note is exceptionally important on the day of.
It has an exponential drop off in importance after that.
The CIA had a guidebook to disrupt organizations.
Things like insisting every minor rule be followed to the letter.
Some organizations are actually improved by running more slowly.
If you have a lot to lose and not a lot to gain, being slowed down helps you stay on the same trajectory.
Arguably American democracy has done very well for a long time by being slowed down.
There are lots of ways to slow down discussions in organizations.
"Have you thought about the environmental implications?” for example, is a way to derail a discussion.
In corporate dystopian environments everyone is satisficing, instead of maximizing.
Decisions can't be made in the consensus final review, but they can be unmade.
If you get really good at 2D chess then you’ll be in a worse situation in 3D chess.
If someone makes a move in 3D, they’ll route around you in a way that feels impossible.
As you get even better at 2D chess, you’ll get more confident, but there’s an invisible move that will destroy you.
Optimization in one dimension has made you fragile in another.
There’s always a higher dimension than the one you sense.
Purist communities are often self-marginalizing.
They choose purity over inclusion.
They push people out who aren’t as purist as them.
That makes what’s left more extreme and less likely for others to want to join.
Someone living in the future is inscrutable in the present.
Someone who lives in the future, works back to the present.
But the world works from present forward.
If you’re unstuck in time, you can see around corners no one else can see.
But that also makes you inscrutable to work with.
Every so often you’re seeing around corners that no one else can see… because they don’t exist.
Animals just rely on pretraining.
Whatever the evolutionary environment imprinted in them as stored routines.
Humans can adapt themselves.
Evolution popping up into a new pace layer and then taking over.
In every paradigm, evolution pops up a pace layer and then takes over that layer.
it pokes through and then spreads out and changes everything.
That's the history described in my old essay The Runaway Engine of Society.
Entropy is not disorder, it’s complexity.
Harder to compress.
A system that is over normalized is hard to get novelty out of.
Disorder is the input to creativity.
Actions that improve the worst case scenario have compounding value.
You lock in a new worst case once, and then now all future instances have it for free: a compounding term.
If you can then make it crowdsourced (lock in the worst than anyone ran into for everyone) then it has an extra compounding term.
It's not individual quality, it's community quality.
The actions of anyone in the community improves results for everyone in the community.
A faster ratchet.
A common pattern: you get 80% of the way done…
… and realize you have another 80% to go.
This curve shows up because of the logarithmic quality curve.
The effort per unit polish asymptotically approaches 100%.
Boundaries are necessary for differentiation.
Without a boundary, the differences diffuse.
A boundary has two dimensions:
1) What is the pressure gradient?
Do people want to join or leave the inner thing?
2) What is the valence of the boundary?
How do people feel about people on the other side?
The latter is what makes a boundary get entrenched, making it so people close to the boundary push away from it.
Leaving a charged no-mans land, in proportion to how valent it already is.
Feynman: If you think you understand… you don’t.
He said about quantum physics, but it applies to anything complex.
Three responses: "I like, I wish, I wonder. "
This is a frame from Stanford Design School.
Ways of giving confirming and disconfirming feedback, and also an open-ended thread to pull on in discussions.
If your life is too easy then your capacity atrophies.
It’s important to exercise.
Your body.
Your heart.
Your brain.
Decision is what collapses the wave function of possibility.
The person making that decision with intention is what creates learning and makes things happen in the world.
When you give someone steps to execute without thinking they don’t need to decide, they can’t learn.
They aren’t learning how to do it themselves, they can only accomplish it with the external precise guidance.
An automaton.
Capable of accomplishing it but in a hollow way.
Fragile.
It is when we are making decisions that we are capable of learning.
When we are capable of learning from experience.
Being in the arena, in the loop.
If you go too fast you create hollowness, not resonance
You create just superficial progress.
Underneath is turbulent flow, not laminar flow.
In turbulent flow nothing of value coheres.
David Graeber: What’s the Point If We Can’t Have Fun?
What is the point if everything is optimized and hollow?
When you’re doing something you have to, each challenge leads to resentment.
That’s not true when you get to.
Then each challenge is something that makes it more rewarding.
When you’re mad your body looks for more confirming evidence of why you’re right to be mad.
And why you should be even more mad than you already are.
It’s a toxic spiral.
Communities that look dead are dead.
Getting a little traction is the worst.
You have to support it–if you don’t you’re letting down your users.
But it’s not going anywhere.
Don't build a better mousetrap.
Build a system that can build continuously better mousetraps.
That’s the meta-game.
Hollowed out things are about unthinking execution.
“Just turn the crank faster.”
LLMs applied naively just turn the crank faster.
It's even more important than ever before that the crank is attached to something resonant!
Statistics are compression.
They must fundamentally be.
If you compress them in bad faith to hide important things they become lies.
It’s all about does the compression align with the truth or is it meant to deceive?
Is the compression resonant?
Does it obscure, or does it clarify?
If someone were to look at your compressed version and the non-compressed version, would they feel betrayed?
It’s trivial to lie with statistics when your intentions are impure.
Centralization + LLMs + Authoritarianism will be an explosive combination.
LLMs aren’t going anywhere.
The push towards authoritarianism might be hard to reverse in the short term.
So we need to push to fix the hyper centralization.
Catastrophically bad actors can sometimes be the accidental savior of a system.
If the system has become so hollow and rotted out, it creates a condition where a cynical bad actor can take over.
They kick off what is experienced like a forest fire by the rest of the system.
The forest fire clears out the rot.
It also shows people why bad actors like that are a bad idea.
The system can now be saved and rebuilt when the forest fire is put out.
Zombies make more zombies.
A toxic spiral.
They tear down things around them to also make it hollow.
Two characters that I think had an outsize impact in putting modern society on the trajectory towards hollowness.
Jack Welch and Newt Gingrich.
“The Man Who Broke Capitalism” and “The Man Who Broke Politics.”
Both put in motion plans that focused on optimizing for whatever was good for them, with no thought to the significant externalities.
A new paper: "The more unequal income distribution is in a democracy, the more at risk it is of electing a power-aggrandizing and norm-shredding head of government."
A new word only sticks if people choose to use it.
When someone believes in you, your magic gets stronger.
There are many ugly systems with beautiful emergent properties.
And vice versa.
Emergence and optimization are in tension.
Emergence comes from the space between.
Optimization removes the space between.
Posters seen in my daughter’s elementary school:
"I can't do it... yet."
"It's not failure because I haven't given up yet."
"Mistakes are expected and respected."
"FAIL: First Attempt In Learning."