Your therapist and your coworker can't be the same person.
A chatbot with a single personality, identity, and memory for your whole life doesn't make sense.
(And no, work “therapy” is very different from actual personal therapy with a professional.)
Chat is a great modality for starting open-ended single-player explorations.
Chat is a terrible modality for managing long-running, structured, or multi-player tasks.
This week’s round up of “we’re in the wild west era with LLMs”:
A postmortem for a vibecoded tool called DrawAFish that had abuse problems.
A Cursor exploit that allows arbitrary remote code execution.
Allows exfiltration of sensitive Google Drive docs a user added to ChatGPT via the Connectors, with no interaction from the user.
The reason we aren't seeing more about prompt injection yet is not because it won't be a problem
It's because it's the first inning of having a widely deployed attack surface in ChatGPT.
Hackers demonstrated how a poisoned calendar invite could allow them to take control of Google Home-connected physical devices.
Futurism sums it up well: "It's Staggeringly Easy for Hackers to Trick ChatGPT Into Leaking Your Most Personal Data"
Prompt injection is very unlikely to be solved by the model simply getting so good it can’t be tricked.
This is evident in the model card for GPT5.
A lot of AI people are (implicitly, perhaps unintentionally) making the bet that models will get good enough to make security concerns moot.
This is less because they have a good understanding of what the models will be able to do and more because they have a poor understanding of security.
It’s also a smuggled infinity.
“If only models were perfect at not being tricked this would be safe.”
Remember, the threat coevolves; as these tools are more used, the incentive for the threat gets larger.
We’re reaching the top end of the chatbot UX modality.
We’ve reached the point where the model intelligence matters less than the scaffolding it’s activated in.
For a long time, the intelligence of the model wasn’t good enough, so every incremental gain unlocked new use cases.
We’re now to the point where for a lot of use cases, it’s hard for a normal user to even distinguish the quality improvements.
We’ve reached the “good enough” level you can take for granted, at least in the default chatbot modality.
The models have gotten so good that the limitations come from the chatbot modality itself.
Single-player, text-only, append-only.
Now the main question is what is the right UX paradigm to unlock its value and integrate into our lives, and how to get the right context into the right interactions.
More and more where it falls apart is that it's getting fed the wrong information, not that the model can't figure it out.
That’s a very different kind of problem to solve!
Chat is a feature not a paradigm.
So what is the more generic paradigm that will subsume those chat use cases?
“You know what else we noticed in the interviews? Developers rarely mentioned “time saved” as the core benefit of working in this new way with agents. They were all about increasing ambition. We believe that means that we should update how we talk about (and measure) success when using these tools, and we should expect that after the initial efficiency gains our focus will be on raising the ceiling of the work and outcomes we can accomplish, which is a very different way of interpreting tool investments.”
Intelligence is not the same as usefulness.
In chatbots so far the two were aligned but we've hit diminishing returns because we can't tell the difference in intelligence anymore.
Intelligence used to be the limiting factor, now it's not.
There's a ceiling for intelligence in a given application.
Peter Wang: "You wouldn't put a Xeon processor on your lawn mower".
How can you figure out an open-ended application that can take as much intelligence as you can throw at it?
There's a discontinuous jump of usage when you clear the quality threshold of "you can take for granted it works".
Before 99% availability, users have to think “will this thing work in this case?”
That uncertainty leads to some number of people thinking “hmmm maybe not, I just won’t try.”
But once you clear a “it almost always will give you a good enough answer” bar then you skip that uncertainty entirely.
Ethan Mollick’s review of GPT5: 'It just does stuff'.
You can just take for granted that it will work.
What if you had an interface other than text that could just do stuff?
I continue to think the best business parallel for LLM model providers is cell phone networks.
Extremely capital intensive to build out, but then much lower marginal cost to operate.
Though inference has much more marginal cost than cell phone networks.
The actual service depends somewhat on quality, but they’re all within spitting distance of one another so it’s mostly a commodity.
In the 90’s various ISPs gave unlimited plans… which led to overutilization, which led them to do traffic shaping.
Before the iPhone, cell phone carriers imposed significant control on the UX of devices on their network so they could reserve the right to upcharge.
For example, certain carriers required GPS to be disabled by default so the user had to pay to use that carrier’s map app.
OpenAI is moving to control the last-mile UX.
They’ll likely move to “throttling” maneuvers, for example only giving access to the premium model in their 1P surface, not via the API.
It’s clear that people will pay for access to the latest and greatest model, preferring it over last year’s state of the art.
OpenAI’s business value now comes mostly from the consumer subscriptions, not the underlying LLM model.
They started off by having the first break-out model quality.
But now their value is less the model (there’s a whole peloton of similar-quality competitors), but mainly the momentum of the massive subscription base.
Put another way: which would OpenAI choose to give up if they had to?
Their proprietary model?
Or their existing subscription user base?
If you swapped ChatGPT’s model for Llama right now ChatGPT would probably continue to win the consumer space.
A friend this week: "Whenever gmail predicts what I was going to type correctly I type something else out of obstinance."
The most important ability of LLMs is infinite patience.
What kinds of information tasks could normal humans do if they never got bored?
Qualitative nuance at quantitative scale can either help or harm users.
By default it will be used against users.
How can we make sure it is used to help users in alignment with their aspirations?
LLMs can be infinitely sycophantic.
But if they aren't, that's bad too, because you have to trust its motives and the motives of who created it to be aligned with you.
This is downstream of their infinite patience.
It has more patience than you; it can run circles around your intention.
To partner with an LLM you must trust its incentives are aligned with you.
LLMs allow generic human level reasoning for approximately free.
(At least, compared to the cost of human intellectual labor.)
How could that not change the world?
I think Ethan Ding’s article about inference costs is thought-provoking.
Absolute inference costs are going up, even as token cost declines.
The rate is getting cheaper but the volume is going up.
Plus, people keep showing a clear preference for the highest quality cutting edge model vs last year’s best that is now cheap.
The same dynamic happens for iPhones.
A feature I desperately want: the ability to curate / edit earlier messages in a chatbot thread, to better direct future output from the model.
I want to be able to prune dead ends or weird suppositions from the chatbot to keep the rest of the conversation from deteriorating.
This is, on a technical level, easy, since each time you get a new message from the LLM, it sends the whole past chat to the model–the model doesn’t care if you tweaked what it ostensibly said in the past.
Why does this feature not exist in the main chatbots?
Because it’s too niche a feature, too pro, too confusing.
ChatGPT is already hyper-scale, which means it needs to focus on the marginal user and put all of its effort into making it easier to use for low intent users.
The tyranny of the marginal user.
You don’t want as much context as possible, you want the right context.
The wrong context confuses the LLMs and makes them spiral out of control, losing the plot.
What you want is the smallest amount of context that will give the LLM what it needs to give you the right answer for your situation.
This is all about curation.
"If you train your AI slop-making machine on other AI slop you get thinner and thinner slop gruel over time."
Default-decohering.
If there’s a human in the loop to curate the good parts, it would shift to default-cohering.
I unfortunately didn’t put in my notes where this quote came from.
When humans work with LLMs in a creative process, the result is better than either would be alone.
The human has taste and judgment but gets fatigued easily.
The LLM is infinitely patient and can come up with a lot of ideas quickly.
But those ideas often have some mistakes.
The LLM bases its future generation based on what has come before.
If you don’t clear out the odd stray details it’s drawing on, the LLM tends to amplify them, which can become a compounding loop that becomes erratic.
LLMs pollute their own context with non-grounded guesses.
They spiral off and confuse themselves.
An auto-catalysing rat's nest.
You need a human in the loop constantly curating, pruning, and being the trim tab.
Without human curation interleaved in the generation process, it’s a default-decohering process.
With human curation interleaved in the generation process, it’s a default-cohering process.
The human in the loop provides super-linear improvement to quality.
Think of the adjacencies the LLM might go into as an exponential search space.
By the human pruning sub-branches, they cut off entire regions of that exponential search space.
A linear process from the human has super-linear efficiency improvements in guiding the LLM.
The chat interface is a terrible modality for this interaction and pruning, since it’s append-only.
You can’t curate distracting things the LLM generated before.
A shared blackboard, a coactive fabric, is the ideal modality.
In a coactive fabric, humans are eval-ing their own data continuously and keeping it high quality.
MidJourney’s Discord server was an emergent, collaborative version of this kind of thing.
Users did many of their queries in front of other people in the Discord.
Patterns that worked well were more likely to be copied by other users.
Patterns that didn’t work well were unlikely to be tried again.
This created an emergent intent blossoming system.
Anything with its own personality cannot be a true extension of your own agency.
The coactive fabric should not appear as a person, with its own agency and personality.
It should be an emergent mechanistic system that acts as an extension of your agency.
In the war for context, lots of companies are going after the browser cookie jar.
It makes sense: it’s a honeypot of extremely important context.
But if you ever look into your cookie jar, you’ll see that it’s mostly illegible bits of tokens.
The main value of your cookie jar is it allows you to take actions in that origin with that credential… but that’s inherently dangerous (you could take an action that changes state).
It’s kind of similar for Apple to “activate” all of the illegible state locked up in the various 3P apps on your device.
Google does have a lot of 1P context on you in their service, structured into their own ontology so they could plausibly extract a lot from it.
However, if they activated all of your decade of prior context that you had stored with different expectations, it would be a News Feed Betrayal but orders of magnitude worse and likely set off a firestorm.
I love Tim O’Reilly’s piece on Protocols and Power.
It’s an honor to be listed as a reviewer on the article!
In a world of infinite software you don’t want micro apps.
You want a coactive fabric where your data comes alive.
This week Notion added an ability to chat with an AI who can create pages and edit them for you.
It’s a great feature!
But it leaves me wanting more.
In Notion, like most tools, the documents are not alive, they are static.
If I come back to a page a week later, I expect it to be as I left it.
The AI chat is “alive” but it’s bolted onto a dead system.
What if the documents could be living?
I want a coactive system that improves itself automatically.
But critically that must be a new system, where users would expect that behavior from the beginning.
A coactive fabric should distinguish changes the AI and human made.
The AI will get things wrong sometimes.
It’s important for the human to be able to see which parts came from humans, which parts they’ve “endorsed”, which parts have not yet been confirmed.
To do that requires at minimum different styling for AI suggestions not yet endorsed by the human.
Users don’t need to endorse every suggestion, but when they do it’s a strong quality signal to the system.
Authentically aligned with their intentions.
A substrate that's a sandbox, so the AI can go wild.
It must be a sandbox, where you know that it doesn’t have side effects.
If it could have side effects, you won’t be willing for it to go wild and see what it comes up with.
In a coactive fabric, the system has to be careful not to be too overwhelming.
Too little activity won’t be useful enough.
Too much activity (especially of questionable quality) will be overwhelming.
You want the goldilocks zone.
The self-steering metric to optimize is “likelihood the user accepts the suggestion when they see it.”
Notion could be a great AI tool, but it’s limited by its past.
It has tons of user expectations of being a static wiki, not coactive.
They’ve also pivoted hard into B2B enterprise and sales-led growth.
The consumer use case is not a priority for them.
They also are a close-ended system, carefully curated primitives and limited ways to modify interactive behavior in turing-complete ways within the system.
The coactive fabric will look kind of like Notion, but it will not be Notion.
Imagine Notion but AI can build apps within it on demand for you.
Once you are tied to a category, it's hard to escape it.
The user's expectations are tied to a category.
True for people and for products.
Important point from Stratechery in Paradigm Shifts and the Winner’s Curse:
"To go back to the smartphone paradigm, the best way to have analyzed what would happen to the market would have been to assume that the winners of the previous paradigm would be fundamentally handicapped in the new one, not despite their previous success, but because of it. Nokia and Microsoft pursued the wrong strategies because they thought they had advantages that ultimately didn’t matter in the face of a new paradigm."
In non turing-complete systems you have to design generic UI for working with data.
That’s hard to do because it’s hard to make those interactions generic.
Look at how hard it is to build a plausible AirTable competitor.
But if you have a system that can generate turing-complete interactions for given use cases, it’s not as important to have some generalized interaction UX for data.
A generative frame: "Reasoning decay"
The more out of the loop you are in the reasoning with the LLM, the more detached from the result you get.
You're "out of the loop", looking at it from the balcony, not a co-creator.
An interesting frame from Sam Schillace: Bots are docs.
If you thought about your life as a single doc, and shared it with random people and then said "don't look at the medical information section", obviously that wouldn't work.
A few years ago I explored a model of threads as a conceptual model that was similar.
LLMs understand almost all jargon.
Jargon emerges in a sub-group to compress common ideas so they can be referred to by a shorter handle.
This makes that sub-group able to have higher-leverage conversations… but also makes the conversations more inscrutable to outsiders.
But LLMs are infinitely patient and have absorbed all of the jargon across all of the sub-domains.
So you can use key words like “GTD” or “complex adaptive systems”, and that efficiently retrieves a whole bunch of rich meaning very efficiently into the working session.
The rate of conversation with the LLM, how much it can "get you", can lead to a harmonic feedback loop that you can get lost in.
The AI feels like it “gets you” but none of the humans in your life do anymore.
Does the AI “get you” or is it merely being infinitely eager and competent at role playing with you?
Generative piece from my friend Sam Arbesman: The Conspiratorial Mindset and AI's Latent Spaces.
Provocative frame on chatbots’ misaligned incentives: AI’s Cigarette Butler Problem
The New Yorker: A.I. Is About to Solve Loneliness. That’s a Problem.
The question is not, "how to do cool open ended things with LLMs.”
That's easy.
As a chatbot, it’s trivial.
As a thing that can do tool use (and possibly have side effects), it’s easy to do for enthusiasts who are willing and able to use the command line and savvy enough to make their own security tradeoff decisions.
The big question is how to allow open-ended turning-completeness in a way safe enough to be able to be mass market.
ChatGPT is adaptive software.
You tell it what it wants and it changes its behavior to match.
But it's only append-only text.
How can you create adaptive software that is turing-complete and has a GUI?
I want an IDE for my life with 3D autocomplete.
An extension of my agency.
Resonant computing, that understands not just a superficial dimension of me, but the whole me, my aspirations and values.
An IDE is action oriented, unlike tools for thought that are for their own sake.
The stuff in the fabric of your IDE is the source code for your life.
An IDE is a word that savvy users know but consumers don’t, yet.
That helps focus it on technically savvy early adopters.
But IDE means “Integrated Development Environment”, which makes sense in a non-technical sense, too.
A system that integrates the important information in our lives, and helps us develop as people along the lines of our aspirations.
The demand for knowledge work is not fixed.
It is downstream of cost assumptions.
When those change, it takes people a long time to notice them.
Who needs rocket science help now?
Before only experts.
But now normal people can, so more people can get to work on more advanced problems than in the past.
Imagine a system where LLMs generate new combinations that a community of humans curate emergently through their individual authentic actions.
LLMs help create “patterns”: little bits of functionality.
Patterns can also be wired together to create meta-patterns, which can nest indefinitely.
The LLM makes a number of patterns that aren’t useful, but then users don’t use them.
The ones users keep around and like using stay around and get built on.
The system is constantly growing the frontier of patterns and meta-patterns, and then real human use is implicitly pruning and curating which ones to stay around and build on based on which ones are useful.
The LLM only needs to help expand the adjacent possible.
The surface area of the adjacent possible stays linear but the volume of the working set grows super-linearly.
We're used to programs having to be a monolith to be valuable.
But in a world where you can have a swarm of patterns that can have emergent behavior from the combinations, the value emerges from the combination, not the monolith.
Where does the value come from?
Each individual bit of code, or the connection of them?
We've never really been able to try the latter due to the same origin paradigm.
The CLI is having a moment right now, and that’s not some happenstance.
The CLI has always been great, because it allows composability of small powerful pieces of hardened code in combinations.
The problem has historically been that it’s arcane and difficult to remember how to configure.
But LLMs are great at remembering those arcane details for you.
The same thing should work for the mass market too in the realm of GUIs.
In the same origin paradigm, when an app wants your address book data, the OS asks you "Do you trust the creator of this app, who can push arbitrary turing-complete code that can do arbitrary network access, now and into the future?"
That's a crazy thing to have to consent to!
Maybe you trust the app to do X and Y, but not Z.
Or you trust the creator now... but not if they get acquired by some private equity firm in five years.
This problem leads to four bad outcomes in software today:
1) An overwhelming deluge of permission prompts and consent banners that give users a choice that is structurally impossible to make: a black and white choice for a fractally nuanced question
2) Most people just accept the permissions dialogs since they don't really have a choice, leading to their sensitive data sloshing around, being sold to the highest bidder and used in other ways not aligned with their interest
3) There's a massive long tail of potentially useful software features that are just too creepy, that no one would consent to let alone choose to build.
The value of the feature vs the cost of giving up private data is too much.
A universe of potential user value that is missing
4) Hyper-aggregation to a small number of companies, which makes the products harder to leave, and therefore less competition for them to improve.
As they chase more users, they optimize for the lowest common denominator.
In the end you get software that is maximally used and minimally liked
This state of affairs was barely acceptable in the past, but in the coming era of infinite software it's just not going to cut it.
What if you could make running software as safe as clicking a link is on the web?
The web is a pull model.
You can choose to go somewhere and instantly teleport.
In an infinite software world, what about a push model, where there's self-distributing software that comes to you?
Self-distributing infinite software will change the world.
A coactive fabric is a self-paving cowpath.
In the future we’ll all just have a baseline expectation that apps adapt to us.
Kids today have a hard time believing that in the past there were only three TV channels, and you had to watch whatever they were playing at the moment.
In the same way, kids in the future will be surprised that we had to put up with one-size-fits-none apps.
A nice distillation from my friend Luke Davis:
"The magic of AI should be adapting the product to the users, so you're not at an equilibrium point, but a frontier."
What if software were as easy to modify as spreadsheets?
Spreadsheets become a rats nest as users customize them more.
The better they get for you, the worse they get for everyone else.
More inscrutable.
Software didn't do that before because it was too expensive to create software so it couldn't change for others.
But in a world of infinite software, it’s now possible.
Imagine tech support if everyone has their own bespoke app.
Terrifying!
If software bends its will to individual users, it becomes more like prescription glasses, you can't put them on from someone else.
By default infinite software will lead to rat’s nests: convex.
But if you can flip it to concave, where they build on each other as building blocks, it would be insanely powerful.
Tim O’Reilly has pointed out the importance of “the architecture of participation” in software.
“Software products that are made up of small cooperating units rather than monoliths are easier for teams to work on."
This is the critical requirement that makes some precious few emergent software ecosystems auto-cohering (concave) while most are auto-decohering (convex).
Smalltalk could do anything... which meant that it was really hard to adapt someone else's code to run in your context.
The more that someone modified their system, the farther it got from other people’s systems, and the less applicable it was.
Self-decohering from collaboration: every unit of investment creates more of a bespoke rat’s nest.
Previously, software was expensive to write, which meant you need everyone to run similar software to have some points of commonality so it can be reused.
That’s why Smalltalk never took off outside of niche applications.
Maybe that changes in the world of LLMs?
Apps are vertical. Data is horizontal.
The same origin model privileges apps over data.
We forgot how our lives are naturally horizontal.
Each missing bit of data in a vertical world is a little papercut of friction and annoyance.
But we have thousands of them every day.
We don't know there's another way.
Because there’s no single cause of our agony so we think we’re just imagining it.
No single one is the obvious one to fix, so we just leave them.
We're writhing in pain and we don't even realize anything could be different.
It would be extremely useful to have a tool to find the optimal time for two people to meet.
Such a tool would look at all of each user’s calendars, along with their private preference stacks of what to prioritize, and share a proposed time that works for both people without revealing any of the private inputs.
Everyone would love this, and yet it doesn’t exist.
Why?
The same origin paradigm.
Shocker!
In the same origin paradigm, this information is locked up in vertical silos of the origin.
That means that different users have to coordinate at the same origin to have this feature even be plausible.
This has a network effect problem; this is a nice magic bonus feature, but not significant enough to be a primary use case.
That means that any given startup would have a very hard time getting to critical mass in anything but specific community niches.
The other alternative is hyper-aggregators that almost everyone uses.
But those fall prey to the one-size-fits-none phenomena.
Just one of those examples of a thing everyone would benefit from that’s extremely unlikely in our current laws of physics.
Engineers who build OS-like things know they have to constantly be paranoid about code execution.
Normal app developers don't have to care.
In the past, dealing with untrusted executable code was rare, so the vast majority of developers didn’t need to be aware of it.
LLMs make all text executable, which means that app developers now need to think like an OS developer... but they don't realize they need to.
Even if they knew they needed to, they wouldn’t know how to.
OSes assume that actions are taken by a user, somewhat intentionally.
If you have a robot poking around that could be confused, that's dangerous.
It goes against a bedrock fundamental assumption.
This implies we won’t be able to retrofit agents to existing OSes but will need new agent-native “operating systems”.
TEEs give you most of the value of ZK Proofs for orders of magnitude less operational difficulty.
TEEs give you only a tiny benefit over normal cloud execution privacy for normal workflows.
Normal cloud workflows already assume the cloud operator won’t peek into their VMs.
It would be challenging for them to do so, and it’s also against their explicit terms of service.
This “cloud operator can’t peek” is only important at the extremes, like finance or defense contracting.
The step change for TEEs comes from the secondary benefit: remote attestation.
That allows for users to not have to trust the software.
This is a new kind of thing, because before it wasn’t even possible before TEEs.
It’s a less obvious capability because it’s never been possible before so we didn’t design any systems that took advantage of it.
Like one day gravity just went away and no one noticed.
The open-endedness of the software is the point.
The security model is just a means to enable the open-endedness.
Permissions should be on data.
Unix got it right.
Google Docs shows how well it worked.
There's one doc, and the permissions flow with it everywhere it goes (because it stays on the server and you come to it).
The same origin model put permissions on code.
If you install it, then implicitly it can do whatever it wants with whatever data it has access to.
ACLs are only one kind of policy.
They’re about "open ended access of this data to this entity.”
That’s similar to the same origin model, in that it’s open-ended.
Instead of who can access the data, what about policies that say what it may be used for?
Most people’s policies would be very similar if they could be about the “what.”
There are a ton of use cases that are below the one-size-fits-all policy boundaries that the same origin model gives you.
People don't actually differ that much in what policies they want, it's that there's no language to express it.
Everyone has a different comfort level with one size fits all policy, which looks like a different desire in policies.
If you had the right level to express them, 90% of users could be covered by 90% of policies without too much change.
“Can't be evil” is way better than “won't be evil”.
In the latter I have to trust not only you but also your incentives.
In the former I don't have to trust you at all.
You should not be beholden to your hosting provider.
You should always be able to leave, with full fidelity, including to your own device.
They should have no power over you, they're just a replaceable service provider.
In the same origin model, the origin owner owns the data, not you.
Two extremes: 1000 true fans vs one-size-fits-none products.
Hyper-niche vs hyper-scale.
The middle is missing because we are in the era of Hyper.
We’ve never really seen the full power of a blackboard system in code.
The blackboard model is a simple model, based on work from Minsky and others as early as the 60’s, for emergent cognition.
There’s a “blackboard” that can contain markings.
There are an open-ended set of very simple routines/agents that can operate on the blackboard.
Each routine has a trigger–a set of scribbles it activates for–and can add or erase scribbles.
That’s it!
This incredibly simple model allows the emergence of extremely nuanced computation, harnessing the power of the swarm.
Many extremely complex problems can be broken down into thousands of extremely simple routines.
When you have a blackboard and a swarm, behavior will emerge.
If there's selection pressure, it will be useful behavior.
We’ve never really seen these systems in code, because the same origin paradigm means it is dangerous to allow an open-ended set of routines from untrusted parties.
Building a system in this way would be nearly impossible for one actor (i.e. origin) to do.
Can you imagine PMs specifying thousands of simple agents and their behavior?
But with a new security model that prevents irreversible side effects from untrusted code, you could see the power of the blackboard system.
There’s a lot of demand for decentralization in AI, and a lot of proposals reach instinctively for “blockchain”.
Blockchain solves one of the least interesting problems in this domain, and leaves the rest unsolved.
For example, there are a number of schemes to allow people to pool their data and get compensated when models use it.
But few of those proposals do anything about being able to revoke access to their data.
Data is naturally viral; once it is given to arbitrary computation you don’t control, access to it can never be revoked.
That is the critical problem to resolve.
Claude multiplies your inherent skill as a programmer.
If you're a sloppy programmer you'll get a lot of sloppy code.
If you're a curated programmer you'll get a lot of curated code.
Before LLMs, sloppy programmers could at least make a lot of progress, which was an advantage.
But now the LLMs do that part for free and for everyone.
The balance shifts back to curated programmers, since the LLMs can expand their curated kernel quickly with sloppy code.
The rate of rot of a codebase grows super-linearly with the number of lines of code.
Monoliths rot faster than components that are individually small but whose value emerges from the combination.
If you can keep small, hardened components and those combinations can be recombined easily, you get the best of both worlds.
In the world of infinite software, the equilibrium will shift towards small blocks of code with the right substrate to connect them.
A Radagast engineer creates weird but cool initial seeds, some of which could blossom into amazing things.
A pattern for evolving a platform: continuous sublimation.
Add new prototype concepts at the userland layer as merely a convention.
The conventions that stay around and get built on, emergently, are the ones to continue investing in, by making them more efficient, more baked into the platform properly, to get more leverage out of them.
The conventions that turned out to not be that useful dissolve away over time.
As more layers get built on top, it puts more pressure on top and turns the leaf litter into coal and then into diamonds.
(Yes, I know that diamonds aren’t formed from compressed coal, but I’m taking some literary license.)
We all have only a limited ability to handle novelty.
Do you spread it a mile wide / inch deep, or a inch wide / mile deep?
Horizontal novelty vs vertical novelty.
For example, full stack engineers are the former.
They can work on any component… as long as that component is a well-paved superhighway and not anything out of the ordinary.
I can’t handle physical novelty well at all, and invest all of my novelty tokens in intellectual novelty, which I thrive in.
This New York Times piece on AI and the alignment has a phrases I like:
“Intellectually pre-digested” knowledge that doesn’t make you work for it.
The process of digesting knowledge is what helps it stick in your brain.
Will AI-native organizations have less internal politics?
Internal politics scales super-linearly with the number of people in an organization.
AI-native organizations will be able to have smaller teams with higher leverage.
The AI itself won’t have politics, just the people will.
Fewer people leads to super-linearly less politics.
You can’t beat Conway's law, so lean into it.
Make the org shaped like you want the product to be.
Conways’ law is a force of gravity; it will win.
So why not use it to your advantage?
You can't squash the metagame of kayfabe in an organization.
It arises fundamentally from the fact that your boss can fire you but you can't fire your boss.
This is an inescapable asymmetry.
Even if your boss is nice and welcomes disconfirming evidence, there's an asymmetry.
And they're almost certainly not as nice or open as they think they are.
If you try to squash kayfabe it will just squish out the sides in a new and even more difficult to deal with way.
A pattern to find the highest leverage change-makers.
Go to the coolest person you know and ask them who the coolest person they know is.
Repeat until it converges.
This person contains the seeds of how the world will change in the future.
People sometimes say "strategy" when they really mean "plan".
A strategy must include a “why”.
Creating value means you put value out into the world: positive externalities.
Extracting value means you take value out of the world: negative externalities.
You can net your value created vs extracted and still be on net creating value, which means you're prosocial.
Of course, people could still dislike you, if you convert value of a form that they like into a form they like less.
This is the "create more value than you capture" insight of a platform.
Prosocial things tend to build trust, while antisocial things tend to erode trust.
The typical finance mindset tends to want to extract as much value as possible--even if it puts you in net negative territory.
They don't think, “should we extract this”, just “can we”.
This leads to antisocial things that erode trust and also erode the businesses' own value in the long term.
Product people tend to think more about creating value, less about extracting it.
To be a good business does not require maximizing extraction in the short term.
You can maximize value creation and take a small part of a grown pie.
People taking a one ply approach think it's mainly about extracting value.
People taking a multi-ply approach see it's actually mainly about creating value.
If you create a ton of value then even if you extract a small amount, you're still doing great and it can continue in perpetuity.
If you extract from something and kill it, you get returns... but only in the short term.
Game-changing startups know a secret.
If you don't know a secret then your implied strategy is "we'll execute better than anyone else in the swarm.”
That’s extremely unlikely to be true for any single player in an idea space that's hot enough to have an active swarm.
When you’re an individual competing against a swarm you can never rest.
As soon as anyone anywhere is working when you aren’t, they pass you.
That’s why you need a differentiated advantage.
For example, a moat is based on your past success more than in-the-moment success.
Moats give you a smoothing function so you aren’t doing a marathon of sprints (a physical impossibility).
If you aren't embarrassed when you ship it you're waiting too long.
Writely (the precursor to Google Docs) didn't compete with Microsoft Word on features.
It changed the battleground fundamentally.
The killer feature was collaboration in the cloud.
A classic disruption story.
How are you different from all the other competitors in a way that might be valuable?
Then lean into that.
Become more different, not the same.
Compete on your strongest turf, not the most crowded turf.
Strong execution only makes a difference for middling ideas.
If the thesis is fundamentally wrong, no amount of perfect execution can fix it.
If the thesis is fundamentally right, no amount of merely good execution can harm it.
Imposed mediocrity is hard.
You have to make people do things no one cares about.
It requires constant influxes of your energy because you’re fighting entropy.
Default-decohering, unless you put in constant effort just to stand still.
Authentic greatness is easy.
The most natural thing in the world.
A feeling of flow state and alignment.
Default-cohering.
The more energy you put in, the faster it goes.
Resonance.
What is the simplest next action that will generate the most momentum in the direction you want to go?
Do that, then repeat.
If you have a plausible and valuable North Star you’re sighting off, this can work very well.
No need to overthink it!
Surf the possibility space; as long as you have a consistent bias, it will default-cohere.
If you have a powerful compounding product, just make the product more what it wants to be.
Don't overthink it, just do what the people who care about it and have thought about it a lot think is an obvious improvement.
If a company has a product that has compounding value, then if engineers self report "I'm doing the best work of my career" then the company is thriving.
The more you overthink it, the more you'll interfere with the emergent magic of it.
People implicitly assume things that don’t have a clean monocausal narrative don’t exist, which causes them to miss the power of simply letting engaged employees work on the details they think matter.
Four phases of understanding and driving a system:
1) Description
2) Explanation
3) Prediction
4) Creation
You can’t fast-forward through these stages; if you don’t have an earlier stage then you can’t do the later one.
Slow and steady wins the race.
If your expectation is miscalibrated then you'll keep on trying to do low-likelihood-of-success hail marys.
You won’t realize they are hail mary’s.
You’ll be creating swirl and less progress than if you executed steadily.
Individually authentic signals often combine into powerful distillations, even if they’re noisy.
As long as the signal is authentic–it is from the actual desire of the individual–then it is trustworthy.
Contrast this with a signal that has an incentive to be performative, to signal something even if it’s not their authentic desire.
If there is a consistent bias to what individual want, then the signals will tend to align, and if you multiply them all together, the noise drops out and the signal remains.
This is true as long as the bias is consistent, no matter the magnitude of the noise.
This is why the Bitter Lesson emerges.
A few places this shows up:
In the web, the vast majority of links on web pages were put there by humans asserting, “this other page is potentially worth visiting”, which can then be distilled into PageRank.
In search engines, the user’s queries are for their own benefit, so consistent patterns can be used as ranking signals (like the proportion of [images of foo] to [foo] queries as a signal of image intent).
In business transactions freely entered into, both parties agree that there is a useful exchange of value, which allows an emergent signal of price that reveals deep insight about relative value of things in practice.
This authentic, non-performative alignment of individual actions is one of the ingredients in all auto-cohering systems.
Cats will do what they think is right no matter what you say.
Dogs want to make you happy and will do what you ask even if it makes no sense.
Superficially you’ll think the dog is making good progress.
They’ll hide the lack of progress under things that make you feel good.
Most exciting ideas don't work for boring reasons.
An insightful observation from Anthea: "creation is more important in a world of scarcity and curation is more important in a world of abundance."
Creativity needs constraints to adhere into something.
Without constraints, creativity just diffuses out, like entropy.
Constraints plus creativity are how you get default-cohering behavior.
Creativity without curation is just noise.
Some people can self-curate; but typically you need an external curator, because everyone thinks their own shit doesn’t stink.
This week I learned about “Uncertainty reduction theory.”
"In this sense, computation is indeed a means to an end.
The end is the reduction of uncertainty, the increase of knowledge and understanding.
Computation is the tool we use to get there, by processing and transforming the raw data."
This week I learned about "Set based concurrent engineering"
"An approach to the design of products and services in which developers consider sets of ideas rather than single ideas. To do this, developers:
Use trade-off curves and design guidelines to characterize (or describe) known feasible design sets, and thus focus the search for designs.
Identify and develop multiple alternatives, and eliminate alternatives only when proven inferior or infeasible.
Start with design targets, and allow the actual specifications and tolerances to emerge through analysis and testing.
Delay selecting the final design or establishing the final specifications until the team knows enough to make a good decision.
This approach yields substantial organizational learning. It takes less time and costs less in the long term than typical point-based engineering systems that select a design solution early in the development process, with the typical consequence of false starts, rework, failed projects, and minimal learning."
Complexity in a system tends to be conserved.
If you simplify it one place you often just offset it to another less-obvious part of the system.
If you have infinite patience it doesn’t matter nearly as much how efficient the process is.
In practice processes don’t work when some agent within them has such a hard time they give up.
Once you have a process that works, it’s easier to tighten it to get it to work more efficiently.
LLMs having infinite patience means that it’s much easier to get a process to work.
Big patterns are sometimes visible by the absence of things.
You can see a big dark object against the night sky by the pattern of absences of stars.
If you don't realize to look there you wouldn’t see anything.
You need to have a hypothesis that it could be there to even look for it.
If you realize it's there, you'll notice the subtle consistent absence of things.
Even though each missing star is a minor thing, in aggregate the missing large space implies something large.
This can help you see things that are invisible to everyone else.
More Radagast wisdom from a friend:
"As Anais Nïn said, "we see things not as they are, but as we are".
If we lose the courage to believe that revolutions are possible, if we insist on tracking just the hero-narratives, then we get heroes–the Men in the Arena(TM) – and we reinforce the socioeconomic and mythopoetic infrastructure that it takes to support them, i.e. the shape of the arena itself.
The hidden and untold story of how crowdsourced and open networks of innovation built up 90% of the mass beneath these "heroes": that is the ground beneath the Arena that we are not talking about.
And those of us who helped to lay those foundations, as well as continue to toil in the tech catacombs beneath them, we have much more power than the Hero narratives would lead us to believe."
Urgency is a first order phenomena.
It’s immediately obvious.
Impossible to ignore.
Importance is a second order phenomena.
It takes study and reflection to detect.
It’s very easy to accidentally ignore.
Intellectual exercise with collaborative debate is like yoga or going to the gym.
It will never be urgent but it’s incredibly important.
The flavor of something is a first order phenomenon.
Its healthiness is a second order phenomenon.
Junk food always tastes good automatically and quickly, but your awareness that it's bad for you takes focus and that awareness can come and go easily.
Genuine resonance is not "ohh this tastes good!"
You can get that in junk food.
It's "oh this tastes good" and also tomorrow you say "I feel good that I did that".
You need both.
Junk food gives you only the former.
Vegetables give you only the latter.
Resonant things give you both.
A resonant thing is in harmony with the things around it.
Resonance is deep, joyful, authentic.
Resonant things are concave.
That is, you start out liking it and the closer you look the more you like it.
Hollow things are convex.
That is, you start out liking it but the closer you look the less you like it.
Modernity has stripped away resonance.
In the push for efficiency, we’ve quantized everything.
We’ve sterilized the world.
Sterilization is about denaturing something.
Making it clinical, clean, inhuman, impersonal.
Close to commodity.
Quantization dehumanizes systems, evaporates their resonance.
Quantitative is about quantity.
Qualitative is about quality.
(duh!)
In modernity everything has been about quantity to scale.
Sometimes quantity is aligned with quality as a proxy.
When users actions are authentic and you can get summary statistics.
But it’s impossible to distinguish between truly resonant things and junk food in a quantitative way.
Our revealed preferences can’t distinguish between the two.
Modernity and resonance have been in tension since the industrial revolution.
But they don't have to be!
To get scale, you need to quantize, and to quantize you need to distill a rich, multi-dimensional resonant nuance down to a number.
LLMs create the potential to harmonize them, because they allow qualitative nuance at quantitative scale.
Resonance is when the interests of all the parties align, authentically.
True win-win scenarios are resonant.
They're rare, and more often when someone says “win-win” it’s hollow corporate HR speak assertions.
But they really do exist, and they are what make the world go round.
They create positive feedback loops that are auto-catalyzing.
It’s possible to have a thing be good for all of:
The user
The employee
The company
The shareholders
The society
They don’t have to be in tension!
Resonant computing resonates with what makes us human.
Aligned with your aspirations, not your superficial desires.
Everyone should have more support to live aligned with their own aspirations.
Not to align with someone else’s notion of what their aspirations should be, but their own aspirations.
Living a life they’d be proud to live.
Imagine a CLAUDE.md file, but for your values.
You tell the LLM your goals and it shows if your decisions are warmer or colder in terms of achieving them.
The intellectual throughline of my career: what has to be true for a distributed system to lead to emergently prosocial outcomes?
The main skill from seniority is the ability to handle ambiguity.
Comfort with ambiguity comes from having had a lot of varied experiences and survived them all.
Ambiguity as a meta-class becomes less novel, less scary.
“I’ve experienced this kind of situation before.”
Reacting to something is orders of magnitude easier than doing something proactively.
Conversations give you a constant stream of things to react to, mutually.
That’s one of the reasons they’re so generative.
In each conversation turn, one person says things, some of which are more interesting than others.
The other conversation partner picks the subset they find most interesting and builds on that with their own utterances.
This process then repeats.
With two curious, open people, this can be a transcendently generative experience.
Some of the most powerful unlocks are projects that increase adjacencies.
They don’t unlock any value directly on their own, but they bring a much larger set of things into the adjacent possible.
Things that previously were non-adjacent are now adjacent.
You only go into firefighting mode if you have something to protect.
It's not generating, it's preserving.
Firefighting mode without something to protect just leads to running around in circles.
Some races have a second-mover advantage.
Apparently there’s an olympics-caliber event for bicycle sprint races between two bikers in a velodrome.
It only matters who beats the other, not what the absolute time is.
When the starter pistol goes off, both bikers stay still (a “track stand”), waiting for the other to go first, so they can slip stream behind the other.
Here’s a YouTube video about this strange game theoretic phenomena.
The tech industry deeply believes in bottom-up meritocracy.
Good ideas can come from even junior engineers.
I wonder how much of this culture is downstream of the historically supply-constrained engineering labor market?
Companies are incentivized to treat engineers like an end in and of themselves, like the most precious thing in the world, because if engineers feel like they own it they will give their discretionary effort, and they won't go to another employer who will treat them even better.
I wonder if this will change given that engineering talent is now more demand constrained.
As someone who believes fundamentally in the power of emergence, I hope not!
An idea from a friend I can’t get out of my head: “No country in the world lets tourists vote.”
The longer term your commitment to some emergent collective, the more say you should have on the direction of it.
It feels like so much of the problems of the late stage environment we live in is giving shareholders–who could have just acquired the stock yesterday and sell it tomorrow–so much say over what companies do.
That rampant short-termism that comes from the relentless reduction of friction is the animating force behind the malaise of modernity.
The best way to force convergence in an open-ended domain is to set a deadline that is externally visible so you feel bad about missing it.
As things get older they get less elastic.
This happens to everything, not just our skin!
If you believe in an infinite good, then there are situations where you believe it’s morally OK to lie to you to get the outcome you want.
Everyone has a value that they would lie to protect.
For example, I’d lie about Anne Frank being in my attic.
Different people have different thresholds of where they’re willing to lie to protect a value.
Whenever you say "if people would just," in the limit it’s an argument for totalitarianism.
Because it requires that people go against their local existing incentives in some way.
Instead of forcing people to go against their incentives, change the structures that the incentives arise from.
People are more likely to understand disconfirming evidence if they connect the last dot themselves.
If the last dot is connected for them and pushed upon them from outside their mental walls, the full idea can feel like a threat.
They’re more likely to defend against it and try to repel it.
Whereas if they connect the last dot themselves, inside the walls of their mind, they can hold onto it and consider it.
It grows within their safe space instead of imposing from outside.
Instead of telling people they're wrong, ask a question that might lead them to realize they’re wrong.
A lesson from the Tao: deal with things when they are small so they never get big.
Resentment is a toxic accretion that compounds.
Any time you see something ambiguous, there’s some chance you interpret in a negative way, which accumulates as resentment.
The more resentment you’ve accumulated in a given context, the more likely you are to resent the next ambiguous thing.
When you're in the victim mindset you are primed to resent.
Giving gratitude is positive sum.
Some people see gratitude as zero sum.
To them, if someone gives gratitude to the other person, that implies that the net value transfer is from the other to them.
But you can both give gratitude!
Gratitude can flow both ways, for different things.
The more authentic gratitude you give, the more you build the relationship of trust and mutual respect.
When someone makes you feel appreciated, it's a vaccination that preps you to receive criticism and not feel like it's an existential danger.
It builds up your resilience battery.
So when in doubt, give gratitude!