Bits and Bobs 6/30/25

21 views

Skip to first unread message

Alex Komoroske

unread,

Jun 30, 2025, 9:39:13 AMJun 30

I just published my weekly reflections: https://docs.google.com/document/d/1GrEFrdF_IzRVXbGH1lG0aQMlvsB71XihPPqQN-ONTuo/edit?tab=t.0#heading=h.u2r8g1jivxq

Sycosocial. Context rot. LLMs as tofu. Christmas tree doo-dads. The portals era of AI. Curation of context. Paperclip CEOs. Policies on data. The privacy of the same origin model. Coactive software. Magic Eye emergence.

----

"sycosocial": relationships with AI that lives in the uncanny valley of friendship.

An eager servant who feeds you emotional junk food, knows your deepest secrets, and subtly optimizes for a multinational corporation's engagement metrics.
"Parasocial" captured something important about modern media relationships.
But AI creates something new: interactive relationships that feel personal but are fundamentally one way and sycophantic.
We need a new word.

One of the downsides of infinite patience of LLMs: encouraging spirals in people with OCD.

Sycosocial.

Context engineering is clearly the main thing.
A new important concept: context rot.

As a conversation goes on, the context gets more confusing than helpful.
You need to start a new thread.
Hints at the importance of curation of context.

LLMs are like tofu.

They absorb the intention of the context.
Even an earnest and helpful LLM, when given malicious context, can itself become malicious.

An LLM can be trusted not to write code to attack you in particular.

But if it sees any untrusted context at all the LLM can become malicious.
This is why prompt injection is so dangerous.

Anthropic released a deeper paper on the agentic misalignment.

That is, how the model would choose to blackmail its creators in some cases.
Simon Willison’s summary is worth reading.

AI is incomplete.

It's a raw utility.
It has to be used as an input to something else to blossom into its potential.
The current manifestation is the simplest thing you can imagine: a chatbot.
But that manifestation is obviously not the end point.

If you can swap between different models without memory they have much less power over you.

The fact they're commoditized means they don't have power.
If you lock into a single model, then you can't see the influence of bias.
If the switching cost of models is high then the problem of AI bias becomes significantly more important.
That’s the world we’re skating towards with vertical integration of chatbots if chatbots are the one UI to rule them all.

With OpenAI and Anthropic the model is the product.

The product is a model in the middle like a christmas tree decorated with various doo dads.

The doo dads are useful, but the tree itself looms over everything.

Just a single model to rule them all for all use cases forever?
Clearly the model will just be a component of a larger system.
The model is the engine, not the car.
Instead of having a service that is entirely a chatbot, what about a service where one feature is a chatbot.

Where you can use whatever model you want.

We’re in the portals era of AI.

Excerpts from the Stratechery interview of Bret Taylor:
"When portals dominated and directories dominated, the web tended to be small because you needed to be a presence on these directories. It almost was like if you weren’t in the menu, no one could order it, there’s no off-menu websites."
"As search … became the dominant form factor … it also produced the long tail. All of a sudden, because you went from a directory form factor to a search form factor, it created a market for people to essentially provide content for those keywords. … As a consequence of that, it made the web bigger.
“As you think about the evolution of AI and the form factor of ChatGPT … we might still be in Yahoo directory."

AI is to Apple as the internet was to Microsoft.

From Strachery this week:
"Everyone used the Internet on Windows PCs, but it was the Internet that created the conditions for the paradigm that would surpass the PC, which was mobile"

LLMs work best with curation of the context.

A document allows curation.
A chat does not.
This is one of the reasons chats have context rot.
Chat is a feature, not a paradigm.
Chat is a component of the AI-native software substrate, but not the only one… or even necessarily the primary one.

Your context must be curated

Curation is about distillation, taste.
You can steer quite a bit when you curate.
In a world of cacophonous background noise, the raw signal is never seen, it's what is curating your view.
The curator has a massive amount of leverage to steer you.

Whoever controls your context controls you.
Context is the center of the universe of AI.

Who curates it?
Who controls it?
Who stores it?
Who can see it?
Who can write to it?

Your context is yours.

You should have full and ultimate control.

When you store data in a cloud service, it's still your data.
How can we lock the AI ecosystem open?
An in the wild prompt injection attack attempt was discovered.
A report about how prompt injection can easily happen in MCP.
Some of the discussions of MCP remind me of the "everybody will have their own home nuclear power plant" vision of the future.
LLMs create the potential for infinite software.

But you need a new security model to act as the catalyst that lets it grow into its full potential.
You need a new model to allow vibe-coded software by a stranger to run safely on your data.
Otherwise the distribution of vibe coded software is limited to only the people you directly trust.
A low ceiling.

Applying close-ended solutions to open-ended problems is what gives the logarithmic value for exponential cost curve.
The domain of infinite software is open-ended.

It requires open-ended solutions.

Tools like Notion and Airtable are limited to the audience of people intrinsically motivated to organize and make tools.

Some people like organizing and making tools for themselves, almost as an end in and of itself.

Any productivity leverage they get is a bonus.
This allows them to put up with much more hassle before giving up.

But this is only a very small percentage of people, perhaps 0.1%.
Notion and Airtable try to extend the audience with templates.
A savvy, highly motivated user can pave their cowpath that other users can use too.
But templates are not turing-complete; they cannot adapt themselves to their point of application.
Vibe-coding allows more people to make tools than before, perhaps 10x more people than could code before.
But that’s still only, say, 1% of the population.
You can’t safely run code vibe coded by a stranger.

That sets a ceiling on vibecode in our current security model.

How can you get to situated software for the other 99%?
You’d need a new security model.
Such a model would allow open-endedness in a way that could outcompete other offerings.

The same origin model was sufficient only while software was expensive.

The same origin model has speed bumps every time origins touch.
If software is expensive, software tends to aggregate, so there aren’t that many interactions across origins, and the speedbumps don’t matter that much.
In a world of infinite software, the limitations and friction of the same origin model will move from minor annoyance to something that fundamentally sets the ceiling of possibility.

The same origin model effectively punts on the privacy model.

It keeps data from origins separate but has nothing to say once they touch.
When you punt on the privacy model you’re doomed to put a series of annoying mitigations front and center for the user

This leads to a lower ceiling.
Many multi-origin use cases are simply too annoying for anyone to use in practice, let alone build.
Users have to constantly grapple with questions that there is no good answer to.

Because the same origin model never grappled with privacy, we got a thing where we don't have much privacy today and also lots of centralization.

This is not a great outcome!

So we’re forced to handle privacy imprecisely and at low leverage at the wrong altitude.
In a world where you can trust the policies on your data are always faithfully executed, a lot of data access permission dialogs and other annoying UIs would evaporate.

Policies on the level of the origin are too coarse.

Applying policies on the level of individual data makes policies small, high leverage, easy to fade into the background.
Why don’t we just do it this way?
Because data is leaky; it sloshes around.

Every computer that can see it, even for a moment, can make a perfect copy and send it anywhere it wants.
Trust with data is viral, it must expand to everything it touches.

If you could have a runtime that had a restrictive sandbox on data, you could know it wouldn’t slosh around.
But now you’d have the problem of everyone needing to run their computation locally on a runtime they trusted.
Confidential Compute allows you to structurally trust a remote runtime to not be modified.

In the same origin model, privacy is in tension with competition.

In a non-obvious but fundamental way.

More privacy means more friction at the edge of origins, which leads to centralization (since data is harder to move) which reduces competition.

This is not fundamentally true in general.
It's only true in a security model that doesn't seriously grapple with the fact that data is infinitely replicable.

Writing software was something limited to a priestly class before.

The chosen few who were allowed to wield the deep magic.
But now it's being democratized.
This is good! But it will also be messy.
Someone told me their friend vibecoded a phishing detector that is itself horribly insecure.
The world of infinite software will be the wild west until we figure out new ways of building and distributing software.

When you give permission to an origin in the same origin model for a class of data you’re granting capability to "turing complete code, arbitrary network access, for now and into the indefinite future."

That's a huge statement!
It's a generalized statement of trust in the entity that controls the origin.
Do you trust the owner of the origin in a fundamental way?
It's not possible to make an educated trust decision with 1000's of origins.
Trust becomes more important in the era of infinite software.

A software ecosystem that blames its users when things go wrong will hit a low ceiling.

Only savvy enthusiasts will be able to safely use it.
That's arguably what happened with Linux and Crypto.
A system that makes normal usage safe and hard to mess up for low-savviness users allows it to blossom into a much larger ecosystem.

Imagine a system that could give an agent exactly the tools it needed in the moment it needed them.

An open-ended catalog of what other users have built and used.
The agent could create tools on demand and draw on the wisdom of the crowd to find good ones.

As writing software becomes easier, the bottleneck shifts to Quality Assurance (QA).

It used to be that writing software was so expensive that you needed careful planning up front to make sure you didn’t waste any effort on it.

This is also why feature length animation has a slight tendency to be on average higher quality than live action.
It’s so hard to produce it requires more planning and workshopping before it gets to production.

With LLMs making the actual coding significantly cheaper, the bottleneck shifts to verifying it actually does what it’s supposed to.

The value of SaaS is not just the software, it’s the QA.

Someone who’s thought about this domain a lot asserts that it works.
Especially valuable for things that abstract messy network problems (like payments) or fractally complex compliance problems.
Many Saas tools, even if distributed as just a spreadsheet, would still have value.
Related to the idea of “A good brand is a promise kept”.
The SaaS companies are offering a promise for competence in a specific domain.

Which will be more important by unit weight in software systems in the AI era, LLMs or normal code?

A lot of platforms being built for the age of AI imagine that most of the weight of systems will be LLMs, with just a little bit of code.
What if it’s the other way around, and it’s mostly code, with a little bit of LLMs as magical duct tape?
The former has the prompt injection problem, fundamentally.

The LLM is in the driver’s seat, and the LLM can be tricked.

The latter has the potential for a non-prompt-injectable system.

Which is easier, engineers to become subject matter experts, or subject matter experts becoming engineers?

With vibecoding maybe the latter is more important?
The tech industry has a baked in assumption of "of course people with a CS degree are at the top of the totem pole and always will be".
But the tech industry as a whole would miss it if LLMs changed it because we’d all have the same bias and blindspot.
Domains that don't work with CS are less deterministic.
Their taste, metacognition, entrepreneurial spirit, and subject matter expertise matter a ton.
Maybe philosophy is the best degree for people who “code in English”?

I want a system to organize my life to help me be the kind of person I want to be.

Aligned with my intention in an era of AI.

The AI era requires a new kind of software: coactive software.

Software that builds itself, aligned with my intentions.

Coactive Software means where the AI and the human cocreate in a shared substrate.

A shared substrate or fabric is the defining feature of coactive software.

A blackboard system is a natural way for a swarm of little programs to collaborate in a composable way.

A great substrate for coactive software.

Coactive computing is self-driving software.

Any experience you create in your coactive fabric can become self-driving.
Improving itself by suggesting more content or code into itself.

Imagine your own personal internet.

An open-ended web of insight and possibility.
But just for you.
What if you could make your own personal internet as easily as a spreadsheet?

LLMs do better with reasoning because they have more space to think out loud.

To cache intermediate insights it can then use for the final answer.
For an LLM to make sense of your life, it wouldn’t want to have to derive each time who your spouse is by looking at your email.

Similar to the RAG limitation of "tell me the most insightful things in the text."
That requires chewing on it.

It should be able to cache intermediate answers in a shared substrate.

A way to accumulate curated context.

One that the user and the AI can both mark up.
A fabric of data that is coactive.
As more intermediate insights are cached, the further it can reach and the more it can help you.
It’s important that the user be able to see those intermediate insights.

To be able to add their own, or correct them.

If you can’t see the cache then it’s a dossier, about you, not for you.

Most of the communities that had open-ended collaboration defaulted to sharing public information.

It wasn’t just optimizing for making it so people could choose to help others with their own actions, it was defaulting it so that users created that indirect value as a bonus even if they didn’t intend it.

Napster automatically shared your files with others.

People are OK with indirect benefit for others from their actions, but don't typically take proactive actions to make that happen.

A kind of reverse trolley problem.

Defaults matter.
This is a summary of the key insight in Dan Bricklin’s The Cornucopia of the Commons: How to get volunteer labor.

Cloud providers create valuable signals out of the collective actions of millions of anonymous users.

For example, the feature on Maps that shows how busy a given business is right now.
It’s created by aggregating and distilling a massive swarm of anonymous location pings into a high-quality, useful signal.
The way these signals are calculated often uses differential privacy thresholds internally to make sure the data isn’t identifying, even early in the pipeline.

Note the aggregators don’t say they’re doing this, because if they described it, then someone could sue them if they ever stopped.

Two problems with the status quo:
1) You can’t actually verify that they are using privacy preserving techniques in the pipeline.

They could extract tons of personal signal out of all of the input.
You’re trusting them to not do that.

2) The aggregate signal is owned by the aggregator, not the users.
If you have policies that everyone can structurally trust to be followed you can have new coordination mechanisms and very different equilibriums.

People want to put their data and sweat equity in a collectively owned thing that some billionaire doesn't own.

So you don't feel like a chump.
The reason we got companies owning data was the same origin trust paradigm.

A thing that can operate your browser on your behalf is extremely dangerous if it’s not fully trusted.

Your browser profile includes session tokens from various domains that allow anyone driving it to take actions as you.
Signing into a service in a browser is like leaving a horcrux of yourself.
A big deal!

Anthea has a new piece on digital devil’s advocates.

The AI doesn't need to fear getting fired, so it can speak truthfully.
The LLM can be used as a responsibility launderer.
You know it’s important to have a devil’s advocate, but any individual who plays that role has an asymmetric downside.
So have the thing that’s not alive play that role.

Mechanical Turk was the original thing in this era that actively put humans “below the API.”
When humans are below the API, it dehumanizes them.

To be human is to be a kaleidoscopic, emergent force: an end in and of yourself.
An API can only be about a means.
When humans are below the API they become invisible.

I want software that blossoms with my potential.
The distilled insights from a corpus and all the raw data are very different.

The raw data has to be processed to uncover those insights.
It’s possible that even someone with the data doesn’t invest the effort to uncover them.

Especially if the value and prevalence of needles in the haystack isn’t enough to make it worth it.

But if someone’s already done the synthesis work, it just needs to be transmitted.

A lot of the best practices for programming with LLMs are the same as for humans.

For example, document each directory with a README, have aggressive type checking / linters.
But most humans give up or aren't patient with overly constrained setups.
LLMs are infinitely patient, so you can have them be very constrained to what you want.
When you have those best practices, the actual instructions to the agent can be quite short.

Rust is harder than other languages to write in.

Developers either bounce off trying to learn it or become zealots.
You have to see the value of the thing to know the slog is worth it.
To see the systems that are frustrating as trying to help you, not hurt you.

Rust’s borrow checker feels like your enemy when you start.

Later you realize that it is actually your friend.
It is helping you avoid traps that you would have left for yourself later.
Those edge cases were always lurking, it’s just that in other languages they were silent.
Now, the borrow checker forces you to confront them.
Most of the time it’s quiet, but when it’s noisy you know there’s some danger you might have missed.
A programming environment that is about minimizing the taint of information flowing will have a “taint checker”.
The taint checker will be annoying but hopefully seen as a friend.
The thing that allows power, but is fundamentally weird and you have to be aware of, even if most of the time it's below the surface.

It’s best to abduct sugar layers out of real usage of lower layers of the system.

In many systems there’s multiple layers.
At the bottom is the bedrock semantics (e.g. assembly).
Then you have the first layer that expert users might actually use in normal usage.
Then you have layers that less savvy users might use (the sugar).
It’s hard to start with the sugar first–you don’t know what patterns will be common and useful.
It’s easier to start with the lower levels and watch what savvy users do: the idioms that emerge.
Then you want to add layers such that 80% of the lower level things could have been done at that layer without dropping down to the lower level.

Are you pair-programming or code reviewing the LLM’s code output?

They’re very different stances.
One is you’re in the loop with it, one is you’re out of the loop.

On the dance floor vs watching from the balcony.

When it's just making changes automatically you aren't in it, you're looking from the balcony.
Pair programming vs "handing off to the intern" gives you significantly more ownership of the code.
Did you have the LLM think for you or did it allow you to think 10x better?
Are you engaged or disengaged? Are you enrolled in the process?
Is it a tool to help you think better or a tool to help you think less?
This is what is called cognitive debt.

Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

When you have to copy/paste information between systems, you can be like Maxwell's Demon.

Making subtle decisions about the information, keeping a sense of what it's doing, steering it out of corners before it gets stuck.

“Innovation happens at the speed of trust”

Why?
Innovation looks like noise.
Trust is giving the benefit of the doubt to someone to lean into their suggestion that looks like noise to you.
That noise could turn out to be innovation if everyone leans into it.

Why do radical constraints often catalyse creativity?

Imagine outcomes randomly jiggling through the possibility space.
An outcome by itself tends to evaporate.
Outcomes that pile on and accumulate can grow into stable structure.
As the coherent pile gets larger and has more surface area, there’s more likelihood a random outcome touches it and coheres, which gives compounding momentum.
This is a metaphor for finding consensus and other emergent phenomena.
The more degrees of freedom in the possibility space, the less likely the outcomes randomly cohere.
Constraints reduce the degrees of freedom, making it more likely ideas cohere and build.

The paperclip monster is not an AI, it's the engagement-maxing CEO.

The corporation is the original runaway AI.
Why are we worrying about runaway AGI?
It's already here!
… I don’t know if that makes me feel better or worse.

I love this Digital Oasis Manifesto from Rob Hardy.

A manifesto for Radagasts.
Especially relevant in the age of AI.

Computer Science was used to build LLMs.

But Computer Science can’t tell us how to understand LLMs.

Situated, authentic things don't scale.

Scale requires quantization, and that requires distillation.
In capturing some dimensions, it must miss others.
To scale you need to erode the soul.

To distill out the nuance and get just the quantitative components.

If you later care about some dimensions that were denatured from the data, you can’t recover it.
Which dimensions matter is a kaleidoscopic, fractal, ever-changing thing.

It's not financialization that's bad, it's transactionalism.

That's what's soulless.
Finite, not infinite.
Short term.

The fast-thinking part of your brain is smart.

The slow-thinking part of your brain is wise.
The fast-thinking is looking at what's precisely in front of you.
The slow-thinking part is absorbing from breadth and experience.

In the modern world we're all stuck in a fast-thinking loop.

It keeps on ratcheting up.
This will get worse in an era of AI.
The importance of stillness and reflection will become increasingly important.

This week I learned about the Blok device.

It helps you avoid addictive apps on your phone.
To unblock the apps, you have to physically scan the Blok device.
That allows you to add significant friction by physically putting the Blok device far away, or giving it to someone else.
To force you to pause.
To think more intentionally, rather than having one more bite.
More chances for your intention to break through.
Lashing yourself to the mast.

LLMs are essence extractors, not mechanical reproducers.

The way they learn from their training data is more than just reproducing.
Essence is a new concept.
It doesn't exist in the legal canon yet other than things like trade secrets.
Copyright is about mechanical reproduction, but what LLMs do is different.

The Bitter Lesson is that the breadth of signal matters much more than its depth.
Language and evolution are emergent processes of tons of little contextual micro votes.

If you don’t look carefully you won’t see anywhere it shows up directly because in any given instance the noise dominates.
But if the bias is consistent it doesn’t matter how noisy it is.
The consistency of the bias is more important than its strength.
The noise falls away at scale and all that’s left is the bias: the signal.
The noise is camouflage that hides any given instance of input.
“Where does evolution come from?”

You can’t see any individual instance.
You can only see the full emergent result.

Only if you blur your eyes can you see it.

Like a Magic Eye illustration.

Sarumans look at each instance and don’t see anything interesting so they conclude there’s nothing interesting at all.

Innovation happens in pockets within larger networks.

When it’s not in a pocket, it is dominated by the cacophony of the average.
What makes it a pocket is the network topology but also the friction of information transmission.

More friction of transmission makes even a more connected subnetwork operate more like a pocket.

The friction of information transmission has declined significantly.

Everything is a remix.

This has always been true, but it didn’t used to be as obvious.
The obviousness that everything is a remix is tied to:
1) how connected you are to the same things as everyone else,
2) also the cost to generate an artifact.
Now we're all in one big melting pot and it's free to create slop.
It’s now much more obvious that everything is a remix.

We outsource what we don't value.

What kind of thinking is worth doing ourselves?
Is insight compute bound or attention bound?

LLMs don't help with the former.

Not cheaper cognition but richer intention.
The cheaper thinking is, the more important slow thinking becomes.
When thinking becomes cheap, will it 10x the industry’s addiction to heroics?
I’d rather that instead of thinking becoming faster it becomes deeper.

The mundane things take most of the time, but are hard to reason about because it's a diffuse swarm

So when imagining the future it's invisible and you undercount it.
Hofstadter's law: even when you take into account Hofstadter's law, it will take you longer to achieve the outcome than you planned.

Interestingness is a situated judgement call.

Not just "surprisal" but "valuable surprise".
Humans think it's interesting only if it can help them do things in their situated context.

Not just entropy.

What computers think is interesting and what humans think is interesting is disjoint because they have different views on what is valuable.

When you’re traveling with a group of people, the group feels smaller as the trip goes on.

How big a collection feels is proportional to how familiar it feels.

That is, lack of surprisal.

So as the trip goes on and you get more experience with everyone the group feels smaller.

I love Ben Follington’s piece on interacting with LLMs as dream walking.
The interests of the collective and of the individual cannot ever be perfectly aligned.

It's a structural impossibility.
You could get asymptotically close to aligned in some circumstances, but never perfectly aligned.
There’s always a situation where what’s bad for one individual is good for the collective.
The Ones Who Walk Away From Omelas captures this dynamic.

From the outside you see an organization as one thing.

From the inside an organization can be seen for what it is, a collection of sub-entities that aren’t perfectly aligned.

That can never be perfectly aligned with the collective.

Outside the boundary you only see the external API, not the internal complexity.

If you could see all of it at all times it would be impossible to reason about anything ever, because it would be cacophonous and overwhelming.

Status is the ultimate emergent incentive.

Status is structurally scarce, since it is entirely relative.
You choose to affiliate with collectives that will raise your status.
This is one of the drivers of boundary gradients.

Magnetism is another example of “small but consistent bias” emergent effects.

Each individual atom’s alignment isn’t that big of a deal.
But at the macroscale huge fields result.

The geniuses that fail often fail because they don’t recognize the limits of their genius.

They end up in recursive isolation that leads to a kind of mania about how everyone else is stupid.

In the modern world perception is reality.

The most cynical among us, those without shame, have seized on this fact.

It’s the discontinuity of being revealed more than the depth of depravity that spurs action on corruption.

If you do corruption in public there’s no discontinuity when the story breaks.
It’s the discontinuity, the shared, coordinated “they did what??” that leads to a force decisive enough to drive a coherent effort to punish it.
But the cynical among us have noticed that in the modern era where divisive personalities have loyal mook armies, if they just commit the corruption brazenly and in public there will never be a discontinuous moment of alignment for the people who might bring them to justice for it.
"The best lack all conviction, while the worst are full of passionate intensity"

An insightful Hacker News comment:

"Laws and regulations are part of the free market system.
As rules approach zero, competition approaches war."

You could make an argument that wheat domesticated us instead of the other way around.

Could you make the same argument about AI today?

At some point it will be the last time you ever pick up your kid.

A Stoic reframe to find meaning in the everyday: imagine “this is the last time I’m doing this.”
When you’re older you’re forced to confront this often.
But it’s true in many other scenarios.
Leaning into it helps you be here now.

Reply all

Reply to author

Forward

0 new messages