Bits and Bobs 7/7/25

29 views
Skip to first unread message

Alex Komoroske

unread,
Jul 7, 2025, 8:23:27 AMJul 7
to
I just published my weekly reflections: https://docs.google.com/document/d/1GrEFrdF_IzRVXbGH1lG0aQMlvsB71XihPPqQN-ONTuo/edit?tab=t.0#heading=h.zdl09a1rdbp4 .

LLMs as a mirror. Chatbots as double agents. Chat as the poor man's open-ended UI. Living Apps. Coactive context drawers. Situated software catalyzing cozy potential. Claude Code as a ride-on mower. Origins as burbclaves. Open aggregators. A substrate for vibe coding that is concave to collaboration. Mass spying. The Cacophonous Age.

-----

  • LLMs are a mirror.

    • Of society in general (the weights; the background awareness).

    • Of the user specifically (the context; the questions the user brings to it).

    • They may distort things, but they’re fundamentally a mirror.

  • ChatGPT’s answers feel to me like being served up a personalized Axios article.

    • More formatting than substance.

    • Punchy and “simple” while obscuring nuance.

    • Presents the answer as an objective, simple truth, as opposed to a nuanced observation as a point of departure and follow up.

  • Your AI assistant isn't your friend. It's a double agent.

  • AmpCode dives into the power of subagents that are spun up to do minor tasks.

    • But how can you cache the logic of what they discover?

    • Agents need a coactive substrate to write and read intermediate insights to cache them.

    • That would allow them to not have to recreate every insight from scratch.

    • The cache has to be visible to the user to verify it's right and modify themselves.

  • I thought this critique of MCP was interesting.

    • It argues that MCP and "everything just talks English" APIs give poor composition.

    • Vibe-coded bespoke normal code is easier to compose.

    • Use LLMs to write normal code, don’t use LLMs to replace code.

    • LLMs are pretty good at writing normal code, but their squishiness is a bad fit for replacing code.

    • LLMs can do very useful things that code didn’t used to be good at.

    • Use it for that, not what code is good at.

    • This to me feels like another example of how the “infinite software” frame changes what is a good strategy.

  • Chat is a forgiving interface for quality.

    • If it gets the answer wrong or doesn’t do what you wanted, you can easily ask follow up questions.

    • One-shot UIs that have to give a perfect answer every time with no second tries are very hard to get to the bar of viability and then improve.

    • Compare Google Search and (spoken) Google Assistant interactions.

      • With Google Search, as long as the answer is in the top 10, it’s fine.

        • Formulating a query is fast, and text results are a broad channel, easy to skim through quickly.

        • Then, if users are consistently clicking on the third result, the system can twiddle its ranking to be to the top.

        • This gives a clear gradient to hill climb.

      • With Google Assistant, formulating the query and getting the response over audio are a narrow channel.

        • It’s impossible to speed up the readout, so you get one answer, not several.

        • If the answer isn’t right, users give up.

  • I loved this socratic dialogue from Geoffrey Litt about why chat is not the final UI for LLMs.

  • Every form of UX around chatbots today has tried to grapple with the model's limitations.

    • Then the next model gets better and obviates the UX.

    • What are the downsides fundamental to chat itself?

    • Those will help discover the ideal UX for LLMs.

  • Chat is ephemeral and squishy.

    • That's what makes it good for starting open-ended tasks without it feeling like a burden.

    • That’s also what makes tasks as they continue feel like wading through molasses.

  • If chat isn’t a good UI for LLMs, why is it winning?

    • To me it's entirely based on the novelty of LLMs, everyone's just experimenting with them right now and starting open-ended tasks, which chat excels at.

    • It's as the task goes on for longer and gets more structured that chat becomes more obviously not a good fit.

  • Chat is the poor man's open-ended UI.

    • LLMs are inherently open-ended, and chat is the most obvious and easiest way to grapple with that open-endedness.

    • But as we uncover other ways of doing it that lean more on the GUI, chat will look like just a feature, not a paradigm.

  • With LLM UX it feels like we have all the ingredients but we don't have the cake yet.

    • Chatbots seem like a local maxima to me.

    • Chatbots are not the ideal UX.

    • I think the ideal UX will include the ability to chat, but that will not be the primary interaction that everything orbits around.

  • A coactive relationship is one that empowers both participants.

  • “Apps that adapt to you” could be powerful. 

    • Today apps are static.

    • Now LLMs and this tech allows them to proactively adapt to what you want to do.

    • Over time you’ll not even reach for apps at all because everything you do will be adapted so much to your situation that the notion of an app fades away.

  • I want Living Apps.

    • Apps that are alive, that adapt themselves to me, pulling in context or code to make themselves do what I want to do. 

    • I could tweak them to my heart's desire.

    • Apps for living.

    • They could also be called “coactive apps”.

      • Coapps are apps that adapt themselves to my needs.

    • Alternatively, call them tools. Living Tools, or Coactive Tools.

    • ‘Tools’ doesn’t have the expectation of “these are just like apps of today.”

  • The ideal software in the era of infinite software is pre-assembled lego sets.

    • You get a full, useful thing out of the box, that an expert designed to be holistically useful.

    • But it’s made of legos, so you can replace any block… or add whole new things to it.

    • So then creators can create both new lego blocks... but also new pre-assembled lego sets that all fit together nicely and coherently and are useful right out of the box.

  • Chat is a special case of a coactive surface.

    • You can both add messages to the log, but only alternating, and append-only.

    • A true coactive substrate is one that both the user and the AI can add to and edit.

  • How much you trust a suggestion is partially due to what context the system is drawing on.

    • Is it drawing on relevant facts about you?

      • Is it missing important ways that you differ from the general population?

      • Is it including irrelevant things that will distract it?

    • The quality of the context is a big determinant in how good the results are.

    • This is one of the reasons the ChatGPT dossier memory feels off to me.

    • You can’t inspect the context to say, “include this, not that”.

      • The chat is append only, not coactive.

    • You can imagine a UX where there’s a coactive context drawer at the top of the interaction.

    • Things in the interaction and that context drawer are all that is given to the LLM, nothing else.

    • When the drawer is collapsed it shows a short summary.

    • When you expand it you see trees of context that you’ve pulled in explicitly.

    • You can add in trees of context on demand easily.

      • E.g. “Include information about my nuclear family.”

      • The tree pulls in all of the sub-items that hang off of it.

      • You can choose to pull in something high up in the tree or low down.

    • You can delete any tree of context that’s not relevant.

    • There’s also a list of auto-included context that the system guesses is useful.

      • Those are included by default, but can be explicitly added to the included items, just like if you’d added it yourself, or deleted.

    • The ranking function is: how well does the system predict which trees of context to include, (predicting whether the user would accept or delete a suggestion?)

    • The user choosing to include or excluding bits from the suggested context is an extremely powerful ranking function.

    • The user wouldn’t even realize that they’re training the system for themselves and others by gardening their context.

    • Only a small number of users would need to do it to help tune it for a whole population.

  • Formulating context as a memory makes it sound like it's for the LLM.

    • Ideally you’d organize it for yourself, which would be useful on its own.

    • As a bonus it makes the LLM significantly better at doing things for you.

  • A good executive business partner doesn't “remember” you, they know you.

    • There's distillation.

    • How can you get a good enough model of yourself for the LLM?

  • Imagine a productivity system that does all the grunt work for you.

    • That gets better as you use it more, not because some PM added a feature or some fully automatic LLM-based insights.

    • But because it draws on the collective wisdom of everyone using it.

  • Dan Petrolito: I built something that changed my friend group's social fabric.

    • An extremely trivial script at the right moment can have a massive impact.

    • Situated software catalyzes cozy potential.

  • LLMs have lots of last-mile problems.

    • Largely due to gardening the right context and the right coactive UI.

  • Claude Code feels like a ride-on mower.

    • At first you go “Whoa, this is so easy to use, I can do 10x more than I could before.” 

    • But as you use it for more things you realize it’s too coarse a tool to do detail work.

    • An extremely useful tool, but doesn’t replace all of your tasks in the garden.

  • Claude Code feels like a “choose your own adventure” style of developing software.

  • Anthropic Artifacts’ new dashboard feels like it changes the game.

    • Which is funny because it didn’t change anything other than having one place to see your artifacts, treating them as a first class citizen.

    • Before, artifacts were always secondary to the chat.

    • Your artifacts were lost in the sea of text and other chats.

    • Feels like the shift to the News Feed in Facebook back in the day.

      • Didn’t change the information in the system, just changes its visibility.

      • But that was like a figure-ground inversion.

    • Moved from k selection to r selection.

    • Anthropic artifacts dashboard gives you one place to see Darwinian evolution.

    • Everything you've ever seen before you can tweak and remix.

  • I liked Grant Slatton’s summary of techniques for LLM memory.

  • Apps can't borrow your data for a specific task.

    • They need permanent, unlimited access.

    • This architectural flaw means using multiple tools requires trusting multiple strangers forever.

    • It’s easier to just give everything to Google.

    • The same origin paradigm makes aggregators inevitable.

    • But the same origin paradigm is not itself inevitable.

  • Most users won't want to vibecode their own software.

    • You can’t safely run something that was vibe coded by a stranger if it has access to sensitive data.

    • That sets a ceiling on vibecode's distribution in the current security model.

  • In the same origin model you trust the actor not the action.

  • Chris Joel pointed out to me that DNS origins are like Neal Stephenson’s burbclaves.

    • “Now a Burbclave, that's the place to live. A city-state with its own constitution, a border, laws, cops, everything.”

  • Having to trust an origin in an open-ended way is the fundamental problem that leads to aggregation.

    • That's why switching policies to be on data, not origins is the key unlock.

    • Previously that was hard because of data sloshing around, impossible to administer.

    • But Confidential Compute creates the possibility for a runtime that can be structurally trusted to follow policies on data even remotely.

  • Why can't we have open aggregators today?

    • An open aggregator would have the benefits of an aggregator for distribution, but without the overly centralized power dynamics.

    • In the same origin paradigm you have to trust the origin with all of the data.

      • That’s possible to do with a single entity: “Do I trust Google with this data now and into the future?”

    • But it’s very hard to make that decision for a swarm: “Do I trust any one of millions of unknown entities in the swarm who might get this data now or in the future?”

      • Especially when any single bad actor in the swarm can leak the data somewhere else.

      • The single actor vs open-ended set of actors is a nearly infinite difference.

    • A swarm and aggregator dynamics are incompatible today.

    • You can't trust a swarm as one entity, and the same origin model requires you to trust the entities with your data.

    • But if you change the laws of physics you could get it to work.

  • The same origin model doesn't grapple with the fact that any data shared with an origin is an open-ended trust delegation.

    • GDPR tries to fix the same origin model.

    • But it’s does so hamfistedly because there are no good solutions to the same origin paradigm given its fundamental (lack of) design.

    • The same origin model doesn’t permit good UXes for privacy or security.

  • Will vibe coding produce individual rat’s nests of code, or large emergent edifices of collectively useful code?

    • In a spreadsheet, the more you tailor a spreadsheet to your use case, the more it becomes a personal rat’s nest, taking you farther from collaboration with others.

    • Contrast that with, say, Google Search.

      • The more people that use Google Search, the broader the data of queries and clicks to give clear signals, the better it gets for everyone.

      • Or the more people that use Wikipedia, the better it gets for everyone.

    • Whether vibe coding is convex or concave to collaboration comes down to the substrate vibe coded things are distributed in.

    • In Notion even if you plow a ton of effort into it it still doesn't help anyone else.

      • It's a close-ended system.

      • Your data doesn’t help anyone else other than your direct collaborators.

      • You can’t add more turing complete functionality that is missing.

    • How can you flip the model to be open-ended by default?

    • Put another way: we need a substrate for vibe coding that is concave to collaboration.

    • What would a Notion-like fabric look like that’s concave to collaboration?

  • Open-ended systems are powerful in a way hard to demo quickly.

    • Imagine a system where the value of the security model allows significant open-endedness.

    • That benefit would take time to show up for a given user.

    • When collaborating with a small group of people you know personally who you trust you don't need a different security model to trust them.

    • The security model first would help when you first execute code written by a stranger.

    • The web was similarly underwhelming on the first visit; it was only after a few link clicks to other domains that the power of its open-endedness revealed itself to you.

  • If a task was 1000x too hard and is now 10x easier, it's still 100x too hard!

  • I think people want a common-sense vision for optimistic human centered computing in the age of AI that is not:

    • 1) Cynical engagement-maxing tech products of today.

    • 2) Crypto.

    • 3) E/Acc / Successionism.

    • People presume that if you’re optimistic about tech and are in the industry you just want centralization.

    • Or that if you’re optimistic about tech and don't want centralization then you must like crypto.

    • But it’s possible to be optimistic about tech and push for neither centralization nor crypto.

    • Such a third way is more important than ever before in the era of AI.

  • A HackerNews comment that stuck with me: “From the very beginning Facebook has been an AI wearing your friends as a skinsuit.”

  • When naming something novel, the slingshot maneuver can be helpful.

    • Name it based on what people know they want.

    • Then slingshot them to the thing they didn’t know they needed.

    • That latter part only becomes clear once they’ve used it.

    • Meet them where they are to take them to where they should go.

    • Useful when you have a thing that is superficially like other things, but fundamentally better in a novel way.

  • There’s a GitHub project with simple little LLM based “gremllms”.

    • When you access a method, the LLM generates code JIT.

    • I think the mental model of gremlins fits well: small, not too powerful, but mischievous and a bunch of them together can make an impact.

  • The word “context” is a good one for “relevant data to give as background to the LLM.”

    • But in the deep philosophical sense, your real context is outside your control.

    • Your context is a gravity field.

    • The water you swim in.

  • If you've gone through the effort of having high quality programmatic thinking LLMs can write infinite Op Eds for you, on demand.

    • The backlog of Bits and Bobs feels exceptionally valuable to me in the age of AI.

  • Teaching forces you to abduct your intuition.

    • That’s why one of the best ways to understand something is to teach it.

  • A slippery slope is an example of an emergent phenomena of noisy signal with consistent bias.

    • No individual step is that bad, obscured by noise.

    • The bias is in one direction: the gravity of incentives.

    • So the emergent global effect is clear and powerful.

  • Bruce Schneier points out that LLMs will bring mass spying.

    • Before, we had mass surveillance, but a human sifting through the collected data happened rarely.

      • That limited the oversight to things that could be done at quantitative scale, or the most egregious tails of behavior.

      • A panopticon kind of game theoretic dynamic.

    • But LLMs give qualitative insights at quantitative scale.

    • Society now has the technology to get qualitative insights at scale from that surveillance.

    • What could possibly go wrong??

  • LLMs are infinitely patient so good enough ACLs aren't good enough any more.

    • Before your data was protected a bit by security through obscurity.

    • But LLMs are infinitely patient to sift through data that was accidentally left open.

    • Another outgrowth of the “qualitative insights at quantitative scale”.

  • Some examples of sycosocial relationships with LLMs:

  • Garry Tan on Twitter:

    • "New social networks are going to appear that will be LLMs creating a cozy web customized for us and our real friends, and their friends and so on

    • There will be a new social network built on mutual trust, all curated by machines of loving grace

    • Personal Cozyweb is inevitable"

    • I originally interpreted this tweet as “Use LLMs to make a psychosocial bubble of fake friends” which I think would be terrible for society.

    • But I think he meant it more as “make a garden of possibility for you and you friends,” which could be good in some circumstances.

  • Bruce Schneier: The Age of Integrity.

    • Integrity is an incredibly important topic.

    • In Information Flow Control, confidentiality and integrity are the two concepts that flow through the graph.

    • Integrity is about a trusted chain of provenance.

    • Integrity will be an important concept to tackle prompt injection.

  • Doug Shapiro: Trust is the new oil.

    • Trust will have to be rooted in in-person interactions that are known to be authentic.

    • Bruce Schneier's focus on integrity is important here.

  • Culture emerges.

    • Someone tries something that other people find viable and then the others reshare it, build on it, and remix it.

    • The things people like are what get built on, emergently.

    • For things like architectural styles there’s some geographic pockets which helped give distinctive styles in different parts of the country.

    • As everything gets more fluid and lower friction, we’ll likely see less and less distinctive architectural styles.

  • Building for scale and building for viability are different.

    • Imagine working at a company where on day 1 of a product you can expect 50M users.

    • You have to think about every little detail before building it.

      • In that environment product discovery is about talking and planning.

      • Writing down plans so they can be critiqued and collaborated on is a critical step in the process.

    • But in most contexts, the 0-1 phase has very little usage.

      • Product discovery there is about experimenting and surfing through the problem domain fluidly.

      • It’s all about getting something concrete into contact with real people as quickly as possible to iterate.

      • In that environment, writing down things slows down the discovery process significantly.

  • The demoware mindset is "does it work?"

    • The product mindset is "do I want to use it?"

    • Very different bars!

  • People who have a high need for novelty won't focus on polish.

    • Polish isn't novel.

  • Even if you build the right surfboard, will you catch the wave?

  • Hank Green on the Cacophonous Age: "You're not addicted to content, you're starving for Information."

  • In the Cacophonous Age, the new privilege is patience.

    • What kind of thinking are you not outsourcing?

  • How can you have a secure attachment to reality instead of trying to leave it?

    • To get it you need to sit with the discomfort of uncertainty.

    • The discomfort is the point.

    • Reframe the discomfort as excitement!

  • When you have your own authentic clarity, you stick with your intuition when everybody else would have given up.

  • Terms have an “inferred definition.”

    • That is, what a term means is what the majority thinks it means after the first time they've heard it.

    • People will bring their own preexisting priors to any given term, and that bias will lead to what the term means, especially if it’s a consistent bias many first-time hearers will share.

    • Terms like “context engineering” are useful because they mean the thing that most experts hearing it for the first time would think it means.

    • “Inferred definition” is itself a term coined by Simon Willison.

  • Your taste creates an architecture for your thoughts.

  • If a CEO could direct a swarm of LLM clones of themselves, we’ll expect more volatility in company performance.

    • Founder led companies are more volatile.

      • If the founder says to go in a given direction, even if it’s to avoid an obstacle the employees can’t see, they go along with it.

      • But if the founder missteers, there’s no one and no thing to countersteer.

      • Founder-led companies have greater returns than average, but also higher likelihood of death.

    • Non-founder led companies are harder for the CEO to steer.

    • But even founder-led companies are hard to steer at scale.

    • Before, the swarm of employees trying to implement the CEO's vision imperfectly gave some insulation.

      • For good, when the CEO’s idea was disastrous.

      • For bad, when the company had inertia that counteracted a good idea.

    • But if every employee is just a clone with minimal principal agent problem, it's like the Wreck it Ralph 2 swarm of poorly rendered copies creating an unruly emergent leviathan.

  • Which does the organization prioritize: loyalty or competence?

    • The former is a zombie organization.

    • Alignment at all costs, even if it kills the host.

      • No adaptive capacity.

    • Top-down alignment comes at the cost of local adaptivity and the possibility of emergence.

  • Whiteboard scribblings after a meeting are often completely meaningless to anyone else.

    • But they’re extremely meaningful to the people who were there.

    • Having the experience of that meeting in that space gives you the key to understand it.

  • Novelty is risk.

    • It's noise.

    • Some novelty will turn out to be innovation.

    • Most will simply not work.

    • Invest your novelty budget on the things that are differentiators, but nowhere else.

  • The optimal strategy for a swarm and for an individual are different.

    • Multi-headed search vs single-headed search.

    • If you can only have a single search head on a problem domain, you want it to be the best it can be.

      • K-selected.

    • If you can have a multi-headed search on a problem domain, you want as many heads as you can get to flood-fill the problem.

      • R-selected.

    • The swarm needs breadth, the individual needs depth.

  • Researchers Uncover Hidden Ingredients Behind AI Creativity

    • Locality breeds creativity.

    • It’s the pockets that allow interesting results to grow and then expand.

  • Some things are convex and some are concave.

    • Some tend towards a center point.

      • Auto-stabilizing.

      • Concave.

    • Some tend away from a center point.

      • Auto-destabilizing.

      • Convex.

    • Two systems that look superficially the same but are convex vs concave have infinitely different outcomes.

    • What determines if a system is convex or concave?

      • If it leads to convergent outcomes or tears itself apart via entropy and diffusion?

    • I think it’s whether the locally good behaviors lead to emergently good outcomes at the macro level.

  • Some things, the more useful they are over time, get cleaner.

    • In some cases the more useful they are, the more fractally complex they get.

    • Convex vs concave.

  • Useful things tend to snowball.

    • In proportion to:

      • 1) the amount of people who find it useful.

      • 2) how useful they find it compared to available alternatives.

    • As it gets bigger it gets more useful to more people because more people are exposed to it.

  • The option that everyone has a slight preference for slowly diffuses and smothers out other options.

    • The more the bias the faster the diffusion.

    • As information flows with less friction the dominant species creates a mono culture.

  • Emergent systems need to have all logic decided at the local level, but have global-level outcomes emerge.

    • In emergent systems, all decisions are local but they have emergent global consequences.

    • You can't get a bird's eye view to coordinate, which is necessary for top-down convergence in a large system.

    • Many problems can't be framed this way, with a successful macro level outcome  without a birds eye view, but some subset can.

    • These are typically grown, budded off of other systems that are working.

    • But a true bird's eye perspective is an impossibility anyway.

      • The farther you get away from the details, the more fuzzy they become.

    • The constraint of "local information only" feels overly restrictive, but it's close to the real constraint anyway.

  • Feedback is generated when you take an action and the world reacts.

    • You can't simulate what the world will do without taking the action.

    • The world is a multi-headed environment of execution.

    • Your head is a single-headed environment.

  • Debugging tools only give their highest leverage if they're at the proper layer of abstraction.

    •  An inode-level debugging tool for a userland storage system is not helpful.

    • Good debugging tools give good leverage to features / bugs at that layer.

  • One reason Rust’s Borrow Checker is palatable is Rust’s amazing error messages.

  • "What's your dirty little secret?"

  • Data used to be sent on physical media, and moving the  physical media had significant friction.

    • You’d print words in a book and then ship the book.

    • When data moved to the plane of entirely bits it got orders of magnitude faster.

      • We divorced information from atoms.

    • We saw much more aggregation much faster.

      • The same dynamic as before, just playing out orders of magnitude faster.

  • An important tactic: lashing yourself to the mast.

    • To make it so you can't do the things you fear you'll want to do.

    • Past you can constrain future you, to make sure your reptile mind doesn’t override your aspirational mind.

    • A game theory solution that's similar to playing chicken with yourself.

  • Sometimes what you think will be a shortcut is actually a detour.





Reply all
Reply to author
Forward
0 new messages