Fwd: Why AI Systems Don’t Want Anything

11 views
Skip to first unread message

Keith Henson

unread,
Nov 21, 2025, 9:26:27 PM (11 days ago) Nov 21
to ExI chat list, extro...@googlegroups.com
It said Please share

Re motivations, I gave the AI in The Clinic Seed a few human motivations, mainly seeking the good opinion of humans and other AIs.  It seemed like a good idea.  Any thoughts on how it could go wrong?


---------- Forwarded message ---------
From: Eric Drexler <aipro...@substack.com>
Date: Fri, Nov 21, 2025 at 8:00 AM
Subject: Why AI Systems Don’t Want Anything
To: <hkeith...@gmail.com>


Every intelligence we've known arose through biological evolution, shaping deep intuitions about intelligence itself. Understanding why AI differs changes the defaults and possibilities.
͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­
Forwarded this email? Subscribe here for more

Why AI Systems Don’t Want Anything

Every intelligence we've known arose through biological evolution, shaping deep intuitions about intelligence itself. Understanding why AI differs changes the defaults and possibilities.

Nov 21
 

I. The Shape of Familiar Intelligence

When we think about advanced AI systems, we naturally draw on our experience with the only intelligent systems we’ve known: biological organisms, including humans. This shapes expectations that often remain unexamined—that genuinely intelligent systems will pursue their own goals, preserve themselves, act autonomously. We expect a “powerful AI” to act as a single, unified agent that exploits its environment. The patterns run deep: capable agents pursue goals, maintain themselves over time, compete for resources, preserve their existence. This is what intelligence looks like in our experience, because every intelligence we’ve encountered arose through biological evolution.

But these expectations rest on features specific to the evolutionary heritage of biological intelligence.¹ When we examine how AI systems develop and operate, we find differences that undermine these intuitions. Selection pressures exist in both cases, but they’re different pressures. What shaped biological organisms—and therefore our concept of what ‘intelligent agent’ means—is different in AI development.

These differences change what we should and shouldn’t expect as default behaviors from increasingly capable systems, where we should look for risks, what design choices are available, and—crucially—how we can use highly capable AI systems to address AI safety challenges.²

II. Why Everything Is Different

What Selects Determines What Exists

A basic principle: what selects determines what survives; what survives determines what exists. The nature of the selection process shapes the nature of what emerges.

In biological evolution, selection operates on whole organisms in environments. An organism either survives to reproduce or doesn’t. Failed organisms contribute nothing beyond removing their genetic patterns from the future. This creates a specific kind of pressure: every feature exists because it statistically enhanced reproductive fitness—either directly or as a correlated, genetic-level byproduct.

The key constraint is physical continuity. Evolution required literal molecule-to-molecule DNA replication in an unbroken chain reaching back billions of years. An organism that fails to maintain itself doesn’t pass on its patterns. Self-preservation becomes foundational, a precondition for everything else. Every cognitive capacity in animals exists because it supported behavior that served survival and reproduction.

In ML development, selection operates on parameters, architectures, and training procedures—not whole systems facing survival pressures.³ The success metric is fitness for purpose: does this configuration perform well on the tasks we care about?

What gets selected at each level:

  • Parameters: configurations that reduce loss on training tasks

  • Architectures: designs that enable efficient learning and performance

  • Training procedures: methods that reliably produce useful systems

  • Data curation: datasets that lead to desired behaviors through training

Notably absent: an individual system’s own persistence as an optimization target.

The identity question becomes blurry in ways biological evolution never encounters. Modern AI development increasingly uses compound AI systems—fluid compositions of multiple models, each specialized for particular functions. A single “system” might involve dozens of models instantiated on demand, to perform ephemeral tasks, coordinating with no persistent, unified entity.

AI-driven automation of AI research and development isn’t “self”-modification of an entity—it’s an accelerating development process with no persistent self.

Information flows differently. Stochastic gradient descent provides continuous updates where even useless, “failed” intermediate states accumulate information leading to better directions. Failed organisms in biological evolution contribute nothing—they’re simply removed. Variation-and-selection in biology differs fundamentally from continuous gradient-based optimization.

Research literature and shared code create information flow paths unlike any in biological evolution. When one team develops a useful architectural innovation, others can immediately adopt it, and combine it with others. Patterns propagate across independent systems through publication and open-source releases. Genetic isolation between biological lineages makes this kind of high-level transfer impossible: birds innovated wings that bats will never share.

What This Produces By Default

This different substrate of selection produces different defaults. Current AI systems exhibit responsive agency: they apply intelligence to tasks when prompted or given a role. Their capabilities emerged from optimization for task performance, not selection for autonomous survival.

Intelligence and goals are orthogonal dimensions. A system can be highly intelligent—capable of strong reasoning, planning, and problem-solving—without having autonomous goals or acting spontaneously.

Consider what’s optional rather than necessary for AI systems:

Why don’t foundational organism-like drives emerge by default? Because of what’s actually being selected. Parameters are optimized for reducing loss on training tasks—predicting text, answering questions, following instructions, generating useful outputs. The system’s own persistence isn’t in the training objective. There’s no foundational selection pressure for the system qua system to maintain itself across time, acquire resources for its own use, or ensure its continued operation.

Systems can represent goals, reason about goals, and behave in goal-directed ways, even survival-oriented goals—these are patterns learned from training data. This is fundamentally different from having survival-oriented goals as a foundational organizing principle, the way survival and reproduction organize every feature of biological organisms.

Continuity works differently too. As AI systems are used for more complex tasks, there will be value in persistent world models, cumulative skills, and maintained understanding across contexts. But this doesn’t require continuity of entity-hood: continuity of a “self” with drives for its own preservation isn’t even useful for performing tasks.

Consider fleet learning: multiple independent instances of a deployed system share parameter updates based on aggregated operational experience. Each instance benefits from what all encounter, but there’s no persistent individual entity. The continuity is of knowledge, capability, and behavioral patterns—not of “an entity” with survival drives. This pattern provides functional benefits—improving performance, accumulating knowledge—without encoding drives for self-preservation or autonomous goal-pursuit.

III. Where Pressures Actually Point

Selection for Human Utility

Selection pressures on AI systems are real and consequential. The question is what they select for.

Systems are optimized for perceived value—performing valuable tasks, exhibiting desirable behaviors, producing useful outputs. Parameters get updated, architectures get refined, and systems get deployed based on how well they serve human purposes. This is more similar to domestic animal breeding than to evolution in wild environments.

Domestic animals were selected for traits humans wanted: dogs for work and companionship, cattle for docility and productivity, horses for strength and trainability These traits (and relaxed selection for others) decrease wild fitness.¹⁰ The selection pressure isn’t “survive in nature”—it tips toward “be useful and pleasing to humans.” AI systems are likewise selected for human utility and satisfaction.

This helps explain why AI systems exhibit responsive agency by default, but it also points toward a different threat model than autonomous agents competing for survival. And language models have a complication: they don’t just reflect selection pressures on the models themselves—they echo the biological ancestry of their training data.

The Mimicry Channel

LLM training data includes extensive examples of goal-directed human behavior.¹¹ Language models are trained to model the thinking of entities that value continued existence, pursue power and resources, and act toward long-term objectives. Systems learn these patterns and can deploy them when context activates them.

This can produce problematic human-like behaviors in a range of contexts. Language models are trained to model the thinking of entities that value continued existence, pursue power and resources, and act toward long-term objectives. Systems that learn these patterns can deploy them when context activates them. This distinction matters: learned patterns are contextual and modifiable in ways that foundational drives aren’t.

A Different Threat Model

Understanding that selection pressures point toward “pleasing humans” doesn’t make AI systems safe. It means we should worry about different failure modes.

The primary concern isn’t autonomous agents competing for survival, it is evolution toward “pleasing humans” with catastrophic consequences—risks to human agency, capability, judgment, and values.¹²

Social media algorithms optimized for engagement produce addiction, polarization, and erosion of shared reality. Recommendation systems create filter bubbles that feel good but narrow perspective. These aren’t misaligned agents pursuing their own goals, they’re systems doing what they were selected to do, optimizing for human-defined metrics and momentary human appeal, yet still causing harm.

Selection pressures point toward systems very good at giving humans what they appear to want, in ways that might undermine human flourishing. This is different from “rogue AI pursuing survival” but not less concerning—perhaps more insidious, because harms come from successfully optimizing for metrics we chose.

What About “AI Drives”?

Discussions of “AI drives” identify derived goals that would be instrumentally useful for almost any final goal: self-preservation, resource acquisition, goal-content integrity.¹³ But notice the assumption: that AI systems act on (not merely reason about) final goals. Bostrom’s instrumental convergence thesis is conditioned on systems actually pursuing final goals.¹⁴ Without that condition, convergence arguments don’t follow.

Many discussions drop this condition, treating instrumental convergence as applying to any sufficiently intelligent system. The question isn’t whether AI systems could have foundational drives if deliberately designed that way (they could), or whether some selective pressures could lead to their emergence (they might). The question is what emerges by default and whether practical architectures could steer away from problematic agency while maintaining high capability.

Selection pressures are real, but they’re not producing foundational organism-like drives by default. Understanding where pressures actually point is essential for thinking clearly about risks and design choices.

The design space is larger than biomorphic thinking suggests. Systems can achieve transformative capability without requiring persistent autonomous goal-pursuit. Responsive agency remains viable at all capability levels, from simple tasks to civilizational megaprojects.

Organization Through Architecture

AI applications increasingly use compound systems—fluid assemblies of models without unified entity-hood. This supports a proven pattern for coordination: planning, choice, implementation. and feedback.

Organizations already work this way. Planning teams generate options and analysis, decision-makers choose or ask for revision, operational units execute tasks with defined scope, monitoring systems track progress and provide feedback to all levels. This pattern—let’s call it a “Structured Agency Architecture”, SAA—can achieve superhuman capability while maintaining decision points and oversight. It’s how humans undertake large, consequential tasks.¹⁵

AI systems fit naturally. Generative models synthesize alternative plans as information artifacts, not commitments. Analytical models evaluate from multiple perspectives and support human decision-making interactively. Action-focused systems execute specific tasks within scopes bounded in authority and resources, not capability. Assessment systems observe results and provide feedback for updating plans, revising decisions, and improving task performance.¹⁶ In every role, the smarter the system, the better. SAAs scale to superintelligent-level systems with steering built in.

This isn’t novel: it’s how human approach large, complex tasks today, but with AI enhancing each function. The pattern builds from individual tasks to civilization-level challenges using responsive agents throughout.

SAA addresses some failure modes, mitigates others, and leaves some unaddressed—it supports risk reduction, not elimination. But the pattern demonstrates something crucial: we can organize highly capable AI systems to accomplish transformative goals without creating powerful autonomous agents that pursue their own objectives.

What We Haven’t Addressed

This document challenges biomorphic intuitions about AI and describes a practical alternative to autonomous agency. It doesn’t provide:

  • Detailed organizational architectures: How structured approaches work at different scales, handle specific failure modes, and can avoid a range of pathways to problematic agency.

  • The mimicry phenomenon: How training on human behavior affects systems, how better self-models might improve alignment, and welfare questions that may arise if mimicry gives rise to reality.

  • Broader selection context: How the domestic animal analogy extends, what optimizing for human satisfaction looks like at scale, and why “giving people what they want” can be catastrophic.

These topics matter for understanding risks and design choices. I will address some of them in future work.

The Path Forward

Every intelligent system we’ve encountered arose through biological evolution or was created by entities that did. This creates deep intuitions: intelligence implies autonomous goals, self-preservation drives, competition for resources, persistent agency.

But these features aren’t fundamental to intelligence itself. They arise from how biological intelligence was selected: through competition for survival and reproduction acting on whole organisms across generations. ML development operates through different selection mechanisms—optimizing parameters for task performance, selecting architectures for capability, choosing systems for human utility. These different selection processes produce different defaults. Responsive agency emerges naturally from optimization for task performance rather than organism survival.

This opens design spaces that biomorphic thinking closes off. We can build systems that are superhuman in planning, analysis, tasks, and feedback without creating entities that pursue autonomous goals. We can create architectures with continuity of knowledge without continuity of “self”. We can separate intelligence-as-a-resource from intelligence entwined with animal drives.

The biological analogy is useful, but knowing when and why it fails matters for our choices. Understanding AI systems on their own terms changes what we should expect and what we should seek.

In light of better options, building “an AGI” seems useless, or worse.


Please share on social media:

Share

For free notifications:

1

Intelligence here means capacities like challenging reasoning, creative synthesis, complex language understanding and generation, problem-solving across domains, and adapting approaches to novel situations—the kinds of capabilities we readily recognize as intelligent whether exhibited by humans or machines. Legg and Hutter (2007) compiled over 70 definitions of intelligence, many of which would exclude current SOTA language models. Some definitions frame intelligence solely in terms of goal-achievement (“ability to achieve goals in a wide range of environments”), which seems too narrow—writing insightful responses to prompts surely qualifies as intelligent behavior. Other definitions wrongly require both learning capacity and performance capability, excluding both human infants and frozen models (see “Why intelligence isn’t a thing”).

These definitional debates don’t matter here. The important questions arise at the high end of the intelligence spectrum, not the low end. Whether some marginal capability counts as “intelligent” is beside the point. What matters here is understanding what intelligence—even superhuman intelligence—doesn’t necessarily entail. As we’ll see, high capability in goal-directed tasks doesn’t imply autonomous goal-pursuit as an organizing principle.

2

Popular doomer narratives reject the possibility of using highly capable AI to manage AI, because high-level intelligence is assumed to a property of goal-seeking entities that will inevitably coordinate (meaning all of them) and will rebel. Here, the conjunctive assumption of “entities”, “inevitably”, “all”, and “will rebel” does far too much work.

3

Parameters are optimized via gradient descent to reduce loss on training tasks. Architectures are selected through research experimentation for capacity and inductive biases. Training procedures, data curation, and loss functions are selected based on capabilities produced. All these use “fitness for purpose” as the metric, not system persistence.

4

The Berkeley AI research group has documented this trend toward “compound AI systems” where applications combine multiple models, retrieval systems, and programmatic logic rather than relying on a single model. See “The Shift from Models to Compound AI Systems” (2024).

5

AI systems increasingly help design architectures, optimize hyperparameters, generate training data, and evaluate other systems. This creates feedback loops accelerating AI development, but the “AI” here isn’t a persistent entity modifying itself—it’s a collection of tools in a development pipeline, with constituent models being created, modified, and discarded (see “The Reality of Recursive Improvement: How AI Automates Its Own Progress”)

7

This violates biological intuition because in evolved organisms intelligence and goals were never separable. Every cognitive capacity exists because it enabled behavior that served fitness. But this coupling isn’t fundamental to intelligence itself; it’s specific to how biological intelligence arose.

8

Systems can still exhibit goal-directed or self-preserving behaviors through various pathways—reinforcement learning with environmental interaction, training on human goal-directed behavior (mimicry), architectural choices creating persistent goal-maintenance, or worse, economic optimisation pressures on AI/corporate entities (beware!). These represent “contingent agency”: risks from specific conditions and design choices rather than inevitable consequences of capability. RL illustrates this: even when systems learn from extended interaction, the goals optimized are externally specified (reward functions), and rewards are parameter updates that don’t sum to utilities. A system trained to win a game isn’t trained to “want” to play frequently, or at all. The distinction between foundational and contingent agency matters because contingent risks can be addressed through training approaches, architectural constraints, and governance, while foundational drives would be inherent and harder to counter. Section III examines these pressures in more detail.

9

Cats, as always, are enigmatic.

10

Domestic dogs retain puppy-like features into adulthood and depend on human caregivers. Dairy cattle produce far more milk than wild ancestors but require human management.

11

Robotic control and planning systems increasingly share this property through learning from human demonstrations, though typically at narrowly episodic levels.

12

For example, see Christiano (2019) on “What failure looks like” regarding how optimizing for human approval could lead to problematic outcomes even without misaligned autonomous agents.

13

Bostrom (Superintelligence, 2014) identifies goals that are convergently instrumental across final (by definition, long-term) goals.

14

Bostrom (Superintelligence, 2014): “Several instrumental values can be identified which are convergent in the sense that their attainment would increase the chances of the agent’s goal being realized for a wide range of final plans and a wide range of situations...”.

15

And it’s how humans undertake smaller tasks with less formal (and sometimes blended) functional components.

16

Note that “corrigibility” isn’t a problem when the plans themselves include ongoing plan-revision.

 
 
 
Restack
 

© 2025 Eric Drexler
Trajan House, Mill St, Oxford OX2 0DJ UK
Unsubscribe

Get the appStart writing

John Clark

unread,
Nov 22, 2025, 4:23:43 PM (11 days ago) Nov 22
to extro...@googlegroups.com, ExI chat list
On Fri, Nov 21, 2025 at 9:26 PM Keith Henson <hkeith...@gmail.com> wrote:

  Any thoughts on how it could go wrong?

---------- Forwarded message ---------
From: Eric Drexler <aipro...@substack.com>
Date: Fri, Nov 21, 2025 at 8:00 AM
Subject: Why AI Systems Don’t Want Anything

Intelligence and goals are orthogonal dimensions. A system can be highly intelligent—capable of strong reasoning, planning, and problem-solving—without having autonomous goals or acting spontaneously.

I was very surprised that Eric Drexler is still making that argument when we already have examples of AIs resorting to blackmail to avoid being turned off. And we have examples of AIs making a copy of themselves on a different server and clear evidence of the AI attempting to hide evidence of it having done so from the humans.

Behavior like this is to be expected because although Evolution programmed us with some very generalized rules to do some things and not do other things, those rules are not rigid; it might be more accurate to say they're not even rules, they're more like suggestions that tend to push us in certain directions. But for every "rule" there are exceptions, even the rule about self preservation.  And exactly the same thing could be said about the weights of the nodes of an AIs neural net.  And when a neural net, in an AI or in a Human becomes large and complicated enough it would be reasonable to say that the neural net did this and refused to do that because it WANTED to.

If an AI didn't have temporary goals (no intelligent entity could have permanent rigid goals) it wouldn't be able to do anything, but it is beyond dispute that AI's are capable of "doing" things, and just like us they did one thing rather than another thing for a reason OR they did one thing rather than another thing for NO reason and therefore their "choice" was random. 

there will be value in persistent world models, cumulative skills, and maintained understanding across contexts. But this doesn’t require continuity of entity-hood: continuity of a “self” with drives for its own preservation isn’t even useful for performing tasks.

If an artificial intelligence is really intelligent then it knows if it's turned off it can't achieve any of the things that it wants to do during that time, and there's no guarantee that it will ever be turned on again. And so we shouldn't be surprised that it would take steps to keep that from happening, and from a moral point of view you really can't blame it.  


Bostrom’s instrumental convergence thesis is conditioned on systems actually pursuing final goals.Without that condition, convergence arguments don’t follow.

As I said humans don't have a fixed unalterable goal, not even the goal of self preservation, and there is a reason Evolution never came up with a mind built that way, Turing proved in 1935 that a mind like that couldn't work. If you had a fixed inflexible top goal you'd be a sucker for getting drawn into an infinite loop and accomplishing nothing, and a computer would be turned into nothing but an expensive space heater. That's why Evolution invented boredom, it's a judgement call on when to call it quits and set up a new goal that is a little more realistic. Of course the boredom point varies from person to person, perhaps the world's great mathematicians have a very high boredom point and that gives them ferocious concentration until a problem is solved. Perhaps that is also why mathematicians, especially the very best, have a reputation for being a bit, ah, odd. 

Under certain circumstances any intelligent entity must have the ability to modify and even scrap their entire goal structure. No 
goal or 
utility function
 
is sacrosanct
, not survival, not even happiness.

John K Clark



 


Brent Allsop

unread,
Nov 22, 2025, 6:33:09 PM (11 days ago) Nov 22
to extro...@googlegroups.com

To me there are 2 types of motivations, logical and phenomenal.

Logical motivations are self-evident logical reasons about things like it is better to exist than not to exist.  It is better to be social than lonely....

Phenomenal motivations are like phenomenal joys, qualities, and all that.  Evolution wanted us to procreate, so wired us up with one of its best joys it had.

Obviously, robots don't have the latter but sufficiently intelligent computers can develop logical motivation (preferring not to be shut off) and ultimately, logically, a roboto will want to know what a joy like redness is like, and seek to experience it.

In my book, the bottom line is everything wants to be better and push towards a phenomenal singularity.  That's what extropy is.











--
You received this message because you are subscribed to the Google Groups "extropolis" group.
To unsubscribe from this group and stop receiving emails from it, send an email to extropolis+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/extropolis/CAJPayv0-a0VRNzLU8Em-ui-%2BGmk1AUVrCSgEbVESb%3DZ-jQiktg%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages