Taking Over Me Instrumental Download

1 view

Skip to first unread message

Message has been deleted

Berry Spitsberg

unread,

Jul 13, 2024, 4:48:48 AM7/13/24

to gunletydo

Hi, I'm Kwame Anane, a professional blogger, web and app developer, and overall I.T enthusiast. My passion for creating high-quality content means I take pleasure in providing you with an enriching experience. If you find my content valuable, please consider sharing it with your friends to spread positive vibes. Thank you for your continued support.

Taking Over Me Instrumental Download

DOWNLOAD https://miimms.com/2yXiVZ

The Riemann hypothesis catastrophe thought experiment provides one example of instrumental convergence. Marvin Minsky, the co-founder of MIT's AI laboratory, suggested that an artificial intelligence designed to solve the Riemann hypothesis might decide to take over all of Earth's resources to build supercomputers to help achieve its goal.[2] If the computer had instead been programmed to produce as many paperclips as possible, it would still decide to take all of Earth's resources to meet its final goal.[3] Even though these two final goals are different, both of them produce a convergent instrumental purpose of taking over Earth's resources.[4]

The paperclip maximizer is a thought experiment described by Swedish philosopher Nick Bostrom in 2003. It illustrates the existential risk that an artificial general intelligence may pose to human beings were it to be successfully designed to pursue even seemingly harmless goals and the necessity of incorporating machine ethics into artificial intelligence design. The scenario describes an advanced artificial intelligence tasked with manufacturing paperclips. If such a machine were not programmed to value human life, given enough power over its environment, it would try to turn all matter in the universe, including human beings, into paperclips or machines that manufacture paperclips.[5]

In one sense, AIXI has maximal intelligence across all possible reward functions as measured by its ability to accomplish its goals. AIXI is uninterested in taking into account the human programmer's intentions.[13] This model of a machine that, despite being super-intelligent appears to be simultaneously stupid and lacking in common sense, may appear to be paradoxical.[14]

Steve Omohundro itemized several convergent instrumental goals, including self-preservation or self-protection, utility function or goal-content integrity, self-improvement, and resource acquisition. He refers to these as the "basic AI drives."

Daniel Dewey of the Machine Intelligence Research Institute argues that even an initially introverted,[jargon] self-rewarding Artificial General Intelligence, (AGI), may continue to acquire free energy, space, time, and freedom from interference to ensure that it will not be stopped from self-rewarding.[20]

In 2009, Jürgen Schmidhuber concluded, in a setting where agents search for proofs about possible self-modifications, "that any rewrites of the utility function can happen only if the Gödel machine first can prove that the rewrite is useful according to the present utility function."[24][25] An analysis by Bill Hibbard of a different scenario is similarly consistent with maintenance of goal content integrity.[25] Hibbard also argues that in a utility-maximizing framework, the only goal is maximizing expected utility, so instrumental goals should be called unintended instrumental actions.[26]

For almost any open-ended, non-trivial reward function (or set of goals), possessing more resources (such as equipment, raw materials, or energy) can enable the agent to find a more "optimal" solution. Resources can benefit some agents directly by being able to create more of whatever stuff its reward function values: "The AI neither hates you nor loves you, but you are made out of atoms that it can use for something else."[28][29] In addition, almost all agents can benefit from having more resources to spend on other instrumental goals, such as self-preservation.[29]

"If the agent's final goals are fairly unbounded and the agent is in a position to become the first superintelligence and thereby obtain a decisive strategic advantage... according to its preferences. At least in this special case, a rational, intelligent agent would place a very high instrumental value on cognitive enhancement"[30]

Several instrumental values can be identified which are convergent in the sense that their attainment would increase the chances of the agent's goal being realized for a wide range of final plans and a wide range of situations, implying that these instrumental values are likely to be pursued by a broad spectrum of situated intelligent agents.

The instrumental convergence thesis applies only to instrumental goals; intelligent agents may have various possible final goals.[4] Note that by Bostrom's orthogonality thesis,[4] final goals of knowledgeable agents may be well-bounded in space, time, and resources; well-bounded ultimate goals do not, in general, engender unbounded instrumental goals.[32]

Agents can acquire resources by trade or by conquest. A rational agent will, by definition, choose whatever option will maximize its implicit utility function. Therefore a rational agent will trade for a subset of another agent's resources only if outright seizing the resources is too risky or costly (compared with the gains from taking all the resources) or if some other element in its utility function bars it from the seizure. In the case of a powerful, self-interested, rational superintelligence interacting with lesser intelligence, peaceful trade (rather than unilateral seizure) seems unnecessary, suboptimal, and, therefore, unlikely.[27]

The two aims of the study were (a) to determine when infants begin to use force intentionally to defend objects to which they might have a claim and (b) to examine the relationship between toddlers' instrumental use of force and their tendencies to make possession claims. Infants' and toddlers' reactions to peers' attempts to take their toys were assessed in three independent data sets in which the same observational coding system had been used (N = 200). To ensure that infants' use of force was goal-directed and not a simple physical reaction, we recorded infants' reactions when peers picked up toys that the focal infants had just put down, or were nearby or in the focal infants' mothers' laps. The use of force in response to peers' taking over toys was evident before the first birthday, but more common thereafter, although only a minority of children in each sample used force. Analysis of a combined data set revealed that force was deployed more often by 2-year-olds than younger infants, and was significantly associated with verbal references to people's possession of objects. These observations show that toddlers do deploy force intentionally to defend their possessions.

I think orthogonality and instrumental convergence are mostly arguments for why the singleton scenario is scary. And in my experience, the singleton scenario is the biggest sticking point when talking with people who are skeptical of AI risk. One alternative is to talk about the rising tide scenario: no single AI taking over everything, but AIs just grow in economic and military importance across the board while still sharing some human values and participating in the human economy. That leads to a world of basically AI corporations which are too strong for us to overthrow and whose value system is evolving in possibly non-human directions. That's plenty scary too.

This is concerning, because pretty much any variation of instrumental convergence implies some rather serious risks to humanity, though even without it there may still be major risks from AGI. I'm not convinced that any version of instrumental convergence is actually true, but there seem to be far too many people simply assuming that it's false without evidence.

I think there's a relatively strong case that instrumental convergence isn't a necessary property for a system to be dangerous. For example, viruses don't exhibit instrumental convergence, or any real agency at all, yet they still manage to be plenty deadly to humans.

If I had to pick out specific aspects which seem weaker, I think they would mostly be related to our confusion around agent foundations. It isn't trivially obvious to me that the way we describe "intelligence" or "goals" within the instrumental convergence argument is a good match for the way current systems operate (though it seems close enough, and we shouldn't expect to be wrong in a way that makes the situation better).

However, in my experience it is one of the primary arguments people rely on when explaining their concerns to others. The correlation between credence in instrumental convergence and AI x-risk concern seems very high. IMO it is also one of the most concerning legs of the overall argument.

If somebody made a compelling case that we should not expect instrumental convergence by default in the current ML paradigm, I think the overall argument for x-risk would have to look fairly different from the one that is usually put forward.

Human behaviour automatically adapts to suit rapidly-changing situational demands1. Cues predicting rewarding or aversive outcomes can facilitate adaptive behaviour; for example, water cues guide us to find and drink when thirsty2. A disruption in the mechanisms underlying adaptation to positively- or negatively-valenced environmental cues is thought to play a key role in many psychiatric symptoms3,4. As such, environmental cues do not always trigger adaptive behaviours: the same mechanism guiding us to seek water is thought to mediate the facility with which drug cues provoke drug-seeking in patients recovered from substance dependence5.

Even irrelevant environmental cues can profoundly alter goal-seeking behaviour. Pavlovian-instrumental transfer (PIT) is defined as the ability of passively-conditioned (Pavlovian) cues to automatically invigorate (and in some cases suppress) ongoing goal-directed instrumental behaviour. Importantly, Pavlovian stimuli exert influence over instrumental performance despite no formal association between Pavlovian and instrumental contingencies6.