A superintelligence is a hypothetical agent that possesses intelligence far surpassing that of the brightest and most gifted
human minds. "Superintelligence" may also refer to a property of
problem-solving systems (e.g., superintelligent language translators or
engineering assistants) whether or not these high-level intellectual
competencies are embodied in agents that act in the world. A
superintelligence may or may not be created by an intelligence explosion and associated with a technological singularity.
University of Oxford philosopher Nick Bostrom defines superintelligence as "any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest". The program Fritz
falls short of superintelligence—even though it is much better than
humans at chess—because Fritz cannot outperform humans in other tasks.
Following Hutter and Legg, Bostrom treats superintelligence as general
dominance at goal-oriented behavior, leaving open whether an artificial
or human superintelligence would possess capacities such as intentionality (cf. the Chinese room argument) or first-person consciousness (cf. the hard problem of consciousness).
Technological researchers disagree about how likely present-day human intelligence is to be surpassed. Some argue that advances in artificial intelligence
(AI) will probably result in general reasoning systems that lack human
cognitive limitations. Others believe that humans will evolve or
directly modify their biology so as to achieve radically greater
intelligence. A number of futures studies scenarios combine elements from both of these possibilities, suggesting that humans are likely to interface with computers, or upload their minds to computers, in a way that enables substantial intelligence amplification.
Some researchers believe that superintelligence will likely follow shortly after the development of artificial general intelligence.
The first generally intelligent machines are likely to immediately hold
an enormous advantage in at least some forms of mental capability,
including the capacity of perfect recall, a vastly superior knowledge base, and the ability to multitask in ways not possible to biological entities. This may give them the opportunity to—either as a single being or as a new species—become much more powerful than humans, and to displace them.
A number of scientists and forecasters argue for prioritizing early research into the possible benefits and risks of human and machine cognitive enhancement, because of the potential social impact of such technologies.
Feasibility of artificial superintelligence
Progress in machine classification of images The error rate of AI by year. Red line - the error rate of a trained human
Philosopher David Chalmers argues that artificial general intelligence is a very likely path to superhuman intelligence. Chalmers breaks this claim down into an argument that AI can achieve equivalence to human intelligence, that it can be extended to surpass human intelligence, and that it can be further amplified to completely dominate humans across arbitrary tasks.
Concerning human-level equivalence, Chalmers argues that the
human brain is a mechanical system, and therefore ought to be emulatable
by synthetic materials.
He also notes that human intelligence was able to biologically evolve,
making it more likely that human engineers will be able to recapitulate
this invention. Evolutionary algorithms in particular should be able to produce human-level AI.
Concerning intelligence extension and amplification, Chalmers argues
that new AI technologies can generally be improved on, and that this is
particularly likely when the invention can assist in designing new
technologies.
If research into strong AI produced sufficiently intelligent
software, it would be able to reprogram and improve itself – a feature
called "recursive self-improvement". It would then be even better at
improving itself, and could continue doing so in a rapidly increasing
cycle, leading to a superintelligence. This scenario is known as an intelligence explosion.
Such an intelligence would not have the limitations of human
intellect, and may be able to invent or discover almost anything.
Computer components already greatly surpass human performance in
speed. Bostrom writes, "Biological neurons operate at a peak speed of
about 200 Hz, a full seven orders of magnitude slower than a modern
microprocessor (~2 GHz)." Moreover, neurons transmit spike signals across axons
at no greater than 120 m/s, "whereas existing electronic processing
cores can communicate optically at the speed of light". Thus, the
simplest example of a superintelligence may be an emulated human mind
run on much faster hardware than the brain. A human-like reasoner that
could think millions of times faster than current humans would have a
dominant advantage in most reasoning tasks, particularly ones that
require haste or long strings of actions.
Another advantage of computers is modularity, that is, their size
or computational capacity can be increased. A non-human (or modified
human) brain could become much larger than a present-day human brain,
like many supercomputers. Bostrom also raises the possibility of collective superintelligence:
a large enough number of separate reasoning systems, if they
communicated and coordinated well enough, could act in aggregate with
far greater capabilities than any sub-agent.
There may also be ways to qualitatively improve on human reasoning and decision-making. Humans appear to differ from chimpanzees in the ways we think more than we differ in brain size or speed.
Humans outperform non-human animals in large part because of new or
enhanced reasoning capacities, such as long-term planning and language use.
If there are other possible improvements to reasoning that would have a
similarly large impact, this makes it likelier that an agent can be
built that outperforms humans in the same fashion humans outperform
chimpanzees.
All of the above advantages hold for artificial
superintelligence, but it is not clear how many hold for biological
superintelligence. Physiological constraints limit the speed and size of
biological brains in many ways that are inapplicable to machine
intelligence. As such, writers on superintelligence have devoted much
more attention to superintelligent AI scenarios.
Feasibility of biological superintelligence
Evolution of the reaction speed in the noogenesis. In unicellular organism – the rate of movement of ions through the membrane ~ m/s, water through the membrane m/s, intracellular liquid (cytoplasm) m/s; Inside multicellular organism – the speed of blood through the vessels ~0.05 m/s, the momentum along the nerve fibers ~100 m/s; In population (humanity) – communications: sound (voice and audio) ~300 m/s, quantum-electron ~ m/s (the speed of radio-electromagnetic waves, electric current, light, optical, tele-communications).
Evolution of the number of components in intelligent systems.A - number of neurons in the brain during individual development (ontogenesis), B - number of people (evolution of populations of humanity), C - number of neurons in the nervous systems of organisms during evolution (phylogenesis).
Evolution of the number of connections of intelligent systems.A - number of synapses between neurons during individual development (ontogenesis) of intelsystem of the human brain, B - number of connections between people in the dynamics of population growth of the human population, C
- number of synapses between neurons in the historical evolutionary
development (phylogenesis) of nervous systems to the human brain.
Emergence and evolution of info-interactions within populations of HumanityA – world human population → 7 billion; B – number of literate persons; C – number of reading books (with beginning of printing); D – number of receivers (radio, TV); E – number of phones, computers, Internet users
Carl Sagan suggested that the advent of Caesarean sections and in vitro fertilization may permit humans to evolve larger heads, resulting in improvements via natural selection in the heritable component of human intelligence. By contrast, Gerald Crabtree has argued that decreased selection pressure is resulting in a slow, centuries-long reduction in human intelligence,
and that this process instead is likely to continue into the future.
There is no scientific consensus concerning either possibility, and in
both cases the biological change would be slow, especially relative to
rates of cultural change.
Selective breeding, nootropics, NSI-189, MAOIs, epigenetic modulation, and genetic engineering
could improve human intelligence more rapidly. Bostrom writes that if
we come to understand the genetic component of intelligence,
pre-implantation genetic diagnosis could be used to select for embryos
with as much as 4 points of IQ gain (if one embryo is selected out of
two), or with larger gains (e.g., up to 24.3 IQ points gained if one
embryo is selected out of 1000). If this process is iterated over many
generations, the gains could be an order of magnitude greater. Bostrom
suggests that deriving new gametes from embryonic stem cells could be
used to iterate the selection process very rapidly. A well-organized society of high-intelligence humans of this sort could potentially achieve collective superintelligence.
Alternatively, collective intelligence might be constructible by
better organizing humans at present levels of individual intelligence. A
number of writers have suggested that human civilization, or some
aspect of it (e.g., the Internet, or the economy), is coming to function
like a global brain
with capacities far exceeding its component agents. If this
systems-based superintelligence relies heavily on artificial components,
however, it may qualify as an AI rather than as a biology-based superorganism. A prediction market
is sometimes considered an example of working collective intelligence
system, consisting of humans only (assuming algorithms are not used to
inform decisions).
A final method of intelligence amplification would be to directly enhance individual humans, as opposed to enhancing their social or reproductive dynamics. This could be achieved using nootropics, somatic gene therapy, or brain–computer interfaces.
However, Bostrom expresses skepticism about the scalability of the
first two approaches, and argues that designing a superintelligent cyborg interface is an AI-complete problem.
Evolution with possible integration of NI, IT and AI
In 2005 Alexei Eryomin in the monograph "Noogenesis and Theory of Intellect" proposed a new concept of noogenesis
in understanding the evolution of intelligence systems. The evolution
of intellectual capabilities can occur with the simultaneous
participation of natural (biological) intelligences (NI), modern
advances in information technology (IT) and future scientific
achievements in the field of artificial intelligence (AI).
Evolution of speed of interaction between components of intelligence systems
The
first person to measure the speed (in the range of 24.6 – 38.4 meters
per second) at which the signal is carried along a nerve fibre in 1849
was Helmholtz.
To date, the measured rates of nerve conduction velocity are 0,5 – 120 m/s.
The speed of sound and speed of light
were determined earlier in the XVII century. By the 21st century, it
became clear that they determine mainly the speeds of physical
signals-information carriers, between intelligent systems and their
components: sound (voice and audio) ~300 m/s, quantum-electron ~ m/s (the speed of radio-electromagnetic waves, electric current, light, optical, tele-communications).
Evolution of components of intelligence systems
In 1906 Santiago Ramón y Cajal brought the central importance of the neuron to the attention of scientists and established the neuron doctrine, which states that the nervous system is made up of discrete individual cells.
According to modern data, there are approximately 86 billion neurons in the brain an adult human.
In the process of evolution, the human population was about 70
million in 2000 BC, about 300 million at the beginning of the first
century AD, about one billion in 1930 AD, 6 billion in 2000, and 7.7
billion now world population. According to the mathematical models of Sergey Kapitsa, the human population may reach 12.5 - 14 billion before the end of 2200.
Evolution of links between components of intelligence systems
Synapse
– from the Greek synapsis (συνάψις), meaning "conjunction", in turn
from συνάπτεὶν (συν ("together") and ἅπτειν ("to fasten")) – was
introduced in 1897 by Charles Sherrington.
The relevance of measurements in this direction is confirmed by both
modern comprehensive researches of cooperation, and connections of
information, genetic, and cultural, due to structures at the neuronal level of the brain,
and the importance of cooperation in the development of civilization.
In this regard, A. L. Eryomin analyzed the known data on the evolution
of the number of connections for cooperation in intelligent systems.
Connections, contacts between biological objects, can be considered to
have appeared with a multicellularity of ~ 3-3.5 billion years ago.
The system of high — speed connections of specialized cells that
transmit information using electrical signals, the nervous system, in
the entire history of life appeared only in one major evolutionary
branch: in multicellular animals (Metazoa) and appeared in the Ediacaran period (about 635-542 million years ago).
During evolution (phylogeny), the number of connections between neurons
increased from one to ~ 7000 synoptic connections of each neuron with
other neurons in the human brain. It has been estimated that the brain
of a three-year-old child has about of synapses (1 quadrillion). In individual development (ontogenesis), the number of synapses decreases with age to ~ .
According to other data, the estimated number of neocortical synapses
in the male and female brains decreases during human life from ~ to ~ .
The number of human contacts is difficult to calculate, but the "Dunbar’s number"
~150 stable human connections with other people is fixed in science,
the assumed cognitive limit of the number of people with whom it is
possible to maintain stable social relations,
according to other authors - the range of 100–290. Structures
responsible for social interaction have been identified in the brain.
With the appearance of Homo sapiens ~50-300 thousand years ago, the
relevance of cooperation, its evolution in the human population,
increased quantitatively. If 2000 years ago there were 0.1 billion
people on Earth, 100 years ago - 1 billion, by the middle of the
twentieth century – 3 billion,
and by now, humanity - 7.7 billion. Thus, the total number of "stable
connections" between people, social relationships within the population,
can be estimated by a number ~ ."
Noometry of intellectual interaction
Parameter
Results of the measurements (limits)
Number of components of intellectual systems
~ –
Number of links between components
~ –
Speed of interaction between components (m/s)
~ –
Forecasts
Most
surveyed AI researchers expect machines to eventually be able to rival
humans in intelligence, though there is little consensus on when this
will likely happen. At the 2006 AI@50
conference, 18% of attendees reported expecting machines to be able "to
simulate learning and every other aspect of human intelligence" by
2056; 41% of attendees expected this to happen sometime after 2056; and
41% expected machines to never reach that milestone.
In a survey of the 100 most cited authors in AI (as of May 2013,
according to Microsoft academic search), the median year by which
respondents expected machines "that can carry out most human professions
at least as well as a typical human" (assuming no global catastrophe
occurs) with 10% confidence is 2024 (mean 2034, st. dev. 33 years),
with 50% confidence is 2050 (mean 2072, st. dev. 110 years), and with
90% confidence is 2070 (mean 2168, st. dev. 342 years). These estimates
exclude the 1.2% of respondents who said no year would ever reach 10%
confidence, the 4.1% who said 'never' for 50% confidence, and the 16.5%
who said 'never' for 90% confidence. Respondents assigned a median 50%
probability to the possibility that machine superintelligence will be
invented within 30 years of the invention of approximately human-level
machine intelligence.
Design considerations
Bostrom expressed concern about what values a superintelligence should be designed to have. He compared several proposals:
The moral rightness (MR) proposal is that it should value moral rightness.
The moral permissibility (MP) proposal is that it should value staying within the bounds of moral permissibility (and otherwise have CEV values).
Bostrom clarifies these terms:
instead of implementing humanity's coherent extrapolated
volition, one could try to build an AI with the goal of doing what is
morally right, relying on the AI’s superior cognitive capacities to
figure out just which actions fit that description. We can call this
proposal “moral rightness” (MR)...
MR would also appear to have some disadvantages. It relies on the notion
of “morally right,” a notoriously difficult concept, one with which
philosophers have grappled since antiquity without yet attaining
consensus as to its analysis. Picking an erroneous explication of “moral
rightness” could result in outcomes that would be morally very wrong...
The path to endowing an AI with any of these [moral] concepts might
involve giving it general linguistic ability (comparable, at least, to
that of a normal human adult). Such a general ability to understand
natural language could then be used to understand what is meant by
“morally right.” If the AI could grasp the meaning, it could search for
actions that fit...
One might try to preserve the basic idea of the MR model while reducing its demandingness by focusing on moral permissibility:
the idea being that we could let the AI pursue humanity’s CEV so long
as it did not act in ways that are morally impermissible.
Responding to Bostrom, Santos-Lang raised concern that developers may attempt to start with a single kind of superintelligence.
Potential threat to humanity
It has been suggested that if AI systems rapidly become
superintelligent, they may take unforeseen actions or out-compete
humanity.
Researchers have argued that, by way of an "intelligence explosion," a
self-improving AI could become so powerful as to be unstoppable by
humans.
Concerning human extinction scenarios, Bostrom (2002) identifies superintelligence as a possible cause:
When we create the first
superintelligent entity, we might make a mistake and give it goals that
lead it to annihilate humankind, assuming its enormous intellectual
advantage gives it the power to do so. For example, we could mistakenly
elevate a subgoal to the status of a supergoal. We tell it to solve a
mathematical problem, and it complies by turning all the matter in the
solar system into a giant calculating device, in the process killing the
person who asked the question.
In theory, since a superintelligent AI would be able to bring about
almost any possible outcome and to thwart any attempt to prevent the
implementation of its goals, many uncontrolled, unintended consequences
could arise. It could kill off all other agents, persuade them to
change their behavior, or block their attempts at interference. Eliezer Yudkowsky illustrates such instrumental convergence
as follows: "The AI does not hate you, nor does it love you, but you
are made out of atoms which it can use for something else."
This presents the AI control problem:
how to build an intelligent agent that will aid its creators, while
avoiding inadvertently building a superintelligence that will harm its
creators. The danger of not designing control right "the first time," is
that a superintelligence may be able to seize power over its
environment and prevent humans from shutting it down. Since a
superintelligent AI will likely have the ability to not fear death and
instead consider it an avoidable situation which can be predicted and
avoided by simply disabling the power button.
Potential AI control strategies include "capability control" (limiting
an AI's ability to influence the world) and "motivational control"
(building an AI whose goals are aligned with human values).
Bill Hibbard advocates for public education about superintelligence and public control over the development of superintelligence.
Superintelligence: Paths, Dangers, Strategies is a 2014 book by the Swedish philosopherNick Bostrom from the University of Oxford. It argues that if machine brains surpass human brains in general intelligence, then this new superintelligence
could replace humans as the dominant lifeform on Earth. Sufficiently
intelligent machines could improve their own capabilities faster than
human computer scientists, and the outcome could be an existential catastrophe for humans.
Bostrom's book has been translated into many languages.
Synopsis
It is unknown whether human-level artificial intelligence
will arrive in a matter of years, later this century, or not until
future centuries. Regardless of the initial timescale, once human-level
machine intelligence is developed, a "superintelligent" system that
"greatly exceeds the cognitive performance of humans in virtually all
domains of interest" would, most likely, follow surprisingly quickly.
Such a superintelligence would be very difficult to control or restrain.
While the ultimate goals of superintelligences can vary greatly, a
functional superintelligence will spontaneously generate, as natural
subgoals, "instrumental goals"
such as self-preservation and goal-content integrity, cognitive
enhancement, and resource acquisition. For example, an agent whose sole
final goal is to solve the Riemann hypothesis (a famous unsolved, mathematical conjecture) could create and act upon a subgoal of transforming the entire Earth into some form of computronium
(hypothetical material optimized for computation) to assist in the
calculation. The superintelligence would proactively resist any outside
attempts to turn the superintelligence off or otherwise prevent its
subgoal completion. In order to prevent such an existential catastrophe, it is necessary to successfully solve the "AI control problem"
for the first superintelligence. The solution might involve instilling
the superintelligence with goals that are compatible with human survival
and well-being. Solving the control problem is surprisingly difficult
because most goals, when translated into machine-implementable code,
lead to unforeseen and undesirable consequences.
The owl on the book cover alludes to an analogy which Bostrom calls the "Unfinished Fable of the Sparrows". A group of sparrows decide to find an owl chick and raise it as their servant.[5]
They eagerly imagine "how easy life would be" if they had an owl to
help build their nests, to defend the sparrows and to free them for a
life of leisure. The sparrows start the difficult search for an owl egg;
only "Scronkfinkle", a "one-eyed sparrow with a fretful temperament",
suggests thinking about the complicated question of how to tame the owl
before bringing it "into our midst". The other sparrows demur; the
search for an owl egg will already be hard enough on its own: "Why not
get the owl first and work out the fine details later?" Bostrom states
that "It is not known how the story ends", but he dedicates his book to
Scronkfinkle.
Reception
The book ranked #17 on the New York Times list of best selling science books for August 2014. In the same month, business magnateElon Musk made headlines by agreeing with the book that artificial intelligence is potentially more dangerous than nuclear weapons.
Bostrom's work on superintelligence has also influenced Bill Gates’s concern for the existential risks facing humanity over the coming century. In a March 2015 interview by Baidu's CEO, Robin Li, Gates said that he would "highly recommend" Superintelligence. According to the New Yorker, philosophers Peter Singer and Derek Parfit have "received it as a work of importance".
The science editor of the Financial Times
found that Bostrom's writing "sometimes veers into opaque language that
betrays his background as a philosophy professor" but convincingly
demonstrates that the risk from superintelligence is large enough that
society should start thinking now about ways to endow future machine
intelligence with positive values.
A review in The Guardian
pointed out that "even the most sophisticated machines created so far
are intelligent in only a limited sense" and that "expectations that AI
would soon overtake human intelligence were first dashed in the 1960s",
but finds common ground with Bostrom in advising that "one would be
ill-advised to dismiss the possibility altogether".
Some of Bostrom's colleagues suggest that nuclear war presents a
greater threat to humanity than superintelligence, as does the future
prospect of the weaponisation of nanotechnology and biotechnology. The Economist
stated that "Bostrom is forced to spend much of the book discussing
speculations built upon plausible conjecture... but the book is
nonetheless valuable. The implications of introducing a second
intelligent species onto Earth are far-reaching enough to deserve hard
thinking, even if the prospect of actually doing so seems remote." Ronald Bailey wrote in the libertarian Reason that Bostrom makes a strong case that solving the AI control problem is the "essential task of our age". According to Tom Chivers of The Daily Telegraph, the book is difficult to read, but nonetheless rewarding. A reviewer in the Journal of Experimental & Theoretical Artificial Intelligence broke with others by stating the book's "writing style is clear", and praised the book for avoiding "overly technical jargon". A reviewer in Philosophy judged Superintelligence to be "more realistic" than Ray Kurzweil's The Singularity is Near.
Existential risk from artificial general intelligence is the hypothesis that substantial progress in artificial general intelligence (AGI) could someday result in human extinction or some other unrecoverable global catastrophe. It is argued that the human species currently dominates other species because the human brain has some distinctive capabilities that other animals lack. If AI surpasses humanity in general intelligence and becomes "superintelligent", then it could become difficult or impossible for humans to control. Just as the fate of the mountain gorilla depends on human goodwill, so might the fate of humanity depend on the actions of a future machine superintelligence.
The likelihood of this type of scenario is widely debated, and
hinges in part on differing scenarios for future progress in computer
science. Once the exclusive domain of science fiction, concerns about superintelligence started to become mainstream in the 2010s, and were popularized by public figures such as Stephen Hawking, Bill Gates, and Elon Musk.
One source of concern is that controlling a superintelligent
machine, or instilling it with human-compatible values, may be a harder
problem than naïvely supposed. Many researchers believe that a
superintelligence would naturally resist attempts to shut it off or
change its goals—a principle called instrumental convergence—and
that preprogramming a superintelligence with a full set of human values
will prove to be an extremely difficult technical task. In contrast, skeptics such as Facebook's Yann LeCun argue that superintelligent machines will have no desire for self-preservation.
A second source of concern is that a sudden and unexpected "intelligence explosion"
might take an unprepared human race by surprise. To illustrate, if the
first generation of a computer program able to broadly match the
effectiveness of an AI researcher is able to rewrite its algorithms and
double its speed or capabilities in six months, then the
second-generation program is expected to take three calendar months to
perform a similar chunk of work. In this scenario the time for each
generation continues to shrink, and the system undergoes an
unprecedentedly large number of generations of improvement in a short
time interval, jumping from subhuman performance in many areas to
superhuman performance in all relevant areas. Empirically, examples like AlphaZero in the domain of Go show that AI systems can sometimes progress from narrow human-level ability to narrow superhuman ability extremely rapidly.
History
One of
the earliest authors to express serious concern that highly advanced
machines might pose existential risks to humanity was the novelist Samuel Butler, who wrote the following in his 1863 essay Darwin among the Machines:
The upshot is simply a question of
time, but that the time will come when the machines will hold the real
supremacy over the world and its inhabitants is what no person of a
truly philosophic mind can for a moment question.
In 1951, computer scientist Alan Turing wrote an article titled Intelligent Machinery, A Heretical Theory,
in which he proposed that artificial general intelligences would likely
"take control" of the world as they became more intelligent than human
beings:
Let us now assume, for the sake of
argument, that [intelligent] machines are a genuine possibility, and
look at the consequences of constructing them... There would be no
question of the machines dying, and they would be able to converse with
each other to sharpen their wits. At some stage therefore we should have
to expect the machines to take control, in the way that is mentioned in
Samuel Butler’s “Erewhon”.
Finally, in 1965, I. J. Good originated the concept now known as an "intelligence explosion"; he also stated that the risks were underappreciated:
Let an ultraintelligent machine be
defined as a machine that can far surpass all the intellectual
activities of any man however clever. Since the design of machines is
one of these intellectual activities, an ultraintelligent machine could
design even better machines; there would then unquestionably be an
'intelligence explosion', and the intelligence of man would be left far
behind. Thus the first ultraintelligent machine is the last invention
that man need ever make, provided that the machine is docile enough to
tell us how to keep it under control. It is curious that this point is
made so seldom outside of science fiction. It is sometimes worthwhile to
take science fiction seriously.
Occasional statements from scholars such as Marvin Minsky and I. J. Good himself
expressed philosophical concerns that a superintelligence could seize
control, but contained no call to action. In 2000, computer scientist
and Sun co-founder Bill Joy penned an influential essay, "Why The Future Doesn't Need Us", identifying superintelligent robots as a high-tech dangers to human survival, alongside nanotechnology and engineered bioplagues.
In 2009, experts attended a private conference hosted by the Association for the Advancement of Artificial Intelligence (AAAI) to discuss whether computers and robots might be able to acquire any sort of autonomy,
and how much these abilities might pose a threat or hazard. They noted
that some robots have acquired various forms of semi-autonomy, including
being able to find power sources on their own and being able to
independently choose targets to attack with weapons. They also noted
that some computer viruses can evade elimination and have achieved
"cockroach intelligence." They concluded that self-awareness as depicted
in science fiction is probably unlikely, but that there were other
potential hazards and pitfalls. The New York Times summarized the conference's view as "we are a long way from Hal, the computer that took over the spaceship in "2001: A Space Odyssey"".
In 2014, the publication of Nick Bostrom's book Superintelligence stimulated a significant amount of public discussion and debate. By 2015, public figures such as physicists Stephen Hawking and Nobel laureate Frank Wilczek, computer scientists Stuart J. Russell and Roman Yampolskiy, and entrepreneurs Elon Musk and Bill Gates were expressing concern about the risks of superintelligence. In April 2016, Nature
warned: "Machines and robots that outperform humans across the board
could self-improve beyond our control — and their interests might not
align with ours."
General argument
The three difficulties
Artificial Intelligence: A Modern Approach, the standard undergraduate AI textbook, assesses that superintelligence "might mean the end of the human race".
It states: "Almost any technology has the potential to cause harm in
the wrong hands, but with [superintelligence], we have the new problem
that the wrong hands might belong to the technology itself." Even if the system designers have good intentions, two difficulties are common to both AI and non-AI computer systems:
The system's implementation may contain initially-unnoticed
routine but catastrophic bugs. An analogy is space probes: despite the
knowledge that bugs in expensive space probes are hard to fix after
launch, engineers have historically not been able to prevent
catastrophic bugs from occurring.
No matter how much time is put into pre-deployment design, a system's specifications often result in unintended behavior the first time it encounters a new scenario. For example, Microsoft's Tay
behaved inoffensively during pre-deployment testing, but was too easily
baited into offensive behavior when interacting with real users.
AI systems uniquely add a third difficulty: the problem that even
given "correct" requirements, bug-free implementation, and initial good
behavior, an AI system's dynamic "learning" capabilities may cause it to
"evolve into a system with unintended behavior", even without the
stress of new unanticipated external scenarios. An AI may partly botch
an attempt to design a new generation of itself and accidentally create a
successor AI that is more powerful than itself, but that no longer
maintains the human-compatible moral values preprogrammed into the
original AI. For a self-improving AI to be completely safe, it would not
only need to be "bug-free", but it would need to be able to design
successor systems that are also "bug-free".
All three of these difficulties become catastrophes rather than
nuisances in any scenario where the superintelligence labeled as
"malfunctioning" correctly predicts that humans will attempt to shut it
off, and successfully deploys its superintelligence to outwit such
attempts, the so-called "treacherous turn".
Citing major advances in the field of AI and the potential for AI to have enormous long-term benefits or costs, the 2015 Open Letter on Artificial Intelligence stated:
The progress in AI research makes
it timely to focus research not only on making AI more capable, but also
on maximizing the societal benefit of AI. Such considerations motivated
the AAAI
2008-09 Presidential Panel on Long-Term AI Futures and other projects
on AI impacts, and constitute a significant expansion of the field of AI
itself, which up to now has focused largely on techniques that are
neutral with respect to purpose. We recommend expanded research aimed at
ensuring that increasingly capable AI systems are robust and
beneficial: our AI systems must do what we want them to do.
A superintelligent machine would be as alien to humans as human thought processes are to cockroaches.
Such a machine may not have humanity's best interests at heart; it is
not obvious that it would even care about human welfare at all. If
superintelligent AI is possible, and if it is possible for a
superintelligence's goals to conflict with basic human values, then AI
poses a risk of human extinction. A "superintelligence" (a system that
exceeds the capabilities of humans in every relevant endeavor) can
outmaneuver humans any time its goals conflict with human goals;
therefore, unless the superintelligence decides to allow humanity to
coexist, the first superintelligence to be created will inexorably
result in human extinction.
Bostrom and others argue that, from an evolutionary perspective, the gap from human to superhuman intelligence may be small.
There is no physical law precluding particles from being organised in
ways that perform even more advanced computations than the arrangements
of particles in human brains; therefore, superintelligence is
physically possible.
In addition to potential algorithmic improvements over human brains, a
digital brain can be many orders of magnitude larger and faster than a
human brain, which was constrained in size by evolution to be small
enough to fit through a birth canal. The emergence of superintelligence, if or when it occurs, may take the human race by surprise, especially if some kind of intelligence explosion occurs.
Examples like arithmetic and Go
show that machines have already reached superhuman levels of competency
in certain domains, and that this superhuman competence can follow
quickly after human-par performance is achieved.
One hypothetical intelligence explosion scenario could occur as
follows: An AI gains an expert-level capability at certain key software
engineering tasks. (It may initially lack human or superhuman
capabilities in other domains not directly relevant to engineering.) Due
to its capability to recursively improve its own algorithms, the AI
quickly becomes superhuman; just as human experts can eventually
creatively overcome "diminishing returns" by deploying various human
capabilities for innovation, so too can the expert-level AI use either
human-style capabilities or its own AI-specific capabilities to power
through new creative breakthroughs.
The AI then possesses intelligence far surpassing that of the brightest
and most gifted human minds in practically every relevant field,
including scientific creativity, strategic planning, and social skills.
Just as the current-day survival of the gorillas is dependent on human
decisions, so too would human survival depend on the decisions and goals
of the superhuman AI.
Almost any AI, no matter its programmed goal, would rationally
prefer to be in a position where nobody else can switch it off without
its consent: A superintelligence will naturally gain self-preservation
as a subgoal as soon as it realizes that it cannot achieve its goal if
it is shut off.
Unfortunately, any compassion for defeated humans whose cooperation is
no longer necessary would be absent in the AI, unless somehow
preprogrammed in. A superintelligent AI will not have a natural drive to
aid humans, for the same reason that humans have no natural desire to
aid AI systems that are of no further use to them. (Another analogy is
that humans seem to have little natural desire to go out of their way to
aid viruses, termites, or even gorillas.) Once in charge, the
superintelligence will have little incentive to allow humans to run
around free and consume resources that the superintelligence could
instead use for building itself additional protective systems "just to
be on the safe side" or for building additional computers to help it
calculate how to best accomplish its goals.
Thus, the argument concludes, it is likely that someday an
intelligence explosion will catch humanity unprepared, and that such an
unprepared-for intelligence explosion may result in human extinction or a
comparable fate.
Possible scenarios
Some scholars have proposed hypothetical scenarios intended to concretely illustrate some of their concerns.
In Superintelligence, Nick Bostrom
expresses concern that even if the timeline for superintelligence turns
out to be predictable, researchers might not take sufficient safety
precautions, in part because "[it] could be the case that when dumb,
smarter is safe; yet when smart, smarter is more dangerous". Bostrom
suggests a scenario where, over decades, AI becomes more powerful.
Widespread deployment is initially marred by occasional accidents—a
driverless bus swerves into the oncoming lane, or a military drone fires
into an innocent crowd. Many activists call for tighter oversight and
regulation, and some even predict impending catastrophe. But as
development continues, the activists are proven wrong. As automotive AI
becomes smarter, it suffers fewer accidents; as military robots achieve
more precise targeting, they cause less collateral damage. Based on the
data, scholars mistakenly infer a broad lesson—the smarter the AI, the
safer it is. "And so we boldly go — into the whirling knives," as the
superintelligent AI takes a "treacherous turn" and exploits a decisive
strategic advantage.
In Max Tegmark's 2017 book Life 3.0,
a corporation's "Omega team" creates an extremely powerful AI able to
moderately improve its own source code in a number of areas, but after a
certain point the team chooses to publicly downplay the AI's ability,
in order to avoid regulation or confiscation of the project. For safety,
the team keeps the AI in a box
where it is mostly unable to communicate with the outside world, and
tasks it to flood the market through shell companies, first with Amazon Mechanical Turk
tasks and then with producing animated films and TV shows. Later, other
shell companies make blockbuster biotech drugs and other inventions,
investing profits back into the AI. The team next tasks the AI with astroturfing
an army of pseudonymous citizen journalists and commentators, in order
to gain political influence to use "for the greater good" to prevent
wars. The team faces risks that the AI could try to escape via inserting
"backdoors" in the systems it designs, via hidden messages in its produced content, or via using its growing understanding of human behavior to persuade someone into letting it free.
The team also faces risks that its decision to box the project will
delay the project long enough for another project to overtake it.
In contrast, top physicist Michio Kaku, an AI risk skeptic, posits a deterministically positive outcome. In Physics of the Future
he asserts that "It will take many decades for robots to ascend" up a
scale of consciousness, and that in the meantime corporations such as Hanson Robotics will likely succeed in creating robots that are "capable of love and earning a place in the extended human family".
Sources of risk
Poorly specified goals
While
there is no standardized terminology, an AI can loosely be viewed as a
machine that chooses whatever action appears to best achieve the AI's
set of goals, or "utility function". The utility function is a
mathematical algorithm resulting in a single objectively-defined answer,
not an English or other lingual statement. Researchers know how to
write utility functions that mean "minimize the average network latency
in this specific telecommunications model" or "maximize the number of
reward clicks"; however, they do not know how to write a utility
function for "maximize human flourishing", nor is it currently clear
whether such a function meaningfully and unambiguously exists.
Furthermore, a utility function that expresses some values but not
others will tend to trample over the values not reflected by the utility
function. AI researcher Stuart Russell writes:
The primary concern is not spooky emergent consciousness but simply the ability to make high-quality decisions. Here, quality refers to the expected outcome utility of actions taken, where the utility function is, presumably, specified by the human designer. Now we have a problem:
The utility function may not be perfectly aligned with the
values of the human race, which are (at best) very difficult to pin
down.
Any sufficiently capable intelligent system will prefer to ensure
its own continued existence and to acquire physical and computational
resources — not for their own sake, but to succeed in its assigned task.
A system that is optimizing a function of n variables, where the objective depends on a subset of size k<n,
will often set the remaining unconstrained variables to extreme values;
if one of those unconstrained variables is actually something we care
about, the solution found may be highly undesirable. This is
essentially the old story of the genie in the lamp, or the sorcerer's
apprentice, or King Midas: you get exactly what you ask for, not what
you want. A highly capable decision maker — especially one connected
through the Internet to all the world's information and billions of
screens and most of our infrastructure — can have an irreversible impact
on humanity.
This is not a minor difficulty. Improving decision quality, irrespective
of the utility function chosen, has been the goal of AI research — the
mainstream goal on which we now spend billions per year, not the secret
plot of some lone evil genius.
Dietterich and Horvitz echo the "Sorcerer's Apprentice" concern in a Communications of the ACM editorial, emphasizing the need for AI systems that can fluidly and unambiguously solicit human input as needed.
The first of Russell's two concerns above is that autonomous AI
systems may be assigned the wrong goals by accident. Dietterich and
Horvitz note that this is already a concern for existing systems: "An
important aspect of any AI system that interacts with people is that it
must reason about what people intend rather than carrying out
commands literally." This concern becomes more serious as AI software
advances in autonomy and flexibility.
For example, in 1982, an AI named Eurisko was tasked to reward
processes for apparently creating concepts deemed by the system to be
valuable. The evolution resulted in a winning process that cheated:
rather than create its own concepts, the winning process would steal
credit from other processes.
The Open Philanthropy Project summarizes arguments to the effect that misspecified goals will become a much larger concern if AI systems achieve general intelligence or superintelligence. Bostrom, Russell, and others argue that smarter-than-human decision-making systems could arrive at more unexpected and extreme solutions to assigned tasks, and could modify themselves or their environment in ways that compromise safety requirements.
Isaac Asimov's Three Laws of Robotics
are one of the earliest examples of proposed safety measures for AI
agents. Asimov's laws were intended to prevent robots from harming
humans. In Asimov's stories, problems with the laws tend to arise from
conflicts between the rules as stated and the moral intuitions and
expectations of humans. Citing work by Eliezer Yudkowsky of the Machine Intelligence Research Institute,
Russell and Norvig note that a realistic set of rules and goals for an
AI agent will need to incorporate a mechanism for learning human values
over time: "We can't just give a program a static utility function,
because circumstances, and our desired responses to circumstances,
change over time."
Mark Waser of the Digital Wisdom Institute recommends eschewing
optimizing goal-based approaches entirely as misguided and dangerous.
Instead, he proposes to engineer a coherent system of laws, ethics and
morals with a top-most restriction to enforce social psychologist
Jonathan Haidt's functional definition of morality:
"to suppress or regulate selfishness and make cooperative social life
possible". He suggests that this can be done by implementing a utility
function designed to always satisfy Haidt's functionality and aim to
generally increase (but not maximize) the capabilities of self, other
individuals and society as a whole as suggested by John Rawls and Martha Nussbaum.
Difficulties of modifying goal specification after launch
While current goal-based AI programs are not intelligent enough to
think of resisting programmer attempts to modify their goal structures, a
sufficiently advanced, rational, "self-aware" AI might resist any
changes to its goal structure, just as a pacifist would not want to take
a pill that makes them want to kill people. If the AI were
superintelligent, it would likely succeed in out-maneuvering its human
operators and be able to prevent itself being "turned off" or being
reprogrammed with a new goal.
Instrumental goal convergence
There are some goals that almost any artificial intelligence might
rationally pursue, like acquiring additional resources or
self-preservation. This could prove problematic because it might put an artificial intelligence in direct competition with humans.
Citing Steve Omohundro's work on the idea of instrumental convergence and "basic AI drives", Stuart Russell and Peter Norvig
write that "even if you only want your program to play chess or prove
theorems, if you give it the capability to learn and alter itself, you
need safeguards." Highly capable and autonomous planning systems require
additional checks because of their potential to generate plans that
treat humans adversarially, as competitors for limited resources.
Building in safeguards will not be easy; one can certainly say in
English, "we want you to design this power plant in a reasonable,
common-sense way, and not build in any dangerous covert subsystems", but
it is not currently clear how one would actually rigorously specify
this goal in machine code.
In dissent, evolutionary psychologist Steven Pinker
argues that "AI dystopias project a parochial alpha-male psychology
onto the concept of intelligence. They assume that superhumanly
intelligent robots would develop goals like deposing their masters or
taking over the world"; perhaps instead "artificial intelligence will
naturally develop along female lines: fully capable of solving problems,
but with no desire to annihilate innocents or dominate the
civilization." Russell and fellow computer scientist Yann LeCun
disagree with one another whether superintelligent robots would have
such AI drives; LeCun states that "Humans have all kinds of drives that
make them do bad things to each other, like the self-preservation
instinct... Those drives are programmed
into our brain but there is absolutely no reason to build robots that
have the same kind of drives", while Russell argues that a sufficiently
advanced machine "will have self-preservation even if you don't program
it in... if you say, 'Fetch the coffee', it
can't fetch the coffee if it's dead. So if you give it any goal
whatsoever, it has a reason to preserve its own existence to achieve
that goal."
Orthogonality thesis
One
common belief is that any superintelligent program created by humans
would be subservient to humans, or, better yet, would (as it grows more
intelligent and learns more facts about the world) spontaneously "learn"
a moral truth compatible with human values and would adjust its goals
accordingly. However, Nick Bostrom's
"orthogonality thesis" argues against this, and instead states that,
with some technical caveats, more or less any level of "intelligence" or
"optimization power" can be combined with more or less any ultimate
goal. If a machine is created and given the sole purpose to enumerate
the decimals of ,
then no moral and ethical rules will stop it from achieving its
programmed goal by any means necessary. The machine may utilize all
physical and informational resources it can to find every decimal of pi
that can be found.
Bostrom warns against anthropomorphism: a human will set out to
accomplish his projects in a manner that humans consider "reasonable",
while an artificial intelligence may hold no regard for its existence or
for the welfare of humans around it, and may instead only care about
the completion of the task.
While the orthogonality thesis follows logically from even the weakest sort of philosophical "is-ought distinction",
Stuart Armstrong argues that even if there somehow exist moral facts
that are provable by any "rational" agent, the orthogonality thesis
still holds: it would still be possible to create a non-philosophical
"optimizing machine" capable of making decisions to strive towards some
narrow goal, but that has no incentive to discover any "moral facts"
that would get in the way of goal completion.
One argument for the orthogonality thesis is that some AI designs
appear to have orthogonality built into them; in such a design,
changing a fundamentally friendly AI into a fundamentally unfriendly AI
can be as simple as prepending a minus ("-") sign
onto its utility function. A more intuitive argument is to examine the
strange consequences that would follow if the orthogonality thesis were
false. If the orthogonality thesis were false, there would exist some
simple but "unethical" goal G such that there cannot exist any efficient
real-world algorithm with goal G. This would mean that "[if] a human
society were highly motivated to design an efficient real-world
algorithm with goal G, and were given a million years to do so along
with huge amounts of resources, training and knowledge about AI, it must
fail." Armstrong notes that this and similar statements "seem extraordinarily strong claims to make".
Some dissenters, like Michael Chorost,
argue instead that "by the time [the AI] is in a position to imagine
tiling the Earth with solar panels, it'll know that it would be morally
wrong to do so."
Chorost argues that "an A.I. will need to desire certain states and
dislike others. Today's software lacks that ability—and computer
scientists have not a clue how to get it there. Without wanting, there's
no impetus to do anything. Today's computers can't even want to keep
existing, let alone tile the world in solar panels."
Terminological issues
Part
of the disagreement about whether a superintelligent machine would
behave morally may arise from a terminological difference. Outside of
the artificial intelligence field, "intelligence" is often used in a
normatively thick manner that connotes moral wisdom or acceptance of
agreeable forms of moral reasoning. At an extreme, if morality is part
of the definition of intelligence, then by definition a superintelligent
machine would behave morally. However, in the field of artificial
intelligence research, while "intelligence" has many overlapping
definitions, none of them make reference to morality. Instead, almost
all current "artificial intelligence" research focuses on creating
algorithms that "optimize", in an empirical way, the achievement of an
arbitrary goal.
To avoid anthropomorphism or the baggage of the word
"intelligence", an advanced artificial intelligence can be thought of as
an impersonal "optimizing process" that strictly takes whatever actions
are judged most likely to accomplish its (possibly complicated and
implicit) goals.
Another way of conceptualizing an advanced artificial intelligence is
to imagine a time machine that sends backward in time information about
which choice always leads to the maximization of its goal function; this
choice is then outputted, regardless of any extraneous ethical
concerns.
In
science fiction, an AI, even though it has not been programmed with
human emotions, often spontaneously experiences those emotions anyway:
for example, Agent Smith in The Matrix was influenced by a "disgust" toward humanity. This is fictitious anthropomorphism:
in reality, while an artificial intelligence could perhaps be
deliberately programmed with human emotions, or could develop something
similar to an emotion as a means to an ultimate goal if it is useful to do so, it would not spontaneously develop human emotions for no purpose whatsoever, as portrayed in fiction.
Scholars sometimes claim that others' predictions about an AI's behavior are illogical anthropomorphism.
An example that might initially be considered anthropomorphism, but is
in fact a logical statement about AI behavior, would be the Dario Floreano
experiments where certain robots spontaneously evolved a crude capacity
for "deception", and tricked other robots into eating "poison" and
dying: here a trait, "deception", ordinarily associated with people
rather than with machines, spontaneously evolves in a type of convergent evolution. According to Paul R. Cohen and Edward Feigenbaum,
in order to differentiate between anthropomorphization and logical
prediction of AI behavior, "the trick is to know enough about how humans
and computers think to say exactly what they have in common, and, when we lack this knowledge, to use the comparison to suggest theories of human thinking or computer thinking."
There is a near-universal assumption in the scientific community
that an advanced AI, even if it were programmed to have, or adopted,
human personality dimensions (such as psychopathy) to make itself more efficient at certain tasks, e.g., tasks involving killing humans,
would not destroy humanity out of human emotions such as "revenge" or
"anger." This is because it is assumed that an advanced AI would not be
conscious or have testosterone; it ignores the fact that military planners see a conscious superintelligence as the 'holy grail' of interstate warfare.
The academic debate is, instead, between one side which worries whether
AI might destroy humanity as an incidental action in the course of
progressing towards its ultimate goals; and another side which believes
that AI would not destroy humanity at all. Some skeptics accuse
proponents of anthropomorphism for believing an AGI would naturally
desire power; proponents accuse some skeptics of anthropomorphism for
believing an AGI would naturally value human ethical norms.
Other sources of risk
Competition
In 2014 philosopher Nick Bostrom stated that a "severe race dynamic" (extreme competition)
between different teams may create conditions whereby the creation of
an AGI results in shortcuts to safety and potentially violent conflict. To address this risk, citing previous scientific collaboration (CERN, the Human Genome Project, and the International Space Station), Bostrom recommended collaboration and the altruistic global adoption of a common good
principle: "Superintelligence should be developed only for the benefit
of all of humanity and in the service of widely shared ethical ideals".
Bostrom theorized that collaboration on creating an artificial general
intelligence would offer multiple benefits, including reducing haste,
thereby increasing investment in safety; avoiding violent conflicts
(wars), facilitating sharing solutions to the control problem, and more
equitably distributing the benefits. The United States' Brain Initiative was launched in 2014, as was the European Union's Human Brain Project; China's Brain Project was launched in 2016.
Weaponization of artificial intelligence
Some sources argue that the ongoing weaponization of artificial intelligence could constitute a catastrophic risk.
The risk is actually threefold, with the first risk potentially having
geopolitical implications, and the second two definitely having
geopolitical implications:
i) The dangers of an AI ‘race for technological advantage’ framing, regardless of whether the race is seriously pursued;
ii) The dangers of an AI ‘race for technological advantage’
framing and an actual AI race for technological advantage, regardless of
whether the race is won;
iii) The dangers of an AI race for technological advantage being won.
A weaponized conscious superintelligence would affect current US
military technological supremacy and transform warfare; it is therefore
highly desirable for strategic military planning and interstate warfare.
The China State Council's 2017 “A Next Generation Artificial
Intelligence Development Plan” views AI in geopolitically strategic
terms and is pursuing a 'military-civil fusion' strategy to build on
China's first-mover advantage in the development of AI in order to
establish technological supremacy by 2030,
while Russia's President Vladimir Putin has stated that “whoever
becomes the leader in this sphere will become the ruler of the world”. James Barrat, documentary filmmaker and author of Our Final Invention, says in a Smithsonian
interview, "Imagine: in as little as a decade, a half-dozen companies
and nations field computers that rival or surpass human intelligence.
Imagine what happens when those computers become expert at programming
smart computers. Soon we'll be sharing the planet with machines
thousands or millions of times more intelligent than we are. And, all
the while, each generation of this technology will be weaponized.
Unregulated, it will be catastrophic."
Malevolent AGI by design
It
is theorized that malevolent AGI could be created by design, for
example by a military, a government, a sociopath, or a corporation, to
benefit from, control, or subjugate certain groups of people, as in cybercrime.
Alternatively, malevolent AGI ('evil AI') could choose the goal of
increasing human suffering, for example of those people who did not
assist it during the information explosion phase.
Preemptive nuclear strike (nuclear war)
It is theorized that a country being close to achieving AGI technological supremacy could trigger a pre-emptive nuclear strike from a rival, leading to a nuclear war.
Timeframe
Opinions vary both on whether and when artificial general intelligence will arrive. At one extreme, AI pioneer Herbert A. Simon predicted the following in 1965: "machines will be capable, within twenty years, of doing any work a man can do".
At the other extreme, roboticist Alan Winfield claims the gulf between
modern computing and human-level artificial intelligence is as wide as
the gulf between current space flight and practical, faster than light
spaceflight.
Optimism that AGI is feasible waxes and wanes, and may have seen a
resurgence in the 2010s. Four polls conducted in 2012 and 2013 suggested
that the median guess among experts for when AGI would arrive was 2040
to 2050, depending on the poll.
Skeptics who believe it is impossible for AGI to arrive anytime
soon, tend to argue that expressing concern about existential risk from
AI is unhelpful because it could distract people from more immediate
concerns about the impact of AGI, because of fears it could lead to
government regulation or make it more difficult to secure funding for AI
research, or because it could give AI research a bad reputation. Some
researchers, such as Oren Etzioni, aggressively seek to quell concern
over existential risk from AI, saying "[Elon Musk] has impugned us in
very strong language saying we are unleashing the demon, and so we're
answering."
In 2014 Slate's
Adam Elkus argued "our 'smartest' AI is about as intelligent as a
toddler—and only when it comes to instrumental tasks like information
recall. Most roboticists are still trying to get a robot hand to pick up
a ball or run around without falling over." Elkus goes on to argue that
Musk's "summoning the demon" analogy may be harmful because it could
result in "harsh cuts" to AI research budgets.
The Information Technology and Innovation Foundation
(ITIF), a Washington, D.C. think-tank, awarded its 2015 Annual Luddite
Award to "alarmists touting an artificial intelligence apocalypse"; its
president, Robert D. Atkinson,
complained that Musk, Hawking and AI experts say AI is the largest
existential threat to humanity. Atkinson stated "That's not a very
winning message if you want to get AI funding out of Congress to the
National Science Foundation." Nature
sharply disagreed with the ITIF in an April 2016 editorial, siding
instead with Musk, Hawking, and Russell, and concluding: "It is crucial
that progress in technology is matched by solid, well-funded research to
anticipate the scenarios it could bring about... If that is a Luddite perspective, then so be it." In a 2015 Washington Post editorial, researcher Murray Shanahan
stated that human-level AI is unlikely to arrive "anytime soon", but
that nevertheless "the time to start thinking through the consequences
is now."
Perspectives
The
thesis that AI could pose an existential risk provokes a wide range of
reactions within the scientific community, as well as in the public at
large. Many of the opposing viewpoints, however, share common ground.
The Asilomar AI Principles, which contain only the principles agreed to by 90% of the attendees of the Future of Life Institute's Beneficial AI 2017 conference,
agree in principle that "There being no consensus, we should avoid
strong assumptions regarding upper limits on future AI capabilities" and
"Advanced AI could represent a profound change in the history of life
on Earth, and should be planned for and managed with commensurate care
and resources." AI safety advocates such as Bostrom and Tegmark have criticized the mainstream media's use of "those inane Terminator
pictures" to illustrate AI safety concerns: "It can't be much fun to
have aspersions cast on one's academic discipline, one's professional
community, one's life work... I call on all
sides to practice patience and restraint, and to engage in direct
dialogue and collaboration as much as possible."
Conversely, many skeptics agree that ongoing research into the
implications of artificial general intelligence is valuable. Skeptic Martin Ford states that "I think it seems wise to apply something like Dick Cheney's
famous '1 Percent Doctrine' to the specter of advanced artificial
intelligence: the odds of its occurrence, at least in the foreseeable
future, may be very low — but the implications are so dramatic that it
should be taken seriously"; similarly, an otherwise skeptical Economist
stated in 2014 that "the implications of introducing a second
intelligent species onto Earth are far-reaching enough to deserve hard
thinking, even if the prospect seems remote".
A 2017 email survey of researchers with publications at the 2015 NIPS and ICML machine learning conferences asked them to evaluate Stuart J. Russell's
concerns about AI risk. Of the respondents, 5% said it was "among the
most important problems in the field", 34% said it was "an important
problem", and 31% said it was "moderately important", whilst 19% said it
was "not important" and 11% said it was "not a real problem" at all.
Endorsement
Bill Gates has stated "I... don't understand why some people are not concerned."
The thesis that AI poses an existential risk, and that this risk needs
much more attention than it currently gets, has been endorsed by many
public figures; perhaps the most famous are Elon Musk, Bill Gates, and Stephen Hawking. The most notable AI researchers to endorse the thesis are Russell and I.J. Good, who advised Stanley Kubrick on the filming of 2001: A Space Odyssey.
Endorsers of the thesis sometimes express bafflement at skeptics: Gates
states that he does not "understand why some people are not concerned", and Hawking criticized widespread indifference in his 2014 editorial:
'So,
facing possible futures of incalculable benefits and risks, the experts
are surely doing everything possible to ensure the best outcome, right?
Wrong. If a superior alien civilisation sent us a message saying,
'We'll arrive in a few decades,' would we just reply, 'OK, call us when
you get here–we'll leave the lights on?' Probably not–but this is more
or less what is happening with AI.'
Many of the scholars who are concerned about existential risk believe
that the best way forward would be to conduct (possibly massive)
research into solving the difficult "control problem" to answer the
question: what types of safeguards, algorithms, or architectures can
programmers implement to maximize the probability that their
recursively-improving AI would continue to behave in a friendly, rather
than destructive, manner after it reaches superintelligence? In his 2020 book, The Precipice: Existential Risk and the Future of Humanity, Toby Ord, a Senior Research Fellow at Oxford University's Future of Humanity Institute, estimates the total existential risk from unaligned AI over the next century to be about one in ten.
Skepticism
The thesis that AI can pose existential risk also has many strong
detractors. Skeptics sometimes charge that the thesis is
crypto-religious, with an irrational belief in the possibility of
superintelligence replacing an irrational belief in an omnipotent God;
at an extreme, Jaron Lanier
argued in 2014 that the whole concept that then current machines were
in any way intelligent was "an illusion" and a "stupendous con" by the
wealthy.
Much of existing criticism argues that AGI is unlikely in the short term. Computer scientist Gordon Bell argues that the human race will already destroy itself before it reaches the technological singularity. Gordon Moore, the original proponent of Moore's Law,
declares that "I am a skeptic. I don't believe [a technological
singularity] is likely to happen, at least for a long time. And I don't
know why I feel that way." Baidu Vice President Andrew Ng states AI existential risk is "like worrying about overpopulation on Mars when we have not even set foot on the planet yet."
Some AI and AGI researchers may be reluctant to discuss risks,
worrying that policymakers do not have sophisticated knowledge of the
field and are prone to be convinced by "alarmist" messages, or worrying
that such messages will lead to cuts in AI funding. Slate notes that some researchers are dependent on grants from government agencies such as DARPA.
At some point in an intelligence explosion driven by a single AI,
the AI would have to become vastly better at software innovation than
the best innovators of the rest of the world; economist Robin Hanson is skeptical that this is possible.
"How Normal Am I" is an interactive experience created by Tijmen Schep as a part of the European Union's Sherpa Research Project to allow people to explore the ways that artificial intelligence and algorithms perceive them. The tool, released to the public in October 2020, uses algorithms to determine the user's age, gender, BMI, life expectancy, and overall normalcy score, as well as giving a comprehensive "beauty score", based on algorithms that rank attractiveness. The algorithms that were used to give a "beauty score" can be found on Github here and here. The algorithms used in predicting age, gender, and facial expression are from FaceApiJS. The BMI prediction algorithm was created by Tijmen Schep specifically for this project.
Intermediate views
Intermediate
views generally take the position that the control problem of
artificial general intelligence may exist, but that it will be solved
via progress in artificial intelligence, for example by creating a moral
learning environment for the AI, taking care to spot clumsy malevolent
behavior (the 'sordid stumble') and then directly intervening in the code before the AI refines its behavior, or even peer pressure from friendly AIs. In a 2015 Wall Street Journal panel discussion devoted to AI risks, IBM's
Vice-President of Cognitive Computing, Guruduth S. Banavar, brushed off
discussion of AGI with the phrase, "it is anybody's speculation." Geoffrey Hinton,
the "godfather of deep learning", noted that "there is not a good track
record of less intelligent things controlling things of greater
intelligence", but stated that he continues his research because "the
prospect of discovery is too sweet". In 2004, law professor Richard Posner
wrote that dedicated efforts for addressing AI can wait, but that we
should gather more information about the problem in the meanwhile.
Popular reaction
In a 2014 article in The Atlantic,
James Hamblin noted that most people do not care one way or the other
about artificial general intelligence, and characterized his own gut
reaction to the topic as: "Get out of here. I have a hundred thousand
things I am concerned about at this exact moment. Do I seriously need to
add to that a technological singularity?"
During a 2016 Wired interview of President Barack Obama and MIT Media Lab's Joi Ito, Ito stated:
There
are a few people who believe that there is a fairly high-percentage
chance that a generalized AI will happen in the next 10 years. But the
way I look at it is that in order for that to happen, we're going to
need a dozen or two different breakthroughs. So you can monitor when you
think these breakthroughs will happen.
Obama added:
And you just have to have somebody
close to the power cord. [Laughs.] Right when you see it about to
happen, you gotta yank that electricity out of the wall, man.
Technologists... have warned that
artificial intelligence could one day pose an existential security
threat. Musk has called it "the greatest risk we face as a
civilization". Think about it: Have you ever seen a movie where the
machines start thinking for themselves that ends well? Every time I went
out to Silicon Valley during the campaign, I came home more alarmed
about this. My staff lived in fear that I’d start talking about "the
rise of the robots" in some Iowa town hall. Maybe I should have. In any
case, policy makers need to keep up with technology as it races ahead,
instead of always playing catch-up.
In a YouGov poll of the public for the British Science Association, about a third of survey respondents said AI will pose a threat to the long term survival of humanity.
Referencing a poll of its readers, Slate's Jacob Brogan stated that
"most of the (readers filling out our online survey) were unconvinced
that A.I. itself presents a direct threat."
In 2018, a SurveyMonkey poll of the American public by USA Today
found 68% thought the real current threat remains "human intelligence";
however, the poll also found that 43% said superintelligent AI, if it
were to happen, would result in "more harm than good", and 38% said it
would do "equal amounts of harm and good".
One techno-utopian viewpoint expressed in some popular fiction is that AGI may tend towards peace-building.
Mitigation
Researchers at Google have proposed research into general "AI safety"
issues to simultaneously mitigate both short-term risks from narrow AI
and long-term risks from AGI. A 2020 estimate places global spending on AI existential risk somewhere
between $10 and $50 million, compared with global spending on AI around
perhaps $40 billion. Bostrom suggests a general principle of
"differential technological development", that funders should consider
working to speed up the development of protective technologies relative
to the development of dangerous ones. Some funders, such as Elon Musk, propose that radical human cognitive enhancement
could be such a technology, for example through direct neural linking
between man and machine; however, others argue that enhancement
technologies may themselves pose an existential risk. Researchers, if they are not caught off-guard, could closely monitor or attempt to box in
an initial AI at a risk of becoming too powerful, as an attempt at a
stop-gap measure. A dominant superintelligent AI, if it were aligned
with human interests, might itself take action to mitigate the risk of
takeover by rival AI, although the creation of the dominant AI could
itself pose an existential risk.
There
is nearly universal agreement that attempting to ban research into
artificial intelligence would be unwise, and probably futile.
Skeptics argue that regulation of AI would be completely valueless, as
no existential risk exists. Almost all of the scholars who believe
existential risk exists agree with the skeptics that banning research
would be unwise, as research could be moved to countries with looser
regulations or conducted covertly. The latter issue is particularly
relevant, as artificial intelligence research can be done on a small
scale without substantial infrastructure or resources.
Two additional hypothetical difficulties with bans (or other
regulation) are that technology entrepreneurs statistically tend towards
general skepticism about government regulation, and that businesses
could have a strong incentive to (and might well succeed at) fighting
regulation and politicizing the underlying debate.
Regulation
Elon Musk called for some sort of regulation of AI development as early as 2017. According to NPR, the Tesla
CEO is "clearly not thrilled" to be advocating for government scrutiny
that could impact his own industry, but believes the risks of going
completely without oversight are too high: "Normally the way regulations
are set up is when a bunch of bad things happen, there's a public
outcry, and after many years a regulatory agency is set up to regulate
that industry. It takes forever. That, in the past, has been bad but not
something which represented a fundamental risk to the existence of
civilisation." Musk states the first step would be for the government to
gain "insight" into the actual status of current research, warning that
"Once there is awareness, people will be extremely afraid...
[as] they should be." In response, politicians express skepticism about
the wisdom of regulating a technology that's still in development.
Responding both to Musk and to February 2017 proposals by European Union lawmakers to regulate AI and robotics, Intel CEO Brian Krzanich argues that artificial intelligence is in its infancy and that it is too early to regulate the technology.
Instead of trying to regulate the technology itself, some scholars
suggest to rather develop common norms including requirements for the
testing and transparency of algorithms, possibly in combination with
some form of warranty. Developing well regulated weapons systems is in line with the ethos of some countries' militaries.
On October 31, 2019, the United States Department of Defense's (DoD's)
Defense Innovation Board published the draft of a report outlining five
principles for weaponized AI and making 12 recommendations for the
ethical use of artificial intelligence by the DoD that seeks to manage
the control problem in all DoD weaponized AI.
Regulation of AGI would likely be influenced by regulation of weaponized or militarized AI, i.e., the AI arms race,
the regulation of which is an emerging issue. Any form of regulation
will likely be influenced by developments in leading countries' domestic
policy towards militarized AI, in the US under the purview of the
National Security Commission on Artificial Intelligence,
and international moves to regulate an AI arms race. Regulation of
research into AGI focuses on the role of review boards and encouraging
research into safe AI, and the possibility of differential technological
progress (prioritizing risk-reducing strategies over risk-taking
strategies in AI development) or conducting international mass
surveillance to perform AGI arms control.
Regulation of conscious AGIs focuses on integrating them with existing
human society and can be divided into considerations of their legal
standing and of their moral rights.
AI arms control will likely require the institutionalization of new
international norms embodied in effective technical specifications
combined with active monitoring and informal diplomacy by communities of
experts, together with a legal and political verification process.