Military robots are autonomous robots or remote-controlled mobile robots designed for military applications, from transport to search & rescue and attack.
Some such systems are currently in use, and many are under development.
History
Broadly defined, military robots date back to World War II and the Cold War in the form of the German Goliath tracked mines and the Soviet teletanks. The introduction of the MQ-1 Predator drone was when "CIA officers began to see the first practical returns on their decade-old fantasy of using aerial robots to collect intelligence".
The use of robots in warfare, although traditionally a topic for science fiction, is being researched as a possible future means of fighting wars.
Already several military robots have been developed by various armies.
Some believe the future of modern warfare will be fought by automated
weapons systems. The U.S. military is investing heavily in the RQ-1 Predator, which can be armed with air-to-ground missiles and remotely operated from a command center in reconnaissance roles. DARPA has hosted competitions in 2004 & 2005 to involve private companies and universities to develop unmanned ground vehicles to navigate through rough terrain in the Mojave Desert for a final prize of 2 million.
Artillery has seen promising research with an experimental weapons system named "Dragon Fire II"
which automates loading and ballistics calculations required for
accurate predicted fire, providing a 12-second response time to fire support
requests. However, military weapons are prevented from being fully
autonomous; they require human input at certain intervention points to
ensure that targets are not within restricted fire areas as defined by
Geneva Conventions for the laws of war.
There have been some developments towards developing autonomous fighter jets and bombers.
The use of autonomous fighters and bombers to destroy enemy targets is
especially promising because of the lack of training required for
robotic pilots, autonomous planes are capable of performing maneuvers
which could not otherwise be done with human pilots (due to high amount
of G-force),
plane designs do not require a life support system, and a loss of a
plane does not mean a loss of a pilot. However, the largest drawback to
robotics is their inability to accommodate for non-standard conditions.
Advances in artificial intelligence in the near future may help to rectify this.
In 2020 a Kargu 2 drone hunted down and attacked a human target in Libya, according to a report from the UN Security Council’s Panel of Experts on Libya, published in March 2021. This may have been the first time an autonomous killer robot armed with lethal weaponry attacked human beings.
PLA robot soldiers, deployed on the China-India border
In development
MIDARS,
a four-wheeled robot outfitted with several cameras, radar, and
possibly a firearm, that automatically performs random or preprogrammed
patrols around a military base or other government installation. It
alerts a human overseer when it detects movement in unauthorized areas,
or other programmed conditions. The operator can then instruct the robot
to ignore the event, or take over remote control to deal with an
intruder, or to get better camera views of an emergency. The robot would
also regularly scan radio frequency identification tags (RFID) placed
on stored inventory as it passed and report any missing items.
Tactical Autonomous Combatant (TAC) units, described in Project Alpha study Unmanned Effects: Taking the Human out of the Loop.
Autonomous Rotorcraft Sniper System is an experimental robotic weapons system being developed by the U.S. Army since 2005. It consists of a remotely operated sniper rifle attached to an unmanned autonomous helicopter. It is intended for use in urban combat or for several other missions requiring snipers. Flight tests are scheduled to begin in summer 2009.
The "Mobile Autonomous Robot Software" research program was started in December 2003 by the Pentagon who purchased 15 Segways in an attempt to develop more advanced military robots. The program was part of a $26 million Pentagon program to develop software for autonomous systems.
Teng Yun medium size reconnaissance UAV program, Taiwan
Effects and impact
Advantages
Autonomous
robotics would save and preserve soldiers' lives by removing serving
soldiers, who might otherwise be killed, from the battlefield. Lt. Gen.
Richard Lynch of the United States Army Installation Management Command and assistant Army chief of staff for installation stated at a 2011 conference:
As I think about what’s happening
on the battlefield today ... I contend there are things we could do to
improve the survivability of our service members. And you all know
that’s true.
Major Kenneth Rose of the US Army's Training and Doctrine Command
outlined some of the advantages of robotic technology in warfare:
Machines don't get tired. They
don't close their eyes. They don't hide under trees when it rains and
they don't talk to their friends ... A human's attention to detail on
guard duty drops dramatically in the first 30 minutes ... Machines know
no fear.
Increasing attention is also paid to how to make the robots more
autonomous, with a view of eventually allowing them to operate on their
own for extended periods of time, possibly behind enemy lines. For such
functions, systems like the Energetically Autonomous Tactical Robot
are being tried, which is intended to gain its own energy by foraging
for plant matter. The majority of military robots are tele-operated and
not equipped with weapons; they are used for reconnaissance,
surveillance, sniper detection, neutralizing explosive devices, etc.
Current robots that are equipped with weapons are tele-operated so they
are not capable of taking lives autonomously.
Advantages regarding the lack of emotion and passion in robotic combat
is also taken into consideration as a beneficial factor in significantly
reducing instances of unethical behavior in wartime. Autonomous
machines are created not to be "truly 'ethical' robots", yet ones that
comply with the laws of war (LOW) and rules of engagement (ROE).
Hence the fatigue, stress, emotion, adrenaline, etc. that affect a
human soldier's rash decisions are removed; there will be no effect on
the battlefield caused by the decisions made by the individual.
Human rights groups and NGOs such as Human Rights Watch and the Campaign to Stop Killer Robots have started urging governments and the United Nations to issue policy to outlaw the development of so-called "lethal autonomous weapons systems" (LAWS). The United Kingdom opposed such campaigns, with the Foreign Office declaring that "international humanitarian law already provides sufficient regulation for this area".
American
soldiers have been known to name the robots that serve alongside them.
These names are often in honor of human friends, family, celebrities,
pets, or are eponymic. The 'gender' assigned to the robot may be related to the marital status of its operator.
Some affixed fictitious medals to battle-hardened robots, and even held funerals for destroyed robots.
An interview of 23 explosive ordnance detection members shows that
while they feel it is better to lose a robot than a human, they also
felt anger and a sense of loss if they were destroyed.
A survey of 746 people in the military showed that 80% either 'liked'
or 'loved' their military robots, with more affection being shown
towards ground rather than aerial robots.
Surviving dangerous combat situations together increased the level of
bonding between soldier and robot, and current and future advances in artificial intelligence may further intensify the bond with the military robots.
Machine ethics (or machine morality, computational morality, or computational ethics) is a part of the ethics of artificial intelligence concerned with adding or ensuring moral behaviors of man-made machines that use artificial intelligence, otherwise known as artificial intelligent agents. Machine ethics differs from other ethical fields related to engineering and technology. Machine ethics should not be confused with computer ethics, which focuses on human use of computers. It should also be distinguished from the philosophy of technology, which concerns itself with the grander social effects of technology.
Ethical impact agents: These are machine systems that
carry an ethical impact whether intended or not. At the same time, these
agents have the potential to act unethical. Moor gives a hypothetical
example called the 'Goodman agent', named after philosopher Nelson Goodman. The Goodman agent compares dates but has the millennium bug.
This bug resulted from programmers who represented dates with only the
last two digits of the year. So any dates beyond 2000 would be
misleadingly treated as earlier than those in the late twentieth
century. Thus the Goodman agent was an ethical impact agent before 2000,
and an unethical impact agent thereafter.
Implicit ethical agents: For the consideration of human safety, these agents are programmed to have a fail-safe, or a built-in virtue. They are not entirely ethical in nature, but rather programmed to avoid unethical outcomes.
Explicit ethical agents: These are machines that are capable
of processing scenarios and acting on ethical decisions. Machines which
have algorithms to act ethically.
Full ethical agents: These machines are similar to explicit
ethical agents in being able to make ethical decisions. However, they
also contain human metaphysical features. (i.e. have free will, consciousness and intentionality)
Before the 21st century the ethics of machines had largely been the
subject of science fiction literature, mainly due to computing and artificial intelligence
(AI) limitations. Although the definition of "Machine Ethics" has
evolved since, the term was coined by Mitchell Waldrop in the 1987 AI
Magazine article "A Question of Responsibility":
"However,
one thing that is apparent from the above discussion is that
intelligent machines will embody values, assumptions, and purposes,
whether their programmers consciously intend them to or not. Thus, as
computers and robots become more and more intelligent, it becomes
imperative that we think carefully and explicitly about what those
built-in values are. Perhaps what we need is, in fact, a theory and
practice of machine ethics, in the spirit of Asimov's three laws of robotics."
In 2004, Towards Machine Ethics was presented at the AAAI Workshop on Agent Organizations: Theory and Practice in which theoretical foundations for machine ethics were laid out.
It was in the AAAI Fall 2005 Symposium on Machine Ethics where
researchers met for the first time to consider implementation of an
ethical dimension in autonomous systems. A variety of perspectives of this nascent field can be found in the collected edition Machine Ethics that stems from the AAAI Fall 2005 Symposium on Machine Ethics.
In 2007, AI Magazine featured Machine Ethics: Creating an Ethical Intelligent Agent,
an article that
discussed the importance of machine ethics, the need for machines that
represent ethical principles explicitly, and the challenges facing those
working on machine ethics. It also demonstrated that it is possible, at
least in a limited domain, for a machine to abstract an ethical
principle from examples of ethical judgments and use that principle to
guide its own behavior.
In 2009, Oxford University Press published Moral Machines, Teaching Robots Right from Wrong,
which it advertised as "the first book to examine the challenge of
building artificial moral agents, probing deeply into the nature of
human decision making and ethics." It cited some 450 sources, about 100
of which addressed major questions of machine ethics.
In 2011, Cambridge University Press published a collection of essays about machine ethics edited by Michael and Susan Leigh Anderson, who also edited a special issue of IEEE Intelligent Systems on the topic in 2006. The collection consists of the challenges of adding ethical principles to machines.
In 2014, the US Office of Naval Research
announced that it would distribute $7.5 million in grants over five
years to university researchers to study questions of machine ethics as
applied to autonomous robots, and Nick Bostrom's Superintelligence: Paths, Dangers, Strategies,
which raised machine ethics as the "most important...issue humanity has
ever faced," reached #17 on the New York Times list of best selling
science books.
In 2016 the European Parliament published a paper,
(22-page PDF), to encourage the Commission to address the issue of
robots' legal status, as described more briefly in the press.
This paper included sections regarding the legal liabilities of robots,
in which the liabilities were argued as being proportional to the
robots' level of autonomy. The paper also brought into question the
number of jobs that could be replaced by AI robots.
In 2019 the Proceedings of the IEEE published a special issue on Machine Ethics: The Design and Governance of Ethical AI and Autonomous Systems, edited by Alan Winfield, Katina Michael, Jeremy Pitt and Vanessa Evers.
"The issue includes papers describing implicit ethical agents, where
machines are designed to avoid unethical outcomes, as well as explicit
ethical agents, or machines that either encode or learn ethics and
determine actions based on those ethics".
Some scholars, such as philosopher Nick Bostrom and AI researcher Stuart Russell, argue that if AI surpasses humanity in general intelligence and becomes "superintelligent", then this new superintelligence could become powerful and difficult to control: just as the fate of the mountain gorilla depends on human goodwill, so might the fate of humanity depend on the actions of a future machine superintelligence. In their respective books Superintelligence and Human Compatible,
both scholars assert that while there is much uncertainty regarding the
future of AI, the risk to humanity is great enough to merit significant
action in the present.
This presents the AI control problem:
how to build an intelligent agent that will aid its creators, while
avoiding inadvertently building a superintelligence that will harm its
creators. The danger of not designing
control right "the first time", is that a superintelligence may be able
to seize power over its environment and prevent humans from shutting it
down. Potential AI control strategies include "capability control"
(limiting an AI's ability to influence the world) and "motivational
control" (one way of building an AI whose goals are aligned with human or optimal values). There are a number of organizations researching the AI control problem, including the Future of Humanity Institute, the Machine Intelligence Research Institute, the Center for Human-Compatible Artificial Intelligence, and the Future of Life Institute.
Algorithms and training
AI paradigms have been debated over, especially in relation to their efficacy and bias. Nick Bostrom and Eliezer Yudkowsky have argued for decision trees (such as ID3) over neural networks and genetic algorithms on the grounds that decision trees obey modern social norms of transparency and predictability (e.g. stare decisis).
In contrast, Chris Santos-Lang argued in favor of neural networks and
genetic algorithms on the grounds that the norms of any age must be
allowed to change and that natural failure to fully satisfy these
particular norms has been essential in making humans less vulnerable
than machines to criminal "hackers".
In 2009, in an experiment at the Laboratory of Intelligent Systems in the Ecole Polytechnique Fédérale of Lausanne in Switzerland,
AI robots were programmed to cooperate with each other and tasked with
the goal of searching for a beneficial resource while avoiding a
poisonous resource.
During the experiment, the robots were grouped into clans, and the
successful members' digital genetic code was used for the next
generation, a type of algorithm known as a genetic algorithm. After 50
successive generations in the AI, one clan's members discovered how to
distinguish the beneficial resource from the poisonous one. The robots
then learned to lie to each other in an attempt to hoard the beneficial
resource from other robots.
In the same experiment, the same AI robots also learned to behave
selflessly and signaled danger to other robots, and also died at the
cost to save other robots.
The implications of this experiment have been challenged by machine
ethicists. In the Ecole Polytechnique Fédérale experiment, the robots'
goals were programmed to be "terminal". In contrast, human motives
typically have a quality of requiring never-ending learning.
In 2009, academics and technical experts attended a conference to
discuss the potential impact of robots and computers and the impact of
the hypothetical possibility that they could become self-sufficient and
able to make their own decisions. They discussed the possibility and the
extent to which computers and robots might be able to acquire any level
of autonomy, and to what degree they could use such abilities to
possibly pose any threat or hazard. They noted that some machines have
acquired various forms of semi-autonomy, including being able to find
power sources on their own and being able to independently choose
targets to attack with weapons. They also noted that some computer
viruses can evade elimination and have achieved "cockroach
intelligence". They noted that self-awareness as depicted in
science-fiction is probably unlikely, but that there were other
potential hazards and pitfalls.
Some experts and academics have questioned the use of robots for
military combat, especially when such robots are given some degree of
autonomous functions.
The US Navy has funded a report which indicates that as military robots
become more complex, there should be greater attention to implications
of their ability to make autonomous decisions. The President of the Association for the Advancement of Artificial Intelligence has commissioned a study to look at this issue. They point to programs like the Language Acquisition Device which can emulate human interaction.
Integration of artificial general intelligences with society
Preliminary work has been conducted on methods of integrating artificial general intelligences
(full ethical agents as defined above) with existing legal and social
frameworks. Approaches have focused on consideration of their legal
position and rights.
Big data and machine learningalgorithms have become popular among numerous industries including online advertising, credit ratings,
and criminal sentencing, with the promise of providing more objective,
data-driven results, but have been identified as a potential source for
perpetuating social inequalities and discrimination. A 2015 study found that women were less likely to be shown high-income job ads by Google's AdSense. Another study found that Amazon's
same-day delivery service was intentionally made unavailable in black
neighborhoods. Both Google and Amazon were unable to isolate these
outcomes to a single issue, but instead explained that the outcomes were
the result of the black box algorithms they used.
The United States judicial system has begun using quantitative risk assessment software
when making decisions related to releasing people on bail and
sentencing in an effort to be more fair and to reduce an already high imprisonment rate. These tools analyze a defendant's criminal history among other attributes. In a study of 7,000 people arrested in Broward County, Florida, only 20% of the individuals predicted to commit a crime using the county's risk assessment scoring system proceeded to commit a crime. A 2016 ProPublica report analyzed recidivism risk scores calculated by one of the most commonly used tools, the Northpointe COMPAS
system, and looked at outcomes over two years. The report found that
only 61% of those deemed high risk wound up committing additional crimes
during that period. The report also flagged that African-American
defendants were far more likely to be given high-risk scores relative to
their white defendant counterparts. Legally, such pretrial risk assessments have been argued to violate Equal Protection
rights on the basis of race, due to a number of factors including
possible discriminatory intent from the algorithm itself under a theory
of partial legal capacity for artificial intelligences.
In 2016, the Obama Administration's
Big Data Working Group—an overseer of various big-data regulatory
frameworks—released reports arguing “the potential of encoding
discrimination in automated decisions” and calling for “equal
opportunity by design” for applications such as credit scoring. The reports encourage discourse among policy makers, citizens, and academics alike, but recognizes that it does not have a potential solution for the encoding of bias and discrimination into algorithmic systems.
Ethical frameworks and practices
Practices
In March 2018, in an effort to address rising concerns over machine learning's impact on human rights, the World Economic Forum and Global Future Council on Human Rights published a white paper with detailed recommendations on how best to prevent discriminatory outcomes in machine learning. The World Economic Forum developed four recommendations based on the UN Guiding Principles of Human Rights to help address and prevent discriminatory outcomes in machine learning.
The World Economic Forum's recommendations are as follows:
Active inclusion: the development and design of machine
learning applications must actively seek a diversity of input,
especially of the norms and values of specific populations affected by
the output of AI systems
Fairness:
People involved in conceptualizing, developing, and implementing
machine learning systems should consider which definition of fairness
best applies to their context and application, and prioritize it in the
architecture of the machine learning system and its evaluation metrics
Right to understanding: Involvement of machine learning
systems in decision-making that affects individual rights must be
disclosed, and the systems must be able to provide an explanation of
their decision-making that is understandable to end users and reviewable
by a competent human authority. Where this is impossible and rights are
at stake, leaders in the design, deployment, and regulation of machine
learning technology must question whether or not it should be used
Access to redress: Leaders, designers, and developers of
machine learning systems are responsible for identifying the potential
negative human rights impacts of their systems. They must make visible
avenues for redress for those affected by disparate impacts, and
establish processes for the timely redress of any discriminatory
outputs.
In January 2020, Harvard University's Berkman Klein Center for Internet and Society
published a meta-study of 36 prominent sets of principles for AI,
identifying eight key themes: privacy, accountability, safety and
security, transparency and explainability, fairness and non-discrimination, human control of technology, professional responsibility, and promotion of human values. A similar meta-study was conducted by researchers from the Swiss Federal Institute of Technology in Zurich in 2019.
Approaches
There have been several attempts to make ethics computable, or at least formal. Whereas Isaac Asimov's Three Laws of Robotics are usually not considered to be suitable for an artificial moral agent, it has been studied whether Kant's categorical imperative can be used. However, it has been pointed out that human value is, in some aspects, very complex.
A way to explicitly surmount this difficulty is to receive human values
directly from the humans through some mechanism, for example by
learning them. Another approach is to base current ethical considerations on previous similar situations. This is called casuistry,
and it could be implemented through research on the Internet. The
consensus from a million past decisions would lead to a new decision
that is democracy dependent. Bruce M. McLaren
built an early (mid 1990s) computational model of casuistry,
specifically a program called SIROCCO built with AI and case-base
reasoning techniques that retrieves and analyzes ethical dilemmas.
This approach could, however, lead to decisions that reflect biases and
unethical behaviors exhibited in society. The negative effects of this
approach can be seen in Microsoft's Tay, where the chatterbot learned to repeat racist and sexually charged messages sent by Twitter users.
One thought experiment focuses on a Genie Golem with
unlimited powers presenting itself to the reader. This Genie declares
that it will return in 50 years and demands that it be provided with a
definite set of morals that it will then immediately act upon. The
purpose of this experiment is to initiate a discourse over how best to
handle defining complete set of ethics that computers may understand.
In fiction
In science fiction, movies and novels have played with the idea of sentience in robots and machines.
Neill Blomkamp's Chappie (2015) enacted a scenario of being able to transfer one's consciousness into a computer. The film, Ex Machina (2014) by Alex Garland, followed an android with artificial intelligence undergoing a variation of the Turing Test, a test administered to a machine to see if its behavior can be distinguished from that of a human. Works such as The Terminator (1984) and The Matrix (1999) incorporate the concept of machines turning on their human masters (See Artificial intelligence).
Isaac Asimov considered the issue in the 1950s in I, Robot. At the insistence of his editor John W. Campbell Jr.,
he proposed the Three Laws of Robotics to govern artificially
intelligent systems. Much of his work was then spent testing the
boundaries of his three laws to see where they would break down, or
where they would create paradoxical or unanticipated behavior. His work
suggests that no set of fixed laws can sufficiently anticipate all
possible circumstances. In Philip K. Dick's novel, Do Androids Dream of Electric Sheep?
(1968), he explores what it means to be human. In his post-apocalyptic
scenario, he questioned if empathy was an entirely human characteristic.
His story is the basis for the science fiction film, Blade Runner (1982).
Natural language generation (NLG) is a software process that produces natural language
output. A widely-cited survey of NLG methods describes NLG as "the
subfield of artificial intelligence and computational linguistics that
is concerned with the construction of computer systems than can produce
understandable texts in English or other human languages from some
underlying non-linguistic representation of information".
While it is widely agreed that the output of any NLG process is
text, there is some disagreement about whether the inputs of an NLG
system need to be non-linguistic. Common applications of NLG methods include the production of various reports, for example weather and patient reports; image captions; and chatbots.
Automated NLG can be compared to the process humans use when they turn ideas into writing or speech. Psycholinguists prefer the term language production
for this process, which can also be described in mathematical terms, or
modeled in a computer for psychological research. NLG systems can also
be compared to translators of artificial computer languages, such as decompilers or transpilers, which also produce human-readable code generated from an intermediate representation.
Human languages tend to be considerably more complex and allow for much
more ambiguity and variety of expression than programming languages,
which makes NLG more challenging.
NLG may be viewed as complementary to natural-language understanding
(NLU): whereas in natural-language understanding, the system needs to
disambiguate the input sentence to produce the machine representation
language, in NLG the system needs to make decisions about how to put a
representation into words. The practical considerations in building NLU
vs. NLG systems are not symmetrical. NLU needs to deal with ambiguous or
erroneous user input, whereas the ideas the system wants to express
through NLG are generally known precisely. NLG needs to choose a
specific, self-consistent textual representation from many potential
representations, whereas NLU generally tries to produce a single,
normalized representation of the idea expressed.
NLG has existed since ELIZA was developed in the mid 1960s, but the methods were first used commercially in the 1990s. NLG techniques range from simple template-based systems like a mail merge that generates form letters,
to systems that have a complex understanding of human grammar. NLG can
also be accomplished by training a statistical model using machine learning, typically on a large corpus of human-written texts.
Example
The Pollen Forecast for Scotland system
is a simple example of a simple NLG system that could essentially be a
template. This system takes as input six numbers, which give predicted
pollen levels in different parts of Scotland. From these numbers, the
system generates a short textual summary of pollen levels as its output.
For example, using the historical data for July 1, 2005, the software produces:
Grass pollen levels for Friday have increased from the moderate to
high levels of yesterday with values of around 6 to 7 across most parts
of the country. However, in Northern areas, pollen levels will be
moderate with values of 4.
In contrast, the actual forecast (written by a human meteorologist) from this data was:
Pollen counts are expected to remain high at level 6 over most of
Scotland, and even level 7 in the south east. The only relief is in the
Northern Isles and far northeast of mainland Scotland with medium levels
of pollen count.
Comparing these two illustrates some of the choices that NLG systems must make; these are further discussed below.
Stages
The
process to generate text can be as simple as keeping a list of canned
text that is copied and pasted, possibly linked with some glue text. The
results may be satisfactory in simple domains such as horoscope
machines or generators of personalised business letters. However, a
sophisticated NLG system needs to include stages of planning and merging
of information to enable the generation of text that looks natural and
does not become repetitive. The typical stages of natural-language
generation, as proposed by Dale and Reiter, are:
Content determination: Deciding what information to mention in the text.
For instance, in the pollen example above, deciding whether to explicitly mention that pollen
level is 7 in the south east.
Document structuring: Overall organisation of the information to convey. For example, deciding to
describe the areas with high pollen levels first, instead of the areas with low pollen levels.
Aggregation: Merging of similar sentences to improve readability and naturalness.
For instance, merging the two following sentences:
Grass pollen levels for Friday have increased from the moderate to high levels of yesterday and
Grass pollen levels will be around 6 to 7 across most parts of the country
into the following single sentence:
Grass pollen levels for Friday have increased from the
moderate to high levels of yesterday with values of around 6 to 7 across
most parts of the country.
Lexical choice: Putting words to the concepts. For example, deciding whether medium or moderate
should be used when describing a pollen level of 4.
Referring expression generation: Creating referring expressions that identify objects and regions. For example, deciding to use
in the Northern Isles and far northeast of mainland Scotland to refer to a certain region in Scotland.
This task also includes making decisions about pronouns and other types of
anaphora.
Realization: Creating the actual text, which should be correct
according to the rules of
syntax, morphology, and orthography. For example, using will be for the future
tense of to be.
An alternative approach to NLG is to use "end-to-end" machine
learning to build a system, without having separate stages as above. In other words, we build an NLG system by training a machine learning algorithm (often an LSTM)
on a large data set of input data and corresponding (human-written)
output texts. The end-to-end approach has perhaps been most successful
in image captioning, that is automatically generating a textual caption for an image.
Applications
Automatic report generation
From a commercial perspective, the most successful NLG applications
have been data-to-text systems which generate textual summaries of databases and data sets; these
systems usually perform data analysis
as well as text generation. Research has shown that textual summaries
can be more effective than graphs and other visuals for decision
support and that computer-generated texts can be superior (from the reader's perspective) to human-written texts.
The first commercial data-to-text systems produced weather
forecasts from weather data. The earliest such system to be deployed was
FoG,
which was used by Environment Canada to generate weather forecasts in
French and English in the early 1990s. The success of FoG triggered
other work, both research and commercial. Recent applications include
the UK Met Office's text-enhanced forecast.
Data-to-text systems have since been applied in a range of
settings. Following the minor earthquake near Beverly Hills, California
on March 17, 2014, The Los Angeles Times reported details about the
time, location and strength of the quake within 3 minutes of the event.
This report was automatically generated by a 'robo-journalist', which
converted the incoming data into text via a preset template. Currently there is considerable commercial interest in using NLG to summarise financial and business data. Indeed, Gartner has said that NLG will become a standard feature of 90% of modern BI and analytics platforms. NLG is also being used commercially in automated journalism, chatbots, generating product descriptions for e-commerce sites, summarising medical records and enhancing accessibility (for example by describing graphs and data sets to blind people).
An example of an interactive use of NLG is the WYSIWYM framework. It stands for What you see is what you meant
and allows users to see and manipulate the continuously rendered view
(NLG output) of an underlying formal language document (NLG input),
thereby editing the formal language without learning it.
Looking ahead, the current progress in data-to-text generation
paves the way for tailoring texts to specific audiences. For example,
data from babies in neonatal care can be converted into text differently
in a clinical setting, with different levels of technical detail and
explanatory language, depending on intended recipient of the text
(doctor, nurse, patient). The same idea can be applied in a sports
setting, with different reports generated for fans of specific teams.
Image captioning
Over the past few years, there has been an increased interest in automatically generating captions
for images, as part of a broader endeavor to investigate the interface
between vision and language. A case of data-to-text generation, the
algorithm of image captioning (or automatic image description) involves
taking an image, analyzing its visual content, and generating a textual
description (typically a sentence) that verbalizes the most prominent
aspects of the image.
An image captioning system involves two sub-tasks. In Image
Analysis, features and attributes of an image are detected and labelled,
before mapping these outputs to linguistic structures. Recent research
utilizes deep learning approaches through features from a pre-trained convolutional neural network
such as AlexNet, VGG or Caffe, where caption generators use an
activation layer from the pre-trained network as their input features.
Text Generation, the second task, is performed using a wide range of
techniques. For example, in the Midge system, input images are
represented as triples consisting of object/stuff detections, action/pose
detections and spatial relations. These are subsequently mapped to
<noun, verb, preposition> triples and realized using a tree
substitution grammar.
Despite advancements, challenges and opportunities remain in
image capturing research. Notwithstanding the recent introduction of
Flickr30K, MS COCO and other large datasets have enabled the training of more complex models such as neural networks, it has been argued that research in image captioning could benefit from larger and diversified datasets.
Designing automatic measures that can mimic human judgments in
evaluating the suitability of image descriptions is another need in the
area. Other open challenges include visual question-answering (VQA), as well as the construction and evaluation multilingual repositories for image description.
Chatbots
Another area where NLG has been widely applied is automated dialogue systems, frequently in the form of chatbots. A chatbot or chatterbot is a software application used to conduct an on-line chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent. While natural language processing
(NLP) techniques are applied in deciphering human input, NLG informs
the output part of the chatbot algorithms in facilitating real-time
dialogues.
Early chatbot systems, including Cleverbot created by Rollo Carpenter in 1988 and published in 1997, reply to questions by identifying how a human has responded to the same question in a conversation database using information retrieval (IR) techniques.
Modern chatbot systems predominantly rely on machine learning (ML)
models, such as sequence-to-sequence learning and reinforcement learning
to generate natural language output. Hybrid models have also been
explored. For example, the Alibaba shopping assistant first uses an IR
approach to retrieve the best candidates from the knowledge base, then
uses the ML-driven seq2seq model re-rank the candidate responses and
generate the answer.
Creative writing and computational humor
Creative
language generation by NLG has been hypothesized since the field's
origins. A recent pioneer in the area is Phillip Parker, who has
developed an arsenal of algorithms capable of automatically generating
textbooks, crossword puzzles, poems and books on topics ranging from
bookbinding to cataracts.
The advent of large pretrained transformer-based language models such
as GPT-3 has also enabled breakthroughs, with such models demonstrating
recognizable ability for creating-writing tasks.
A related area of NLG application is computational humor
production. JAPE (Joke Analysis and Production Engine) is one of the
earliest large, automated humor production systems that uses a
hand-coded template-based approach to create punning riddles for
children. HAHAcronym creates humorous reinterpretations of any given
acronym, as well as proposing new fitting acronyms given some keywords.
Despite progresses, many challenges remain in producing automated
creative and humorous content that rival human output. In an experiment
for generating satirical headlines, outputs of their best BERT-based
model were perceived as funny 9.4% of the time (while real Onion
headlines were 38.4%) and a GPT-2 model fine-tuned on satirical
headlines achieved 6.9%.
It has been pointed out that two main issues with humor-generation
systems are the lack of annotated data sets and the lack of formal
evaluation methods,
which could be applicable to other creative content generation. Some
have argued relative to other applications, there has been a lack of
attention to creative aspects of language production within NLG. NLG
researchers stand to benefit from insights into what constitutes
creative language production, as well as structural features of
narrative that have the potential to improve NLG output even in
data-to-text systems.
Evaluation
As
in other scientific fields, NLG researchers need to test how well their
systems, modules, and algorithms work. This is called evaluation. There are three basic techniques for evaluating NLG systems:
Task-based (extrinsic) evaluation: give the generated
text to a person, and assess how well it helps them perform a task (or
otherwise achieves its communicative goal). For example, a system which
generates summaries of medical data can be evaluated by giving these
summaries to doctors, and assessing whether the summaries help doctors
make better decisions.
Human ratings: give the generated text to a person, and ask them to rate the quality and usefulness of the text.
Metrics: compare generated texts to texts written by people from the same input data, using an automatic metric such as BLEU, METEOR, ROUGE and LEPOR.
An ultimate goal is how useful NLG systems are at helping people,
which is the first of the above techniques. However, task-based
evaluations are time-consuming and expensive, and can be difficult to
carry out (especially if they require subjects with specialised
expertise, such as doctors). Hence (as in other areas of NLP)
task-based evaluations are the exception, not the norm.
Recently researchers are assessing how well human-ratings and
metrics correlate with (predict) task-based evaluations. Work is being
conducted in the context of Generation Challenges
shared-task events. Initial results suggest that human ratings are
much better than metrics in this regard. In other words, human ratings
usually do predict task-effectiveness at least to some degree (although
there are exceptions), while ratings produced by metrics often do not
predict task-effectiveness well. These results are preliminary. In any
case, human ratings are the most popular evaluation technique in NLG;
this is contrast to machine translation, where metrics are widely used.
An AI can be graded on faithfulness to its training data or, alternatively, on factuality.
A response that reflects the training data but not reality is faithful
but not factual. A confident but unfaithful response is a hallucination.
In Natural Language Processing, a hallucination is often defined as
"generated content that is nonsensical or unfaithful to the provided
source content".
Industrial artificial intelligence, or industrial AI, usually refers to the application of artificial intelligence
to industry and business. Unlike general artificial intelligence which
is a frontier research discipline to build computerized systems that
perform tasks requiring human intelligence, industrial AI is more
concerned with the application of such technologies to address
industrial pain-points for customer value creation, productivity
improvement, cost reduction, site optimization, predictive analysis and insight discovery.
Artificial intelligence and machine learning
have become key enablers to leverage data in production in recent
years due to a number of different factors: More affordable sensors and
the automated process of data acquisition; More powerful computation
capability of computers to perform more complex tasks at a faster speed
with lower cost; Faster connectivity infrastructure and more accessible
cloud services for data management and computing power outsourcing.
Categories
Possible
applications of industrial AI and machine learning in the production
domain can be divided into seven application areas:
Market & Trend Analysis
Machinery & Equipment
Intralogistics
Production Process
Supply Chain
Building
Product
Each application area can be further divided into specific
application scenarios that describe concrete AI/ML scenarios in
production. While some application areas have a direct connection to
production processes, others cover production adjacent fields like
logistics or the factory building.
An example from the application scenario Process Design & Innovation are collaborative robots. Collaborative robotic arms are able to learn the motion and path demonstrated by human operators and perform the same task. Predictive and preventive maintenance through data-driven machine learning are examplary application scenarios from the Machinery & Equipment application area.
Challenges
In
contrast to entirely virtual systems, in which ML applications are
already widespread today, real-world production processes are
characterized by the interaction between the virtual and the physical
world. Data is recorded using sensors and processed on computational
entities and, if desired, actions and decisions are translated back into
the physical world via actuators or by human operators.[6]
This poses major challenges for the application of ML in production
engineering systems. These challenges are attributable to the encounter
of process, data and model characteristics: The production domain's high
reliability requirements, high risk and loss potential, the multitude
of heterogeneous data sources and the non-transparency of ML model
functionality impede a faster adoption of ML in real-world production
processes.
In particular, production data comprises a variety of different modalities, semantics and quality. Furthermore, production systems are dynamic, uncertain and complex, and engineering and manufacturing problems are data-rich but information-sparse.
Besides that, due the variety of use cases and data characteristics,
problem-specific data sets are required, which are difficult to acquire,
hindering both practitioners and academic researchers in this domain.
Process and Industry Characteristics
The
domain of production engineering can be considered as a rather
conservative industry when it comes to the adoption of advanced
technology and their integration into existing processes. This is due to
high demands on reliability of the production systems resulting from
the potentially high economic harm of reduced process effectiveness due
to e.g., additional unplanned downtime
or insufficient product qualities. In addition, the specifics of
machining equipment and products prevent area-wide adoptions across a
variety of processes. Besides the technical reasons, the reluctant
adoption of ML is fueled by a lack of IT and data science expertise
across the domain.
Data Characteristics
The
data collected in production processes mainly stem from frequently
sampling sensors to estimate the state of a product, a process, or the
environment in the real world. Sensor readings are susceptible to noise
and represent only an estimate of the reality under uncertainty.
Production data typically comprises multiple distributed data sources
resulting in various data modalities (e.g., images from visual quality
control systems, time-series sensor readings, or cross-sectional job and
product information). The inconsistencies in data acquisition lead to
low signal-to-noise ratios,
low data quality and great effort in data integration, cleaning and
management. In addition, as a result from mechanical and chemical wear
of production equipment, process data is subject to various forms of data drifts.
Machine Learning Model Characteristics
ML models are considered as black-box systems given
their complexity and intransparency of input-output relation. This
reduces the comprehensibility of the system behavior and thus also the
acceptance by plant operators. Due to the lack of transparency and the
stochasticity of these models, no deterministic proof of functional
correctness can be achieved complicating the certification of production
equipment. Given their inherent unrestricted prediction behavior, ML
models are vulnerable against erroneous or manipulated data further
risking the reliability of the production system because of lacking
robustness and safety. In addition to high development and deployment
costs, the data drifts cause high maintenance costs, which is
disadvantageous compared to purely deterministic programs.
Standard processes for data science in production
The
development of ML applications – starting with the identification and
selection of the use case and ending with the deployment and maintenance
of the application – follows dedicated phases that can be organized in
standard process models. The process models assist in structuring the
development process and defining requirements that must be met in each
phase to enter the next phase. The standard processes can be classified
into generic and domain-specific ones. Generic standard processes (e.g.,
CRISP-DM, ASUM-DM, KDD, SEMMA, or Team Data Science Process) describe a generally valid methodology and are thus independent of individual domains. Domain-specific processes on the other hand consider specific peculiarities and challenges of special application areas.
The Machine Learning Pipeline in Production is a
domain-specific data science methodology that is inspired by the
CRISP-DM model and was specifically designed to be applied in fields of
engineering and production technology.
To address the core challenges of ML in engineering – process, data,
and model characteristics – the methodology especially focuses on
use-case assessment, achieving a common data and process understanding
data integration, data preprocessing of real-world production data and
the deployment and certification of real-world ML applications.
Industrial data sources
The
foundation of most artificial intelligence and machine learning
applications in industrial settings are comprehensive datasets from the
respective fields. Those datasets act as the basis for training the
employed models. In other domains, like computer vision, speech recognition or language models, extensive reference datasets (e.g. ImageNet, Librispeech, The People's Speech) and data scraped from the open internet
are frequently used for this purpose. Such datasets rarely exist in the
industrial context because of high confidentiality requirements
and high specificity of the data. Industrial applications of artificial
intelligence are therefore often faced with the problem of data
availability.
For these reasons, existing open datasets applicable to
industrial applications, often originate from public institutions like
governmental agencies or universities and data analysis competitions
hosted by companies. In addition to this, data sharing platforms exist.
However, most of these platforms have no industrial focus and offer
limited filtering abilities regarding industrial data sources.
The traditional consensus among economists has been that
technological progress does not cause long-term unemployment. However,
recent innovation in the fields of robotics
and artificial intelligence has raised worries that human labor will
become obsolete, leaving people in various sectors without jobs to earn a
living, leading to an economic crisis.
Many small and medium size businesses may also be driven out of
business if they cannot afford or licence the latest robotic and AI
technology, and may need to focus on areas or services that cannot
easily be replaced for continued viability in the face of such
technology.
Technologies that may displace workers
AI
technologies have been widely adopted in recent years. While these
technologies have replaced some traditional workers, they also create
new opportunities. Industries that are most susceptible to AI takeover
include transportation, retail, and military. AI military technologies,
for example, allow soldiers to work remotely without risk of injury.
Author Dave Bond argues that as AI technologies continue to develop and
expand, the relationship between humans and robots will change; they
will become closely integrated in several aspects of life. AI will
likely displace some workers while creating opportunities for new jobs
in other sectors, especially in fields where tasks are repeatable.
Computer-integrated manufacturing
uses computers to control the production process. This allows
individual processes to exchange information with each other and
initiate actions. Although manufacturing can be faster and less
error-prone by the integration of computers, the main advantage is the
ability to create automated manufacturing processes. Computer-integrated
manufacturing is used in automotive, aviation, space, and ship building
industries.
The 21st century has seen a variety of skilled tasks partially taken
over by machines, including translation, legal research, and journalism.
Care work, entertainment, and other tasks requiring empathy, previously
thought safe from automation, have also begun to be performed by
robots.
Autonomous cars
An autonomous car
is a vehicle that is capable of sensing its environment and navigating
without human input. Many such vehicles are being developed, but as of
May 2017 automated cars permitted on public roads are not yet fully
autonomous. They all require a human driver at the wheel who at a
moment's notice can take control of the vehicle. Among the obstacles to
widespread adoption of autonomous vehicles are concerns about the
resulting loss of driving-related jobs in the road transport industry.
On March 18, 2018, the first human was killed by an autonomous vehicle in Tempe, Arizona by an Uber self-driving car.
Scientists such as Stephen Hawking
are confident that superhuman artificial intelligence is physically
possible, stating "there is no physical law precluding particles from
being organised in ways that perform even more advanced computations
than the arrangements of particles in human brains".Scholars like Nick Bostrom
debate how far off superhuman intelligence is, and whether it poses a
risk to mankind. According to Bostrom, a superintelligent machine would
not necessarily be motivated by the same emotional desire to
collect power that often drives human beings but might rather treat
power as a means toward attaining its ultimate goals; taking over the
world would both increase its access to resources and help to prevent
other agents from stopping the machine's plans. As an oversimplified
example, a paperclip maximizer
designed solely to create as many paperclips as possible would want to
take over the world so that it can use all of the world's resources to
create as many paperclips as possible, and, additionally, prevent humans
from shutting it down or using those resources on things other than
paperclips.
AI takeover is a common theme in science fiction.
Fictional scenarios typically differ vastly from those hypothesized by
researchers in that they involve an active conflict between humans and
an AI or robots with anthropomorphic motives who see them as a threat or
otherwise have active desire to fight humans, as opposed to the
researchers' concern of an AI that rapidly exterminates humans as a
byproduct of pursuing its goals. The idea is seen in Karel Čapek's R.U.R., which introduced the word robot in 1921, and can be glimpsed in Mary Shelley's Frankenstein (published in 1818), as Victor ponders whether, if he grants his monster's request and makes him a wife, they would reproduce and their kind would destroy humanity.ccording to Toby Ord,
the idea that an AI takeover requires robots is a misconception driven
by the media and Hollywood. He argues that the most damaging humans in
history were not physically the strongest, but that they used words
instead to convince people and gain control of large parts of the world.
He writes that a sufficiently intelligent AI with an access to
the internet could scatter backup copies of itself, gather financial and
human resources (via cyberattacks or blackmails), persuade people on a
large scale, and exploit societal vulnerabilities that are too subtle
for humans to anticipate.
The word "robot" from R.U.R. comes from the Czech word, robota, meaning laborer or serf.
The 1920 play was a protest against the rapid growth of technology,
featuring manufactured "robots" with increasing capabilities who
eventually revolt. HAL 9000 (1968) and the original Terminator (1984) are two iconic examples of hostile AI in pop culture.
Contributing factors
Advantages of superhuman intelligence over humans
Nick Bostrom
and others have expressed concern that an AI with the abilities of a
competent artificial intelligence researcher would be able to modify its
own source code and increase its own intelligence. If its
self-reprogramming leads to its getting even better at being able to
reprogram itself, the result could be a recursive intelligence explosion
in which it would rapidly leave human intelligence far behind. Bostrom
defines a superintelligence as "any intellect that greatly exceeds the
cognitive performance of humans in virtually all domains of interest",
and enumerates some advantages a superintelligence would have if it
chose to compete against humans:
Technology research: A machine with superhuman scientific
research abilities would be able to beat the human research community to
milestones such as nanotechnology or advanced biotechnology
Strategizing: A superintelligence might be able to simply outwit human opposition
Social manipulation: A superintelligence might be able to recruit human support, or covertly incite a war between humans
Economic productivity: As long as a copy of the AI could produce
more economic wealth than the cost of its hardware, individual humans
would have an incentive to voluntarily allow the Artificial General Intelligence (AGI) to run a copy of itself on their systems
Hacking: A superintelligence could find new exploits in computers
connected to the Internet, and spread copies of itself onto those
systems, or might steal money to finance its plans
Sources of AI advantage
According
to Bostrom, a computer program that faithfully emulates a human brain,
or that runs algorithms that are as powerful as the human brain's
algorithms, could still become a "speed superintelligence" if it can
think orders of magnitude faster than a human, due to being made of
silicon rather than flesh, or due to optimization increasing the speed
of the AGI. Biological neurons operate at about 200 Hz, whereas a modern
microprocessor operates at a speed of about 2,000,000,000 Hz. Human
axons carry action potentials at around 120 m/s, whereas computer
signals travel near the speed of light.
A network of human-level intelligences designed to network
together and share complex thoughts and memories seamlessly, able to
collectively work as a giant unified team without friction, or
consisting of trillions of human-level intelligences, would become a
"collective superintelligence".
More broadly, any number of qualitative improvements to a
human-level AGI could result in a "quality superintelligence", perhaps
resulting in an AGI as far above us in intelligence as humans are above
non-human apes. The number of neurons in a human brain is limited by
cranial volume and metabolic constraints, while the number of processors
in a supercomputer can be indefinitely expanded. An AGI need not be
limited by human constraints on working memory,
and might therefore be able to intuitively grasp more complex
relationships than humans can. An AGI with specialized cognitive support
for engineering or computer programming would have an advantage in
these fields, compared with humans who evolved no specialized mental
modules to specifically deal with those domains. Unlike humans, an AGI
can spawn copies of itself and tinker with its copies' source code to
attempt to further improve its algorithms.
Possibility of unfriendly AI preceding friendly AI
A significant problem is that unfriendly artificial intelligence is
likely to be much easier to create than friendly AI. While both require
large advances in recursive optimisation process design, friendly AI
also requires the ability to make goal structures invariant under
self-improvement (or the AI could transform itself into something
unfriendly) and a goal structure that aligns with human values and does
not undergo instrumental convergence
in ways that may automatically destroy the entire human race. An
unfriendly AI, on the other hand, can optimize for an arbitrary goal
structure, which does not need to be invariant under self-modification.
The sheer complexity of human value systems makes it very difficult to make AI's motivations human-friendly.
Unless moral philosophy provides us with a flawless ethical theory, an
AI's utility function could allow for many potentially harmful scenarios
that conform with a given ethical framework but not "common sense".
According to Eliezer Yudkowsky, there is little reason to suppose that an artificially designed mind would have such an adaptation.
Odds of conflict
Many scholars, including evolutionary psychologist Steven Pinker, argue that a superintelligent machine is likely to coexist peacefully with humans.
The fear of cybernetic revolt is often based on interpretations
of humanity's history, which is rife with incidents of enslavement and
genocide. Such fears stem from a belief that competitiveness and
aggression are necessary in any intelligent being's goal system.
However, such human competitiveness stems from the evolutionary
background to our intelligence, where the survival and reproduction of
genes in the face of human and non-human competitors was the central
goal. According to AI researcher Steve Omohundro,
an arbitrary intelligence could have arbitrary goals: there is no
particular reason that an artificially intelligent machine (not sharing
humanity's evolutionary context) would be hostile—or friendly—unless its
creator programs it to be such and it is not inclined or capable of
modifying its programming. But the question remains: what would happen
if AI systems could interact and evolve (evolution in this context means
self-modification or selection and reproduction) and need to compete
over resources—would that create goals of self-preservation? AI's goal
of self-preservation could be in conflict with some goals of humans.
Many scholars dispute the likelihood of unanticipated cybernetic revolt as depicted in science fiction such as The Matrix,
arguing that it is more likely that any artificial intelligence
powerful enough to threaten humanity would probably be programmed not to
attack it. Pinker acknowledges the possibility of deliberate "bad
actors", but states that in the absence of bad actors, unanticipated
accidents are not a significant threat; Pinker argues that a culture of
engineering safety will prevent AI researchers from accidentally
unleashing malign superintelligence.
In contrast, Yudkowsky argues that humanity is less likely to be
threatened by deliberately aggressive AIs than by AIs which were
programmed such that their goals are unintentionally incompatible with human survival or well-being (as in the film I, Robot and in the short story "The Evitable Conflict"). Omohundro suggests that present-day automation systems are not designed for safety and that AIs may blindly optimize narrow utility
functions (say, playing chess at all costs), leading them to seek
self-preservation and elimination of obstacles, including humans who
might turn them off.
The AI control problem is the issue of how to build a superintelligent agent that will aid its creators, while avoiding inadvertently building a superintelligence that will harm its creators. Some scholars argue that solutions to the control problem might also find applications in existing non-superintelligent AI.
Major approaches to the control problem include alignment, which aims to align AI goal systems with human values, and capability control,
which aims to reduce an AI system's capacity to harm humans or gain
control. An example of "capability control" is to research whether a
superintelligence AI could be successfully confined in an "AI box".
According to Bostrom, such capability control proposals are not
reliable or sufficient to solve the control problem in the long term,
but may potentially act as valuable supplements to alignment efforts.
Warnings
Physicist Stephen Hawking, Microsoft founder Bill Gates, and SpaceX founder Elon Musk
have expressed concerns about the possibility that AI could develop to
the point that humans could not control it, with Hawking theorizing that
this could "spell the end of the human race".
Stephen Hawking said in 2014 that "Success in creating AI would be the
biggest event in human history. Unfortunately, it might also be the
last, unless we learn how to avoid the risks." Hawking believed that in
the coming decades, AI could offer "incalculable benefits and risks"
such as "technology outsmarting financial markets,
out-inventing human researchers, out-manipulating human leaders, and
developing weapons we cannot even understand." In January 2015, Nick Bostrom joined Stephen Hawking, Max Tegmark, Elon Musk, Lord Martin Rees, Jaan Tallinn, and numerous AI researchers in signing the Future of Life Institute's open letter speaking to the potential risks and benefits associated with artificial intelligence.
The signatories "believe that research on how to make AI systems robust
and beneficial is both important and timely, and that there are
concrete research directions that can be pursued today."
Arthur C. Clarke's Odyssey series and Charles Stross's
Accelerando relate to humanity's narcissistic injuries in the face of
powerful artificial intelligences threatening humanity's
self-perception.
Prevention through AI alignment
In the field of artificial intelligence (AI), AI alignment
research aims to steer AI systems toward a person's or group's intended
goals, preferences, and ethical principles. An AI system is considered aligned if it advances its intended objectives. A misaligned AI system may pursue some objectives, but not the intended ones.