From Wikipedia, the free encyclopedia
OpenAI is an American artificial intelligence (AI) research laboratory consisting of the non-profit OpenAI Incorporated (OpenAI Inc.) and its for-profit subsidiary corporation OpenAI Limited Partnership (OpenAI LP). OpenAI conducts AI research to promote and develop friendly AI in a way that benefits all humanity. The organization was founded in San Francisco in 2015 by Elon Musk, Sam Altman, Peter Thiel, Reid Hoffman, Jessica Livingston and others, who collectively pledged US$1 billion. Musk resigned from the board in 2018 but remained a donor. Microsoft
provided OpenAI LP a $1 billion investment in 2019 and a second
multi-year investment in January 2023 reported to be $10 billion.
History
In December 2015, Sam Altman, Elon Musk, Greg Brockman, Reid Hoffman, Jessica Livingston, Peter Thiel, Amazon Web Services (AWS), Infosys, and YC Research announced the formation of OpenAI and pledged over US$1 billion
to the venture. The organization stated it would "freely collaborate"
with other institutions and researchers by making its patents and
research open to the public. OpenAI is headquartered at the Pioneer Building in Mission District, San Francisco.
In April 2016, OpenAI released a public beta of "OpenAI Gym", its platform for reinforcement learning research.
In December 2016, OpenAI released "Universe", a software platform for
measuring and training an AI's general intelligence across the world's
supply of games, websites, and other applications.
In 2018, Musk resigned his board seat, citing "a potential future conflict (of interest)" with Tesla AI development for self driving cars, but remained a donor.
In 2019, OpenAI transitioned from non-profit to "capped" for-profit, with profit cap set to 100X on any investment. The company distributed equity to its employees and partnered with Microsoft and Matthew Brown Companies,
who announced an investment package of $1 billion into the company.
OpenAI then announced its intention to commercially license its
technologies.
In 2020, OpenAI announced GPT-3, a language model trained on large datasets from the Internet. It also announced that an associated API,
named simply "the API", would form the heart of its first commercial
product. GPT-3 is aimed at natural language answering of questions, but
it can also translate between languages and coherently generate
improvised text.
In 2021, OpenAI introduced DALL-E, a deep learning model that can generate digital images from natural language descriptions.
Around December 2022, OpenAI received widespread media coverage after launching a free preview of ChatGPT, its new AI chatbot based on GPT-3.5. According to OpenAI, the preview received over a million signups within the first five days.
According to anonymous sources cited by Reuters in December 2022,
OpenAI was projecting a $200 million revenue for 2023 and $1 billion
revenue for 2024.
As of January 2023, OpenAI was in talks for funding that would
value the company at $29 billion, double the value of the company in
2021.
On January 23, 2023, Microsoft announced a new multi-year,
multi-billion dollar (reported to be $10 billion) investment in OpenAI.
Participants
Key employees:
- CEO and co-founder: Sam Altman, former president of the startup accelerator Y Combinator
- President and co-founder: Greg Brockman, former CTO, 3rd employee of Stripe
- Chief Scientist and co-founder: Ilya Sutskever, a former Google expert on machine learning
- Chief Technology Officer: Mira Murati, previously at Leap Motion and Tesla, Inc.
- Chief Operating Officer: Brad Lightcap, previously at Y Combinator and JPMorgan Chase
Board of the OpenAI nonprofit:
Other backers of the project include:
- Reid Hoffman, LinkedIn co-founder
- Peter Thiel, PayPal co-founder
- Jessica Livingston, a founding partner of Y Combinator
Companies:
The group started in early January 2016 with nine researchers. According to Wired, Brockman met with Yoshua Bengio, one of the "founding fathers" of the deep learning movement, and drew up a list of the "best researchers in the field". Microsoft's Peter Lee stated that the cost of a top AI researcher exceeds the cost of a top NFL
quarterback prospect. While OpenAI pays corporate-level (rather than
nonprofit-level) salaries, it doesn't currently pay AI researchers
salaries comparable to those of Facebook or Google.
Nevertheless, Sutskever stated that he was willing to leave Google for
OpenAI "partly because of the very strong group of people and, to a very
large extent, because of its mission." Brockman stated that "the best
thing that I could imagine doing was moving humanity closer to building
real AI in a safe way." OpenAI researcher Wojciech Zaremba stated that
he turned down "borderline crazy" offers of two to three times his
market value to join OpenAI instead.
Motives
Some scientists, such as Stephen Hawking and Stuart Russell,
have articulated concerns that if advanced AI someday gains the ability
to re-design itself at an ever-increasing rate, an unstoppable "intelligence explosion" could lead to human extinction. Musk characterizes AI as humanity's "biggest existential threat."
OpenAI's founders structured it as a non-profit so that they could
focus its research on making positive long-term contributions to
humanity.
Musk and Altman have stated they are partly motivated by concerns about AI safety and the existential risk from artificial general intelligence.
OpenAI states that "it's hard to fathom how much human-level AI could
benefit society," and that it is equally difficult to comprehend "how
much it could damage society if built or used incorrectly".
Research on safety cannot safely be postponed: "because of AI's
surprising history, it's hard to predict when human-level AI might come
within reach."
OpenAI states that AI "should be an extension of individual human wills
and, in the spirit of liberty, as broadly and evenly distributed as
possible...". Co-chair Sam Altman expects the decades-long project to surpass human intelligence.
Vishal Sikka,
former CEO of Infosys, stated that an "openness" where the endeavor
would "produce results generally in the greater interest of humanity"
was a fundamental requirement for his support, and that OpenAI "aligns
very nicely with our long-held values" and their "endeavor to do
purposeful work". Cade Metz of Wired suggests that corporations such as Amazon
may be motivated by a desire to use open-source software and data to
level the playing field against corporations such as Google and Facebook
that own enormous supplies of proprietary data. Altman states that Y
Combinator companies will share their data with OpenAI.
In 2019, OpenAI became a for-profit company called OpenAI LP to
secure additional funding while staying controlled by a non-profit
called OpenAI Inc in a structure that OpenAI calls "capped-profit", having previously been a 501(c)(3) nonprofit organization.
Strategy
Musk
posed the question: "What is the best thing we can do to ensure the
future is good? We could sit on the sidelines or we can encourage
regulatory oversight, or we could participate with the right structure
with people who care deeply about developing AI in a way that is safe
and is beneficial to humanity." Musk acknowledged that "there is always
some risk that in actually trying to advance (friendly) AI we may create
the thing we are concerned about"; nonetheless, the best defense is "to
empower as many people as possible to have AI. If everyone has AI
powers, then there's not any one person or a small set of individuals
who can have AI superpower."
Musk and Altman's counter-intuitive strategy of trying to reduce
the risk that AI will cause overall harm, by giving AI to everyone, is
controversial among those who are concerned with existential risk from artificial intelligence. Philosopher Nick Bostrom
is skeptical of Musk's approach: "If you have a button that could do
bad things to the world, you don't want to give it to everyone."
During a 2016 conversation about the technological singularity, Altman
said that "we don't plan to release all of our source code" and
mentioned a plan to "allow wide swaths of the world to elect
representatives to a new governance board". Greg Brockman stated that
"Our goal right now... is to do the best thing there is to do. It's a
little vague."
Conversely, OpenAI's initial decision to withhold GPT-2
due to a wish to "err on the side of caution" in the presence of
potential misuse, has been criticized by advocates of openness. Delip
Rao, an expert in text generation, stated "I don't think [OpenAI] spent
enough time proving [GPT-2] was actually dangerous." Other critics
argued that open publication is necessary to replicate the research and
to be able to come up with countermeasures.
In the 2017 tax year, OpenAI spent $7.9 million, or a quarter of its functional expenses, on cloud computing alone. In comparison, DeepMind's total expenses in 2017 were much larger, measuring $442 million. In Summer 2018, simply training OpenAI's Dota 2
bots required renting 128,000 CPUs and 256 GPUs from Google for
multiple weeks. According to OpenAI, the capped-profit model adopted in
March 2019 allows OpenAI LP to legally attract investment from venture
funds, and in addition, to grant employees stakes in the company, the
goal being that they can say "I'm going to Open AI, but in the long term
it's not going to be disadvantageous to us as a family." Many top researchers work for Google Brain, DeepMind, or Facebook, which offer stock options that a nonprofit would be unable to.
In June 2019, OpenAI LP raised a billion dollars from Microsoft, a sum
which OpenAI plans to have spent "within five years, and possibly much
faster".
Altman has stated that even a billion dollars may turn out to be
insufficient, and that the lab may ultimately need "more capital than
any non-profit has ever raised" to achieve artificial general intelligence.
The transition from a nonprofit to a capped-profit company was viewed with skepticism by Oren Etzioni of the nonprofit Allen Institute for AI,
who agreed that wooing top researchers to a nonprofit is difficult, but
stated "I disagree with the notion that a nonprofit can't compete" and
pointed to successful low-budget projects by OpenAI and others. "If
bigger and better funded was always better, then IBM
would still be number one." Following the transition, public disclosure
of the compensation of top employees at OpenAI LP is no longer legally
required. The nonprofit, OpenAI Inc., is the sole controlling shareholder of OpenAI LP. OpenAI LP, despite being a for-profit company, retains a formal fiduciary responsibility to OpenAI's Inc.'s nonprofit charter. A majority of OpenAI Inc.'s board is barred from having financial stakes in OpenAI LP. In addition, minority members with a stake in OpenAI LP are barred from certain votes due to conflict of interest.
Some researchers have argued that OpenAI LP's switch to for-profit
status is inconsistent with OpenAI's claims to be "democratizing" AI. A journalist at Vice News wrote that "generally, we've never been able to rely on venture capitalists to better humanity".
Products and applications
OpenAI's research tend to focus on reinforcement learning (RL). OpenAI is viewed as an important competitor to DeepMind.
Gym
Gym aims to provide an easy to set up, general-intelligence benchmark with a wide variety of different environments—somewhat akin to, but broader than, the ImageNet Large Scale Visual Recognition Challenge used in supervised learning
research—and that hopes to standardize the way in which environments
are defined in AI research publications, so that published research
becomes more easily reproducible. The project claims to provide the user with a simple interface. As of June 2017, Gym can only be used with Python. As of September 2017, the Gym documentation site was not maintained, and active work focused instead on its GitHub page.
In "RoboSumo", virtual humanoid "metalearning"
robots initially lack knowledge of how to even walk, and are given the
goals of learning to move around, and pushing the opposing agent out of
the ring. Through this adversarial learning process, the agents learn
how to adapt to changing conditions; when an agent is then removed from
this virtual environment and placed in a new virtual environment with
high winds, the agent braces to remain upright, suggesting it had
learned how to balance in a generalized way.
OpenAI's Igor Mordatch argues that competition between agents can
create an intelligence "arms race" that can increase an agent's ability
to function, even outside the context of the competition.
Debate Game
In
2018, OpenAI launched the Debate Game, which teaches machines to debate
toy problems in front of a human judge. The purpose is to research
whether such an approach may assist in auditing AI decisions and in
developing explainable AI.
Dactyl
Dactyl uses machine learning to train a Shadow Hand,
a human-like robot hand, to manipulate physical objects. It learns
entirely in simulation using the same RL algorithms and training code as
OpenAI Five. OpenAI tackled the object orientation problem by using domain randomization,
a simulation approach which exposes the learner to a variety of
experiences rather than trying to fit to reality. The set-up for Dactyl,
aside from having motion tracking cameras, also has RGB cameras to
allow the robot to manipulate an arbitrary object by seeing it. In 2018,
OpenAI showed that the system was able to manipulate a cube and an
octagonal prism.
In 2019, OpenAI demonstrated that Dactyl could solve a Rubik's Cube.
The robot was able to solve the puzzle 60% of the time. Objects like
the Rubik's Cube introduce complex physics that is harder to model.
OpenAI solved this by improving the robustness of Dactyl to
perturbations; they employed a technique called Automatic Domain
Randomization (ADR), a simulation approach where progressively more
difficult environments are endlessly generated. ADR differs from manual
domain randomization by not needing there to be a human to specify
randomization ranges.
Generative models
GPT
The original paper on generative pre-training (GPT) of a language
model was written by Alec Radford and his colleagues, and published in
preprint on OpenAI's website on June 11, 2018. It showed how a generative model
of language is able to acquire world knowledge and process long-range
dependencies by pre-training on a diverse corpus with long stretches of
contiguous text.
GPT-2
An instance of GPT-2 writing a paragraph based on a prompt from its own Wikipedia article in February 2021
Generative Pre-trained Transformer 2, commonly known by its abbreviated form GPT-2, is an unsupervised transformer language model
and the successor to GPT. GPT-2 was first announced in February 2019,
with only limited demonstrative versions initially released to the
public. The full version of GPT-2 was not immediately released out of
concern over potential misuse, including applications for writing fake news.
Some experts expressed skepticism that GPT-2 posed a significant
threat. The Allen Institute for Artificial Intelligence responded to
GPT-2 with a tool to detect "neural fake news".
Other researchers, such as Jeremy Howard, warned of "the technology to
totally fill Twitter, email, and the web up with reasonable-sounding,
context-appropriate prose, which would drown out all other speech and be
impossible to filter". In November 2019, OpenAI released the complete version of the GPT-2 language model. Several websites host interactive demonstrations of different instances of GPT-2 and other transformer models.
GPT-2's authors argue unsupervised language models to be
general-purpose learners, illustrated by GPT-2 achieving
state-of-the-art accuracy and perplexity on 7 of 8 zero-shot
tasks (i.e. the model was not further trained on any task-specific
input-output examples). The corpus it was trained on, called WebText,
contains slightly over 8 million documents for a total of 40 GB of text
from URLs shared in Reddit submissions with at least 3 upvotes. It avoids certain issues encoding vocabulary with word tokens by using byte pair encoding. This allows to represent any string of characters by encoding both individual realse
characters and multiple-character tokens.
GPT-3
Generative Pre-trained Transformer 3, commonly known by its abbreviated form GPT-3, is an unsupervised transformer language model and the successor to GPT-2. It was first described in May 2020. OpenAI stated that full version of GPT-3 contains 175 billion parameters, two orders of magnitude larger than the 1.5 billion parameters in the full version of GPT-2 (although GPT-3 models with as few as 125 million parameters were also trained).
OpenAI stated that GPT-3 succeeds at certain "meta-learning" tasks. It can generalize the purpose of a single input-output pair.
The paper gives an example of translation and cross-linguistic transfer
learning between English and Romanian, and between English and German.
GPT-3 dramatically improved benchmark results over GPT-2. OpenAI
cautioned that such scaling up of language models could be approaching
or encountering the fundamental capability limitations of predictive
language models. Pre-training GPT-3 required several thousand petaflop/s-days of compute, compared to tens of petaflop/s-days for the full GPT-2 model. Like that of its predecessor,
GPT-3's fully trained model was not immediately released to the public
on the grounds of possible abuse, though OpenAI planned to allow access
through a paid cloud API after a two-month free private beta that began in June 2020.
On September 23, 2020, GPT-3 was licensed exclusively to Microsoft.
ChatGPT
ChatGPT is an artificial intelligence tool that provides a conversational interface that allows you to ask questions in natural language.
The system then responds with an answer within seconds. ChatGPT was
launched in November 2022 and reached 1 million users only 5 days after
its initial launch.
Music
OpenAI's MuseNet (2019) is a deep neural net trained to predict subsequent musical notes in MIDI music files. It can generate songs with ten different instruments in fifteen different styles. According to The Verge, a song generated by MuseNet tends to start reasonably but then fall into chaos the longer it plays.
OpenAI's Jukebox (2020) is an open-sourced algorithm to generate music
with vocals. After training on 1.2 million samples, the system accepts a
genre, artist, and a snippet of lyrics and outputs song samples. OpenAI
stated the songs "show local musical coherence, follow traditional
chord patterns" but acknowledged that the songs lack "familiar larger
musical structures such as choruses that repeat" and that "there is a
significant gap" between Jukebox and human-generated music. The Verge
stated "It's technologically impressive, even if the results sound like
mushy versions of songs that might feel familiar", while Business Insider stated "surprisingly, some of the resulting songs are catchy and sound legitimate".
Whisper
Whisper
is a general-purpose speech recognition model. It is trained on a large
dataset of diverse audio and is also a multi-task model that can
perform multilingual speech recognition as well as speech translation
and language identification.
API
In June 2020, OpenAI announced a multi-purpose API
which it said was "for accessing new AI models developed by OpenAI" to
let developers call on it for "any English language AI task."
DALL-E and CLIP
Images
produced by DALL-E when given the text prompt "a professional
high-quality illustration of a giraffe dragon chimera. a giraffe
imitating a dragon. a giraffe made of dragon."
DALL-E is a Transformer model that creates images from textual descriptions, revealed by OpenAI in January 2021.
CLIP does the opposite: it creates a description for a given image.
DALL-E uses a 12-billion-parameter version of GPT-3 to interpret
natural language inputs (such as "a green leather purse shaped like a
pentagon" or "an isometric view of a sad capybara") and generate
corresponding images. It can create images of realistic objects ("a
stained-glass window with an image of a blue strawberry") as well as
objects that do not exist in reality ("a cube with the texture of a
porcupine"). As of March 2021, no API or code is available.
In March 2021, OpenAI released a paper titled Multimodal Neurons in Artificial Neural Networks,
where they showed a detailed analysis of CLIP (and GPT) models and
their vulnerabilities. The new type of attacks on such models was
described in this work.
We refer to these attacks as
typographic attacks. We believe attacks such as those described above
are far from simply an academic concern. By exploiting the model's
ability to read text robustly, we find that even photographs of
hand-written text can often fool the model.
— Multimodal Neurons in Artificial Neural Networks, OpenAI
In April 2022, OpenAI announced DALL-E 2, an updated version of the model with more realistic results.
In December 2022, OpenAI published on GitHub software for Point-E, a
new rudimentary system for converting a text description into a
3-dimensional model.
Microscope
OpenAI Microscope
is a collection of visualizations of every significant layer and neuron
of eight different neural network models which are often studied in
interpretability. Microscope was created to analyze the features that
form inside these neural networks easily.
The models included are AlexNet, VGG 19, different versions of Inception, and different versions of CLIP Resnet.
Codex
OpenAI Codex is a descendant of GPT-3 that has additionally been trained on code from 54 million GitHub repositories. It was announced in mid-2021 as the AI powering the code autocompletion tool GitHub Copilot. In August 2021, an API was released in private beta.
According to OpenAI, the model is able to create working code in over a
dozen programming languages, most effectively in Python.
Several issues with glitches, design flaws, and security vulnerabilities have been brought up.
Video game bots and benchmarks
OpenAI Five
OpenAI Five is the name of a team of five OpenAI-curated bots that are used in the competitive five-on-five video game Dota 2,
who learn to play against human players at a high skill level entirely
through trial-and-error algorithms. Before becoming a team of five, the
first public demonstration occurred at The International 2017, the annual premiere championship tournament for the game, where Dendi, a professional Ukrainian player, lost against a bot in a live 1v1 matchup. After the match, CTO Greg Brockman explained that the bot had learned by playing against itself for two weeks of real time, and that the learning software was a step in the direction of creating software that can handle complex tasks like a surgeon. The system uses a form of reinforcement learning,
as the bots learn over time by playing against themselves hundreds of
times a day for months, and are rewarded for actions such as killing an
enemy and taking map objectives.
By June 2018, the ability of the bots expanded to play together
as a full team of five, and they were able to defeat teams of amateur
and semi-professional players. At The International 2018, OpenAI Five played in two exhibition matches against professional players, but ended up losing both games. In April 2019, OpenAI Five defeated OG, the reigning world champions of the game at the time, 2:0 in a live exhibition match in San Francisco.
The bots' final public appearance came later that month, where they
played in 42,729 total games in a four-day open online competition,
winning 99.4% of those games.
GYM Retro
Gym
Retro is a platform for RL research on video games. Gym Retro is used to
research RL algorithms and study generalization. Prior research in RL
has focused chiefly on optimizing agents to solve single tasks. Gym
Retro gives the ability to generalize between games with similar
concepts but different appearances.