Search This Blog

Saturday, March 16, 2019

Computer chess

From Wikipedia, the free encyclopedia

1990s pressure-sensory chess computer with LCD screen

Computer chess includes both hardware (dedicated computers) and software capable of playing chess. Computer chess provides opportunities for players to practice even in the absence of human opponents, and also provides opportunities for analysis, entertainment and training. Since around 2005, chess engines have been able to defeat even the strongest human players. Nevertheless, it is considered unlikely that computers will ever solve chess due to its computational complexity.

History

The idea of creating a chess-playing machine dates back to the eighteenth century. Around 1769, the chess playing automaton called The Turk became famous before being exposed as a hoax. Before the development of digital computing, serious trials based on automata such as El Ajedrecista of 1912, were too complex and limited to be useful for playing full games of chess. The field of mechanical chess research languished until the advent of the digital computer in the 1950s. Since then, chess enthusiasts and computer engineers have built, with increasing degrees of seriousness and success, chess-playing machines and computer programs.
  • 1769 – Wolfgang von Kempelen builds the Automaton Chess-Player, in what becomes one of the greatest hoaxes of its period.
  • 1868 – Charles Hooper presented the Ajeeb automaton — which also had a human chess player hidden inside.
  • 1912 – Leonardo Torres y Quevedo builds a machine that could play King and Rook versus King endgames.
  • 1941 – Predating comparable work by at least a decade, Konrad Zuse develops computer chess algorithms in his Plankalkül programming formalism. Because of the circumstances of the Second World War, however, they were not published, and did not come to light, until the 1970s.
  • 1948 – Norbert Wiener's book Cybernetics describes how a chess program could be developed using a depth-limited minimax search with an evaluation function.
  • 1950 – Claude Shannon publishes "Programming a Computer for Playing Chess", one of the first papers on the problem of computer chess.
  • 1951 – Alan Turing is first to publish a program, developed on paper, that was capable of playing a full game of chess (dubbed Turochamp).
  • 1952 – Dietrich Prinz develops a program that solves chess problems.

abcdef

6a6 black rookb6 black knightc6 black queend6 black kinge6 black knightf6 black rook6
5a5 black pawnb5 black pawnc5 black pawnd5 black pawne5 black pawnf5 black pawn5
4a4b4c4d4e4f44
3a3b3c3d3e3f33
2a2 white pawnb2 white pawnc2 white pawnd2 white pawne2 white pawnf2 white pawn2
1a1 white rookb1 white knightc1 white queend1 white kinge1 white knightf1 white rook1

abcdef

Los Alamos chess. This simplified version of chess was played in 1956 by the MANIAC I computer.
  • 1956 – Los Alamos chess is the first program to play a chess-like game, developed by Paul Stein and Mark Wells for the MANIAC I computer.
  • 1956 – John McCarthy invents the alpha-beta search algorithm.
  • 1957 – The first programs that can play a full game of chess are developed, one by Alex Bernstein and one by Russian programmers using a BESM.
  • 1958 – NSS becomes the first chess program to use the alpha-beta search algorithm.
  • 1962 – The first program to play credibly, Kotok-McCarthy, is published at MIT.
  • 1963 – Grandmaster David Bronstein defeats an M-20 running an early chess program.
  • 1966–67 – The first chess match between computer programs is played. Moscow Institute for Theoretical and Experimental Physics (ITEP) defeats Kotok-McCarthy at Stanford University by telegraph over nine months.
  • 1967 – Mac Hack Six, by Richard Greenblatt et al. introduces transposition tables and becomes the first program to defeat a person in tournament play
  • 1968 – Scottish chess champion David Levy makes a bet with AI pioneers John McCarthy and Donald Michie that no computer program would win a chess match against him within 10 years.
  • 1970 – Monty Newborn and the Association for Computing Machinery organize the first ACM North American Computer Chess Championships in New York.
  • 1974 – David Levy, Ben Mittman and Monty Newborn organize the first World Computer Chess Championship which is won by the Russian program Kaissa.
  • 1975 – January, Micro Instrumentation and Telemetry Systems (MITS) releases the Altair 8800, the first commercially successful microcomputer.
  • 1976 – In December, Canadian programmer Peter R. Jennings releases Microchess, the first game for microcomputers to be sold.
  • 1977 – In March, Fidelity Electronics releases Chess Challenger coded by Ron Nelson, the first dedicated chess computer to be sold. Chess 4.6 becomes the first chess computer to be successful at a major chess tournament. Ben Mittman founded the International Computer Chess Association.
  • 1978 – David Levy wins the bet made 10 years earlier, defeating Chess 4.7 in a six-game match by a score of 4½–1½. The computer's victory in game four is the first defeat of a human master in a tournament. With Levy's help, Personal Computer World magazine organizes the first PCW Microprocessor Championship. Dan and Kathe Spracklen start selling photocopies of the Sargon source code.
  • 1979 – Frederic Friedel organizes a rematch between IM David Levy and Chess 4.7, which is broadcast on German television.
  • 1980 – The third PCW Microcomputer championship is declared the first World Microcomputer Chess Championship. Fidelity computers win the World Microcomputer Championships each year from 1980 through 1984. In Germany, Hegener & Glaser release their first Mephisto dedicated chess computer.The USCF prohibits computers from competing in human tournaments except when represented by the chess systems' creators. The Fredkin Prize is established.
  • 1981 – Cray Blitz wins the Mississippi State Championship with a perfect 5–0 score and a performance rating of 2258. In round 4 it defeats Joe Sentef (2262) to become the first computer to beat a master in tournament play and the first computer to gain a master rating. The World Microcomputer Chess Championship is split into commercial (won by SciSys) and open (won by Fidelity) divisions. In August, IBM releases its first PC, the IBM 5150 running on Intel chips with an operating system designed by Microsoft, a platform that was soon to become the main one for chess programmers.
  • 1982 – Ken Thompson's hardware chess player Belle earns a US master title. David Horne releases 1K ZX Chess, the code of which only takes up 672 bytes, for the Sinclair ZX81.
  • 1983 – Acorn Computers sponsors Garry Kasparov's Candidates match with Viktor Korchnoi. Kasparov wins an Acorn Archimedes as part of his prize, sparking his interest in computers. Frederic Friedel founds the magazine Computer-schach International, changing the name the next year to Computer-schach & Spiele. Richard Lang's Psion becomes one of the first programs to be ported to the IBM PC. Psion would share the World Microcomputer Chess Championship title in 1984.
  • 1984 – January - British GM John Nunn starts annotating games for Personal Computer World magazine. In June, he joins the editorial board of Computer-schach & Spiele magazine. The Svenska schackdatorföreningen (SSDF Swedish Chess Computer Association) is founded, taking over the publication of Ply Magazine, and publishing its first computer chess rating list. The German Company Hegener & Glaser's Mephisto line of dedicated chess computers begins a long streak of victories (1984-1990) in the World Microcomputer Championship using dedicated computers powered by programs by Richard Lang(ChessGenius) and Ed Schröder(Rebel), and offered for sale commercially. Soon after, companies such as Millennium 2000,[11] Scisys/Saitek and The Advanced Software Company (TASC) start producing dedicated chess computers.
  • 1985 – Eric Hallsworth puts out the first issue of Selective Search magazine devoted to computer chess.
  • 1986 – Software Country (see Software Toolworks) released Chessmaster 2000 based on an engine by David Kittinger, the first edition of what was to become the world's best selling line of chess programs.
  • 1987 – Frederic Friedel and physicist Matthias Wüllenweber found Chessbase, releasing the first chess database program. Friedel's friend, world champion Garry Kasparov begins using Chessbase to prep for specific opponents. Stuart Cracraft releases GNU Chess, one of the first 'chess engines' to be bundled with a separate graphical user interface (GUI), chesstool.
  • 1988 – HiTech, developed by Hans Berliner and Carl Ebeling, wins a match against grandmaster Arnold Denker 3½–½. Deep Thought shares first place with Tony Miles in the Software Toolworks Championship, ahead of former world champion Mikhail Tal and several grandmasters including Samuel Reshevsky, Walter Browne and Mikhail Gurevich. It also defeats grandmaster Bent Larsen, making it the first computer to beat a GM in a tournament. Its rating for performance in this tournament of 2745 (USCF scale) was the highest obtained by a computer player. Interplay Entertainment releases Battle Chess, a popular program where the animated pieces fight each other every time a piece is captured. This idea was remade many times, e.g. Empire Interactive's Combat Chess (based on Rebel), XS Games' War Chess, Battle vs. Chess (using Fritz 10), and Interplay's Battle Chess: Game of Kings.
  • 1989 – Deep Thought loses two exhibition games to Garry Kasparov, the reigning world champion. Hegener & Glaser buys out Fidelity Electronics.
  • 1990 – On April 25, former world champion Anatoly Karpov lost in a simul to Hegener & Glaser's Mephisto Portorose 68030 chess computer.
  • 1991 – The ChessMachine based on Ed Schröder's Rebel wins the World Microcomputer Chess Championship, and is offered for sale by the Dutch The Advanced Software Company (TASC). Frans Morsch, the Dutch author of the chess programs Nona and Quest, joins Chessbase, where he designs their Fritz engine, which is released in the U.S. as Knightstalker.
  • 1992 – ChessMachine wins the 7th World Computer Chess Championship, the first time a microcomputer beat mainframes. GM John Nunn releases Secrets of Rook Endings, the first book based on endgame tablebases developed by Ken Thompson. In December, Kasparov visits Frederic Friedel in his hotel room in Cologne, and plays a series of blitz games against Fritz 2 winning 24, drawing 4 and losing 9.
  • 1993 – Deep Blue loses a four-game match against Bent Larsen. Stephen J. Edwards issues the first Portable Game Notation specification, allowing people and programs to share the moves of games. In his book on the Four Knights Defence, GM John Nunn thanks TASC for providing him with a ChessMachine for use in his opening analysis. Nunn also reports receiving phone calls from Frederic Friedel explaining that the Chessbase engine Fritz 2 is busting more published endgame analysis. Chess programs running on personal computers surpass Mephisto's dedicated chess computers to win the Microcomputer Championship, marking a shift from dedicated chess hardware to software on multipurpose personal computers.
  • 1994 – February - John Nunn writes an article for British Chess Magazine asking if Chessbase's Fritz or ChessGenius is stronger. May 19-20, Fritz entered a GM blitz tournament for the first time, the Munich Intel Express. Kasparov lost his first game to Fritz, but managed to tie for first place in the tournament, and then win the playoff, but on the next day, he lost another blitz game to Fritz on ZDF television. In July, Viswanathan Anand plays some opening novelties checked with Fritz in his Candidates match vs. Gata Kamsky. On August 31st, at the London Intel Grand Prix, a rapid, Richard Lang's ChessGenius 2 knocked Kasparov out in the first round, another first. Shay Bushinsky, co-author of Junior, asked Tim Mann how to hook his engine to the GNU Chess Graphical User Interface, and Tim's reply became the basis for the Chess Engine Communication Protocol (a.k.a. Winboard engines). Saitek bought Hegener & Glaser, but continues producing their Mephisto dedicated chess computers.
  • 1995 – May 20, Kasparov gets his revenge on ChessGenius beating it 1 1/2-1/2 in rapid games on Cologne TV. Fritz beats Deep Blue to win the World Computer Chess Championships in Hong Kong.
  • 1996 – Deep Blue loses a six-game match against Garry Kasparov.
  • 1997 – Deep Blue wins a six-game match against Garry Kasparov. Chess programmers move from the rec.games.chess.computer newsgroup to the Computer Chess Club message board.
  • 1999 – Stefan Meyer-Kahlen, author of Shredder, joins Chessbase, where Mathias Feist ports Shredder to the Chessbase format to sell it in the Fritz Graphical User Interface. Shredder started to win many of the world computer, software and microcomputer championships vs. other engines from this point on.
  • 2000 – Stefan Meyer-Kahlen and Rudolf Huber draft the Universal Chess Interface, a protocol for GUIs to talk to engines that would gradually become the main form new engines would take. UCI includes provisions for limiting the strength of engines through its uci_limitstrength and uci_elo parameters giving amateurs a chance to play against the top engines on even terms.
  • 2002 – Vladimir Kramnik draws an eight-game match against Deep Fritz. The International Computer Chess Association changes its name to the International Computer Games Association.
  • 2003 – Kasparov draws a six-game match against Deep Junior.
  • 2003 – Kasparov draws a four-game match against X3D Fritz.
  • 2004 – a team of computers (Hydra, Deep Junior and Fritz), wins 8½–3½ against a rather strong human team formed by Veselin Topalov, Ruslan Ponomariov and Sergey Karjakin, who had an average Elo rating of 2681. In his match with Peter Leko, Vladimir Kramnik employs an opening novelty checked by chess engines, but ends up losing the game. Fabien Letouzey releases the source code for Fruit 2.1, an engine quite competitive with the top closed source engines of the time. This leads many authors to revise their code, incorporating the new ideas.
  • 2005 – Hydra defeats Michael Adams 5½–½.
  • 2005 – Rybka wins the IPCCC tournament and very quickly afterwards becomes the strongest engine.
  • 2006 – the world champion, Vladimir Kramnik, is defeated 4–2 by Deep Fritz.
  • 2007 – GM Larry Christiansen and IM Josh Waitzkin produce audio tutorials for Ubisoft Chessmaster Grandmaster Edition, cementing its popularity. The Computer Chess Club moves to Talkchess.com.
  • 2008 – On the Talkchess.com Forum, Zach Wegner called attention to the similarities between Rybka 1.0 and Fruit 2.1, intimating that Rybka is a Fruit clone.
  • 2009 – Pocket Fritz 4 wins Copa Mercosur 9½/10. A group of pseudonymous Russian programmers release the source code of Ippolit, an engine seemingly stronger than Rybka. This becomes the basis for the engines Robbolito and Ivanhoe, and many engine authors adopt ideas from it.
  • 2010 – Before the World Chess Championship 2010, Topalov prepares by sparring against the supercomputer Blue Gene with 8,192 processors capable of 500 trillion (5 × 1014) floating point operations per second. Rybka developer, Vasik Rajlich accuses Ippolit of being a clone of Rybka.
  • 2011 – Engine programmers Stefan Meyer-Kahlen, Don Dailey, Shay Bushinsky (Junior) and others sign an open letter confirming that they believe Rybka is a clone of Fruit. The ICGA strips Rybka of its WCCC titles.
  • 2015 – BootChess computer implementation of chess at a size of only 487 bytes.
  • 2015 – Super Micro is now the smallest computer implementation of chess on any platform at a size of only 455 bytes.
  • 2017 – A computer engine ends first in the freestyle Ultimate Challenge tournament. The first ranked human plus computer player came in at 3rd place.
  • 2017 – AlphaZero beats Stockfish 28-0, with 72 draws, in a 100-game match.

Availability

Computer chess IC bearing the name of developer Frans Morsch
 
Chess-playing computers and software came onto the market in the mid-1970s. There are many chess engines such as Stockfish, Crafty, Fruit and GNU Chess that can be downloaded from the Internet free of charge. Top programs such as Stockfish have surpassed even world champion caliber players.

Computer chess rating lists

CEGT, CSS, SSDF, and WBEC maintain rating lists allowing fans to compare the strength of engines. As of 3 February 2016, Stockfish is the top rated chess program on the IPON rating list.

CCRL (Computer Chess Rating Lists) is an organisation that tests computer chess engines' strength by playing the programs against each other. CCRL was founded in 2006 by Graham Banks, Ray Banks, Sarah Bird, Kirill Kryukov and Charles Smith, and as of June 2012 its members are Graham Banks, Ray Banks (who only participates in Chess960, or Fischer Random Chess), Shaun Brewer, Adam Hair, Aser Huerga, Kirill Kryukov, Denis Mendoza, Charles Smith and Gabor Szots.

The organisation runs three different lists: 40/40 (40 minutes for every 40 moves played), 40/4 (4 minutes for every 40 moves played), and 40/4 FRC (same time control but Chess960). Pondering (or permanent brain) is switched off and timing is adjusted to the AMD64 X2 4600+ (2.4 GHz) CPU by using Crafty 19.17 BH as a benchmark. Generic, neutral opening books are used (as opposed to the engine's own book) up to a limit of 12 moves into the game alongside 4 or 5 man tablebases.

Computers versus humans

Using "ends-and-means" heuristics a human chess player can intuitively determine optimal outcomes and how to achieve them regardless of the number of moves necessary, but a computer must be systematic in its analysis. Most players agree that looking at least five moves ahead (ten plies) when necessary is required to play well. Normal tournament rules give each player an average of three minutes per move. On average there are more than 30 legal moves per chess position, so a computer must examine a quadrillion possibilities to look ahead ten plies (five full moves); one that could examine a million positions a second would require more than 30 years.

After discovering refutation screening—the application of alpha-beta pruning to optimizing move evaluation—in 1957, a team at Carnegie Mellon University predicted that a computer would defeat the world human champion by 1967. It did not anticipate the difficulty of determining the right order to evaluate branches. Researchers worked to improve programs' ability to identify killer heuristics, unusually high-scoring moves to reexamine when evaluating other branches, but into the 1970s most top chess players believed that computers would not soon be able to play at a Master level. In 1968 International Master David Levy made a famous bet that no chess computer would be able to beat him within ten years, and in 1976 Senior Master and professor of psychology Eliot Hearst of Indiana University wrote that "the only way a current computer program could ever win a single game against a master player would be for the master, perhaps in a drunken stupor while playing 50 games simultaneously, to commit some once-in-a-year blunder".

In the late 1970s chess programs suddenly began defeating top human players. The year of Hearst's statement, Northwestern University's Chess 4.5 at the Paul Masson American Chess Championship's Class B level became the first to win a human tournament. Levy won his bet in 1978 by beating Chess 4.7, but it achieved the first computer victory against a Master-class player at the tournament level by winning one of the six games. In 1980 Belle began often defeating Masters. By 1982 two programs played at Master level and three were slightly weaker.

The sudden improvement without a theoretical breakthrough surprised humans, who did not expect that Belle's ability to examine 100,000 positions a second—about eight plies—would be sufficient. The Spracklens, creators of the successful microcomputer program Sargon, estimated that 90% of the improvement came from faster evaluation speed and only 10% from improved evaluations. New Scientist stated in 1982 that computers "play terrible chess ... clumsy, inefficient, diffuse, and just plain ugly", but humans lost to them by making "horrible blunders, astonishing lapses, incomprehensible oversights, gross miscalculations, and the like" much more often than they realized; "in short, computers win primarily through their ability to find and exploit miscalculations in human initiatives".

By 1982, microcomputer chess programs could evaluate up to 1,500 moves a second and were as strong as mainframe chess programs of five years earlier, able to defeat almost all players. While only able to look ahead one or two plies more than at their debut in the mid-1970s, doing so improved their play more than experts expected; seemingly minor improvements "appear to have allowed the crossing of a psychological threshold, after which a rich harvest of human error becomes accessible", New Scientist wrote. While reviewing SPOC in 1984, BYTE wrote that "Computers—mainframes, minis, and micros—tend to play ugly, inelegant chess", but noted Robert Byrne's statement that "tactically they are freer from error than the average human player". The magazine described SPOC as a "state-of-the-art chess program" for the IBM PC with a "surprisingly high" level of play, and estimated its USCF rating as 1700 (Class B).

At the 1982 North American Computer Chess Championship, Monroe Newborn predicted that a chess program could become world champion within five years; tournament director and International Master Michael Valvo predicted ten years; the Spracklens predicted 15; Ken Thompson predicted more than 20; and others predicted that it would never happen. The most widely held opinion, however, stated that it would occur around the year 2000. In 1989, Levy was defeated by Deep Thought in an exhibition match. Deep Thought, however, was still considerably below World Championship Level, as the then reigning world champion Garry Kasparov demonstrated in two strong wins in 1989. It was not until a 1996 match with IBM's Deep Blue that Kasparov lost his first game to a computer at tournament time controls in Deep Blue - Kasparov, 1996, Game 1. This game was, in fact, the first time a reigning world champion had lost to a computer using regular time controls. However, Kasparov regrouped to win three and draw two of the remaining five games of the match, for a convincing victory. 

In May 1997, an updated version of Deep Blue defeated Kasparov 3½–2½ in a return match. A documentary mainly about the confrontation was made in 2003, titled Game Over: Kasparov and the Machine. IBM keeps a web site of the event


abcdefgh
8
Chessboard480.svg
h7 white rook
f6 black queen
h6 black king
d5 white queen
g5 white knight
d4 black pawn
a3 white pawn
b3 white pawn
f3 black pawn
g3 white pawn
h3 white pawn
f2 black knight
h2 white king
e1 black rook
8
77
66
55
44
33
22
11

abcdefgh
Final position 
 
With increasing processing power and improved evaluation functions, chess programs running on commercially available workstations began to rival top flight players. In 1998, Rebel 10 defeated Viswanathan Anand, who at the time was ranked second in the world, by a score of 5–3. However most of those games were not played at normal time controls. Out of the eight games, four were blitz games (five minutes plus five seconds Fischer delay (see time control) for each move); these Rebel won 3–1. Two were semi-blitz games (fifteen minutes for each side) that Rebel won as well (1½–½). Finally, two games were played as regular tournament games (forty moves in two hours, one hour sudden death); here it was Anand who won ½–1½. In fast games, computers played better than humans, but at classical time controls – at which a player's rating is determined – the advantage was not so clear. 

In the early 2000s, commercially available programs such as Junior and Fritz were able to draw matches against former world champion Garry Kasparov and classical world champion Vladimir Kramnik

In October 2002, Vladimir Kramnik and Deep Fritz competed in the eight-game Brains in Bahrain match, which ended in a draw. Kramnik won games 2 and 3 by "conventional" anti-computer tactics – play conservatively for a long-term advantage the computer is not able to see in its game tree search. Fritz, however, won game 5 after a severe blunder by Kramnik. Game 6 was described by the tournament commentators as "spectacular." Kramnik, in a better position in the early middlegame, tried a piece sacrifice to achieve a strong tactical attack, a strategy known to be highly risky against computers who are at their strongest defending against such attacks. True to form, Fritz found a watertight defense and Kramnik's attack petered out leaving him in a bad position. Kramnik resigned the game, believing the position lost. However, post-game human and computer analysis has shown that the Fritz program was unlikely to have been able to force a win and Kramnik effectively sacrificed a drawn position. The final two games were draws. Given the circumstances, most commentators still rate Kramnik the stronger player in the match.

In January 2003, Garry Kasparov played Junior, another chess computer program, in New York City. The match ended 3–3. 

In November 2003, Garry Kasparov played X3D Fritz. The match ended 2–2. 

In 2005, Hydra, a dedicated chess computer with custom hardware and sixty-four processors and also winner of the 14th IPCCC in 2005, defeated seventh-ranked Michael Adams 5½–½ in a six-game match (though Adams' preparation was far less thorough than Kramnik's for the 2002 series).

In November–December 2006, World Champion Vladimir Kramnik played Deep Fritz. This time the computer won; the match ended 2–4. Kramnik was able to view the computer's opening book. In the first five games Kramnik steered the game into a typical "anti-computer" positional contest. He lost one game (overlooking a mate in one), and drew the next four. In the final game, in an attempt to draw the match, Kramnik played the more aggressive Sicilian Defence and was crushed. 

There was speculation that interest in human-computer chess competition would plummet as a result of the 2006 Kramnik-Deep Fritz match. According to Newborn, for example, "the science is done".

Human-computer chess matches showed the best computer systems overtaking human chess champions in the late 1990s. For the 40 years prior to that, the trend had been that the best machines gained about 40 points per year in the Elo rating while the best humans only gained roughly 2 points per year. The highest rating obtained by a computer in human competition was Deep Thought's USCF rating of 2551 in 1988 and FIDE no longer accepts human-computer results in their rating lists. Specialized machine-only Elo pools have been created for rating machines, but such numbers, while similar in appearance, should not be directly compared. In 2016, the Swedish Chess Computer Association rated computer program Komodo at 3361. 

Chess engines continue to improve. In 2009, chess engines running on slower hardware have reached the grandmaster level. A mobile phone won a category 6 tournament with a performance rating 2898: chess engine Hiarcs 13 running inside Pocket Fritz 4 on the mobile phone HTC Touch HD won the Copa Mercosur tournament in Buenos Aires, Argentina with 9 wins and 1 draw on August 4–14, 2009. Pocket Fritz 4 searches fewer than 20,000 positions per second. This is in contrast to supercomputers such as Deep Blue that searched 200 million positions per second. 

Advanced Chess is a form of chess developed in 1998 by Kasparov where a human plays against another human, and both have access to computers to enhance their strength. The resulting "advanced" player was argued by Kasparov to be stronger than a human or computer alone, this has been proven in numerous occasions, at Freestyle Chess events. In 2017, a win by a computer engine in the freestyle Ultimate Challenge tournament. was the source of a lengthy debate, in which the organisers declined to participate. 

Players today are inclined to treat chess engines as analysis tools rather than opponents.

Implementation issues

The developers of a chess-playing computer system must decide on a number of fundamental implementation issues. These include:
  • Board representation – how a single position is represented in data structures;
  • Search techniques – how to identify the possible moves and select the most promising ones for further examination;
  • Leaf evaluation – how to evaluate the value of a board position, if no further search will be done from that position.
Computer chess programs usually support a number of common de facto standards. Nearly all of today's programs can read and write game moves as Portable Game Notation (PGN), and can read and write individual positions as Forsyth–Edwards Notation (FEN). Older chess programs often only understood long algebraic notation, but today users expect chess programs to understand standard algebraic chess notation

Starting in the late 1990s, programmers began to develop separately engines (with a command-line interface which calculates which moves are strongest in a position) or a graphical user interface (GUI) which provides the player with a chessboard they can see, and pieces that can be moved. Engines communicate their moves to the GUI using a protocol such as the Chess Engine Communication Protocol (CECP) or Universal Chess Interface (UCI). By dividing chess programs into these two pieces, developers can write only the user interface, or only the engine, without needing to write both parts of the program. (See also chess engines.) 

Developers have to decide whether to connect the engine to an opening book and/or endgame tablebases or leave this to the GUI.

Board representations

The data structure used to represent each chess position is key to the performance of move generation and position evaluation. Methods include pieces stored in an array ("mailbox" and "0x88"), piece positions stored in a list ("piece list"), collections of bit-sets for piece locations ("bitboards"), and huffman coded positions for compact long-term storage.

Search techniques

The first paper on the subject was by Claude Shannon in 1950. He predicted the two main possible search strategies which would be used, which he labeled "Type A" and "Type B", before anyone had programmed a computer to play chess. 

Type A programs would use a "brute force" approach, examining every possible position for a fixed number of moves using the minimax algorithm. Shannon believed this would be impractical for two reasons. 

First, with approximately thirty moves possible in a typical real-life position, he expected that searching the approximately 109 positions involved in looking three moves ahead for both sides (six plies) would take about sixteen minutes, even in the "very optimistic" case that the chess computer evaluated a million positions every second. (It took about forty years to achieve this speed.) 

Second, it ignored the problem of quiescence, trying to only evaluate a position that is at the end of an exchange of pieces or other important sequence of moves ('lines'). He expected that adapting type A to cope with this would greatly increase the number of positions needing to be looked at and slow the program down still further. 

Instead of wasting processing power examining bad or trivial moves, Shannon suggested that "type B" programs would use two improvements:
  1. Employ a quiescence search.
  2. Only look at a few good moves for each position.
This would enable them to look further ahead ('deeper') at the most significant lines in a reasonable time. The test of time has borne out the first approach; all modern programs employ a terminal quiescence search before evaluating positions. The second approach (now called forward pruning) has been dropped in favor of search extensions. 

Adriaan de Groot interviewed a number of chess players of varying strengths, and concluded that both masters and beginners look at around forty to fifty positions before deciding which move to play. What makes the former much better players is that they use pattern recognition skills built from experience. This enables them to examine some lines in much greater depth than others by simply not considering moves they can assume to be poor.

More evidence for this being the case is the way that good human players find it much easier to recall positions from genuine chess games, breaking them down into a small number of recognizable sub-positions, rather than completely random arrangements of the same pieces. In contrast, poor players have the same level of recall for both. 

The problem with type B is that it relies on the program being able to decide which moves are good enough to be worthy of consideration ('plausible') in any given position and this proved to be a much harder problem to solve than speeding up type A searches with superior hardware and search extension techniques. 

One of the few chess grandmasters to devote himself seriously to computer chess was former World Chess Champion Mikhail Botvinnik, who wrote several works on the subject. He also held a doctorate in electrical engineering. Working with relatively primitive hardware available in the Soviet Union in the early 1960s, Botvinnik had no choice but to investigate software move selection techniques; at the time only the most powerful computers could achieve much beyond a three-ply full-width search, and Botvinnik had no such machines. In 1965 Botvinnik was a consultant to the ITEP team in a US-Soviet computer chess match. 

One developmental milestone occurred when the team from Northwestern University, which was responsible for the Chess series of programs and won the first three ACM Computer Chess Championships (1970–72), abandoned type B searching in 1973. The resulting program, Chess 4.0, won that year's championship and its successors went on to come in second in both the 1974 ACM Championship and that year's inaugural World Computer Chess Championship, before winning the ACM Championship again in 1975, 1976 and 1977.

One reason they gave for the switch was that they found it less stressful during competition, because it was difficult to anticipate which moves their type B programs would play, and why. They also reported that type A was much easier to debug in the four months they had available and turned out to be just as fast: in the time it used to take to decide which moves were worthy of being searched, it was possible just to search all of them. 

In fact, Chess 4.0 set the paradigm that was and still is followed essentially by all modern Chess programs today. Chess 4.0 type programs won out for the simple reason that their programs played better chess. Such programs did not try to mimic human thought processes, but relied on full width alpha-beta and negascout searches. Most such programs (including all modern programs today) also included a fairly limited selective part of the search based on quiescence searches, and usually extensions and pruning (particularly null move pruning from the 1990s onwards) which were triggered based on certain conditions in an attempt to weed out or reduce obviously bad moves (history moves) or to investigate interesting nodes (e.g. check extensions, passed pawns on seventh rank, etc.). Extension and pruning triggers have to be used very carefully however. Over extend and the program wastes too much time looking at uninteresting positions. If too much is pruned, there is a risk cutting out interesting nodes. Chess programs differ in terms of how and what types of pruning and extension rules are included as well as in the evaluation function. Some programs are believed to be more selective than others (for example Deep Blue was known to be less selective than most commercial programs because they could afford to do more complete full width searches), but all have a base full width search as a foundation and all have some selective components (Q-search, pruning/extensions). 

Though such additions meant that the program did not truly examine every node within its search depth (so it would not be truly brute force in that sense), the rare mistakes due to these selective searches was found to be worth the extra time it saved because it could search deeper. In that way Chess programs can get the best of both worlds.

Furthermore, technological advances by orders of magnitude in processing power have made the brute force approach far more incisive than was the case in the early years. The result is that a very solid, tactical AI player aided by some limited positional knowledge built in by the evaluation function and pruning/extension rules began to match the best players in the world. It turned out to produce excellent results, at least in the field of chess, to let computers do what they do best (calculate) rather than coax them into imitating human thought processes and knowledge. In 1997 Deep Blue defeated World Champion Garry Kasparov, marking the first time a computer has defeated a reigning world chess champion in standard time control. 

Computer chess programs consider chess moves as a game tree. In theory, they examine all moves, then all counter-moves to those moves, then all moves countering them, and so on, where each individual move by one player is called a "ply". This evaluation continues until a certain maximum search depth or the program determines that a final "leaf" position has been reached (e.g. checkmate).
A naive implementation of this approach can only search to a small depth in a practical amount of time, so various methods have been devised to greatly speed the search for good moves. 

The AlphaZero program uses a variant of Monte Carlo tree search without rollout.

For more information, see:

Leaf evaluation

For most chess positions, computers cannot look ahead to all possible final positions. Instead, they must look ahead a few plies and compare the possible positions, known as leaves. The algorithm that evaluates leaves is termed the "evaluation function", and these algorithms are often vastly different between different chess programs. 

Evaluation functions typically evaluate positions in hundredths of a pawn (called a centipawn), and consider material value along with other factors affecting the strength of each side. When counting up the material for each side, typical values for pieces are 1 point for a pawn, 3 points for a knight or bishop, 5 points for a rook, and 9 points for a queen. (See Chess piece relative value.) The king is sometimes given an arbitrary high value such as 200 points (Shannon's paper) or 1,000,000,000 points (1961 USSR program) to ensure that a checkmate outweighs all other factors (Levy & Newborn 1991:45). By convention, a positive evaluation favors White, and a negative evaluation favors Black. 

In addition to points for pieces, most evaluation functions take many factors into account, such as pawn structure, the fact that a pair of bishops are usually worth more, centralized pieces are worth more, and so on. The protection of kings is usually considered, as well as the phase of the game (opening, middle or endgame).

Endgame tablebases

Endgame play had long been one of the great weaknesses of chess programs, because of the depth of search needed. Some otherwise master-level programs were unable to win in positions where even intermediate human players can force a win. 

To solve this problem, computers have been used to analyze some chess endgame positions completely, starting with king and pawn against king. Such endgame tablebases are generated in advance using a form of retrograde analysis, starting with positions where the final result is known (e.g., where one side has been mated) and seeing which other positions are one move away from them, then which are one move from those, etc. Ken Thompson was a pioneer in this area.

The results of the computer analysis sometimes surprised people. In 1977 Thompson's Belle chess machine used the endgame tablebase for a king and rook against king and queen and was able to draw that theoretically lost ending against several masters. This was despite not following the usual strategy to delay defeat by keeping the defending king and rook close together for as long as possible. Asked to explain the reasons behind some of the program's moves, Thompson was unable to do so beyond saying the program's database simply returned the best moves. 

Most grandmasters declined to play against the computer in the queen versus rook endgame, but Walter Browne accepted the challenge. A queen versus rook position was set up in which the queen can win in thirty moves, with perfect play. Browne was allowed 2½ hours to play fifty moves, otherwise a draw would be claimed under the fifty-move rule. After forty-five moves, Browne agreed to a draw, being unable to force checkmate or win the rook within the next five moves. In the final position, Browne was still seventeen moves away from checkmate, but not quite that far away from winning the rook. Browne studied the endgame, and played the computer again a week later in a different position in which the queen can win in thirty moves. This time, he captured the rook on the fiftieth move, giving him a winning position. 

Other positions, long believed to be won, turned out to take more moves against perfect play to actually win than were allowed by chess's fifty-move rule. As a consequence, for some years the official FIDE rules of chess were changed to extend the number of moves allowed in these endings. After a while, the rule reverted to fifty moves in all positions — more such positions were discovered, complicating the rule still further, and it made no difference in human play, as they could not play the positions perfectly. 

Over the years, other endgame database formats have been released including the Edward Tablebase, the De Koning Database and the Nalimov Tablebase which is used by many chess programs such as Rybka, Shredder and Fritz. Tablebases for all positions with six pieces are available. Some seven-piece endgames have been analyzed by Marc Bourzutschky and Yakov Konoval. Programmers using the Lomonosov supercomputers in Moscow have completed a chess tablebase for all endgames with seven pieces or fewer (trivial endgame positions are excluded, such as six white pieces versus a lone black king). In all of these endgame databases it is assumed that castling is no longer possible. 

Many tablebases do not consider the fifty-move rule, under which a game where fifty moves pass without a capture or pawn move can be claimed to be a draw by either player. This results in the tablebase returning results such as "Forced mate in sixty-six moves" in some positions which would actually be drawn because of the fifty-move rule. One reason for this is that if the rules of chess were to be changed once more, giving more time to win such positions, it will not be necessary to regenerate all the tablebases. It is also very easy for the program using the tablebases to notice and take account of this 'feature' and in any case if using an endgame tablebase will choose the move that leads to the quickest win (even if it would fall foul of the fifty-move rule with perfect play). If playing an opponent not using a tablebase, such a choice will give good chances of winning within fifty moves. 

The Nalimov tablebases, which use state-of-the-art compression techniques, require 7.05 GB of hard disk space for all five-piece endings. To cover all the six-piece endings requires approximately 1.2 TB. It is estimated that a seven-piece tablebase requires between 50 and 200 TB of storage space.

Endgame databases featured prominently in 1999, when Kasparov played an exhibition match on the Internet against the rest of the world. A seven piece Queen and pawn endgame was reached with the World Team fighting to salvage a draw. Eugene Nalimov helped by generating the six piece ending tablebase where both sides had two Queens which was used heavily to aid analysis by both sides.

Other optimizations

Many other optimizations can be used to make chess-playing programs stronger. For example, transposition tables are used to record positions that have been previously evaluated, to save recalculation of them. Refutation tables record key moves that "refute" what appears to be a good move; these are typically tried first in variant positions (since a move that refutes one position is likely to refute another). Opening books aid computer programs by giving common openings that are considered good play (and good ways to counter poor openings). Many chess engines use pondering to increase their strength. 

Of course, faster hardware and additional processors can improve chess-playing program abilities, and some systems (such as Deep Blue) use specialized chess hardware instead of only software. Another way to examine more chess positions is to distribute the analysis of positions to many computers. The ChessBrain project was a chess program that distributed the search tree computation through the Internet. In 2004 the ChessBrain played chess using 2,070 computers.

Playing strength versus computer speed

It has been estimated that doubling the computer speed gains approximately fifty to seventy Elo points in playing strength.

Chess variants

"Chess on an Infinite Plane" is an example of a variant chess game largely unaffected by chess computers or software.
 
Chess engines have been developed to play some chess variants such as Capablanca Chess, but the engines are almost never directly integrated with specific hardware. Even for the software that has been developed, most will not play chess beyond a certain board size, so games played on an unbounded chessboard (infinite chess) remain virtually untouched by both chess computers and software.

Categorizations

Dedicated hardware

These chess playing systems include custom hardware or run on supercomputers.

Commercial dedicated computers

In the 1980s and early 1990s, there was a competitive market for dedicated chess computers. This market changed in the mid-90s when computers with dedicated processors could no longer compete with the fast processors in personal computers. Nowadays, most dedicated units sold are of beginner and intermediate strength.
  • Chess Challenger, a line of chess computers sold by Fidelity Electronics from 1977 to 1992. These models won the first four World Microcomputer Chess Championships.
  • ChessMachine, an ARM-based dedicated computer, which could run two engines:
  • Excalibur Electronics sells a line of beginner strength units.
  • Mephisto, a line of chess computers sold by Hegener & Glaser. The units won six consecutive World Microcomputer Chess Championships.
  • Novag sold a line of tactically strong computers, including the Constellation, Sapphire, and Star Diamond brands.
  • Phoenix Chess Systems makes limited edition units based around StrongARM and XScale processors running modern engines and emulating classic engines.
  • Saitek sells mid-range units of intermediate strength. They bought out Hegener & Glaser and its Mephisto brand in 1994.
Recently, some hobbyists have been using the Multi Emulator Super System to run the chess programs created for Fidelity or Hegener & Glaser's Mephisto computers on modern 64 bit operating systems such as Windows 10. The author of Rebel, Ed Schröder has also adapted three of the Hegener & Glaser Mephisto's he wrote to work as UCI engines.

Historical

These chess programs run on obsolete hardware:

DOS programs

These programs can be run on MS-DOS, and can be run on 64 bit Windows 10 via emulators such as DOSBox or Qemu:

Types and features of chess software

Perhaps the most common type of chess software are programs that simply play chess. You make a move on the board, and the AI calculates and plays a response, and back and forth until one player resigns. Sometimes the chess engine, which calculates the moves, and the graphical user interface (GUI) are separate programs. A variety of engines can be imported into the GUI, so that you can play against different styles. Engines often have just a simple text command-line interface while GUIs may offer a variety of piece sets, board styles or even 3D or animated pieces. Because recent engines are so strong, engines or GUIs may offer some way of limiting the engine's strength, so the player has a better chance of winning. Universal Chess Interface(UCI) engines such Fritz or Rybka may have a built in mechanism for reducing the Elo rating of the engine (via UCI's uci_limitstrength and uci_elo parameters). Some versions of Fritz have a Handicap and Fun mode for limiting the current engine or changing the percentage of mistakes it makes or changing its style. Fritz also has a Friend Mode where during the game it tries to match the level of the player. 

Chess databases allow users to search through a large library of historical games, analyze them, check statistics, and draw up an opening repertoire. Chessbase (for PC) is perhaps the most common program for this amongst professional players, but there are alternatives such as Shane's Chess Information Database (Scid)  for Windows, Mac or Linux, Chess Assistant for PC, Gerhard Kalab's Chess PGN Master for Android or Giordano Vicoli's Chess-Studio for iOS.

Programs such as Playchess allow you to play games against other players over the internet.

Chess training programs teach chess. Chessmaster had playthrough tutorials by IM Josh Waitzkin and GM Larry Christiansen. Stefan Meyer-Kahlen offers Shredder Chess Tutor based on the Step coursebooks of Rob Brunia and Cor Van Wijgerden. World champions Magnus Carlsen's Play Magnus company recently released a Magnus Trainer app for Android and iOS. Chessbase has Fritz and Chesster for children. Convekta has a large number of training apps such as CT-ART and its Chess King line based on tutorials by GM Alexander Kalinin and Maxim Blokh. 

Notable theorists

Well-known computer chess theorists include:

Solving chess

The prospects of completely solving chess are generally considered to be rather remote. It is widely conjectured that there is no computationally inexpensive method to solve chess even in the very weak sense of determining with certainty the value of the initial position, and hence the idea of solving chess in the stronger sense of obtaining a practically usable description of a strategy for perfect play for either side seems unrealistic today. However, it has not been proven that no computationally cheap way of determining the best move in a chess position exists, nor even that a traditional alpha-beta-searcher running on present-day computing hardware could not solve the initial position in an acceptable amount of time. The difficulty in proving the latter lies in the fact that, while the number of board positions that could happen in the course of a chess game is huge (on the order of at least 1043 to 1047), it is hard to rule out with mathematical certainty the possibility that the initial position allows either side to force a mate or a threefold repetition after relatively few moves, in which case the search tree might encompass only a very small subset of the set of possible positions. It has been mathematically proven that generalized chess (chess played with an arbitrarily large number of pieces on an arbitrarily large chessboard) is EXPTIME-complete, meaning that determining the winning side in an arbitrary position of generalized chess provably takes exponential time in the worst case; however, this theoretical result gives no lower bound on the amount of work required to solve ordinary 8x8 chess. 

Gardner's Minichess, played on a 5×5 board with approximately 1018 possible board positions, has been solved; its game-theoretic value is 1/2 (i.e. a draw can be forced by either side), and the forcing strategy to achieve that result has been described. 

Progress has also been made from the other side: as of 2012, all 7 and fewer piece (2 kings and up to 5 other pieces) endgames have been solved.

Chess engines

A "chess engine" is software that calculates and orders which moves are the strongest to play in a given position. Engine authors focus on improving the play of their engines, often just importing the engine into a graphical user interface (GUI) developed by someone else. Engines communicate with the GUI by following standardized protocols such as the Universal Chess Interface developed by Stefan Meyer-Kahlen and Franz Huber or the Chess Engine Communication Protocol developed by Tim Mann for GNU Chess and Winboard. Chessbase has its own proprietary protocol, and at one time Millennium 2000 had another protocol used for ChessGenius. Engines designed for one operating system and protocol may be ported to other OS's or protocols.

Chess web apps

In 1997, the Internet Chess Club released its first Java client for playing chess online against other people inside one's webbrowser. This was probably one of the first chess web apps. Free Internet Chess Server followed soon after with a similar client. In 2004, International Correspondence Chess Federation opened up a web server to replace their email based system. Chess.com started offering Live Chess in 2007. Chessbase/Playchess had long had a downloadable client, but they had a web interface by 2013.

Another popular web app is tactics training. The now defunct Chess Tactics Server opened its site in 2006, followed by Chesstempo the next year, and Chess.com added its Tactics Trainer in 2008. Chessbase added a tactics trainer web app in 2015.

Chessbase took their chess game database online in 1998. Another early chess game databases was Chess Lab, which started in 1999. New In Chess had initially tried to compete with Chessbase by releasing a NICBase program for Windows 3.x, but eventually, decided to give up on software, and instead focus on their online database starting in 2002.

One could play against the engine Shredder online from 2006. In 2015, Chessbase added a play Fritz web app, as well as My Games for storing one's games.

Starting in 2007, Chess.com offered the content of the training program, Chess Mentor, to their customers online.  Top GMs such as Sam Shankland and Walter Browne have contributed lessons.

Watson (computer)

From Wikipedia, the free encyclopedia

Watson's avatar, inspired by the IBM "smarter planet" logo
 
Watson is a question-answering computer system capable of answering questions posed in natural language, developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first CEO, industrialist Thomas J. Watson.

The computer system was initially developed to answer questions on the quiz show Jeopardy! and, in 2011, the Watson computer system competed on Jeopardy! against legendary champions Brad Rutter and Ken Jennings winning the first place prize of $1 million.

In February 2013, IBM announced that Watson software system's first commercial application would be for utilization management decisions in lung cancer treatment at Memorial Sloan Kettering Cancer Center, New York City, in conjunction with health insurance company WellPoint. IBM Watson's former business chief, Manoj Saxena, says that 90% of nurses in the field who use Watson now follow its guidance.

Description

The high-level architecture of IBM's DeepQA used in Watson
 
Watson was created as a question answering (QA) computing system that IBM built to apply advanced natural language processing, information retrieval, knowledge representation, automated reasoning, and machine learning technologies to the field of open domain question answering.
The key difference between QA technology and document search is that document search takes a keyword query and returns a list of documents, ranked in order of relevance to the query (often based on popularity and page ranking), while QA technology takes a question expressed in natural language, seeks to understand it in much greater detail, and returns a precise answer to the question.
When created, IBM stated that, "more than 100 different techniques are used to analyze natural language, identify sources, find and generate hypotheses, find and score evidence, and merge and rank hypotheses."

In recent years, the Watson capabilities have been extended and the way in which Watson works has been changed to take advantage of new deployment models (Watson on IBM Cloud) and evolved machine learning capabilities and optimised hardware available to developers and researchers. It is no longer purely a question answering (QA) computing system designed from Q&A pairs but can now 'see', 'hear', 'read', 'talk', 'taste', 'interpret', 'learn' and 'recommend'.

Software

Watson uses IBM's DeepQA software and the Apache UIMA (Unstructured Information Management Architecture) framework implementation. The system was written in various languages, including Java, C++, and Prolog, and runs on the SUSE Linux Enterprise Server 11 operating system using the Apache Hadoop framework to provide distributed computing.

Hardware

The system is workload-optimized, integrating massively parallel POWER7 processors and built on IBM's DeepQA technology, which it uses to generate hypotheses, gather massive evidence, and analyze data. Watson employs a cluster of ninety IBM Power 750 servers, each of which uses a 3.5 GHz POWER7 eight-core processor, with four threads per core. In total, the system has 2,880 POWER7 processor threads and 16 terabytes of RAM.

According to John Rennie, Watson can process 500 gigabytes, the equivalent of a million books, per second. IBM's master inventor and senior consultant, Tony Pearson, estimated Watson's hardware cost at about three million dollars. Its Linpack performance stands at 80 TeraFLOPs, which is about half as fast as the cut-off line for the Top 500 Supercomputers list. According to Rennie, all content was stored in Watson's RAM for the Jeopardy game because data stored on hard drives would be too slow to be competitive with human Jeopardy champions.

Data

The sources of information for Watson include encyclopedias, dictionaries, thesauri, newswire articles and literary works. Watson also used databases, taxonomies and ontologies. Specifically, DBPedia, WordNet and Yago were used. The IBM team provided Watson with millions of documents, including dictionaries, encyclopedias and other reference material that it could use to build its knowledge.

Operation

The computer's techniques for unravelling Jeopardy! clues sounded just like mine. That machine zeroes in on keywords in a clue then combs its memory (in Watson's case, a 15-terabyte databank of human knowledge) for clusters of associations with those words. It rigorously checks the top hits against all the contextual information it can muster: the category name; the kind of answer being sought; the time, place, and gender hinted at in the clue; and so on. And when it feels "sure" enough, it decides to buzz. This is all an instant, intuitive process for a human Jeopardy! player, but I felt convinced that under the hood my brain was doing more or less the same thing.
— Ken Jennings

Watson parses questions into different keywords and sentence fragments in order to find statistically related phrases. Watson's main innovation was not in the creation of a new algorithm for this operation but rather its ability to quickly execute hundreds of proven language analysis algorithms simultaneously. The more algorithms that find the same answer independently the more likely Watson is to be correct. Once Watson has a small number of potential solutions, it is able to check against its database to ascertain whether the solution makes sense or not.

Comparison with human players

Ken Jennings, Watson, and Brad Rutter in their Jeopardy! exhibition match.
 
Watson's basic working principle is to parse keywords in a clue while searching for related terms as responses. This gives Watson some advantages and disadvantages compared with human Jeopardy! players. Watson has deficiencies in understanding the contexts of the clues. As a result, human players usually generate responses faster than Watson, especially to short clues. Watson's programming prevents it from using the popular tactic of buzzing before it is sure of its response. Watson has consistently better reaction time on the buzzer once it has generated a response, and is immune to human players' psychological tactics, such as jumping between categories on every clue.

In a sequence of 20 mock games of Jeopardy, human participants were able to use the average six to seven seconds that Watson needed to hear the clue and decide whether to signal for responding. During that time, Watson also has to evaluate the response and determine whether it is sufficiently confident in the result to signal. Part of the system used to win the Jeopardy! contest was the electronic circuitry that receives the "ready" signal and then examined whether Watson's confidence level was great enough to activate the buzzer. Given the speed of this circuitry compared to the speed of human reaction times, Watson's reaction time was faster than the human contestants except when the human anticipated (instead of reacted to) the ready signal. After signaling, Watson speaks with an electronic voice and gives the responses in Jeopardy!'s question format. Watson's voice was synthesized from recordings that actor Jeff Woodman made for an IBM text-to-speech program in 2004.

The Jeopardy! staff used different means to notify Watson and the human players when to buzz, which was critical in many rounds. The humans were notified by a light, which took them tenths of a second to perceive. Watson was notified by an electronic signal and could activate the buzzer within about eight milliseconds. The humans tried to compensate for the perception delay by anticipating the light, but the variation in the anticipation time was generally too great to fall within Watson's response time. Watson did not attempt to anticipate the notification signal.

History

Development

Since Deep Blue's victory over Garry Kasparov in chess in 1997, IBM had been on the hunt for a new challenge. In 2004, IBM Research manager Charles Lickel, over dinner with coworkers, noticed that the restaurant they were in had fallen silent. He soon discovered the cause of this evening hiatus: Ken Jennings, who was then in the middle of his successful 74-game run on Jeopardy!. Nearly the entire restaurant had piled toward the televisions, mid-meal, to watch the phenomenon. Intrigued by the quiz show as a possible challenge for IBM, Lickel passed the idea on, and in 2005, IBM Research executive Paul Horn backed Lickel up, pushing for someone in his department to take up the challenge of playing Jeopardy! with an IBM system. Though he initially had trouble finding any research staff willing to take on what looked to be a much more complex challenge than the wordless game of chess, eventually David Ferrucci took him up on the offer. In competitions managed by the United States government, Watson's predecessor, a system named Piquant, was usually able to respond correctly to only about 35% of clues and often required several minutes to respond. To compete successfully on Jeopardy!, Watson would need to respond in no more than a few seconds, and at that time, the problems posed by the game show were deemed to be impossible to solve.

In initial tests run during 2006 by David Ferrucci, the senior manager of IBM's Semantic Analysis and Integration department, Watson was given 500 clues from past Jeopardy! programs. While the best real-life competitors buzzed in half the time and responded correctly to as many as 95% of clues, Watson's first pass could get only about 15% correct. During 2007, the IBM team was given three to five years and a staff of 15 people to solve the problems. By 2008, the developers had advanced Watson such that it could compete with Jeopardy! champions. By February 2010, Watson could beat human Jeopardy! contestants on a regular basis.

During the game, Watson had access to 200 million pages of structured and unstructured content consuming four terabytes of disk storage including the full text of the 2011 edition of Wikipedia, but was not connected to the Internet. For each clue, Watson's three most probable responses were displayed on the television screen. Watson consistently outperformed its human opponents on the game's signaling device, but had trouble in a few categories, notably those having short clues containing only a few words. 

Although the system is primarily an IBM effort, Watson's development involved faculty and graduate students from Rensselaer Polytechnic Institute, Carnegie Mellon University, University of Massachusetts Amherst, the University of Southern California's Information Sciences Institute, the University of Texas at Austin, the Massachusetts Institute of Technology, and the University of Trento, as well as students from New York Medical College.

Jeopardy!

Preparation

Watson demo at an IBM booth at a trade show
 
In 2008, IBM representatives communicated with Jeopardy! executive producer Harry Friedman about the possibility of having Watson compete against Ken Jennings and Brad Rutter, two of the most successful contestants on the show, and the program's producers agreed. Watson's differences with human players had generated conflicts between IBM and Jeopardy! staff during the planning of the competition. IBM repeatedly expressed concerns that the show's writers would exploit Watson's cognitive deficiencies when writing the clues, thereby turning the game into a Turing test. To alleviate that claim, a third party randomly picked the clues from previously written shows that were never broadcast. Jeopardy! staff also showed concerns over Watson's reaction time on the buzzer. Originally Watson signalled electronically, but show staff requested that it press a button physically, as the human contestants would. Even with a robotic "finger" pressing the buzzer, Watson remained faster than its human competitors. Ken Jennings noted, "If you're trying to win on the show, the buzzer is all", and that Watson "can knock out a microsecond-precise buzz every single time with little or no variation. Human reflexes can't compete with computer circuits in this regard." Stephen Baker, a journalist who recorded Watson's development in his book Final Jeopardy, reported that the conflict between IBM and Jeopardy! became so serious in May 2010 that the competition was almost canceled. As part of the preparation, IBM constructed a mock set in a conference room at one of its technology sites to model the one used on Jeopardy!. Human players, including former Jeopardy! contestants, also participated in mock games against Watson with Todd Alan Crain of The Onion playing host. About 100 test matches were conducted with Watson winning 65% of the games.

To provide a physical presence in the televised games, Watson was represented by an "avatar" of a globe, inspired by the IBM "smarter planet" symbol. Jennings described the computer's avatar as a "glowing blue ball criss-crossed by 'threads' of thought—42 threads, to be precise", and stated that the number of thought threads in the avatar was an in-joke referencing the significance of the number 42 in Douglas Adams' Hitchhiker's Guide to the Galaxy. Joshua Davis, the artist who designed the avatar for the project, explained to Stephen Baker that there are 36 triggerable states that Watson was able to use throughout the game to show its confidence in responding to a clue correctly; he had hoped to be able to find forty-two, to add another level to the Hitchhiker's Guide reference, but he was unable to pinpoint enough game states.

A practice match was recorded on January 13, 2011, and the official matches were recorded on January 14, 2011. All participants maintained secrecy about the outcome until the match was broadcast in February.

Practice match

In a practice match before the press on January 13, 2011, Watson won a 15-question round against Ken Jennings and Brad Rutter with a score of $4,400 to Jennings's $3,400 and Rutter's $1,200, though Jennings and Watson were tied before the final $1,000 question. None of the three players responded incorrectly to a clue.

First match

The first round was broadcast February 14, 2011, and the second round, on February 15, 2011. The right to choose the first category had been determined by a draw won by Rutter. Watson, represented by a computer monitor display and artificial voice, responded correctly to the second clue and then selected the fourth clue of the first category, a deliberate strategy to find the Daily Double as quickly as possible. Watson's guess at the Daily Double location was correct. At the end of the first round, Watson was tied with Rutter at $5,000; Jennings had $2,000.

Watson's performance was characterized by some quirks. In one instance, Watson repeated a reworded version of an incorrect response offered by Jennings. (Jennings said "What are the '20s?" in reference to the 1920s. Then Watson said "What is 1920s?") Because Watson could not recognize other contestants' responses, it did not know that Jennings had already given the same response. In another instance, Watson was initially given credit for a response of "What is a leg?" after Jennings incorrectly responded "What is: he only had one hand?" to a clue about George Eyser (the correct response was, "What is: he's missing a leg?"). Because Watson, unlike a human, could not have been responding to Jennings's mistake, it was decided that this response was incorrect. The broadcast version of the episode was edited to omit Trebek's original acceptance of Watson's response. Watson also demonstrated complex wagering strategies on the Daily Doubles, with one bet at $6,435 and another at $1,246. Gerald Tesauro, one of the IBM researchers who worked on Watson, explained that Watson's wagers were based on its confidence level for the category and a complex regression model called the Game State Evaluator.

Watson took a commanding lead in Double Jeopardy!, correctly responding to both Daily Doubles. Watson responded to the second Daily Double correctly with a 32% confidence score.

Although it wagered only $947 on the clue, Watson was the only contestant to miss the Final Jeopardy! response in the category U.S. CITIES ("Its largest airport was named for a World War II hero; its second largest, for a World War II battle"). Rutter and Jennings gave the correct response of Chicago, but Watson's response was "What is Toronto?????" Ferrucci offered reasons why Watson would appear to have guessed a Canadian city: categories only weakly suggest the type of response desired, the phrase "U.S. city" did not appear in the question, there are cities named Toronto in the U.S., and Toronto in Ontario has an American League baseball team. Dr. Chris Welty, who also worked on Watson, suggested that it may not have been able to correctly parse the second part of the clue, "its second largest, for a World War II battle" (which was not a standalone clause despite it following a semicolon, and required context to understand that it was referring to a second-largest airport). Eric Nyberg, a professor at Carnegie Mellon University and a member of the development team, stated that the error occurred because Watson does not possess the comparative knowledge to discard that potential response as not viable. Although not displayed to the audience as with non-Final Jeopardy! questions, Watson's second choice was Chicago. Both Toronto and Chicago were well below Watson's confidence threshold, at 14% and 11% respectively. (This lack of confidence was the reason for the multiple question marks in Watson's response.) 

The game ended with Jennings with $4,800, Rutter with $10,400, and Watson with $35,734.

Second match

During the introduction, Trebek (a Canadian native) joked that he had learned Toronto was a U.S. city, and Watson's error in the first match prompted an IBM engineer to wear a Toronto Blue Jays jacket to the recording of the second match.

In the first round, Jennings was finally able to choose a Daily Double clue, while Watson responded to one Daily Double clue incorrectly for the first time in the Double Jeopardy! Round. After the first round, Watson placed second for the first time in the competition after Rutter and Jennings were briefly successful in increasing their dollar values before Watson could respond. Nonetheless, the final result ended with a victory for Watson with a score of $77,147, besting Jennings who scored $24,000 and Rutter who scored $21,600.

Final outcome

The prizes for the competition were $1 million for first place (Watson), $300,000 for second place (Jennings), and $200,000 for third place (Rutter). As promised, IBM donated 100% of Watson's winnings to charity, with 50% of those winnings going to World Vision and 50% going to World Community Grid. Similarly, Jennings and Rutter donated 50% of their winnings to their respective charities.

In acknowledgment of IBM and Watson's achievements, Jennings made an additional remark in his Final Jeopardy! response: "I for one welcome our new computer overlords", echoing a similar memetic reference to the episode "Deep Space Homer" on The Simpsons, in which TV news presenter Kent Brockman speaks of welcoming "our new insect overlords". Jennings later wrote an article for Slate, in which he stated:
IBM has bragged to the media that Watson's question-answering skills are good for more than annoying Alex Trebek. The company sees a future in which fields like medical diagnosis, business analytics, and tech support are automated by question-answering software like Watson. Just as factory jobs were eliminated in the 20th century by new assembly-line robots, Brad and I were the first knowledge-industry workers put out of work by the new generation of 'thinking' machines. 'Quiz show contestant' may be the first job made redundant by Watson, but I'm sure it won't be the last.

Philosophy

Philosopher John Searle argues that Watson—despite impressive capabilities—cannot actually think. Drawing on his Chinese room thought experiment, Searle claims that Watson, like other computational machines, is capable only of manipulating symbols, but has no ability to understand the meaning of those symbols; however, Searle's experiment has its detractors.

Match against members of the United States Congress

On February 28, 2011, Watson played an untelevised exhibition match of Jeopardy! against members of the United States House of Representatives. In the first round, Rush D. Holt, Jr. (D-NJ, a former Jeopardy! contestant), who was challenging the computer with Bill Cassidy (R-LA, later Senator from Louisiana), led with Watson in second place. However, combining the scores between all matches, the final score was $40,300 for Watson and $30,000 for the congressional players combined.

IBM's Christopher Padilla said of the match, "The technology behind Watson represents a major advancement in computing. In the data-intensive environment of government, this type of technology can help organizations make better decisions and improve how government helps its citizens."

Current and future applications

According to IBM, "The goal is to have computers start to interact in natural human terms across a range of applications and processes, understanding the questions that humans ask and providing answers that humans can understand and justify." It has been suggested by Robert C. Weber, IBM's general counsel, that Watson may be used for legal research. The company also intends to use Watson in other information-intensive fields, such as telecommunications, financial services and government.

Watson is based on commercially available IBM Power 750 servers that have been marketed since February 2010. IBM also intends to market the DeepQA software to large corporations, with a price in the millions of dollars, reflecting the $1 million needed to acquire a server that meets the minimum system requirement to operate Watson. IBM expects the price to decrease substantially within a decade as the technology improves.

Commentator Rick Merritt said that "there's another really important reason why it is strategic for IBM to be seen very broadly by the American public as a company that can tackle tough computer problems. A big slice of [IBM's profit] comes from selling to the U.S. government some of the biggest, most expensive systems in the world."

In 2013, it was reported that three companies were working with IBM to create apps embedded with Watson technology. Fluid is developing an app for retailers, one called "The North Face", which is designed to provide advice to online shoppers. Welltok is developing an app designed to give people advice on ways to engage in activities to improve their health. MD Buyline is developing an app for the purpose of advising medical institutions on equipment procurement decisions.

In November 2013, IBM announced it would make Watson's API available to software application providers, enabling them to build apps and services that are embedded in Watson's capabilities. To build out its base of partners who create applications on the Watson platform, IBM consults with a network of venture capital firms, which advise IBM on which of their portfolio companies may be a logical fit for what IBM calls the Watson Ecosystem. Thus far, roughly 800 organizations and individuals have signed up with IBM, with interest in creating applications that could use the Watson platform.

On January 30, 2013, it was announced that Rensselaer Polytechnic Institute would receive a successor version of Watson, which would be housed at the Institute's technology park and be available to researchers and students. By summer 2013, Rensselaer had become the first university to receive a Watson computer.

On February 6, 2014, it was reported that IBM plans to invest $100 million in a 10-year initiative to use Watson and other IBM technologies to help countries in Africa address development problems, beginning with healthcare and education.

On June 3, 2014, three new Watson Ecosystem partners were chosen from more than 400 business concepts submitted by teams spanning 18 industries from 43 countries. "These bright and enterprising organizations have discovered innovative ways to apply Watson that can deliver demonstrable business benefits", said Steve Gold, vice president, IBM Watson Group. The winners were Majestyk Apps with their adaptive educational platform, FANG (Friendly Anthropomorphic Networked Genome); Red Ant with their retail sales trainer; and GenieMD with their medical recommendation service.

On July 9, 2014, Genesys Telecommunications Laboratories announced plans to integrate Watson to improve their customer experience platform, citing the sheer volume of customer data to analyze is staggering.

Watson has been integrated with databases including Bon Appétit magazine to perform a recipe generating platform.

Watson is being used by Decibel, a music discovery startup, in its app MusicGeek which uses the supercomputer to provide music recommendations to its users. The use of the artificial intelligence of Watson has also been found in the hospitality industry. GoMoment uses Watson for its Rev1 app, which gives hotel staff a way to quickly respond to questions from guests. Arria NLG has built an app that helps energy companies stay within regulatory guidelines, making it easier for managers to make sense of thousands of pages of legal and technical jargon.

OmniEarth, Inc. uses Watson computer vision services to analyze satellite and aerial imagery, along with other municipal data, to infer water usage on a property-by-property basis, helping water districts in drought-stricken California improve water conservation efforts.

In September 2016, Condé Nast has started using IBM's Watson to help build and strategize social influencer campaigns for brands. Using software built by IBM and Influential, Condé Nast's clients will be able to know which influencer's demographics, personality traits and more best align with a marketer and the audience it is targeting.

In February 2017, Rare Carat, a New York City-based startup and e-commerce platform for buying diamonds and diamond rings, introduced an IBM Watson-powered artificial intelligence chatbot called "Rocky" to assist novice diamond buyers through the daunting process of purchasing a diamond. As part of the IBM Global Entrepreneur Program, Rare Carat received the assistance of IBM in the development of the Rocky Chat Bot. In May 2017, IBM partnered with the Pebble Beach Company to use Watson as a concierge. Watson's artificial intelligence was added to an app developed by Pebble Beach and was used to guide visitors around the resort. The mobile app was designed by IBM iX and hosted on the IBM Cloud. It uses Watson's Conversation applications programming interface. 

In November 2017, in Mexico City, the Experience Voices of Another Time was opened at the National Museum of Anthropology using IBM Watson as an alternative to visiting a museum.

Healthcare

In healthcare, Watson's natural language, hypothesis generation, and evidence-based learning capabilities are being investigated to see how Watson may contribute to clinical decision support systems and the increase in Artificial intelligence in healthcare for use by medical professionals. To aid physicians in the treatment of their patients, once a physician has posed a query to the system describing symptoms and other related factors, Watson first parses the input to identify the most important pieces of information; then mines patient data to find facts relevant to the patient's medical and hereditary history; then examines available data sources to form and test hypotheses; and finally provides a list of individualized, confidence-scored recommendations. The sources of data that Watson uses for analysis can include treatment guidelines, electronic medical record data, notes from healthcare providers, research materials, clinical studies, journal articles and patient information. Despite being developed and marketed as a "diagnosis and treatment advisor", Watson has never been actually involved in the medical diagnosis process, only in assisting with identifying treatment options for patients who have already been diagnosed.

In February 2011, it was announced that IBM would be partnering with Nuance Communications for a research project to develop a commercial product during the next 18 to 24 months, designed to exploit Watson's clinical decision support capabilities. Physicians at Columbia University would help to identify critical issues in the practice of medicine where the system's technology may be able to contribute, and physicians at the University of Maryland would work to identify the best way that a technology like Watson could interact with medical practitioners to provide the maximum assistance.

In September 2011, IBM and WellPoint announced a partnership to utilize Watson's data crunching capability to help suggest treatment options to physicians. Then, in February 2013, IBM and WellPoint gave Watson its first commercial application, for utilization management decisions in lung cancer treatment at Memorial Sloan–Kettering Cancer Center.

IBM announced a partnership with Cleveland Clinic in October 2012. The company has sent Watson to the Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, where it will increase its health expertise and assist medical professionals in treating patients. The medical facility will utilize Watson's ability to store and process large quantities of information to help speed up and increase the accuracy of the treatment process. "Cleveland Clinic's collaboration with IBM is exciting because it offers us the opportunity to teach Watson to 'think' in ways that have the potential to make it a powerful tool in medicine", said C. Martin Harris, MD, chief information officer of Cleveland Clinic.

In 2013, IBM and MD Anderson Cancer Center began a pilot program to further the center's "mission to eradicate cancer". However, after spending $62 million, the project did not meet its goals and it has been stopped.

On February 8, 2013, IBM announced that oncologists at the Maine Center for Cancer Medicine and Westmed Medical Group in New York have started to test the Watson supercomputer system in an effort to recommend treatment for lung cancer.

On July 29, 2016, IBM and Manipal Hospitals (a leading hospital chain in India), announced launch of IBM Watson for Oncology, for cancer patients. This product provides information and insights to physicians and cancer patients to help them identify personalized, evidence-based cancer care options. Manipal Hospitals is the second hospital in the world to adopt this technology and first in the world to offer it to patients online as an expert second opinion through their website.

On January 7, 2017, IBM and Fukoku Mutual Life Insurance entered into a contract for IBM to deliver analysis to compensation payouts via its IBM Watson Explorer AI, this resulted in the loss of 34 jobs and the company said it would speed up compensation payout analysis via analysing claims and medical record and increase productivity by 30%. The company also said it would save ¥140m in running costs.

It is said that IBM Watson will be carrying the knowledge-base of 1000 cancer specialists which will bring a revolution in the field of healthcare. IBM is regarded as a disruptive innovation. However the stream of oncology is still in its nascent stage.

Several startups in the healthcare space have been effectively using seven business model archetypes to take solutions based on IBM Watson to the marketplace. These archetypes depends on the value generate for the target user (e.g. patient focus vs. healthcare provider and payer focus) and value capturing mechanisms (e.g. providing information or connecting stakeholders).

IBM Watson Group

On January 9, 2014 IBM announced it was creating a business unit around Watson, led by senior vice president Michael Rhodin. IBM Watson Group will have headquarters in New York's Silicon Alley and will employ 2,000 people. IBM has invested $1 billion to get the division going. Watson Group will develop three new cloud-delivered services: Watson Discovery Advisor, Watson Engagement Advisor, and Watson Explorer. Watson Discovery Advisor will focus on research and development projects in pharmaceutical industry, publishing, and biotechnology, Watson Engagement Advisor will focus on self-service applications using insights on the basis of natural language questions posed by business users, and Watson Explorer will focus on helping enterprise users uncover and share data-driven insights based on federated search more easily. The company is also launching a $100 million venture fund to spur application development for "cognitive" applications. According to IBM, the cloud-delivered enterprise-ready Watson has seen its speed increase 24 times over—a 2,300 percent improvement in performance and its physical size shrank by 90 percent—from the size of a master bedroom to three stacked pizza boxes. IBM CEO Virginia Rometty said she wants Watson to generate $10 billion in annual revenue within ten years. On 20 September 2017, Anantha Chandrakasan, dean of the MIT School of Engineering announced Antonio Torralba as the MIT director of the MIT-IBM Watson AI Lab. In March 2018, IBM's CEO Ginni Rometty proposed "Watson's Law," the "use of and application of business, smart cities, consumer applications and life in general."

Chatterbot

Watson is being used via IBM partner program as a Chatterbot to provide the conversation for children's toys.

Building codes

In 2015, the engineering firm ENGEO created an online service via the IBM partner program named GoFetchCode. GoFetchCode applies Watson's natural language processing and question-answering capabilities to the International Code Council's model building codes.

Teaching assistant

IBM Watson is being used for several projects relating to education, and has entered partnerships with Pearson Education, Blackboard, Sesame Workshop and Apple.

In its partnership with Pearson, Watson is being made available inside electronic text books to provide natural language, one-on-one tutoring to students on the reading material.

As an individual using the free Watson APIs available to the public, Ashok Goel, a professor at Georgia Tech, used Watson to create a virtual teaching assistant to assist students in his class. Initially, Goel did not reveal the nature of "Jill", which was created with the help of a few students and IBM. Jill answered questions where it had a 97% certainty of an accurate answer, with the remainder being answered by human assistants.

The research group of Sabri Pllana developed an assistant for learning parallel programming using the IBM Watson. A survey with a number of novice parallel programmers at the Linnaeus University indicated that such assistant will be welcome by students that learn parallel programming.

Weather forecasting

In August 2016, IBM announced it would be using Watson for weather forecasting. Specifically, the company announced they would use Watson to analyze data from over 200,000 Weather Underground personal weather stations, and data from other sources, as a part of project Deep Thunder.

Fashion

IBM Watson together with Marchesa designed a dress that changed the colour of the fabric depending on the mood of the audience. The dress lit up in different colours based on the sentiment of Tweets about the dress. Tweets were passed through a Watson tone analyzer and then sent back to a small computer inside the waist of the dress. As social media is an integral part of their business, the Marchesa team loved how Watson could incorporate that information into the glamour of the gown.

Tax preparation

On February 5–6, 2017, tax preparation company H&R Block began nationwide use of a Watson-based program.

Liquefied petroleum gas

From Wikipedia, the free encyclopedia ...