Search This Blog

Sunday, May 21, 2023

Computer

From Wikipedia, the free encyclopedia

A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These programs enable computers to perform a wide range of tasks. A computer system is a nominally complete computer that includes the hardware, operating system (main software), and peripheral equipment needed and used for full operation. This term may also refer to a group of computers that are linked and function together, such as a computer network or computer cluster.

A broad range of industrial and consumer products use computers as control systems. Simple special-purpose devices like microwave ovens and remote controls are included, as are factory devices like industrial robots and computer-aided design, as well as general-purpose devices like personal computers and mobile devices like smartphones. Computers power the Internet, which links billions of other computers and users.

Early computers were meant to be used only for calculations. Simple manual instruments like the abacus have aided people in doing calculations since ancient times. Early in the Industrial Revolution, some mechanical devices were built to automate long, tedious tasks, such as guiding patterns for looms. More sophisticated electrical machines did specialized analog calculations in the early 20th century. The first digital electronic calculating machines were developed during World War II. The first semiconductor transistors in the late 1940s were followed by the silicon-based MOSFET (MOS transistor) and monolithic integrated circuit chip technologies in the late 1950s, leading to the microprocessor and the microcomputer revolution in the 1970s. The speed, power and versatility of computers have been increasing dramatically ever since then, with transistor counts increasing at a rapid pace (as predicted by Moore's law), leading to the Digital Revolution during the late 20th to early 21st centuries.

Conventionally, a modern computer consists of at least one processing element, typically a central processing unit (CPU) in the form of a microprocessor, along with some type of computer memory, typically semiconductor memory chips. The processing element carries out arithmetic and logical operations, and a sequencing and control unit can change the order of operations in response to stored information. Peripheral devices include input devices (keyboards, mice, joystick, etc.), output devices (monitor screens, printers, etc.), and input/output devices that perform both functions (e.g., the 2000s-era touchscreen). Peripheral devices allow information to be retrieved from an external source and they enable the result of operations to be saved and retrieved.

Etymology

A human computer.
A human computer, with microscope and calculator, 1952

According to the Oxford English Dictionary, the first known use of computer was in a 1613 book called The Yong Mans Gleanings by the English writer Richard Brathwait: "I haue [sic] read the truest computer of Times, and the best Arithmetician that euer [sic] breathed, and he reduceth thy dayes into a short number." This usage of the term referred to a human computer, a person who carried out calculations or computations. The word continued with the same meaning until the middle of the 20th century. During the latter part of this period women were often hired as computers because they could be paid less than their male counterparts. By 1943, most human computers were women.

The Online Etymology Dictionary gives the first attested use of computer in the 1640s, meaning 'one who calculates'; this is an "agent noun from compute (v.)". The Online Etymology Dictionary states that the use of the term to mean "'calculating machine' (of any type) is from 1897." The Online Etymology Dictionary indicates that the "modern use" of the term, to mean 'programmable digital electronic computer' dates from "1945 under this name; [in a] theoretical [sense] from 1937, as Turing machine".

History

Pre-20th century

Devices have been used to aid computation for thousands of years, mostly using one-to-one correspondence with fingers. The earliest counting device was most likely a form of tally stick. Later record keeping aids throughout the Fertile Crescent included calculi (clay spheres, cones, etc.) which represented counts of items, likely livestock or grains, sealed in hollow unbaked clay containers. The use of counting rods is one example.

The Chinese suanpan (算盘). The number represented on this abacus is 6,302,715,408.

The abacus was initially used for arithmetic tasks. The Roman abacus was developed from devices used in Babylonia as early as 2400 BCE. Since then, many other forms of reckoning boards or tables have been invented. In a medieval European counting house, a checkered cloth would be placed on a table, and markers moved around on it according to certain rules, as an aid to calculating sums of money.

The Antikythera mechanism, dating back to ancient Greece circa 150–100 BCE, is an early analog computing device.

The Antikythera mechanism is believed to be the earliest known mechanical analog computer, according to Derek J. de Solla Price. It was designed to calculate astronomical positions. It was discovered in 1901 in the Antikythera wreck off the Greek island of Antikythera, between Kythera and Crete, and has been dated to approximately c. 100 BCE. Devices of comparable complexity to the Antikythera mechanism would not reappear until the fourteenth century.

Many mechanical aids to calculation and measurement were constructed for astronomical and navigation use. The planisphere was a star chart invented by Abū Rayhān al-Bīrūnī in the early 11th century. The astrolabe was invented in the Hellenistic world in either the 1st or 2nd centuries BCE and is often attributed to Hipparchus. A combination of the planisphere and dioptra, the astrolabe was effectively an analog computer capable of working out several different kinds of problems in spherical astronomy. An astrolabe incorporating a mechanical calendar computer and gear-wheels was invented by Abi Bakr of Isfahan, Persia in 1235. Abū Rayhān al-Bīrūnī invented the first mechanical geared lunisolar calendar astrolabe, an early fixed-wired knowledge processing machine with a gear train and gear-wheels, c. 1000 AD.

The sector, a calculating instrument used for solving problems in proportion, trigonometry, multiplication and division, and for various functions, such as squares and cube roots, was developed in the late 16th century and found application in gunnery, surveying and navigation.

The planimeter was a manual instrument to calculate the area of a closed figure by tracing over it with a mechanical linkage.

The slide rule was invented around 1620–1630 by the English clergyman William Oughtred, shortly after the publication of the concept of the logarithm. It is a hand-operated analog computer for doing multiplication and division. As slide rule development progressed, added scales provided reciprocals, squares and square roots, cubes and cube roots, as well as transcendental functions such as logarithms and exponentials, circular and hyperbolic trigonometry and other functions. Slide rules with special scales are still used for quick performance of routine calculations, such as the E6B circular slide rule used for time and distance calculations on light aircraft.

In the 1770s, Pierre Jaquet-Droz, a Swiss watchmaker, built a mechanical doll (automaton) that could write holding a quill pen. By switching the number and order of its internal wheels different letters, and hence different messages, could be produced. In effect, it could be mechanically "programmed" to read instructions. Along with two other complex machines, the doll is at the Musée d'Art et d'Histoire of Neuchâtel, Switzerland, and still operates.

In 1831–1835, mathematician and engineer Giovanni Plana devised a Perpetual Calendar machine, which, through a system of pulleys and cylinders and over, could predict the perpetual calendar for every year from 0 CE (that is, 1 BCE) to 4000 CE, keeping track of leap years and varying day length. The tide-predicting machine invented by the Scottish scientist Sir William Thomson in 1872 was of great utility to navigation in shallow waters. It used a system of pulleys and wires to automatically calculate predicted tide levels for a set period at a particular location.

The differential analyser, a mechanical analog computer designed to solve differential equations by integration, used wheel-and-disc mechanisms to perform the integration. In 1876, Sir William Thomson had already discussed the possible construction of such calculators, but he had been stymied by the limited output torque of the ball-and-disk integrators. In a differential analyzer, the output of one integrator drove the input of the next integrator, or a graphing output. The torque amplifier was the advance that allowed these machines to work. Starting in the 1920s, Vannevar Bush and others developed mechanical differential analyzers.

First computer

Charles Babbage, an English mechanical engineer and polymath, originated the concept of a programmable computer. Considered the "father of the computer", he conceptualized and invented the first mechanical computer in the early 19th century. After working on his revolutionary difference engine, designed to aid in navigational calculations, in 1833 he realized that a much more general design, an analytical engine, was possible. The input of programs and data was to be provided to the machine via punched cards, a method being used at the time to direct mechanical looms such as the Jacquard loom. For output, the machine would have a printer, a curve plotter and a bell. The machine would also be able to punch numbers onto cards to be read in later. The Engine incorporated an arithmetic logic unit, control flow in the form of conditional branching and loops, and integrated memory, making it the first design for a general-purpose computer that could be described in modern terms as Turing-complete.

The machine was about a century ahead of its time. All the parts for his machine had to be made by hand – this was a major problem for a device with thousands of parts. Eventually, the project was dissolved with the decision of the British Government to cease funding. Babbage's failure to complete the analytical engine can be chiefly attributed to political and financial difficulties as well as his desire to develop an increasingly sophisticated computer and to move ahead faster than anyone else could follow. Nevertheless, his son, Henry Babbage, completed a simplified version of the analytical engine's computing unit (the mill) in 1888. He gave a successful demonstration of its use in computing tables in 1906.

Analog computers

 
Sir William Thomson's third tide-predicting machine design, 1879–81

During the first half of the 20th century, many scientific computing needs were met by increasingly sophisticated analog computers, which used a direct mechanical or electrical model of the problem as a basis for computation. However, these were not programmable and generally lacked the versatility and accuracy of modern digital computers. The first modern analog computer was a tide-predicting machine, invented by Sir William Thomson (later to become Lord Kelvin) in 1872. The differential analyser, a mechanical analog computer designed to solve differential equations by integration using wheel-and-disc mechanisms, was conceptualized in 1876 by James Thomson, the elder brother of the more famous Sir William Thomson.

The art of mechanical analog computing reached its zenith with the differential analyzer, built by H. L. Hazen and Vannevar Bush at MIT starting in 1927. This built on the mechanical integrators of James Thomson and the torque amplifiers invented by H. W. Nieman. A dozen of these devices were built before their obsolescence became obvious. By the 1950s, the success of digital electronic computers had spelled the end for most analog computing machines, but analog computers remained in use during the 1950s in some specialized applications such as education (slide rule) and aircraft (control systems).

Digital computers

Electromechanical

By 1938, the United States Navy had developed an electromechanical analog computer small enough to use aboard a submarine. This was the Torpedo Data Computer, which used trigonometry to solve the problem of firing a torpedo at a moving target. During World War II similar devices were developed in other countries as well.

Replica of Konrad Zuse's Z3, the first fully automatic, digital (electromechanical) computer

Early digital computers were electromechanical; electric switches drove mechanical relays to perform the calculation. These devices had a low operating speed and were eventually superseded by much faster all-electric computers, originally using vacuum tubes. The Z2, created by German engineer Konrad Zuse in 1939 in Berlin, was one of the earliest examples of an electromechanical relay computer.

Konrad Zuse, inventor of the modern computer

In 1941, Zuse followed his earlier machine up with the Z3, the world's first working electromechanical programmable, fully automatic digital computer. The Z3 was built with 2000 relays, implementing a 22 bit word length that operated at a clock frequency of about 5–10 Hz. Program code was supplied on punched film while data could be stored in 64 words of memory or supplied from the keyboard. It was quite similar to modern machines in some respects, pioneering numerous advances such as floating-point numbers. Rather than the harder-to-implement decimal system (used in Charles Babbage's earlier design), using a binary system meant that Zuse's machines were easier to build and potentially more reliable, given the technologies available at that time. The Z3 was not itself a universal computer but could be extended to be Turing complete.

Zuse's next computer, the Z4, became the world's first commercial computer; after initial delay due to the Second World War, it was completed in 1950 and delivered to the ETH Zurich. The computer was manufactured by Zuse's own company, Zuse KG [de], which was founded in 1941 as the first company with the sole purpose of developing computers in Berlin.

Vacuum tubes and digital electronic circuits

Purely electronic circuit elements soon replaced their mechanical and electromechanical equivalents, at the same time that digital calculation replaced analog. The engineer Tommy Flowers, working at the Post Office Research Station in London in the 1930s, began to explore the possible use of electronics for the telephone exchange. Experimental equipment that he built in 1934 went into operation five years later, converting a portion of the telephone exchange network into an electronic data processing system, using thousands of vacuum tubes. In the US, John Vincent Atanasoff and Clifford E. Berry of Iowa State University developed and tested the Atanasoff–Berry Computer (ABC) in 1942, the first "automatic electronic digital computer". This design was also all-electronic and used about 300 vacuum tubes, with capacitors fixed in a mechanically rotating drum for memory.

Two women are seen by the Colossus computer.
Colossus, the first electronic digital programmable computing device, was used to break German ciphers during World War II. It is seen here in use at Bletchley Park in 1943.

During World War II, the British code-breakers at Bletchley Park achieved a number of successes at breaking encrypted German military communications. The German encryption machine, Enigma, was first attacked with the help of the electro-mechanical bombes which were often run by women. To crack the more sophisticated German Lorenz SZ 40/42 machine, used for high-level Army communications, Max Newman and his colleagues commissioned Flowers to build the Colossus. He spent eleven months from early February 1943 designing and building the first Colossus. After a functional test in December 1943, Colossus was shipped to Bletchley Park, where it was delivered on 18 January 1944 and attacked its first message on 5 February.

Colossus was the world's first electronic digital programmable computer. It used a large number of valves (vacuum tubes). It had paper-tape input and was capable of being configured to perform a variety of boolean logical operations on its data, but it was not Turing-complete. Nine Mk II Colossi were built (The Mk I was converted to a Mk II making ten machines in total). Colossus Mark I contained 1,500 thermionic valves (tubes), but Mark II with 2,400 valves, was both five times faster and simpler to operate than Mark I, greatly speeding the decoding process.

ENIAC was the first electronic, Turing-complete device, and performed ballistics trajectory calculations for the United States Army.

The ENIAC (Electronic Numerical Integrator and Computer) was the first electronic programmable computer built in the U.S. Although the ENIAC was similar to the Colossus, it was much faster, more flexible, and it was Turing-complete. Like the Colossus, a "program" on the ENIAC was defined by the states of its patch cables and switches, a far cry from the stored program electronic machines that came later. Once a program was written, it had to be mechanically set into the machine with manual resetting of plugs and switches. The programmers of the ENIAC were six women, often known collectively as the "ENIAC girls".

It combined the high speed of electronics with the ability to be programmed for many complex problems. It could add or subtract 5000 times a second, a thousand times faster than any other machine. It also had modules to multiply, divide, and square root. High speed memory was limited to 20 words (about 80 bytes). Built under the direction of John Mauchly and J. Presper Eckert at the University of Pennsylvania, ENIAC's development and construction lasted from 1943 to full operation at the end of 1945. The machine was huge, weighing 30 tons, using 200 kilowatts of electric power and contained over 18,000 vacuum tubes, 1,500 relays, and hundreds of thousands of resistors, capacitors, and inductors.

Modern computers

Concept of modern computer

The principle of the modern computer was proposed by Alan Turing in his seminal 1936 paper, On Computable Numbers. Turing proposed a simple device that he called "Universal Computing machine" and that is now known as a universal Turing machine. He proved that such a machine is capable of computing anything that is computable by executing instructions (program) stored on tape, allowing the machine to be programmable. The fundamental concept of Turing's design is the stored program, where all the instructions for computing are stored in memory. Von Neumann acknowledged that the central concept of the modern computer was due to this paper. Turing machines are to this day a central object of study in theory of computation. Except for the limitations imposed by their finite memory stores, modern computers are said to be Turing-complete, which is to say, they have algorithm execution capability equivalent to a universal Turing machine.

Stored programs

Three tall racks containing electronic circuit boards
A section of the reconstructed Manchester Baby, the first electronic stored-program computer

Early computing machines had fixed programs. Changing its function required the re-wiring and re-structuring of the machine. With the proposal of the stored-program computer this changed. A stored-program computer includes by design an instruction set and can store in memory a set of instructions (a program) that details the computation. The theoretical basis for the stored-program computer was laid out by Alan Turing in his 1936 paper. In 1945, Turing joined the National Physical Laboratory and began work on developing an electronic stored-program digital computer. His 1945 report "Proposed Electronic Calculator" was the first specification for such a device. John von Neumann at the University of Pennsylvania also circulated his First Draft of a Report on the EDVAC in 1945.

The Manchester Baby was the world's first stored-program computer. It was built at the University of Manchester in England by Frederic C. Williams, Tom Kilburn and Geoff Tootill, and ran its first program on 21 June 1948. It was designed as a testbed for the Williams tube, the first random-access digital storage device. Although the computer was described as "small and primitive" by a 1998 retrospective, it was the first working machine to contain all of the elements essential to a modern electronic computer. As soon as the Baby had demonstrated the feasibility of its design, a project began at the university to develop it into a practically useful computer, the Manchester Mark 1.

The Mark 1 in turn quickly became the prototype for the Ferranti Mark 1, the world's first commercially available general-purpose computer. Built by Ferranti, it was delivered to the University of Manchester in February 1951. At least seven of these later machines were delivered between 1953 and 1957, one of them to Shell labs in Amsterdam. In October 1947 the directors of British catering company J. Lyons & Company decided to take an active role in promoting the commercial development of computers. Lyons's LEO I computer, modelled closely on the Cambridge EDSAC of 1949, became operational in April 1951 and ran the world's first routine office computer job.

Grace Hopper was the first to develop a compiler for a programming language.

Transistors

The concept of a field-effect transistor was proposed by Julius Edgar Lilienfeld in 1925. John Bardeen and Walter Brattain, while working under William Shockley at Bell Labs, built the first working transistor, the point-contact transistor, in 1947, which was followed by Shockley's bipolar junction transistor in 1948. From 1955 onwards, transistors replaced vacuum tubes in computer designs, giving rise to the "second generation" of computers. Compared to vacuum tubes, transistors have many advantages: they are smaller, and require less power than vacuum tubes, so give off less heat. Junction transistors were much more reliable than vacuum tubes and had longer, indefinite, service life. Transistorized computers could contain tens of thousands of binary logic circuits in a relatively compact space. However, early junction transistors were relatively bulky devices that were difficult to manufacture on a mass-production basis, which limited them to a number of specialised applications.

At the University of Manchester, a team under the leadership of Tom Kilburn designed and built a machine using the newly developed transistors instead of valves. Their first transistorised computer and the first in the world, was operational by 1953, and a second version was completed there in April 1955. However, the machine did make use of valves to generate its 125 kHz clock waveforms and in the circuitry to read and write on its magnetic drum memory, so it was not the first completely transistorized computer. That distinction goes to the Harwell CADET of 1955, built by the electronics division of the Atomic Energy Research Establishment at Harwell.

MOSFET (MOS transistor), showing gate (G), body (B), source (S) and drain (D) terminals. The gate is separated from the body by an insulating layer (pink).

The metal–oxide–silicon field-effect transistor (MOSFET), also known as the MOS transistor, was invented by Mohamed M. Atalla and Dawon Kahng at Bell Labs in 1959. It was the first truly compact transistor that could be miniaturised and mass-produced for a wide range of uses. With its high scalability, and much lower power consumption and higher density than bipolar junction transistors, the MOSFET made it possible to build high-density integrated circuits. In addition to data processing, it also enabled the practical use of MOS transistors as memory cell storage elements, leading to the development of MOS semiconductor memory, which replaced earlier magnetic-core memory in computers. The MOSFET led to the microcomputer revolution, and became the driving force behind the computer revolution. The MOSFET is the most widely used transistor in computers, and is the fundamental building block of digital electronics.

Integrated circuits

MOS 6502 computer chip die photograph
Die photograph of a MOS 6502, an early 1970s microprocessor integrating 3500 transistors on a single chip
MOS 6502 computer chip in 'DIP' package
Integrated circuits are typically packaged in plastic, metal, or ceramic cases to protect the IC from damage and for ease of assembly.

The next great advance in computing power came with the advent of the integrated circuit (IC). The idea of the integrated circuit was first conceived by a radar scientist working for the Royal Radar Establishment of the Ministry of Defence, Geoffrey W.A. Dummer. Dummer presented the first public description of an integrated circuit at the Symposium on Progress in Quality Electronic Components in Washington, D.C. on 7 May 1952.

The first working ICs were invented by Jack Kilby at Texas Instruments and Robert Noyce at Fairchild Semiconductor. Kilby recorded his initial ideas concerning the integrated circuit in July 1958, successfully demonstrating the first working integrated example on 12 September 1958. In his patent application of 6 February 1959, Kilby described his new device as "a body of semiconductor material ... wherein all the components of the electronic circuit are completely integrated". However, Kilby's invention was a hybrid integrated circuit (hybrid IC), rather than a monolithic integrated circuit (IC) chip. Kilby's IC had external wire connections, which made it difficult to mass-produce.

Noyce also came up with his own idea of an integrated circuit half a year later than Kilby. Noyce's invention was the first true monolithic IC chip. His chip solved many practical problems that Kilby's had not. Produced at Fairchild Semiconductor, it was made of silicon, whereas Kilby's chip was made of germanium. Noyce's monolithic IC was fabricated using the planar process, developed by his colleague Jean Hoerni in early 1959. In turn, the planar process was based on Mohamed M. Atalla's work on semiconductor surface passivation by silicon dioxide in the late 1950s.

Modern monolithic ICs are predominantly MOS (metal–oxide–semiconductor) integrated circuits, built from MOSFETs (MOS transistors). The earliest experimental MOS IC to be fabricated was a 16-transistor chip built by Fred Heiman and Steven Hofstein at RCA in 1962. General Microelectronics later introduced the first commercial MOS IC in 1964, developed by Robert Norman. Following the development of the self-aligned gate (silicon-gate) MOS transistor by Robert Kerwin, Donald Klein and John Sarace at Bell Labs in 1967, the first silicon-gate MOS IC with self-aligned gates was developed by Federico Faggin at Fairchild Semiconductor in 1968. The MOSFET has since become the most critical device component in modern ICs.

The development of the MOS integrated circuit led to the invention of the microprocessor, and heralded an explosion in the commercial and personal use of computers. While the subject of exactly which device was the first microprocessor is contentious, partly due to lack of agreement on the exact definition of the term "microprocessor", it is largely undisputed that the first single-chip microprocessor was the Intel 4004, designed and realized by Federico Faggin with his silicon-gate MOS IC technology, along with Ted Hoff, Masatoshi Shima and Stanley Mazor at Intel. In the early 1970s, MOS IC technology enabled the integration of more than 10,000 transistors on a single chip.

System on a Chip (SoCs) are complete computers on a microchip (or chip) the size of a coin. They may or may not have integrated RAM and flash memory. If not integrated, the RAM is usually placed directly above (known as Package on package) or below (on the opposite side of the circuit board) the SoC, and the flash memory is usually placed right next to the SoC, this all done to improve data transfer speeds, as the data signals don't have to travel long distances. Since ENIAC in 1945, computers have advanced enormously, with modern SoCs (Such as the Snapdragon 865) being the size of a coin while also being hundreds of thousands of times more powerful than ENIAC, integrating billions of transistors, and consuming only a few watts of power.

Mobile computers

The first mobile computers were heavy and ran from mains power. The 50 lb (23 kg) IBM 5100 was an early example. Later portables such as the Osborne 1 and Compaq Portable were considerably lighter but still needed to be plugged in. The first laptops, such as the Grid Compass, removed this requirement by incorporating batteries – and with the continued miniaturization of computing resources and advancements in portable battery life, portable computers grew in popularity in the 2000s. The same developments allowed manufacturers to integrate computing resources into cellular mobile phones by the early 2000s.

These smartphones and tablets run on a variety of operating systems and recently became the dominant computing device on the market. These are powered by System on a Chip (SoCs), which are complete computers on a microchip the size of a coin.

Types

Computers can be classified in a number of different ways, including:

By architecture

By size, form-factor and purpose

Hardware

The term hardware covers all of those parts of a computer that are tangible physical objects. Circuits, computer chips, graphic cards, sound cards, memory (RAM), motherboard, displays, power supplies, cables, keyboards, printers and "mice" input devices are all hardware.

History of computing hardware

First generation
(mechanical/electromechanical)
Calculators Pascal's calculator, Arithmometer, Difference engine, Quevedo's analytical machines
Programmable devices Jacquard loom, Analytical engine, IBM ASCC/Harvard Mark I, Harvard Mark II, IBM SSEC, Z1, Z2, Z3
Second generation
(vacuum tubes)
Calculators Atanasoff–Berry Computer, IBM 604, UNIVAC 60, UNIVAC 120
Programmable devices Colossus, ENIAC, Manchester Baby, EDSAC, Manchester Mark 1, Ferranti Pegasus, Ferranti Mercury, CSIRAC, EDVAC, UNIVAC I, IBM 701, IBM 702, IBM 650, Z22
Third generation
(discrete transistors and SSI, MSI, LSI integrated circuits)
Mainframes IBM 7090, IBM 7080, IBM System/360, BUNCH
Minicomputer HP 2116A, IBM System/32, IBM System/36, LINC, PDP-8, PDP-11
Desktop Computer HP 9100
Fourth generation
(VLSI integrated circuits)
Minicomputer VAX, IBM AS/400
4-bit microcomputer Intel 4004, Intel 4040
8-bit microcomputer Intel 8008, Intel 8080, Motorola 6800, Motorola 6809, MOS Technology 6502, Zilog Z80
16-bit microcomputer Intel 8088, Zilog Z8000, WDC 65816/65802
32-bit microcomputer Intel 80386, Pentium, Motorola 68000, ARM
64-bit microcomputer Alpha, MIPS, PA-RISC, PowerPC, SPARC, x86-64, ARMv8-A
Embedded computer Intel 8048, Intel 8051
Personal computer Desktop computer, Home computer, Laptop computer, Personal digital assistant (PDA), Portable computer, Tablet PC, Wearable computer
Theoretical/experimental Quantum computer IBM Q System One
Chemical computer
DNA computing
Optical computer
Spintronics-based computer
Wetware/Organic computer

Other hardware topics

Peripheral device (input/output) Input Mouse, keyboard, joystick, image scanner, webcam, graphics tablet, microphone
Output Monitor, printer, loudspeaker
Both Floppy disk drive, hard disk drive, optical disc drive, teleprinter
Computer buses Short range RS-232, SCSI, PCI, USB
Long range (computer networking) Ethernet, ATM, FDDI

A general-purpose computer has four main components: the arithmetic logic unit (ALU), the control unit, the memory, and the input and output devices (collectively termed I/O). These parts are interconnected by buses, often made of groups of wires. Inside each of these parts are thousands to trillions of small electrical circuits which can be turned off or on by means of an electronic switch. Each circuit represents a bit (binary digit) of information so that when the circuit is on it represents a "1", and when off it represents a "0" (in positive logic representation). The circuits are arranged in logic gates so that one or more of the circuits may control the state of one or more of the other circuits.

Input devices

When unprocessed data is sent to the computer with the help of input devices, the data is processed and sent to output devices. The input devices may be hand-operated or automated. The act of processing is mainly regulated by the CPU. Some examples of input devices are:

Output devices

The means through which computer gives output are known as output devices. Some examples of output devices are:

Control unit

Diagram showing how a particular MIPS architecture instruction would be decoded by the control system

The control unit (often called a control system or central controller) manages the computer's various components; it reads and interprets (decodes) the program instructions, transforming them into control signals that activate other parts of the computer. Control systems in advanced computers may change the order of execution of some instructions to improve performance.

A key component common to all CPUs is the program counter, a special memory cell (a register) that keeps track of which location in memory the next instruction is to be read from.

The control system's function is as follows— this is a simplified description, and some of these steps may be performed concurrently or in a different order depending on the type of CPU:

  1. Read the code for the next instruction from the cell indicated by the program counter.
  2. Decode the numerical code for the instruction into a set of commands or signals for each of the other systems.
  3. Increment the program counter so it points to the next instruction.
  4. Read whatever data the instruction requires from cells in memory (or perhaps from an input device). The location of this required data is typically stored within the instruction code.
  5. Provide the necessary data to an ALU or register.
  6. If the instruction requires an ALU or specialized hardware to complete, instruct the hardware to perform the requested operation.
  7. Write the result from the ALU back to a memory location or to a register or perhaps an output device.
  8. Jump back to step (1).

Since the program counter is (conceptually) just another set of memory cells, it can be changed by calculations done in the ALU. Adding 100 to the program counter would cause the next instruction to be read from a place 100 locations further down the program. Instructions that modify the program counter are often known as "jumps" and allow for loops (instructions that are repeated by the computer) and often conditional instruction execution (both examples of control flow).

The sequence of operations that the control unit goes through to process an instruction is in itself like a short computer program, and indeed, in some more complex CPU designs, there is another yet smaller computer called a microsequencer, which runs a microcode program that causes all of these events to happen.

Central processing unit (CPU)

The control unit, ALU, and registers are collectively known as a central processing unit (CPU). Early CPUs were composed of many separate components. Since the 1970s, CPUs have typically been constructed on a single MOS integrated circuit chip called a microprocessor.

Arithmetic logic unit (ALU)

The ALU is capable of performing two classes of operations: arithmetic and logic. The set of arithmetic operations that a particular ALU supports may be limited to addition and subtraction, or might include multiplication, division, trigonometry functions such as sine, cosine, etc., and square roots. Some can operate only on whole numbers (integers) while others use floating point to represent real numbers, albeit with limited precision. However, any computer that is capable of performing just the simplest operations can be programmed to break down the more complex operations into simple steps that it can perform. Therefore, any computer can be programmed to perform any arithmetic operation—although it will take more time to do so if its ALU does not directly support the operation. An ALU may also compare numbers and return Boolean truth values (true or false) depending on whether one is equal to, greater than or less than the other ("is 64 greater than 65?"). Logic operations involve Boolean logic: AND, OR, XOR, and NOT. These can be useful for creating complicated conditional statements and processing Boolean logic.

Superscalar computers may contain multiple ALUs, allowing them to process several instructions simultaneously. Graphics processors and computers with SIMD and MIMD features often contain ALUs that can perform arithmetic on vectors and matrices.

Memory

Magnetic-core memory (using magnetic cores) was the computer memory of choice in the 1960s, until it was replaced by semiconductor memory (using MOS memory cells).

A computer's memory can be viewed as a list of cells into which numbers can be placed or read. Each cell has a numbered "address" and can store a single number. The computer can be instructed to "put the number 123 into the cell numbered 1357" or to "add the number that is in cell 1357 to the number that is in cell 2468 and put the answer into cell 1595." The information stored in memory may represent practically anything. Letters, numbers, even computer instructions can be placed into memory with equal ease. Since the CPU does not differentiate between different types of information, it is the software's responsibility to give significance to what the memory sees as nothing but a series of numbers.

In almost all modern computers, each memory cell is set up to store binary numbers in groups of eight bits (called a byte). Each byte is able to represent 256 different numbers (28 = 256); either from 0 to 255 or −128 to +127. To store larger numbers, several consecutive bytes may be used (typically, two, four or eight). When negative numbers are required, they are usually stored in two's complement notation. Other arrangements are possible, but are usually not seen outside of specialized applications or historical contexts. A computer can store any kind of information in memory if it can be represented numerically. Modern computers have billions or even trillions of bytes of memory.

The CPU contains a special set of memory cells called registers that can be read and written to much more rapidly than the main memory area. There are typically between two and one hundred registers depending on the type of CPU. Registers are used for the most frequently needed data items to avoid having to access main memory every time data is needed. As data is constantly being worked on, reducing the need to access main memory (which is often slow compared to the ALU and control units) greatly increases the computer's speed.

Computer main memory comes in two principal varieties:

RAM can be read and written to anytime the CPU commands it, but ROM is preloaded with data and software that never changes, therefore the CPU can only read from it. ROM is typically used to store the computer's initial start-up instructions. In general, the contents of RAM are erased when the power to the computer is turned off, but ROM retains its data indefinitely. In a PC, the ROM contains a specialized program called the BIOS that orchestrates loading the computer's operating system from the hard disk drive into RAM whenever the computer is turned on or reset. In embedded computers, which frequently do not have disk drives, all of the required software may be stored in ROM. Software stored in ROM is often called firmware, because it is notionally more like hardware than software. Flash memory blurs the distinction between ROM and RAM, as it retains its data when turned off but is also rewritable. It is typically much slower than conventional ROM and RAM however, so its use is restricted to applications where high speed is unnecessary.

In more sophisticated computers there may be one or more RAM cache memories, which are slower than registers but faster than main memory. Generally computers with this sort of cache are designed to move frequently needed data into the cache automatically, often without the need for any intervention on the programmer's part.

Input/output (I/O)

Hard disk drives are common storage devices used with computers.

I/O is the means by which a computer exchanges information with the outside world. Devices that provide input or output to the computer are called peripherals. On a typical personal computer, peripherals include input devices like the keyboard and mouse, and output devices such as the display and printer. Hard disk drives, floppy disk drives and optical disc drives serve as both input and output devices. Computer networking is another form of I/O. I/O devices are often complex computers in their own right, with their own CPU and memory. A graphics processing unit might contain fifty or more tiny computers that perform the calculations necessary to display 3D graphics. Modern desktop computers contain many smaller computers that assist the main CPU in performing I/O. A 2016-era flat screen display contains its own computer circuitry.

Multitasking

While a computer may be viewed as running one gigantic program stored in its main memory, in some systems it is necessary to give the appearance of running several programs simultaneously. This is achieved by multitasking i.e. having the computer switch rapidly between running each program in turn. One means by which this is done is with a special signal called an interrupt, which can periodically cause the computer to stop executing instructions where it was and do something else instead. By remembering where it was executing prior to the interrupt, the computer can return to that task later. If several programs are running "at the same time". then the interrupt generator might be causing several hundred interrupts per second, causing a program switch each time. Since modern computers typically execute instructions several orders of magnitude faster than human perception, it may appear that many programs are running at the same time even though only one is ever executing in any given instant. This method of multitasking is sometimes termed "time-sharing" since each program is allocated a "slice" of time in turn.

Before the era of inexpensive computers, the principal use for multitasking was to allow many people to share the same computer. Seemingly, multitasking would cause a computer that is switching between several programs to run more slowly, in direct proportion to the number of programs it is running, but most programs spend much of their time waiting for slow input/output devices to complete their tasks. If a program is waiting for the user to click on the mouse or press a key on the keyboard, then it will not take a "time slice" until the event it is waiting for has occurred. This frees up time for other programs to execute so that many programs may be run simultaneously without unacceptable speed loss.

Multiprocessing

Cray designed many supercomputers that used multiprocessing heavily.

Some computers are designed to distribute their work across several CPUs in a multiprocessing configuration, a technique once employed in only large and powerful machines such as supercomputers, mainframe computers and servers. Multiprocessor and multi-core (multiple CPUs on a single integrated circuit) personal and laptop computers are now widely available, and are being increasingly used in lower-end markets as a result.

Supercomputers in particular often have highly unique architectures that differ significantly from the basic stored-program architecture and from general-purpose computers. They often feature thousands of CPUs, customized high-speed interconnects, and specialized computing hardware. Such designs tend to be useful for only specialized tasks due to the large scale of program organization required to use most of the available resources at once. Supercomputers usually see usage in large-scale simulation, graphics rendering, and cryptography applications, as well as with other so-called "embarrassingly parallel" tasks.

Software

Software refers to parts of the computer which do not have a material form, such as programs, data, protocols, etc. Software is that part of a computer system that consists of encoded information or computer instructions, in contrast to the physical hardware from which the system is built. Computer software includes computer programs, libraries and related non-executable data, such as online documentation or digital media. It is often divided into system software and application software Computer hardware and software require each other and neither can be realistically used on its own. When software is stored in hardware that cannot easily be modified, such as with BIOS ROM in an IBM PC compatible computer, it is sometimes called "firmware".

Operating system /System Software Unix and BSD UNIX System V, IBM AIX, HP-UX, Solaris (SunOS), IRIX, List of BSD operating systems
Linux List of Linux distributions, Comparison of Linux distributions
Microsoft Windows Windows 95, Windows 98, Windows NT, Windows 2000, Windows ME, Windows XP, Windows Vista, Windows 7, Windows 8, Windows 8.1, Windows 10, Windows 11
DOS 86-DOS (QDOS), IBM PC DOS, MS-DOS, DR-DOS, FreeDOS
Macintosh operating systems Classic Mac OS, macOS (previously OS X and Mac OS X)
Embedded and real-time List of embedded operating systems
Experimental Amoeba, OberonAOS, Bluebottle, A2, Plan 9 from Bell Labs
Library Multimedia DirectX, OpenGL, OpenAL, Vulkan (API)
Programming library C standard library, Standard Template Library
Data Protocol TCP/IP, Kermit, FTP, HTTP, SMTP
File format HTML, XML, JPEG, MPEG, PNG
User interface Graphical user interface (WIMP) Microsoft Windows, GNOME, KDE, QNX Photon, CDE, GEM, Aqua
Text-based user interface Command-line interface, Text user interface
Application Software Office suite Word processing, Desktop publishing, Presentation program, Database management system, Scheduling & Time management, Spreadsheet, Accounting software
Internet Access Browser, Email client, Web server, Mail transfer agent, Instant messaging
Design and manufacturing Computer-aided design, Computer-aided manufacturing, Plant management, Robotic manufacturing, Supply chain management
Graphics Raster graphics editor, Vector graphics editor, 3D modeler, Animation editor, 3D computer graphics, Video editing, Image processing
Audio Digital audio editor, Audio playback, Mixing, Audio synthesis, Computer music
Software engineering Compiler, Assembler, Interpreter, Debugger, Text editor, Integrated development environment, Software performance analysis, Revision control, Software configuration management
Educational Edutainment, Educational game, Serious game, Flight simulator
Games Strategy, Arcade, Puzzle, Simulation, First-person shooter, Platform, Massively multiplayer, Interactive fiction
Misc Artificial intelligence, Antivirus software, Malware scanner, Installer/Package management systems, File manager

Languages

There are thousands of different programming languages—some intended for general purpose, others useful for only highly specialized applications.

Programming languages
Lists of programming languages Timeline of programming languages, List of programming languages by category, Generational list of programming languages, List of programming languages, Non-English-based programming languages
Commonly used assembly languages ARM, MIPS, x86
Commonly used high-level programming languages Ada, BASIC, C, C++, C#, COBOL, Fortran, PL/I, REXX, Java, Lisp, Pascal, Object Pascal
Commonly used scripting languages Bourne script, JavaScript, Python, Ruby, PHP, Perl

Programs

The defining feature of modern computers which distinguishes them from all other machines is that they can be programmed. That is to say that some type of instructions (the program) can be given to the computer, and it will process them. Modern computers based on the von Neumann architecture often have machine code in the form of an imperative programming language. In practical terms, a computer program may be just a few instructions or extend to many millions of instructions, as do the programs for word processors and web browsers for example. A typical modern computer can execute billions of instructions per second (gigaflops) and rarely makes a mistake over many years of operation. Large computer programs consisting of several million instructions may take teams of programmers years to write, and due to the complexity of the task almost certainly contain errors.

Stored program architecture

Replica of the Manchester Baby, the world's first electronic stored-program computer, at the Museum of Science and Industry in Manchester, England

This section applies to most common RAM machine–based computers.

In most cases, computer instructions are simple: add one number to another, move some data from one location to another, send a message to some external device, etc. These instructions are read from the computer's memory and are generally carried out (executed) in the order they were given. However, there are usually specialized instructions to tell the computer to jump ahead or backwards to some other place in the program and to carry on executing from there. These are called "jump" instructions (or branches). Furthermore, jump instructions may be made to happen conditionally so that different sequences of instructions may be used depending on the result of some previous calculation or some external event. Many computers directly support subroutines by providing a type of jump that "remembers" the location it jumped from and another instruction to return to the instruction following that jump instruction.

Program execution might be likened to reading a book. While a person will normally read each word and line in sequence, they may at times jump back to an earlier place in the text or skip sections that are not of interest. Similarly, a computer may sometimes go back and repeat the instructions in some section of the program over and over again until some internal condition is met. This is called the flow of control within the program and it is what allows the computer to perform tasks repeatedly without human intervention.

Comparatively, a person using a pocket calculator can perform a basic arithmetic operation such as adding two numbers with just a few button presses. But to add together all of the numbers from 1 to 1,000 would take thousands of button presses and a lot of time, with a near certainty of making a mistake. On the other hand, a computer may be programmed to do this with just a few simple instructions. The following example is written in the MIPS assembly language:

  begin:
  addi $8, $0, 0           # initialize sum to 0
  addi $9, $0, 1           # set first number to add = 1
  loop:
  slti $10, $9, 1000       # check if the number is less than 1000
  beq $10, $0, finish      # if odd number is greater than n then exit
  add $8, $8, $9           # update sum
  addi $9, $9, 1           # get next number
  j loop                   # repeat the summing process
  finish:
  add $2, $8, $0           # put sum in output register

Once told to run this program, the computer will perform the repetitive addition task without further human intervention. It will almost never make a mistake and a modern PC can complete the task in a fraction of a second.

Machine code

In most computers, individual instructions are stored as machine code with each instruction being given a unique number (its operation code or opcode for short). The command to add two numbers together would have one opcode; the command to multiply them would have a different opcode, and so on. The simplest computers are able to perform any of a handful of different instructions; the more complex computers have several hundred to choose from, each with a unique numerical code. Since the computer's memory is able to store numbers, it can also store the instruction codes. This leads to the important fact that entire programs (which are just lists of these instructions) can be represented as lists of numbers and can themselves be manipulated inside the computer in the same way as numeric data. The fundamental concept of storing programs in the computer's memory alongside the data they operate on is the crux of the von Neumann, or stored program, architecture.[105][106] In some cases, a computer might store some or all of its program in memory that is kept separate from the data it operates on. This is called the Harvard architecture after the Harvard Mark I computer. Modern von Neumann computers display some traits of the Harvard architecture in their designs, such as in CPU caches.

While it is possible to write computer programs as long lists of numbers (machine language) and while this technique was used with many early computers,[h] it is extremely tedious and potentially error-prone to do so in practice, especially for complicated programs. Instead, each basic instruction can be given a short name that is indicative of its function and easy to remember – a mnemonic such as ADD, SUB, MULT or JUMP. These mnemonics are collectively known as a computer's assembly language. Converting programs written in assembly language into something the computer can actually understand (machine language) is usually done by a computer program called an assembler.

A 1970s punched card containing one line from a Fortran program. The card reads: "Z(1) = Y + W(1)" and is labeled "PROJ039" for identification purposes.

Programming language

Programming languages provide various ways of specifying programs for computers to run. Unlike natural languages, programming languages are designed to permit no ambiguity and to be concise. They are purely written languages and are often difficult to read aloud. They are generally either translated into machine code by a compiler or an assembler before being run, or translated directly at run time by an interpreter. Sometimes programs are executed by a hybrid method of the two techniques.

Low-level languages

Machine languages and the assembly languages that represent them (collectively termed low-level programming languages) are generally unique to the particular architecture of a computer's central processing unit (CPU). For instance, an ARM architecture CPU (such as may be found in a smartphone or a hand-held videogame) cannot understand the machine language of an x86 CPU that might be in a PC.[i] Historically a significant number of other cpu architectures were created and saw extensive use, notably including the MOS Technology 6502 and 6510 in addition to the Zilog Z80.

High-level languages

Although considerably easier than in machine language, writing long programs in assembly language is often difficult and is also error prone. Therefore, most practical programs are written in more abstract high-level programming languages that are able to express the needs of the programmer more conveniently (and thereby help reduce programmer error). High level languages are usually "compiled" into machine language (or sometimes into assembly language and then into machine language) using another computer program called a compiler.[j] High level languages are less related to the workings of the target computer than assembly language, and more related to the language and structure of the problem(s) to be solved by the final program. It is therefore often possible to use different compilers to translate the same high level language program into the machine language of many different types of computer. This is part of the means by which software like video games may be made available for different computer architectures such as personal computers and various video game consoles.

Program design

Program design of small programs is relatively simple and involves the analysis of the problem, collection of inputs, using the programming constructs within languages, devising or using established procedures and algorithms, providing data for output devices and solutions to the problem as applicable.[107] As problems become larger and more complex, features such as subprograms, modules, formal documentation, and new paradigms such as object-oriented programming are encountered.[108] Large programs involving thousands of line of code and more require formal software methodologies.[109] The task of developing large software systems presents a significant intellectual challenge.[110] Producing software with an acceptably high reliability within a predictable schedule and budget has historically been difficult;[111] the academic and professional discipline of software engineering concentrates specifically on this challenge.[112]

Bugs

The actual first computer bug, a moth found trapped on a relay of the Harvard Mark II computer

Errors in computer programs are called "bugs". They may be benign and not affect the usefulness of the program, or have only subtle effects. However, in some cases they may cause the program or the entire system to "hang", becoming unresponsive to input such as mouse clicks or keystrokes, to completely fail, or to crash.[113] Otherwise benign bugs may sometimes be harnessed for malicious intent by an unscrupulous user writing an exploit, code designed to take advantage of a bug and disrupt a computer's proper execution. Bugs are usually not the fault of the computer. Since computers merely execute the instructions they are given, bugs are nearly always the result of programmer error or an oversight made in the program's design.[k] Admiral Grace Hopper, an American computer scientist and developer of the first compiler, is credited for having first used the term "bugs" in computing after a dead moth was found shorting a relay in the Harvard Mark II computer in September 1947.[114]

Networking and the Internet

Visualization of a portion of the routes on the Internet

Computers have been used to coordinate information between multiple locations since the 1950s. The U.S. military's SAGE system was the first large-scale example of such a system, which led to a number of special-purpose commercial systems such as Sabre.[115] In the 1970s, computer engineers at research institutions throughout the United States began to link their computers together using telecommunications technology. The effort was funded by ARPA (now DARPA), and the computer network that resulted was called the ARPANET.[116] The technologies that made the Arpanet possible spread and evolved.

In time, the network spread beyond academic and military institutions and became known as the Internet. The emergence of networking involved a redefinition of the nature and boundaries of the computer. Computer operating systems and applications were modified to include the ability to define and access the resources of other computers on the network, such as peripheral devices, stored information, and the like, as extensions of the resources of an individual computer. Initially these facilities were available primarily to people working in high-tech environments, but in the 1990s the spread of applications like e-mail and the World Wide Web, combined with the development of cheap, fast networking technologies like Ethernet and ADSL saw computer networking become almost ubiquitous. In fact, the number of computers that are networked is growing phenomenally. A very large proportion of personal computers regularly connect to the Internet to communicate and receive information. "Wireless" networking, often utilizing mobile phone networks, has meant networking is becoming increasingly ubiquitous even in mobile computing environments.

Unconventional computers

A computer does not need to be electronic, nor even have a processor, nor RAM, nor even a hard disk. While popular usage of the word "computer" is synonymous with a personal electronic computer,[l] a typical modern definition of a computer is: "A device that computes, especially a programmable [usually] electronic machine that performs high-speed mathematical or logical operations or that assembles, stores, correlates, or otherwise processes information."[117] According to this definition, any device that processes information qualifies as a computer.

Future

There is active research to make computers out of many promising new types of technology, such as optical computers, DNA computers, neural computers, and quantum computers. Most computers are universal, and are able to calculate any computable function, and are limited only by their memory capacity and operating speed. However different designs of computers can give very different performance for particular problems; for example quantum computers can potentially break some modern encryption algorithms (by quantum factoring) very quickly.

Computer architecture paradigms

There are many types of computer architectures:

Of all these abstract machines, a quantum computer holds the most promise for revolutionizing computing.[118] Logic gates are a common abstraction which can apply to most of the above digital or analog paradigms. The ability to store and execute lists of instructions called programs makes computers extremely versatile, distinguishing them from calculators. The Church–Turing thesis is a mathematical statement of this versatility: any computer with a minimum capability (being Turing-complete) is, in principle, capable of performing the same tasks that any other computer can perform. Therefore, any type of computer (netbook, supercomputer, cellular automaton, etc.) is able to perform the same computational tasks, given enough time and storage capacity.

Artificial intelligence

A computer will solve problems in exactly the way it is programmed to, without regard to efficiency, alternative solutions, possible shortcuts, or possible errors in the code. Computer programs that learn and adapt are part of the emerging field of artificial intelligence and machine learning. Artificial intelligence based products generally fall into two major categories: rule-based systems and pattern recognition systems. Rule-based systems attempt to represent the rules used by human experts and tend to be expensive to develop. Pattern-based systems use data about a problem to generate conclusions. Examples of pattern-based systems include voice recognition, font recognition, translation and the emerging field of on-line marketing.

Explainable artificial intelligence

Explainable AI (XAI), also known as Interpretable AI, or Explainable Machine Learning (XML), is artificial intelligence (AI) in which humans can understand the reasoning behind decisions or predictions made by the AI. It contrasts with the "black box" concept in machine learning, where even the AI's designers cannot explain why it arrived at a specific decision.

XAI hopes to help users of AI-powered systems perform more effectively by improving their understanding of how those systems reason. XAI may be an implementation of the social right to explanation. Even if there is no such legal right or regulatory requirement, XAI can improve the user experience of a product or service by helping end users trust that the AI is making good decisions. XAI aims to explain what has been done, what is being done, and what will be done next, and to unveil which information these actions are based on. This makes it possible to confirm existing knowledge, challenge existing knowledge, and generate new assumptions.

Machine learning (ML) algorithms used in AI can be categorized as white-box or black-box. White-box models provide results that are understandable to experts in the domain. Black-box models, on the other hand, are extremely hard to explain and can hardly be understood even by domain experts. XAI algorithms follow the three principles of transparency, interpretability, and explainability. A model is transparent “if the processes that extract model parameters from training data and generate labels from testing data can be described and motivated by the approach designer.” Interpretability describes the possibility of comprehending the ML model and presenting the underlying basis for decision-making in a way that is understandable to humans. Explainability is a concept that is recognized as important, but a consensus definition is not available. One possibility is “the collection of features of the interpretable domain that have contributed, for a given example, to producing a decision (e.g., classification or regression)”. If algorithms fulfill these principles, they provide a basis for justifying decisions, tracking them and thereby verifying them, improving the algorithms, and exploring new facts.

Sometimes it is also possible to achieve a high-accuracy result with a white-box ML algorithm that is interpretable in itself. This is especially important in domains like medicine, defense, finance, and law, where it is crucial to understand decisions and build trust in the algorithms. Many researchers argue that, at least for supervised machine learning, the way forward is symbolic regression, where the algorithm searches the space of mathematical expressions to find the model that best fits a given dataset.

AI systems optimize behavior to satisfy a mathematically specified goal system chosen by the system designers, such as the command "maximize accuracy of assessing how positive film reviews are in the test dataset." The AI may learn useful general rules from the test set, such as "reviews containing the word "horrible" are likely to be negative." However, it may also learn inappropriate rules, such as "reviews containing '"Daniel Day-Lewis" are usually positive"; such rules may be undesirable if they are likely to fail to generalize outside the training set, or if people consider the rule to be "cheating" or "unfair." A human can audit rules in an XAI to get an idea of how likely the system is to generalize to future real-world data outside the test set.

Goals

Cooperation between agents --- in this case, algorithms and humans—depends on trust. If humans are to accept algorithmic prescriptions, they need to trust them. Incompleteness in the formalization of trust criteria is a barrier to straightforward optimization approaches. Transparency, interpretability, and explainability are intermediate goals on the road to these more comprehensive trust criteria. This is particularly relevant in medicine, especially with clinical decision support systems (CDSS), in which medical professionals should be able to understand how and why a machine-based decision was made in order to trust the decision and augment their decision-making process.

AI systems sometimes learn undesirable tricks that do an optimal job of satisfying explicit pre-programmed goals on the training data but do not reflect the more nuanced implicit desires of the human system designers or the full complexity of the domain data. For example, a 2017 system tasked with image recognition learned to "cheat" by looking for a copyright tag that happened to be associated with horse pictures rather than learning how to tell if a horse was actually pictured. In another 2017 system, a supervised learning AI tasked with grasping items in a virtual world learned to cheat by placing its manipulator between the object and the viewer in a way such that it falsely appeared to be grasping the object.

One transparency project, the DARPA XAI program, aims to produce "glass box" models that are explainable to a "human-in-the-loop" without greatly sacrificing AI performance. Human users of such a system can understand the AI's cognition (both in real-time and after the fact) and can determine whether to trust the AI. Other applications of XAI are knowledge extraction from black-box models and model comparisons. In the context of monitoring systems for ethical and socio-legal compliance, the term "glass box" is commonly used to refer to tools that track the inputs and outputs of the system in question, and provide value-based explanations for their behavior. These tools aim to ensure that the system operates in accordance with ethical and legal standards, and that its decision-making processes are transparent and accountable. The term "glass box" is often used in contrast to "black box" systems, which lack transparency and can be more difficult to monitor and regulate. The term is also used to name a voice assistant that produces counterfactual statements as explanations.

History and methods

During the 1970s to 1990s, symbolic reasoning systems, such as MYCIN, GUIDON, SOPHIE, and PROTOS could represent, reason about, and explain their reasoning for diagnostic, instructional, or machine-learning (explanation-based learning) purposes. MYCIN, developed in the early 1970s as a research prototype for diagnosing bacteremia infections of the bloodstream, could explain which of its hand-coded rules contributed to a diagnosis in a specific case. Research in intelligent tutoring systems resulted in developing systems such as SOPHIE that could act as an "articulate expert", explaining problem-solving strategy at a level the student could understand, so they would know what action to take next. For instance, SOPHIE could explain the qualitative reasoning behind its electronics troubleshooting, even though it ultimately relied on the SPICE circuit simulator. Similarly, GUIDON added tutorial rules to supplement MYCIN's domain-level rules so it could explain strategy for medical diagnosis. Symbolic approaches to machine learning, especially those relying on explanation-based learning, such as PROTOS, explicitly relied on representations of explanations, both to explain their actions and to acquire new knowledge.

In the 1980s through early 1990s, truth maintenance systems (TMS) extended the capabilities of causal-reasoning, rule-based, and logic-based inference systems. A TMS explicitly tracks alternate lines of reasoning, justifications for conclusions, and lines of reasoning that lead to contradictions, allowing future reasoning to avoid these dead ends. To provide explanation, they trace reasoning from conclusions to assumptions through rule operations or logical inferences, allowing explanations to be generated from the reasoning traces. As an example, consider a rule-based problem solver with just a few rules about Socrates that concludes he has died from poison:

By just tracing through the dependency structure the problem solver can construct the following explanation: "Socrates died because he was mortal and drank poison, and all mortals die when they drink poison. Socrates was mortal because he was a man and all men are mortal. Socrates drank poison because he held dissident beliefs, the government was conservative, and those holding conservative dissident beliefs under conservative governments must drink poison."

By the 1990s researchers began studying whether it is possible to meaningfully extract the non-hand-coded rules being generated by opaque trained neural networks. Researchers in clinical expert systems creating neural network-powered decision support for clinicians sought to develop dynamic explanations that allow these technologies to be more trusted and trustworthy in practice. In the 2010s public concerns about racial and other bias in the use of AI for criminal sentencing decisions and findings of creditworthiness may have led to increased demand for transparent artificial intelligence. As a result, many academics and organizations are developing tools to help detect bias in their systems.

Marvin Minsky et al. raised the issue that AI can function as a form of surveillance, with the biases inherent in surveillance, suggesting HI (Humanistic Intelligence) as a way to create a more fair and balanced "human-in-the-loop" AI.

Modern complex AI techniques, such as deep learning and genetic algorithms, are naturally opaque. To address this issue, methods have been developed to make new models more explainable and interpretable. This includes layerwise relevance propagation (LRP), a technique for determining which features in a particular input vector contribute most strongly to a neural network's output. Other techniques explain some particular prediction made by a (nonlinear) black-box model, a goal referred to as "local interpretability". The mere transposition of the concepts of local interpretability into a remote context (where the black-box model is executed at a third party) is currently under scrutiny.

There has been work on making glass-box models which are more transparent to inspection. This includes decision trees, Bayesian networks, sparse linear models, and more. The Association for Computing Machinery Conference on Fairness, Accountability, and Transparency (ACM FAccT) was established in 2018 to study transparency and explainability in the context of socio-technical systems, many of which include artificial intelligence.

Some techniques allow visualisations of the inputs which individual software neurons respond to most strongly. Several groups found that neurons can be aggregated into circuits that perform human-comprehensible functions, some of which reliably arise across different networks trained independently.

There are various techniques to extract compressed representations of the features of given inputs, which can then be analysed by standard clustering techniques. Alternatively, networks can be trained to output linguistic explanations of their behaviour, which are then directly human-interpretable. Model behaviour can also be explained with reference to training data—for example, by evaluating which training inputs influenced a given behaviour the most.

Regulation

As regulators, official bodies, and general users come to depend on AI-based dynamic systems, clearer accountability will be required for automated decision-making processes to ensure trust and transparency. The first global conference exclusively dedicated to this emerging discipline was the 2017 International Joint Conference on Artificial Intelligence: Workshop on Explainable Artificial Intelligence (XAI).

The European Union introduced a right to explanation in the General Data Protection Right (GDPR) to address potential problems stemming from the rising importance of algorithms. The implementation of the regulation began in 2018. However, the right to explanation in GDPR covers only the local aspect of interpretability. In the United States, insurance companies are required to be able to explain their rate and coverage decisions. In France the Loi pour une République numérique (Digital Republic Act) grants subjects the right to request and receive information pertaining to the implementation of algorithms that process data about them.

Limitations

Despite efforts to increase the explainability of AI models, they still have a number of limitations.

Adversarial parties

By making an AI system more explainable, we also reveal more of its inner workings. For example, the explainability method of feature importance identifies features or variables that are most important in determining the model's output, while the influential samples method identifies the training samples that are most influential in determining the output, given a particular input. Adversarial parties could take advantage of this knowledge.

For example, competitor firms could replicate aspects of the original AI system in their own product, thus reducing competitive advantage. An explainable AI system is also susceptible to being “gamed”—influenced in a way that undermines its intended purpose. One study gives the example of a predictive policing system; in this case, those who could potentially “game” the system are the criminals subject to the system's decisions. In this study, developers of the system discussed the issue of criminal gangs looking to illegally obtain passports, and they expressed concerns that, if given an idea of what factors might trigger an alert in the passport application process, those gangs would be able to “send guinea pigs” to test those triggers, eventually finding a loophole that would allow them to “reliably get passports from under the noses of the authorities”.

Technical complexity

A fundamental barrier to making AI systems explainable is the technical complexity of such systems. End users often lack the coding knowledge required to understand software of any kind. Current methods used to explain AI are mainly technical ones, geared toward machine learning engineers for debugging purposes, rather than toward the end users who are ultimately affected by the system, causing “a gap between explainability in practice and the goal of transparency”. Proposed solutions to address the issue of technical complexity include either promoting the coding education of the general public so technical explanations would be more accessible to end users, or providing explanations in layperson terms.

The solution must avoid oversimplification. It is important to strike a balance between accuracy – how faithfully the explanation reflects the process of the AI system – and explainability – how well end users understand the process. This is a difficult balance to strike, since the complexity of machine learning makes it difficult for even ML engineers to fully understand, let alone non-experts.

Understanding versus trust

The goal of explainability to end users of AI systems is to increase trust in the systems, even “address concerns about lack of ‘fairness’ and discriminatory effects”. However, even with a good understanding of an AI system, end users may not necessarily trust the system. In one study, participants were presented with combinations of white-box and black-box explanations, and static and interactive explanations of AI systems. While these explanations served to increase both their self-reported and objective understanding, it had no impact on their level of trust, which remained skeptical.

This outcome was especially true for decisions that impacted the end user in a significant way, such as graduate school admissions. Participants judged algorithms to be too inflexible and unforgiving in comparison to human decision-makers; instead of rigidly adhering to a set of rules, humans are able to consider exceptional cases as well as appeals to their initial decision. For such decisions, explainability will not necessarily cause end users to accept the use of decision-making algorithms. We will need to either turn to another method to increase trust and acceptance of decision-making algorithms, or question the need to rely solely on AI for such impactful decisions in the first place.

Criticism

Scholars have suggested that explainability in AI should be considered a goal secondary to AI effectiveness, and that encouraging the exclusive development of XAI may limit the functionality of AI more broadly. Critiques of XAI rely on developed concepts of mechanistic and empiric reasoning from evidence-based medicine to suggest that AI technologies can be clinically validated even when their function cannot be understood by their operators.

Moreover, XAI systems have primarily focused on making AI systems understandable to AI practitioners rather than end users, and their results on user perceptions of these systems have been somewhat fragmented. Some researchers advocate the use of inherently interpretable machine learning models, rather than using post-hoc explanations in which a second model is created to explain the first. This is partly because post-hoc models increase the complexity in a decision pathway and partly because it is often unclear how faithfully a post-hoc explanation can mimic the computations of an entirely separate model. However, another view is that what is important is that the explanation accomplishes the given task at hand, and whether it is pre or post-hoc doesn't matter. If a post-hoc explanation method helps a doctor diagnose cancer better, it is of secondary importance whether it is a correct/incorrect explanation.

The goals of XAI amount to a form of lossy compression that will become less effective as AI models grow in their number of parameters. Along with other factors this leads to a theoretical limit for explainability.

Supercomputer

From Wikipedia, the free encyclopedia
The IBM Blue Gene/P supercomputer "Intrepid" at Argonne National Laboratory runs 164,000 processor cores using normal data center air conditioning, grouped in 40 racks/cabinets connected by a high-speed 3D torus network.
 
Computing power of the top 1 supercomputer each year, measured in FLOPS

A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2017, there have existed supercomputers which can perform over 1017 FLOPS (a hundred quadrillion FLOPS, 100 petaFLOPS or 100 PFLOPS). For comparison, a desktop computer has performance in the range of hundreds of gigaFLOPS (1011) to tens of teraFLOPS (1013). Since November 2017, all of the world's fastest 500 supercomputers run on Linux-based operating systems. Additional research is being conducted in the United States, the European Union, Taiwan, Japan, and China to build faster, more powerful and technologically superior exascale supercomputers.

Supercomputers play an important role in the field of computational science, and are used for a wide range of computationally intensive tasks in various fields, including quantum mechanics, weather forecasting, climate research, oil and gas exploration, molecular modeling (computing the structures and properties of chemical compounds, biological macromolecules, polymers, and crystals), and physical simulations (such as simulations of the early moments of the universe, airplane and spacecraft aerodynamics, the detonation of nuclear weapons, and nuclear fusion). They have been essential in the field of cryptanalysis.

Supercomputers were introduced in the 1960s, and for several decades the fastest were made by Seymour Cray at Control Data Corporation (CDC), Cray Research and subsequent companies bearing his name or monogram. The first such machines were highly tuned conventional designs that ran more quickly than their more general-purpose contemporaries. Through the decade, increasing amounts of parallelism were added, with one to four processors being typical. In the 1970s, vector processors operating on large arrays of data came to dominate. A notable example is the highly successful Cray-1 of 1976. Vector computers remained the dominant design into the 1990s. From then until today, massively parallel supercomputers with tens of thousands of off-the-shelf processors became the norm.

The US has long been the leader in the supercomputer field, first through Cray's almost uninterrupted dominance of the field, and later through a variety of technology companies. Japan made major strides in the field in the 1980s and 90s, with China becoming increasingly active in the field. As of May 2022, the fastest supercomputer on the TOP500 supercomputer list is Frontier, in the US, with a LINPACK benchmark score of 1.102 ExaFlop/s, followed by Fugaku. The US has five of the top 10; China has two; Japan, Finland, and France have one each. In June 2018, all combined supercomputers on the TOP500 list broke the 1 exaFLOPS mark.

History

A circuit board from the IBM 7030
 
The CDC 6600. Behind the system console are two of the "arms" of the plus-sign shaped cabinet with the covers opened. Each arm of the machine had up to four such racks. On the right is the cooling system.
 
A Cray-1 preserved at the Deutsches Museum

In 1960, UNIVAC built the Livermore Atomic Research Computer (LARC), today considered among the first supercomputers, for the US Navy Research and Development Center. It still used high-speed drum memory, rather than the newly emerging disk drive technology. Also, among the first supercomputers was the IBM 7030 Stretch. The IBM 7030 was built by IBM for the Los Alamos National Laboratory, which in 1955 had requested a computer 100 times faster than any existing computer. The IBM 7030 used transistors, magnetic core memory, pipelined instructions, prefetched data through a memory controller and included pioneering random access disk drives. The IBM 7030 was completed in 1961 and despite not meeting the challenge of a hundredfold increase in performance, it was purchased by the Los Alamos National Laboratory. Customers in England and France also bought the computer, and it became the basis for the IBM 7950 Harvest, a supercomputer built for cryptanalysis.

The third pioneering supercomputer project in the early 1960s was the Atlas at the University of Manchester, built by a team led by Tom Kilburn. He designed the Atlas to have memory space for up to a million words of 48 bits, but because magnetic storage with such a capacity was unaffordable, the actual core memory of the Atlas was only 16,000 words, with a drum providing memory for a further 96,000 words. The Atlas operating system swapped data in the form of pages between the magnetic core and the drum. The Atlas operating system also introduced time-sharing to supercomputing, so that more than one program could be executed on the supercomputer at any one time. Atlas was a joint venture between Ferranti and the Manchester University and was designed to operate at processing speeds approaching one microsecond per instruction, about one million instructions per second.

The CDC 6600, designed by Seymour Cray, was finished in 1964 and marked the transition from germanium to silicon transistors. Silicon transistors could run more quickly and the overheating problem was solved by introducing refrigeration to the supercomputer design. Thus, the CDC6600 became the fastest computer in the world. Given that the 6600 outperformed all the other contemporary computers by about 10 times, it was dubbed a supercomputer and defined the supercomputing market, when one hundred computers were sold at $8 million each.

Cray left CDC in 1972 to form his own company, Cray Research. Four years after leaving CDC, Cray delivered the 80 MHz Cray-1 in 1976, which became one of the most successful supercomputers in history. The Cray-2 was released in 1985. It had eight central processing units (CPUs), liquid cooling and the electronics coolant liquid Fluorinert was pumped through the supercomputer architecture. It reached 1.9 gigaFLOPS, making it the first supercomputer to break the gigaflop barrier.

Massively parallel designs

A cabinet of the massively parallel Blue Gene/L, showing the stacked blades, each holding many processors

The only computer to seriously challenge the Cray-1's performance in the 1970s was the ILLIAC IV. This machine was the first realized example of a true massively parallel computer, in which many processors worked together to solve different parts of a single larger problem. In contrast with the vector systems, which were designed to run a single stream of data as quickly as possible, in this concept, the computer instead feeds separate parts of the data to entirely different processors and then recombines the results. The ILLIAC's design was finalized in 1966 with 256 processors and offer speed up to 1 GFLOPS, compared to the 1970s Cray-1's peak of 250 MFLOPS. However, development problems led to only 64 processors being built, and the system could never operate more quickly than about 200 MFLOPS while being much larger and more complex than the Cray. Another problem was that writing software for the system was difficult, and getting peak performance from it was a matter of serious effort.

But the partial success of the ILLIAC IV was widely seen as pointing the way to the future of supercomputing. Cray argued against this, famously quipping that "If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?" But by the early 1980s, several teams were working on parallel designs with thousands of processors, notably the Connection Machine (CM) that developed from research at MIT. The CM-1 used as many as 65,536 simplified custom microprocessors connected together in a network to share data. Several updated versions followed; the CM-5 supercomputer is a massively parallel processing computer capable of many billions of arithmetic operations per second.

In 1982, Osaka University's LINKS-1 Computer Graphics System used a massively parallel processing architecture, with 514 microprocessors, including 257 Zilog Z8001 control processors and 257 iAPX 86/20 floating-point processors. It was mainly used for rendering realistic 3D computer graphics. Fujitsu's VPP500 from 1992 is unusual since, to achieve higher speeds, its processors used GaAs, a material normally reserved for microwave applications due to its toxicity. Fujitsu's Numerical Wind Tunnel supercomputer used 166 vector processors to gain the top spot in 1994 with a peak speed of 1.7 gigaFLOPS (GFLOPS) per processor. The Hitachi SR2201 obtained a peak performance of 600 GFLOPS in 1996 by using 2048 processors connected via a fast three-dimensional crossbar network. The Intel Paragon could have 1000 to 4000 Intel i860 processors in various configurations and was ranked the fastest in the world in 1993. The Paragon was a MIMD machine which connected processors via a high speed two-dimensional mesh, allowing processes to execute on separate nodes, communicating via the Message Passing Interface.

Software development remained a problem, but the CM series sparked off considerable research into this issue. Similar designs using custom hardware were made by many companies, including the Evans & Sutherland ES-1, MasPar, nCUBE, Intel iPSC and the Goodyear MPP. But by the mid-1990s, general-purpose CPU performance had improved so much in that a supercomputer could be built using them as the individual processing units, instead of using custom chips. By the turn of the 21st century, designs featuring tens of thousands of commodity CPUs were the norm, with later machines adding graphic units to the mix.

The CPU share of TOP500
 
Diagram of a three-dimensional torus interconnect used by systems such as Blue Gene, Cray XT3, etc.

Systems with a massive number of processors generally take one of two paths. In the grid computing approach, the processing power of many computers, organized as distributed, diverse administrative domains, is opportunistically used whenever a computer is available. In another approach, many processors are used in proximity to each other, e.g. in a computer cluster. In such a centralized massively parallel system the speed and flexibility of the interconnect becomes very important and modern supercomputers have used various approaches ranging from enhanced Infiniband systems to three-dimensional torus interconnects. The use of multi-core processors combined with centralization is an emerging direction, e.g. as in the Cyclops64 system.

As the price, performance and energy efficiency of general-purpose graphics processing units (GPGPUs) have improved, a number of petaFLOPS supercomputers such as Tianhe-I and Nebulae have started to rely on them. However, other systems such as the K computer continue to use conventional processors such as SPARC-based designs and the overall applicability of GPGPUs in general-purpose high-performance computing applications has been the subject of debate, in that while a GPGPU may be tuned to score well on specific benchmarks, its overall applicability to everyday algorithms may be limited unless significant effort is spent to tune the application to it. However, GPUs are gaining ground, and in 2012 the Jaguar supercomputer was transformed into Titan by retrofitting CPUs with GPUs.

High-performance computers have an expected life cycle of about three years before requiring an upgrade. The Gyoukou supercomputer is unique in that it uses both a massively parallel design and liquid immersion cooling.

Special purpose supercomputers

A number of special-purpose systems have been designed, dedicated to a single problem. This allows the use of specially programmed FPGA chips or even custom ASICs, allowing better price/performance ratios by sacrificing generality. Examples of special-purpose supercomputers include Belle, Deep Blue, and Hydra for playing chess, Gravity Pipe for astrophysics, MDGRAPE-3 for protein structure prediction and molecular dynamics, and Deep Crack for breaking the DES cipher.

Energy usage and heat management

The Summit supercomputer was as of November 2018 the fastest supercomputer in the world. With a measured power efficiency of 14.668 GFlops/watt it is also the third most energy efficient in the world.

Throughout the decades, the management of heat density has remained a key issue for most centralized supercomputers. The large amount of heat generated by a system may also have other effects, e.g. reducing the lifetime of other system components. There have been diverse approaches to heat management, from pumping Fluorinert through the system, to a hybrid liquid-air cooling system or air cooling with normal air conditioning temperatures. A typical supercomputer consumes large amounts of electrical power, almost all of which is converted into heat, requiring cooling. For example, Tianhe-1A consumes 4.04 megawatts (MW) of electricity. The cost to power and cool the system can be significant, e.g. 4 MW at $0.10/kWh is $400 an hour or about $3.5 million per year.

Heat management is a major issue in complex electronic devices and affects powerful computer systems in various ways. The thermal design power and CPU power dissipation issues in supercomputing surpass those of traditional computer cooling technologies. The supercomputing awards for green computing reflect this issue.

The packing of thousands of processors together inevitably generates significant amounts of heat density that need to be dealt with. The Cray-2 was liquid cooled, and used a Fluorinert "cooling waterfall" which was forced through the modules under pressure. However, the submerged liquid cooling approach was not practical for the multi-cabinet systems based on off-the-shelf processors, and in System X a special cooling system that combined air conditioning with liquid cooling was developed in conjunction with the Liebert company.

In the Blue Gene system, IBM deliberately used low power processors to deal with heat density. The IBM Power 775, released in 2011, has closely packed elements that require water cooling. The IBM Aquasar system uses hot water cooling to achieve energy efficiency, the water being used to heat buildings as well.

The energy efficiency of computer systems is generally measured in terms of "FLOPS per watt". In 2008, Roadrunner by IBM operated at 3.76 MFLOPS/W. In November 2010, the Blue Gene/Q reached 1,684 MFLOPS/W and in June 2011 the top two spots on the Green 500 list were occupied by Blue Gene machines in New York (one achieving 2097 MFLOPS/W) with the DEGIMA cluster in Nagasaki placing third with 1375 MFLOPS/W.

Because copper wires can transfer energy into a supercomputer with much higher power densities than forced air or circulating refrigerants can remove waste heat, the ability of the cooling systems to remove waste heat is a limiting factor. As of 2015, many existing supercomputers have more infrastructure capacity than the actual peak demand of the machine – designers generally conservatively design the power and cooling infrastructure to handle more than the theoretical peak electrical power consumed by the supercomputer. Designs for future supercomputers are power-limited – the thermal design power of the supercomputer as a whole, the amount that the power and cooling infrastructure can handle, is somewhat more than the expected normal power consumption, but less than the theoretical peak power consumption of the electronic hardware.

Software and system management

Operating systems

Since the end of the 20th century, supercomputer operating systems have undergone major transformations, based on the changes in supercomputer architecture. While early operating systems were custom tailored to each supercomputer to gain speed, the trend has been to move away from in-house operating systems to the adaptation of generic software such as Linux.

Since modern massively parallel supercomputers typically separate computations from other services by using multiple types of nodes, they usually run different operating systems on different nodes, e.g. using a small and efficient lightweight kernel such as CNK or CNL on compute nodes, but a larger system such as a Linux-derivative on server and I/O nodes.

While in a traditional multi-user computer system job scheduling is, in effect, a tasking problem for processing and peripheral resources, in a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources, as well as gracefully deal with inevitable hardware failures when tens of thousands of processors are present.

Although most modern supercomputers use Linux-based operating systems, each manufacturer has its own specific Linux-derivative, and no industry standard exists, partly due to the fact that the differences in hardware architectures require changes to optimize the operating system to each hardware design.

Software tools and message passing

Wide-angle view of the ALMA correlator

The parallel architectures of supercomputers often dictate the use of special programming techniques to exploit their speed. Software tools for distributed processing include standard APIs such as MPI and PVM, VTL, and open source software such as Beowulf.

In the most common scenario, environments such as PVM and MPI for loosely connected clusters and OpenMP for tightly coordinated shared memory machines are used. Significant effort is required to optimize an algorithm for the interconnect characteristics of the machine it will be run on; the aim is to prevent any of the CPUs from wasting time waiting on data from other nodes. GPGPUs have hundreds of processor cores and are programmed using programming models such as CUDA or OpenCL.

Moreover, it is quite difficult to debug and test parallel programs. Special techniques need to be used for testing and debugging such applications.

Distributed supercomputing

Opportunistic approaches

Example architecture of a grid computing system connecting many personal computers over the internet

Opportunistic supercomputing is a form of networked grid computing whereby a "super virtual computer" of many loosely coupled volunteer computing machines performs very large computing tasks. Grid computing has been applied to a number of large-scale embarrassingly parallel problems that require supercomputing performance scales. However, basic grid and cloud computing approaches that rely on volunteer computing cannot handle traditional supercomputing tasks such as fluid dynamic simulations.

The fastest grid computing system is the volunteer computing project Folding@home (F@h). As of April 2020, F@h reported 2.5 exaFLOPS of x86 processing power. Of this, over 100 PFLOPS are contributed by clients running on various GPUs, and the rest from various CPU systems.

The Berkeley Open Infrastructure for Network Computing (BOINC) platform hosts a number of volunteer computing projects. As of February 2017, BOINC recorded a processing power of over 166 petaFLOPS through over 762 thousand active Computers (Hosts) on the network.

As of October 2016, Great Internet Mersenne Prime Search's (GIMPS) distributed Mersenne Prime search achieved about 0.313 PFLOPS through over 1.3 million computers. The [1] supports GIMPS's grid computing approach, one of the earliest volunteer computing projects, since 1997.

Quasi-opportunistic approaches

Quasi-opportunistic supercomputing is a form of distributed computing whereby the "super virtual computer" of many networked geographically disperse computers performs computing tasks that demand huge processing power. Quasi-opportunistic supercomputing aims to provide a higher quality of service than opportunistic grid computing by achieving more control over the assignment of tasks to distributed resources and the use of intelligence about the availability and reliability of individual systems within the supercomputing network. However, quasi-opportunistic distributed execution of demanding parallel computing software in grids should be achieved through implementation of grid-wise allocation agreements, co-allocation subsystems, communication topology-aware allocation mechanisms, fault tolerant message passing libraries and data pre-conditioning.

High-performance computing clouds

Cloud computing with its recent and rapid expansions and development have grabbed the attention of high-performance computing (HPC) users and developers in recent years. Cloud computing attempts to provide HPC-as-a-service exactly like other forms of services available in the cloud such as software as a service, platform as a service, and infrastructure as a service. HPC users may benefit from the cloud in different angles such as scalability, resources being on-demand, fast, and inexpensive. On the other hand, moving HPC applications have a set of challenges too. Good examples of such challenges are virtualization overhead in the cloud, multi-tenancy of resources, and network latency issues. Much research is currently being done to overcome these challenges and make HPC in the cloud a more realistic possibility.

In 2016, Penguin Computing, Parallel Works, R-HPC, Amazon Web Services, Univa, Silicon Graphics International, Rescale, Sabalcore, and Gomput started to offer HPC cloud computing. The Penguin On Demand (POD) cloud is a bare-metal compute model to execute code, but each user is given virtualized login node. POD computing nodes are connected via non-virtualized 10 Gbit/s Ethernet or QDR InfiniBand networks. User connectivity to the POD data center ranges from 50 Mbit/s to 1 Gbit/s. Citing Amazon's EC2 Elastic Compute Cloud, Penguin Computing argues that virtualization of compute nodes is not suitable for HPC. Penguin Computing has also criticized that HPC clouds may have allocated computing nodes to customers that are far apart, causing latency that impairs performance for some HPC applications.

Performance measurement

Capability versus capacity

Supercomputers generally aim for the maximum in capability computing rather than capacity computing. Capability computing is typically thought of as using the maximum computing power to solve a single large problem in the shortest amount of time. Often a capability system is able to solve a problem of a size or complexity that no other computer can, e.g. a very complex weather simulation application.

Capacity computing, in contrast, is typically thought of as using efficient cost-effective computing power to solve a few somewhat large problems or many small problems. Architectures that lend themselves to supporting many users for routine everyday tasks may have a lot of capacity but are not typically considered supercomputers, given that they do not solve a single very complex problem.

Performance metrics

Top supercomputer speeds: logscale speed over 60 years

In general, the speed of supercomputers is measured and benchmarked in FLOPS (floating-point operations per second), and not in terms of MIPS (million instructions per second), as is the case with general-purpose computers. These measurements are commonly used with an SI prefix such as tera-, combined into the shorthand TFLOPS (1012 FLOPS, pronounced teraflops), or peta-, combined into the shorthand PFLOPS (1015 FLOPS, pronounced petaflops.) Petascale supercomputers can process one quadrillion (1015) (1000 trillion) FLOPS. Exascale is computing performance in the exaFLOPS (EFLOPS) range. An EFLOPS is one quintillion (1018) FLOPS (one million TFLOPS).

No single number can reflect the overall performance of a computer system, yet the goal of the Linpack benchmark is to approximate how fast the computer solves numerical problems and it is widely used in the industry. The FLOPS measurement is either quoted based on the theoretical floating point performance of a processor (derived from manufacturer's processor specifications and shown as "Rpeak" in the TOP500 lists), which is generally unachievable when running real workloads, or the achievable throughput, derived from the LINPACK benchmarks and shown as "Rmax" in the TOP500 list.[102] The LINPACK benchmark typically performs LU decomposition of a large matrix. The LINPACK performance gives some indication of performance for some real-world problems, but does not necessarily match the processing requirements of many other supercomputer workloads, which for example may require more memory bandwidth, or may require better integer computing performance, or may need a high performance I/O system to achieve high levels of performance.

The TOP500 list

Top 20 supercomputers in the world (June 2014)

Since 1993, the fastest supercomputers have been ranked on the TOP500 list according to their LINPACK benchmark results. The list does not claim to be unbiased or definitive, but it is a widely cited current definition of the "fastest" supercomputer available at any given time.

Applications

The stages of supercomputer application may be summarized in the following table:

Decade Uses and computer involved
1970s Weather forecasting, aerodynamic research (Cray-1).
1980s Probabilistic analysis, radiation shielding modeling (CDC Cyber).
1990s Brute force code breaking (EFF DES cracker).
2000s 3D nuclear test simulations as a substitute for legal conduct Nuclear Non-Proliferation Treaty (ASCI Q).
2010s Molecular dynamics simulation (Tianhe-1A)
2020s Scientific research for outbreak prevention/Electrochemical Reaction Research

The IBM Blue Gene/P computer has been used to simulate a number of artificial neurons equivalent to approximately one percent of a human cerebral cortex, containing 1.6 billion neurons with approximately 9 trillion connections. The same research group also succeeded in using a supercomputer to simulate a number of artificial neurons equivalent to the entirety of a rat's brain.

Modern-day weather forecasting also relies on supercomputers. The National Oceanic and Atmospheric Administration uses supercomputers to crunch hundreds of millions of observations to help make weather forecasts more accurate.

In 2011, the challenges and difficulties in pushing the envelope in supercomputing were underscored by IBM's abandonment of the Blue Waters petascale project.

The Advanced Simulation and Computing Program currently uses supercomputers to maintain and simulate the United States nuclear stockpile.

In early 2020, COVID-19 was front and center in the world. Supercomputers used different simulations to find compounds that could potentially stop the spread. These computers run for tens of hours using multiple paralleled running CPU's to model different processes.

Taiwania 3 is a Taiwanese supercomputer which assisted the scientific community in fighting COVID-19. It was launched in 2020 and has a capacity of about two to three PetaFLOPS.

Development and trends

Distribution of TOP500 supercomputers among different countries, in November 2015

In the 2010s, China, the United States, the European Union, and others competed to be the first to create a 1 exaFLOP (1018 or one quintillion FLOPS) supercomputer. Erik P. DeBenedictis of Sandia National Laboratories has theorized that a zettaFLOPS (1021 or one sextillion FLOPS) computer is required to accomplish full weather modeling, which could cover a two-week time span accurately. Such systems might be built around 2030.

Many Monte Carlo simulations use the same algorithm to process a randomly generated data set; particularly, integro-differential equations describing physical transport processes, the random paths, collisions, and energy and momentum depositions of neutrons, photons, ions, electrons, etc. The next step for microprocessors may be into the third dimension; and specializing to Monte Carlo, the many layers could be identical, simplifying the design and manufacture process.

The cost of operating high performance supercomputers has risen, mainly due to increasing power consumption. In the mid-1990s a top 10 supercomputer required in the range of 100 kilowatts, in 2010 the top 10 supercomputers required between 1 and 2 megawatts. A 2010 study commissioned by DARPA identified power consumption as the most pervasive challenge in achieving Exascale computing. At the time a megawatt per year in energy consumption cost about 1 million dollars. Supercomputing facilities were constructed to efficiently remove the increasing amount of heat produced by modern multi-core central processing units. Based on the energy consumption of the Green 500 list of supercomputers between 2007 and 2011, a supercomputer with 1 exaFLOPS in 2011 would have required nearly 500 megawatts. Operating systems were developed for existing hardware to conserve energy whenever possible. CPU cores not in use during the execution of a parallelized application were put into low-power states, producing energy savings for some supercomputing applications.

The increasing cost of operating supercomputers has been a driving factor in a trend toward bundling of resources through a distributed supercomputer infrastructure. National supercomputing centers first emerged in the US, followed by Germany and Japan. The European Union launched the Partnership for Advanced Computing in Europe (PRACE) with the aim of creating a persistent pan-European supercomputer infrastructure with services to support scientists across the European Union in porting, scaling and optimizing supercomputing applications. Iceland built the world's first zero-emission supercomputer. Located at the Thor Data Center in Reykjavík, Iceland, this supercomputer relies on completely renewable sources for its power rather than fossil fuels. The colder climate also reduces the need for active cooling, making it one of the greenest facilities in the world of computers.

Funding supercomputer hardware also became increasingly difficult. In the mid-1990s a top 10 supercomputer cost about 10 million euros, while in 2010 the top 10 supercomputers required an investment of between 40 and 50 million euros. In the 2000s national governments put in place different strategies to fund supercomputers. In the UK the national government funded supercomputers entirely and high performance computing was put under the control of a national funding agency. Germany developed a mixed funding model, pooling local state funding and federal funding.

In fiction

Many science fiction writers have depicted supercomputers in their works, both before and after the historical construction of such computers. Much of such fiction deals with the relations of humans with the computers they build and with the possibility of conflict eventually developing between them. Examples of supercomputers in fiction include HAL 9000, Multivac, The Machine Stops, GLaDOS, The Evitable Conflict, Vulcan's Hammer, Colossus, WOPR, and Deep Thought.

Two-state solution

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Two-state_solution A peace movement po...