Search This Blog

Monday, March 29, 2021

Computer multitasking

From Wikipedia, the free encyclopedia
Modern desktop operating systems are capable of handling large numbers of different processes at the same time. This screenshot shows Linux Mint running simultaneously Xfce desktop environment, Firefox, a calculator program, the built-in calendar, Vim, GIMP, and VLC media player.
 
Multitasking capabilities of Microsoft Windows 1.01 released in 1985, here shown running the MS-DOS Executive and Calculator programs

In computing, multitasking is the concurrent execution of multiple tasks (also known as processes) over a certain period of time. New tasks can interrupt already started ones before they finish, instead of waiting for them to end. As a result, a computer executes segments of multiple tasks in an interleaved manner, while the tasks share common processing resources such as central processing units (CPUs) and main memory. Multitasking automatically interrupts the running program, saving its state (partial results, memory contents and computer register contents) and loading the saved state of another program and transferring control to it. This "context switch" may be initiated at fixed time intervals (pre-emptive multitasking), or the running program may be coded to signal to the supervisory software when it can be interrupted (cooperative multitasking).

Multitasking does not require parallel execution of multiple tasks at exactly the same time; instead, it allows more than one task to advance over a given period of time. Even on multiprocessor computers, multitasking allows many more tasks to be run than there are CPUs.

Multitasking is a common feature of computer operating systems. It allows more efficient use of the computer hardware; where a program is waiting for some external event such as a user input or an input/output transfer with a peripheral to complete, the central processor can still be used with another program. In a time-sharing system, multiple human operators use the same processor as if it was dedicated to their use, while behind the scenes the computer is serving many users by multitasking their individual programs. In multiprogramming systems, a task runs until it must wait for an external event or until the operating system's scheduler forcibly swaps the running task out of the CPU. Real-time systems such as those designed to control industrial robots, require timely processing; a single processor might be shared between calculations of machine movement, communications, and user interface.

Often multitasking operating systems include measures to change the priority of individual tasks, so that important jobs receive more processor time than those considered less significant. Depending on the operating system, a task might be as large as an entire application program, or might be made up of smaller threads that carry out portions of the overall program.

A processor intended for use with multitasking operating systems may include special hardware to securely support multiple tasks, such as memory protection, and protection rings that ensure the supervisory software cannot be damaged or subverted by user-mode program errors.

The term "multitasking" has become an international term, as the same word is used in many other languages such as German, Italian, Dutch, Danish and Norwegian.

Multiprogramming

In the early days of computing, CPU time was expensive, and peripherals were very slow. When the computer ran a program that needed access to a peripheral, the central processing unit (CPU) would have to stop executing program instructions while the peripheral processed the data. This was usually very inefficient.

The first computer using a multiprogramming system was the British Leo III owned by J. Lyons and Co. During batch processing, several different programs were loaded in the computer memory, and the first one began to run. When the first program reached an instruction waiting for a peripheral, the context of this program was stored away, and the second program in memory was given a chance to run. The process continued until all programs finished running.

The use of multiprogramming was enhanced by the arrival of virtual memory and virtual machine technology, which enabled individual programs to make use of memory and operating system resources as if other concurrently running programs were, for all practical purposes, nonexistent.

Multiprogramming gives no guarantee that a program will run in a timely manner. Indeed, the first program may very well run for hours without needing access to a peripheral. As there were no users waiting at an interactive terminal, this was no problem: users handed in a deck of punched cards to an operator, and came back a few hours later for printed results. Multiprogramming greatly reduced wait times when multiple batches were being processed.

Cooperative multitasking

Early multitasking systems used applications that voluntarily ceded time to one another. This approach, which was eventually supported by many computer operating systems, is known today as cooperative multitasking. Although it is now rarely used in larger systems except for specific applications such as CICS or the JES2 subsystem, cooperative multitasking was once the only scheduling scheme employed by Microsoft Windows and Classic Mac OS to enable multiple applications to run simultaneously. Cooperative multitasking is still used today on RISC OS systems.

As a cooperatively multitasked system relies on each process regularly giving up time to other processes on the system, one poorly designed program can consume all of the CPU time for itself, either by performing extensive calculations or by busy waiting; both would cause the whole system to hang. In a server environment, this is a hazard that makes the entire environment unacceptably fragile.

Preemptive multitasking

Preemptive multitasking allows the computer system to more reliably guarantee to each process a regular "slice" of operating time. It also allows the system to deal rapidly with important external events like incoming data, which might require the immediate attention of one or another process. Operating systems were developed to take advantage of these hardware capabilities and run multiple processes preemptively. Preemptive multitasking was implemented in the PDP-6 Monitor and MULTICS in 1964, in OS/360 MFT in 1967, and in Unix in 1969, and was available in some operating systems for computers as small as DEC's PDP-8; it is a core feature of all Unix-like operating systems, such as Linux, Solaris and BSD with its derivatives, as well as modern versions of Windows.

At any specific time, processes can be grouped into two categories: those that are waiting for input or output (called "I/O bound"), and those that are fully utilizing the CPU ("CPU bound"). In primitive systems, the software would often "poll", or "busywait" while waiting for requested input (such as disk, keyboard or network input). During this time, the system was not performing useful work. With the advent of interrupts and preemptive multitasking, I/O bound processes could be "blocked", or put on hold, pending the arrival of the necessary data, allowing other processes to utilize the CPU. As the arrival of the requested data would generate an interrupt, blocked processes could be guaranteed a timely return to execution.

The earliest preemptive multitasking OS available to home users was Sinclair QDOS on the Sinclair QL, released in 1984, but very few people bought the machine. Commodore's Amiga, released the following year, was the first commercially successful home computer to use the technology, and its multimedia abilities make it a clear ancestor of contemporary multitasking personal computers. Microsoft made preemptive multitasking a core feature of their flagship operating system in the early 1990s when developing Windows NT 3.1 and then Windows 95. It was later adopted on the Apple Macintosh by Mac OS X that, as a Unix-like operating system, uses preemptive multitasking for all native applications.

A similar model is used in Windows 9x and the Windows NT family, where native 32-bit applications are multitasked preemptively. 64-bit editions of Windows, both for the x86-64 and Itanium architectures, no longer support legacy 16-bit applications, and thus provide preemptive multitasking for all supported applications.

Real time

Another reason for multitasking was in the design of real-time computing systems, where there are a number of possibly unrelated external activities needed to be controlled by a single processor system. In such systems a hierarchical interrupt system is coupled with process prioritization to ensure that key activities were given a greater share of available process time.

Multithreading

As multitasking greatly improved the throughput of computers, programmers started to implement applications as sets of cooperating processes (e. g., one process gathering input data, one process processing input data, one process writing out results on disk). This, however, required some tools to allow processes to efficiently exchange data.

Threads were born from the idea that the most efficient way for cooperating processes to exchange data would be to share their entire memory space. Thus, threads are effectively processes that run in the same memory context and share other resources with their parent processes, such as open files. Threads are described as lightweight processes because switching between threads does not involve changing the memory context.

While threads are scheduled preemptively, some operating systems provide a variant to threads, named fibers, that are scheduled cooperatively. On operating systems that do not provide fibers, an application may implement its own fibers using repeated calls to worker functions. Fibers are even more lightweight than threads, and somewhat easier to program with, although they tend to lose some or all of the benefits of threads on machines with multiple processors.

Some systems directly support multithreading in hardware.

Memory protection

Essential to any multitasking system is to safely and effectively share access to system resources. Access to memory must be strictly managed to ensure that no process can inadvertently or deliberately read or write to memory locations outside the process's address space. This is done for the purpose of general system stability and data integrity, as well as data security.

In general, memory access management is a responsibility of the operating system kernel, in combination with hardware mechanisms that provide supporting functionalities, such as a memory management unit (MMU). If a process attempts to access a memory location outside its memory space, the MMU denies the request and signals the kernel to take appropriate actions; this usually results in forcibly terminating the offending process. Depending on the software and kernel design and the specific error in question, the user may receive an access violation error message such as "segmentation fault".

In a well designed and correctly implemented multitasking system, a given process can never directly access memory that belongs to another process. An exception to this rule is in the case of shared memory; for example, in the System V inter-process communication mechanism the kernel allocates memory to be mutually shared by multiple processes. Such features are often used by database management software such as PostgreSQL.

Inadequate memory protection mechanisms, either due to flaws in their design or poor implementations, allow for security vulnerabilities that may be potentially exploited by malicious software.

Memory swapping

Use of a swap file or swap partition is a way for the operating system to provide more memory than is physically available by keeping portions of the primary memory in secondary storage. While multitasking and memory swapping are two completely unrelated techniques, they are very often used together, as swapping memory allows more tasks to be loaded at the same time. Typically, a multitasking system allows another process to run when the running process hits a point where it has to wait for some portion of memory to be reloaded from secondary storage.

Programming

Processes that are entirely independent are not much trouble to program in a multitasking environment. Most of the complexity in multitasking systems comes from the need to share computer resources between tasks and to synchronize the operation of co-operating tasks.

Various concurrent computing techniques are used to avoid potential problems caused by multiple tasks attempting to access the same resource.

Bigger systems were sometimes built with a central processor(s) and some number of I/O processors, a kind of asymmetric multiprocessing.

Over the years, multitasking systems have been refined. Modern operating systems generally include detailed mechanisms for prioritizing processes, while symmetric multiprocessing has introduced new complexities and capabilities.

Church–Turing thesis

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis

In computability theory, the Church–Turing thesis (also known as computability thesis, the Turing–Church thesis, the Church–Turing conjecture, Church's thesis, Church's conjecture, and Turing's thesis) is a hypothesis about the nature of computable functions. It states that a function on the natural numbers can be calculated by an effective method if and only if it is computable by a Turing machine. The thesis is named after American mathematician Alonzo Church and the British mathematician Alan Turing. Before the precise definition of computable function, mathematicians often used the informal term effectively calculable to describe functions that are computable by paper-and-pencil methods. In the 1930s, several independent attempts were made to formalize the notion of computability:

  • In 1933, Kurt Gödel, with Jacques Herbrand, created a formal definition of a class called general recursive functions. The class of general recursive functions is the smallest class of functions (possibly with more than one argument) that includes all constant functions, projections, the successor function, and which is closed under function composition, recursion, and minimization.
  • In 1936, Alonzo Church created a method for defining functions called the λ-calculus. Within λ-calculus, he defined an encoding of the natural numbers called the Church numerals. A function on the natural numbers is called λ-computable if the corresponding function on the Church numerals can be represented by a term of the λ-calculus.
  • Also in 1936, before learning of Church's work, Alan Turing created a theoretical model for machines, now called Turing machines, that could carry out calculations from inputs by manipulating symbols on a tape. Given a suitable encoding of the natural numbers as sequences of symbols, a function on the natural numbers is called Turing computable if some Turing machine computes the corresponding function on encoded natural numbers.

Church and Turing proved that these three formally defined classes of computable functions coincide: a function is λ-computable if and only if it is Turing computable, and if and only if it is general recursive. This has led mathematicians and computer scientists to believe that the concept of computability is accurately characterized by these three equivalent processes. Other formal attempts to characterize computability have subsequently strengthened this belief (see below).

On the other hand, the Church–Turing thesis states that the above three formally-defined classes of computable functions coincide with the informal notion of an effectively calculable function. Since, as an informal notion, the concept of effective calculability does not have a formal definition, the thesis, although it has near-universal acceptance, cannot be formally proven.

Since its inception, variations on the original thesis have arisen, including statements about what can physically be realized by a computer in our universe (physical Church-Turing thesis) and what can be efficiently computed (Church–Turing thesis (complexity theory)). These variations are not due to Church or Turing, but arise from later work in complexity theory and digital physics. The thesis also has implications for the philosophy of mind (see below).

Statement in Church's and Turing's words

J. B. Rosser (1939) addresses the notion of "effective computability" as follows: "Clearly the existence of CC and RC (Church's and Rosser's proofs) presupposes a precise definition of 'effective'. 'Effective method' is here used in the rather special sense of a method each step of which is precisely predetermined and which is certain to produce the answer in a finite number of steps". Thus the adverb-adjective "effective" is used in a sense of "1a: producing a decided, decisive, or desired effect", and "capable of producing a result".

In the following, the words "effectively calculable" will mean "produced by any intuitively 'effective' means whatsoever" and "effectively computable" will mean "produced by a Turing-machine or equivalent mechanical device". Turing's "definitions" given in a footnote in his 1938 Ph.D. thesis Systems of Logic Based on Ordinals, supervised by Church, are virtually the same:

We shall use the expression "computable function" to mean a function calculable by a machine, and let "effectively calculable" refer to the intuitive idea without particular identification with any one of these definitions.

The thesis can be stated as: Every effectively calculable function is a computable function. Church also stated that "No computational procedure will be considered as an algorithm unless it can be represented as a Turing Machine". Turing stated it this way:

It was stated ... that "a function is effectively calculable if its values can be found by some purely mechanical process". We may take this literally, understanding that by a purely mechanical process one which could be carried out by a machine. The development ... leads to ... an identification of computability with effective calculability. [ is the footnote quoted above.]

History

One of the important problems for logicians in the 1930s was the Entscheidungsproblem of David Hilbert and Wilhelm Ackermann, which asked whether there was a mechanical procedure for separating mathematical truths from mathematical falsehoods. This quest required that the notion of "algorithm" or "effective calculability" be pinned down, at least well enough for the quest to begin. But from the very outset Alonzo Church's attempts began with a debate that continues to this day. Was the notion of "effective calculability" to be (i) an "axiom or axioms" in an axiomatic system, (ii) merely a definition that "identified" two or more propositions, (iii) an empirical hypothesis to be verified by observation of natural events, or (iv) just a proposal for the sake of argument (i.e. a "thesis").

Circa 1930–1952

In the course of studying the problem, Church and his student Stephen Kleene introduced the notion of λ-definable functions, and they were able to prove that several large classes of functions frequently encountered in number theory were λ-definable. The debate began when Church proposed to Gödel that one should define the "effectively computable" functions as the λ-definable functions. Gödel, however, was not convinced and called the proposal "thoroughly unsatisfactory". Rather, in correspondence with Church (c. 1934–35), Gödel proposed axiomatizing the notion of "effective calculability"; indeed, in a 1935 letter to Kleene, Church reported that:

His [Gödel's] only idea at the time was that it might be possible, in terms of effective calculability as an undefined notion, to state a set of axioms which would embody the generally accepted properties of this notion, and to do something on that basis.

But Gödel offered no further guidance. Eventually, he would suggest his recursion, modified by Herbrand's suggestion, that Gödel had detailed in his 1934 lectures in Princeton NJ (Kleene and Rosser transcribed the notes). But he did not think that the two ideas could be satisfactorily identified "except heuristically".

Next, it was necessary to identify and prove the equivalence of two notions of effective calculability. Equipped with the λ-calculus and "general" recursion, Stephen Kleene with help of Church and J. Barkley Rosser produced proofs (1933, 1935) to show that the two calculi are equivalent. Church subsequently modified his methods to include use of Herbrand–Gödel recursion and then proved (1936) that the Entscheidungsproblem is unsolvable: there is no algorithm that can determine whether a well formed formula has a "normal form".

Many years later in a letter to Davis (c. 1965), Gödel said that "he was, at the time of these [1934] lectures, not at all convinced that his concept of recursion comprised all possible recursions". By 1963–64 Gödel would disavow Herbrand–Gödel recursion and the λ-calculus in favor of the Turing machine as the definition of "algorithm" or "mechanical procedure" or "formal system".

A hypothesis leading to a natural law?: In late 1936 Alan Turing's paper (also proving that the Entscheidungsproblem is unsolvable) was delivered orally, but had not yet appeared in print. On the other hand, Emil Post's 1936 paper had appeared and was certified independent of Turing's work. Post strongly disagreed with Church's "identification" of effective computability with the λ-calculus and recursion, stating:

Actually the work already done by Church and others carries this identification considerably beyond the working hypothesis stage. But to mask this identification under a definition… blinds us to the need of its continual verification.

Rather, he regarded the notion of "effective calculability" as merely a "working hypothesis" that might lead by inductive reasoning to a "natural law" rather than by "a definition or an axiom". This idea was "sharply" criticized by Church.

Thus Post in his 1936 paper was also discounting Kurt Gödel's suggestion to Church in 1934–35 that the thesis might be expressed as an axiom or set of axioms.

Turing adds another definition, Rosser equates all three: Within just a short time, Turing's 1936–37 paper "On Computable Numbers, with an Application to the Entscheidungsproblem" appeared. In it he stated another notion of "effective computability" with the introduction of his a-machines (now known as the Turing machine abstract computational model). And in a proof-sketch added as an "Appendix" to his 1936–37 paper, Turing showed that the classes of functions defined by λ-calculus and Turing machines coincided. Church was quick to recognise how compelling Turing's analysis was. In his review of Turing's paper he made clear that Turing's notion made "the identification with effectiveness in the ordinary (not explicitly defined) sense evident immediately".

In a few years (1939) Turing would propose, like Church and Kleene before him, that his formal definition of mechanical computing agent was the correct one. Thus, by 1939, both Church (1934) and Turing (1939) had individually proposed that their "formal systems" should be definitions of "effective calculability"; neither framed their statements as theses.

Rosser (1939) formally identified the three notions-as-definitions:

All three definitions are equivalent, so it does not matter which one is used.

Kleene proposes Church's Thesis: This left the overt expression of a "thesis" to Kleene. In his 1943 paper Recursive Predicates and Quantifiers Kleene proposed his "THESIS I":

This heuristic fact [general recursive functions are effectively calculable] ... led Church to state the following thesis(22). The same thesis is implicit in Turing's description of computing machines(23).

THESIS I. Every effectively calculable function (effectively decidable predicate) is general recursive [Kleene's italics]

Since a precise mathematical definition of the term effectively calculable (effectively decidable) has been wanting, we can take this thesis ... as a definition of it ...

(22) references Church 1936; (23) references Turing 1936–7 Kleene goes on to note that:

the thesis has the character of an hypothesis—a point emphasized by Post and by Church(24). If we consider the thesis and its converse as definition, then the hypothesis is an hypothesis about the application of the mathematical theory developed from the definition. For the acceptance of the hypothesis, there are, as we have suggested, quite compelling grounds.

(24) references Post 1936 of Post and Church's Formal definitions in the theory of ordinal numbers, Fund. Math. vol 28 (1936) pp. 11–21.

Kleene's Church–Turing Thesis: In the years 1939-1952, Stephen Kleene would go on to overtly name, after switching from presenting his work in the mathematical terminology of the lambda calculus to his own theory of partial recursive functions or Gödel-Kleene recursiveness, both Church's thesis and Turing's thesis. Finally coining the term the "Church-Turing thesis" in a section from his graduate textbook on logic. In which he was helping to give clarifications to concepts, demanded in a critique from William Boone on Alan Turing's paper "The Word Problem in Semi-Groups with Cancellation", defend, and express the two "theses" and then "identify" them (show equivalence) by use of his Theorem XXX:

Heuristic evidence and other considerations led Church 1936 to propose the following thesis.

Thesis I. Every effectively calculable function (effectively decidable predicate) is general recursive.

Theorem XXX: The following classes of partial functions are coextensive, i.e. have the same members: (a) the partial recursive functions, (b) the computable functions ...

Turing's thesis: Turing's thesis that every function which would naturally be regarded as computable is computable under his definition, i.e. by one of his machines, is equivalent to Church's thesis by Theorem XXX.

Later developments

An attempt to understand the notion of "effective computability" better led Robin Gandy (Turing's student and friend) in 1980 to analyze machine computation (as opposed to human-computation acted out by a Turing machine). Gandy's curiosity about, and analysis of, cellular automata (including Conway's game of life), parallelism, and crystalline automata, led him to propose four "principles (or constraints) ... which it is argued, any machine must satisfy". His most-important fourth, "the principle of causality" is based on the "finite velocity of propagation of effects and signals; contemporary physics rejects the possibility of instantaneous action at a distance". From these principles and some additional constraints—(1a) a lower bound on the linear dimensions of any of the parts, (1b) an upper bound on speed of propagation (the velocity of light), (2) discrete progress of the machine, and (3) deterministic behavior—he produces a theorem that "What can be calculated by a device satisfying principles I–IV is computable."

In the late 1990s Wilfried Sieg analyzed Turing's and Gandy's notions of "effective calculability" with the intent of "sharpening the informal notion, formulating its general features axiomatically, and investigating the axiomatic framework". In his 1997 and 2002 work Sieg presents a series of constraints on the behavior of a computor—"a human computing agent who proceeds mechanically". These constraints reduce to:

  • "(B.1) (Boundedness) There is a fixed bound on the number of symbolic configurations a computor can immediately recognize.
  • "(B.2) (Boundedness) There is a fixed bound on the number of internal states a computor can be in.
  • "(L.1) (Locality) A computor can change only elements of an observed symbolic configuration.
  • "(L.2) (Locality) A computor can shift attention from one symbolic configuration to another one, but the new observed configurations must be within a bounded distance of the immediately previously observed configuration.
  • "(D) (Determinacy) The immediately recognizable (sub-)configuration determines uniquely the next computation step (and id [instantaneous description])"; stated another way: "A computor's internal state together with the observed configuration fixes uniquely the next computation step and the next internal state."

The matter remains in active discussion within the academic community.

The thesis as a definition

The thesis can be viewed as nothing but an ordinary mathematical definition. Comments by Gödel on the subject suggest this view, e.g. "the correct definition of mechanical computability was established beyond any doubt by Turing". The case for viewing the thesis as nothing more than a definition is made explicitly by Robert I. Soare, where it is also argued that Turing's definition of computability is no less likely to be correct than the epsilon-delta definition of a continuous function.

Success of the thesis

Other formalisms (besides recursion, the λ-calculus, and the Turing machine) have been proposed for describing effective calculability/computability. Stephen Kleene (1952) adds to the list the functions "reckonable in the system S1" of Kurt Gödel 1936, and Emil Post's (1943, 1946) "canonical [also called normal] systems". In the 1950s Hao Wang and Martin Davis greatly simplified the one-tape Turing-machine model (see Post–Turing machine). Marvin Minsky expanded the model to two or more tapes and greatly simplified the tapes into "up-down counters", which Melzak and Lambek further evolved into what is now known as the counter machine model. In the late 1960s and early 1970s researchers expanded the counter machine model into the register machine, a close cousin to the modern notion of the computer. Other models include combinatory logic and Markov algorithms. Gurevich adds the pointer machine model of Kolmogorov and Uspensky (1953, 1958): "... they just wanted to ... convince themselves that there is no way to extend the notion of computable function."

All these contributions involve proofs that the models are computationally equivalent to the Turing machine; such models are said to be Turing complete. Because all these different attempts at formalizing the concept of "effective calculability/computability" have yielded equivalent results, it is now generally assumed that the Church–Turing thesis is correct. In fact, Gödel (1936) proposed something stronger than this; he observed that there was something "absolute" about the concept of "reckonable in S1":

It may also be shown that a function which is computable ['reckonable'] in one of the systems Si, or even in a system of transfinite type, is already computable [reckonable] in S1. Thus the concept 'computable' ['reckonable'] is in a certain definite sense 'absolute', while practically all other familiar metamathematical concepts (e.g. provable, definable, etc.) depend quite essentially on the system to which they are defined ...

Informal usage in proofs

Proofs in computability theory often invoke the Church–Turing thesis in an informal way to establish the computability of functions while avoiding the (often very long) details which would be involved in a rigorous, formal proof. To establish that a function is computable by Turing machine, it is usually considered sufficient to give an informal English description of how the function can be effectively computed, and then conclude "by the Church–Turing thesis" that the function is Turing computable (equivalently, partial recursive).

Dirk van Dalen gives the following example for the sake of illustrating this informal use of the Church–Turing thesis:

EXAMPLE: Each infinite RE set contains an infinite recursive set.

Proof: Let A be infinite RE. We list the elements of A effectively, n0, n1, n2, n3, ...

From this list we extract an increasing sublist: put m0=n0, after finitely many steps we find an nk such that nk > m0, put m1=nk. We repeat this procedure to find m2 > m1, etc. this yields an effective listing of the subset B={m0,m1,m2,...} of A, with the property mi < mi+1.

Claim. B is decidable. For, in order to test k in B we must check if k=mi for some i. Since the sequence of mi's is increasing we have to produce at most k+1 elements of the list and compare them with k. If none of them is equal to k, then k not in B. Since this test is effective, B is decidable and, by Church's thesis, recursive.

In order to make the above example completely rigorous, one would have to carefully construct a Turing machine, or λ-function, or carefully invoke recursion axioms, or at best, cleverly invoke various theorems of computability theory. But because the computability theorist believes that Turing computability correctly captures what can be computed effectively, and because an effective procedure is spelled out in English for deciding the set B, the computability theorist accepts this as proof that the set is indeed recursive.

Variations

The success of the Church–Turing thesis prompted variations of the thesis to be proposed. For example, the physical Church–Turing thesis states: "All physically computable functions are Turing-computable."

The Church–Turing thesis says nothing about the efficiency with which one model of computation can simulate another. It has been proved for instance that a (multi-tape) universal Turing machine only suffers a logarithmic slowdown factor in simulating any Turing machine.

A variation of the Church–Turing thesis addresses whether an arbitrary but "reasonable" model of computation can be efficiently simulated. This is called the feasibility thesis, also known as the (classical) complexity-theoretic Church–Turing thesis or the extended Church–Turing thesis, which is not due to Church or Turing, but rather was realized gradually in the development of complexity theory. It states: "A probabilistic Turing machine can efficiently simulate any realistic model of computation." The word 'efficiently' here means up to polynomial-time reductions. This thesis was originally called computational complexity-theoretic Church–Turing thesis by Ethan Bernstein and Umesh Vazirani (1997). The complexity-theoretic Church–Turing thesis, then, posits that all 'reasonable' models of computation yield the same class of problems that can be computed in polynomial time. Assuming the conjecture that probabilistic polynomial time (BPP) equals deterministic polynomial time (P), the word 'probabilistic' is optional in the complexity-theoretic Church–Turing thesis. A similar thesis, called the invariance thesis, was introduced by Cees F. Slot and Peter van Emde Boas. It states: "'Reasonable' machines can simulate each other within a polynomially bounded overhead in time and a constant-factor overhead in space." The thesis originally appeared in a paper at STOC'84, which was the first paper to show that polynomial-time overhead and constant-space overhead could be simultaneously achieved for a simulation of a Random Access Machine on a Turing machine.

If BQP is shown to be a strict superset of BPP, it would invalidate the complexity-theoretic Church–Turing thesis. In other words, there would be efficient quantum algorithms that perform tasks that do not have efficient probabilistic algorithms. This would not however invalidate the original Church–Turing thesis, since a quantum computer can always be simulated by a Turing machine, but it would invalidate the classical complexity-theoretic Church–Turing thesis for efficiency reasons. Consequently, the quantum complexity-theoretic Church–Turing thesis states: "A quantum Turing machine can efficiently simulate any realistic model of computation."

Eugene Eberbach and Peter Wegner claim that the Church–Turing thesis is sometimes interpreted too broadly, stating "the broader assertion that algorithms precisely capture what can be computed is invalid". They claim that forms of computation not captured by the thesis are relevant today, terms which they call super-Turing computation.

Philosophical implications

Philosophers have interpreted the Church–Turing thesis as having implications for the philosophy of mind. B. Jack Copeland states that it is an open empirical question whether there are actual deterministic physical processes that, in the long run, elude simulation by a Turing machine; furthermore, he states that it is an open empirical question whether any such processes are involved in the working of the human brain. There are also some important open questions which cover the relationship between the Church–Turing thesis and physics, and the possibility of hypercomputation. When applied to physics, the thesis has several possible meanings:

  1. The universe is equivalent to a Turing machine; thus, computing non-recursive functions is physically impossible. This has been termed the strong Church–Turing thesis, or Church–Turing–Deutsch principle, and is a foundation of digital physics.
  2. The universe is not equivalent to a Turing machine (i.e., the laws of physics are not Turing-computable), but incomputable physical events are not "harnessable" for the construction of a hypercomputer. For example, a universe in which physics involves random real numbers, as opposed to computable reals, would fall into this category.
  3. The universe is a hypercomputer, and it is possible to build physical devices to harness this property and calculate non-recursive functions. For example, it is an open question whether all quantum mechanical events are Turing-computable, although it is known that rigorous models such as quantum Turing machines are equivalent to deterministic Turing machines. (They are not necessarily efficiently equivalent; see above.) John Lucas and Roger Penrose have suggested that the human mind might be the result of some kind of quantum-mechanically enhanced, "non-algorithmic" computation.

There are many other technical possibilities which fall outside or between these three categories, but these serve to illustrate the range of the concept.

Philosophical aspects of the thesis, regarding both physical and biological computers, are also discussed in Odifreddi's 1989 textbook on recursion theory.

Non-computable functions

One can formally define functions that are not computable. A well-known example of such a function is the Busy Beaver function. This function takes an input n and returns the largest number of symbols that a Turing machine with n states can print before halting, when run with no input. Finding an upper bound on the busy beaver function is equivalent to solving the halting problem, a problem known to be unsolvable by Turing machines. Since the busy beaver function cannot be computed by Turing machines, the Church–Turing thesis states that this function cannot be effectively computed by any method.

Several computational models allow for the computation of (Church-Turing) non-computable functions. These are known as hypercomputers. Mark Burgin argues that super-recursive algorithms such as inductive Turing machines disprove the Church–Turing thesis. His argument relies on a definition of algorithm broader than the ordinary one, so that non-computable functions obtained from some inductive Turing machines are called computable. This interpretation of the Church–Turing thesis differs from the interpretation commonly accepted in computability theory, discussed above. The argument that super-recursive algorithms are indeed algorithms in the sense of the Church–Turing thesis has not found broad acceptance within the computability research community.

Computational complexity theory

From Wikipedia, the free encyclopedia

Computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm.

A problem is regarded as inherently difficult if its solution requires significant resources, whatever the algorithm used. The theory formalizes this intuition, by introducing mathematical models of computation to study these problems and quantifying their computational complexity, i.e., the amount of resources needed to solve them, such as time and storage. Other measures of complexity are also used, such as the amount of communication (used in communication complexity), the number of gates in a circuit (used in circuit complexity) and the number of processors (used in parallel computing). One of the roles of computational complexity theory is to determine the practical limits on what computers can and cannot do. The P versus NP problem, one of the seven Millennium Prize Problems, is dedicated to the field of computational complexity.

Closely related fields in theoretical computer science are analysis of algorithms and computability theory. A key distinction between analysis of algorithms and computational complexity theory is that the former is devoted to analyzing the amount of resources needed by a particular algorithm to solve a problem, whereas the latter asks a more general question about all possible algorithms that could be used to solve the same problem. More precisely, computational complexity theory tries to classify problems that can or cannot be solved with appropriately restricted resources. In turn, imposing restrictions on the available resources is what distinguishes computational complexity from computability theory: the latter theory asks what kinds of problems can, in principle, be solved algorithmically.

Computational problems

A traveling salesman tour through 14 German cities.

Problem instances

A computational problem can be viewed as an infinite collection of instances together with a solution for every instance. The input string for a computational problem is referred to as a problem instance, and should not be confused with the problem itself. In computational complexity theory, a problem refers to the abstract question to be solved. In contrast, an instance of this problem is a rather concrete utterance, which can serve as the input for a decision problem. For example, consider the problem of primality testing. The instance is a number (e.g., 15) and the solution is "yes" if the number is prime and "no" otherwise (in this case, 15 is not prime and the answer is "no"). Stated another way, the instance is a particular input to the problem, and the solution is the output corresponding to the given input.

To further highlight the difference between a problem and an instance, consider the following instance of the decision version of the traveling salesman problem: Is there a route of at most 2000 kilometres passing through all of Germany's 15 largest cities? The quantitative answer to this particular problem instance is of little use for solving other instances of the problem, such as asking for a round trip through all sites in Milan whose total length is at most 10 km. For this reason, complexity theory addresses computational problems and not particular problem instances.

Representing problem instances

When considering computational problems, a problem instance is a string over an alphabet. Usually, the alphabet is taken to be the binary alphabet (i.e., the set {0,1}), and thus the strings are bitstrings. As in a real-world computer, mathematical objects other than bitstrings must be suitably encoded. For example, integers can be represented in binary notation, and graphs can be encoded directly via their adjacency matrices, or by encoding their adjacency lists in binary.

Even though some proofs of complexity-theoretic theorems regularly assume some concrete choice of input encoding, one tries to keep the discussion abstract enough to be independent of the choice of encoding. This can be achieved by ensuring that different representations can be transformed into each other efficiently.

Decision problems as formal languages

A decision problem has only two possible outputs, yes or no (or alternately 1 or 0) on any input.

Decision problems are one of the central objects of study in computational complexity theory. A decision problem is a special type of computational problem whose answer is either yes or no, or alternately either 1 or 0. A decision problem can be viewed as a formal language, where the members of the language are instances whose output is yes, and the non-members are those instances whose output is no. The objective is to decide, with the aid of an algorithm, whether a given input string is a member of the formal language under consideration. If the algorithm deciding this problem returns the answer yes, the algorithm is said to accept the input string, otherwise it is said to reject the input.

An example of a decision problem is the following. The input is an arbitrary graph. The problem consists in deciding whether the given graph is connected or not. The formal language associated with this decision problem is then the set of all connected graphs — to obtain a precise definition of this language, one has to decide how graphs are encoded as binary strings.

Function problems

A function problem is a computational problem where a single output (of a total function) is expected for every input, but the output is more complex than that of a decision problem—that is, the output isn't just yes or no. Notable examples include the traveling salesman problem and the integer factorization problem.

It is tempting to think that the notion of function problems is much richer than the notion of decision problems. However, this is not really the case, since function problems can be recast as decision problems. For example, the multiplication of two integers can be expressed as the set of triples (abc) such that the relation a × b = c holds. Deciding whether a given triple is a member of this set corresponds to solving the problem of multiplying two numbers.

Measuring the size of an instance

To measure the difficulty of solving a computational problem, one may wish to see how much time the best algorithm requires to solve the problem. However, the running time may, in general, depend on the instance. In particular, larger instances will require more time to solve. Thus the time required to solve a problem (or the space required, or any measure of complexity) is calculated as a function of the size of the instance. This is usually taken to be the size of the input in bits. Complexity theory is interested in how algorithms scale with an increase in the input size. For instance, in the problem of finding whether a graph is connected, how much more time does it take to solve a problem for a graph with 2n vertices compared to the time taken for a graph with n vertices?

If the input size is n, the time taken can be expressed as a function of n. Since the time taken on different inputs of the same size can be different, the worst-case time complexity T(n) is defined to be the maximum time taken over all inputs of size n. If T(n) is a polynomial in n, then the algorithm is said to be a polynomial time algorithm. Cobham's thesis argues that a problem can be solved with a feasible amount of resources if it admits a polynomial time algorithm.

Machine models and complexity measures

Turing machine

An illustration of a Turing machine

A Turing machine is a mathematical model of a general computing machine. It is a theoretical device that manipulates symbols contained on a strip of tape. Turing machines are not intended as a practical computing technology, but rather as a general model of a computing machine—anything from an advanced supercomputer to a mathematician with a pencil and paper. It is believed that if a problem can be solved by an algorithm, there exists a Turing machine that solves the problem. Indeed, this is the statement of the Church–Turing thesis. Furthermore, it is known that everything that can be computed on other models of computation known to us today, such as a RAM machine, Conway's Game of Life, cellular automata or any programming language can be computed on a Turing machine. Since Turing machines are easy to analyze mathematically, and are believed to be as powerful as any other model of computation, the Turing machine is the most commonly used model in complexity theory.

Many types of Turing machines are used to define complexity classes, such as deterministic Turing machines, probabilistic Turing machines, non-deterministic Turing machines, quantum Turing machines, symmetric Turing machines and alternating Turing machines. They are all equally powerful in principle, but when resources (such as time or space) are bounded, some of these may be more powerful than others.

A deterministic Turing machine is the most basic Turing machine, which uses a fixed set of rules to determine its future actions. A probabilistic Turing machine is a deterministic Turing machine with an extra supply of random bits. The ability to make probabilistic decisions often helps algorithms solve problems more efficiently. Algorithms that use random bits are called randomized algorithms. A non-deterministic Turing machine is a deterministic Turing machine with an added feature of non-determinism, which allows a Turing machine to have multiple possible future actions from a given state. One way to view non-determinism is that the Turing machine branches into many possible computational paths at each step, and if it solves the problem in any of these branches, it is said to have solved the problem. Clearly, this model is not meant to be a physically realizable model, it is just a theoretically interesting abstract machine that gives rise to particularly interesting complexity classes. For examples, see non-deterministic algorithm.

Other machine models

Many machine models different from the standard multi-tape Turing machines have been proposed in the literature, for example random access machines. Perhaps surprisingly, each of these models can be converted to another without providing any extra computational power. The time and memory consumption of these alternate models may vary. What all these models have in common is that the machines operate deterministically.

However, some computational problems are easier to analyze in terms of more unusual resources. For example, a non-deterministic Turing machine is a computational model that is allowed to branch out to check many different possibilities at once. The non-deterministic Turing machine has very little to do with how we physically want to compute algorithms, but its branching exactly captures many of the mathematical models we want to analyze, so that non-deterministic time is a very important resource in analyzing computational problems.

Complexity measures

For a precise definition of what it means to solve a problem using a given amount of time and space, a computational model such as the deterministic Turing machine is used. The time required by a deterministic Turing machine M on input x is the total number of state transitions, or steps, the machine makes before it halts and outputs the answer ("yes" or "no"). A Turing machine M is said to operate within time f(n) if the time required by M on each input of length n is at most f(n). A decision problem A can be solved in time f(n) if there exists a Turing machine operating in time f(n) that solves the problem. Since complexity theory is interested in classifying problems based on their difficulty, one defines sets of problems based on some criteria. For instance, the set of problems solvable within time f(n) on a deterministic Turing machine is then denoted by DTIME(f(n)).

Analogous definitions can be made for space requirements. Although time and space are the most well-known complexity resources, any complexity measure can be viewed as a computational resource. Complexity measures are very generally defined by the Blum complexity axioms. Other complexity measures used in complexity theory include communication complexity, circuit complexity, and decision tree complexity.

The complexity of an algorithm is often expressed using big O notation.

Best, worst and average case complexity

Visualization of the quicksort algorithm that has average case performance .

The best, worst and average case complexity refer to three different ways of measuring the time complexity (or any other complexity measure) of different inputs of the same size. Since some inputs of size n may be faster to solve than others, we define the following complexities:

  1. Best-case complexity: This is the complexity of solving the problem for the best input of size n.
  2. Average-case complexity: This is the complexity of solving the problem on an average. This complexity is only defined with respect to a probability distribution over the inputs. For instance, if all inputs of the same size are assumed to be equally likely to appear, the average case complexity can be defined with respect to the uniform distribution over all inputs of size n.
  3. Amortized analysis: Amortized analysis considers both the costly and less costly operations together over the whole series of operations of the algorithm.
  4. Worst-case complexity: This is the complexity of solving the problem for the worst input of size n.

The order from cheap to costly is: Best, average (of discrete uniform distribution), amortized, worst.

For example, consider the deterministic sorting algorithm quicksort. This solves the problem of sorting a list of integers that is given as the input. The worst-case is when the pivot is always the largest or smallest value in the list (so the list is never divided). In this case the algorithm takes time O(n2). If we assume that all possible permutations of the input list are equally likely, the average time taken for sorting is O(n log n). The best case occurs when each pivoting divides the list in half, also needing O(n log n) time.

Upper and lower bounds on the complexity of problems

To classify the computation time (or similar resources, such as space consumption), it is helpful to demonstrate upper and lower bounds on the maximum amount of time required by the most efficient algorithm to solve a given problem. The complexity of an algorithm is usually taken to be its worst-case complexity, unless specified otherwise. Analyzing a particular algorithm falls under the field of analysis of algorithms. To show an upper bound T(n) on the time complexity of a problem, one needs to show only that there is a particular algorithm with running time at most T(n). However, proving lower bounds is much more difficult, since lower bounds make a statement about all possible algorithms that solve a given problem. The phrase "all possible algorithms" includes not just the algorithms known today, but any algorithm that might be discovered in the future. To show a lower bound of T(n) for a problem requires showing that no algorithm can have time complexity lower than T(n).

Upper and lower bounds are usually stated using the big O notation, which hides constant factors and smaller terms. This makes the bounds independent of the specific details of the computational model used. For instance, if T(n) = 7n2 + 15n + 40, in big O notation one would write T(n) = O(n2).

Complexity classes

Defining complexity classes

A complexity class is a set of problems of related complexity. Simpler complexity classes are defined by the following factors:

  • The type of computational problem: The most commonly used problems are decision problems. However, complexity classes can be defined based on function problems, counting problems, optimization problems, promise problems, etc.
  • The model of computation: The most common model of computation is the deterministic Turing machine, but many complexity classes are based on non-deterministic Turing machines, Boolean circuits, quantum Turing machines, monotone circuits, etc.
  • The resource (or resources) that is being bounded and the bound: These two properties are usually stated together, such as "polynomial time", "logarithmic space", "constant depth", etc.

Some complexity classes have complicated definitions that do not fit into this framework. Thus, a typical complexity class has a definition like the following:

The set of decision problems solvable by a deterministic Turing machine within time f(n). (This complexity class is known as DTIME(f(n)).)

But bounding the computation time above by some concrete function f(n) often yields complexity classes that depend on the chosen machine model. For instance, the language {xx | x is any binary string} can be solved in linear time on a multi-tape Turing machine, but necessarily requires quadratic time in the model of single-tape Turing machines. If we allow polynomial variations in running time, Cobham-Edmonds thesis states that "the time complexities in any two reasonable and general models of computation are polynomially related" (Goldreich 2008, Chapter 1.2). This forms the basis for the complexity class P, which is the set of decision problems solvable by a deterministic Turing machine within polynomial time. The corresponding set of function problems is FP.

Important complexity classes

A representation of the relation among complexity classes

Many important complexity classes can be defined by bounding the time or space used by the algorithm. Some important complexity classes of decision problems defined in this manner are the following:

Complexity class Model of computation Resource constraint
Deterministic time
DTIME(f(n)) Deterministic Turing machine Time O(f(n))
     
P Deterministic Turing machine Time O(poly(n))
EXPTIME Deterministic Turing machine Time O(2poly(n))
Non-deterministic time
NTIME(f(n)) Non-deterministic Turing machine Time O(f(n))
     
NP Non-deterministic Turing machine Time O(poly(n))
NEXPTIME Non-deterministic Turing machine Time O(2poly(n))
Complexity class Model of computation Resource constraint
Deterministic space
DSPACE(f(n)) Deterministic Turing machine Space O(f(n))
L Deterministic Turing machine Space O(log n)
PSPACE Deterministic Turing machine Space O(poly(n))
EXPSPACE Deterministic Turing machine Space O(2poly(n))
Non-deterministic space
NSPACE(f(n)) Non-deterministic Turing machine Space O(f(n))
NL Non-deterministic Turing machine Space O(log n)
NPSPACE Non-deterministic Turing machine Space O(poly(n))
NEXPSPACE Non-deterministic Turing machine Space O(2poly(n))

The logarithmic-space classes (necessarily) do not take into account the space needed to represent the problem.

It turns out that PSPACE = NPSPACE and EXPSPACE = NEXPSPACE by Savitch's theorem.

Other important complexity classes include BPP, ZPP and RP, which are defined using probabilistic Turing machines; AC and NC, which are defined using Boolean circuits; and BQP and QMA, which are defined using quantum Turing machines. #P is an important complexity class of counting problems (not decision problems). Classes like IP and AM are defined using Interactive proof systems. ALL is the class of all decision problems.

Hierarchy theorems

For the complexity classes defined in this way, it is desirable to prove that relaxing the requirements on (say) computation time indeed defines a bigger set of problems. In particular, although DTIME(n) is contained in DTIME(n2), it would be interesting to know if the inclusion is strict. For time and space requirements, the answer to such questions is given by the time and space hierarchy theorems respectively. They are called hierarchy theorems because they induce a proper hierarchy on the classes defined by constraining the respective resources. Thus there are pairs of complexity classes such that one is properly included in the other. Having deduced such proper set inclusions, we can proceed to make quantitative statements about how much more additional time or space is needed in order to increase the number of problems that can be solved.

More precisely, the time hierarchy theorem states that

.

The space hierarchy theorem states that

.

The time and space hierarchy theorems form the basis for most separation results of complexity classes. For instance, the time hierarchy theorem tells us that P is strictly contained in EXPTIME, and the space hierarchy theorem tells us that L is strictly contained in PSPACE.

Reduction

Many complexity classes are defined using the concept of a reduction. A reduction is a transformation of one problem into another problem. It captures the informal notion of a problem being at most as difficult as another problem. For instance, if a problem X can be solved using an algorithm for Y, X is no more difficult than Y, and we say that X reduces to Y. There are many different types of reductions, based on the method of reduction, such as Cook reductions, Karp reductions and Levin reductions, and the bound on the complexity of reductions, such as polynomial-time reductions or log-space reductions.

The most commonly used reduction is a polynomial-time reduction. This means that the reduction process takes polynomial time. For example, the problem of squaring an integer can be reduced to the problem of multiplying two integers. This means an algorithm for multiplying two integers can be used to square an integer. Indeed, this can be done by giving the same input to both inputs of the multiplication algorithm. Thus we see that squaring is not more difficult than multiplication, since squaring can be reduced to multiplication.

This motivates the concept of a problem being hard for a complexity class. A problem X is hard for a class of problems C if every problem in C can be reduced to X. Thus no problem in C is harder than X, since an algorithm for X allows us to solve any problem in C. The notion of hard problems depends on the type of reduction being used. For complexity classes larger than P, polynomial-time reductions are commonly used. In particular, the set of problems that are hard for NP is the set of NP-hard problems.

If a problem X is in C and hard for C, then X is said to be complete for C. This means that X is the hardest problem in C. (Since many problems could be equally hard, one might say that X is one of the hardest problems in C.) Thus the class of NP-complete problems contains the most difficult problems in NP, in the sense that they are the ones most likely not to be in P. Because the problem P = NP is not solved, being able to reduce a known NP-complete problem, Π2, to another problem, Π1, would indicate that there is no known polynomial-time solution for Π1. This is because a polynomial-time solution to Π1 would yield a polynomial-time solution to Π2. Similarly, because all NP problems can be reduced to the set, finding an NP-complete problem that can be solved in polynomial time would mean that P = NP.

Important open problems

Diagram of complexity classes provided that P ≠ NP. The existence of problems in NP outside both P and NP-complete in this case was established by Ladner.

P versus NP problem

The complexity class P is often seen as a mathematical abstraction modeling those computational tasks that admit an efficient algorithm. This hypothesis is called the Cobham–Edmonds thesis. The complexity class NP, on the other hand, contains many problems that people would like to solve efficiently, but for which no efficient algorithm is known, such as the Boolean satisfiability problem, the Hamiltonian path problem and the vertex cover problem. Since deterministic Turing machines are special non-deterministic Turing machines, it is easily observed that each problem in P is also member of the class NP.

The question of whether P equals NP is one of the most important open questions in theoretical computer science because of the wide implications of a solution. If the answer is yes, many important problems can be shown to have more efficient solutions. These include various types of integer programming problems in operations research, many problems in logistics, protein structure prediction in biology, and the ability to find formal proofs of pure mathematics theorems. The P versus NP problem is one of the Millennium Prize Problems proposed by the Clay Mathematics Institute. There is a US$1,000,000 prize for resolving the problem.

Problems in NP not known to be in P or NP-complete

It was shown by Ladner that if PNP then there exist problems in NP that are neither in P nor NP-complete. Such problems are called NP-intermediate problems. The graph isomorphism problem, the discrete logarithm problem and the integer factorization problem are examples of problems believed to be NP-intermediate. They are some of the very few NP problems not known to be in P or to be NP-complete.

The graph isomorphism problem is the computational problem of determining whether two finite graphs are isomorphic. An important unsolved problem in complexity theory is whether the graph isomorphism problem is in P, NP-complete, or NP-intermediate. The answer is not known, but it is believed that the problem is at least not NP-complete. If graph isomorphism is NP-complete, the polynomial time hierarchy collapses to its second level. Since it is widely believed that the polynomial hierarchy does not collapse to any finite level, it is believed that graph isomorphism is not NP-complete. The best algorithm for this problem, due to László Babai and Eugene Luks has run time for graphs with n vertices, although some recent work by Babai offers some potentially new perspectives on this.

The integer factorization problem is the computational problem of determining the prime factorization of a given integer. Phrased as a decision problem, it is the problem of deciding whether the input has a prime factor less than k. No efficient integer factorization algorithm is known, and this fact forms the basis of several modern cryptographic systems, such as the RSA algorithm. The integer factorization problem is in NP and in co-NP (and even in UP and co-UP). If the problem is NP-complete, the polynomial time hierarchy will collapse to its first level (i.e., NP will equal co-NP). The best known algorithm for integer factorization is the general number field sieve, which takes time to factor an odd integer n. However, the best known quantum algorithm for this problem, Shor's algorithm, does run in polynomial time. Unfortunately, this fact doesn't say much about where the problem lies with respect to non-quantum complexity classes.

Separations between other complexity classes

Many known complexity classes are suspected to be unequal, but this has not been proved. For instance PNPPPPSPACE, but it is possible that P = PSPACE. If P is not equal to NP, then P is not equal to PSPACE either. Since there are many known complexity classes between P and PSPACE, such as RP, BPP, PP, BQP, MA, PH, etc., it is possible that all these complexity classes collapse to one class. Proving that any of these classes are unequal would be a major breakthrough in complexity theory.

Along the same lines, co-NP is the class containing the complement problems (i.e. problems with the yes/no answers reversed) of NP problems. It is believed that NP is not equal to co-NP; however, it has not yet been proven. It is clear that if these two complexity classes are not equal then P is not equal to NP, since P=co-P. Thus if P=NP we would have co-P=co-NP whence NP=P=co-P=co-NP.

Similarly, it is not known if L (the set of all problems that can be solved in logarithmic space) is strictly contained in P or equal to P. Again, there are many complexity classes between the two, such as NL and NC, and it is not known if they are distinct or equal classes.

It is suspected that P and BPP are equal. However, it is currently open if BPP = NEXP.

Intractability

A problem that can be solved in theory (e.g. given large but finite resources, especially time), but for which in practice any solution takes too many resources to be useful, is known as an intractable problem. Conversely, a problem that can be solved in practice is called a tractable problem, literally "a problem that can be handled". The term infeasible (literally "cannot be done") is sometimes used interchangeably with intractable, though this risks confusion with a feasible solution in mathematical optimization.

Tractable problems are frequently identified with problems that have polynomial-time solutions (P, PTIME); this is known as the Cobham–Edmonds thesis. Problems that are known to be intractable in this sense include those that are EXPTIME-hard. If NP is not the same as P, then NP-hard problems are also intractable in this sense.

However, this identification is inexact: a polynomial-time solution with large degree or large leading coefficient grows quickly, and may be impractical for practical size problems; conversely, an exponential-time solution that grows slowly may be practical on realistic input, or a solution that takes a long time in the worst case may take a short time in most cases or the average case, and thus still be practical. Saying that a problem is not in P does not imply that all large cases of the problem are hard or even that most of them are. For example, the decision problem in Presburger arithmetic has been shown not to be in P, yet algorithms have been written that solve the problem in reasonable times in most cases. Similarly, algorithms can solve the NP-complete knapsack problem over a wide range of sizes in less than quadratic time and SAT solvers routinely handle large instances of the NP-complete Boolean satisfiability problem.

To see why exponential-time algorithms are generally unusable in practice, consider a program that makes 2n operations before halting. For small n, say 100, and assuming for the sake of example that the computer does 1012 operations each second, the program would run for about 4 × 1010 years, which is the same order of magnitude as the age of the universe. Even with a much faster computer, the program would only be useful for very small instances and in that sense the intractability of a problem is somewhat independent of technological progress. However, an exponential-time algorithm that takes 1.0001n operations is practical until n gets relatively large.

Similarly, a polynomial time algorithm is not always practical. If its running time is, say, n15, it is unreasonable to consider it efficient and it is still useless except on small instances. Indeed, in practice even n3 or n2 algorithms are often impractical on realistic sizes of problems.

Continuous complexity theory

Continuous complexity theory can refer to complexity theory of problems that involve continuous functions that are approximated by discretizations, as studied in numerical analysis. One approach to complexity theory of numerical analysis is information based complexity.

Continuous complexity theory can also refer to complexity theory of the use of analog computation, which uses continuous dynamical systems and differential equations. Control theory can be considered a form of computation and differential equations are used in the modelling of continuous-time and hybrid discrete-continuous-time systems.

History

An early example of algorithm complexity analysis is the running time analysis of the Euclidean algorithm done by Gabriel Lamé in 1844.

Before the actual research explicitly devoted to the complexity of algorithmic problems started off, numerous foundations were laid out by various researchers. Most influential among these was the definition of Turing machines by Alan Turing in 1936, which turned out to be a very robust and flexible simplification of a computer.

The beginning of systematic studies in computational complexity is attributed to the seminal 1965 paper "On the Computational Complexity of Algorithms" by Juris Hartmanis and Richard E. Stearns, which laid out the definitions of time complexity and space complexity, and proved the hierarchy theorems. In addition, in 1965 Edmonds suggested to consider a "good" algorithm to be one with running time bounded by a polynomial of the input size.

Earlier papers studying problems solvable by Turing machines with specific bounded resources include John Myhill's definition of linear bounded automata (Myhill 1960), Raymond Smullyan's study of rudimentary sets (1961), as well as Hisao Yamada's paper on real-time computations (1962). Somewhat earlier, Boris Trakhtenbrot (1956), a pioneer in the field from the USSR, studied another specific complexity measure. As he remembers:

However, [my] initial interest [in automata theory] was increasingly set aside in favor of computational complexity, an exciting fusion of combinatorial methods, inherited from switching theory, with the conceptual arsenal of the theory of algorithms. These ideas had occurred to me earlier in 1955 when I coined the term "signalizing function", which is nowadays commonly known as "complexity measure".

In 1967, Manuel Blum formulated a set of axioms (now known as Blum axioms) specifying desirable properties of complexity measures on the set of computable functions and proved an important result, the so-called speed-up theorem. The field began to flourish in 1971 when the Stephen Cook and Leonid Levin proved the existence of practically relevant problems that are NP-complete. In 1972, Richard Karp took this idea a leap forward with his landmark paper, "Reducibility Among Combinatorial Problems", in which he showed that 21 diverse combinatorial and graph theoretical problems, each infamous for its computational intractability, are NP-complete.

Butane

From Wikipedia, the free encyclopedia ...