A Medley of Potpourri: Time complexity

Friday, June 28, 2024

Time complexity

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Time_complexity

In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to be related by a constant factor.

Since an algorithm's running time may vary among different inputs of the same size, one commonly considers the worst-case time complexity, which is the maximum amount of time required for inputs of a given size. Less common, and usually specified explicitly, is the average-case complexity, which is the average of the time taken on inputs of a given size (this makes sense because there are only a finite number of possible inputs of a given size). In both cases, the time complexity is generally expressed as a function of the size of the input. Since this function is generally difficult to compute exactly, and the running time for small inputs is usually not consequential, one commonly focuses on the behavior of the complexity when the input size increases—that is, the asymptotic behavior of the complexity. Therefore, the time complexity is commonly expressed using big O notation, typically $O (n)$ , $O (n \log n)$ , $O (n^{α})$ , $O (2^{n})$ , etc., where $n$ is the size in units of bits needed to represent the input.

Algorithmic complexities are classified according to the type of function appearing in the big O notation. For example, an algorithm with time complexity $O (n)$ is a linear time algorithm and an algorithm with time complexity $O (n^{α})$ for some constant $α > 1$ is a polynomial time algorithm.

Table of common time complexities

The following table summarizes some classes of commonly encountered time complexities. In the table, poly(x) = x^O(1), i.e., polynomial in x.

Name	Complexity class	Time complexity ( $O (n)$ )	Examples of running times	Example algorithms
constant time		$O (1)$	10	Finding the median value in a sorted array of numbers. Calculating $(-1) n$ .
inverse Ackermann time		$O (α (n))$		Amortized time per operation using a disjoint set
iterated logarithmic time		$O (\log^{*} n)$		Distributed coloring of cycles
log-logarithmic		$O (\log \log n)$		Amortized time per operation using a bounded priority queue
logarithmic time	DLOGTIME	$O (\log n)$	$\log n$ , $\log (n^{2})$	Binary search
polylogarithmic time		$poly (\log n)$	$(\log n)^{2}$
fractional power		$O (n^{c})$ where $0 < c < 1$	$n^{\frac{1}{2}}$ , $n^{\frac{2}{3}}$	Range searching in a k-d tree
linear time		$O (n)$	$n$ , $2 n + 5$	Finding the smallest or largest item in an unsorted array. Kadane's algorithm. Linear search.
"n log-star n" time		$O (n \log^{*} n)$		Seidel's polygon triangulation algorithm.
linearithmic time		$O (n \log n)$	$n \log n$ , $\log n!$	Fastest possible comparison sort. Fast Fourier transform.
quasilinear time		$n poly (\log n)$	$n \log^{2} n$	Multipoint polynomial evaluation
quadratic time		$O (n^{2})$	$n^{2}$	Bubble sort. Insertion sort. Direct convolution
cubic time		$O (n^{3})$	$n^{3}$	Naive multiplication of two $n \times n$ matrices. Calculating partial correlation.
polynomial time	P	$2^{O (\log n)} = poly (n)$	$n^{2} + n$ , $n^{10}$	Karmarkar's algorithm for linear programming. AKS primality test
quasi-polynomial time	QP	$2^{poly (\log n)}$	$n^{\log \log n}$ , $n^{\log n}$	Best-known $O (log 2 n)$ -approximation algorithm for the directed Steiner tree problem, best known parity game solver, best known graph isomorphism algorithm
sub-exponential time (first definition)	SUBEXP	$O (2^{n^{ϵ}})$ for all $ϵ > 0$		Contains BPP unless EXPTIME (see below) equals MA.
sub-exponential time (second definition)		$2^{o (n)}$	$2^{\sqrt[3]{n}}$	Best classical algorithm for integer factorization formerly-best algorithm for graph isomorphism
exponential time (with linear exponent)	E	$2^{O (n)}$	${1.1}^{n}$ , $10^{n}$	Solving the traveling salesman problem using dynamic programming
factorial time		$O (n)! = 2^{O (n \log n)}$	$n!, n^{n}, 2^{n \log n}$	Solving the traveling salesman problem via brute-force search
exponential time	EXPTIME	$2^{poly (n)}$	$2^{n}$ , $2^{n^{2}}$	Solving matrix chain multiplication via brute-force search
double exponential time	2-EXPTIME	$2^{2^{poly (n)}}$	$2^{2^{n}}$	Deciding the truth of a given statement in Presburger arithmetic

Constant time

An algorithm is said to be constant time (also written as $O (1)$ time) if the value of $T (n)$ (the complexity of the algorithm) is bounded by a value that does not depend on the size of the input. For example, accessing any single element in an array takes constant time as only one operation has to be performed to locate it. In a similar manner, finding the minimal value in an array sorted in ascending order; it is the first element. However, finding the minimal value in an unordered array is not a constant time operation as scanning over each element in the array is needed in order to determine the minimal value. Hence it is a linear time operation, taking $O (n)$ time. If the number of elements is known in advance and does not change, however, such an algorithm can still be said to run in constant time.

Despite the name "constant time", the running time does not have to be independent of the problem size, but an upper bound for the running time has to be independent of the problem size. For example, the task "exchange the values of $a$ and $b$ if necessary so that $a \leq b$ " is called constant time even though the time may depend on whether or not it is already true that $a \leq b$ . However, there is some constant $t$ such that the time required is always at most $t$ .

Logarithmic time

An algorithm is said to take logarithmic time when $T (n) = O (\log n)$ . Since $\log_{a} n$ and $\log_{b} n$ are related by a constant multiplier, and such a multiplier is irrelevant to big O classification, the standard usage for logarithmic-time algorithms is $O (\log n)$ regardless of the base of the logarithm appearing in the expression of $T$ .

Algorithms taking logarithmic time are commonly found in operations on binary trees or when using binary search.

An $O (\log n)$ algorithm is considered highly efficient, as the ratio of the number of operations to the size of the input decreases and tends to zero when $n$ increases. An algorithm that must access all elements of its input cannot take logarithmic time, as the time taken for reading an input of size $n$ is of the order of $n$ .

An example of logarithmic time is given by dictionary search. Consider a dictionary $D$ which contains $n$ entries, sorted in alphabetical order. We suppose that, for $1 \leq k \leq n$ , one may access the $k$ th entry of the dictionary in a constant time. Let $D (k)$ denote this $k$ th entry. Under these hypotheses, the test to see if a word $w$ is in the dictionary may be done in logarithmic time: consider $D (⌊ \frac{n}{2} ⌋)$ , where $⌊ ⌋$ denotes the floor function. If $w = D (⌊ \frac{n}{2} ⌋)$ --that is to say, the word $w$ is exactly in the middle of the dictionary--then we are done. Else, if $w < D (⌊ \frac{n}{2} ⌋)$ --i.e., if the word $w$ comes earlier in alphabetical order than the middle word of the whole dictionary--we continue the search in the same way in the left (i.e. earlier) half of the dictionary, and then again repeatedly until the correct word is found. Otherwise, if it comes after the middle word, continue similarly with the right half of the dictionary. This algorithm is similar to the method often used to find an entry in a paper dictionary. As a result, the search space within the dictionary decreases as the algorithm gets closer to the target word.

Polylogarithmic time

An algorithm is said to run in polylogarithmic time if its time $T (n)$ is $O ((\log n)^{k})$ for some constant $k$ . Another way to write this is $O (\log^{k} n)$ .

For example, matrix chain ordering can be solved in polylogarithmic time on a parallel random-access machine, and a graph can be determined to be planar in a fully dynamic way in $O (\log^{3} n)$ time per insert/delete operation.

Sub-linear time

An algorithm is said to run in sub-linear time (often spelled sublinear time) if $T (n) = o (n)$ . In particular this includes algorithms with the time complexities defined above.

The specific term sublinear time algorithm commonly refers to randomized algorithms that sample a small fraction of their inputs and process them efficiently to approximately infer properties of the entire instance. This type of sublinear time algorithm is closely related to property testing and statistics.

Other settings where algorithms can run in sublinear time include:

Parallel algorithms that have linear or greater total work (allowing them to read the entire input), but sub-linear depth.
Algorithms that have guaranteed assumptions on the input structure. An important example are operations on data structures, e.g. binary search in a sorted array.
Algorithms that search for local structure in the input, for example finding a local minimum in a 1-D array (can be solved in $O (\log (n))$ time using a variant of binary search). A closely related notion is that of Local Computation Algorithms (LCA) where the algorithm receives a large input and queries to local information about some valid large output.

Linear time

An algorithm is said to take linear time, or $O (n)$ time, if its time complexity is $O (n)$ . Informally, this means that the running time increases at most linearly with the size of the input. More precisely, this means that there is a constant $c$ such that the running time is at most $c n$ for every input of size $n$ . For example, a procedure that adds up all elements of a list requires time proportional to the length of the list, if the adding time is constant, or, at least, bounded by a constant.

Linear time is the best possible time complexity in situations where the algorithm has to sequentially read its entire input. Therefore, much research has been invested into discovering algorithms exhibiting linear time or, at least, nearly linear time. This research includes both software and hardware methods. There are several hardware technologies which exploit parallelism to provide this. An example is content-addressable memory. This concept of linear time is used in string matching algorithms such as the Boyer–Moore string-search algorithm and Ukkonen's algorithm.

Quasilinear time

An algorithm is said to run in quasilinear time (also referred to as log-linear time) if $T (n) = O (n \log^{k} n)$ for some positive constant $k$ ; linearithmic time is the case $k = 1$ . Using soft O notation these algorithms are $\tilde{O} (n)$ . Quasilinear time algorithms are also $O (n^{1 + ε})$ for every constant $ε > 0$ and thus run faster than any polynomial time algorithm whose time bound includes a term $n^{c}$ for any $c > 1$ .

Algorithms which run in quasilinear time include:

In-place merge sort, $O (n \log^{2} n)$
Quicksort, $O (n \log n)$ , in its randomized version, has a running time that is $O (n \log n)$ in expectation on the worst-case input. Its non-randomized version has an $O (n \log n)$ running time only when considering average case complexity.
Heapsort, $O (n \log n)$ , merge sort, introsort, binary tree sort, smoothsort, patience sorting, etc. in the worst case
Fast Fourier transforms, $O (n \log n)$
Monge array calculation, $O (n \log n)$

In many cases, the $O (n \log n)$ running time is simply the result of performing a $Θ (\log n)$ operation $n$ times (for the notation, see Big O notation § Family of Bachmann–Landau notations). For example, binary tree sort creates a binary tree by inserting each element of the $n$ -sized array one by one. Since the insert operation on a self-balancing binary search tree takes $O (\log n)$ time, the entire algorithm takes $O (n \log n)$ time.

Comparison sorts require at least $Ω (n \log n)$ comparisons in the worst case because $\log (n!) = Θ (n \log n)$ , by Stirling's approximation. They also frequently arise from the recurrence relation $T (n) = 2 T (\frac{n}{2}) + O (n)$ .

Sub-quadratic time

An algorithm is said to be subquadratic time if $T (n) = o (n^{2})$ .

For example, simple, comparison-based sorting algorithms are quadratic (e.g. insertion sort), but more advanced algorithms can be found that are subquadratic (e.g. shell sort). No general-purpose sorts run in linear time, but the change from quadratic to sub-quadratic is of great practical importance.

Polynomial time

An algorithm is said to be of polynomial time if its running time is upper bounded by a polynomial expression in the size of the input for the algorithm, that is, T(n) = O(n^k) for some positive constant k. Problems for which a deterministic polynomial-time algorithm exists belong to the complexity class P, which is central in the field of computational complexity theory. Cobham's thesis states that polynomial time is a synonym for "tractable", "feasible", "efficient", or "fast".

Some examples of polynomial-time algorithms:

The selection sort sorting algorithm on n integers performs $A n^{2}$ operations for some constant A. Thus it runs in time $O (n^{2})$ and is a polynomial-time algorithm.
All the basic arithmetic operations (addition, subtraction, multiplication, division, and comparison) can be done in polynomial time.
Maximum matchings in graphs can be found in polynomial time. In some contexts, especially in optimization, one differentiates between strongly polynomial time and weakly polynomial time algorithms.

These two concepts are only relevant if the inputs to the algorithms consist of integers.

Complexity classes

The concept of polynomial time leads to several complexity classes in computational complexity theory. Some important classes defined using polynomial time are the following.

P: The complexity class of decision problems that can be solved on a deterministic Turing machine in polynomial time
NP: The complexity class of decision problems that can be solved on a non-deterministic Turing machine in polynomial time
ZPP: The complexity class of decision problems that can be solved with zero error on a probabilistic Turing machine in polynomial time
RP: The complexity class of decision problems that can be solved with 1-sided error on a probabilistic Turing machine in polynomial time.
BPP: The complexity class of decision problems that can be solved with 2-sided error on a probabilistic Turing machine in polynomial time
BQP: The complexity class of decision problems that can be solved with 2-sided error on a quantum Turing machine in polynomial time

P is the smallest time-complexity class on a deterministic machine which is robust in terms of machine model changes. (For example, a change from a single-tape Turing machine to a multi-tape machine can lead to a quadratic speedup, but any algorithm that runs in polynomial time under one model also does so on the other.) Any given abstract machine will have a complexity class corresponding to the problems which can be solved in polynomial time on that machine.

Superpolynomial time

An algorithm is defined to take superpolynomial time if T(n) is not bounded above by any polynomial. Using little omega notation, it is ω(n^c) time for all constants c, where n is the input parameter, typically the number of bits in the input.

For example, an algorithm that runs for 2ⁿ steps on an input of size n requires superpolynomial time (more specifically, exponential time).

An algorithm that uses exponential resources is clearly superpolynomial, but some algorithms are only very weakly superpolynomial. For example, the Adleman–Pomerance–Rumely primality test runs for n^{O(log log n)} time on n-bit inputs; this grows faster than any polynomial for large enough n, but the input size must become impractically large before it cannot be dominated by a polynomial with small degree.

An algorithm that requires superpolynomial time lies outside the complexity class P. Cobham's thesis posits that these algorithms are impractical, and in many cases they are. Since the P versus NP problem is unresolved, it is unknown whether NP-complete problems require superpolynomial time.

Quasi-polynomial time

Quasi-polynomial time algorithms are algorithms whose running time exhibits quasi-polynomial growth, a type of behavior that may be slower than polynomial time but yet is significantly faster than exponential time. The worst case running time of a quasi-polynomial time algorithm is $2^{O (\log^{c} n)}$ for some fixed $c > 0$ . When $c = 1$ this gives polynomial time, and for $c < 1$ it gives sub-linear time.

There are some problems for which we know quasi-polynomial time algorithms, but no polynomial time algorithm is known. Such problems arise in approximation algorithms; a famous example is the directed Steiner tree problem, for which there is a quasi-polynomial time approximation algorithm achieving an approximation factor of $O (\log^{3} n)$ (n being the number of vertices), but showing the existence of such a polynomial time algorithm is an open problem.

Other computational problems with quasi-polynomial time solutions but no known polynomial time solution include the planted clique problem in which the goal is to find a large clique in the union of a clique and a random graph. Although quasi-polynomially solvable, it has been conjectured that the planted clique problem has no polynomial time solution; this planted clique conjecture has been used as a computational hardness assumption to prove the difficulty of several other problems in computational game theory, property testing, and machine learning.

The complexity class QP consists of all problems that have quasi-polynomial time algorithms. It can be defined in terms of DTIME as follows.

QP = ⋃_{c \in N} DTIME (2^{\log^{c} n})

Relation to NP-complete problems

In complexity theory, the unsolved P versus NP problem asks if all problems in NP have polynomial-time algorithms. All the best-known algorithms for NP-complete problems like 3SAT etc. take exponential time. Indeed, it is conjectured for many natural NP-complete problems that they do not have sub-exponential time algorithms. Here "sub-exponential time" is taken to mean the second definition presented below. (On the other hand, many graph problems represented in the natural way by adjacency matrices are solvable in subexponential time simply because the size of the input is the square of the number of vertices.) This conjecture (for the k-SAT problem) is known as the exponential time hypothesis. Since it is conjectured that NP-complete problems do not have quasi-polynomial time algorithms, some inapproximability results in the field of approximation algorithms make the assumption that NP-complete problems do not have quasi-polynomial time algorithms. For example, see the known inapproximability results for the set cover problem.

Sub-exponential time

The term sub-exponential time is used to express that the running time of some algorithm may grow faster than any polynomial but is still significantly smaller than an exponential. In this sense, problems that have sub-exponential time algorithms are somewhat more tractable than those that only have exponential algorithms. The precise definition of "sub-exponential" is not generally agreed upon, however the two most widely used are below.

First definition

A problem is said to be sub-exponential time solvable if it can be solved in running times whose logarithms grow smaller than any given polynomial. More precisely, a problem is in sub-exponential time if for every ε > 0 there exists an algorithm which solves the problem in time O(2^{n^ε}). The set of all such problems is the complexity class SUBEXP which can be defined in terms of DTIME as follows.

SUBEXP = ⋂_{ε > 0} DTIME (2^{n^{ε}})

This notion of sub-exponential is non-uniform in terms of ε in the sense that ε is not part of the input and each ε may have its own algorithm for the problem.

Second definition

Some authors define sub-exponential time as running times in $2^{o (n)}$ . This definition allows larger running times than the first definition of sub-exponential time. An example of such a sub-exponential time algorithm is the best-known classical algorithm for integer factorization, the general number field sieve, which runs in time about $2^{\tilde{O} (n^{1 / 3})}$ , where the length of the input is $n$ . Another example was the graph isomorphism problem, which the best known algorithm from 1982 to 2016 solved in $2^{O (\sqrt{n \log n})}$ . However, at STOC 2016 a quasi-polynomial time algorithm was presented.

It makes a difference whether the algorithm is allowed to be sub-exponential in the size of the instance, the number of vertices, or the number of edges. In parameterized complexity, this difference is made explicit by considering pairs $(L, k)$ of decision problems and parameters k. SUBEPT is the class of all parameterized problems that run in time sub-exponential in k and polynomial in the input size n:

SUBEPT = DTIME (2^{o (k)} \cdot poly (n)) .

More precisely, SUBEPT is the class of all parameterized problems $(L, k)$ for which there is a computable function $f : N \to N$ with $f \in o (k)$ and an algorithm that decides L in time $2^{f (k)} \cdot poly (n)$ .

Exponential time hypothesis

The exponential time hypothesis (ETH) is that 3SAT, the satisfiability problem of Boolean formulas in conjunctive normal form with at most three literals per clause and with n variables, cannot be solved in time 2^o(n). More precisely, the hypothesis is that there is some absolute constant $c > 0$ such that 3SAT cannot be decided in time 2^cn by any deterministic Turing machine. With m denoting the number of clauses, ETH is equivalent to the hypothesis that kSAT cannot be solved in time 2^o(m) for any integer $k \geq 3$ . The exponential time hypothesis implies P ≠ NP.

Exponential time

An algorithm is said to be exponential time, if T(n) is upper bounded by 2^poly(n), where poly(n) is some polynomial in n. More formally, an algorithm is exponential time if T(n) is bounded by O(2^{n^k}) for some constant k. Problems which admit exponential time algorithms on a deterministic Turing machine form the complexity class known as EXP.

EXP = ⋃_{c \in R_{+}} DTIME (2^{n^{c}})

Sometimes, exponential time is used to refer to algorithms that have T(n) = 2^O(n), where the exponent is at most a linear function of n. This gives rise to the complexity class E.

E = ⋃_{c \in N} DTIME (2^{c n})

Factorial time

An algorithm is said to be factorial time if T(n) is upper bounded by the factorial function n!. Factorial time is a subset of exponential time (EXP) because $n! \leq n^{n} = 2^{n \log n} = O (2^{n^{1 + ϵ}})$ for all $ϵ > 0$ . However, it is not a subset of E.

An example of an algorithm that runs in factorial time is bogosort, a notoriously inefficient sorting algorithm based on trial and error. Bogosort sorts a list of n items by repeatedly shuffling the list until it is found to be sorted. In the average case, each pass through the bogosort algorithm will examine one of the n! orderings of the n items. If the items are distinct, only one such ordering is sorted. Bogosort shares patrimony with the infinite monkey theorem.

Double exponential time

An algorithm is said to be double exponential time if T(n) is upper bounded by 2^{2^poly(n)}, where poly(n) is some polynomial in n. Such algorithms belong to the complexity class 2-EXPTIME.

2-EXPTIME = ⋃_{c \in N} DTIME (2^{2^{n^{c}}})

Well-known double exponential time algorithms include:

Decision procedures for Presburger arithmetic
Computing a Gröbner basis (in the worst case)
Quantifier elimination on real closed fields takes at least double exponential time, and can be done in this time.

A Medley of Potpourri

Search This Blog