A Medley of Potpourri: Information-based complexity

Monday, June 8, 2020

Information-based complexity

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Information-based_complexity

Information-based complexity (IBC) studies optimal algorithms and computational complexity for the continuous problems which arise in physical science, economics, engineering, and mathematical finance. IBC has studied such continuous problems as path integration, partial differential equations, systems of ordinary differential equations, nonlinear equations, integral equations, fixed points, and very-high-dimensional integration. All these problems involve functions (typically multivariate) of a real or complex variable. Since one can never obtain a closed-form solution of the problems of interest one has to settle for a numerical solution. Since a function of a real or complex variable cannot be entered into a digital computer, the solution of continuous problems involves partial information. To give a simple illustration, in the numerical approximation of an integral, only samples of the integrand at a finite number of points are available. In the numerical solution of partial differential equations the functions specifying the boundary conditions and the coefficients of the differential operator can only be sampled. Furthermore, this partial information can be expensive to obtain. Finally the information is often contaminated by noise.

The goal of information-based complexity is to create a theory of computational complexity and optimal algorithms for problems with partial, contaminated and priced information, and to apply the results to answering questions in various disciplines. Examples of such disciplines include physics, economics, mathematical finance, computer vision, control theory, geophysics, medical imaging, weather forecasting and climate prediction, and statistics. The theory is developed over abstract spaces, typically Hilbert or Banach spaces, while the applications are usually for multivariate problems.

Since the information is partial and contaminated, only approximate solutions can be obtained. IBC studies computational complexity and optimal algorithms for approximate solutions in various settings. Since the worst case setting often leads to negative results such as unsolvability and intractability, settings with weaker assurances such as average, probabilistic and randomized are also studied. A fairly new area of IBC research is continuous quantum computing.

Overview

We illustrate some important concepts with a very simple example, the computation of

\int _{0}^{1}f(x)\,dx.

For most integrands we can't use the fundamental theorem of calculus to compute the integral analytically; we have to approximate it numerically. We compute the values of

f

at n points

[f(t_{1}),\dots ,f(t_{n})].

The n numbers are the partial information about the true integrand

f(x).

We combine these n numbers by a combinatory algorithm to compute an approximation to the integral. See the monograph Complexity and Information for particulars.

Because we have only partial information we can use an adversary argument to tell us how large n has to be to compute an

\epsilon

-approximation. Because of these information-based arguments we can often obtain tight bounds on the complexity of continuous problems. For discrete problems such as integer factorization or the travelling salesman problem we have settle for conjectures about the complexity hierarchy. The reason is that the input is a number or a vector of numbers and can thus be entered into the computer. Thus there is typically no adversary argument at the information level and the complexity of a discrete problem is rarely known.

The univariate integration problem was for illustration only. Significant for many applications is multivariate integration. The number of variables is in the hundreds or thousands. The number of variables may even be infinite; we then speak of path integration. The reason that integrals are important in many disciplines is that they occur when we want to know the expected behavior of a continuous process. See for example, the application to mathematical finance below.

Assume we want to compute an integral in d dimensions (dimensions and variables are used interchangeably) and that we want to guarantee an error at most

\epsilon

for any integrand in some class. The computational complexity of the problem is known to be of order

\epsilon ^{-d}.

(Here we are counting the number of function evaluations and the number of arithmetic operations so this is the time complexity.) This would take many years for even modest values of

d.

The exponential dependence on d is called the curse of dimensionality. We say the problem is intractable.

We stated the curse of dimensionality for integration. But exponential dependence on d occurs for almost every continuous problem that has been investigated. How can we try to vanquish the curse? There are two possibilities:

We can weaken the guarantee that the error must be less than $\epsilon$ (worst case setting) and settle for a stochastic assurance. For example, we might only require that the expected error be less than $\epsilon$ (average case setting.) Another possibility is the randomized setting. For some problems we can break the curse of dimensionality by weakening the assurance; for others, we cannot. There is a large IBC literature on results in various settings; see Where to Learn More below.
We can incorporate domain knowledge. See An Example: Mathematical Finance below.

An example: mathematical finance

Very high dimensional integrals are common in finance. For example, computing expected cash flows for a collateralized mortgage obligation (CMO) requires the calculation of a number of

360

dimensional integrals, the

360

being the number of months in

30

years. Recall that if a worst case assurance is required the time is of order

\epsilon ^{{-d}}

time units. Even if the error is not small, say

\epsilon =10^{-2},

this is

10^{720}

time units. People in finance have long been using the Monte Carlo method (MC), an instance of a randomized algorithm. Then in 1994 a research group at Columbia University (Papageorgiou, Paskov, Traub, Woźniakowski) discovered that the quasi-Monte Carlo (QMC) method using low discrepancy sequences beat MC by one to three orders of magnitude. The results were reported to a number of Wall Street finance to considerable initial skepticism. The results were first published by Paskov and Traub, Faster Valuation of Financial Derivatives, Journal of Portfolio Management 22, 113-120. Today QMC is widely used in the financial sector to value financial derivatives.

These results are empirical; where does computational complexity come in? QMC is not a panacea for all high dimensional integrals. What is special about financial derivatives? Here's a possible explanation. The

360

dimensions in the CMO represent monthly future times. Due to the discounted value of money variables representing times for in the future are less important than the variables representing nearby times. Thus the integrals are non-isotropic. Sloan and Woźniakowski introduced the very powerful idea of weighted spaces which is a formalization of the above observation. They were able to show that with this additional domain knowledge high dimensional integrals satisfying certain conditions were tractable even in the worst case! In contrast the Monte Carlo method gives only a stochastic assurance. See Sloan and Woźniakowski When are Quasi-Monte Carlo Algorithms Efficient for High Dimensional Integration? J. Complexity 14, 1-33, 1998. For which classes of integrals is QMC superior to MC? This continues to be a major research problem.

Brief history

Precursors to IBC may be found in the 1950s by Kiefer, Sard, and Nikolskij. In 1959 Traub had the key insight that the optimal algorithm and the computational complexity of solving a continuous problem depended on the available information. He applied this insight to the solution of nonlinear equations which started the area of optimal iteration theory. This research was published in the 1964 monograph Iterative Methods for the Solution of Equations.

The general setting for information-based complexity was formulated by Traub and Woźniakowski in 1980 in A General Theory of Optimal Algorithms

A Medley of Potpourri

Search This Blog

Monday, June 8, 2020

Information-based complexity

Overview

An example: mathematical finance

Brief history

Hamlet

Followers

Total Pageviews