Search This Blog

Monday, November 22, 2021

Monte Carlo method

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Monte_Carlo_method

Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle. They are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other approaches. Monte Carlo methods are mainly used in three problem classes: optimization, numerical integration, and generating draws from a probability distribution.

In physics-related problems, Monte Carlo methods are useful for simulating systems with many coupled degrees of freedom, such as fluids, disordered materials, strongly coupled solids, and cellular structures (see cellular Potts model, interacting particle systems, McKean–Vlasov processes, kinetic models of gases).

Other examples include modeling phenomena with significant uncertainty in inputs such as the calculation of risk in business and, in mathematics, evaluation of multidimensional definite integrals with complicated boundary conditions. In application to systems engineering problems (space, oil exploration, aircraft design, etc.), Monte Carlo–based predictions of failure, cost overruns and schedule overruns are routinely better than human intuition or alternative "soft" methods.

In principle, Monte Carlo methods can be used to solve any problem having a probabilistic interpretation. By the law of large numbers, integrals described by the expected value of some random variable can be approximated by taking the empirical mean (a.k.a. the sample mean) of independent samples of the variable. When the probability distribution of the variable is parametrized, mathematicians often use a Markov chain Monte Carlo (MCMC) sampler. The central idea is to design a judicious Markov chain model with a prescribed stationary probability distribution. That is, in the limit, the samples being generated by the MCMC method will be samples from the desired (target) distribution. By the ergodic theorem, the stationary distribution is approximated by the empirical measures of the random states of the MCMC sampler.

In other problems, the objective is generating draws from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability distributions can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depend on the distributions of the current random states (see McKean–Vlasov processes, nonlinear filtering equation). In other instances we are given a flow of probability distributions with an increasing level of sampling complexity (path spaces models with an increasing time horizon, Boltzmann–Gibbs measures associated with decreasing temperature parameters, and many others). These models can also be seen as the evolution of the law of the random states of a nonlinear Markov chain. A natural way to simulate these sophisticated nonlinear Markov processes is to sample multiple copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and MCMC methodologies, these mean-field particle techniques rely on sequential interacting samples. The terminology mean field reflects the fact that each of the samples (a.k.a. particles, individuals, walkers, agents, creatures, or phenotypes) interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes.

Despite its conceptual and algorithmic simplicity, the computational cost associated with a Monte Carlo simulation can be staggeringly high. In general the method requires many samples to get a good approximation, which may incur an arbitrarily large total runtime if the processing time of a single sample is high. Although this is a severe limitation in very complex problems, the embarrassingly parallel nature of the algorithm allows this large cost to be reduced (perhaps to a feasible level) through parallel computing strategies in local processors, clusters, cloud computing, GPU, FPGA etc.

Overview

Monte Carlo methods vary, but tend to follow a particular pattern:

  1. Define a domain of possible inputs
  2. Generate inputs randomly from a probability distribution over the domain
  3. Perform a deterministic computation on the inputs
  4. Aggregate the results
Monte Carlo method applied to approximating the value of π.

For example, consider a quadrant (circular sector) inscribed in a unit square. Given that the ratio of their areas is π/4, the value of π can be approximated using a Monte Carlo method:

  1. Draw a square, then inscribe a quadrant within it
  2. Uniformly scatter a given number of points over the square
  3. Count the number of points inside the quadrant, i.e. having a distance from the origin of less than 1
  4. The ratio of the inside-count and the total-sample-count is an estimate of the ratio of the two areas, π/4. Multiply the result by 4 to estimate π.

In this procedure the domain of inputs is the square that circumscribes the quadrant. We generate random inputs by scattering grains over the square then perform a computation on each input (test whether it falls within the quadrant). Aggregating the results yields our final result, the approximation of π.

There are two important considerations:

  1. If the points are not uniformly distributed, then the approximation will be poor.
  2. There are many points. The approximation is generally poor if only a few points are randomly placed in the whole square. On average, the approximation improves as more points are placed.

Uses of Monte Carlo methods require large amounts of random numbers, and it was their use that spurred the development of pseudorandom number generators, which were far quicker to use than the tables of random numbers that had been previously used for statistical sampling.

History

Before the Monte Carlo method was developed, simulations tested a previously understood deterministic problem, and statistical sampling was used to estimate uncertainties in the simulations. Monte Carlo simulations invert this approach, solving deterministic problems using probabilistic metaheuristics (see simulated annealing).

An early variant of the Monte Carlo method was devised to solve the Buffon's needle problem, in which π can be estimated by dropping needles on a floor made of parallel equidistant strips. In the 1930s, Enrico Fermi first experimented with the Monte Carlo method while studying neutron diffusion, but he did not publish this work.

In the late 1940s, Stanislaw Ulam invented the modern version of the Markov Chain Monte Carlo method while he was working on nuclear weapons projects at the Los Alamos National Laboratory. Immediately after Ulam's breakthrough, John von Neumann understood its importance. In 1946, nuclear weapons physicists at Los Alamos were investigating neutron diffusion in the core of a nuclear weapon. Despite having most of the necessary data, such as the average distance a neutron would travel in a substance before it collided with an atomic nucleus and how much energy the neutron was likely to give off following a collision, the Los Alamos physicists were unable to solve the problem using conventional, deterministic mathematical methods. Ulam proposed using random experiments. He recounts his inspiration as follows:

The first thoughts and attempts I made to practice [the Monte Carlo Method] were suggested by a question which occurred to me in 1946 as I was convalescing from an illness and playing solitaires. The question was what are the chances that a Canfield solitaire laid out with 52 cards will come out successfully? After spending a lot of time trying to estimate them by pure combinatorial calculations, I wondered whether a more practical method than "abstract thinking" might not be to lay it out say one hundred times and simply observe and count the number of successful plays. This was already possible to envisage with the beginning of the new era of fast computers, and I immediately thought of problems of neutron diffusion and other questions of mathematical physics, and more generally how to change processes described by certain differential equations into an equivalent form interpretable as a succession of random operations. Later [in 1946], I described the idea to John von Neumann, and we began to plan actual calculations.

Being secret, the work of von Neumann and Ulam required a code name. A colleague of von Neumann and Ulam, Nicholas Metropolis, suggested using the name Monte Carlo, which refers to the Monte Carlo Casino in Monaco where Ulam's uncle would borrow money from relatives to gamble. Monte Carlo methods were central to the simulations required for the Manhattan Project, though severely limited by the computational tools at the time. Von Neumann, Nicholas Metropolis and others programmed the ENIAC computer to perform the first fully automated Monte Carlo calculations, of a fission weapon core, in the spring of 1948. In the 1950s Monte Carlo methods were used at Los Alamos for the development of the hydrogen bomb, and became popularized in the fields of physics, physical chemistry, and operations research. The Rand Corporation and the U.S. Air Force were two of the major organizations responsible for funding and disseminating information on Monte Carlo methods during this time, and they began to find a wide application in many different fields.

The theory of more sophisticated mean-field type particle Monte Carlo methods had certainly started by the mid-1960s, with the work of Henry P. McKean Jr. on Markov interpretations of a class of nonlinear parabolic partial differential equations arising in fluid mechanics. We also quote an earlier pioneering article by Theodore E. Harris and Herman Kahn, published in 1951, using mean-field genetic-type Monte Carlo methods for estimating particle transmission energies. Mean-field genetic type Monte Carlo methodologies are also used as heuristic natural search algorithms (a.k.a. metaheuristic) in evolutionary computing. The origins of these mean-field computational techniques can be traced to 1950 and 1954 with the work of Alan Turing on genetic type mutation-selection learning machines and the articles by Nils Aall Barricelli at the Institute for Advanced Study in Princeton, New Jersey.

Quantum Monte Carlo, and more specifically diffusion Monte Carlo methods can also be interpreted as a mean-field particle Monte Carlo approximation of FeynmanKac path integrals. The origins of Quantum Monte Carlo methods are often attributed to Enrico Fermi and Robert Richtmyer who developed in 1948 a mean-field particle interpretation of neutron-chain reactions, but the first heuristic-like and genetic type particle algorithm (a.k.a. Resampled or Reconfiguration Monte Carlo methods) for estimating ground state energies of quantum systems (in reduced matrix models) is due to Jack H. Hetherington in 1984 In molecular chemistry, the use of genetic heuristic-like particle methodologies (a.k.a. pruning and enrichment strategies) can be traced back to 1955 with the seminal work of Marshall N. Rosenbluth and Arianna W. Rosenbluth.

The use of Sequential Monte Carlo in advanced signal processing and Bayesian inference is more recent. It was in 1993, that Gordon et al., published in their seminal work the first application of a Monte Carlo resampling algorithm in Bayesian statistical inference. The authors named their algorithm 'the bootstrap filter', and demonstrated that compared to other filtering methods, their bootstrap algorithm does not require any assumption about that state-space or the noise of the system. We also quote another pioneering article in this field of Genshiro Kitagawa on a related "Monte Carlo filter", and the ones by Pierre Del Moral and Himilcon Carvalho, Pierre Del Moral, André Monin and Gérard Salut on particle filters published in the mid-1990s. Particle filters were also developed in signal processing in 1989–1992 by P. Del Moral, J. C. Noyer, G. Rigal, and G. Salut in the LAAS-CNRS in a series of restricted and classified research reports with STCAN (Service Technique des Constructions et Armes Navales), the IT company DIGILOG, and the LAAS-CNRS (the Laboratory for Analysis and Architecture of Systems) on radar/sonar and GPS signal processing problems. These Sequential Monte Carlo methodologies can be interpreted as an acceptance-rejection sampler equipped with an interacting recycling mechanism.

From 1950 to 1996, all the publications on Sequential Monte Carlo methodologies, including the pruning and resample Monte Carlo methods introduced in computational physics and molecular chemistry, present natural and heuristic-like algorithms applied to different situations without a single proof of their consistency, nor a discussion on the bias of the estimates and on genealogical and ancestral tree based algorithms. The mathematical foundations and the first rigorous analysis of these particle algorithms were written by Pierre Del Moral in 1996.

Branching type particle methodologies with varying population sizes were also developed in the end of the 1990s by Dan Crisan, Jessica Gaines and Terry Lyons, and by Dan Crisan, Pierre Del Moral and Terry Lyons. Further developments in this field were developed in 2000 by P. Del Moral, A. Guionnet and L. Miclo.

Definitions

There is no consensus on how Monte Carlo should be defined. For example, Ripley defines most probabilistic modeling as stochastic simulation, with Monte Carlo being reserved for Monte Carlo integration and Monte Carlo statistical tests. Sawilowsky distinguishes between a simulation, a Monte Carlo method, and a Monte Carlo simulation: a simulation is a fictitious representation of reality, a Monte Carlo method is a technique that can be used to solve a mathematical or statistical problem, and a Monte Carlo simulation uses repeated sampling to obtain the statistical properties of some phenomenon (or behavior). Examples:

  • Simulation: Drawing one pseudo-random uniform variable from the interval [0,1] can be used to simulate the tossing of a coin: If the value is less than or equal to 0.50 designate the outcome as heads, but if the value is greater than 0.50 designate the outcome as tails. This is a simulation, but not a Monte Carlo simulation.
  • Monte Carlo method: Pouring out a box of coins on a table, and then computing the ratio of coins that land heads versus tails is a Monte Carlo method of determining the behavior of repeated coin tosses, but it is not a simulation.
  • Monte Carlo simulation: Drawing a large number of pseudo-random uniform variables from the interval [0,1] at one time, or once at many different times, and assigning values less than or equal to 0.50 as heads and greater than 0.50 as tails, is a Monte Carlo simulation of the behavior of repeatedly tossing a coin.

Kalos and Whitlock point out that such distinctions are not always easy to maintain. For example, the emission of radiation from atoms is a natural stochastic process. It can be simulated directly, or its average behavior can be described by stochastic equations that can themselves be solved using Monte Carlo methods. "Indeed, the same computer code can be viewed simultaneously as a 'natural simulation' or as a solution of the equations by natural sampling."

Monte Carlo and random numbers

The main idea behind this method is that the results are computed based on repeated random sampling and statistical analysis. The Monte Carlo simulation is, in fact, random experimentations, in the case that, the results of these experiments are not well known. Monte Carlo simulations are typically characterized by many unknown parameters, many of which are difficult to obtain experimentally. Monte Carlo simulation methods do not always require truly random numbers to be useful (although, for some applications such as primality testing, unpredictability is vital). Many of the most useful techniques use deterministic, pseudorandom sequences, making it easy to test and re-run simulations. The only quality usually necessary to make good simulations is for the pseudo-random sequence to appear "random enough" in a certain sense.

What this means depends on the application, but typically they should pass a series of statistical tests. Testing that the numbers are uniformly distributed or follow another desired distribution when a large enough number of elements of the sequence are considered is one of the simplest and most common ones. Weak correlations between successive samples are also often desirable/necessary.

Sawilowsky lists the characteristics of a high-quality Monte Carlo simulation:

  • the (pseudo-random) number generator has certain characteristics (e.g. a long "period" before the sequence repeats)
  • the (pseudo-random) number generator produces values that pass tests for randomness
  • there are enough samples to ensure accurate results
  • the proper sampling technique is used
  • the algorithm used is valid for what is being modeled
  • it simulates the phenomenon in question.

Pseudo-random number sampling algorithms are used to transform uniformly distributed pseudo-random numbers into numbers that are distributed according to a given probability distribution.

Low-discrepancy sequences are often used instead of random sampling from a space as they ensure even coverage and normally have a faster order of convergence than Monte Carlo simulations using random or pseudorandom sequences. Methods based on their use are called quasi-Monte Carlo methods.

In an effort to assess the impact of random number quality on Monte Carlo simulation outcomes, astrophysical researchers tested cryptographically-secure pseudorandom numbers generated via Intel's RDRAND instruction set, as compared to those derived from algorithms, like the Mersenne Twister, in Monte Carlo simulations of radio flares from brown dwarfs. RDRAND is the closest pseudorandom number generator to a true random number generator. No statistically significant difference was found between models generated with typical pseudorandom number generators and RDRAND for trials consisting of the generation of 107 random numbers.

Monte Carlo simulation versus "what if" scenarios

There are ways of using probabilities that are definitely not Monte Carlo simulations – for example, deterministic modeling using single-point estimates. Each uncertain variable within a model is assigned a "best guess" estimate. Scenarios (such as best, worst, or most likely case) for each input variable are chosen and the results recorded.

By contrast, Monte Carlo simulations sample from a probability distribution for each variable to produce hundreds or thousands of possible outcomes. The results are analyzed to get probabilities of different outcomes occurring. For example, a comparison of a spreadsheet cost construction model run using traditional "what if" scenarios, and then running the comparison again with Monte Carlo simulation and triangular probability distributions shows that the Monte Carlo analysis has a narrower range than the "what if" analysis. This is because the "what if" analysis gives equal weight to all scenarios (see quantifying uncertainty in corporate finance), while the Monte Carlo method hardly samples in the very low probability regions. The samples in such regions are called "rare events".

Applications

Monte Carlo methods are especially useful for simulating phenomena with significant uncertainty in inputs and systems with many coupled degrees of freedom. Areas of application include:

Physical sciences

Monte Carlo methods are very important in computational physics, physical chemistry, and related applied fields, and have diverse applications from complicated quantum chromodynamics calculations to designing heat shields and aerodynamic forms as well as in modeling radiation transport for radiation dosimetry calculations. In statistical physics Monte Carlo molecular modeling is an alternative to computational molecular dynamics, and Monte Carlo methods are used to compute statistical field theories of simple particle and polymer systems. Quantum Monte Carlo methods solve the many-body problem for quantum systems. In radiation materials science, the binary collision approximation for simulating ion implantation is usually based on a Monte Carlo approach to select the next colliding atom. In experimental particle physics, Monte Carlo methods are used for designing detectors, understanding their behavior and comparing experimental data to theory. In astrophysics, they are used in such diverse manners as to model both galaxy evolution and microwave radiation transmission through a rough planetary surface. Monte Carlo methods are also used in the ensemble models that form the basis of modern weather forecasting.

Engineering

Monte Carlo methods are widely used in engineering for sensitivity analysis and quantitative probabilistic analysis in process design. The need arises from the interactive, co-linear and non-linear behavior of typical process simulations. For example,

Climate change and radiative forcing

The Intergovernmental Panel on Climate Change relies on Monte Carlo methods in probability density function analysis of radiative forcing.

Probability density function (PDF) of ERF due to total GHG, aerosol forcing and total anthropogenic forcing. The GHG consists of WMGHG, ozone and stratospheric water vapour. The PDFs are generated based on uncertainties provided in Table 8.6. The combination of the individual RF agents to derive total forcing over the Industrial Era are done by Monte Carlo simulations and based on the method in Boucher and Haywood (2001). PDF of the ERF from surface albedo changes and combined contrails and contrail-induced cirrus are included in the total anthropogenic forcing, but not shown as a separate PDF. We currently do not have ERF estimates for some forcing mechanisms: ozone, land use, solar, etc.

Computational biology

Monte Carlo methods are used in various fields of computational biology, for example for Bayesian inference in phylogeny, or for studying biological systems such as genomes, proteins, or membranes. The systems can be studied in the coarse-grained or ab initio frameworks depending on the desired accuracy. Computer simulations allow us to monitor the local environment of a particular molecule to see if some chemical reaction is happening for instance. In cases where it is not feasible to conduct a physical experiment, thought experiments can be conducted (for instance: breaking bonds, introducing impurities at specific sites, changing the local/global structure, or introducing external fields).

Computer graphics

Path tracing, occasionally referred to as Monte Carlo ray tracing, renders a 3D scene by randomly tracing samples of possible light paths. Repeated sampling of any given pixel will eventually cause the average of the samples to converge on the correct solution of the rendering equation, making it one of the most physically accurate 3D graphics rendering methods in existence.

Applied statistics

The standards for Monte Carlo experiments in statistics were set by Sawilowsky. In applied statistics, Monte Carlo methods may be used for at least four purposes:

  1. To compare competing statistics for small samples under realistic data conditions. Although type I error and power properties of statistics can be calculated for data drawn from classical theoretical distributions (e.g., normal curve, Cauchy distribution) for asymptotic conditions (i. e, infinite sample size and infinitesimally small treatment effect), real data often do not have such distributions.
  2. To provide implementations of hypothesis tests that are more efficient than exact tests such as permutation tests (which are often impossible to compute) while being more accurate than critical values for asymptotic distributions.
  3. To provide a random sample from the posterior distribution in Bayesian inference. This sample then approximates and summarizes all the essential features of the posterior.
  4. To provide efficient random estimates of the Hessian matrix of the negative log-likelihood function that may be averaged to form an estimate of the Fisher information matrix.

Monte Carlo methods are also a compromise between approximate randomization and permutation tests. An approximate randomization test is based on a specified subset of all permutations (which entails potentially enormous housekeeping of which permutations have been considered). The Monte Carlo approach is based on a specified number of randomly drawn permutations (exchanging a minor loss in precision if a permutation is drawn twice—or more frequently—for the efficiency of not having to track which permutations have already been selected).

Artificial intelligence for games

Monte Carlo methods have been developed into a technique called Monte-Carlo tree search that is useful for searching for the best move in a game. Possible moves are organized in a search tree and many random simulations are used to estimate the long-term potential of each move. A black box simulator represents the opponent's moves.

The Monte Carlo tree search (MCTS) method has four steps:

  1. Starting at root node of the tree, select optimal child nodes until a leaf node is reached.
  2. Expand the leaf node and choose one of its children.
  3. Play a simulated game starting with that node.
  4. Use the results of that simulated game to update the node and its ancestors.

The net effect, over the course of many simulated games, is that the value of a node representing a move will go up or down, hopefully corresponding to whether or not that node represents a good move.

Monte Carlo Tree Search has been used successfully to play games such as Go, Tantrix, Battleship, Havannah, and Arimaa.

Design and visuals

Monte Carlo methods are also efficient in solving coupled integral differential equations of radiation fields and energy transport, and thus these methods have been used in global illumination computations that produce photo-realistic images of virtual 3D models, with applications in video games, architecture, design, computer generated films, and cinematic special effects.

Search and rescue

The US Coast Guard utilizes Monte Carlo methods within its computer modeling software SAROPS in order to calculate the probable locations of vessels during search and rescue operations. Each simulation can generate as many as ten thousand data points that are randomly distributed based upon provided variables. Search patterns are then generated based upon extrapolations of these data in order to optimize the probability of containment (POC) and the probability of detection (POD), which together will equal an overall probability of success (POS). Ultimately this serves as a practical application of probability distribution in order to provide the swiftest and most expedient method of rescue, saving both lives and resources.

Finance and business

Monte Carlo simulation is commonly used to evaluate the risk and uncertainty that would affect the outcome of different decision options. Monte Carlo simulation allows the business risk analyst to incorporate the total effects of uncertainty in variables like sales volume, commodity and labour prices, interest and exchange rates, as well as the effect of distinct risk events like the cancellation of a contract or the change of a tax law.

Monte Carlo methods in finance are often used to evaluate investments in projects at a business unit or corporate level, or other financial valuations. They can be used to model project schedules, where simulations aggregate estimates for worst-case, best-case, and most likely durations for each task to determine outcomes for the overall project. Monte Carlo methods are also used in option pricing, default risk analysis. Additionally, they can be used to estimate the financial impact of medical interventions.

Law

A Monte Carlo approach was used for evaluating the potential value of a proposed program to help female petitioners in Wisconsin be successful in their applications for harassment and domestic abuse restraining orders. It was proposed to help women succeed in their petitions by providing them with greater advocacy thereby potentially reducing the risk of rape and physical assault. However, there were many variables in play that could not be estimated perfectly, including the effectiveness of restraining orders, the success rate of petitioners both with and without advocacy, and many others. The study ran trials that varied these variables to come up with an overall estimate of the success level of the proposed program as a whole.

Use in mathematics

In general, the Monte Carlo methods are used in mathematics to solve various problems by generating suitable random numbers (see also Random number generation) and observing that fraction of the numbers that obeys some property or properties. The method is useful for obtaining numerical solutions to problems too complicated to solve analytically. The most common application of the Monte Carlo method is Monte Carlo integration.

Integration

Monte-Carlo integration works by comparing random points with the value of the function
 
Errors reduce by a factor of

Deterministic numerical integration algorithms work well in a small number of dimensions, but encounter two problems when the functions have many variables. First, the number of function evaluations needed increases rapidly with the number of dimensions. For example, if 10 evaluations provide adequate accuracy in one dimension, then 10100 points are needed for 100 dimensions—far too many to be computed. This is called the curse of dimensionality. Second, the boundary of a multidimensional region may be very complicated, so it may not be feasible to reduce the problem to an iterated integral. 100 dimensions is by no means unusual, since in many physical problems, a "dimension" is equivalent to a degree of freedom.

Monte Carlo methods provide a way out of this exponential increase in computation time. As long as the function in question is reasonably well-behaved, it can be estimated by randomly selecting points in 100-dimensional space, and taking some kind of average of the function values at these points. By the central limit theorem, this method displays convergence—i.e., quadrupling the number of sampled points halves the error, regardless of the number of dimensions.

A refinement of this method, known as importance sampling in statistics, involves sampling the points randomly, but more frequently where the integrand is large. To do this precisely one would have to already know the integral, but one can approximate the integral by an integral of a similar function or use adaptive routines such as stratified sampling, recursive stratified sampling, adaptive umbrella sampling or the VEGAS algorithm.

A similar approach, the quasi-Monte Carlo method, uses low-discrepancy sequences. These sequences "fill" the area better and sample the most important points more frequently, so quasi-Monte Carlo methods can often converge on the integral more quickly.

Another class of methods for sampling points in a volume is to simulate random walks over it (Markov chain Monte Carlo). Such methods include the Metropolis–Hastings algorithm, Gibbs sampling, Wang and Landau algorithm, and interacting type MCMC methodologies such as the sequential Monte Carlo samplers.

Simulation and optimization

Another powerful and very popular application for random numbers in numerical simulation is in numerical optimization. The problem is to minimize (or maximize) functions of some vector that often has many dimensions. Many problems can be phrased in this way: for example, a computer chess program could be seen as trying to find the set of, say, 10 moves that produces the best evaluation function at the end. In the traveling salesman problem the goal is to minimize distance traveled. There are also applications to engineering design, such as multidisciplinary design optimization. It has been applied with quasi-one-dimensional models to solve particle dynamics problems by efficiently exploring large configuration space. Reference is a comprehensive review of many issues related to simulation and optimization.

The traveling salesman problem is what is called a conventional optimization problem. That is, all the facts (distances between each destination point) needed to determine the optimal path to follow are known with certainty and the goal is to run through the possible travel choices to come up with the one with the lowest total distance. However, let's assume that instead of wanting to minimize the total distance traveled to visit each desired destination, we wanted to minimize the total time needed to reach each destination. This goes beyond conventional optimization since travel time is inherently uncertain (traffic jams, time of day, etc.). As a result, to determine our optimal path we would want to use simulation - optimization to first understand the range of potential times it could take to go from one point to another (represented by a probability distribution in this case rather than a specific distance) and then optimize our travel decisions to identify the best path to follow taking that uncertainty into account.

Inverse problems

Probabilistic formulation of inverse problems leads to the definition of a probability distribution in the model space. This probability distribution combines prior information with new information obtained by measuring some observable parameters (data). As, in the general case, the theory linking data with model parameters is nonlinear, the posterior probability in the model space may not be easy to describe (it may be multimodal, some moments may not be defined, etc.).

When analyzing an inverse problem, obtaining a maximum likelihood model is usually not sufficient, as we normally also wish to have information on the resolution power of the data. In the general case we may have many model parameters, and an inspection of the marginal probability densities of interest may be impractical, or even useless. But it is possible to pseudorandomly generate a large collection of models according to the posterior probability distribution and to analyze and display the models in such a way that information on the relative likelihoods of model properties is conveyed to the spectator. This can be accomplished by means of an efficient Monte Carlo method, even in cases where no explicit formula for the a priori distribution is available.

The best-known importance sampling method, the Metropolis algorithm, can be generalized, and this gives a method that allows analysis of (possibly highly nonlinear) inverse problems with complex a priori information and data with an arbitrary noise distribution.

Philosophy

Popular exposition of the Monte Carlo Method was conducted by McCracken. Method's general philosophy was discussed by Elishakoff and Grüne-Yanoff and Weirich.

 

Statistical mechanics

In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic behavior of nature from the behavior of such ensembles.

Statistical mechanics arose out of the development of classical thermodynamics, a field for which it was successful in explaining macroscopic physical properties—such as temperature, pressure, and heat capacity—in terms of microscopic parameters that fluctuate about average values and are characterized by probability distributions. This established the fields of statistical thermodynamics and statistical physics.

The founding of the field of statistical mechanics is generally credited to three physicists:

While classical thermodynamics is primarily concerned with thermodynamic equilibrium, statistical mechanics has been applied in non-equilibrium statistical mechanics to the issues of microscopically modeling the speed of irreversible processes that are driven by imbalances. Examples of such processes include chemical reactions and flows of particles and heat. The fluctuation–dissipation theorem is the basic knowledge obtained from applying non-equilibrium statistical mechanics to study the simplest non-equilibrium situation of a steady state current flow in a system of many particles.

Principles: mechanics and ensembles

In physics, two types of mechanics are usually examined: classical mechanics and quantum mechanics. For both types of mechanics, the standard mathematical approach is to consider two concepts:

Using these two concepts, the state at any other time, past or future, can in principle be calculated. There is however a disconnection between these laws and everyday life experiences, as we do not find it necessary (nor even theoretically possible) to know exactly at a microscopic level the simultaneous positions and velocities of each molecule while carrying out processes at the human scale (for example, when performing a chemical reaction). Statistical mechanics fills this disconnection between the laws of mechanics and the practical experience of incomplete knowledge, by adding some uncertainty about which state the system is in.

Whereas ordinary mechanics only considers the behaviour of a single state, statistical mechanics introduces the statistical ensemble, which is a large collection of virtual, independent copies of the system in various states. The statistical ensemble is a probability distribution over all possible states of the system. In classical statistical mechanics, the ensemble is a probability distribution over phase points (as opposed to a single phase point in ordinary mechanics), usually represented as a distribution in a phase space with canonical coordinates. In quantum statistical mechanics, the ensemble is a probability distribution over pure states, and can be compactly summarized as a density matrix.

As is usual for probabilities, the ensemble can be interpreted in different ways:

  • an ensemble can be taken to represent the various possible states that a single system could be in (epistemic probability, a form of knowledge), or
  • the members of the ensemble can be understood as the states of the systems in experiments repeated on independent systems which have been prepared in a similar but imperfectly controlled manner (empirical probability), in the limit of an infinite number of trials.

These two meanings are equivalent for many purposes, and will be used interchangeably in this article.

However the probability is interpreted, each state in the ensemble evolves over time according to the equation of motion. Thus, the ensemble itself (the probability distribution over states) also evolves, as the virtual systems in the ensemble continually leave one state and enter another. The ensemble evolution is given by the Liouville equation (classical mechanics) or the von Neumann equation (quantum mechanics). These equations are simply derived by the application of the mechanical equation of motion separately to each virtual system contained in the ensemble, with the probability of the virtual system being conserved over time as it evolves from state to state.

One special class of ensemble is those ensembles that do not evolve over time. These ensembles are known as equilibrium ensembles and their condition is known as statistical equilibrium. Statistical equilibrium occurs if, for each state in the ensemble, the ensemble also contains all of its future and past states with probabilities equal to the probability of being in that state. The study of equilibrium ensembles of isolated systems is the focus of statistical thermodynamics. Non-equilibrium statistical mechanics addresses the more general case of ensembles that change over time, and/or ensembles of non-isolated systems.

Statistical thermodynamics

The primary goal of statistical thermodynamics (also known as equilibrium statistical mechanics) is to derive the classical thermodynamics of materials in terms of the properties of their constituent particles and the interactions between them. In other words, statistical thermodynamics provides a connection between the macroscopic properties of materials in thermodynamic equilibrium, and the microscopic behaviours and motions occurring inside the material.

Whereas statistical mechanics proper involves dynamics, here the attention is focussed on statistical equilibrium (steady state). Statistical equilibrium does not mean that the particles have stopped moving (mechanical equilibrium), rather, only that the ensemble is not evolving.

Fundamental postulate

A sufficient (but not necessary) condition for statistical equilibrium with an isolated system is that the probability distribution is a function only of conserved properties (total energy, total particle numbers, etc.). There are many different equilibrium ensembles that can be considered, and only some of them correspond to thermodynamics. Additional postulates are necessary to motivate why the ensemble for a given system should have one form or another.

A common approach found in many textbooks is to take the equal a priori probability postulate. This postulate states that

For an isolated system with an exactly known energy and exactly known composition, the system can be found with equal probability in any microstate consistent with that knowledge.

The equal a priori probability postulate therefore provides a motivation for the microcanonical ensemble described below. There are various arguments in favour of the equal a priori probability postulate:

  • Ergodic hypothesis: An ergodic system is one that evolves over time to explore "all accessible" states: all those with the same energy and composition. In an ergodic system, the microcanonical ensemble is the only possible equilibrium ensemble with fixed energy. This approach has limited applicability, since most systems are not ergodic.
  • Principle of indifference: In the absence of any further information, we can only assign equal probabilities to each compatible situation.
  • Maximum information entropy: A more elaborate version of the principle of indifference states that the correct ensemble is the ensemble that is compatible with the known information and that has the largest Gibbs entropy (information entropy).

Other fundamental postulates for statistical mechanics have also been proposed.

Three thermodynamic ensembles

There are three equilibrium ensembles with a simple form that can be defined for any isolated system bounded inside a finite volume. These are the most often discussed ensembles in statistical thermodynamics. In the macroscopic limit (defined below) they all correspond to classical thermodynamics.

Microcanonical ensemble
describes a system with a precisely given energy and fixed composition (precise number of particles). The microcanonical ensemble contains with equal probability each possible state that is consistent with that energy and composition.
Canonical ensemble
describes a system of fixed composition that is in thermal equilibrium with a heat bath of a precise temperature. The canonical ensemble contains states of varying energy but identical composition; the different states in the ensemble are accorded different probabilities depending on their total energy.
Grand canonical ensemble
describes a system with non-fixed composition (uncertain particle numbers) that is in thermal and chemical equilibrium with a thermodynamic reservoir. The reservoir has a precise temperature, and precise chemical potentials for various types of particle. The grand canonical ensemble contains states of varying energy and varying numbers of particles; the different states in the ensemble are accorded different probabilities depending on their total energy and total particle numbers.

For systems containing many particles (the thermodynamic limit), all three of the ensembles listed above tend to give identical behaviour. It is then simply a matter of mathematical convenience which ensemble is used. The Gibbs theorem about equivalence of ensembles was developed into the theory of concentration of measure phenomenon, which has applications in many areas of science, from functional analysis to methods of artificial intelligence and big data technology.

Important cases where the thermodynamic ensembles do not give identical results include:

  • Microscopic systems.
  • Large systems at a phase transition.
  • Large systems with long-range interactions.

In these cases the correct thermodynamic ensemble must be chosen as there are observable differences between these ensembles not just in the size of fluctuations, but also in average quantities such as the distribution of particles. The correct ensemble is that which corresponds to the way the system has been prepared and characterized—in other words, the ensemble that reflects the knowledge about that system.



Thermodynamic ensembles
Microcanonical Canonical Grand canonical
Fixed variables
Microscopic features
Macroscopic function

Calculation methods

Once the characteristic state function for an ensemble has been calculated for a given system, that system is 'solved' (macroscopic observables can be extracted from the characteristic state function). Calculating the characteristic state function of a thermodynamic ensemble is not necessarily a simple task, however, since it involves considering every possible state of the system. While some hypothetical systems have been exactly solved, the most general (and realistic) case is too complex for an exact solution. Various approaches exist to approximate the true ensemble and allow calculation of average quantities.

Exact

There are some cases which allow exact solutions.

  • For very small microscopic systems, the ensembles can be directly computed by simply enumerating over all possible states of the system (using exact diagonalization in quantum mechanics, or integral over all phase space in classical mechanics).
  • Some large systems consist of many separable microscopic systems, and each of the subsystems can be analysed independently. Notably, idealized gases of non-interacting particles have this property, allowing exact derivations of Maxwell–Boltzmann statistics, Fermi–Dirac statistics, and Bose–Einstein statistics.
  • A few large systems with interaction have been solved. By the use of subtle mathematical techniques, exact solutions have been found for a few toy models. Some examples include the Bethe ansatz, square-lattice Ising model in zero field, hard hexagon model.

Monte Carlo

One approximate approach that is particularly well suited to computers is the Monte Carlo method, which examines just a few of the possible states of the system, with the states chosen randomly (with a fair weight). As long as these states form a representative sample of the whole set of states of the system, the approximate characteristic function is obtained. As more and more random samples are included, the errors are reduced to an arbitrarily low level.

Other

  • For rarefied non-ideal gases, approaches such as the cluster expansion use perturbation theory to include the effect of weak interactions, leading to a virial expansion.
  • For dense fluids, another approximate approach is based on reduced distribution functions, in particular the radial distribution function.
  • Molecular dynamics computer simulations can be used to calculate microcanonical ensemble averages, in ergodic systems. With the inclusion of a connection to a stochastic heat bath, they can also model canonical and grand canonical conditions.
  • Mixed methods involving non-equilibrium statistical mechanical results (see below) may be useful.

Non-equilibrium statistical mechanics

Many physical phenomena involve quasi-thermodynamic processes out of equilibrium, for example:

All of these processes occur over time with characteristic rates. These rates are important in engineering. The field of non-equilibrium statistical mechanics is concerned with understanding these non-equilibrium processes at the microscopic level. (Statistical thermodynamics can only be used to calculate the final result, after the external imbalances have been removed and the ensemble has settled back down to equilibrium.)

In principle, non-equilibrium statistical mechanics could be mathematically exact: ensembles for an isolated system evolve over time according to deterministic equations such as Liouville's equation or its quantum equivalent, the von Neumann equation. These equations are the result of applying the mechanical equations of motion independently to each state in the ensemble. Unfortunately, these ensemble evolution equations inherit much of the complexity of the underlying mechanical motion, and so exact solutions are very difficult to obtain. Moreover, the ensemble evolution equations are fully reversible and do not destroy information (the ensemble's Gibbs entropy is preserved). In order to make headway in modelling irreversible processes, it is necessary to consider additional factors besides probability and reversible mechanics.

Non-equilibrium mechanics is therefore an active area of theoretical research as the range of validity of these additional assumptions continues to be explored. A few approaches are described in the following subsections.

Stochastic methods

One approach to non-equilibrium statistical mechanics is to incorporate stochastic (random) behaviour into the system. Stochastic behaviour destroys information contained in the ensemble. While this is technically inaccurate (aside from hypothetical situations involving black holes, a system cannot in itself cause loss of information), the randomness is added to reflect that information of interest becomes converted over time into subtle correlations within the system, or to correlations between the system and environment. These correlations appear as chaotic or pseudorandom influences on the variables of interest. By replacing these correlations with randomness proper, the calculations can be made much easier.

  • Boltzmann transport equation: An early form of stochastic mechanics appeared even before the term "statistical mechanics" had been coined, in studies of kinetic theory. James Clerk Maxwell had demonstrated that molecular collisions would lead to apparently chaotic motion inside a gas. Ludwig Boltzmann subsequently showed that, by taking this molecular chaos for granted as a complete randomization, the motions of particles in a gas would follow a simple Boltzmann transport equation that would rapidly restore a gas to an equilibrium state (see H-theorem).

    The Boltzmann transport equation and related approaches are important tools in non-equilibrium statistical mechanics due to their extreme simplicity. These approximations work well in systems where the "interesting" information is immediately (after just one collision) scrambled up into subtle correlations, which essentially restricts them to rarefied gases. The Boltzmann transport equation has been found to be very useful in simulations of electron transport in lightly doped semiconductors (in transistors), where the electrons are indeed analogous to a rarefied gas.

    A quantum technique related in theme is the random phase approximation.
  • BBGKY hierarchy: In liquids and dense gases, it is not valid to immediately discard the correlations between particles after one collision. The BBGKY hierarchy (Bogoliubov–Born–Green–Kirkwood–Yvon hierarchy) gives a method for deriving Boltzmann-type equations but also extending them beyond the dilute gas case, to include correlations after a few collisions.
  • Keldysh formalism (a.k.a. NEGF—non-equilibrium Green functions): A quantum approach to including stochastic dynamics is found in the Keldysh formalism. This approach is often used in electronic quantum transport calculations.
  • Stochastic Liouville equation.

Near-equilibrium methods

Another important class of non-equilibrium statistical mechanical models deals with systems that are only very slightly perturbed from equilibrium. With very small perturbations, the response can be analysed in linear response theory. A remarkable result, as formalized by the fluctuation–dissipation theorem, is that the response of a system when near equilibrium is precisely related to the fluctuations that occur when the system is in total equilibrium. Essentially, a system that is slightly away from equilibrium—whether put there by external forces or by fluctuations—relaxes towards equilibrium in the same way, since the system cannot tell the difference or "know" how it came to be away from equilibrium.

This provides an indirect avenue for obtaining numbers such as ohmic conductivity and thermal conductivity by extracting results from equilibrium statistical mechanics. Since equilibrium statistical mechanics is mathematically well defined and (in some cases) more amenable for calculations, the fluctuation–dissipation connection can be a convenient shortcut for calculations in near-equilibrium statistical mechanics.

A few of the theoretical tools used to make this connection include:

Hybrid methods

An advanced approach uses a combination of stochastic methods and linear response theory. As an example, one approach to compute quantum coherence effects (weak localization, conductance fluctuations) in the conductance of an electronic system is the use of the Green–Kubo relations, with the inclusion of stochastic dephasing by interactions between various electrons by use of the Keldysh method.

Applications outside thermodynamics

The ensemble formalism also can be used to analyze general mechanical systems with uncertainty in knowledge about the state of a system. Ensembles are also used in:

History

In 1738, Swiss physicist and mathematician Daniel Bernoulli published Hydrodynamica which laid the basis for the kinetic theory of gases. In this work, Bernoulli posited the argument, still used to this day, that gases consist of great numbers of molecules moving in all directions, that their impact on a surface causes the gas pressure that we feel, and that what we experience as heat is simply the kinetic energy of their motion.

In 1859, after reading a paper on the diffusion of molecules by Rudolf Clausius, Scottish physicist James Clerk Maxwell formulated the Maxwell distribution of molecular velocities, which gave the proportion of molecules having a certain velocity in a specific range. This was the first-ever statistical law in physics. Maxwell also gave the first mechanical argument that molecular collisions entail an equalization of temperatures and hence a tendency towards equilibrium. Five years later, in 1864, Ludwig Boltzmann, a young student in Vienna, came across Maxwell's paper and spent much of his life developing the subject further.

Statistical mechanics was initiated in the 1870s with the work of Boltzmann, much of which was collectively published in his 1896 Lectures on Gas Theory. Boltzmann's original papers on the statistical interpretation of thermodynamics, the H-theorem, transport theory, thermal equilibrium, the equation of state of gases, and similar subjects, occupy about 2,000 pages in the proceedings of the Vienna Academy and other societies. Boltzmann introduced the concept of an equilibrium statistical ensemble and also investigated for the first time non-equilibrium statistical mechanics, with his H-theorem.

The term "statistical mechanics" was coined by the American mathematical physicist J. Willard Gibbs in 1884. "Probabilistic mechanics" might today seem a more appropriate term, but "statistical mechanics" is firmly entrenched. Shortly before his death, Gibbs published in 1902 Elementary Principles in Statistical Mechanics, a book which formalized statistical mechanics as a fully general approach to address all mechanical systems—macroscopic or microscopic, gaseous or non-gaseous. Gibbs' methods were initially derived in the framework classical mechanics, however they were of such generality that they were found to adapt easily to the later quantum mechanics, and still form the foundation of statistical mechanics to this day.

Classical unified field theories

From Wikipedia, the free encyclopedia

Since the 19th century, some physicists, notably Albert Einstein, have attempted to develop a single theoretical framework that can account for all the fundamental forces of nature – a unified field theory. Classical unified field theories are attempts to create a unified field theory based on classical physics. In particular, unification of gravitation and electromagnetism was actively pursued by several physicists and mathematicians in the years between the two World Wars. This work spurred the purely mathematical development of differential geometry.

This article describes various attempts at formulating a classical (non-quantum), relativistic unified field theory. For a survey of classical relativistic field theories of gravitation that have been motivated by theoretical concerns other than unification, see Classical theories of gravitation. For a survey of current work toward creating a quantum theory of gravitation, see quantum gravity.

Overview

The early attempts at creating a unified field theory began with the Riemannian geometry of general relativity, and attempted to incorporate electromagnetic fields into a more general geometry, since ordinary Riemannian geometry seemed incapable of expressing the properties of the electromagnetic field. Einstein was not alone in his attempts to unify electromagnetism and gravity; a large number of mathematicians and physicists, including Hermann Weyl, Arthur Eddington, and Theodor Kaluza also attempted to develop approaches that could unify these interactions. These scientists pursued several avenues of generalization, including extending the foundations of geometry and adding an extra spatial dimension.

Early work

The first attempts to provide a unified theory were by G. Mie in 1912 and Ernst Reichenbacher in 1916. However, these theories were unsatisfactory, as they did not incorporate general relativity because general relativity had yet to be formulated. These efforts, along with those of Rudolf Förster, involved making the metric tensor (which had previously been assumed to be symmetric and real-valued) into an asymmetric and/or complex-valued tensor, and they also attempted to create a field theory for matter as well.

Differential geometry and field theory

From 1918 until 1923, there were three distinct approaches to field theory: the gauge theory of Weyl, Kaluza's five-dimensional theory, and Eddington's development of affine geometry. Einstein corresponded with these researchers, and collaborated with Kaluza, but was not yet fully involved in the unification effort.

Weyl's infinitesimal geometry

In order to include electromagnetism into the geometry of general relativity, Hermann Weyl worked to generalize the Riemannian geometry upon which general relativity is based. His idea was to create a more general infinitesimal geometry. He noted that in addition to a metric field there could be additional degrees of freedom along a path between two points in a manifold, and he tried to exploit this by introducing a basic method for comparison of local size measures along such a path, in terms of a gauge field. This geometry generalized Riemannian geometry in that there was a vector field Q, in addition to the metric g, which together gave rise to both the electromagnetic and gravitational fields. This theory was mathematically sound, albeit complicated, resulting in difficult and high-order field equations. The critical mathematical ingredients in this theory, the Lagrangians and curvature tensor, were worked out by Weyl and colleagues. Then Weyl carried out an extensive correspondence with Einstein and others as to its physical validity, and the theory was ultimately found to be physically unreasonable. However, Weyl's principle of gauge invariance was later applied in a modified form to quantum field theory.

Kaluza's fifth dimension

Kaluza's approach to unification was to embed space-time into a five-dimensional cylindrical world, consisting of four space dimensions and one time dimension. Unlike Weyl's approach, Riemannian geometry was maintained, and the extra dimension allowed for the incorporation of the electromagnetic field vector into the geometry. Despite the relative mathematical elegance of this approach, in collaboration with Einstein and Einstein's aide Grommer it was determined that this theory did not admit a non-singular, static, spherically symmetric solution. This theory did have some influence on Einstein's later work and was further developed later by Klein in an attempt to incorporate relativity into quantum theory, in what is now known as Kaluza–Klein theory.

Eddington's affine geometry

Sir Arthur Stanley Eddington was a noted astronomer who became an enthusiastic and influential promoter of Einstein's general theory of relativity. He was among the first to propose an extension of the gravitational theory based on the affine connection as the fundamental structure field rather than the metric tensor which was the original focus of general relativity. Affine connection is the basis for parallel transport of vectors from one space-time point to another; Eddington assumed the affine connection to be symmetric in its covariant indices, because it seemed plausible that the result of parallel-transporting one infinitesimal vector along another should produce the same result as transporting the second along the first. (Later workers revisited this assumption.)

Eddington emphasized what he considered to be epistemological considerations; for example, he thought that the cosmological constant version of the general-relativistic field equation expressed the property that the universe was "self-gauging". Since the simplest cosmological model (the De Sitter universe) that solves that equation is a spherically symmetric, stationary, closed universe (exhibiting a cosmological red shift, which is more conventionally interpreted as due to expansion), it seemed to explain the overall form of the universe.

Like many other classical unified field theorists, Eddington considered that in the Einstein field equations for general relativity the stress–energy tensor , which represents matter/energy, was merely provisional, and that in a truly unified theory the source term would automatically arise as some aspect of the free-space field equations. He also shared the hope that an improved fundamental theory would explain why the two elementary particles then known (proton and electron) have quite different masses.

The Dirac equation for the relativistic quantum electron caused Eddington to rethink his previous conviction that fundamental physical theory had to be based on tensors. He subsequently devoted his efforts into development of a "Fundamental Theory" based largely on algebraic notions (which he called "E-frames"). Unfortunately his descriptions of this theory were sketchy and difficult to understand, so very few physicists followed up on his work.

Einstein's geometric approaches

When the equivalent of Maxwell's equations for electromagnetism is formulated within the framework of Einstein's theory of general relativity, the electromagnetic field energy (being equivalent to mass as one would expect from Einstein's famous equation E=mc2) contributes to the stress tensor and thus to the curvature of space-time, which is the general-relativistic representation of the gravitational field; or putting it another way, certain configurations of curved space-time incorporate effects of an electromagnetic field. This suggests that a purely geometric theory ought to treat these two fields as different aspects of the same basic phenomenon. However, ordinary Riemannian geometry is unable to describe the properties of the electromagnetic field as a purely geometric phenomenon.

Einstein tried to form a generalized theory of gravitation that would unify the gravitational and electromagnetic forces (and perhaps others), guided by a belief in a single origin for the entire set of physical laws. These attempts initially concentrated on additional geometric notions such as vierbeins and "distant parallelism", but eventually centered around treating both the metric tensor and the affine connection as fundamental fields. (Because they are not independent, the metric-affine theory was somewhat complicated.) In general relativity, these fields are symmetric (in the matrix sense), but since antisymmetry seemed essential for electromagnetism, the symmetry requirement was relaxed for one or both fields. Einstein's proposed unified-field equations (fundamental laws of physics) were generally derived from a variational principle expressed in terms of the Riemann curvature tensor for the presumed space-time manifold.

In field theories of this kind, particles appear as limited regions in space-time in which the field strength or the energy density are particularly high. Einstein and coworker Leopold Infeld managed to demonstrate that, in Einstein's final theory of the unified field, true singularities of the field did have trajectories resembling point particles. However, singularities are places where the equations break down, and Einstein believed that in an ultimate theory the laws should apply everywhere, with particles being soliton-like solutions to the (highly nonlinear) field equations. Further, the large-scale topology of the universe should impose restrictions on the solutions, such as quantization or discrete symmetries.

The degree of abstraction, combined with a relative lack of good mathematical tools for analyzing nonlinear equation systems, make it hard to connect such theories with the physical phenomena that they might describe. For example, it has been suggested that the torsion (antisymmetric part of the affine connection) might be related to isospin rather than electromagnetism; this is related to a discrete (or "internal") symmetry known to Einstein as "displacement field duality".

Einstein became increasingly isolated in his research on a generalized theory of gravitation, and most physicists consider his attempts ultimately unsuccessful. In particular, his pursuit of a unification of the fundamental forces ignored developments in quantum physics (and vice versa), most notably the discovery of the strong nuclear force and weak nuclear force.

Schrödinger's pure-affine theory

Inspired by Einstein's approach to a unified field theory and Eddington's idea of the affine connection as the sole basis for differential geometric structure for space-time, Erwin Schrödinger from 1940 to 1951 thoroughly investigated pure-affine formulations of generalized gravitational theory. Although he initially assumed a symmetric affine connection, like Einstein he later considered the nonsymmetric field.

Schrödinger's most striking discovery during this work was that the metric tensor was induced upon the manifold via a simple construction from the Riemann curvature tensor, which was in turn formed entirely from the affine connection. Further, taking this approach with the simplest feasible basis for the variational principle resulted in a field equation having the form of Einstein's general-relativistic field equation with a cosmological term arising automatically.

Skepticism from Einstein and published criticisms from other physicists discouraged Schrödinger, and his work in this area has been largely ignored.

Later work

After the 1930s, progressively fewer scientists worked on classical unification, due to the continued development of quantum-theoretical descriptions of the non-gravitational fundamental forces of nature and the difficulties encountered in developing a quantum theory of gravity. Einstein pressed on with his attempts to theoretically unify gravity and electromagnetism, but he became increasingly isolated in this research, which he pursued until his death. Einstein's celebrity status brought much attention to his final quest, which ultimately saw limited success.

Most physicists, on the other hand, eventually abandoned classical unified theories. Current mainstream research on unified field theories focuses on the problem of creating a quantum theory of gravity and unifying with the other fundamental theories in physics, all of which are quantum field theories. (Some programs, such as string theory, attempt to solve both of these problems at once.) Of the four known fundamental forces, gravity remains the one force for which unification with the others proves problematic.

Although new "classical" unified field theories continue to be proposed from time to time, often involving non-traditional elements such as spinors or relating gravitation to an electromagnetic force, none have been generally accepted by physicists yet.

Entropy (information theory)

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Entropy_(information_theory) In info...