Search This Blog

Friday, April 10, 2026

Non-linear least squares

From Wikipedia, the free encyclopedia

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box–Cox transformed regressors ().

Theory

Consider a set of data points, and a curve (model function) that in addition to the variable also depends on parameters, with It is desired to find the vector of parameters such that the curve fits best the given data in the least squares sense, that is, the sum of squares is minimized, where the residuals (in-sample prediction errors) ri are given by for

The minimum value of S occurs when the gradient is zero. Since the model contains n parameters there are n gradient equations:

In a nonlinear system, the derivatives are functions of both the independent variable and the parameters, so in general these gradient equations do not have a closed solution. Instead, initial values must be chosen for the parameters. Then, the parameters are refined iteratively, that is, the values are obtained by successive approximation,

Here, k is an iteration number and the vector of increments, is known as the shift vector. At each iteration the model is linearized by approximation to a first-order Taylor polynomial expansion about The Jacobian matrix, J, is a function of constants, the independent variable and the parameters, so it changes from one iteration to the next. Thus, in terms of the linearized model, and the residuals are given by

Substituting these expressions into the gradient equations, they become which, on rearrangement, become n simultaneous linear equations, the normal equations

The normal equations are written in matrix notation as

These equations form the basis for the Gauss–Newton algorithm for a non-linear least squares problem.

Note the sign convention in the definition of the Jacobian matrix in terms of the derivatives. Formulas linear in may appear with factor of in other articles or the literature.

Extension by weights

When the observations are not equally reliable, a weighted sum of squares may be minimized,

Each element of the diagonal weight matrix W should, ideally, be equal to the reciprocal of the error variance of the measurement. The normal equations are then, more generally,

Geometrical interpretation

In linear least squares the objective function, S, is a quadratic function of the parameters. When there is only one parameter the graph of S with respect to that parameter will be a parabola. With two or more parameters the contours of S with respect to any pair of parameters will be concentric ellipses (assuming that the normal equations matrix is positive definite). The minimum parameter values are to be found at the centre of the ellipses. The geometry of the general objective function can be described as paraboloid elliptical. In NLLSQ the objective function is quadratic with respect to the parameters only in a region close to its minimum value, where the truncated Taylor series is a good approximation to the model. The more the parameter values differ from their optimal values, the more the contours deviate from elliptical shape. A consequence of this is that initial parameter estimates should be as close as practicable to their (unknown!) optimal values. It also explains how divergence can come about as the Gauss–Newton algorithm is convergent only when the objective function is approximately quadratic in the parameters.

Computation

Initial parameter estimates

Some problems of ill-conditioning and divergence can be corrected by finding initial parameter estimates that are near to the optimal values. A good way to do this is by computer simulation. Both the observed and calculated data are displayed on a screen. The parameters of the model are adjusted by hand until the agreement between observed and calculated data is reasonably good. Although this will be a subjective judgment, it is sufficient to find a good starting point for the non-linear refinement. Initial parameter estimates can be created using transformations or linearizations. Better still evolutionary algorithms such as the Stochastic Funnel Algorithm can lead to the convex basin of attraction that surrounds the optimal parameter estimates. Hybrid algorithms that use randomization and elitism, followed by Newton methods have been shown to be useful and computationally efficient.

Solution

Any method among the ones described below can be applied to find a solution.

Convergence criteria

The common sense criterion for convergence is that the sum of squares does not increase from one iteration to the next. However this criterion is often difficult to implement in practice, for various reasons. A useful convergence criterion is The value 0.0001 is somewhat arbitrary and may need to be changed. In particular it may need to be increased when experimental errors are large. An alternative criterion is

Again, the numerical value is somewhat arbitrary; 0.001 is equivalent to specifying that each parameter should be refined to 0.1% precision. This is reasonable when it is less than the largest relative standard deviation on the parameters.

Calculation of the Jacobian by numerical approximation

There are models for which it is either very difficult or even impossible to derive analytical expressions for the elements of the Jacobian. Then, the numerical approximation is obtained by calculation of for and . The increment,, size should be chosen so the numerical derivative is not subject to approximation error by being too large, or round-off error by being too small.

Parameter errors, confidence limits, residuals etc.

Some information is given in the corresponding section on the Weighted least squares page.

Multiple minima

Multiple minima can occur in a variety of circumstances some of which are:

  • A parameter is raised to a power of two or more. For example, when fitting data to a Lorentzian curve where is the height, is the position and is the half-width at half height, there are two solutions for the half-width, and which give the same optimal value for the objective function.
  • Two parameters can be interchanged without changing the value of the model. A simple example is when the model contains the product of two parameters, since will give the same value as .
  • A parameter is in a trigonometric function, such as , which has identical values at . See Levenberg–Marquardt algorithm for an example.

Not all multiple minima have equal values of the objective function. False minima, also known as local minima, occur when the objective function value is greater than its value at the so-called global minimum. To be certain that the minimum found is the global minimum, the refinement should be started with widely differing initial values of the parameters. When the same minimum is found regardless of starting point, it is likely to be the global minimum.

When multiple minima exist there is an important consequence: the objective function will have a stationary point (e.g. a maximum or a saddle point) somewhere between two minima. The normal equations matrix is not positive definite at a stationary point in the objective function, because the gradient vanishes and no unique direction of descent exists. Refinement from a point (a set of parameter values) close to a stationary point will be ill-conditioned and should be avoided as a starting point. For example, when fitting a Lorentzian the normal equations matrix is not positive definite when the half-width of the Lorentzian is zero.

Transformation to a linear model

A non-linear model can sometimes be transformed into a linear one. Such an approximation is, for instance, often applicable in the vicinity of the best estimator, and it is one of the basic assumption in most iterative minimization algorithms. When a linear approximation is valid, the model can directly be used for inference with a generalized least squares, where the equations of the Linear Template Fit apply.

Another example of a linear approximation would be when the model is a simple exponential function, which can be transformed into a linear model by taking logarithms. Graphically this corresponds to working on a semi-log plot. The sum of squares becomes This procedure should be avoided unless the errors are multiplicative and log-normally distributed because it can give misleading results. This comes from the fact that whatever the experimental errors on y might be, the errors on log y are different. Therefore, when the transformed sum of squares is minimized, different results will be obtained both for the parameter values and their calculated standard deviations. However, with multiplicative errors that are log-normally distributed, this procedure gives unbiased and consistent parameter estimates.

Another example is furnished by Michaelis–Menten kinetics, used to determine two parameters and : The Lineweaver–Burk plot of against is linear in the parameters and but very sensitive to data error and strongly biased toward fitting the data in a particular range of the independent variable .

Algorithms

Gauss–Newton method

The normal equations may be solved for by Cholesky decomposition, as described in linear least squares. The parameters are updated iteratively where k is an iteration number. While this method may be adequate for simple models, it will fail if divergence occurs. Therefore, protection against divergence is essential.

Shift-cutting

If divergence occurs, a simple expedient is to reduce the length of the shift vector, , by a fraction, f For example, the length of the shift vector may be successively halved until the new value of the objective function is less than its value at the last iteration. The fraction, f could be optimized by a line search. As each trial value of f requires the objective function to be re-calculated it is not worth optimizing its value too stringently.

When using shift-cutting, the direction of the shift vector remains unchanged. This limits the applicability of the method to situations where the direction of the shift vector is not very different from what it would be if the objective function were approximately quadratic in the parameters,

Marquardt parameter

If divergence occurs and the direction of the shift vector is so far from its "ideal" direction that shift-cutting is not very effective, that is, the fraction, f required to avoid divergence is very small, the direction must be changed. This can be achieved by using the Marquardt parameter. In this method the normal equations are modified where is the Marquardt parameter and I is an identity matrix. Increasing the value of has the effect of changing both the direction and the length of the shift vector. The shift vector is rotated towards the direction of steepest descent when is the steepest descent vector. So, when becomes very large, the shift vector becomes a small fraction of the steepest descent vector.

Various strategies have been proposed for the determination of the Marquardt parameter. As with shift-cutting, it is wasteful to optimize this parameter too stringently. Rather, once a value has been found that brings about a reduction in the value of the objective function, that value of the parameter is carried to the next iteration, reduced if possible, or increased if need be. When reducing the value of the Marquardt parameter, there is a cut-off value below which it is safe to set it to zero, that is, to continue with the unmodified Gauss–Newton method. The cut-off value may be set equal to the smallest singular value of the Jacobian. A bound for this value is given by where tr is the trace function.

QR decomposition

The minimum in the sum of squares can be found by a method that does not involve forming the normal equations. The residuals with the linearized model can be written as The Jacobian is subjected to an orthogonal decomposition; the QR decomposition will serve to illustrate the process. where Q is an orthogonal matrix and R is an matrix which is partitioned into an block, , and a zero block. is upper triangular.

The residual vector is left-multiplied by .

This has no effect on the sum of squares since because Q is orthogonal. The minimum value of S is attained when the upper block is zero. Therefore, the shift vector is found by solving

These equations are easily solved as R is upper triangular.

Singular value decomposition

A variant of the method of orthogonal decomposition involves singular value decomposition, in which R is diagonalized by further orthogonal transformations.

where is orthogonal, is a diagonal matrix of singular values and is the orthogonal matrix of the eigenvectors of or equivalently the right singular vectors of . In this case the shift vector is given by

The relative simplicity of this expression is very useful in theoretical analysis of non-linear least squares. The application of singular value decomposition is discussed in detail in Lawson and Hanson.

Gradient methods

There are many examples in the scientific literature where different methods have been used for non-linear data-fitting problems.

  • Inclusion of second derivatives in The Taylor series expansion of the model function. This is Newton's method in optimization. The matrix H is known as the Hessian matrix. Although this model has better convergence properties near to the minimum, it is much worse when the parameters are far from their optimal values. Calculation of the Hessian adds to the complexity of the algorithm. This method is not in general use.
  • Davidon–Fletcher–Powell method. This method, a form of pseudo-Newton method, is similar to the one above but calculates the Hessian by successive approximation, to avoid having to use analytical expressions for the second derivatives.
  • Steepest descent. Although a reduction in the sum of squares is guaranteed when the shift vector points in the direction of steepest descent, this method often performs poorly. When the parameter values are far from optimal the direction of the steepest descent vector, which is normal (perpendicular) to the contours of the objective function, is very different from the direction of the Gauss–Newton vector. This makes divergence much more likely, especially as the minimum along the direction of steepest descent may correspond to a small fraction of the length of the steepest descent vector. When the contours of the objective function are very eccentric, due to there being high correlation between parameters, the steepest descent iterations, with shift-cutting, follow a slow, zig-zag trajectory towards the minimum.
  • Conjugate gradient search. This is an improved steepest descent based method with good theoretical convergence properties, although it can fail on finite-precision digital computers even when used on quadratic problems.

Direct search methods

Direct search methods depend on evaluations of the objective function at a variety of parameter values and do not use derivatives at all. They offer alternatives to the use of numerical derivatives in the Gauss–Newton method and gradient methods.

  • Alternating variable search. Each parameter is varied in turn by adding a fixed or variable increment to it and retaining the value that brings about a reduction in the sum of squares. The method is simple and effective when the parameters are not highly correlated. It has very poor convergence properties, but may be useful for finding initial parameter estimates.
  • Nelder–Mead (simplex) search. A simplex in this context is a polytope of n + 1 vertices in n dimensions; a triangle on a plane, a tetrahedron in three-dimensional space and so forth. Each vertex corresponds to a value of the objective function for a particular set of parameters. The shape and size of the simplex is adjusted by varying the parameters in such a way that the value of the objective function at the highest vertex always decreases. Although the sum of squares may initially decrease rapidly, it can converge to a nonstationary point on quasiconvex problems, by an example of M. J. D. Powell.

More detailed descriptions of these, and other, methods are available, in Numerical Recipes, together with computer code in various languages.

Nonlinear programming

From Wikipedia, the free encyclopedia

In mathematics, nonlinear programming (NLP), also known as nonlinear optimization, is the process of solving an optimization problem where some of the constraints are not linear equalities or the objective function is not a linear function. An optimization problem is one of calculation of the extrema (maxima, minima or stationary points) of an objective function over a set of unknown real variables and conditional to the satisfaction of a system of equalities and inequalities, collectively termed constraints. It is the sub-field of mathematical optimization that deals with problems that are not linear.

Definition and discussion

Let n, m, and p be positive integers. Let X be a subset of Rn (usually a box-constrained one), let f, gi, and hj be real-valued functions on X for each i in {1, ..., m} and each j in {1, ..., p}, with at least one of f, gi, and hj being nonlinear.

A nonlinear programming problem is an optimization problem of the form

Depending on the constraint set, there are several possibilities:

  • feasible problem is one for which there exists at least one set of values for the choice variables satisfying all the constraints.
  • an infeasible problem is one for which no set of values for the choice variables satisfies all the constraints. That is, the constraints are mutually contradictory, and no solution exists; the feasible set is the empty set.
  • unbounded problem is a feasible problem for which the objective function can be made to be better than any given finite value. Thus there is no optimal solution, because there is always a feasible solution that gives a better objective function value than does any given proposed solution.

Most realistic applications feature feasible problems, with infeasible or unbounded problems seen as a failure of an underlying model. In some cases, infeasible problems are handled by minimizing a sum of feasibility violations.

Some special cases of nonlinear programming have specialized solution methods:

  • If the objective function is concave (maximization problem), or convex (minimization problem) and the constraint set is convex, then the program is called convex and general methods from convex optimization can be used in most cases.
  • If the objective function is quadratic and the constraints are linear, quadratic programming techniques are used.
  • If the objective function is a ratio of a concave and a convex function (in the maximization case) and the constraints are convex, then the problem can be transformed to a convex optimization problem using fractional programming techniques.

Applicability

A typical non-convex problem is that of optimizing transportation costs by selection from a set of transportation methods, one or more of which exhibit economies of scale, with various connectivities and capacity constraints. An example would be petroleum product transport given a selection or combination of pipeline, rail tanker, road tanker, river barge, or coastal tankship. Owing to economic batch size the cost functions may have discontinuities in addition to smooth changes.

In experimental science, some simple data analysis (such as fitting a spectrum with a sum of peaks of known location and shape but unknown magnitude) can be done with linear methods, but in general these problems are also nonlinear. Typically, one has a theoretical model of the system under study with variable parameters in it and a model the experiment or experiments, which may also have unknown parameters. One tries to find a best fit numerically. In this case one often wants a measure of the precision of the result, as well as the best fit itself.

Methods for solving a general nonlinear program

Analytic methods

Under differentiability and constraint qualifications, the Karush–Kuhn–Tucker (KKT) conditions provide necessary conditions for a solution to be optimal. If some of the functions are non-differentiable, subdifferential versions of Karush–Kuhn–Tucker (KKT) conditions are available.

Under convexity, the KKT conditions are sufficient for a global optimum. Without convexity, these conditions are sufficient only for a local optimum. In some cases, the number of local optima is small, and one can find all of them analytically and find the one for which the objective value is smallest.

Numeric methods

In most realistic cases, it is very hard to solve the KKT conditions analytically, and so the problems are solved using numerical methods. These methods are iterative: they start with an initial point, and then proceed to points that are supposed to be closer to the optimal point, using some update rule. There are three kinds of update rules:

  • Zero-order routines - use only the values of the objective function and constraint functions at the current point;
  • First-order routines - use also the values of the gradients of these functions;
  • Second-order routines - use also the values of the Hessians of these functions.

Third-order routines (and higher) are theoretically possible, but not used in practice, due to the higher computational load and little theoretical benefit.

Branch and bound

Another method involves the use of branch and bound techniques, where the program is divided into subclasses to be solved with convex (minimization problem) or linear approximations that form a lower bound on the overall cost within the subdivision. With subsequent divisions, at some point an actual solution will be obtained whose cost is equal to the best lower bound obtained for any of the approximate solutions. This solution is optimal, although possibly not unique. The algorithm may also be stopped early, with the assurance that the best possible solution is within a tolerance from the best point found; such points are called ε-optimal. Terminating to ε-optimal points is typically necessary to ensure finite termination. This is especially useful for large, difficult problems and problems with uncertain costs or values where the uncertainty can be estimated with an appropriate reliability estimation.

Implementations

There exist numerous nonlinear programming solvers, including open source:

  • ALGLIB (C++, C#, Java, Python API) implements several first-order and derivative-free nonlinear programming solvers
  • NLopt (C/C++ implementation, with numerous interfaces including Julia, Python, R, MATLAB/Octave), includes various nonlinear programming solvers
  • SciPy (de facto standard for scientific Python) has scipy.optimize solver, which includes several nonlinear programming algorithms (zero-order, first order and second order ones).
  • IPOPT (C++ implementation, with numerous interfaces including C, Fortran, Java, AMPL, R, Python, etc.) is an interior point method solver (zero-order, and optionally first order and second order derivatives).

Numerical Examples

2-dimensional example

The blue region is the feasible region. The tangency of the line with the feasible region represents the solution. The line is the best achievable contour line (locus with a given value of the objective function).

A simple problem (shown in the diagram) can be defined by the constraints with an objective function to be maximized where x = (x1, x2).

3-dimensional example

The tangency of the top surface with the constrained space in the center represents the solution.

Another simple problem (see diagram) can be defined by the constraints with an objective function to be maximized where x = (x1, x2, x3).

Theoretical physics

From Wikipedia, the free encyclopedia
Visual representation of a Schwarzschild wormhole. Wormholes have never been observed, but they are predicted to exist through mathematical models and scientific theory.

Theoretical physics is a branch of physics that employs mathematical models and abstractions of physical objects and systems to rationalize, explain, and predict natural phenomena. This is in contrast to experimental physics, which uses experimental tools to probe these phenomena.

The advancement of science generally depends on the interplay between experimental studies and theory. In some cases, theoretical physics adheres to standards of mathematical rigour while giving little weight to experiments and observations. For example, while developing special relativity, Albert Einstein was concerned with the Lorentz transformation which left Maxwell's equations invariant, but was apparently uninterested in the Michelson–Morley experiment on Earth's drift through a luminiferous aether. Conversely, Einstein was awarded the Nobel Prize for explaining the photoelectric effect, previously an experimental result lacking a theoretical formulation.

Overview

A physical theory is a model of physical events. It is judged by the extent to which its predictions agree with empirical observations. The quality of a physical theory is also judged on its ability to make new predictions which can be verified by new observations. A physical theory differs from a mathematical theorem in that while both are based on some form of axioms, judgment of mathematical applicability is not based on agreement with any experimental results. A physical theory similarly differs from a mathematical theory, in the sense that the word "theory" has a different meaning in mathematical terms.

The equations for an Einstein manifold, used in general relativity to describe the curvature of spacetime

A physical theory involves one or more relationships between various measurable quantities. Archimedes realized that a ship floats by displacing its mass of water, Pythagoras understood the relation between the length of a vibrating string and the musical tone it produces. Other examples include entropy as a measure of the uncertainty regarding the positions and motions of unseen particles and the quantum mechanical idea that (action and) energy are not continuously variable.[citation needed]

Theoretical physics consists of several different approaches. In this regard, theoretical particle physics forms a good example. For instance: "phenomenologists" might employ (semi-) empirical formulas and heuristics to agree with experimental results, often without deep physical understanding. "Modelers" (also called "model-builders") often appear much like phenomenologists, but try to model speculative theories that have certain desirable features (rather than on experimental data), or apply the techniques of mathematical modeling to physics problems. Some attempt to create approximate theories, called effective theories, because fully developed theories may be regarded as unsolvable or too complicated. Other theorists may try to unify, formalise, reinterpret or generalise extant theories, or create completely new ones altogether. Sometimes the vision provided by pure mathematical systems can provide clues to how a physical system might be modeled; e.g., the notion, due to Riemann and others, that space itself might be curved. Theoretical problems that need computational investigation are often the concern of computational physics.

Theoretical advances may consist in setting aside old, incorrect paradigms (e.g., aether theory of light propagation, caloric theory of heat, burning consisting of evolving phlogiston, or astronomical bodies revolving around the Earth) or may be an alternative model that provides answers that are more accurate or that can be more widely applied. In the latter case, a correspondence principle will be required to recover the previously known result. Sometimes though, advances may proceed along different paths. For example, an essentially correct theory may need some conceptual or factual revisions; atomic theory, first postulated millennia ago (by several thinkers in Greece and India) and the two-fluid theory of electricity are two cases in this point. However, an exception to all the above is the wave–particle duality, a theory combining aspects of different, opposing models via the Bohr complementarity principle.

Relationship between mathematics and physics

Physical theories become accepted if they are able to make correct predictions and no (or few) incorrect ones. The theory should have, at least as a secondary objective, a certain economy and elegance (compare to mathematical beauty), a notion sometimes called "Occam's razor" after the 13th-century English philosopher William of Occam (or Ockham), in which the simpler of two theories that describe the same matter just as adequately is preferred (but conceptual simplicity may mean mathematical complexity). They are also more likely to be accepted if they connect a wide range of phenomena. Testing the consequences of a theory is part of the scientific method.

Physical theories can be grouped into three categories: mainstream theories, proposed theories and fringe theories.

History

Theoretical physics began at least 2,300 years ago, under the pre-Socratic philosophy, and continued by Plato and Aristotle, whose views held sway for a millennium. During the rise of medieval universities, the only acknowledged intellectual disciplines were the seven liberal arts of the Trivium like grammar, logic, and rhetoric and of the Quadrivium like arithmetic, geometry, music and astronomy. During the Middle Ages and Renaissance, the concept of experimental science, the counterpoint to theory, began with scholars such as Ibn al-Haytham and Francis Bacon. As the Scientific Revolution gathered pace, the concepts of matter, energy, space, time and causality slowly began to acquire the form we know today, and other sciences spun off from the rubric of natural philosophy. Thus began the modern era of theory with the Copernican paradigm shift in astronomy, soon followed by Johannes Kepler's expressions for planetary orbits, which summarized the meticulous observations of Tycho Brahe; the works of these men (alongside Galileo's) can perhaps be considered to constitute the Scientific Revolution.

The great push toward the modern concept of explanation started with Galileo Galilei, one of the few physicists who was both a consummate theoretician and a great experimentalist. The analytic geometry and mechanics of René Descartes were incorporated into the calculus and mechanics of Isaac Newton, another theoretician/experimentalist of the highest order, writing Principia Mathematica. In it contained a grand synthesis of the work of Copernicus, Galileo and Kepler; as well as Newton's theories of mechanics and gravitation, which held sway as worldviews until the early 20th century. Simultaneously, progress was also made in optics (in particular colour theory and the ancient science of geometrical optics), courtesy of Newton, Descartes and the Dutchmen Willebrord Snell and Christiaan Huygens. In the 18th and 19th centuries Joseph-Louis Lagrange, Leonhard Euler and William Rowan Hamilton would extend the theory of classical mechanics considerably. They picked up the interactive intertwining of mathematics and physics begun two millennia earlier by Pythagoras.

Among the great conceptual achievements of the 19th and 20th centuries were the consolidation of the idea of energy (as well as its global conservation) by the inclusion of heat, electricity and magnetism, and then light. Lord Kelvin and Walther Nernst's discoveries of the laws of thermodynamics, and more importantly Rudolf Clausius's introduction of the singular concept of entropy, began to provide a macroscopic explanation for the properties of matter. Statistical mechanics (followed by statistical physics and quantum statistical mechanics) emerged as an offshoot of thermodynamics late in the 19th century. Another important event in the 19th century was James Clerk Maxwell's discovery of electromagnetic theory, unifying the previously separate phenomena of electricity, magnetism and light.

The pillars of modern physics, and perhaps the most revolutionary theories in the history of physics, have been relativity theory, devised by Albert Einstein, and quantum mechanics, founded by Werner Heisenberg, Max Born, Pascual Jordan, and Erwin Schrödinger. Newtonian mechanics was subsumed under special relativity and Newton's gravity was given a kinematic explanation by general relativity. Quantum mechanics led to an understanding of blackbody radiation (which indeed, was an original motivation for the theory) and of anomalies in the specific heats of solids — and finally to an understanding of the internal structures of atoms and molecules. Quantum mechanics soon gave way to the formulation of quantum field theory (QFT), begun in the late 1920s. In the aftermath of World War II, more progress brought much renewed interest in QFT, which had since the early efforts, stagnated. The same period also saw fresh attacks on the problems of superconductivity and phase transitions, as well as the first applications of QFT in the area of theoretical condensed matter. The 1960s and 70s saw the formulation of the Standard Model of particle physics using QFT and progress in condensed matter physics (theoretical foundations of superconductivity and critical phenomena, among others), in parallel to the applications of relativity to problems in astronomy and cosmology respectively.

All of these achievements depended on the theoretical physics as a moving force both to suggest experiments and to consolidate results — often by ingenious application of existing mathematics, or, as in the case of Descartes and Newton (with Leibniz), by inventing new mathematics. Fourier's studies of heat conduction led to a new branch of mathematics: infinite, orthogonal series.

Modern theoretical physics attempts to unify theories and explain phenomena in further attempts to understand the Universe, from the cosmological to the elementary particle scale. Where experimentation cannot be done, theoretical physics still tries to advance through the use of mathematical models.

Mainstream theories

Mainstream theories (sometimes referred to as central theories) are the body of knowledge of both factual and scientific views and possess a usual scientific quality of the tests of repeatability, consistency with existing well-established science and experimentation. There do exist mainstream theories that are generally accepted theories based solely upon their effects explaining a wide variety of data, although the detection, explanation, and possible composition are subjects of debate.

Examples

Proposed theories

The proposed theories of physics are usually relatively new theories which deal with the study of physics which include scientific approaches, means for determining the validity of models and new types of reasoning used to arrive at the theory. However, some proposed theories include theories that have been around for decades and have eluded methods of discovery and testing. Proposed theories can include fringe theories in the process of becoming established (and, sometimes, gaining wider acceptance). Proposed theories usually have not been tested. In addition to the theories like those listed below, there are also different interpretations of quantum mechanics, which may or may not be considered different theories since it is debatable whether they yield different predictions for physical experiments, even in principle. For example, AdS/CFT correspondence, Chern–Simons theory, graviton, magnetic monopole, string theory, theory of everything.

Fringe theories

Fringe theories include any new area of scientific endeavor in the process of becoming established and some proposed theories. It can include speculative sciences. This includes physics fields and physical theories presented in accordance with known evidence, and a body of associated predictions have been made according to that theory.

Some fringe theories go on to become a widely accepted part of physics. Other fringe theories end up being disproven. Some fringe theories are a form of protoscience and others are a form of pseudoscience. The falsification of the original theory sometimes leads to reformulation of the theory.

Examples

Thought experiments vs real experiments

"Thought" experiments are situations created in one's mind, asking a question akin to "suppose you are in this situation, assuming such is true, what would follow?". They are usually created to investigate phenomena that are not readily experienced in every-day situations. Famous examples of such thought experiments are Schrödinger's cat, the EPR thought experiment, simple illustrations of time dilation, and so on. These usually lead to real experiments designed to verify that the conclusion (and therefore the assumptions) of the thought experiments are correct. The EPR thought experiment led to the Bell inequalities, which were then tested to various degrees of rigor, leading to the acceptance of the current formulation of quantum mechanics and probabilism as a working hypothesis.

Emotional contagion

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Emotional_contagion   Emotiona...