Monday, May 21, 2018

Inertial frame of reference

An inertial frame of reference in classical physics and special relativity is a frame of reference in which a body with zero net force acting upon it is not accelerating; that is, such a body is at rest or it is moving at a constant speed in a straight line.[1] In analytical terms, it is a frame of reference that describes time and space homogeneously, isotropically, and in a time-independent manner.[2] Conceptually, the physics of a system in an inertial frame have no causes external to the system.[3] An inertial frame of reference may also be called an inertial reference frame, inertial frame, Galilean reference frame, or inertial space.[citation needed]

All inertial frames are in a state of constant, rectilinear motion with respect to one another; an accelerometer moving with any of them would detect zero acceleration. Measurements in one inertial frame can be converted to measurements in another by a simple transformation (the Galilean transformation in Newtonian physics and the Lorentz transformation in special relativity). In general relativity, in any region small enough for the curvature of spacetime and tidal forces[4] to be negligible, one can find a set of inertial frames that approximately describe that region.[5][6]

In a non-inertial reference frame in classical physics and special relativity, the physics of a system vary depending on the acceleration of that frame with respect to an inertial frame, and the usual physical forces must be supplemented by fictitious forces.[7][8] In contrast, systems in non-inertial frames in general relativity don't have external causes, because of the principle of geodesic motion.[9] In classical physics, for example, a ball dropped towards the ground does not go exactly straight down because the Earth is rotating, which means the frame of reference of an observer on Earth is not inertial. The physics must account for the Coriolis effect—in this case thought of as a force—to predict the horizontal motion. Another example of such a fictitious force associated with rotating reference frames is the centrifugal effect, or centrifugal force.


The motion of a body can only be described relative to something else—other bodies, observers, or a set of space-time coordinates. These are called frames of reference. If the coordinates are chosen badly, the laws of motion may be more complex than necessary. For example, suppose a free body that has no external forces acting on it is at rest at some instant. In many coordinate systems, it would begin to move at the next instant, even though there are no forces on it. However, a frame of reference can always be chosen in which it remains stationary. Similarly, if space is not described uniformly or time independently, a coordinate system could describe the simple flight of a free body in space as a complicated zig-zag in its coordinate system. Indeed, an intuitive summary of inertial frames can be given: in an inertial reference frame, the laws of mechanics take their simplest form.[2]
In an inertial frame, Newton's first law, the law of inertia, is satisfied: Any free motion has a constant magnitude and direction.[2] Newton's second law for a particle takes the form:
\mathbf{F} = m \mathbf{a} \ ,
with F the net force (a vector), m the mass of a particle and a the acceleration of the particle (also a vector) which would be measured by an observer at rest in the frame. The force F is the vector sum of all "real" forces on the particle, such as electromagnetic, gravitational, nuclear and so forth. In contrast, Newton's second law in a rotating frame of reference, rotating at angular rate Ω about an axis, takes the form:
\mathbf{F}' = m \mathbf{a} \ ,
which looks the same as in an inertial frame, but now the force F′ is the resultant of not only F, but also additional terms (the paragraph following this equation presents the main points without detailed mathematics):
\mathbf{F}' = \mathbf{F} - 2m \mathbf{\Omega} \times \mathbf{v}_{B} - m \mathbf{\Omega} \times (\mathbf{\Omega} \times \mathbf{x}_B ) - m \frac{d \mathbf{\Omega}}{dt} \times \mathbf{x}_B \ ,
where the angular rotation of the frame is expressed by the vector Ω pointing in the direction of the axis of rotation, and with magnitude equal to the angular rate of rotation Ω, symbol × denotes the vector cross product, vector xB locates the body and vector vB is the velocity of the body according to a rotating observer (different from the velocity seen by the inertial observer).

The extra terms in the force F′ are the "fictitious" forces for this frame, whose causes are external to the system in the frame. The first extra term is the Coriolis force, the second the centrifugal force, and the third the Euler force. These terms all have these properties: they vanish when Ω = 0; that is, they are zero for an inertial frame (which, of course, does not rotate); they take on a different magnitude and direction in every rotating frame, depending upon its particular value of Ω; they are ubiquitous in the rotating frame (affect every particle, regardless of circumstance); and they have no apparent source in identifiable physical sources, in particular, matter. Also, fictitious forces do not drop off with distance (unlike, for example, nuclear forces or electrical forces). For example, the centrifugal force that appears to emanate from the axis of rotation in a rotating frame increases with distance from the axis.

All observers agree on the real forces, F; only non-inertial observers need fictitious forces. The laws of physics in the inertial frame are simpler because unnecessary forces are not present.

In Newton's time the fixed stars were invoked as a reference frame, supposedly at rest relative to absolute space. In reference frames that were either at rest with respect to the fixed stars or in uniform translation relative to these stars, Newton's laws of motion were supposed to hold. In contrast, in frames accelerating with respect to the fixed stars, an important case being frames rotating relative to the fixed stars, the laws of motion did not hold in their simplest form, but had to be supplemented by the addition of fictitious forces, for example, the Coriolis force and the centrifugal force. Two interesting experiments were devised by Newton to demonstrate how these forces could be discovered, thereby revealing to an observer that they were not in an inertial frame: the example of the tension in the cord linking two spheres rotating about their center of gravity, and the example of the curvature of the surface of water in a rotating bucket. In both cases, application of Newton's second law would not work for the rotating observer without invoking centrifugal and Coriolis forces to account for their observations (tension in the case of the spheres; parabolic water surface in the case of the rotating bucket).

As we now know, the fixed stars are not fixed. Those that reside in the Milky Way turn with the galaxy, exhibiting proper motions. Those that are outside our galaxy (such as nebulae once mistaken to be stars) participate in their own motion as well, partly due to expansion of the universe, and partly due to peculiar velocities.[10] The Andromeda galaxy is on collision course with the Milky Way at a speed of 117 km/s.[11] The concept of inertial frames of reference is no longer tied to either the fixed stars or to absolute space. Rather, the identification of an inertial frame is based upon the simplicity of the laws of physics in the frame. In particular, the absence of fictitious forces is their identifying property.[12]

In practice, although not a requirement, using a frame of reference based upon the fixed stars as though it were an inertial frame of reference introduces very little discrepancy. For example, the centrifugal acceleration of the Earth because of its rotation about the Sun is about thirty million times greater than that of the Sun about the galactic center.[13]

To illustrate further, consider the question: "Does our Universe rotate?" To answer, we might attempt to explain the shape of the Milky Way galaxy using the laws of physics,[14] although other observations might be more definitive, that is, provide larger discrepancies or less measurement uncertainty, like the anisotropy of the microwave background radiation or Big Bang nucleosynthesis.[15][16] The flatness of the Milky Way depends on its rate of rotation in an inertial frame of reference. If we attribute its apparent rate of rotation entirely to rotation in an inertial frame, a different "flatness" is predicted than if we suppose part of this rotation actually is due to rotation of the universe and should not be included in the rotation of the galaxy itself. Based upon the laws of physics, a model is set up in which one parameter is the rate of rotation of the Universe. If the laws of physics agree more accurately with observations in a model with rotation than without it, we are inclined to select the best-fit value for rotation, subject to all other pertinent experimental observations. If no value of the rotation parameter is successful and theory is not within observational error, a modification of physical law is considered, for example, dark matter is invoked to explain the galactic rotation curve. So far, observations show any rotation of the universe is very slow, no faster than once every 60·1012 years (10−13 rad/yr),[17] and debate persists over whether there is any rotation. However, if rotation were found, interpretation of observations in a frame tied to the universe would have to be corrected for the fictitious forces inherent in such rotation in classical physics and special relativity, or interpreted as the curvature of spacetime and the motion of matter along the geodesics in general relativity.

When quantum effects are important, there are additional conceptual complications that arise in quantum reference frames.


A set of frames where the laws of physics are simple

According to the first postulate of special relativity, all physical laws take their simplest form in an inertial frame, and there exist multiple inertial frames interrelated by uniform translation: [18]
Special principle of relativity: If a system of coordinates K is chosen so that, in relation to it, physical laws hold good in their simplest form, the same laws hold good in relation to any other system of coordinates K' moving in uniform translation relatively to K.
— Albert Einstein: The foundation of the general theory of relativity, Section A, §1
This simplicity manifests in that inertial frames have self-contained physics without the need for external causes, while physics in non-inertial frames have external causes.[3] The principle of simplicity can be used within Newtonian physics as well as in special relativity; see Nagel[19] and also Blagojević.[20]
The laws of Newtonian mechanics do not always hold in their simplest form...If, for instance, an observer is placed on a disc rotating relative to the earth, he/she will sense a 'force' pushing him/her toward the periphery of the disc, which is not caused by any interaction with other bodies. Here, the acceleration is not the consequence of the usual force, but of the so-called inertial force. Newton's laws hold in their simplest form only in a family of reference frames, called inertial frames. This fact represents the essence of the Galilean principle of relativity:
   The laws of mechanics have the same form in all inertial frames.
— Milutin Blagojević: Gravitation and Gauge Symmetries, p. 4
In practical terms, the equivalence of inertial reference frames means that scientists within a box moving uniformly cannot determine their absolute velocity by any experiment. Otherwise, the differences would set up an absolute standard reference frame.[21][22] According to this definition, supplemented with the constancy of the speed of light, inertial frames of reference transform among themselves according to the Poincaré group of symmetry transformations, of which the Lorentz transformations are a subgroup.[23] In Newtonian mechanics, which can be viewed as a limiting case of special relativity in which the speed of light is infinite, inertial frames of reference are related by the Galilean group of symmetries.

Absolute space

Newton posited an absolute space considered well approximated by a frame of reference stationary relative to the fixed stars. An inertial frame was then one in uniform translation relative to absolute space. However, some scientists (called "relativists" by Mach[24]), even at the time of Newton, felt that absolute space was a defect of the formulation, and should be replaced.

Indeed, the expression inertial frame of reference (German: Inertialsystem) was coined by Ludwig Lange in 1885, to replace Newton's definitions of "absolute space and time" by a more operational definition.[25][26] As translated by Iro, Lange proposed the following definition:[27]
A reference frame in which a mass point thrown from the same point in three different (non co-planar) directions follows rectilinear paths each time it is thrown, is called an inertial frame.
A discussion of Lange's proposal can be found in Mach.[24]

The inadequacy of the notion of "absolute space" in Newtonian mechanics is spelled out by Blagojević:[28]
  • The existence of absolute space contradicts the internal logic of classical mechanics since, according to Galilean principle of relativity, none of the inertial frames can be singled out.
  • Absolute space does not explain inertial forces since they are related to acceleration with respect to any one of the inertial frames.
  • Absolute space acts on physical objects by inducing their resistance to acceleration but it cannot be acted upon.
— Milutin Blagojević: Gravitation and Gauge Symmetries, p. 5
The utility of operational definitions was carried much further in the special theory of relativity.[29] Some historical background including Lange's definition is provided by DiSalle, who says in summary:[30]
The original question, "relative to what frame of reference do the laws of motion hold?" is revealed to be wrongly posed. For the laws of motion essentially determine a class of reference frames, and (in principle) a procedure for constructing them.

Newton's inertial frame of reference

Figure 1: Two frames of reference moving with relative velocity \stackrel{\vec v}{}. Frame S' has an arbitrary but fixed rotation with respect to frame S. They are both inertial frames provided a body not subject to forces appears to move in a straight line. If that motion is seen in one frame, it will also appear that way in the other.

Within the realm of Newtonian mechanics, an inertial frame of reference, or inertial reference frame, is one in which Newton's first law of motion is valid.[31] However, the principle of special relativity generalizes the notion of inertial frame to include all physical laws, not simply Newton's first law.

Newton viewed the first law as valid in any reference frame that is in uniform motion relative to the fixed stars;[32] that is, neither rotating nor accelerating relative to the stars.[33] Today the notion of "absolute space" is abandoned, and an inertial frame in the field of classical mechanics is defined as:[34][35]
An inertial frame of reference is one in which the motion of a particle not subject to forces is in a straight line at constant speed.
Hence, with respect to an inertial frame, an object or body accelerates only when a physical force is applied, and (following Newton's first law of motion), in the absence of a net force, a body at rest will remain at rest and a body in motion will continue to move uniformly—that is, in a straight line and at constant speed. Newtonian inertial frames transform among each other according to the Galilean group of symmetries.

If this rule is interpreted as saying that straight-line motion is an indication of zero net force, the rule does not identify inertial reference frames because straight-line motion can be observed in a variety of frames. If the rule is interpreted as defining an inertial frame, then we have to be able to determine when zero net force is applied. The problem was summarized by Einstein:[36]
The weakness of the principle of inertia lies in this, that it involves an argument in a circle: a mass moves without acceleration if it is sufficiently far from other bodies; we know that it is sufficiently far from other bodies only by the fact that it moves without acceleration.
— Albert Einstein: The Meaning of Relativity, p. 58
There are several approaches to this issue. One approach is to argue that all real forces drop off with distance from their sources in a known manner, so we have only to be sure that a body is far enough away from all sources to ensure that no force is present.[37] A possible issue with this approach is the historically long-lived view that the distant universe might affect matters (Mach's principle). Another approach is to identify all real sources for real forces and account for them. A possible issue with this approach is that we might miss something, or account inappropriately for their influence, perhaps, again, due to Mach's principle and an incomplete understanding of the universe. A third approach is to look at the way the forces transform when we shift reference frames. Fictitious forces, those that arise due to the acceleration of a frame, disappear in inertial frames, and have complicated rules of transformation in general cases. On the basis of universality of physical law and the request for frames where the laws are most simply expressed, inertial frames are distinguished by the absence of such fictitious forces.

Newton enunciated a principle of relativity himself in one of his corollaries to the laws of motion:[38][39]
The motions of bodies included in a given space are the same among themselves, whether that space is at rest or moves uniformly forward in a straight line.
— Isaac Newton: Principia, Corollary V, p. 88 in Andrew Motte translation
This principle differs from the special principle in two ways: first, it is restricted to mechanics, and second, it makes no mention of simplicity. It shares with the special principle the invariance of the form of the description among mutually translating reference frames.[40] The role of fictitious forces in classifying reference frames is pursued further below.

Separating non-inertial from inertial reference frames


Figure 2: Two spheres tied with a string and rotating at an angular rate ω. Because of the rotation, the string tying the spheres together is under tension.

Figure 3: Exploded view of rotating spheres in an inertial frame of reference showing the centripetal forces on the spheres provided by the tension in the tying string.

Inertial and non-inertial reference frames can be distinguished by the absence or presence of fictitious forces, as explained shortly.[7][8]
The effect of this being in the noninertial frame is to require the observer to introduce a fictitious force into his calculations….
— Sidney Borowitz and Lawrence A Bornstein in A Contemporary View of Elementary Physics, p. 138
The presence of fictitious forces indicates the physical laws are not the simplest laws available so, in terms of the special principle of relativity, a frame where fictitious forces are present is not an inertial frame:[41]
The equations of motion in a non-inertial system differ from the equations in an inertial system by additional terms called inertial forces. This allows us to detect experimentally the non-inertial nature of a system.
— V. I. Arnol'd: Mathematical Methods of Classical Mechanics Second Edition, p. 129
Bodies in non-inertial reference frames are subject to so-called fictitious forces (pseudo-forces); that is, forces that result from the acceleration of the reference frame itself and not from any physical force acting on the body. Examples of fictitious forces are the centrifugal force and the Coriolis force in rotating reference frames.

How then, are "fictitious" forces to be separated from "real" forces? It is hard to apply the Newtonian definition of an inertial frame without this separation. For example, consider a stationary object in an inertial frame. Being at rest, no net force is applied. But in a frame rotating about a fixed axis, the object appears to move in a circle, and is subject to centripetal force (which is made up of the Coriolis force and the centrifugal force). How can we decide that the rotating frame is a non-inertial frame? There are two approaches to this resolution: one approach is to look for the origin of the fictitious forces (the Coriolis force and the centrifugal force). We will find there are no sources for these forces, no associated force carriers, no originating bodies.[42] A second approach is to look at a variety of frames of reference. For any inertial frame, the Coriolis force and the centrifugal force disappear, so application of the principle of special relativity would identify these frames where the forces disappear as sharing the same and the simplest physical laws, and hence rule that the rotating frame is not an inertial frame.

Newton examined this problem himself using rotating spheres, as shown in Figure 2 and Figure 3. He pointed out that if the spheres are not rotating, the tension in the tying string is measured as zero in every frame of reference.[43] If the spheres only appear to rotate (that is, we are watching stationary spheres from a rotating frame), the zero tension in the string is accounted for by observing that the centripetal force is supplied by the centrifugal and Coriolis forces in combination, so no tension is needed. If the spheres really are rotating, the tension observed is exactly the centripetal force required by the circular motion. Thus, measurement of the tension in the string identifies the inertial frame: it is the one where the tension in the string provides exactly the centripetal force demanded by the motion as it is observed in that frame, and not a different value. That is, the inertial frame is the one where the fictitious forces vanish.

So much for fictitious forces due to rotation. However, for linear acceleration, Newton expressed the idea of undetectability of straight-line accelerations held in common:[39]
If bodies, any how moved among themselves, are urged in the direction of parallel lines by equal accelerative forces, they will continue to move among themselves, after the same manner as if they had been urged by no such forces.
— Isaac Newton: Principia Corollary VI, p. 89, in Andrew Motte translation
This principle generalizes the notion of an inertial frame. For example, an observer confined in a free-falling lift will assert that he himself is a valid inertial frame, even if he is accelerating under gravity, so long as he has no knowledge about anything outside the lift. So, strictly speaking, inertial frame is a relative concept. With this in mind, we can define inertial frames collectively as a set of frames which are stationary or moving at constant velocity with respect to each other, so that a single inertial frame is defined as an element of this set.

For these ideas to apply, everything observed in the frame has to be subject to a base-line, common acceleration shared by the frame itself. That situation would apply, for example, to the elevator example, where all objects are subject to the same gravitational acceleration, and the elevator itself accelerates at the same rate.


Inertial navigation systems used a cluster of gyroscopes and accelerometers to determine accelerations relative to inertial space. After a gyroscope is spun up in a particular orientation in inertial space, the law of conservation of angular momentum requires that it retain that orientation as long as no external forces are applied to it.[44]:59 Three orthogonal gyroscopes establish an inertial reference frame, and the accelerators measure acceleration relative to that frame. The accelerations, along with a clock, can then be used to calculate the change in position. Thus, inertial navigation is a form of dead reckoning that requires no external input, and therefore cannot be jammed by any external or internal signal source.[45]

A gyrocompass, employed for navigation of seagoing vessels, finds the geometric north. It does so, not by sensing the Earth's magnetic field, but by using inertial space as its reference. The outer casing of the gyrocompass device is held in such a way that it remains aligned with the local plumb line. When the gyroscope wheel inside the gyrocompass device is spun up, the way the gyroscope wheel is suspended causes the gyroscope wheel to gradually align its spinning axis with the Earth's axis. Alignment with the Earth's axis is the only direction for which the gyroscope's spinning axis can be stationary with respect to the Earth and not be required to change direction with respect to inertial space. After being spun up, a gyrocompass can reach the direction of alignment with the Earth's axis in as little as a quarter of an hour.[46]

Newtonian mechanics

Classical theories that use the Galilean transformation postulate the equivalence of all inertial reference frames. Some theories may even postulate the existence of a privileged frame which provides absolute space and absolute time. The Galilean transformation transforms coordinates from one inertial reference frame, \mathbf {s} , to another, {\displaystyle \mathbf {s} ^{\prime }}, by simple addition or subtraction of corrdinates:

\mathbf{r}^{\prime} = \mathbf{r} - \mathbf{r}_{0} - \mathbf{v} t

t^{\prime} = t - t_{0}
where r0 and t0 represent shifts in the origin of space and time, and v is the relative velocity of the two inertial reference frames. Under Galilean transformations, the time t2t1 between two events is the same for all reference frames and the distance between two simultaneous events (or, equivalently, the length of any object, |r2r1|) is also the same.

Special relativity

Einstein's theory of special relativity, like Newtonian mechanics, postulates the equivalence of all inertial reference frames. However, because special relativity postulates that the speed of light in free space is invariant, the transformation between inertial frames is the Lorentz transformation, not the Galilean transformation which is used in Newtonian mechanics. The invariance of the speed of light leads to counter-intuitive phenomena, such as time dilation and length contraction, and the relativity of simultaneity, which have been extensively verified experimentally.[47] The Lorentz transformation reduces to the Galilean transformation as the speed of light approaches infinity or as the relative velocity between frames approaches zero.[48]

General relativity

General relativity is based upon the principle of equivalence:[49][50]
There is no experiment observers can perform to distinguish whether an acceleration arises because of a gravitational force or because their reference frame is accelerating.
— Douglas C. Giancoli, Physics for Scientists and Engineers with Modern Physics, p. 155.
This idea was introduced in Einstein's 1907 article "Principle of Relativity and Gravitation" and later developed in 1911.[51] Support for this principle is found in the Eötvös experiment, which determines whether the ratio of inertial to gravitational mass is the same for all bodies, regardless of size or composition. To date no difference has been found to a few parts in 1011.[52] For some discussion of the subtleties of the Eötvös experiment, such as the local mass distribution around the experimental site (including a quip about the mass of Eötvös himself), see Franklin.[53]

Einstein’s general theory modifies the distinction between nominally "inertial" and "noninertial" effects by replacing special relativity's "flat" Minkowski Space with a metric that produces non-zero curvature. In general relativity, the principle of inertia is replaced with the principle of geodesic motion, whereby objects move in a way dictated by the curvature of spacetime. As a consequence of this curvature, it is not a given in general relativity that inertial objects moving at a particular rate with respect to each other will continue to do so. This phenomenon of geodesic deviation means that inertial frames of reference do not exist globally as they do in Newtonian mechanics and special relativity.

However, the general theory reduces to the special theory over sufficiently small regions of spacetime, where curvature effects become less important and the earlier inertial frame arguments can come back into play.[54][55] Consequently, modern special relativity is now sometimes described as only a "local theory".[56] "Local" can encompass, for example, the entire Milky Way galaxy: The astronomer Karl Schwarzschild observed the motion of pairs of stars orbiting each other. He found that the two orbits of the stars of such a system lie in a plane, and the perihelion of the orbits of the two stars remains pointing in the same direction with respect to the solar system. Schwarzschild pointed out that that was invariably seen: the direction of the angular momentum of all observed double star systems remains fixed with respect to the direction of the angular momentum of the Solar System. These observations allowed him to conclude that inertial frames inside the galaxy do not rotate with respect to one another, and that the space of the Milky Way is approximately Galilean or Minkowskian.[57]

Yang–Mills theory

Yang–Mills theory is a gauge theory based on the SU(N) group, or more generally any compact, reductive Lie algebra. Yang–Mills theory seeks to describe the behavior of elementary particles using these non-Abelian Lie groups and is at the core of the unification of the electromagnetic force and weak forces (i.e. U(1) × SU(2)) as well as quantum chromodynamics, the theory of the strong force (based on SU(3)). Thus it forms the basis of our understanding of the Standard Model of particle physics.

History and theoretical description

In a private correspondence, Wolfgang Pauli formulated in 1953 a six-dimensional theory of Einstein's field equations of general relativity, extending the five-dimensional theory of Kaluza, Klein, Fock and others to a higher-dimensional internal space.[1] However, there is no evidence that Pauli developed the Lagrangian of a gauge field or the quantization of it. Because Pauli found that his theory "leads to some rather unphysical shadow particles”, he refrained from publishing his results formally.[1] Although Pauli did not publish his six-dimensional theory, he gave two talks about it in Zürich.[2] Recent research shows that an extended Kaluza–Klein theory is in general not equivalent to Yang–Mills theory, as the former contains additional terms.[3]

In early 1954, Chen Ning Yang and Robert Mills[4] extended the concept of gauge theory for abelian groups, e.g. quantum electrodynamics, to nonabelian groups to provide an explanation for strong interactions. The idea by Yang–Mills was criticized by Pauli,[5] as the quanta of the Yang–Mills field must be massless in order to maintain gauge invariance. The idea was set aside until 1960, when the concept of particles acquiring mass through symmetry breaking in massless theories was put forward, initially by Jeffrey Goldstone, Yoichiro Nambu, and Giovanni Jona-Lasinio.

This prompted a significant restart of Yang–Mills theory studies that proved successful in the formulation of both electroweak unification and quantum chromodynamics (QCD). The electroweak interaction is described by SU(2) × U(1) group while QCD is an SU(3) Yang–Mills theory. The electroweak theory is obtained by combining SU(2) with U(1), where quantum electrodynamics (QED) is described by a U(1) group, and is replaced in the unified electroweak theory by a U(1) group representing a weak hypercharge rather than electric charge. The massless bosons from the SU(2) × U(1) theory mix after spontaneous symmetry breaking to produce the 3 massive weak bosons, and the photon field. The Standard Model combines the strong interaction with the unified electroweak interaction (unifying the weak and electromagnetic interaction) through the symmetry group SU(2) × U(1) × SU(3). In the current epoch the strong interaction is not unified with the electroweak interaction, but from the observed running of the coupling constants it is believed[citation needed] they all converge to a single value at very high energies.

Phenomenology at lower energies in quantum chromodynamics is not completely understood due to the difficulties of managing such a theory with a strong coupling. This may be the reason why confinement has not been theoretically proven, though it is a consistent experimental observation. Proof that QCD confines at low energy is a mathematical problem of great relevance, and an award has been proposed by the Clay Mathematics Institute for whoever is also able to show that the Yang–Mills theory has a mass gap and its existence.

Mathematical overview

Yang–Mills theories are a special example of gauge theory with a non-abelian symmetry group given by the Lagrangian
{\mathcal {L}}_{\mathrm {gf} }=-{\frac {1}{2}}\operatorname {Tr} (F^{2})=-{\frac {1}{4}}F^{a\mu \nu }F_{\mu \nu }^{a}
with the generators of the Lie algebra, indexed by a, corresponding to the F-quantities (the curvature or field-strength form) satisfying
{\displaystyle \operatorname {Tr} (T^{a}T^{b})={\frac {1}{2}}\delta ^{ab},\quad [T^{a},T^{b}]=if^{abc}T^{c},}
where the fabc are structure constants of the Lie algebra, and the covariant derivative defined as
D_{\mu }=I\partial _{\mu }-igT^{a}A_{\mu }^{a}
where I is the identity matrix (matching the size of the generators), A_{\mu }^{a} is the vector potential, and g is the coupling constant. In four dimensions, the coupling constant g is a pure number and for a SU(N) group one has a,b,c=1\ldots N^{2}-1.

The relation
F_{\mu \nu }^{a}=\partial _{\mu }A_{\nu }^{a}-\partial _{\nu }A_{\mu }^{a}+gf^{abc}A_{\mu }^{b}A_{\nu }^{c}
can be derived by the commutator
[D_{\mu },D_{\nu }]=-igT^{a}F_{\mu \nu }^{a}.
The field has the property of being self-interacting and equations of motion that one obtains are said to be semilinear, as nonlinearities are both with and without derivatives. This means that one can manage this theory only by perturbation theory, with small nonlinearities.

Note that the transition between "upper" ("contravariant") and "lower" ("covariant") vector or tensor components is trivial for a indices (e.g. f^{abc}=f_{abc}), whereas for μ and ν it is nontrivial, corresponding e.g. to the usual Lorentz signature, \eta _{\mu \nu }={\rm {diag}}(+---).

From the given Lagrangian one can derive the equations of motion given by
\partial ^{\mu }F_{\mu \nu }^{a}+gf^{abc}A^{\mu b}F_{\mu \nu }^{c}=0.
Putting F_{\mu \nu }=T^{a}F_{\mu \nu }^{a}, these can be rewritten as
(D^{\mu }F_{\mu \nu })^{a}=0.
A Bianchi identity holds
(D_{\mu }F_{\nu \kappa })^{a}+(D_{\kappa }F_{\mu \nu })^{a}+(D_{\nu }F_{\kappa \mu })^{a}=0
which is equivalent to the Jacobi identity
[D_{\mu },[D_{\nu },D_{\kappa }]]+[D_{\kappa },[D_{\mu },D_{\nu }]]+[D_{\nu },[D_{\kappa },D_{\mu }]]=0
since [D_{\mu },F_{\nu \kappa }^{a}]=D_{\mu }F_{\nu \kappa }^{a}. Define the dual strength tensor {\tilde {F}}^{\mu \nu }={\frac {1}{2}}\varepsilon ^{\mu \nu \rho \sigma }F_{\rho \sigma }, then the Bianchi identity can be rewritten as
D_{\mu }{\tilde {F}}^{\mu \nu }=0.
A source J_{\mu }^{a} enters into the equations of motion as
\partial ^{\mu }F_{\mu \nu }^{a}+gf^{abc}A^{b\mu }F_{\mu \nu }^{c}=-J_{\nu }^{a}.
Note that the currents must properly change under gauge group transformations.

We give here some comments about the physical dimensions of the coupling. In D dimensions, the field scales as [A]=[L^{\frac {2-D}{2}}][citation needed] and so the coupling must scale as [g^{2}]=[L^{D-4}]. This implies that Yang–Mills theory is not renormalizable for dimensions greater than four. Furthermore, for D = 4, the coupling is dimensionless and both the field and the square of the coupling have the same dimensions of the field and the coupling of a massless quartic scalar field theory. So, these theories share the scale invariance at the classical level.


A method of quantizing the Yang–Mills theory is by functional methods, i.e. path integrals. One introduces a generating functional for n-point functions as
Z[j]=\int [dA]\exp \left[-{\frac {i}{2}}\int d^{4}x\operatorname {Tr} (F^{\mu \nu }F_{\mu \nu })+i\int d^{4}x\,j_{\mu }^{a}(x)A^{a\mu }(x)\right],

but this integral has no meaning as it is because the potential vector can be arbitrarily chosen due to the gauge freedom. This problem was already known for quantum electrodynamics but here becomes more severe due to non-abelian properties of the gauge group. A way out has been given by Ludvig Faddeev and Victor Popov with the introduction of a ghost field that has the property of being unphysical since, although it agrees with Fermi–Dirac statistics, it is a complex scalar field, which violates the spin–statistics theorem. So, we can write the generating functional as
{\begin{aligned}Z[j,{\bar {\varepsilon }},\varepsilon ]&=\int [dA][d{\bar {c}}][dc]\exp \left\{iS_{F}[\partial A,A]+iS_{gf}[\partial A]+iS_{g}[\partial c,\partial {\bar {c}},c,{\bar {c}},A]\right\}\\&\exp \left\{i\int d^{4}xj_{\mu }^{a}(x)A^{a\mu }(x)+i\int d^{4}x[{\bar {c}}^{a}(x)\varepsilon ^{a}(x)+{\bar {\varepsilon }}^{a}(x)c^{a}(x)]\right\}\end{aligned}}

S_{F}=-{\frac {1}{2}}\operatorname {Tr} (F^{\mu \nu }F_{\mu \nu })
for the field,
S_{gf}=-{\frac {1}{2\xi }}(\partial \cdot A)^{2}
for the gauge fixing and
S_{g}=-({\bar {c}}^{a}\partial _{\mu }\partial ^{\mu }c^{a}+g{\bar {c}}^{a}f^{abc}\partial _{\mu }A^{b\mu }c^{c})
for the ghost. This is the expression commonly used to derive Feynman's rules. Here we have ca for the ghost field while α fixes the gauge's choice for the quantization. Feynman's rules obtained from this functional are the following

These rules for Feynman diagrams can be obtained when the generating functional given above is rewritten as
{\begin{aligned}Z[j,{\bar {\varepsilon }},\varepsilon ]&=\exp \left(-ig\int d^{4}x\,{\frac {\delta }{i\delta {\bar {\varepsilon }}^{a}(x)}}f^{abc}\partial _{\mu }{\frac {i\delta }{\delta j_{\mu }^{b}(x)}}{\frac {i\delta }{\delta \varepsilon ^{c}(x)}}\right)\\&\qquad \times \exp \left(-ig\int d^{4}xf^{abc}\partial _{\mu }{\frac {i\delta }{\delta j_{\nu }^{a}(x)}}{\frac {i\delta }{\delta j_{\mu }^{b}(x)}}{\frac {i\delta }{\delta j^{c\nu }(x)}}\right)\\&\qquad \qquad \times \exp \left(-i{\frac {g^{2}}{4}}\int d^{4}xf^{abc}f^{ars}{\frac {i\delta }{\delta j_{\mu }^{b}(x)}}{\frac {i\delta }{\delta j_{\nu }^{c}(x)}}{\frac {i\delta }{\delta j^{r\mu }(x)}}{\frac {i\delta }{\delta j^{s\nu }(x)}}\right)\\&\qquad \qquad \qquad \times Z_{0}[j,{\bar {\varepsilon }},\varepsilon ]\end{aligned}}

Z_{0}[j,{\bar {\varepsilon }},\varepsilon ]=\exp \left(-\int d^{4}xd^{4}y{\bar {\varepsilon }}^{a}(x)C^{ab}(x-y)\varepsilon ^{b}(y)\right)\exp \left({\tfrac {1}{2}}\int d^{4}xd^{4}yj_{\mu }^{a}(x)D^{ab\mu \nu }(x-y)j_{\nu }^{b}(y)\right)

being the generating functional of the free theory. Expanding in g and computing the functional derivatives, we are able to obtain all the n-point functions with perturbation theory. Using LSZ reduction formula we get from the n-point functions the corresponding process amplitudes, cross sections and decay rates. The theory is renormalizable and corrections are finite at any order of perturbation theory.

For quantum electrodynamics the ghost field decouples because the gauge group is abelian. This can be seen from the coupling between the gauge field and the ghost field that is {\bar {c}}^{a}f^{abc}\partial _{\mu }A^{b\mu }c^{c}. For the abelian case, all the structure constants f^{abc} are zero and so there is no coupling. In the non-abelian case, the ghost field appears as a useful way to rewrite the quantum field theory without physical consequences on the observables of the theory such as cross sections or decay rates.

One of the most important results obtained for Yang–Mills theory is asymptotic freedom. This result can be obtained by assuming that the coupling constant g is small (so small nonlinearities), as for high energies, and applying perturbation theory. The relevance of this result is due to the fact that a Yang–Mills theory that describes strong interaction and asymptotic freedom permits proper treatment of experimental results coming from deep inelastic scattering.

To obtain the behavior of the Yang–Mills theory at high energies, and so to prove asymptotic freedom, one applies perturbation theory assuming a small coupling. This is verified a posteriori in the ultraviolet limit. In the opposite limit, the infrared limit, the situation is the opposite, as the coupling is too large for perturbation theory to be reliable. Most of the difficulties that research meets is just managing the theory at low energies. That is the interesting case, being inherent to the description of hadronic matter and, more generally, to all the observed bound states of gluons and quarks and their confinement (see hadrons). The most used method to study the theory in this limit is to try to solve it on computers (see lattice gauge theory). In this case, large computational resources are needed to be sure the correct limit of infinite volume (smaller lattice spacing) is obtained. This is the limit the results must be compared with. Smaller spacing and larger coupling are not independent of each other, and larger computational resources are needed for each. As of today, the situation appears somewhat satisfactory for the hadronic spectrum and the computation of the gluon and ghost propagators, but the glueball and hybrids spectra are yet a questioned matter in view of the experimental observation of such exotic states. Indeed, the σ resonance[6][7] is not seen in any of such lattice computations and contrasting interpretations have been put forward. This is a hotly debated issue.

Open problems

Yang–Mills theories met with general acceptance in the physics community after Gerard 't Hooft, in 1972, worked out their renormalization, relying on a formulation of the problem worked out by his advisor Martinus Veltman. (Their work[8] was recognized by the 1999 Nobel prize in physics.) Renormalizability is obtained even if the gauge bosons described by this theory are massive, as in the electroweak theory, provided the mass is only an "acquired" one, generated by the Higgs mechanism.

Concerning the mathematics, it should be noted that the Yang–Mills theory is a very active field of research, yielding e.g. invariants of differentiable structures on four-dimensional manifolds via work of Simon Donaldson. Furthermore, the field of Yang–Mills theories was included in the Clay Mathematics Institute's list of "Millennium Prize Problems". Here the prize-problem consists, especially, in a proof of the conjecture that the lowest excitations of a pure Yang–Mills theory (i.e. without matter fields) have a finite mass-gap with regard to the vacuum state. Another open problem, connected with this conjecture, is a proof of the confinement property in the presence of additional Fermion particles.

In physics the survey of Yang–Mills theories does not usually start from perturbation analysis or analytical methods, but more recently from systematic application of numerical methods to lattice gauge theories.

Scientific community

