A Medley of Potpourri

Friday, August 3, 2018

Path integral formulation

From Wikipedia, the free encyclopedia

The path integral formulation of quantum mechanics is a description of quantum theory that generalizes the action principle of classical mechanics. It replaces the classical notion of a single, unique classical trajectory for a system with a sum, or functional integral, over an infinity of quantum-mechanically possible trajectories to compute a quantum amplitude.

This formulation has proven crucial to the subsequent development of theoretical physics, because manifest Lorentz covariance (time and space components of quantities enter equations in the same way) is easier to achieve than in the operator formalism of canonical quantization. Unlike previous methods, the path integral allows a physicist to easily change coordinates between very different canonical descriptions of the same quantum system. Another advantage is that it is in practice easier to guess the correct form of the Lagrangian of a theory, which naturally enters the path integrals (for interactions of a certain type, these are coordinate space or Feynman path integrals), than the Hamiltonian. Possible downsides of the approach include that unitarity (this is related to conservation of probability; the probabilities of all physically possible outcomes must add up to one) of the S-matrix is obscure in the formulation. The path-integral approach has been proved to be equivalent to the other formalisms of quantum mechanics and quantum field theory. Thus, by deriving either approach from the other, problems associated with one or the other approach (as exemplified by Lorentz covariance or unitarity) go away.^[1]

The path integral also relates quantum and stochastic processes, and this provided the basis for the grand synthesis of the 1970s, which unified quantum field theory with the statistical field theory of a fluctuating field near a second-order phase transition. The Schrödinger equation is a diffusion equation with an imaginary diffusion constant, and the path integral is an analytic continuation of a method for summing up all possible random walks.

The basic idea of the path integral formulation can be traced back to Norbert Wiener, who introduced the Wiener integral for solving problems in diffusion and Brownian motion.^[2] This idea was extended to the use of the Lagrangian in quantum mechanics by P. A. M. Dirac in his 1933 article.^[3]^[4] The complete method was developed in 1948 by Richard Feynman. Some preliminaries were worked out earlier in his doctoral work under the supervision of John Archibald Wheeler. The original motivation stemmed from the desire to obtain a quantum-mechanical formulation for the Wheeler–Feynman absorber theory using a Lagrangian (rather than a Hamiltonian) as a starting point.

These are just three of the paths that contribute to the quantum amplitude for a particle moving from point A at some time

t 0

to point B at some other time

t 1

Quantum action principle

In quantum mechanics, as in classical mechanics, the Hamiltonian is the generator of time translations. This means that the state at a slightly later time differs from the state at the current time by the result of acting with the Hamiltonian operator (multiplied by the negative imaginary unit,

- i

). For states with a definite energy, this is a statement of the de Broglie relation between frequency and energy, and the general relation is consistent with that plus the superposition principle.

The Hamiltonian in classical mechanics is derived from a Lagrangian, which is a more fundamental quantity relative to special relativity. The Hamiltonian indicates how to march forward in time, but the time is different in different reference frames. The Lagrangian is a Lorentz scalar, while the Hamiltonian is the time component of a four-vector. So the Hamiltonian is different in different frames, and this type of symmetry is not apparent in the original formulation of quantum mechanics.
The Hamiltonian is a function of the position and momentum at one time, and it determines the position and momentum a little later. The Lagrangian is a function of the position now and the position a little later (or, equivalently for infinitesimal time separations, it is a function of the position and velocity). The relation between the two is by a Legendre transformation, and the condition that determines the classical equations of motion (the Euler–Lagrange equations) is that the action has an extremum.

In quantum mechanics, the Legendre transform is hard to interpret, because the motion is not over a definite trajectory. In classical mechanics, with discretization in time, the Legendre transform becomes

\varepsilon H=p(t){\big (}q(t+\varepsilon )-q(t){\big )}-\varepsilon L

and

p={\frac {\partial L}{\partial {\dot {q}}}},

where the partial derivative with respect to

{\dot {q}}

holds

q (t + ε)

fixed. The inverse Legendre transform is

\varepsilon L=\varepsilon p{\dot {q}}-\varepsilon H,

where

{\dot {q}}={\frac {\partial H}{\partial p}},

and the partial derivative now is with respect to

p

at fixed

q

.

In quantum mechanics, the state is a superposition of different states with different values of

q

, or different values of

p

, and the quantities

p

and

q

can be interpreted as noncommuting operators. The operator

p

is only definite on states that are indefinite with respect to

q

. So consider two states separated in time and act with the operator corresponding to the Lagrangian:

e^{i{\big [}p{\big (}q(t+\varepsilon )-q(t){\big )}-\varepsilon H(p,q){\big ]}}.

If the multiplications implicit in this formula are reinterpreted as matrix multiplications, the first factor is

e^{-ipq(t)},

and if this is also interpreted as a matrix multiplication, the sum over all states integrates over all

q (t)

, and so it takes the Fourier transform in

q (t)

to change basis to

p (t)

. That is the action on the Hilbert space – change basis to $p$ at time $t$ .

Next comes

e^{-i\varepsilon H(p,q)},

or evolve an infinitesimal time into the future.

Finally, the last factor in this interpretation is

e^{ipq(t+\varepsilon )},

which means change basis back to $q$ at a later time.

This is not very different from just ordinary time evolution: the

H

factor contains all the dynamical information – it pushes the state forward in time. The first part and the last part are just Fourier transforms to change to a pure

q

basis from an intermediate

p

basis.In Dirac's words

...we see that the integrand in (11) must be of the form

e iF / h

, where

F

is a function of

q T, q 1, q 2, \dots q m, q t

, which remains finite as

h

tends to zero. Let us now picture one of the intermediate

q

s, say

q k

, as varying continuously while the other ones are fixed. Owing to the smallness of

h

, we shall then in general have F/h varying extremely rapidly. This means that

e iF / h

will vary periodically with a very high frequency about the value zero, as a result of which its integral will be practically zero. The only important part in the domain of integration of

q k

is thus that for which a comparatively large variation in

q k

produces only a very small variation in

F

. This part is the neighbourhood of a point for which

F

is stationary with respect to small variations in

q k

. We can apply this argument to each of the variables of integration ... and obtain the result that the only important part in the domain of integration is that for which

F

is stationary for small variations in all intermediate

q

s. ... We see that

F

has for its classical analogue

\int t T L dt

, which is just the action function, which classical mechanics requires to be stationary for small variations in all the intermediate

q

s. This shows the way in which equation (11) goes over into classical results when

h

becomes extremely small.

Dirac (1933), p. 69

Another way of saying this is that since the Hamiltonian is naturally a function of

p

and

q

, exponentiating this quantity and changing basis from

p

q

at each step allows the matrix element of

H

to be expressed as a simple function along each path. This function is the quantum analog of the classical action. This observation is due to Paul Dirac.^[5]

Dirac further noted that one could square the time-evolution operator in the

S

representation:

e^{i\varepsilon S},

and this gives the time-evolution operator between time

t

and time

t + 2 ε

. While in the

H

representation the quantity that is being summed over the intermediate states is an obscure matrix element, in the

S

representation it is reinterpreted as a quantity associated to the path. In the limit that one takes a large power of this operator, one reconstructs the full quantum evolution between two states, the early one with a fixed value of

q (0)

and the later one with a fixed value of

q (t)

. The result is a sum over paths with a phase, which is the quantum action. Crucially, Dirac identified in this article the deep quantum-mechanical reason for the principle of least action controlling the classical limit (see quotation box).

Feynman's interpretation

Dirac's work did not provide a precise prescription to calculate the sum over paths, and he did not show that one could recover the Schrödinger equation or the canonical commutation relations from this rule. This was done by Feynman.^{[nb 1]} That is, the classical path arises naturally in the classical limit.

Feynman showed that Dirac's quantum action was, for most cases of interest, simply equal to the classical action, appropriately discretized. This means that the classical action is the phase acquired by quantum evolution between two fixed endpoints. He proposed to recover all of quantum mechanics from the following postulates:

The probability for an event is given by the squared modulus of a complex number called the "probability amplitude".
The probability amplitude is given by adding together the contributions of all paths in configuration space.
The contribution of a path is proportional to $e iS / ħ$ , where $S$ is the action given by the time integral of the Lagrangian along the path.

In order to find the overall probability amplitude for a given process, then, one adds up, or integrates, the amplitude of the 3rd postulate over the space of all possible paths of the system in between the initial and final states, including those that are absurd by classical standards. In calculating the probability amplitude for a single particle to go from one space-time coordinate to another, it is correct to include paths in which the particle describes elaborate curlicues, curves in which the particle shoots off into outer space and flies back again, and so forth. The path integral assigns to all these amplitudes equal weight but varying phase, or argument of the complex number. Contributions from paths wildly different from the classical trajectory may be suppressed by interference (see below).

Feynman showed that this formulation of quantum mechanics is equivalent to the canonical approach to quantum mechanics when the Hamiltonian is at most quadratic in the momentum. An amplitude computed according to Feynman's principles will also obey the Schrödinger equation for the Hamiltonian corresponding to the given action.

The path integral formulation of quantum field theory represents the transition amplitude (corresponding to the classical correlation function) as a weighted sum of all possible histories of the system from the initial to the final state. A Feynman diagram is a graphical representation of a perturbative contribution to the transition amplitude.

Path integral in quantum mechanics

Time-slicing derivation

One common approach to deriving the path integral formula is to divide the time interval into small pieces. Once this is done, the Trotter product formula tells us that the noncommutativity of the kinetic and potential energy operators can be ignored.
For a particle in a smooth potential, the path integral is approximated by zigzag paths, which in one dimension is a product of ordinary integrals. For the motion of the particle from position

x a

at time

t a

x b

at time

t b

, the time sequence

t_{a}=t_{0

can be divided up into

n + 1

smaller segments

t j - t j - 1

, where

j = 1, ..., n + 1

, of fixed duration

\varepsilon =\Delta t={\frac {t_{b}-t_{a}}{n+1}}.

This process is called time-slicing.

An approximation for the path integral can be computed as proportional to

{\displaystyle \int \limits _{-\infty }^{+\infty }\cdots \int \limits _{-\infty }^{+\infty }\exp \left({\frac {i}{\hbar }}\int _{t_{a}}^{t_{b}}L{\big (}x(t),v(t){\big )}\,dt\right)\,dx_{0}\,\cdots \,dx_{n},}

where

L (x, v)

is the Lagrangian of the one-dimensional system with position variable

x (t)

and velocity

v = ẋ (t)

considered (see below), and

dx j

corresponds to the position at the

j

th time step, if the time integral is approximated by a sum of

n

terms.^{[nb 2]}

In the limit

n \to \infty

, this becomes a functional integral, which, apart from a nonessential factor, is directly the product of the probability amplitudes

⟨ x b, t b | x a, t a ⟩

(more precisely, since one must work with a continuous spectrum, the respective densities) to find the quantum mechanical particle at

t a

in the initial state

x a

and at

t b

in the final state

x b

.

Actually

L

is the classical Lagrangian of the one-dimensional system considered,

L(x,{\dot {x}})=T-V={\frac {1}{2}}m|{\dot {x}}|^{2}-V(x)

and the abovementioned "zigzagging" corresponds to the appearance of the terms

{\displaystyle \exp \left({\frac {i}{\hbar }}\varepsilon \sum _{j=1}^{n+1}L\left({\tilde {x}}_{j},{\frac {x_{j}-x_{j-1}}{\varepsilon }},j\right)\right)}

in the Riemann sum approximating the time integral, which are finally integrated over

x 1

x n

with the integration measure

dx 1 ... dx n

x̃ j

is an arbitrary value of the interval corresponding to

j

, e.g. its center,

x j + x j -1 / 2

.

Thus, in contrast to classical mechanics, not only does the stationary path contribute, but actually all virtual paths between the initial and the final point also contribute.

Path integral formula

In terms of the wave function in the position representation, the path integral formula reads as follows:

{\displaystyle \psi (x,t)={\frac {1}{Z}}\int _{\mathbf {x} (0)=x}{\mathcal {D}}\mathbf {x} \,e^{iS[\mathbf {x} ,{\dot {\mathbf {x} }}]}\psi _{0}(\mathbf {x} (t))\,}

where

{\mathcal {D}}\mathbf {x}

denotes integration over all paths

\mathbf {x}

with

\mathbf {x} (0)=x

and where

Z

is a normalization factor. Here

S

is the action, given by

S[\mathbf {x} ,{\dot {\mathbf {x} }}]=\int dt\,L(\mathbf {x} (t),{\dot {\mathbf {x} }}(t))

Play media

The diagram shows the contribution to the path integral of a free particle for a set of paths.

Free particle

The path integral representation gives the quantum amplitude to go from point

x

to point

y

as an integral over all paths. For a free-particle action (for simplicity let

m = 1

ħ = 1

)

S=\int {\frac {{\dot {x}}^{2}}{2}}\,dt,

the integral can be evaluated explicitly.

To do this, it is convenient to start without the factor

i

in the exponential, so that large deviations are suppressed by small numbers, not by cancelling oscillatory contributions:

K(x-y;T)=\int _{x(0)=x}^{x(T)=y}\exp \left(-\int _{0}^{T}{\frac {{\dot {x}}^{2}}{2}}\,dt\right)\,Dx.

Splitting the integral into time slices:

{\displaystyle K(x,y;T)=\int _{x(0)=x}^{x(T)=y}\prod _{t}\exp \left(-{\tfrac {1}{2}}\left({\frac {x(t+\varepsilon )-x(t)}{\varepsilon }}\right)^{2}\varepsilon \right)\,Dx,}

where the

Dx

is interpreted as a finite collection of integrations at each integer multiple of

ε

. Each factor in the product is a Gaussian as a function of

x (t + ε)

centered at

x (t)

with variance

ε

. The multiple integrals are a repeated convolution of this Gaussian

G ε

with copies of itself at adjacent times:

K(x-y;T)=G_{\varepsilon }*G_{\varepsilon }*\cdots *G_{\varepsilon },

where the number of convolutions is

T / ε

. The result is easy to evaluate by taking the Fourier transform of both sides, so that the convolutions become multiplications:

{\tilde {K}}(p;T)={\tilde {G}}_{\varepsilon }(p)^{T/\varepsilon }.

The Fourier transform of the Gaussian

G

is another Gaussian of reciprocal variance:

{\tilde {G}}_{\varepsilon }(p)=e^{-{\frac {\varepsilon p^{2}}{2}}},

and the result is

{\tilde {K}}(p;T)=e^{-{\frac {Tp^{2}}{2}}}.

The Fourier transform gives

K

, and it is a Gaussian again with reciprocal variance:

K(x-y;T)\propto e^{-{\frac {(x-y)^{2}}{2T}}}.

The proportionality constant is not really determined by the time-slicing approach, only the ratio of values for different endpoint choices is determined. The proportionality constant should be chosen to ensure that between each two time slices the time evolution is quantum-mechanically unitary, but a more illuminating way to fix the normalization is to consider the path integral as a description of a stochastic process.

The result has a probability interpretation. The sum over all paths of the exponential factor can be seen as the sum over each path of the probability of selecting that path. The probability is the product over each segment of the probability of selecting that segment, so that each segment is probabilistically independently chosen. The fact that the answer is a Gaussian spreading linearly in time is the central limit theorem, which can be interpreted as the first historical evaluation of a statistical path integral.

The probability interpretation gives a natural normalization choice. The path integral should be defined so that

\int K(x-y;T)\,dy=1.

This condition normalizes the Gaussian and produces a kernel that obeys the diffusion equation:

{\frac {d}{dt}}K(x;T)={\frac {\nabla ^{2}}{2}}K.

For oscillatory path integrals, ones with an

i

in the numerator, the time slicing produces convolved Gaussians, just as before. Now, however, the convolution product is marginally singular, since it requires careful limits to evaluate the oscillating integrals. To make the factors well defined, the easiest way is to add a small imaginary part to the time increment

ε

. This is closely related to Wick rotation. Then the same convolution argument as before gives the propagation kernel:

K(x-y;T)\propto e^{\frac {i(x-y)^{2}}{2T}},

which, with the same normalization as before (not the sum-squares normalization – this function has a divergent norm), obeys a free Schrödinger equation:

{\frac {d}{dt}}K(x;T)=i{\frac {\nabla ^{2}}{2}}K.

This means that any superposition of

K

s will also obey the same equation, by linearity. Defining

\psi _{t}(y)=\int \psi _{0}(x)K(x-y;t)\,dx=\int \psi _{0}(x)\int _{x(0)=x}^{x(t)=y}e^{iS}\,Dx,

then

ψ t

obeys the free Schrödinger equation just as

K

does:

i{\frac {\partial }{\partial t}}\psi _{t}=-{\frac {\nabla ^{2}}{2}}\psi _{t}.

Simple harmonic oscillator

The Lagrangian for the simple harmonic oscillator is

{\mathcal {L}}={\tfrac {1}{2}}m{\dot {x}}^{2}-{\tfrac {1}{2}}m\omega ^{2}x^{2}.

Write its trajectory

x (t)

as the classical trajectory plus some perturbation,

x (t) = x c (t) + δx (t)

and the action as

S = S c + δS

. The classical trajectory can be written as

{\displaystyle x_{\text{c}}(t)=x_{i}{\frac {\sin \omega (t_{f}-t)}{\sin \omega (t_{f}-t_{i})}}+x_{f}{\frac {\sin \omega (t-t_{i})}{\sin \omega (t_{f}-t_{i})}}.}

This trajectory yields the classical action

{\displaystyle {\begin{aligned}S_{\text{c}}&=\int _{t_{i}}^{t_{f}}{\mathcal {L}}\,dt=\int _{t_{i}}^{t_{f}}\left({\tfrac {1}{2}}m{\dot {x}}^{2}-{\tfrac {1}{2}}m\omega ^{2}x^{2}\right)\,dt\\[6pt]&={\frac {1}{2}}m\omega \left({\frac {(x_{i}^{2}+x_{f}^{2})\cos \omega (t_{f}-t_{i})-2x_{i}x_{f}}{\sin \omega (t_{f}-t_{i})}}\right)~.\end{aligned}}}

Next, expand the non-classical contribution to the action

δS

as a Fourier series, which gives

{\displaystyle S=S_{\text{c}}+\sum _{n=1}^{\infty }{\tfrac {1}{2}}a_{n}^{2}{\frac {m}{2}}\left({\frac {(n\pi )^{2}}{t_{f}-t_{i}}}-\omega ^{2}(t_{f}-t_{i})\right).}

This means that the propagator is

{\displaystyle {\begin{aligned}K(x_{f},t_{f};x_{i},t_{i})&=Qe^{\frac {iS_{\text{c}}}{\hbar }}\prod _{j=1}^{\infty }{\frac {j\pi }{\sqrt {2}}}\int da_{j}\exp {\left({\frac {i}{2\hbar }}a_{j}^{2}{\frac {m}{2}}\left({\frac {(j\pi )^{2}}{t_{f}-t_{i}}}-\omega ^{2}(t_{f}-t_{i})\right)\right)}\\[6pt]&=e^{\frac {iS_{\text{c}}}{\hbar }}Q\prod _{j=1}^{\infty }\left(1-\left({\frac {\omega (t_{f}-t_{i})}{j\pi }}\right)^{2}\right)^{-{\frac {1}{2}}}\end{aligned}}}

for some normalization

Q={\sqrt {\frac {m}{2\pi i\hbar (t_{f}-t_{i})}}}~.

Using the infinite-product representation of the sinc function,

\prod _{j=1}^{\infty }\left(1-{\frac {x^{2}}{j^{2}}}\right)={\frac {\sin \pi x}{\pi x}},

the propagator can be written as

{\displaystyle K(x_{f},t_{f};x_{i},t_{i})=Qe^{\frac {iS_{\text{c}}}{\hbar }}{\sqrt {\frac {\omega (t_{f}-t_{i})}{\sin \omega (t_{f}-t_{i})}}}=e^{\frac {iS_{c}}{\hbar }}{\sqrt {\frac {m\omega }{2\pi i\hbar \sin \omega (t_{f}-t_{i})}}}.}

Let

T = t f - t i

. One may write this propagator in terms of energy eigenstates as

{\displaystyle {\begin{aligned}K(x_{f},t_{f};x_{i},t_{i})&=\left({\frac {m\omega }{2\pi i\hbar \sin \omega T}}\right)^{\frac {1}{2}}\exp {\left({\frac {i}{\hbar }}{\tfrac {1}{2}}m\omega {\frac {(x_{i}^{2}+x_{f}^{2})\cos \omega T-2x_{i}x_{f}}{\sin \omega T}}\right)}\\[6pt]&=\sum _{n=0}^{\infty }\exp {\left(-{\frac {iE_{n}T}{\hbar }}\right)}\psi _{n}(x_{f})^{*}\psi _{n}(x_{i})~.\end{aligned}}}

Using the identities

i sin ωT = 1 / 2 e iωT (1 - e -2 iωT)

and

cos ωT = 1 / 2 e iωT (1 + e -2 iωT)

, this amounts to

{\displaystyle K(x_{f},t_{f};x_{i},t_{i})=\left({\frac {m\omega }{\pi \hbar }}\right)^{\frac {1}{2}}e^{\frac {-i\omega T}{2}}\left(1-e^{-2i\omega T}\right)^{-{\frac {1}{2}}}\exp {\left(-{\frac {m\omega }{2\hbar }}\left(\left(x_{i}^{2}+x_{f}^{2}\right){\frac {1+e^{-2i\omega T}}{1-e^{-2i\omega T}}}-{\frac {4x_{i}x_{f}e^{-i\omega T}}{1-e^{-2i\omega T}}}\right)\right)}.}

One may absorb all terms after the first

e - iωT /2

into

R (T)

, thereby obtaining

{\displaystyle K(x_{f},t_{f};x_{i},t_{i})=\left({\frac {m\omega }{\pi \hbar }}\right)^{\frac {1}{2}}e^{\frac {-i\omega T}{2}}\cdot R(T).}

One may finally expand

R (T)

in powers of

e - iωT

: All terms in this expansion get multiplied by the

e - iωT /2

factor in the front, yielding terms of the form

{\displaystyle e^{\frac {-i\omega T}{2}}e^{-in\omega T}=e^{-i\omega T\left({\frac {1}{2}}+n\right)}\quad {\text{for }}n=0,1,2,\ldots .}

Comparison to the above eigenstate expansion yields the standard energy spectrum for the simple harmonic oscillator,

E_{n}=\left(n+{\tfrac {1}{2}}\right)\hbar \omega ~.

Coulomb potential

Feynman's time-sliced approximation does not, however, exist for the most important quantum-mechanical path integrals of atoms, due to the singularity of the Coulomb potential

e 2 / r

at the origin. Only after replacing the time

t

by another path-dependent pseudo-time parameter

s=\int {\frac {dt}{r(t)}}

the singularity is removed and a time-sliced approximation exists, which is exactly integrable, since it can be made harmonic by a simple coordinate transformation, as discovered in 1979 by İsmail Hakkı Duru and Hagen Kleinert.^[6] The combination of a path-dependent time transformation and a coordinate transformation is an important tool to solve many path integrals and is called generically the Duru–Kleinert transformation.

The Schrödinger equation

The path integral reproduces the Schrödinger equation for the initial and final state even when a potential is present. This is easiest to see by taking a path-integral over infinitesimally separated times.

{\displaystyle \psi (y;t+\varepsilon )=\int _{-\infty }^{\infty }\psi (x;t)\int _{x(t)=x}^{x(t+\varepsilon )=y}e^{i\int \limits _{t}^{t+\varepsilon }\left({\frac {{\dot {x}}^{2}}{2}}-V(x)\right)\,dt}\,Dx(t)\,dx\qquad (1)}

Since the time separation is infinitesimal and the cancelling oscillations become severe for large values of

ẋ

, the path integral has most weight for

y

close to

x

. In this case, to lowest order the potential energy is constant, and only the kinetic energy contribution is nontrivial. (This separation of the kinetic and potential energy terms in the exponent is essentially the Trotter product formula.) The exponential of the action is

e^{-i\varepsilon V(x)}e^{i{\frac {{\dot {x}}^{2}}{2}}\varepsilon }

The first term rotates the phase of

ψ (x)

locally by an amount proportional to the potential energy. The second term is the free particle propagator, corresponding to

i

times a diffusion process. To lowest order in

ε

they are additive; in any case one has with (1):

\psi (y;t+\varepsilon )\approx \int \psi (x;t)e^{-i\varepsilon V(x)}e^{\frac {i(x-y)^{2}}{2\varepsilon }}\,dx\,.

As mentioned, the spread in

ψ

is diffusive from the free particle propagation, with an extra infinitesimal rotation in phase which slowly varies from point to point from the potential:

{\frac {\partial \psi }{\partial t}}=i\cdot \left({\tfrac {1}{2}}\nabla ^{2}-V(x)\right)\psi \,

and this is the Schrödinger equation. Note that the normalization of the path integral needs to be fixed in exactly the same way as in the free particle case. An arbitrary continuous potential does not affect the normalization, although singular potentials require careful treatment.

Equations of motion

Since the states obey the Schrödinger equation, the path integral must reproduce the Heisenberg equations of motion for the averages of

x

and

ẋ

variables, but it is instructive to see this directly. The direct approach shows that the expectation values calculated from the path integral reproduce the usual ones of quantum mechanics.

Start by considering the path integral with some fixed initial state

\int \psi _{0}(x)\int _{x(0)=x}e^{iS(x,{\dot {x}})}\,Dx\,

Now note that

x (t)

at each separate time is a separate integration variable. So it is legitimate to change variables in the integral by shifting:

x (t) = u (t) + ε (t)

where

ε (t)

is a different shift at each time but

ε (0) = ε (T) = 0

, since the endpoints are not integrated:

\int \psi _{0}(x)\int _{u(0)=x}e^{iS(u+\varepsilon ,{\dot {u}}+{\dot {\varepsilon }})}\,Du\,

The change in the integral from the shift is, to first infinitesimal order in

ε

{\displaystyle \int \psi _{0}(x)\int _{u(0)=x}\left(\int {\frac {\partial S}{\partial u}}\varepsilon +{\frac {\partial S}{\partial {\dot {u}}}}{\dot {\varepsilon }}\,dt\right)e^{iS}\,Du\,}

which, integrating by parts in

t

, gives:

{\displaystyle \int \psi _{0}(x)\int _{u(0)=x}-\left(\int \left({\frac {d}{dt}}{\frac {\partial S}{\partial {\dot {u}}}}-{\frac {\partial S}{\partial u}}\right)\varepsilon (t)\,dt\right)e^{iS}\,Du\,}

But this was just a shift of integration variables, which doesn't change the value of the integral for any choice of

ε (t)

. The conclusion is that this first order variation is zero for an arbitrary initial state and at any arbitrary point in time:

\left\langle \psi _{0}\left|{\frac {\delta S}{\delta x}}(t)\right|\psi _{0}\right\rangle =0

this is the Heisenberg equation of motion.

If the action contains terms which multiply

ẋ

and

x

, at the same moment in time, the manipulations above are only heuristic, because the multiplication rules for these quantities is just as noncommuting in the path integral as it is in the operator formalism.

Stationary-phase approximation

If the variation in the action exceeds

ħ

by many orders of magnitude, we typically have destructive interference other than in the vicinity of those trajectories satisfying the Euler–Lagrange equation, which is now reinterpreted as the condition for constructive interference. This can be shown using the method of stationary phase applied to the propagator. As

ħ

decreases, the exponential in the integral oscillates rapidly in the complex domain for any change in the action. Thus, in the limit that

ħ

goes to zero, only points where the classical action does not vary contribute to the propagator.

Canonical commutation relations

The formulation of the path integral does not make it clear at first sight that the quantities

x

and

p

do not commute. In the path integral, these are just integration variables and they have no obvious ordering. Feynman discovered that the non-commutativity is still present.^[7]

To see this, consider the simplest path integral, the brownian walk. This is not yet quantum mechanics, so in the path-integral the action is not multiplied by

i

S=\int \left({\frac {dx}{dt}}\right)^{2}\,dt

The quantity

x (t)

is fluctuating, and the derivative is defined as the limit of a discrete difference.

{\frac {dx}{dt}}={\frac {x(t+\varepsilon )-x(t)}{\varepsilon }}

Note that the distance that a random walk moves is proportional to

\sqrt t

, so that:

x(t+\varepsilon )-x(t)\approx {\sqrt {\varepsilon }}

This shows that the random walk is not differentiable, since the ratio that defines the derivative diverges with probability one.

The quantity

xẋ

is ambiguous, with two possible meanings:

[1]=x{\frac {dx}{dt}}=x(t){\frac {x(t+\varepsilon )-x(t)}{\varepsilon }}

[2]=x{\frac {dx}{dt}}=x(t+\varepsilon ){\frac {x(t+\varepsilon )-x(t)}{\varepsilon }}

In elementary calculus, the two are only different by an amount which goes to 0 as

ε

goes to 0. But in this case, the difference between the two is not 0:

{\displaystyle [2]-[1]={\frac {{\big (}x(t+\varepsilon )-x(t){\big )}^{2}}{\varepsilon }}\approx {\frac {\varepsilon }{\varepsilon }}}

give a name to the value of the difference for any one random walk:

{\frac {{\big (}x(t+\varepsilon )-x(t){\big )}^{2}}{\varepsilon }}=f(t)

and note that

f (t)

is a rapidly fluctuating statistical quantity, whose average value is 1, i.e. a normalized "Gaussian process". The fluctuations of such a quantity can be described by a statistical Lagrangian

{\mathcal {L}}=(f(t)-1)^{2}\,,

and the equations of motion for

f

derived from extremizing the action

S

corresponding to L just set it equal to 1. In physics, such a quantity is "equal to 1 as an operator identity". In mathematics, it "weakly converges to 1". In either case, it is 1 in any expectation value, or when averaged over any interval, or for all practical purpose.

Defining the time order to be the operator order:

[x,{\dot {x}}]=x{\frac {dx}{dt}}-{\frac {dx}{dt}}x=1

This is called the Itō lemma in stochastic calculus, and the (euclideanized) canonical commutation relations in physics.

For a general statistical action, a similar argument shows that

\left[x,{\frac {\partial S}{\partial {\dot {x}}}}\right]=1

and in quantum mechanics, the extra imaginary unit in the action converts this to the canonical commutation relation,

[x,p]=i

Particle in curved space

For a particle in curved space the kinetic term depends on the position, and the above time slicing cannot be applied, this being a manifestation of the notorious operator ordering problem in Schrödinger quantum mechanics. One may, however, solve this problem by transforming the time-sliced flat-space path integral to curved space using a multivalued coordinate transformation (nonholonomic mapping explained here).

Measure-theoretic factors

Sometimes (e.g. a particle moving in curved space) we also have measure-theoretic factors in the functional integral:

\int \mu [x]e^{iS[x]}\,{\mathcal {D}}x.

This factor is needed to restore unitarity.
For instance, if

S=\int \left({\frac {m}{2}}g_{ij}{\dot {x}}^{i}{\dot {x}}^{j}-V(x)\right)\,dt,

then it means that each spatial slice is multiplied by the measure

\sqrt g

. This measure cannot be expressed as a functional multiplying the

D x

measure because they belong to entirely different classes.

Euclidean path integrals

It is very common in path integrals to perform a Wick rotation from real to imaginary times. In the setting of quantum field theory, the Wick rotation changes the geometry of space-time from Lorentzian to Euclidean; as a result, Wick-rotated path integrals are often called Euclidean path integrals.

Wick rotation and the Feynman–Kac formula

If we replace

t

-it

, the time-evolution operator

e^{-it{\hat {H}}/\hbar }

is replaced by

e^{-t{\hat {H}}/\hbar }

. (This change is known as a Wick rotation.) If we repeat the derivation of the path-integral formula in this setting, we obtain^[8]

{\displaystyle \psi (x,t)={\frac {1}{Z}}\int _{\mathbf {x} (0)=x}e^{-S_{\mathrm {Euclidean} }(\mathbf {x} ,{\dot {\mathbf {x} }})/\hbar }\psi _{0}(\mathbf {x} (t))\,{\mathcal {D}}\mathbf {x} \,}

where

S_{\mathrm {Euclidean} }

is the Euclidean action, given by

{\displaystyle S_{\mathrm {Euclidean} }(\mathbf {x} ,{\dot {\mathbf {x} }})=\int \left[{\frac {m}{2}}|{\dot {\mathbf {x} }}(t)|^{2}+V(\mathbf {x} (t))\right]\,dt}

Note the sign change between this and the normal action, where the potential energy term is negative. (The term Euclidean is from the context of quantum field theory, where the change from real to imaginary time changes the space-time geometry from Lorentzian to Euclidean.)

Now, the contribution of the kinetic energy to the path integral is as follows:

{\displaystyle {\frac {1}{Z}}\int _{\mathbf {x} (0)=x}f(\mathbf {x} )e^{-{\frac {m}{2}}\int |{\dot {\mathbf {x} }}|^{2}dt}\,{\mathcal {D}}\mathbf {x} \,}

where

f(\mathbf {x} )

includes all the remaining dependence of the integrand on the path. This integral has a rigorous mathematical interpretation as integration against the Wiener measure, denoted

\mu _{x}

. The Wiener measure, constructed by Norbert Wiener gives a rigorous foundation to Einstein's mathematical model of Brownian motion. The subscript

x

indicates that the measure

\mu _{x}

is supported on paths

\mathbf {x}

with

\mathbf {x} (0)=x

.

We then have a rigorous version of the Feynman path integral, known as the Feynman–Kac formula:^[9]

\psi (x,t)=\int e^{-\int V(\mathbf {x} (t)\,dt/\hbar }\,\psi _{0}(\mathbf {x} (t))\,d\mu _{x}(\mathbf {x} )

where now

\psi (x,t)

satisfies the Wick-rotated version of the Schrödinger equation,

\hbar {\frac {\partial }{\partial t}}\psi (x,t)=-{\hat {H}}\psi (x,t)

Although the Wick-rotated Schrödinger equation does not have a direct physical meaning, interesting properties of the Schrödinger operator

{\hat {H}}

can be extracted by studying it.^[10]

Much of the study of quantum field theories from the path-integral perspective, in both the mathematics and physics literatures, is done in the Euclidean setting, that is, after a Wick rotation. In particular, there are various results showing that if a Euclidean field theory with suitable properties can be constructed, one can then undo the Wick rotation to recover the physical, Lorentzian theory.^[11] On the other hand, it is much more difficult to give a meaning to path integrals (even Euclidean path integrals) in quantum field theory than in quantum mechanics.^[12]

The path integral and the partition function

The path integral is just the generalization of the integral above to all quantum mechanical problems—

{\displaystyle Z=\int e^{\frac {i{\mathcal {S}}[\mathbf {x} ]}{\hbar }}\,{\mathcal {D}}\mathbf {x} \quad {\text{where }}{\mathcal {S}}[\mathbf {x} ]=\int _{0}^{T}L[\mathbf {x} (t),{\dot {\mathbf {x} }}(t)]\,dt}

is the action of the classical problem in which one investigates the path starting at time

t = 0

and ending at time

t = T

, and

{\mathcal {D}}\mathbf {x}

denotes integration over all paths. In the classical limit,

{\mathcal {S}}[\mathbf {x} ]\gg \hbar

, the path of minimum action dominates the integral, because the phase of any path away from this fluctuates rapidly and different contributions cancel.^[13]

The connection with statistical mechanics follows. Considering only paths which begin and end in the same configuration, perform the Wick rotation

it = τ

, i.e., make time imaginary, and integrate over all possible beginning-ending configurations. The Wick-rotated path integral—described in the previous subsection, with the ordinary action replaced by its "Euclidean" counterpart—now resembles the partition function of statistical mechanics defined in a canonical ensemble with inverse temperature proportional to imaginary time,

1 / T = k B τ / ħ

. Strictly speaking, though, this is the partition function for a statistical field theory.

Clearly, such a deep analogy between quantum mechanics and statistical mechanics cannot be dependent on the formulation. In the canonical formulation, one sees that the unitary evolution operator of a state is given by

|\alpha ;t\rangle =e^{-{\frac {iHt}{\hbar }}}|\alpha ;0\rangle

where the state

α

is evolved from time

t = 0

. If one makes a Wick rotation here, and finds the amplitude to go from any state, back to the same state in (imaginary) time

iT

is given by

Z=\operatorname {Tr} \left[e^{\frac {-HT}{\hbar }}\right]

which is precisely the partition function of statistical mechanics for the same system at temperature quoted earlier. One aspect of this equivalence was also known to Erwin Schrödinger who remarked that the equation named after him looked like the diffusion equation after Wick rotation. Note, however, that the Euclidean path integral is actually in the form of a classical statistical mechanics model.

Quantum field theory

At first, the path integral was viewed mostly as a curiosity. However, it was very important for the development of, and by the late 1970's emerging as the standard way to define a quantum field theory.^[14] Both the Schrödinger and Heisenberg approaches to quantum mechanics single out time and are not in the spirit of relativity. For example, the Heisenberg approach requires that scalar field operators obey the commutation relation

[\varphi (x),\partial _{t}\varphi (y)]=i\delta ^{3}(x-y)

for two simultaneous spatial positions

x

and

y

, and this is not a relativistically invariant concept. The results of a calculation are covariant, but the symmetry is not apparent in intermediate stages. If naive field-theory calculations did not produce infinite answers in the continuum limit, this would not have been such a big problem – it would just have been a bad choice of coordinates. But the lack of symmetry means that the infinite quantities must be cut off, and the bad coordinates make it nearly impossible to cut off the theory without spoiling the symmetry. This makes it difficult to extract the physical predictions, which require a careful limiting procedure.

The problem of lost symmetry also appears in classical mechanics, where the Hamiltonian formulation also superficially singles out time. The Lagrangian formulation makes the relativistic invariance apparent. In the same way, the path integral is manifestly relativistic. It reproduces the Schrödinger equation, the Heisenberg equations of motion, and the canonical commutation relations and shows that they are compatible with relativity. It extends the Heisenberg-type operator algebra to operator product rules, which are new relations difficult to see in the old formalism.

Further, different choices of canonical variables lead to very different-seeming formulations of the same theory. The transformations between the variables can be very complicated, but the path integral makes them into reasonably straightforward changes of integration variables. For these reasons, the Feynman path integral has made earlier formalisms largely obsolete.

The price of a path integral representation is that the unitarity of a theory is no longer self-evident, but it can be proven by changing variables to some canonical representation. The path integral itself also deals with larger mathematical spaces than is usual, which requires more careful mathematics, not all of which has been fully worked out. The path integral historically was not immediately accepted, partly because it took many years to incorporate fermions properly. This required physicists to invent an entirely new mathematical object – the Grassmann variable – which also allowed changes of variables to be done naturally, as well as allowing constrained quantization.

The integration variables in the path integral are subtly non-commuting. The value of the product of two field operators at what looks like the same point depends on how the two points are ordered in space and time. This makes some naive identities fail.

The propagator

In relativistic theories, there is both a particle and field representation for every theory. The field representation is a sum over all field configurations, and the particle representation is a sum over different particle paths.

The nonrelativistic formulation is traditionally given in terms of particle paths, not fields. There, the path integral in the usual variables, with fixed boundary conditions, gives the probability amplitude for a particle to go from point

x

to point

y

in time

T

K(x,y;T)=\langle y;T\mid x;0\rangle =\int _{x(0)=x}^{x(T)=y}e^{iS[x]}\,Dx.

This is called the propagator. Superposing different values of the initial position

x

with an arbitrary initial state

ψ 0 (x)

constructs the final state:

\psi _{T}(y)=\int _{x}\psi _{0}(x)K(x,y;T)\,dx=\int ^{x(T)=y}\psi _{0}(x(0))e^{iS[x]}\,Dx.

For a spatially homogeneous system, where

K (x, y)

is only a function of

(x - y)

, the integral is a convolution, the final state is the initial state convolved with the propagator:

\psi _{T}=\psi _{0}*K(;T).

For a free particle of mass

m

, the propagator can be evaluated either explicitly from the path integral or by noting that the Schrödinger equation is a diffusion equation in imaginary time, and the solution must be a normalized Gaussian:

K(x,y;T)\propto e^{\frac {im(x-y)^{2}}{2T}}.

Taking the Fourier transform in

(x - y)

produces another Gaussian:

K(p;T)=e^{\frac {iTp^{2}}{2m}},

and in

p

-space the proportionality factor here is constant in time, as will be verified in a moment. The Fourier transform in time, extending

K (p; T)

to be zero for negative times, gives Green's function, or the frequency-space propagator:

G_{\text{F}}(p,E)={\frac {-i}{E-{\frac {{\vec {p}}^{2}}{2m}}+i\varepsilon }},

which is the reciprocal of the operator that annihilates the wavefunction in the Schrödinger equation, which wouldn't have come out right if the proportionality factor weren't constant in the

p

-space representation.

The infinitesimal term in the denominator is a small positive number, which guarantees that the inverse Fourier transform in

E

will be nonzero only for future times. For past times, the inverse Fourier transform contour closes toward values of

E

where there is no singularity. This guarantees that

K

propagates the particle into the future and is the reason for the subscript "F" on

G

. The infinitesimal term can be interpreted as an infinitesimal rotation toward imaginary time.

It is also possible to reexpress the nonrelativistic time evolution in terms of propagators going toward the past, since the Schrödinger equation is time-reversible. The past propagator is the same as the future propagator except for the obvious difference that it vanishes in the future, and in the Gaussian

t

is replaced by

- t

. In this case, the interpretation is that these are the quantities to convolve the final wavefunction so as to get the initial wavefunction:

G_{\text{B}}(p,E)={\frac {-i}{-E-{\frac {i{\vec {p}}^{2}}{2m}}+i\varepsilon }}.

Given the nearly identical only change is the sign of

E

and

ε

, the parameter

E

in Green's function can either be the energy if the paths are going toward the future, or the negative of the energy if the paths are going toward the past.

For a nonrelativistic theory, the time as measured along the path of a moving particle and the time as measured by an outside observer are the same. In relativity, this is no longer true. For a relativistic theory the propagator should be defined as the sum over all paths that travel between two points in a fixed proper time, as measured along the path (these paths describe the trajectory of a particle in space and in time):

{\displaystyle K(x-y,\mathrm {T} )=\int _{x(0)=x}^{x(\mathrm {T} )=y}e^{i\int _{0}^{\mathrm {T} }{\sqrt {{\dot {x}}^{2}}}-\alpha \,d\tau }.}

The integral above is not trivial to interpret because of the square root. Fortunately, there is a heuristic trick. The sum is over the relativistic arc length of the path of an oscillating quantity, and like the nonrelativistic path integral should be interpreted as slightly rotated into imaginary time. The function

K (x - y, τ)

can be evaluated when the sum is over paths in Euclidean space:

K(x-y,\mathrm {T} )=e^{-\alpha \mathrm {T} }\int _{x(0)=x}^{x(\mathrm {T} )=y}e^{-L}.

This describes a sum over all paths of length

Τ

of the exponential of minus the length. This can be given a probability interpretation. The sum over all paths is a probability average over a path constructed step by step. The total number of steps is proportional to

Τ

, and each step is less likely the longer it is. By the central limit theorem, the result of many independent steps is a Gaussian of variance proportional to

Τ

K(x-y,\mathrm {T} )=e^{-\alpha \mathrm {T} }e^{-{\frac {(x-y)^{2}}{\mathrm {T} }}}.

The usual definition of the relativistic propagator only asks for the amplitude is to travel from

x

y

, after summing over all the possible proper times it could take:

K(x-y)=\int _{0}^{\infty }K(x-y,\mathrm {T} )W(\mathrm {T} )\,d\mathrm {T} ,

where

W (Τ)

is a weight factor, the relative importance of paths of different proper time. By the translation symmetry in proper time, this weight can only be an exponential factor and can be absorbed into the constant

α

K(x-y)=\int _{0}^{\infty }e^{-{\frac {(x-y)^{2}}{\mathrm {T} }}-\alpha \mathrm {T} }\,d\mathrm {T} .

This is the Schwinger representation. Taking a Fourier transform over the variable

(x - y)

can be done for each value of

Τ

separately, and because each separate

Τ

contribution is a Gaussian, gives whose Fourier transform is another Gaussian with reciprocal width. So in

p

-space, the propagator can be reexpressed simply:

K(p)=\int _{0}^{\infty }e^{-\mathrm {T} p^{2}-\mathrm {T} \alpha }\,d\mathrm {T} ={\frac {1}{p^{2}+\alpha }},

which is the Euclidean propagator for a scalar particle. Rotating

p 0

to be imaginary gives the usual relativistic propagator, up to a factor of

- i

and an ambiguity, which will be clarified below:

K(p)={\frac {i}{p_{0}^{2}-{\vec {p}}^{2}-m^{2}}}.

This expression can be interpreted in the nonrelativistic limit, where it is convenient to split it by partial fractions:

2p_{0}K(p)={\frac {i}{p_{0}-{\sqrt {{\vec {p}}^{2}+m^{2}}}}}+{\frac {i}{p_{0}+{\sqrt {{\vec {p}}^{2}+m^{2}}}}}.

For states where one nonrelativistic particle is present, the initial wavefunction has a frequency distribution concentrated near

p 0 = m

. When convolving with the propagator, which in

p

space just means multiplying by the propagator, the second term is suppressed and the first term is enhanced. For frequencies near

p 0 = m

, the dominant first term has the form

2mK_{\text{NR}}(p)={\frac {i}{(p_{0}-m)-{\frac {{\vec {p}}^{2}}{2m}}}}.

This is the expression for the nonrelativistic Green's function of a free Schrödinger particle.
The second term has a nonrelativistic limit also, but this limit is concentrated on frequencies that are negative. The second pole is dominated by contributions from paths where the proper time and the coordinate time are ticking in an opposite sense, which means that the second term is to be interpreted as the antiparticle. The nonrelativistic analysis shows that with this form the antiparticle still has positive energy.

The proper way to express this mathematically is that, adding a small suppression factor in proper time, the limit where

t \to -\infty

of the first term must vanish, while the

t \to +\infty

limit of the second term must vanish. In the Fourier transform, this means shifting the pole in

p 0

slightly, so that the inverse Fourier transform will pick up a small decay factor in one of the time directions:

{\displaystyle K(p)={\frac {i}{p_{0}-{\sqrt {{\vec {p}}^{2}+m^{2}}}+i\varepsilon }}+{\frac {i}{p_{0}-{\sqrt {{\vec {p}}^{2}+m^{2}}}-i\varepsilon }}.}

Without these terms, the pole contribution could not be unambiguously evaluated when taking the inverse Fourier transform of

p 0

. The terms can be recombined:

K(p)={\frac {i}{p^{2}-m^{2}+i\varepsilon }},

which when factored, produces opposite-sign infinitesimal terms in each factor. This is the mathematically precise form of the relativistic particle propagator, free of any ambiguities. The

ε

term introduces a small imaginary part to the

α = m 2

, which in the Minkowski version is a small exponential suppression of long paths.

So in the relativistic case, the Feynman path-integral representation of the propagator includes paths going backwards in time, which describe antiparticles. The paths that contribute to the relativistic propagator go forward and backwards in time, and the interpretation of this is that the amplitude for a free particle to travel between two points includes amplitudes for the particle to fluctuate into an antiparticle, travel back in time, then forward again.

Unlike the nonrelativistic case, it is impossible to produce a relativistic theory of local particle propagation without including antiparticles. All local differential operators have inverses that are nonzero outside the light cone, meaning that it is impossible to keep a particle from travelling faster than light. Such a particle cannot have a Green's function which is only nonzero in the future in a relativistically invariant theory.

Functionals of fields

However, the path integral formulation is also extremely important in direct application to quantum field theory, in which the "paths" or histories being considered are not the motions of a single particle, but the possible time evolutions of a field over all space. The action is referred to technically as a functional of the field:

S [ϕ]

, where the field

ϕ (x μ)

is itself a function of space and time, and the square brackets are a reminder that the action depends on all the field's values everywhere, not just some particular value. One such given function

ϕ (x μ)

of spacetime is called a field configuration. In principle, one integrates Feynman's amplitude over the class of all possible field configurations.
Much of the formal study of QFT is devoted to the properties of the resulting functional integral, and much effort (not yet entirely successful) has been made toward making these functional integrals mathematically precise.

Such a functional integral is extremely similar to the partition function in statistical mechanics. Indeed, it is sometimes called a partition function, and the two are essentially mathematically identical except for the factor of

i

in the exponent in Feynman's postulate 3. Analytically continuing the integral to an imaginary time variable (called a Wick rotation) makes the functional integral even more like a statistical partition function and also tames some of the mathematical difficulties of working with these integrals.

Expectation values

In quantum field theory, if the action is given by the functional S of field configurations (which only depends locally on the fields), then the time-ordered vacuum expectation value of polynomially bounded functional

F

⟨ F ⟩

, is given by

{\displaystyle \langle F\rangle ={\frac {\int {\mathcal {D}}\varphi F[\varphi ]e^{i{\mathcal {S}}[\varphi ]}}{\int {\mathcal {D}}\varphi e^{i{\mathcal {S}}[\varphi ]}}}.}

The symbol

\int D ϕ

here is a concise way to represent the infinite-dimensional integral over all possible field configurations on all of space-time. As stated above, the unadorned path integral in the denominator ensures proper normalization.

As a probability

Strictly speaking, the only question that can be asked in physics is: What fraction of states satisfying condition $A$ also satisfy condition $B$ ? The answer to this is a number between 0 and 1, which can be interpreted as a conditional probability, written as

P(B | A)

. In terms of path integration, since

P(B | A) = P(A \cap B) / P(A)

, this means

{\displaystyle \operatorname {P} (B\mid A)={\frac {\sum _{F\subset A\cap B}\left|\int {\mathcal {D}}\varphi O_{\text{in}}[\varphi ]e^{i{\mathcal {S}}[\varphi ]}F[\varphi ]\right|^{2}}{\sum _{F\subset A}\left|\int {\mathcal {D}}\varphi O_{\text{in}}[\varphi ]e^{i{\mathcal {S}}[\varphi ]}F[\varphi ]\right|^{2}}},}

where the functional

O in [ϕ]

is the superposition of all incoming states that could lead to the states we are interested in. In particular, this could be a state corresponding to the state of the Universe just after the Big Bang, although for actual calculation this can be simplified using heuristic methods. Since this expression is a quotient of path integrals, it is naturally normalised.

Schwinger–Dyson equations

Since this formulation of quantum mechanics is analogous to classical action principle, one might expect that identities concerning the action in classical mechanics would have quantum counterparts derivable from a functional integral. This is often the case.
In the language of functional analysis, we can write the Euler–Lagrange equations as

{\frac {\delta {\mathcal {S}}[\varphi ]}{\delta \varphi }}=0

(the left-hand side is a functional derivative; the equation means that the action is stationary under small changes in the field configuration). The quantum analogues of these equations are called the Schwinger–Dyson equations.

If the functional measure

D ϕ

turns out to be translationally invariant (we'll assume this for the rest of this article, although this does not hold for, let's say nonlinear sigma models), and if we assume that after a Wick rotation

e^{i{\mathcal {S}}[\varphi ]},

which now becomes

e^{-H[\varphi ]}

for some

H

, it goes to zero faster than a reciprocal of any polynomial for large values of

φ

, then we can integrate by parts (after a Wick rotation, followed by a Wick rotation back) to get the following Schwinger–Dyson equations for the expectation:

{\displaystyle \left\langle {\frac {\delta F[\varphi ]}{\delta \varphi }}\right\rangle =-i\left\langle F[\varphi ]{\frac {\delta {\mathcal {S}}[\varphi ]}{\delta \varphi }}\right\rangle }

for any polynomially-bounded functional

F

. In the deWitt notation this looks like^[15]

\left\langle F_{,i}\right\rangle =-i\left\langle F{\mathcal {S}}_{,i}\right\rangle .

These equations are the analog of the on-shell EL equations. The time ordering is taken before the time derivatives inside the

S, i

.

If

J

(called the source field) is an element of the dual space of the field configurations (which has at least an affine structure because of the assumption of the translational invariance for the functional measure), then the generating functional

Z

of the source fields is defined to be

Z[J]=\int {\mathcal {D}}\varphi e^{i\left({\mathcal {S}}[\varphi ]+\langle J,\varphi \rangle \right)}.

Note that

{\displaystyle {\frac {\delta ^{n}Z}{\delta J(x_{1})\cdots \delta J(x_{n})}}[J]=i^{n}\,Z[J]\,\left\langle \varphi (x_{1})\cdots \varphi (x_{n})\right\rangle _{J},}

Z^{,i_{1}\cdots i_{n}}[J]=i^{n}Z[J]\left\langle \varphi ^{i_{1}}\cdots \varphi ^{i_{n}}\right\rangle _{J},

where

{\displaystyle \langle F\rangle _{J}={\frac {\int {\mathcal {D}}\varphi F[\varphi ]e^{i\left({\mathcal {S}}[\varphi ]+\langle J,\varphi \rangle \right)}}{\int {\mathcal {D}}\varphi e^{i\left({\mathcal {S}}[\varphi ]+\langle J,\varphi \rangle \right)}}}.}

Basically, if

D φ e i S [φ]

is viewed as a functional distribution (this shouldn't be taken too literally as an interpretation of QFT, unlike its Wick-rotated statistical mechanics analogue, because we have time ordering complications here!), then

⟨ φ (x 1) ... φ (x n)⟩

are its moments, and

Z

is its Fourier transform.

If

F

is a functional of

φ

, then for an operator

K

F [K]

is defined to be the operator that substitutes

K

for

φ

. For example, if

{\displaystyle F[\varphi ]={\frac {\partial ^{k_{1}}}{\partial x_{1}^{k_{1}}}}\varphi (x_{1})\cdots {\frac {\partial ^{k_{n}}}{\partial x_{n}^{k_{n}}}}\varphi (x_{n}),}

and

G

is a functional of

J

, then

{\displaystyle F\left[-i{\frac {\delta }{\delta J}}\right]G[J]=(-i)^{n}{\frac {\partial ^{k_{1}}}{\partial x_{1}^{k_{1}}}}{\frac {\delta }{\delta J(x_{1})}}\cdots {\frac {\partial ^{k_{n}}}{\partial x_{n}^{k_{n}}}}{\frac {\delta }{\delta J(x_{n})}}G[J].}

Then, from the properties of the functional integrals

\left\langle {\frac {\delta {\mathcal {S}}}{\delta \varphi (x)}}[\varphi ]+J(x)\right\rangle _{J}=0

we get the "master" Schwinger–Dyson equation:

{\frac {\delta {\mathcal {S}}}{\delta \varphi (x)}}\left[-i{\frac {\delta }{\delta J}}\right]Z[J]+J(x)Z[J]=0,

{\mathcal {S}}_{,i}[-i\partial ]Z+J_{i}Z=0.

If the functional measure is not translationally invariant, it might be possible to express it as the product

M [φ] D φ

, where

M

is a functional and

D φ

is a translationally invariant measure. This is true, for example, for nonlinear sigma models where the target space is diffeomorphic to

R n

. However, if the target manifold is some topologically nontrivial space, the concept of a translation does not even make any sense.

In that case, we would have to replace the S in this equation by another functional

{\hat {\mathcal {S}}}={\mathcal {S}}-i\ln M.

If we expand this equation as a Taylor series about J = 0, we get the entire set of Schwinger–Dyson equations.

Localization

The path integrals are usually thought of as being the sum of all paths through an infinite space–time. However, in local quantum field theory we would restrict everything to lie within a finite causally complete region, for example inside a double light-cone. This gives a more mathematically precise and physically rigorous definition of quantum field theory.

Ward–Takahashi identities

Now how about the on shell Noether's theorem for the classical case? Does it have a quantum analog as well? Yes, but with a caveat. The functional measure would have to be invariant under the one parameter group of symmetry transformation as well.
Let's just assume for simplicity here that the symmetry in question is local (not local in the sense of a gauge symmetry, but in the sense that the transformed value of the field at any given point under an infinitesimal transformation would only depend on the field configuration over an arbitrarily small neighborhood of the point in question). Let's also assume that the action is local in the sense that it is the integral over spacetime of a Lagrangian, and that

Q[{\mathcal {L}}(x)]=\partial _{\mu }f^{\mu }(x)

for some function

f

where

f

only depends locally on

φ

(and possibly the spacetime position).
If we don't assume any special boundary conditions, this would not be a "true" symmetry in the true sense of the term in general unless

f = 0

or something. Here,

Q

is a derivation which generates the one parameter group in question. We could have antiderivations as well, such as BRST and supersymmetry.

Let's also assume

\int {\mathcal {D}}\varphi \,Q[F][\varphi ]=0

for any polynomially-bounded functional

F

. This property is called the invariance of the measure. And this does not hold in general. See anomaly (physics) for more details.

Then,

\int {\mathcal {D}}\varphi \,Q\left[Fe^{iS}\right][\varphi ]=0,

which implies

\langle Q[F]\rangle +i\left\langle F\int _{\partial V}f^{\mu }\,ds_{\mu }\right\rangle =0

where the integral is over the boundary. This is the quantum analog of Noether's theorem.

Now, let's assume even further that

Q

is a local integral

Q=\int d^{d}x\,q(x)

where

q(x)[\varphi (y)]=\delta ^{(d)}(X-y)Q[\varphi (y)]\,

so that

q(x)[S]=\partial _{\mu }j^{\mu }(x)\,

where

j^{\mu }(x)=f^{\mu }(x)-{\frac {\partial }{\partial (\partial _{\mu }\varphi )}}{\mathcal {L}}(x)Q[\varphi ]\,

(this is assuming the Lagrangian only depends on

φ

and its first partial derivatives! More general Lagrangians would require a modification to this definition!). Note that we're NOT insisting that

q (x)

is the generator of a symmetry (i.e. we are not insisting upon the gauge principle), but just that

Q

is. And we also assume the even stronger assumption that the functional measure is locally invariant:

\int {\mathcal {D}}\varphi \,q(x)[F][\varphi ]=0.

Then, we would have

{\displaystyle \langle q(x)[F]\rangle +i\langle Fq(x)[S]\rangle =\langle q(x)[F]\rangle +i\left\langle F\partial _{\mu }j^{\mu }(x)\right\rangle =0.}

Alternatively,

{\displaystyle q(x)[S]\left[-i{\frac {\delta }{\delta J}}\right]Z[J]+J(x)Q[\varphi (x)]\left[-i{\frac {\delta }{\delta J}}\right]Z[J]=\partial _{\mu }j^{\mu }(x)\left[-i{\frac {\delta }{\delta J}}\right]Z[J]+J(x)Q[\varphi (x)]\left[-i{\frac {\delta }{\delta J}}\right]Z[J]=0.}

The above two equations are the Ward–Takahashi identities.

Now for the case where

f = 0

, we can forget about all the boundary conditions and locality assumptions. We'd simply have

\left\langle Q[F]\right\rangle =0.

Alternatively,

\int d^{d}x\,J(x)Q[\varphi (x)]\left[-i{\frac {\delta }{\delta J}}\right]Z[J]=0.

The need for regulators and renormalization

Path integrals as they are defined here require the introduction of regulators. Changing the scale of the regulator leads to the renormalization group. In fact, renormalization is the major obstruction to making path integrals well-defined.

The path integral in quantum-mechanical interpretation

In one interpretation of quantum mechanics, the "sum over histories" interpretation, the path integral is taken to be fundamental, and reality is viewed as a single indistinguishable "class" of paths that all share the same events. For this interpretation, it is crucial to understand what exactly an event is. The sum-over-histories method gives identical results to canonical quantum mechanics, and Sinha and Sorkin^[16] claim the interpretation explains the Einstein–Podolsky–Rosen paradox without resorting to nonlocality.

Some^[who?] advocates of interpretations of quantum mechanics emphasizing decoherence have attempted to make more rigorous the notion of extracting a classical-like "coarse-grained" history from the space of all possible histories.

Quantum gravity

Whereas in quantum mechanics the path integral formulation is fully equivalent to other formulations, it may be that it can be extended to quantum gravity, which would make it different from the Hilbert space model. Feynman had some success in this direction, and his work has been extended by Hawking and others.^[17] Approaches that use this method include causal dynamical triangulations and spinfoam models.

Quantum tunneling

Quantum tunnelling can be modeled by using the path integral formation to determine the action of the trajectory through a potential barrier. Using the WKB approximation, the tunneling rate (

Γ

) can be determined to be of the form

\Gamma =A_{\mathrm {o} }\exp \left(-{\frac {S_{\mathrm {eff} }}{\hbar }}\right)

with the effective action

S eff

and pre-exponential factor

A o

. This form is specifically useful in a dissipative system, in which the systems and surroundings must be modeled together. Using the Langevin equation to model Brownian motion, the path integral formation can be used to determine an effective action and pre-exponential model to see the effect of dissipation on tunnelling.^[18] From this model, tunneling rates of macroscopic systems (at finite temperatures) can be predicted.

High-speed light-based systems could replace supercomputers for certain ‘deep learning’ calculations

Low power requirements for photons (instead of electrons) may make deep learning more practical in future self-driving cars and mobile consumer devices.

June 14, 2017
Original link: http://www.kurzweilai.net/learning-with-light-new-system-allows-optical-deep-learning

(a) Optical micrograph of an experimentally fabricated on-chip optical interference unit; the physical region where the optical neural network program exists is highlighted in gray. A programmable nanophotonic processor uses a field-programmable gate array (similar to an FPGA integrated circuit ) — an array of interconnected waveguides, allowing the light beams to be modified as needed for a specific deep-learning matrix computation. (b) Schematic illustration of the optical neural network program, which performs matrix multiplication and amplification fully optically. (credit: Yichen Shen et al./Nature Photonics)

A team of researchers at MIT and elsewhere has developed a new approach to deep learning systems — using light instead of electricity, which they say could vastly improve the speed and efficiency of certain deep-learning computations.

Deep-learning systems are based on artificial neural networks that mimic the way the brain learns from an accumulation of examples. They can enable technologies such as face- and voice-recognition software, or scour vast amounts of medical data to find patterns that could be useful diagnostically, for example.

But the computations these systems carry out are highly complex and demanding, even for supercomputers. Traditional computer architectures are not very efficient for calculations needed for neural-network tasks that involve repeated multiplications of matrices (arrays of numbers). These can be computationally intensive for conventional CPUs or even GPUs.

Programmable nanophotonic processor

Instead, the new approach uses an optical device that the researchers call a “programmable nanophotonic processor.” Multiple light beams are directed in such a way that their waves interact with each other, producing interference patterns that “compute” the intended operation.

The optical chips using this architecture could, in principle, carry out dense matrix multiplications (the most power-hungry and time-consuming part in AI algorithms) for learning tasks much faster, compared to conventional electronic chips. The researchers expect a computational speed enhancement of at least two orders of magnitude over the state-of-the-art and three orders of magnitude in power efficiency.

“This chip, once you tune it, can carry out matrix multiplication with, in principle, zero energy, almost instantly,” says Marin Soljacic, one of the MIT researchers on the team.

To demonstrate the concept, the team set the programmable nanophotonic processor to implement a neural network that recognizes four basic vowel sounds. Even with the prototype system, they were able to achieve a 77 percent accuracy level, compared to about 90 percent for conventional systems. There are “no substantial obstacles” to scaling up the system for greater accuracy, according to Soljacic.

The team says is will still take a lot more time and effort to make this system useful. However, once the system is scaled up and fully functioning, the low-power system should find many uses, especially for situations where power is limited, such as in self-driving cars, drones, and mobile consumer devices. Other uses include signal processing for data transmission and computer centers.

The research was published Monday (June 12, 2017) in a paper in the journal Nature Photonics (open-access version available on arXiv).

The team also included researchers at Elenion Technologies of New York and the Université de Sherbrooke in Quebec. The work was supported by the U.S. Army Research Office through the Institute for Soldier Nanotechnologies, the National Science Foundation, and the Air Force Office of Scientific Research.

Abstract of Deep learning with coherent nanophotonic circuits

Artificial neural networks are computational network models inspired by signal processing in the brain. These models have dramatically improved performance for many machine-learning tasks, including speech and image recognition. However, today’s computing hardware is inefficient at implementing neural networks, in large part because much of it was designed for von Neumann computing schemes. Significant effort has been made towards developing electronic architectures tuned to implement artificial neural networks that exhibit improved computational speed and accuracy. Here, we propose a new architecture for a fully optical neural network that, in principle, could offer an enhancement in computational speed and power efficiency over state-of-the-art electronics for conventional inference tasks. We experimentally demonstrate the essential part of the concept using a programmable nanophotonic processor featuring a cascaded array of 56 programmable Mach–Zehnder interferometers in a silicon photonic integrated circuit and show its utility for vowel recognition.

References:

Pion

From Wikipedia, the free encyclopedia

Pion
The quark structure of the pion.
Composition	π⁺: u d π⁰: u u or d d π⁻: d u
Statistics	Bosonic
Interactions	Strong, Weak, Electromagnetic and Gravity
Symbol	π⁺, π⁰, and π⁻
Theorized	Hideki Yukawa (1935)
Discovered	César Lattes, Giuseppe Occhialini (1947) and Cecil Powell
Types	3
Mass	π^±: 139.57018(35) MeV/c² π⁰: 134.9766(6) MeV/c²
Electric charge	π⁺: +1 e π⁰: 0 e π⁻: −1 e
Spin	0
Parity	−1

In particle physics, a pion (or a pi meson, denoted with the Greek letter pi:
π
) is any of three subatomic particles:
π⁰,
π⁺, and
π⁻. Each pion consists of a quark and an antiquark and is therefore a meson. Pions are the lightest mesons and, more generally, the lightest hadrons. They are unstable, with the charged pions
π⁺ and
π⁻ decaying with a mean lifetime of 26.033 nanoseconds (2.6033×10⁻⁸ seconds), and the neutral pion
π⁰ decaying with a much shorter lifetime of 8.4×10⁻¹⁷ seconds. Charged pions most often decay into muons and muon neutrinos, while neutral pions generally decay into gamma rays.

The exchange of virtual pions, along with the vector, rho and omega mesons, provides an explanation for the residual strong force between nucleons. Pions are not produced in radioactive decay, but are commonly produced in high energy accelerators in collisions between hadrons. All types of pions are also produced in natural processes when high energy cosmic ray protons and other hadronic cosmic ray components interact with matter in the Earth's atmosphere. Recently, the detection of characteristic gamma rays originating from the decay of neutral pions in two supernova remnants has shown that pions are produced copiously after supernovas, most probably in conjunction with production of high energy protons that are detected on Earth as cosmic rays.^[1]

The concept of mesons as the carrier particles of the nuclear force was first proposed in 1935 by Hideki Yukawa. While the muon was first proposed to be this particle after its discovery in 1936, later work found that it did not participate in the strong nuclear interaction. The pions, which turned out to be examples of Yukawa's proposed mesons, were discovered later: the charged pions in 1947, and the neutral pion in 1950.

History

An animation of the nuclear force (or residual strong force) interaction. The small colored double disks are gluons. Anticolors are shown as per this diagram (larger version).

The same process as in the animation with the individual quark constituents shown, to illustrate how the fundamental strong interaction gives rise to the nuclear force. Straight lines are quarks, while multi-colored loops are gluons (the carriers of the fundamental force). Other gluons, which bind together the proton, neutron, and pion "in-flight," are not shown.

Theoretical work by Hideki Yukawa in 1935 had predicted the existence of mesons as the carrier particles of the strong nuclear force. From the range of the strong nuclear force (inferred from the radius of the atomic nucleus), Yukawa predicted the existence of a particle having a mass of about 100 MeV. Initially after its discovery in 1936, the muon (initially called the "mu meson") was thought to be this particle, since it has a mass of 106 MeV. However, later experiments showed that the muon did not participate in the strong nuclear interaction. In modern terminology, this makes the muon a lepton, and not a meson. However, some communities of astrophysicists continue to call the muon a "mu-meson".

In 1947, the first true mesons, the charged pions, were found by the collaboration of Cecil Powell, César Lattes, Giuseppe Occhialini, et al., at the University of Bristol, in England. Since the advent of particle accelerators had not yet come, high-energy subatomic particles were only obtainable from atmospheric cosmic rays. Photographic emulsions based on the gelatin-silver process were placed for long periods of time in sites located at high altitude mountains, first at Pic du Midi de Bigorre in the Pyrenees, and later at Chacaltaya in the Andes Mountains, where the plates were struck by cosmic rays.

After the development of the photographic plates, microscopic inspection of the emulsions revealed the tracks of charged subatomic particles. Pions were first identified by their unusual "double meson" tracks, which were left by their decay into a putative meson. The particle was identified as a muon, which is not typically classified as a meson in modern particle physics. In 1948, Lattes, Eugene Gardner, and their team first artificially produced pions at the University of California's cyclotron in Berkeley, California, by bombarding carbon atoms with high-speed alpha particles. Further advanced theoretical work was carried out by Riazuddin, who in 1959, used the dispersion relation for Compton scattering of virtual photons on pions to analyze their charge radius.^[2]

Nobel Prizes in Physics were awarded to Yukawa in 1949 for his theoretical prediction of the existence of mesons, and to Cecil Powell in 1950 for developing and applying the technique of particle detection using photographic emulsions.

Since the neutral pion is not electrically charged, it is more difficult to detect and observe than the charged pions are. Neutral pions do not leave tracks in photographic emulsions or Wilson cloud chambers. The existence of the neutral pion was inferred from observing its decay products from cosmic rays, a so-called "soft component" of slow electrons with photons. The
π⁰ was identified definitively at the University of California's cyclotron in 1950 by observing its decay into two photons.^[3] Later in the same year, they were also observed in cosmic-ray balloon experiments at Bristol University.

The pion also plays a crucial role in cosmology, by imposing an upper limit on the energies of cosmic rays surviving collisions with the cosmic microwave background, through the Greisen–Zatsepin–Kuzmin limit.

In the standard understanding of the strong force interaction as defined by quantum chromodynamics, pions are loosely portrayed as Goldstone bosons of spontaneously broken chiral symmetry. That explains why the masses of the three kinds of pions are considerably less than that of the other mesons, such as the scalar or vector mesons. If their current quarks were massless particles, it could make the chiral symmetry exact and thus the Goldstone theorem would dictate that all pions have a zero mass. Empirically, since the light quarks actually have minuscule nonzero masses, the pions also have nonzero rest masses. However, those weights are almost an order of magnitude smaller than that of the nucleons, roughly^[4] m_π ≈ √v m_q / f_π ≈ √m_q 45 MeV, where m are the relevant current quark masses in MeV, 5−10 MeVs.

The use of pions in medical radiation therapy, such as for cancer, was explored at a number of research institutions, including the Los Alamos National Laboratory's Meson Physics Facility, which treated 228 patients between 1974 and 1981 in New Mexico,^[5] and the TRIUMF laboratory in Vancouver, British Columbia.

Theoretical overview

The pion can be thought of as one of the particles that mediate the interaction between a pair of nucleons. This interaction is attractive: it pulls the nucleons together. Written in a non-relativistic form, it is called the Yukawa potential. The pion, being spinless, has kinematics described by the Klein–Gordon equation. In the terms of quantum field theory, the effective field theory Lagrangian describing the pion-nucleon interaction is called the Yukawa interaction.

The nearly identical masses of
π^± and
π⁰ imply that there must be a symmetry at play; this symmetry is called the SU(2) flavour symmetry or isospin. The reason that there are three pions,
π⁺,
π⁻ and
π⁰, is that these are understood to belong to the triplet representation or the adjoint representation 3 of SU(2). By contrast, the up and down quarks transform according to the fundamental representation 2 of SU(2), whereas the anti-quarks transform according to the conjugate representation 2*.

With the addition of the strange quark, one can say that the pions participate in an SU(3) flavour symmetry, belonging to the adjoint representation 8 of SU(3). The other members of this octet are the four kaons and the eta meson.

Pions are pseudoscalars under a parity transformation. Pion currents thus couple to the axial vector current and pions participate in the chiral anomaly.

Basic properties

Pions, which are mesons with zero spin, are composed of first-generation quarks. In the quark model, an up quark and an anti-down quark make up a
π⁺, whereas a down quark and an anti-up quark make up the
π⁻, and these are the antiparticles of one another. The neutral pion
π⁰ is a combination of an up quark with an anti-up quark or a down quark with an anti-down quark. The two combinations have identical quantum numbers, and hence they are only found in superpositions. The lowest-energy superposition of these is the
π⁰, which is its own antiparticle. Together, the pions form a triplet of isospin. Each pion has isospin (I = 1) and third-component isospin equal to its charge (I_z = +1, 0 or −1).

Charged pion decays

Feynman diagram of the dominating leptonic pion decay.

The
π^± mesons have a mass of 139.6 MeV/c² and a mean lifetime of 2.6033×10⁻⁸ s. They decay due to the weak interaction. The primary decay mode of a pion, with a branching fraction of 0.999877, is a leptonic decay into a muon and a muon neutrino:

π⁺	→	μ⁺	+	ν _μ
π⁻	→	μ⁻	+	ν _μ

The second most common decay mode of a pion, with a branching fraction of 0.000123, is also a leptonic decay into an electron and the corresponding electron antineutrino. This "electronic mode" was discovered at CERN in 1958:^[6]

π⁺	→	e⁺	+	ν _e
π⁻	→	e⁻	+	ν _e

The suppression of the electronic decay mode with respect to the muonic one is given approximately (up to a few percent effect of the radiative corrections) by the ratio of the half-widths of the pion–electron and the pion–muon decay reactions:

{\displaystyle R_{\pi }=(m_{e}/m_{\mu })^{2}\left({\frac {m_{\pi }^{2}-m_{e}^{2}}{m_{\pi }^{2}-m_{\mu }^{2}}}\right)^{2}=1.283\times 10^{-4}}

and is a spin effect known as helicity suppression. Its mechanism is as follows: The negative pion has spin zero, therefore the lepton and antineutrino must be emitted with opposite spins (and opposite linear momenta) to preserve net zero spin (and conserve linear momentum). However, because the weak interaction is sensitive only to the left chirality component of fields, the antineutrino has always chirality left, which means it is right-handed, since for massless anti-particles the helicity is opposite to the chirality. This implies that the lepton must be emitted with spin in the direction of its linear momentum (i.e., also right-handed). If, however, leptons were massless, they would only interact with the pion in the left-handed form (because for massless particles helicity is the same as chirality) and this decay mode would be prohibited. Therefore, suppression of the electron decay channel comes from the fact that the electron's mass is much smaller than the muon's. The electron is thus relatively massless compared with the muon, and thus the electronic mode is almost prohibited.^[7] Although this explanation suggests that parity violation is causing the helicity suppression, it should be emphasized that the fundamental reason lies in the vector-nature of the interaction which demands a different handedness for the neutrino and the charged lepton. Thus, even a parity conserving interaction would yield the same suppression.

Measurements of the above ratio have been considered for decades to be a test of lepton universality. Experimentally, this ratio is 1.230(4)×10⁻⁴.^[8]

Besides the purely leptonic decays of pions, some structure-dependent radiative leptonic decays (that is, decay to the usual leptons plus a gamma ray) have also been observed.

Also observed, for charged pions only, is the very rare "pion beta decay" (with branching fraction of about 10⁻⁸) into a neutral pion, an electron and an electron antineutrino (or for positive pions, a neutral pion, a positron, and electron neutrino).

π⁻	→	π⁰	+	e⁻	+	ν _e
π⁺	→	π⁰	+	e⁺	+	ν _e

The rate at which pions decay is a prominent quantity in many sub-fields of particle physics, such as chiral perturbation theory. This rate is parametrized by the pion decay constant (ƒ_π), related to the wave function overlap of the quark and antiquark, which is about 130 MeV.^[9]

Neutral pion decays

The
π⁰ meson has a mass of 135.0 MeV/c² and a mean lifetime of 8.4×10⁻¹⁷ s. It decays via the electromagnetic force, which explains why its mean lifetime is much smaller than that of the charged pion (which can only decay via the weak force). The main π⁰ decay mode, with a branching ratio of BR=0.98823, is into two photons:

π⁰

→

2
γ
.

The decay π⁰ → 3γ (as well as decays into any odd number of photons) is forbidden by the C-symmetry of the electromagnetic interaction. The intrinsic C-parity of the π⁰ is +1, while the C-parity of a system of n photons is (−1)ⁿ.

The second largest π⁰ decay mode (BR=0.01174) is the Dalitz decay (named after Richard Dalitz), which is a two-photon decay with an internal photon conversion resulting a photon and an electron-positron pair in the final state:

π⁰

→

e⁻

e⁺.

The third largest established decay mode (BR=3.34×10⁻⁵) is the double Dalitz decay, with both photons undergoing internal conversion which leads to further suppression of the rate:

π⁰

→

e⁻

e⁺

e⁻

e⁺.

The fourth largest established decay mode is the loop-induced and therefore suppressed (and additionally helicity-suppressed) leptonic decay mode (BR=6.46×10⁻⁸):

π⁰

→

e⁻

e⁺.

The neutral pion has also been observed to decay into positronium with a branching fraction of the order of 10⁻⁹. No other decay modes have been established experimentally. The branching fractions above are the PDG central values, and their uncertainties are not quoted.

Pions
Particle name	Particle symbol	Antiparticle symbol	Quark content^[10]	Rest mass (MeV/c²)	I^G	J^PC	S	C	B'	Mean lifetime (s)	Commonly decays to (>5% of decays)
Pion^[8]	π⁺	π⁻	u d	139.570 18 ± 0.000 35	1⁻	0⁻	0	0	0	2.6033 ± 0.0005 × 10⁻⁸	μ⁺ + ν _μ
Pion^[11]	π⁰	Self	${\tfrac {\mathrm {u{\bar {u}}} -\mathrm {d{\bar {d}}} }{\sqrt {2}}}$ ^{^[a]}	134.976 6 ± 0.000 6	1⁻	0⁻⁺	0	0	0	8.4 ± 0.6 × 10⁻¹⁷	γ + γ

^[a] ^ Make-up inexact due to non-zero quark masses.

Search This Blog

Friday, August 3, 2018

Path integral formulation

Quantum action principle

Feynman's interpretation

Path integral in quantum mechanics

Time-slicing derivation

Path integral formula

Free particle

Simple harmonic oscillator

Coulomb potential

The Schrödinger equation

Equations of motion

Stationary-phase approximation

Canonical commutation relations

Particle in curved space

Measure-theoretic factors

Euclidean path integrals

Wick rotation and the Feynman–Kac formula

The path integral and the partition function

Quantum field theory

The propagator

Functionals of fields

Expectation values

As a probability

Schwinger–Dyson equations

Localization

Ward–Takahashi identities

The need for regulators and renormalization

The path integral in quantum-mechanical interpretation

Quantum gravity

Quantum tunneling

High-speed light-based systems could replace supercomputers for certain ‘deep learning’ calculations

June 14, 2017Original link: http://www.kurzweilai.net/learning-with-light-new-system-allows-optical-deep-learning

References:

Pion

History

Theoretical overview

Basic properties

Charged pion decays

Neutral pion decays

Predation problem

June 14, 2017
Original link: http://www.kurzweilai.net/learning-with-light-new-system-allows-optical-deep-learning