From Wikipedia, the free encyclopedia
Development of matrix mechanics
In 1925,
Werner Heisenberg,
Max Born, and
Pascual Jordan formulated the matrix mechanics representation of quantum mechanics.
Epiphany at Helgoland
In 1925 Werner Heisenberg was working in
Göttingen on the problem of calculating the
spectral lines of
hydrogen. By May 1925 he began trying to describe atomic systems by
observables only. On June 7, to escape the effects of a bad attack of
hay fever, Heisenberg left for the pollen free
North Sea island of
Helgoland. While there, in between climbing and learning by heart poems from
Goethe's
West-östlicher Diwan, he continued to ponder the spectral issue and eventually realised that adopting
non-commuting observables might solve the problem, and he later wrote
[3]
"It was about three o' clock at night when the final result of the
calculation lay before me. At first I was deeply shaken. I was so
excited that I could not think of sleep. So I left the house and awaited
the sunrise on the top of a rock."
The Three Fundamental Papers
After Heisenberg returned to
Göttingen, he showed
Wolfgang Pauli his calculations, commenting at one point:
[4]
"Everything is still vague and unclear to me, but it seems as if the electrons will no more move on orbits."
On July 9 Heisenberg gave the same paper of his calculations to
Max Born,
saying, "...he had written a crazy paper and did not dare to send it in
for publication, and that Born should read it and advise him on it..."
prior to publication. Heisenberg then departed for a while, leaving Born
to analyse the paper.
[5]
In the paper, Heisenberg formulated quantum theory without sharp electron orbits.
Hendrik Kramers had earlier calculated the relative intensities of spectral lines in the
Sommerfeld model by interpreting the
Fourier coefficients of the orbits as intensities. But his answer, like all other calculations in the
old quantum theory, was only correct for
large orbits.
Heisenberg, after a collaboration with Kramers,
[6]
began to understand that the transition probabilities were not quite
classical quantities, because the only frequencies that appear in the
Fourier series
should be the ones that are observed in quantum jumps, not the
fictional ones that come from Fourier-analyzing sharp classical orbits.
He replaced the classical Fourier series with a matrix of coefficients, a
fuzzed-out quantum analog of the Fourier series. Classically, the
Fourier coefficients give the intensity of the emitted radiation, so in
quantum mechanics the magnitude of the matrix elements of the
position operator
were the intensity of radiation in the bright-line spectrum. The
quantities in Heisenberg's formulation were the classical position and
momentum, but now they were no longer sharply defined. Each quantity was
represented by a collection of
Fourier coefficients with two indices, corresponding to the initial and final states.
[7]
When Born read the paper, he recognized the formulation as one which could be transcribed and extended to the systematic
language of matrices,
[8] which he had learned from his study under Jakob Rosanes
[9] at
Breslau University. Born, with the help of his assistant and former student
Pascual Jordan,
began immediately to make the transcription and extension, and they
submitted their results for publication; the paper was received for
publication just 60 days after Heisenberg's paper.
[10]
A follow-on paper was submitted for publication before the end of the year by all three authors.
[11]
(A brief review of Born's role in the development of the matrix
mechanics formulation of quantum mechanics along with a discussion of
the key formula involving the non-commutivity of the probability
amplitudes can be found in an article by
Jeremy Bernstein.
[12] A detailed historical and technical account can be found in Mehra and Rechenberg's book
The Historical Development of Quantum Theory. Volume 3. The Formulation of Matrix Mechanics and Its Modifications 1925–1926.[13])
*W. Heisenberg,
Über quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen,
Zeitschrift für Physik,
33, 879-893, 1925 (received July 29, 1925). [English translation in: B. L. van der Waerden, editor,
Sources of Quantum Mechanics (Dover Publications, 1968)
ISBN 0-486-61881-1 (English title:
Quantum-Theoretical Re-interpretation of Kinematic and Mechanical Relations).]
- M. Born and P. Jordan, Zur Quantenmechanik, Zeitschrift für Physik, 34, 858-888, 1925 (received September 27, 1925). [English translation in: B. L. van der Waerden, editor, Sources of Quantum Mechanics (Dover Publications, 1968) ISBN 0-486-61881-1 (English title: On Quantum Mechanics).]
- M. Born, W. Heisenberg, and P. Jordan, Zur Quantenmechanik II, Zeitschrift für Physik, 35, 557-615, 1926 (received November 16, 1925). [English translation in: B. L. van der Waerden, editor, Sources of Quantum Mechanics (Dover Publications, 1968) ISBN 0-486-61881-1 (English title: On Quantum Mechanics II).]
Up until this time, matrices were seldom used by physicists; they were considered to belong to the realm of pure mathematics.
Gustav Mie
had used them in a paper on electrodynamics in 1912 and Born had used
them in his work on the lattices theory of crystals in 1921. While
matrices were used in these cases, the algebra of matrices with their
multiplication did not enter the picture as they did in the matrix
formulation of quantum mechanics.
[14]
Born, however, had learned matrix algebra from Rosanes, as already
noted, but Born had also learned Hilbert's theory of integral equations
and quadratic forms for an infinite number of variables as was apparent
from a citation by Born of Hilbert's work
Grundzüge einer allgemeinen Theorie der Linearen Integralgleichungen published in 1912.
[15][16]
Jordan, too was well equipped for the task. For a number of years, he had been an assistant to
Richard Courant at Göttingen in the preparation of Courant and
David Hilbert's book
Methoden der mathematischen Physik I, which was published in 1924.
[17]
This book, fortuitously, contained a great many of the mathematical
tools necessary for the continued development of quantum mechanics.
In 1926,
John von Neumann became assistant to
David Hilbert, and he would coin the term
Hilbert space to describe the algebra and analysis which were used in the development of quantum mechanics.
[18][19]
Heisenberg's reasoning
Before matrix mechanics, the
old quantum theory described the motion of a particle by a classical orbit, with well defined position and momentum
X(
t),
P(
t), with the restriction that the time integral over one period
T of the momentum times the velocity must be a positive integer multiple of
Planck's constant
- .
While this restriction correctly selects orbits with more or less the right energy values
En,
the old quantum mechanical formalism did not describe time dependent
processes, such as the emission or absorption of radiation.
When a classical particle is weakly coupled to a
radiation field, so that the radiative damping can be neglected, it will emit
radiation
in a pattern which repeats itself every orbital period. The frequencies
which make up the outgoing wave are then integer multiples of the
orbital frequency, and this is a reflection of the fact that
X(
t) is periodic, so that its
Fourier representation has frequencies 2π
n/T only.
- .
The coefficients
Xn are complex numbers. The ones with negative frequencies must be the
complex conjugates of the ones with positive frequencies, so that
X(
t) will always be real,
- .
A quantum mechanical particle, on the other hand, can not emit
radiation continuously, it can only emit photons. Assuming that the
quantum particle started in orbit number
n, emitted a photon, then ended up in orbit number
m, the energy of the photon is
En−Em, which means that its frequency is
(En−Em)/h.
For large
n and
m, but with
n−
m relatively small, these are the classical frequencies by
Bohr's
correspondence principle
- .
In the formula above,
T is the classical period of either orbit
n or orbit
m, since the difference between them is higher order in
h. But for
n and
m small, or if
n −
m is large, the frequencies are not integer multiples of any single frequency.
Since the frequencies which the particle emits are the same as the
frequencies in the fourier description of its motion, this suggests that
something in the time-dependent description of the particle is oscillating with frequency
(En−Em)/h. Heisenberg called this quantity
Xnm, and demanded that it should reduce to the classical
Fourier coefficients in the classical limit. For large values of
n,
m but with
n −
m relatively small,
Xnm is the
(n−m)th Fourier coefficient of the classical motion at orbit
n. Since
Xnm has opposite frequency to
Xmn, the condition that
X is real becomes
- .
By definition,
Xnm only has the frequency
(En−Em)/h, so its time evolution is simple:
- .
This is the original form of Heisenberg's equation of motion.
Given two arrays
Xnm and
Pnm describing two physical quantities, Heisenberg could form a new array of the same type by combining the terms
XnkPkm, which also oscillate with the right frequency. Since the
Fourier coefficients of the product of two quantities is the
convolution
of the Fourier coefficients of each one separately, the correspondence
with Fourier series allowed Heisenberg to deduce the rule by which the
arrays should be multiplied,
- .
Born pointed out that
this is the law of matrix multiplication,
so that the position, the momentum, the energy, all the observable
quantities in the theory, are interpreted as matrices. Under this
multiplication rule, the product depends on the order:
XP is different from
PX.
The
X matrix is a complete description of the motion of a
quantum mechanical particle. Because the frequencies in the quantum
motion are not multiples of a common frequency, the matrix elements
cannot be interpreted as the Fourier coefficients of a sharp classical trajectory. Nevertheless, as matrices,
X(
t) and
P(
t) satisfy the classical equations of motion; also see Ehrenfest's theorem, below.
Matrix basics
When it was introduced by
Werner Heisenberg,
Max Born and
Pascual Jordan
in 1925, matrix mechanics was not immediately accepted and was a source
of controversy, at first. Schrödinger's later introduction of
wave mechanics was greatly favored.
Part of the reason was that Heisenberg's formulation was in an odd
mathematical language, for the time, while Schrödinger's formulation was
based on familiar wave equations. But there was also a deeper
sociological reason. Quantum mechanics had been developing by two paths,
one under the direction of Einstein and the other under the direction
of Bohr. Einstein emphasized wave–particle duality, while Bohr
emphasized the discrete energy states and quantum jumps. De Broglie had
shown how to reproduce the discrete energy states in Einstein's
framework – the quantum condition is the standing wave condition, and
this gave hope to those in the Einstein school that all the discrete
aspects of quantum mechanics would be subsumed into a continuous wave
mechanics.
Matrix mechanics, on the other hand, came from the Bohr school, which
was concerned with discrete energy states and quantum jumps. Bohr's
followers did not appreciate physical models which pictured electrons as
waves, or as anything at all. They preferred to focus on the quantities
which were directly connected to experiments.
In atomic physics,
spectroscopy gave observational data on atomic transitions arising from the interactions of atoms with light
quanta.
The Bohr school required that only those quantities which were in
principle measurable by spectroscopy should appear in the theory. These
quantities include the energy levels and their intensities but they do
not include the exact location of a particle in its Bohr orbit. It is
very hard to imagine an experiment which could determine whether an
electron in the ground state of a hydrogen atom is to the right or to
the left of the nucleus. It was a deep conviction that such questions
did not have an answer.
The matrix formulation was built on the premise that all physical
observables are represented by matrices, whose elements are indexed by two different energy levels. The set of
eigenvalues
of the matrix were eventually understood to be the set of all possible
values that the observable can have. Since Heisenberg's matrices are
Hermitian, the eigenvalues are real.
If an observable is measured and the result is a certain eigenvalue, the corresponding
eigenvector
is the state of the system immediately after the measurement. The act
of measurement in matrix mechanics 'collapses' the state of the system.
If one measures two observables simultaneously, the state of the system
collapses to a common eigenvector of the two observables. Since most
matrices don't have any eigenvectors in common, most observables can
never be measured precisely at the same time. This is the
uncertainty principle.
If two matrices share their eigenvectors, they can be simultaneously
diagonalized. In the basis where they are both diagonal, it is clear
that their product does not depend on their order because multiplication
of diagonal matrices is just multiplication of numbers. The uncertainty
principle, by contrast, is an expression of the fact that often two
matrices
A and
B do not always commute, i.e., that
AB − BA does not necessarily equal 0. The fundamental commutation relation of matrix mechanics,
implies then that
there are no states which simultaneously have a definite position and momentum.
This principle of uncertainty holds for many other pairs of
observables as well. For example, the energy does not commute with the
position either, so it is impossible to precisely determine the position
and energy of an electron in an atom.
Nobel Prize
In 1928,
Albert Einstein nominated Heisenberg, Born, and Jordan for the
Nobel Prize in Physics.
[20] The announcement of the Nobel Prize in Physics for 1932 was delayed until November 1933.
[21]
It was at that time that it was announced Heisenberg had won the Prize
for 1932 "for the creation of quantum mechanics, the application of
which has, inter alia, led to the discovery of the allotropic forms of
hydrogen"
[22] and
Erwin Schrödinger and
Paul Adrien Maurice Dirac shared the 1933 Prize "for the discovery of new productive forms of atomic theory".
[22]
One can rightly ask why Born was not awarded the Prize in 1932 along
with Heisenberg, and Bernstein gives some speculations on this matter.
One of them is related to Jordan joining the
Nazi Party on May 1, 1933 and becoming a
Storm Trooper.
[23]
Hence, Jordan's Party affiliations and Jordan's links to Born may have
affected Born's chance at the Prize at that time. Bernstein also notes
that when Born won the Prize in 1954, Jordan was still alive, and the
Prize was awarded for the statistical interpretation of quantum
mechanics, attributable to Born alone.
[24]
Heisenberg's reactions to Born for Heisenberg receiving the Prize for
1932 and for Born receiving the Prize in 1954 are also instructive in
evaluating whether Born should have shared the Prize with Heisenberg. On
November 25, 1933 Born received a letter from Heisenberg in which he
said he had been delayed in writing due to a "bad conscience" that he
alone had received the Prize "for work done in Göttingen in
collaboration – you, Jordan and I." Heisenberg went on to say that Born
and Jordan's contribution to quantum mechanics cannot be changed by "a
wrong decision from the outside."
[25]
In 1954, Heisenberg wrote an article honoring
Max Planck
for his insight in 1900. In the article, Heisenberg credited Born and
Jordan for the final mathematical formulation of matrix mechanics and
Heisenberg went on to stress how great their contributions were to
quantum mechanics, which were not "adequately acknowledged in the public
eye."
[26]
Mathematical development
Once
Heisenberg introduced the matrices for X and P, he could find their
matrix elements in special cases by guesswork, guided by the
correspondence principle. Since the matrix elements are the quantum mechanical analogs of
Fourier coefficients of the classical orbits, the simplest case is the
harmonic oscillator, where the classical position and momentum, X(t) and P(t), are sinusoidal.
Harmonic oscillator
In units where the mass and frequency of the oscillator are equal to one (see
nondimensionalization), the energy of the oscillator is
The
level sets of
H are the clockwise orbits, and they are nested circles in phase space. The classical orbit with energy
E is
The
old quantum condition dictates that the integral of
P dX over an orbit, which is the area of the circle in phase space, must be an integer multiple of
Planck's constant. The area of the circle of radius
√2E is
2πE. So
or, in
natural units where
ħ = 1, the energy is an integer.
The
Fourier components of
X(t) and
P(t) are simple, and more so if they are combined into the quantities
- .
Both
A and
A† have only a single frequency, and
X and
P can be recovered from their sum and difference.
Since
A(t) has a classical Fourier series with only the lowest frequency, and the matrix element
Amn is the
(m − n)th Fourier coefficient of the classical orbit, the matrix for
A is nonzero only on the line just above the diagonal, where it is equal to
√2En. The matrix for
A† is likewise only nonzero on the line below the diagonal, with the same elements.
Thus,
from A and A†, reconstruction yields
and
which, up to the choice of units, are the Heisenberg matrices for the harmonic oscillator. Note that both matrices are
hermitian, since they are constructed from the Fourier coefficients of real quantities.
Finding
X(t) and
P(t) is direct, since they are quantum Fourier coefficients so they evolve simply with time,
The matrix product of
X and
P is not hermitian, but has a real and imaginary part. The real part is one half the symmetric expression
XP + PX, while the imaginary part is proportional to the
commutator
- .
It is simple to verify explicitly that
XP − PX in the case of the harmonic oscillator, is
iħ, multiplied by the
identity.
It is likewise simple to verify that the matrix
is a
diagonal matrix, with
eigenvalues Ei.
Conservation of energy
The harmonic oscillator is an important case. Finding the matrices is
easier than determining the general conditions from these special
forms. For this reason, Heisenberg investigated the
anharmonic oscillator, with
Hamiltonian
In this case, the
X and
P
matrices are no longer simple off diagonal matrices, since the
corresponding classical orbits are slightly squashed and displaced, so
that they have
Fourier coefficients
at every classical frequency. To determine the matrix elements,
Heisenberg required that the classical equations of motion be obeyed as
matrix equations,
He noticed that if this could be done, then
H, considered as a matrix function of
X and
P, will have zero time derivative.
where
A∗B is the
anticommutator,
- .
Given that all the off diagonal elements have a nonzero frequency;
H being constant implies that
H
is diagonal. It was clear to Heisenberg that in this system, the energy
could be exactly conserved in an arbitrary quantum system, a very
encouraging sign.
The process of emission and absorption of photons seemed to demand
that the conservation of energy will hold at best on average. If a wave
containing exactly one photon passes over some atoms, and one of them
absorbs it, that atom needs to tell the others that they can't absorb
the photon anymore. But if the atoms are far apart, any signal cannot
reach the other atoms in time, and they might end up absorbing the same
photon anyway and dissipating the energy to the environment. When the
signal reached them, the other atoms would have to somehow
recall that energy. This paradox led
Bohr, Kramers and Slater
to abandon exact conservation of energy. Heisenberg's formalism, when
extended to include the electromagnetic field, was obviously going to
sidestep this problem, a hint that the interpretation of the theory will
involve
wavefunction collapse.
Differentiation trick — canonical commutation relations
Demanding
that the classical equations of motion are preserved is not a strong
enough condition to determine the matrix elements. Planck's constant
does not appear in the classical equations, so that the matrices could
be constructed for many different values of
ħ and still satisfy the equations of motion, but with different energy levels.
So, in order to implement his program, Heisenberg needed to use the
old quantum condition to fix the energy levels, then fill in the matrices with
Fourier coefficients
of the classical equations, then alter the matrix coefficients and the
energy levels slightly to make sure the classical equations are
satisfied. This is clearly not satisfactory. The old quantum conditions
refer to the area enclosed by the sharp classical orbits, which do not
exist in the new formalism.
The most important thing that Heisenberg discovered is how to translate the
old quantum condition into a simple statement in matrix mechanics.
To do this, he investigated the action integral as a matrix quantity,
There are several problems with this integral, all stemming from the
incompatibility of the matrix formalism with the old picture of orbits.
Which period
T should be used?
Semiclassically, it should be either
m or
n, but the difference is order
ħ, and an answer to order
ħ is sought. The
quantum condition tells us that
Jmn is 2π
n on the diagonal, so the fact that
J is classically constant tells us that the off-diagonal elements are zero.
His crucial insight was to differentiate the quantum condition with respect to
n. This idea only makes complete sense in the classical limit, where
n is not an integer but the continuous
action variable J,
but Heisenberg performed analogous manipulations with matrices, where
the intermediate expressions are sometimes discrete differences and
sometimes derivatives.
In the following discussion, for the sake of clarity, the
differentiation will be performed on the classical variables, and the
transition to matrix mechanics will be done afterwards, guided by the
correspondence principle.
In the classical setting, the derivative is the derivative with respect to
J of the integral which defines
J, so it is tautologically equal to 1.
where the derivatives
dP/dJ and
dX/dJ should be interpreted as differences with respect to
J at corresponding times on nearby orbits, exactly what would be obtained if the
Fourier coefficients
of the orbital motion were differentiated. (These derivatives are
symplectically orthogonal in phase space to the time derivatives
dP/dt and
dX/dt).
The final expression is clarified by introducing the variable canonically conjugate to
J, which is called the
angle variable θ: The derivative with respect to time is a derivative with respect to
θ, up to a factor of 2π
T,
So the quantum condition integral is the average value over one cycle of the
Poisson bracket of
X and
P.
An analogous differentiation of the Fourier series of
P dX
demonstrates that the off-diagonal elements of the Poisson bracket are
all zero. The Poisson bracket of two canonically conjugate variables,
such as
X and
P, is the constant value 1, so this integral really is the average value of 1; so it is 1, as we knew all along, because it is
dJ/dJ
after all. But Heisenberg, Born and Jordan, unlike Dirac, were not
familiar with the theory of Poisson brackets, so, for them, the
differentiation effectively evaluated {
X, P} in
J, θ coordinates.
The Poisson Bracket, unlike the action integral, does have a simple
translation to matrix mechanics−−it normally corresponds to the
imaginary part of the product of two variables, the
commutator.
To see this, examine the (antisymmetrized) product of two matrices
A and
B
in the correspondence limit, where the matrix elements are slowly
varying functions of the index, keeping in mind that the answer is zero
classically.
In the correspondence limit, when indices
m,
n are large and nearby, while
k,
r are small, the rate of change of the matrix elements in the diagonal direction is the matrix element of the
J
derivative of the corresponding classical quantity. So its possible to
shift any matrix element diagonally through the correspondence,
where the right hand side is really only the (
m −
n)'th Fourier component of
dA/dJ at the orbit near
m to this semiclassical order, not a full well-defined matrix.
The semiclassical time derivative of a matrix element is obtained up to a factor of
i by multiplying by the distance from the diagonal,
since the coefficient
Am(m+k) is semiclassically the
k'th Fourier coefficient of the
m-th classical orbit.
The imaginary part of the product of
A and
B can be evaluated by shifting the matrix elements around so as to reproduce the classical answer, which is zero.
The leading nonzero residual is then given entirely by the shifting.
Since all the matrix elements are at indices which have a small distance
from the large index position (
m,m), it helps to introduce two temporary notations:
A[r,k] = A(m+r)(m+k) for the matrices, and
(dA/dJ)[r] for the r'th Fourier components of classical quantities,
Flipping the summation variable in the first sum from
r to
r' =
k −
r, the matrix element becomes,
and it is clear that the principal (classical) part cancels.
The leading quantum part, neglecting the higher order product of derivatives in the residual expression, is then
- =
so that, finally,
which can be identified with
i times the
k-th classical Fourier component of the Poisson bracket.
Heisenberg's original differentiation trick was eventually extended
to a full semiclassical derivation of the quantum condition, in
collaboration with Born and Jordan. Once they were able to establish
that
- ,
this condition replaced and extended the
old quantization rule, allowing the matrix elements of
P and
X for an arbitrary system to be determined simply from the form of the Hamiltonian.
The new quantization rule was
assumed to be universally true, even though the derivation from the
old quantum theory
required semiclassical reasoning. (A full quantum treatment, however,
for more elaborate arguments of the brackets, was appreciated in the
1940s to amount to extending Poisson brackets to
Moyal brackets.)
State vectors and the Heisenberg equation
To make the transition to standard quantum mechanics, the most important further addition was the
quantum state vector, now written |
ψ⟩,
which is the vector that the matrices act on. Without the state vector,
it is not clear which particular motion the Heisenberg matrices are
describing, since they include all the motions somewhere.
The interpretation of the state vector, whose components are written
ψm,
was furnished by Born. This interpretation is statistical: the result
of a measurement of the physical quantity corresponding to the matrix
A is random, with an average value equal to
Alternatively, and equivalently, the state vector gives the
probability amplitude ψn for the quantum system to be in the energy state
n.
Once the state vector was introduced, matrix mechanics could be rotated to
any basis, where the
H matrix need no longer be diagonal. The Heisenberg equation of motion in its original form states that
Amn evolves in time like a Fourier component,
which can be recast in differential form
and it can be restated so that it is true in an arbitrary basis, by noting that the
H matrix is diagonal with diagonal values
Em,
This is now a matrix equation, so it holds in any basis. This is the modern form of the Heisenberg equation of motion.
Its formal solution is:
All these forms of the equation of motion above say the same thing, that
A(t) is equivalent to
A(0), through a basis rotation by the
unitary matrix
eHt, a systematic picture elucidated by Dirac in his
bra–ket notation.
Conversely, by rotating the basis for the state vector at each time by
eiHt, the time dependence in the matrices can be undone. The matrices are now time independent, but the state vector rotates,
This is the
Schrödinger equation for the state vector, and this time-dependent change of basis amounts to transformation to the
Schrödinger picture, with ⟨
x|
ψ⟩ =
ψ(x).
In
quantum mechanics in the
Heisenberg picture the
state vector, |
ψ⟩ does not change with time, while an observable
A satisfies the
Heisenberg equation of motion,
-
|
The extra term is for operators such as
which have an
explicit time dependence, in addition to the time dependence from the unitary evolution discussed.
The
Heisenberg picture does not distinguish time from space, so it is better suited to
relativistic theories than the Schrödinger equation. Moreover, the similarity to
classical physics is more manifest: the Hamiltonian equations of motion for classical mechanics are recovered by replacing the
commutator above by the
Poisson bracket (see also below). By the
Stone–von Neumann theorem, the Heisenberg picture and the
Schrödinger picture must be unitarily equivalent, as detailed below.
Further results
Matrix mechanics rapidly developed into modern quantum mechanics, and gave interesting physical results on the spectra of atoms.
Wave mechanics
Jordan noted that the commutation relations ensure that
P acts as a differential operator.
The operator identity
allows the evaluation of the commutator of
P with any power of
X, and it implies that
which, together with linearity, implies that a
P-commutator effectively differentiates any analytic matrix function of
X.
Assuming limits are defined sensibly, this extends to arbitrary
functions−−but the extension need not be made explicit until a certain
degree of mathematical rigor is required,
-
|
Since
X is a Hermitian matrix, it should be diagonalizable, and it will be clear from the eventual form of
P
that every real number can be an eigenvalue. This makes some of the
mathematics subtle, since there is a separate eigenvector for every
point in space.
In the basis where
X is diagonal, an arbitrary state can be written as a superposition of states with eigenvalues
x,
- ,
so that
ψ(
x) = ⟨
x|
ψ⟩, and the operator
X multiplies each eigenvector by
x,
Define a linear operator
D which differentiates
ψ,
- ,
and note that
- ,
so that the operator −
iD obeys the same commutation relation as
P. Thus, the difference between
P and −
iD must commute with
X,
- ,
so it may be simultaneously diagonalized with
X: its value acting on any eigenstate of
X is some function
f of the eigenvalue
x.
This function must be real, because both
P and −
iD are Hermitian,
- ,
rotating each state
by a phase
f(x), that is, redefining the phase of the wavefunction:
- .
The operator
iD is redefined by an amount:
- ,
which means that, in the rotated basis,
P is equal to −
iD.
Hence, there is always a basis for the eigenvalues of
X where the action of
P on any wavefunction is known:
- ,
and the Hamiltonian in this basis is a linear differential operator on the state-vector components,
Thus, the equation of motion for the state vector is but a celebrated differential equation,
-
|
Since
D is a differential operator, in order for it to be sensibly defined, there must be eigenvalues of
X which neighbors every given value. This suggests that the only possibility is that the space of all eigenvalues of
X is all real numbers, and that
P is iD, up to a phase rotation.
To make this rigorous requires a sensible discussion of the limiting space of functions, and in this space this is the
Stone–von Neumann theorem: any operators
X and
P which obey the commutation relations can be made to act on a space of wavefunctions, with
P a derivative operator. This implies that a
Schrödinger picture is always available.
Matrix mechanics easily extends to many degrees of freedom in a natural way. Each degree of freedom has a separate
X operator and a separate effective differential operator
P, and the wavefunction is a function of all the possible eigenvalues of the independent commuting
X variables.
In particular, this means that a system of
N interacting particles in 3 dimensions is described by one vector whose components in a basis where all the
X are diagonal is a mathematical function of 3
N-dimensional space
describing all their possible positions, effectively a
much bigger collection of values than the mere collection of
N
three-dimensional wavefunctions in one physical space. Schrödinger came
to the same conclusion independently, and eventually proved the
equivalence of his own formalism to Heisenberg's.
Since the wavefunction is a property of the whole system, not of any
one part, the description in quantum mechanics is not entirely local.
The description of several quantum particles has them correlated, or
entangled. This entanglement leads to strange correlations between distant particles which violate the classical
Bell's inequality.
Even if the particles can only be in just two positions, the wavefunction for
N particles requires 2
N complex numbers, one for each total configuration of positions. This is exponentially many numbers in
N,
so simulating quantum mechanics on a computer requires exponential
resources. Conversely, this suggests that it might be possible to find
quantum systems of size
N which physically compute the answers to problems which classically require 2
N bits to solve. This is the aspiration behind
quantum computing.
Ehrenfest theorem
For the time-independent operators
X and
P,
∂A/∂t = 0 so the Heisenberg equation above reduces to:
[27]
- ,
where the square brackets [ , ] denote the
commutator. For a Hamiltonian which is
, the
X and
P operators satisfy:
- ,
where the first is classically the
velocity, and second is classically the
force, or
potential gradient. These reproduce Hamilton's form of
Newton's laws of motion. In the
Heisenberg picture, the
X and
P
operators satisfy the classical equations of motion. You can take the
expectation value of both sides of the equation to see that, in any
state |
ψ⟩:
So Newton's laws are exactly obeyed by the expected values of the operators in any given state. This is
Ehrenfest's theorem,
which is an obvious corollary of the Heisenberg equations of motion,
but is less trivial in the Schrödinger picture, where Ehrenfest
discovered it.
Transformation theory
In classical mechanics, a canonical transformation of phase space
coordinates is one which preserves the structure of the Poisson
brackets. The new variables
x',p' have the same Poisson brackets with each other as the original variables
x,p.
Time evolution is a canonical transformation, since the phase space at
any time is just as good a choice of variables as the phase space at any
other time.
The Hamiltonian flow is the
canonical transformation:
Since the Hamiltonian can be an arbitrary function of
x and
p, there are such infinitesimal canonical transformations corresponding to
every classical quantity G, where
G serves as the Hamiltonian to generate a flow of points in phase space for an increment of time
s,
For a general function
A(x, p) on phase space, its infinitesimal change at every step
ds under this map is
The quantity
G is called the
infinitesimal generator of the canonical transformation.
In quantum mechanics, the quantum analog
G is now a Hermitian matrix, and the equations of motion are given by commutators,
The infinitesimal canonial motions can be formally integrated, just as the Heisenberg equation of motion were integrated,
where
U= eiGs and s is an arbitrary parameter.
The definition of a quantum canonical transformation is thus an
arbitrary unitary change of basis on the space of all state vectors.
U is an arbitrary unitary matrix, a complex rotation in phase space,
These transformations leave the sum of the absolute square of the wavefunction components
invariant,
while they take states which are multiples of each other (including
states which are imaginary multiples of each other) to states which are
the
same multiple of each other.
The interpretation of the matrices is that they act as
generators of motions on the space of states.
For example, the motion generated by
P can be found by solving the Heisenberg equation of motion using
P as a Hamiltonian,
These are translations of the matrix
X by a multiple of the identity matrix,
This is the interpretation of the derivative operator
D:
eiPs = eD,
the exponential of a derivative operator is a translation (so Lagrange's
shift operator).
The
X operator likewise generates translations in
P. The Hamiltonian generates
translations in time, the angular momentum generates
rotations in physical space, and the operator
X 2 + P 2 generates
rotations in phase space.
When a transformation, like a rotation in physical space, commutes with the Hamiltonian, the transformation is called a
symmetry
(behind a degeneracy) of the Hamiltonian−−the Hamiltonian expressed in
terms of rotated coordinates is the same as the original Hamiltonian.
This means that the change in the Hamiltonian under the infinitesimal
symmetry generator
L vanishes,
It then follows that the change in the generator under
time translation also vanishes,
so that the matrix
L is constant in time: it is conserved.
The one-to-one association of infinitesimal symmetry generators and conservation laws was discovered by
Emmy Noether for classical mechanics, where the commutators are
Poisson brackets,
but the quantum-mechanical reasoning is identical. In quantum
mechanics, any unitary symmetry transformation yields a conservation
law, since if the matrix U has the property that
so it follows that
and that the time derivative of
U is zero—it is conserved.
The eigenvalues of unitary matrices are pure phases, so that the
value of a unitary conserved quantity is a complex number of unit
magnitude, not a real number. Another way of saying this is that a
unitary matrix is the exponential of
i times a Hermitian matrix,
so that the additive conserved real quantity, the phase, is only
well-defined up to an integer multiple of
2π. Only when the
unitary symmetry matrix is part of a family that comes arbitrarily close
to the identity are the conserved real quantities single-valued, and
then the demand that they are conserved become a much more exacting
constraint.
Symmetries which can be continuously connected to the identity are called
continuous, and translations, rotations, and boosts are examples. Symmetries which cannot be continuously connected to the identity are
discrete, and the operation of space-inversion, or
parity, and
charge conjugation are examples.
The interpretation of the matrices as generators of canonical transformations is due to
Paul Dirac.
[28] The correspondence between symmetries and matrices was shown by
Eugene Wigner to be complete, if
antiunitary matrices which describe symmetries which include time-reversal are included.
Selection rules
It was physically clear to Heisenberg that the absolute squares of the matrix elements of
X, which are the Fourier coefficients of the oscillation, would yield the rate of emission of electromagnetic radiation.
In the classical limit of large orbits, if a charge with position
X(t) and charge
q is oscillating next to an equal and opposite charge at position 0, the instantaneous dipole moment is
q X(t),
and the time variation of this moment translates directly into the
space-time variation of the vector potential, which yields nested
outgoing spherical waves.
For atoms, the wavelength of the emitted light is about 10,000 times
the atomic radius, and the dipole moment is the only contribution to the
radiative field, while all other details of the atomic charge
distribution can be ignored.
Ignoring back-reaction, the power radiated in each outgoing mode is a
sum of separate contributions from the square of each independent time
Fourier mode of
d,
Now, in Heisenberg's representation, the Fourier coefficients of the dipole moment are the matrix elements of
X.
This correspondence allowed Heisenberg to provide the rule for the
transition intensities, the fraction of the time that, starting from an
initial state
i, a photon is emitted and the atom jumps to a final state
j,
This then allowed the magnitude of the matrix elements to be interpreted statistically:
they give the intensity of the spectral lines, the probability for quantum jumps from the emission of dipole radiation.
Since the transition rates are given by the matrix elements of
X, wherever
Xij is zero, the corresponding transition should be absent. These were called the
selection rules, which were a puzzle until the advent of matrix mechanics.
An arbitrary state of the Hydrogen atom, ignoring spin, is labelled by |
n;
ℓ,m ⟩, where the value of ℓ is a measure of the total orbital angular momentum and
m is its
z-component, which defines the orbit orientation. The components of the angular momentum
pseudovector are
where the products in this expression are independent of order and real, because different components of
X and
P commute.
The commutation relations of
L with all three coordinate matrices
X, Y, Z (or with any vector) are easy to find,
- ,
which confirms that the operator
L generates rotations between the three components of the vector of coordinate matrices
X.
From this, the commutator of
Lz and the coordinate matrices
X, Y, Z can be read off,
- ,
- .
This means that the quantities
X + iY, X − iY have a simple commutation rule,
- ,
- .
Just like the matrix elements of
X + iP and
X − iP for
the harmonic oscillator Hamiltonian, this commutation law implies that
these operators only have certain off diagonal matrix elements in states
of definite
m,
meaning that the matrix
(X + iY) takes an eigenvector of
Lz with eigenvalue
m to an eigenvector with eigenvalue
m + 1. Similarly,
(X− iY) decrease
m by one unit, while
Z does not change the value of
m.
So, in a basis of |
ℓ,m⟩ states where
L2 and
Lz have definite values, the matrix elements of any of the three components of the position are zero, except when
m is the same or changes by one unit.
This places a constraint on the change in total angular momentum. Any
state can be rotated so that its angular momentum is in the
z-direction as much as possible, where
m = ℓ. The matrix element of the position acting on |
ℓ,m⟩ can only produce values of
m which are bigger by one unit, so that if the coordinates are rotated so that the final state is |
ℓ',ℓ'
⟩, the value of ℓ’ can be at most one bigger than the biggest value of ℓ
that occurs in the initial state. So ℓ’ is at most ℓ + 1.
The matrix elements vanish for ℓ’ > ℓ + 1, and the reverse matrix
element is determined by Hermiticity, so these vanish also when ℓ’ < ℓ
- 1: Dipole transitions are forbidden with a change in angular momentum
of more than one unit.
Sum rules
The Heisenberg equation of motion determines the matrix elements of
P in the Heisenberg basis from the matrix elements of
X.
- ,
which turns the diagonal part of the commutation relation into a sum rule for the magnitude of the matrix elements:
- .
This yields a relation for the sum of the spectroscopic intensities
to and from any given state, although to be absolutely correct,
contributions from the radiative capture probability for unbound
scattering states must be included in the sum:
- .