Search This Blog

Monday, July 6, 2015

Bra–ket notation


From Wikipedia, the free encyclopedia
 
In quantum mechanics, bra–ket notation is a standard notation for describing quantum states, composed of angle brackets and vertical bars. It can also be used to denote abstract vectors and linear functionals in mathematics. It is so called because the inner product (or dot product on a complex vector space) of two states is denoted by
\langle\phi\mid\psi\rangle,
consisting of a left part, \langle\phi| called the bra /brɑː/, and a right part, |\psi\rangle, called the ket /kɛt/. The notation was introduced in 1939 by Paul Dirac[1] and is also known as Dirac notation, though the notation has precursors in Grassmann's use of the notation [\phi\mid\psi] for his inner products nearly 100 years earlier.[2][3]

Bra–ket notation is widespread in quantum mechanics: almost every phenomenon that is explained using quantum mechanics—including a large portion of modern physics — is usually explained with the help of bra-ket notation. Part of the appeal of the notation is the abstract representation-independence it encodes, together with its versatility in producing a specific representation (e.g. x, or p, or eigenfunction base) without much ado, or excessive reliance on the nature of the linear spaces involved. The overlap expression \langle\phi\mid\psi\rangle is typically interpreted as the probability amplitude for the state ψ to collapse into the state φ.

Vector spaces

Background: Vector spaces

In physics, basis vectors allow any Euclidean vector to be represented geometrically using angles and lengths, in different directions, i.e. in terms of the spatial orientations. It is simpler to see the notational equivalences between ordinary notation and bra-ket notation; so, for now, consider a vector A starting at the origin and ending at an element of 3-d Euclidean space; the vector then is specified by this end-point, a triplet of elements in the field of real numbers, symbolically dubbed as A ∈ ℝ3.
The vector A can be written using any set of basis vectors and corresponding coordinate system. Informally basis vectors are like "building blocks of a vector": they are added together to compose a vector, and the coordinates are the numerical coefficients of basis vectors in each direction. Two useful representations of a vector are simply a linear combination of basis vectors, and column matrices. Using the familiar Cartesian basis, a vector A may be written as
  
\mathbf{A}  \doteq \!\, A_x \mathbf{e}_x + A_y \mathbf{e}_y + A_z \mathbf{e}_z  
 = A_x \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} +
A_y \begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix} +
A_z \begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix}

 = \begin{pmatrix} A_x \\ 0 \\ 0 \end{pmatrix} +
\begin{pmatrix} 0 \\ A_y \\ 0 \end{pmatrix} +
\begin{pmatrix} 0 \\ 0 \\ A_z \end{pmatrix} 
  = \begin{pmatrix}
A_x \\
A_y \\
A_z \\
\end{pmatrix}

respectively, where ex, ey, ez denote the Cartesian basis vectors (all are orthogonal unit vectors) and Ax, Ay, Az are the corresponding coordinates, in the x, y, z directions. In a more general notation, for any basis in 3-d space one writes
\mathbf{A} \doteq \!\, A_1 \mathbf{e}_1 + A_2 \mathbf{e}_2 + A_3 \mathbf{e}_3 = \begin{pmatrix}
A_1 \\
A_2 \\
A_3 \\
\end{pmatrix}
Generalizing further, consider a vector A in an N-dimensional vector space over the field of complex numbers , symbolically stated as A ∈ ℂN. The vector A is still conventionally represented by a linear combination of basis vectors or a column matrix:
\mathbf{A} \doteq \!\, \sum_{n=1}^N A_n \mathbf{e}_n = \begin{pmatrix}
A_1 \\
A_2 \\
\vdots \\
A_N \\
\end{pmatrix}
though the coordinates are now all complex-valued.

Even more generally, A can be a vector in a complex Hilbert space. Some Hilbert spaces, like N, have finite dimension, while others have infinite dimension. In an infinite-dimensional space, the column-vector representation of A would be a list of infinitely many complex numbers.

Ket notation for vectors

Rather than boldtype, over arrows, underscores etc. conventionally used elsewhere, \mathbf{A},\,\vec{A},\,\underline{A}, Dirac's notation for a vector uses vertical bars and angular brackets: |A\rangle. When this notation is used, these vectors are called "ket", read as "ket-A".[4] This applies to all vectors, the resultant vector and the basis. The previous vectors are now written
 |A \rangle = A_x|e_x \rangle + A_y|e_y \rangle + A_z|e_z \rangle {\doteq \!\,}
\begin{pmatrix} A_x \\ A_y \\ A_z \end{pmatrix},
or in a more easily generalized notation,
 |A \rangle = A_1|e_1 \rangle + A_2|e_2 \rangle + A_3|e_3 \rangle {\doteq \!\,}
\begin{pmatrix} A_1 \\ A_2 \\ A_3 \end{pmatrix},
The last one may be written in short as
|A \rangle = A_1|1 \rangle + A_2|2 \rangle + A_3|3 \rangle ~.
Note how any symbols, letters, numbers, or even words—whatever serves as a convenient label—can be used as the label inside a ket. In other words, the symbol "|A\rangle" has a specific and universal mathematical meaning, while just the "A" by itself does not. Nevertheless, for convenience, there is usually some logical scheme behind the labels inside kets, such as the common practice of labeling energy eigenkets in quantum mechanics through a listing of their quantum numbers. Further note that a ket and its representation by a coordinate vector are not the same mathematical object: a ket does not require specification of a basis, whereas the coordinate vector needs a basis in order to be well defined (the same holds for an operator and its representation by a matrix).[5] In this context, one should best use a symbol different than the equal sign, for example the symbol , read as "is represented by".

Inner products and bras

An inner product is a generalization of the dot product. The inner product of two vectors is a scalar. bra-ket notation uses a specific notation for inner products:
 \langle A | B \rangle = \text{the inner product of ket } | A \rangle \text{ with ket } | B \rangle
For example, in three-dimensional complex Euclidean space,
\langle A | B \rangle \doteq \!\, A_x^*B_x + A_y^*B_y + A_z^*B_z
where A_i^* denotes the complex conjugate of Ai. A special case is the inner product of a vector with itself, which is the square of its norm (magnitude):
\langle A | A \rangle \doteq \!\, |A_x|^2 + |A_y|^2 + |A_z|^2
bra-ket notation splits this inner product (also called a "bracket") into two pieces, the "bra" and the "ket":
 \langle A | B \rangle = \left( \, \langle A | \, \right) \,\, \left( \, | B \rangle \, \right)
where \langle A| is called a bra, read as "bra-A", and |B\rangle is a ket as above.

The purpose of "splitting" the inner product into a bra and a ket is that both the bra \langle A| and the ket |B\rangle are meaningful on their own, and can be used in other contexts besides within an inner product. There are two main ways to think about the meanings of separate bras and kets:

Bras and kets as row and column vectors

For a finite-dimensional vector space, using a fixed orthonormal basis, the inner product can be written as a matrix multiplication of a row vector with a column vector:
 \langle A | B \rangle \doteq \!\, A_1^* B_1 + A_2^* B_2 + \cdots + A_N^* B_N {=}
\begin{pmatrix} A_1^* & A_2^* & \cdots & A_N^* \end{pmatrix}
\begin{pmatrix} B_1 \\ B_2 \\ \vdots \\ B_N \end{pmatrix}
Based on this, the bras and kets can be defined as:
 \langle A | {\doteq \!\,} \begin{pmatrix} A_1^* & A_2^* & \cdots & A_N^* \end{pmatrix}
 | B \rangle {\doteq \!\,} \begin{pmatrix} B_1 \\ B_2 \\ \vdots \\ B_N \end{pmatrix}
and then it is understood that a bra next to a ket implies matrix multiplication.

The conjugate transpose (also called Hermitian conjugate) of a bra is the corresponding ket and vice versa:
\langle A |^\dagger = |A \rangle, \quad |A \rangle^\dagger = \langle A |
because if one starts with the bra
\begin{pmatrix} A_1^* & A_2^* & \cdots & A_N^* \end{pmatrix},
then performs a complex conjugation, and then a matrix transpose, one ends up with the ket
\begin{pmatrix} A_1 \\ A_2 \\ \vdots \\ A_N \end{pmatrix}

Bras as linear operators on kets

A more abstract definition, which is equivalent but more easily generalized to infinite-dimensional spaces, is to say that bras are linear functionals on kets, i.e. operators that input a ket and output a complex number. The bra operators are defined to be consistent with the inner product.
In mathematics terminology, the vector space of bras is the dual space to the vector space of kets, and corresponding bras and kets are related by the Riesz representation theorem.

Non-normalizable states and non-Hilbert spaces

bra-ket notation can be used even if the vector space is not a Hilbert space.

In quantum mechanics, it is common practice to write down kets which have infinite norm, i.e. non-normalisable wavefunctions. Examples include states whose wavefunctions are Dirac delta functions or infinite plane waves. These do not, technically, belong to the Hilbert space itself. However, the definition of "Hilbert space" can be broadened to accommodate these states (see the Gelfand–Naimark–Segal construction or rigged Hilbert spaces). The bra-ket notation continues to work in an analogous way in this broader context.

For a rigorous treatment of the Dirac inner product of non-normalizable states, see the definition given by D. Carfì.[6][7] For a rigorous definition of basis with a continuous set of indices and consequently for a rigorous definition of position and momentum basis, see.[8] For a rigorous statement of the expansion of an S-diagonalizable operator, or observable, in its eigenbasis or in another basis, see.[9]

Banach spaces are a different generalization of Hilbert spaces. In a Banach space B, the vectors may be notated by kets and the continuous linear functionals by bras. Over any vector space without topology, we may also notate the vectors by kets and the linear functionals by bras. In these more general contexts, the bracket does not have the meaning of an inner product, because the Riesz representation theorem does not apply.

Usage in quantum mechanics

The mathematical structure of quantum mechanics is based in large part on linear algebra:
  • Wave functions and other quantum states can be represented as vectors in a complex Hilbert space. (The exact structure of this Hilbert space depends on the situation.) In bra-ket notation, for example, an electron might be in the "state" |ψ. (Technically, the quantum states are rays of vectors in the Hilbert space, as c|ψ corresponds to the same state for any nonzero complex number c.)
  • Quantum superpositions can be described as vector sums of the constituent states. For example, an electron in the state |1 + i |2 is in a quantum superposition of the states |1 and |2.
  • Measurements are associated with linear operators (called observables) on the Hilbert space of quantum states.
  • Dynamics are also described by linear operators on the Hilbert space. For example, in the Schrödinger picture, there is a linear time evolution operator U with the property that if an electron is in state |ψ right now, then in one second it will be in the state U|ψ, the same U for every possible |ψ.
  • Wave function normalization is scaling a wave function so that its norm is 1.
Since virtually every calculation in quantum mechanics involves vectors and linear operators, it can involve, and often does involve, bra-ket notation. A few examples follow:

Spinless position–space wave function

Discrete components Ak of a complex vector |A = ∑k Ak|ek, which belongs to a countably infinite-dimensional Hilbert space; there are countably infinitely many k values and basis vectors |ek.
Continuous components ψ(x) of a complex vector |ψ = ∫ dx ψ(x)|x, which belongs to an uncountably infinite-dimensional Hilbert space; there are infinitely many x values and basis vectors |x.

Components of complex vectors plotted against index number; discrete k and continuous x. Two particular components out of infinitely many are highlighted.

The Hilbert space of a spin-0 point particle is spanned by a "position basis" { |r }, where the label r extends over the set of all points in position space. Since there are uncountably infinitely many vector components in the basis, this is an uncountably infinite-dimensional Hilbert space. The dimensions of the Hilbert space (usually infinite) and position space (usually 1, 2 or 3) are not to be conflated.

Starting from any ket |Ψ in this Hilbert space, we can define a complex scalar function of r, known as a wavefunction:
\Psi(\mathbf{r}) \ \stackrel{\text{def}}{=}\ \lang \mathbf{r}|\Psi\rang .
On the left side, Ψ(r) is a function mapping any point in space to a complex number; on the right side, |Ψ = ∫ d3r Ψ(r) |r is a ket.

It is then customary to define linear operators acting on wavefunctions in terms of linear operators acting on kets, by
A \Psi(\mathbf{r}) \ \stackrel{\text{def}}{=}\ \lang \mathbf{r}|A|\Psi\rang .
For instance, the momentum operator p has the following form,
\mathbf{p} \Psi(\mathbf{r}) \ \stackrel{\text{def}}{=}\ \lang \mathbf{r} |\mathbf{p}|\Psi\rang = - i \hbar \nabla \Psi(\mathbf{r}) .
One occasionally encounters a sloppy expression like
\nabla |\Psi\rang ,
though this is something of a (common) abuse of notation. The differential operator must be understood to be an abstract operator, acting on kets, that has the effect of differentiating wavefunctions once the expression is projected into the position basis,
\nabla \lang\mathbf{r}|\Psi\rang ,
even though, in the momentum basis, the operator amounts to a mere multiplication operator (by iħp).

Overlap of states

In quantum mechanics the expression φ|ψ is typically interpreted as the probability amplitude for the state ψ to collapse into the state φ. Mathematically, this means the coefficient for the projection of ψ onto φ. It is also described as the projection of state ψ onto state φ.

Changing basis for a spin-1/2 particle

A stationary spin-½ particle has a two-dimensional Hilbert space. One orthonormal basis is:
|\uparrow_z \rangle, \; |\downarrow_z \rangle
where |\uparrow_z \rangle is the state with a definite value of the spin operator Sz equal to +1/2 and |\downarrow_z \rangle is the state with a definite value of the spin operator Sz equal to −1/2.

Since these are a basis, any quantum state of the particle can be expressed as a linear combination (i.e., quantum superposition) of these two states:
|\psi \rangle = a_{\psi} |\uparrow_z \rangle + b_{\psi} |\downarrow_z \rangle
where aψ, bψ are complex numbers.

A different basis for the same Hilbert space is:
|\uparrow_x \rangle, \; |\downarrow_x \rangle
defined in terms of Sx rather than Sz.

Again, any state of the particle can be expressed as a linear combination of these two:
|\psi \rangle = c_{\psi} |\uparrow_x \rangle + d_{\psi} |\downarrow_x \rangle
In vector form, you might write
|\psi\rangle {\doteq \!\,} \begin{pmatrix} a_\psi \\ b_\psi \end{pmatrix}, \;\; \text{OR} \;\; |\psi\rangle {\doteq \!\,} \begin{pmatrix} c_\psi \\ d_\psi \end{pmatrix}
depending on which basis you are using. In other words, the "coordinates" of a vector depend on the basis used.

There is a mathematical relationship between aψ, bψ, cψ, dψ; see change of basis.

Misleading uses

There are a few conventions and abuses of notation that are generally accepted by the physics community, but which might confuse the non-initiated.

It is common among physicists to use the same symbol for labels and constants in the same equation. It supposedly becomes easier to identify that the constant is related to the labeled object, and is claimed that the divergent nature of each will eliminate any ambiguity and no further differentiation is required. For example, α̂ |α = α|α, where the symbol α is used simultaneously as the name of the operator α̂, its eigenvector |α and the associated eigenvalue α.

Something similar occurs in component notation of vectors. While Ψ (uppercase) is traditionally associated with wavefunctions, ψ (lowercase) may be used to denote a label, a wave function or complex constant in the same context, usually differentiated only by a subscript.

The main abuses are including operations inside the vector labels. This is usually done for a fast notation of scaling vectors. E.g. if the vector |α is scaled by 2, it might be denoted by |α/2, which makes no sense since α is a label, not a function or a number, so you can't perform operations on it.

This is especially common when denoting vectors as tensor products, where part of the labels are moved outside the designed slot. E.g. |α = |α/21|α/22. Here part of the labeling that should state that all three vectors are different was moved outside the kets, as subscripts 1 and 2. And a further abuse occurs, since α is meant to refer to the norm of the first vector – which is a label is denoting a value.

Linear operators

Linear operators acting on kets

A linear operator is a map that inputs a ket and outputs a ket. (In order to be called "linear", it is required to have certain properties.) In other words, if A is a linear operator and |ψ is a ket, then A|ψ is another ket.

In an N-dimensional Hilbert space, |ψ can be written as an N×1 column vector, and then A is an N×N matrix with complex entries. The ket A|ψ can be computed by normal matrix multiplication.

Linear operators are ubiquitous in the theory of quantum mechanics. For example, observable physical quantities are represented by self-adjoint operators, such as energy or momentum, whereas transformative processes are represented by unitary linear operators such as rotation or the progression of time.

Linear operators acting on bras

Operators can also be viewed as acting on bras from the right hand side. Specifically, if A is a linear operator and φ| is a bra, then φ|A is another bra defined by the rule
\bigg(\langle\phi|A\bigg) \; |\psi\rangle = \langle\phi| \; \bigg(A|\psi\rangle\bigg) ,
(in other words, a function composition). This expression is commonly written as (cf. energy inner product)
\langle\phi|A|\psi\rangle .
In an N-dimensional Hilbert space, φ| can be written as a N row vector, and A (as in the previous section) is an N×N matrix. Then the bra φ|A can be computed by normal matrix multiplication.

If the same state vector appears on both bra and ket side,
\langle\psi|A|\psi\rangle ,
then this expression gives the expectation value, or mean or average value, of the observable represented by operator A for the physical system in the state |ψ.

Outer products

A convenient way to define linear operators on H is given by the outer product: if φ| is a bra and |ψ is a ket, the outer product
 |\phi\rang \lang \psi|
denotes the rank-one operator with the rule
 (|\phi\rang \lang \psi|)(x) = \lang \psi, x \rang \phi .
For a finite-dimensional vector space, the outer product can be understood as simple matrix multiplication:
 |\phi \rangle \, \langle \psi | {\doteq \!\,}
\begin{pmatrix} \phi_1 \\ \phi_2 \\ \vdots \\ \phi_N \end{pmatrix}
\begin{pmatrix} \psi_1^* & \psi_2^* & \cdots & \psi_N^* \end{pmatrix}
= \begin{pmatrix}
\phi_1 \psi_1^* & \phi_1 \psi_2^* & \cdots & \phi_1 \psi_N^* \\
\phi_2 \psi_1^* & \phi_2 \psi_2^* & \cdots & \phi_2 \psi_N^* \\
\vdots & \vdots & \ddots & \vdots \\
\phi_N \psi_1^* & \phi_N \psi_2^* & \cdots & \phi_N \psi_N^* \end{pmatrix}
The outer product is an N×N matrix, as expected for a linear operator.

One of the uses of the outer product is to construct projection operators. Given a ket |ψ of norm 1, the orthogonal projection onto the subspace spanned by |ψ is
|\psi\rangle\langle\psi|.

Hermitian conjugate operator

Just as kets and bras can be transformed into each other (making |ψ into ψ|), the element from the dual space corresponding to A|ψ is ψ|A, where A denotes the Hermitian conjugate (or adjoint) of the operator A. In other words,
 |\phi\rangle = A |\psi\rangle   if and only if   \qquad \langle\phi| = \langle \psi | A^\dagger .
If A is expressed as an N×N matrix, then A is its conjugate transpose.

Self-adjoint operators, where A = A, play an important role in quantum mechanics; for example, an observable is always described by a self-adjoint operator. If A is a self-adjoint operator, then ψ|A|ψ is always a real number (not complex). This implies that expectation values of observables are real.

Properties

bra-ket notation was designed to facilitate the formal manipulation of linear-algebraic expressions. Some of the properties that allow this manipulation are listed herein. In what follows, c1 and c2 denote arbitrary complex numbers, c denotes the complex conjugate of c, A and B denote arbitrary linear operators, and these properties are to hold for any choice of bras and kets.

Linearity

  • Since bras are linear functionals,
\langle\phi| \; \bigg( c_1|\psi_1\rangle + c_2|\psi_2\rangle \bigg) = c_1\langle\phi|\psi_1\rangle + c_2\langle\phi|\psi_2\rangle.
  • By the definition of addition and scalar multiplication of linear functionals in the dual space,[10]
\bigg(c_1 \langle\phi_1| + c_2 \langle\phi_2|\bigg) \; |\psi\rangle = c_1 \langle\phi_1|\psi\rangle + c_2 \langle\phi_2|\psi\rangle.

Associativity

Given any expression involving complex numbers, bras, kets, inner products, outer products, and/or linear operators (but not addition), written in bra-ket notation, the parenthetical groupings do not matter (i.e., the associative property holds). For example:
 \lang \psi| (A |\phi\rang) = (\lang \psi|A)|\phi\rang \, \stackrel{\text{def}}{=} \, \lang \psi | A | \phi \rang
 (A|\psi\rang)\lang \phi| = A(|\psi\rang \lang \phi|) \, \stackrel{\text{def}}{=} \, A | \psi \rang \lang \phi |
and so forth. The expressions on the right (with no parentheses whatsoever) are allowed to be written unambiguously because of the equalities on the left. Note that the associative property does not hold for expressions that include non-linear operators, such as the antilinear time reversal operator in physics.

Hermitian conjugation

bra-ket notation makes it particularly easy to compute the Hermitian conjugate (also called dagger, and denoted ) of expressions. The formal rules are:
  • The Hermitian conjugate of a bra is the corresponding ket, and vice versa.
  • The Hermitian conjugate of a complex number is its complex conjugate.
  • The Hermitian conjugate of the Hermitian conjugate of anything (linear operators, bras, kets, numbers) is itself—i.e.,
(x) = x.
  • Given any combination of complex numbers, bras, kets, inner products, outer products, and/or linear operators, written in bra-ket notation, its Hermitian conjugate can be computed by reversing the order of the components, and taking the Hermitian conjugate of each.
These rules are sufficient to formally write the Hermitian conjugate of any such expression; some examples are as follows:
  • Kets:

\left(c_1|\psi_1\rangle + c_2|\psi_2\rangle\right)^\dagger = c_1^* \langle\psi_1| + c_2^* \langle\psi_2| ~.
  • Inner products:
\langle \phi | \psi \rangle^* = \langle \psi|\phi\rangle ~.
  • Matrix elements:
\langle \phi| A | \psi \rangle^* = \langle \psi | A^\dagger |\phi \rangle
\langle \phi| A^\dagger B^\dagger | \psi \rangle^* = \langle \psi | BA |\phi \rangle ~.
  • Outer products:
\left((c_1|\phi_1\rangle\langle \psi_1|) + (c_2|\phi_2\rangle\langle\psi_2|)\right)^\dagger = (c_1^* |\psi_1\rangle\langle \phi_1|) + (c_2^*|\psi_2\rangle\langle\phi_2|)~.

Composite bras and kets

Two Hilbert spaces V and W may form a third space VW by a tensor product. In quantum mechanics, this is used for describing composite systems. If a system is composed of two subsystems described in V and W respectively, then the Hilbert space of the entire system is the tensor product of the two spaces. (The exception to this is if the subsystems are actually identical particles. In that case, the situation is a little more complicated.)

If |ψ is a ket in V and |φ is a ket in W, the direct product of the two kets is a ket in VW. This is written in various notations:
|\psi\rangle|\phi\rangle \,,\quad |\psi\rangle \otimes |\phi\rangle\,,\quad|\psi \phi\rangle\,,\quad|\psi ,\phi\rangle\,.
See quantum entanglement and the EPR paradox for applications of this product.

The unit operator

Consider a complete orthonormal system (basis), \{ e_i \ | \ i \in \mathbb{N} \}, for a Hilbert space H, with respect to the norm from an inner product \langle\cdot,\cdot\rangle. From basic functional analysis we know that any ket |ψ can also be written as
|\psi\rangle = \sum_{i \in \mathbb{N}} \langle e_i | \psi \rangle | e_i \rangle,
with \langle\cdot|\cdot\rangle the inner product on the Hilbert space.

From the commutativity of kets with (complex) scalars now follows that
\sum_{i \in \mathbb{N}} | e_i \rangle \langle e_i | = \hat{1}
must be the identity operator, which sends each vector to itself. This can be inserted in any expression without affecting its value, for example
 \langle v | w \rangle = \langle v | \sum_{i \in \mathbb{N}} | e_i \rangle \langle e_i | w \rangle = \langle v | \sum_{i \in \mathbb{N}} | e_i \rangle \langle e_i | \sum_{j \in \mathbb{N}} | e_j \rangle \langle e_j | w \rangle = \langle v | e_i \rangle \langle e_i | e_j \rangle \langle e_j | w \rangle ,
where, in the last identity, the Einstein summation convention has been used.

In quantum mechanics, it often occurs that little or no information about the inner product \langle\psi|\phi\rangle of two arbitrary (state) kets is present, while it is still possible to say something about the expansion coefficients \langle\psi|e_i\rangle = \langle e_i|\psi\rangle^* and \langle e_i|\phi\rangle of those vectors with respect to a specific (orthonormalized) basis. In this case, it is particularly useful to insert the unit operator into the bracket one time or more.

For more information, see Resolution of the identity, 1 = ∫ dx |xx| = ∫ dp |pp|, where |p = ∫ dx eixp|x/√2πħ; since x′|x = δ(xx′), plane waves follow, x|p = exp(ixp/ħ)/√2πħ.

Notation used by mathematicians

The object physicists are considering when using the "bra-ket" notation is a Hilbert space (a complete inner product space).

Let  \mathcal{H} be a Hilbert space and  h\in\mathcal{H} is a vector in  \mathcal{H} . What physicists would denote as |h is the vector itself. That is
 |h\rangle\in \mathcal{H} .
Let  \mathcal{H}^* be the dual space of  \mathcal{H} . This is the space of linear functionals on \mathcal{H}. The isomorphism  \Phi:\mathcal{H}\to\mathcal{H}^* is defined by  \Phi(h) = \phi_h where for all  g\in\mathcal{H} we have
 \phi_h(g) = \mbox{IP}(h,g) = (h,g) = \langle h,g \rangle = \langle h|g \rangle ,
where  \mbox{IP}(\cdot,\cdot), (\cdot,\cdot),\langle \cdot,\cdot \rangle and \langle \cdot | \cdot \rangle are just different notations for expressing an inner product between two elements in a Hilbert space (or for the first three, in any inner product space). Notational confusion arises when identifying  \phi_h and  g with  \langle h | and |g \rangle respectively. This is because of literal symbolic substitutions. Let  \phi_h = H = \langle h| and let  g=G=|g\rangle . This gives
 \phi_h(g) = H(g) = H(G)=\langle h|(G) = \langle h|(
|g\rangle).
One ignores the parentheses and removes the double bars. Some properties of this notation are convenient since we are dealing with linear operators and composition acts like a ring multiplication.

Moreover, mathematicians usually write the dual entity not at the first place, as the physicists do, but at the second one, and they don't use the *-symbol, but an overline (which the physicists reserve for averages and Dirac conjugation) to denote conjugate-complex numbers, i.e. for scalar products mathematicians usually write
(\phi ,\psi )=\int \phi (x)\cdot \overline{\psi(x)}\, {\rm d}x \,,
whereas physicists would write for the same quantity
 \langle\psi |\phi \rangle=\int {\rm d}x\,\psi^*(x)\cdot\phi(x)\,.

Sunday, July 5, 2015

Statistical mechanics



From Wikipedia, the free encyclopedia

Statistical mechanics is a branch of theoretical physics and chemistry (and mathematical physics) that studies, using probability theory, the average behaviour of a mechanical system where the state of the system is uncertain.[1][2][3][note 1]

The classical view of the universe was that its fundamental laws are mechanical in nature, and that all physical systems are therefore governed by mechanical laws at a microscopic level. These laws are precise equations of motion that map any given initial state to a corresponding future state at a later time. There is however a disconnection between these laws and everyday life experiences, as we do not find it necessary (nor even theoretically possible) to know exactly at a microscopic level the simultaneous positions and velocities of each molecule while carrying out processes at the human scale (for example, when performing a chemical reaction).
Statistical mechanics is a collection of mathematical tools that are used to fill this disconnection between the laws of mechanics and the practical experience of incomplete knowledge.

A common use of statistical mechanics is in explaining the thermodynamic behaviour of large systems. Microscopic mechanical laws do not contain concepts such as temperature, heat, or entropy, however, statistical mechanics shows how these concepts arise from the natural uncertainty that arises about the state of a system when that system is prepared in practice. The benefit of using statistical mechanics is that it provides exact methods to connect thermodynamic quantities (such as heat capacity) to microscopic behaviour, whereas in classical thermodynamics the only available option would be to just measure and tabulate such quantities for various materials. Statistical mechanics also makes it possible to extend the laws of thermodynamics to cases which are not considered in classical thermodynamics, for example microscopic systems and other mechanical systems with few degrees of freedom.[1] This branch of statistical mechanics which treats and extends classical thermodynamics is known as statistical thermodynamics or equilibrium statistical mechanics.

Statistical mechanics also finds use outside equilibrium. An important subbranch known as non-equilibrium statistical mechanics deals with the issue of microscopically modelling the speed of irreversible processes that are driven by imbalances. Examples of such processes include chemical reactions, or flows of particles and heat. Unlike with equilibrium, there is no exact formalism that applies to non-equilibrium statistical mechanics in general and so this branch of statistical mechanics remains an active area of theoretical research.

Principles: mechanics and ensembles

In physics there are two types of mechanics usually examined: classical mechanics and quantum mechanics. For both types of mechanics, the standard mathematical approach is to consider two ingredients:
  1. The complete state of the mechanical system at a given time, mathematically encoded as a phase point (classical mechanics) or a pure quantum state vector (quantum mechanics).
  2. An equation of motion which carries the state forward in time: Hamilton's equations (classical mechanics) or the time-dependent Schrödinger equation (quantum mechanics)
Using these two ingredients, the state at any other time, past or future, can in principle be calculated.

Whereas ordinary mechanics only considers the behaviour of a single state, statistical mechanics introduces the statistical ensemble, which is a large collection of virtual, independent copies of the system in various states. The statistical ensemble is a probability distribution over all possible states of the system. In classical statistical mechanics, the ensemble is a probability distribution over phase points (as opposed to a single phase point in ordinary mechanics), usually represented as a distribution in a phase space with canonical coordinates. In quantum statistical mechanics, the ensemble is a probability distribution over pure states,[note 2] and can be compactly summarized as a density matrix.

As is usual for probabilities, the ensemble can be interpreted in different ways:[1]
  • an ensemble can be taken to represent the various possible states that a single system could be in (epistemic probability, a form of knowledge), or
  • the members of the ensemble can be understood as the states of the systems in experiments repeated on independent systems which have been prepared in a similar but imperfectly controlled manner (empirical probability), in the limit of an infinite number of trials.
These two meanings are equivalent for many purposes, and will be used interchangeably in this article.

However the probability is interpreted, each state in the ensemble evolves over time according to the equation of motion. Thus, the ensemble itself (the probability distribution over states) also evolves, as the virtual systems in the ensemble continually leave one state and enter another. The ensemble evolution is given by the Liouville equation (classical mechanics) or the von Neumann equation (quantum mechanics). These equations are simply derived by the application of the mechanical equation of motion separately to each virtual system contained in the ensemble, with the probability of the virtual system being conserved over time as it evolves from state to state.

One special class of ensemble is those ensembles that do not evolve over time. These ensembles are known as equilibrium ensembles and their condition is known as statistical equilibrium. Statistical equilibrium occurs if, for each state in the ensemble, the ensemble also contains all of its future and past states with probabilities equal to that state.[note 3] The study of equilibrium ensembles of isolated systems is the focus of statistical thermodynamics. Non-equilibrium statistical mechanics addresses the more general case of ensembles that change over time, and/or ensembles of non-isolated systems.

Statistical thermodynamics

The primary goal of statistical thermodynamics (also known as equilibrium statistical mechanics) is to explain the classical thermodynamics of materials in terms of the properties of their constituent particles and the interactions between them. In other words, statistical thermodynamics provides a connection between the macroscopic properties of materials in thermodynamic equilibrium, and the microscopic behaviours and motions occurring inside the material.

As an example, one might ask what is it about a thermodynamic system of NH3 molecules that determines the free energy characteristic of that compound? Classical thermodynamics does not provide the answer. If, for example, we were given spectroscopic data, of this body of gas molecules, such as bond length, bond angle, bond rotation, and flexibility of the bonds in NH3 we should see that the free energy could not be other than it is. To prove this true, we need to bridge the gap between the microscopic realm of atoms and molecules and the macroscopic realm of classical thermodynamics. Statistical mechanics demonstrates how the thermodynamic parameters of a system, such as temperature and pressure, are related to microscopic behaviours of such constituent atoms and molecules.[4]

Although we may understand a system generically, in general we lack information about the state of a specific instance of that system. For this reason the notion of statistical ensemble (a probability distribution over possible states) is necessary. Furthermore, in order to reflect that the material is in a thermodynamic equilibrium, it is necessary to introduce a corresponding statistical mechanical definition of equilibrium. The analogue of thermodynamic equilibrium in statistical thermodynamics is the ensemble property of statistical equilibrium, described in the previous section. An additional assumption in statistical thermodynamics is that the system is isolated (no varying external forces are acting on the system), so that its total energy does not vary over time. A sufficient (but not necessary) condition for statistical equilibrium with an isolated system is that the probability distribution is a function only of conserved properties (total energy, total particle numbers, etc.).[1]

Fundamental postulate

There are many different equilibrium ensembles that can be considered, and only some of them correspond to thermodynamics.[1] An additional postulate is necessary to motivate why the ensemble for a given system should have one form or another.

A common approach found in many textbooks is to take the equal a priori probability postulate.[2] This postulate states that
For an isolated system with an exactly known energy and exactly known composition, the system can be found with equal probability in any microstate consistent with that knowledge.
The equal a priori probability postulate therefore provides a motivation for the microcanonical ensemble described below. There are various arguments in favour of the equal a priori probability postulate:
  • Ergodic hypothesis: An ergodic state is one that evolves over time to explore "all accessible" states: all those with the same energy and composition. In an ergodic system, the microcanonical ensemble is the only possible equilibrium ensemble with fixed energy. This approach has limited applicability, since most systems are not ergodic.
  • Principle of indifference: In the absence of any further information, we can only assign equal probabilities to each compatible situation.
  • Maximum information entropy: A more elaborate version of the principle of indifference states that the correct ensemble is the ensemble that is compatible with the known information and that has the largest Gibbs entropy (information entropy).[5]
Other fundamental postulates for statistical mechanics have also been proposed.[6]

In any case, the reason for establishing the microcanonical ensemble is mainly axiomatic.[6] The microcanonical ensemble itself is mathematically awkward to use for real calculations, and even very simple finite systems can only be solved approximately. However, it is possible to use the microcanonical ensemble to construct a hypothetical infinite thermodynamic reservoir that has an exactly defined notion of temperature and chemical potential. Once this reservoir has been established, it can be used to justify exactly the canonical ensemble or grand canonical ensemble (see below) for any other system by considering the contact of this system with the reservoir.[1] These other ensembles are those actually used in practical statistical mechanics calculations as they are mathematically simpler and also correspond to a much more realistic situation (energy not known exactly).[2]

Three thermodynamic ensembles

There are three equilibrium ensembles with a simple form that can be defined for any isolated system bounded inside a finite volume.[1] These are the most often discussed ensembles in statistical thermodynamics. In the macroscopic limit (defined below) they all correspond to classical thermodynamics.
  • The microcanonical ensemble describes a system with a precisely given energy and fixed composition (precise number of particles). The microcanonical ensemble contains with equal probability each possible state that is consistent with that energy and composition.
  • The canonical ensemble describes a system of fixed composition that is in thermal equilibrium[note 4] with a heat bath of a precise temperature. The canonical ensemble contains states of varying energy but identical composition; the different states in the ensemble are accorded different probabilities depending on their total energy.
  • The grand canonical ensemble describes a system with non-fixed composition (uncertain particle numbers) that is in thermal and chemical equilibrium with a thermodynamic reservoir. The reservoir has a precise temperature, and precise chemical potentials for various types of particle. The grand canonical ensemble contains states of varying energy and varying numbers of particles; the different states in the ensemble are accorded different probabilities depending on their total energy and total particle numbers.
Thermodynamic ensembles[1]
Microcanonical Canonical Grand canonical
Fixed variables
N, E, V
N, T, V
μ, T, V
Microscopic features
Macroscopic function

Statistical fluctuations and the macroscopic limit

The thermodynamic ensembles' most significant difference is that they either admit uncertainty in the variables of energy or particle number, or that those variables are fixed to particular values. While this difference can be observed in some cases, for macroscopic systems the thermodynamic ensembles are usually observationally equivalent.
The limit of large systems in statistical mechanics is known as the thermodynamic limit. In the thermodynamic limit the microcanonical, canonical, and grand canonical ensembles tend to give identical predictions about thermodynamic characteristics. This means that one can specify either total energy or temperature and arrive at the same result; likewise one can specify either total particle number or chemical potential. Given these considerations, the best ensemble to choose for the calculation of the properties of a macroscopic system is usually just the ensemble which allows the result to be derived most easily.[7]

Important cases where the thermodynamic ensembles do not give identical results include:
  • Systems at a phase transition.
  • Systems with long-range interactions.
  • Microscopic systems.
In these cases the correct thermodynamic ensemble must be chosen as there are observable differences between these ensembles not just in the size of fluctuations, but also in average quantities such as the distribution of particles. The correct ensemble is that which corresponds to the way the system has been prepared and characterized—in other words, the ensemble that reflects the knowledge about that system.[2]

Illustrative example (a gas)

The above concepts can be illustrated for the specific case of one liter of ammonia gas at standard conditions. (Note that statistical thermodynamics is not restricted to the study of macroscopic gases, and the example of a gas is given here to illustrate concepts. Statistical mechanics and statistical thermodynamics apply to all mechanical systems (including microscopic systems) and to all phases of matter: liquids, solids, plasmas, gases, nuclear matter, quark matter.)

A simple way to prepare one litre sample of ammonia in a standard condition is to take a very large reservoir of ammonia at those standard conditions, and connect it to a previously evacuated one-litre container. After ammonia gas has entered the container and the container has been given time to reach thermodynamic equilibrium with the reservoir, the container is then sealed and isolated. In thermodynamics, this is a repeatable process resulting in a very well defined sample of gas with a precise description. We now consider the corresponding precise description in statistical thermodynamics.

Although this process is well defined and repeatable in a macroscopic sense, we have no information about the exact locations and velocities of each and every molecule in the container of gas. Moreover, we do not even know exactly how many molecules are in the container; even supposing we knew exactly the average density of the ammonia gas in general, we do not know how many molecules of the gas happened to be inside our container at the moment when we sealed it. The sample is in equilibrium and is in equilibrium with the reservoir: we could reconnect it to the reservoir for some time, and then re-seal it, and our knowledge about the state of the gas would not change. In this case, our knowledge about the state of the gas is precisely described by the grand canonical ensemble. Provided we have an accurate microscopic model of the ammonia gas, we could in principle compute all thermodynamic properties of this sample of gas by using the distribution provided by the grand canonical ensemble.

Hypothetically, we could use an extremely sensitive weight scale to measure exactly the mass of the container before and after introducing the ammonia gas, so that we can exactly know the number of ammonia molecules. After we make this measurement, then our knowledge about the gas would correspond to the canonical ensemble. Finally, suppose by some hypothetical apparatus we can measure exactly the number of molecules and also measure exactly the total energy of the system. Supposing furthermore that this apparatus gives us no further information about the molecules' positions and velocities, our knowledge about the system would correspond to the microcanonical ensemble.

Even after making such measurements, however, our expectations about the behaviour of the gas do not change appreciably. This is because the gas sample is macroscopic and approximates very well the thermodynamic limit, so the different ensembles behave similarly. This can be demonstrated by considering how small the actual fluctuations would be. Suppose that we knew the number density of ammonia gas was exactly 3.04×1022 molecules per liter inside the reservoir of ammonia gas used to fill the one-litre container. In describing the container with the grand canonical ensemble, then, the average number of molecules would be \langle N\rangle = 3.04\times 10^{22} and the uncertainty (standard deviation) in the number of molecules would be \sigma_N = \sqrt{\langle N \rangle} \approx 2\times 10^{11} (assuming Poisson distribution), which is relatively very small compared to the total number of molecules. Upon measuring the particle number (thus arriving at a canonical ensemble) we should find very nearly 3.04×1022 molecules. For example the probability of finding more than 3.040001×1022 or less than 3.039999×1022 molecules would be about 1 in 103000000000.[note 5]

Calculation methods

Once the characteristic state function for an ensemble has been calculated for a given system, that system is 'solved' (macroscopic observables can be extracted from the characteristic state function). Calculating the characteristic state function of a thermodynamic ensemble is not necessarily a simple task, however, since it involves considering every possible state of the system. While some hypothetical systems have been exactly solved, the most general (and realistic) case is too complex for exact solution. Various approaches exist to approximate the true ensemble and allow calculation of average quantities.

Exact

There are some cases which allow exact solutions.
  • For very small microscopic systems, the ensembles can be directly computed by simply enumerating over all possible states of the system (using exact diagonalization in quantum mechanics, or integral over all phase space in classical mechanics).
  • Some large systems consist of many separable microscopic systems, and each of the subsystems can be analysed independently. Notably, idealized gases of non-interacting particles have this property, allowing exact derivations of Maxwell–Boltzmann statistics, Fermi–Dirac statistics, and Bose–Einstein statistics.[2]
  • A few large systems with interaction have been solved. By the use of subtle mathematical techniques, exact solutions have been found for a few toy models.[8] Some examples include the Bethe ansatz, square-lattice Ising model in zero field, hard hexagon model.

Monte Carlo

One approximate approach that is particularly well suited to computers is the Monte Carlo method, which examines just a few of the possible states of the system, with the states chosen randomly (with a fair weight). As long as these states form a representative sample of the whole set of states of the system, the approximate characteristic function is obtained. As more and more random samples are included, the errors are reduced to an arbitrarily low level.

Other

  • For rarefied non-ideal gases, approaches such as the cluster expansion use perturbation theory to include the effect of weak interactions, leading to a virial expansion.[3]
  • For dense fluids, another approximate approach is based on reduced distribution functions, in particular the radial distribution function.[3]
  • Molecular dynamics computer simulations can be used to calculate microcanonical ensemble averages, in ergodic systems. With the inclusion of a connection to a stochastic heat bath, they can also model canonical and grand canonical conditions.
  • Mixed methods involving non-equilibrium statistical mechanical results (see below) may be useful.

Non-equilibrium statistical mechanics

There are many physical phenomena of interest that involve quasi-thermodynamic processes out of equilibrium, for example: All of these processes occur over time with characteristic rates, and these rates are of importance for engineering.
The field of non-equilibrium statistical mechanics is concerned with understanding these non-equilibrium processes at the microscopic level. (Statistical thermodynamics can only be used to calculate the final result, after the external imbalances have been removed and the ensemble has settled back down to equilibrium.)

In principle, non-equilibrium statistical mechanics could be mathematically exact: ensembles for an isolated system evolve over time according to deterministic equations such as Liouville's equation or its quantum equivalent, the von Neumann equation. These equations are the result of applying the mechanical equations of motion independently to each state in the ensemble. Unfortunately, these ensemble evolution equations inherit much of the complexity of the underlying mechanical motion, and so exact solutions are very difficult to obtain. Moreover, the ensemble evolution equations are fully reversible and do not destroy information (the ensemble's Gibbs entropy is preserved). In order to make headway in modelling irreversible processes, it is necessary to add additional ingredients besides probability and reversible mechanics.

Non-equilibrium mechanics is therefore an active area of theoretical research as the range of validity of these additional assumptions continues to be explored. A few approaches are described in the following subsections.

Stochastic methods

One approach to non-equilibrium statistical mechanics is to incorporate stochastic (random) behaviour into the system. Stochastic behaviour destroys information contained in the ensemble. While this is technically inaccurate (aside from hypothetical situations involving black holes, a system cannot in itself cause loss of information), the randomness is added to reflect that information of interest becomes converted over time into subtle correlations within the system, or to correlations between the system and environment. These correlations appear as chaotic or pseudorandom influences on the variables of interest. By replacing these correlations with randomness proper, the calculations can be made much easier.
  • Boltzmann transport equation: An early form of stochastic mechanics appeared even before the term "statistical mechanics" had been coined, in studies of kinetic theory. James Clerk Maxwell had demonstrated that molecular collisions would lead to apparently chaotic motion inside a gas. Ludwig Boltzmann subsequently showed that, by taking this molecular chaos for granted as a complete randomization, the motions of particles in a gas would follow a simple Boltzmann transport equation that would rapidly restore a gas to an equilibrium state (see H-theorem).
    The Boltzmann transport equation and related approaches are important tools in non-equilibrium statistical mechanics due to their extreme simplicity. These approximations work well in systems where the "interesting" information is immediately (after just one collision) scrambled up into subtle correlations, which essentially restricts them to rarefied gases. The Boltzmann transport equation has been found to be very useful in simulations of electron transport in lightly doped semiconductors (in transistors), where the electrons are indeed analogous to a rarefied gas.

    A quantum technique related in theme is the random phase approximation.
  • BBGKY hierarchy: In liquids and dense gases, it is not valid to immediately discard the correlations between particles after one collision. The BBGKY hierarchy (Bogoliubov–Born–Green–Kirkwood–Yvon hierarchy) gives a method for deriving Boltzmann-type equations but also extending them beyond the dilute gas case, to include correlations after a few collisions.
  • Keldysh formalism (a.k.a. NEGF—non-equilibrium Green functions): A quantum approach to including stochastic dynamics is found in the Keldysh formalism. This approach often used in electronic quantum transport calculations.

Near-equilibrium methods

Another important class of non-equilibrium statistical mechanical models deals with systems that are only very slightly perturbed from equilibrium. With very small perturbations, the response can be analysed in linear response theory. A remarkable result, as formalized by the fluctuation-dissipation theorem, is that the response of a system when near equilibrium is precisely related to the fluctuations that occur when the system is in total equilibrium.
Essentially, a system that is slightly away from equilibrium—whether put there by external forces or by fluctuations—relaxes towards equilibrium in the same way, since the system cannot tell the difference or "know" how it came to be away from equilibrium.[3]:664

This provides an indirect avenue for obtaining numbers such as ohmic conductivity and thermal conductivity by extracting results from equilibrium statistical mechanics. Since equilibrium statistical mechanics is mathematically well defined and (in some cases) more amenable for calculations, the fluctuation-dissipation connection can be a convenient shortcut for calculations in near-equilibrium statistical mechanics.

A few of the theoretical tools used to make this connection include:

Hybrid methods

An advanced approach uses a combination of stochastic methods and linear response theory. As an example, one approach to compute quantum coherence effects (weak localization, conductance fluctuations) in the conductance of an electronic system is the use of the Green-Kubo relations, with the inclusion of stochastic dephasing by interactions between various electrons by use of the Keldysh method.[9][10]

Applications outside thermodynamics

The ensemble formalism also can be used to analyze general mechanical systems with uncertainty in knowledge about the state of a system. Ensembles are also used in:

History

In 1738, Swiss physicist and mathematician Daniel Bernoulli published Hydrodynamica which laid the basis for the kinetic theory of gases. In this work, Bernoulli posited the argument, still used to this day, that gases consist of great numbers of molecules moving in all directions, that their impact on a surface causes the gas pressure that we feel, and that what we experience as heat is simply the kinetic energy of their motion.[6]

In 1859, after reading a paper on the diffusion of molecules by Rudolf Clausius, Scottish physicist James Clerk Maxwell formulated the Maxwell distribution of molecular velocities, which gave the proportion of molecules having a certain velocity in a specific range. This was the first-ever statistical law in physics.[11] Five years later, in 1864, Ludwig Boltzmann, a young student in Vienna, came across Maxwell’s paper and was so inspired by it that he spent much of his life developing the subject further.

Statistical mechanics proper was initiated in the 1870s with the work of Boltzmann, much of which was collectively published in his 1896 Lectures on Gas Theory.[12] Boltzmann's original papers on the statistical interpretation of thermodynamics, the H-theorem, transport theory, thermal equilibrium, the equation of state of gases, and similar subjects, occupy about 2,000 pages in the proceedings of the Vienna Academy and other societies. Boltzmann introduced the concept of an equilibrium statistical ensemble and also investigated for the first time non-equilibrium statistical mechanics, with his H-theorem.

The term "statistical mechanics" was coined by the American mathematical physicist J. Willard Gibbs in 1884.[13][note 6] "Probabilistic mechanics" might today seem a more appropriate term, but "statistical mechanics" is firmly entrenched.[14] Shortly before his death, Gibbs published in 1902 Elementary Principles in Statistical Mechanics, a book which formalized statistical mechanics as a fully general approach to address all mechanical systems—macroscopic or microscopic, gaseous or non-gaseous.[1] Gibbs' methods were initially derived in the framework classical mechanics, however they were of such generality that they were found to adapt easily to the later quantum mechanics, and still form the foundation of statistical mechanics to this day.[2]

Moon

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Moon   Near side of the Moon , lunar ...