The H-theorem is a natural consequence of the kinetic equation derived by Boltzmann that has come to be known as Boltzmann's equation. The H-theorem has led to considerable discussion about its actual implications, with major themes being:
- What is entropy? In what sense does Boltzmann's quantity H correspond to the thermodynamic entropy?
- Are the assumptions (especially the assumption of molecular chaos) behind Boltzmann's equation too strong? When are these assumptions violated?
Etymology and pronunciation
Boltzmann in his original publication writes the symbol E (as in entropy) for its statistical function. Years later, Samuel Hawksley Burbury, one of the critics of the theorem, wrote the function with the symbol H, a notation that was subsequently adopted by Boltzmann when referring to its "H-theorem".
The notation has led to some confusion regarding the name of the
theorem. Even though the statement is usually referred to as the "Aitch theorem", sometimes it is instead called the "Eta theorem", as the capital Greek letter Eta (Η) is undistinguishable from the capital version of Latin letter h (H).
Discussions have been raised on how the symbol should be understood,
but it remains unclear due to the lack of written sources from the time
of the theorem. Studies of the typography and the work of J.W. Gibbs, seem to favour the interpretation of H as Eta.
Definition and meaning of Boltzmann's H
The H value is determined from the function f(E, t) dE, which is the energy distribution function of molecules at time t. The value f(E, t) dE is the number of molecules that have kinetic energy between E and E + dE. H itself is defined as
For an isolated ideal gas (with fixed total energy and fixed total number of particles), the function H is at a minimum when the particles have a Maxwell–Boltzmann distribution;
if the molecules of the ideal gas are distributed in some other way
(say, all having the same kinetic energy), then the value of H will be higher. Boltzmann's H-theorem,
described in the next section, shows that when collisions between
molecules are allowed, such distributions are unstable and tend to
irreversibly seek towards the minimum value of H (towards the Maxwell–Boltzmann distribution).
(Note on notation: Boltzmann originally used the letter E for quantity H; most of the literature after Boltzmann uses the letter H as here. Boltzmann also used the symbol x to refer to the kinetic energy of a particle.)
Boltzmann's H theorem
Boltzmann considered what happens during the collision between two
particles. It is a basic fact of mechanics that in the elastic collision
between two particles (such as hard spheres), the energy transferred
between the particles varies depending on initial conditions (angle of
collision, etc.).
Boltzmann made a key assumption known as the Stosszahlansatz (molecular chaos
assumption), that during any collision event in the gas, the two
particles participating in the collision have 1) independently chosen
kinetic energies from the distribution, 2) independent velocity
directions, 3) independent starting points. Under these assumptions, and
given the mechanics of energy transfer, the energies of the particles
after the collision will obey a certain new random distribution that can
be computed.
Considering repeated uncorrelated collisions, between any and all
of the molecules in the gas, Boltzmann constructed his kinetic equation
(Boltzmann's equation). From this kinetic equation, a natural outcome is that the continual process of collision causes the quantity H to decrease until it has reached a minimum.
Impact
Although Boltzmann's H-theorem
turned out not to be the absolute proof of the second law of
thermodynamics as originally claimed (see Criticisms below), the H-theorem
led Boltzmann in the last years of the 19th century to more and more
probabilistic arguments about the nature of thermodynamics. The
probabilistic view of thermodynamics culminated in 1902 with Josiah Willard Gibbs's statistical mechanics for fully general systems (not just gases), and the introduction of generalized statistical ensembles.
The kinetic equation and in particular Boltzmann's molecular chaos assumption inspired a whole family of Boltzmann equations
that are still used today to model the motions of particles, such as
the electrons in a semiconductor. In many cases the molecular chaos
assumption is highly accurate, and the ability to discard complex
correlations between particles makes calculations much simpler.
Criticism and exceptions
There are several notable reasons described below why the H-theorem,
at least in its original 1871 form, is not completely rigorous. As
Boltzmann would eventually go on to admit, the arrow of time in the H-theorem is not in fact purely mechanical, but really a consequence of assumptions about initial conditions.
Loschmidt's paradox
Soon after Boltzmann published his H theorem, Johann Josef Loschmidt
objected that it should not be possible to deduce an irreversible
process from time-symmetric dynamics and a time-symmetric formalism. If
the H decreases over time in one state, then there must be a matching reversed state where H increases over time (Loschmidt's paradox). The explanation is that Boltzmann's equation is based on the assumption of "molecular chaos",
i.e., that it follows from, or at least is consistent with, the
underlying kinetic model that the particles be considered independent
and uncorrelated. It turns out that this assumption breaks time reversal
symmetry in a subtle sense, and therefore begs the question. Once the particles are allowed to collide, their velocity directions and positions in fact do
become correlated (however, these correlations are encoded in an
extremely complex manner). This shows that an (ongoing) assumption of
independence is not consistent with the underlying particle model.
Boltzmann's reply to Loschmidt was to concede the possibility of
these states, but noting that these sorts of states were so rare and
unusual as to be impossible in practice. Boltzmann would go on to
sharpen this notion of the "rarity" of states, resulting in his famous
equation, his entropy formula of 1877.
Spin echo
As a demonstration of Loschmidt's paradox, a famous modern counterexample (not to Boltzmann's original gas-related H-theorem, but to a closely related analogue) is the phenomenon of spin echo. In the spin echo effect, it is physically possible to induce time reversal in an interacting system of spins.
An analogue to Boltzmann's H for the spin system can be
defined in terms of the distribution of spin states in the system. In
the experiment, the spin system is initially perturbed into a
non-equilibrium state (high H), and, as predicted by the H theorem the quantity H
soon decreases to the equilibrium value. At some point, a carefully
constructed electromagnetic pulse is applied that reverses the motions
of all the spins. The spins then undo the time evolution from before the
pulse, and after some time the H actually increases away from equilibrium (once the evolution has completely unwound, the H
decreases once again to the minimum value). In some sense, the time
reversed states noted by Loschmidt turned out to be not completely
impractical.
Poincaré recurrence
In 1896, Ernst Zermelo noted a further problem with the H theorem, which was that if the system's H is at any time not a minimum, then by Poincaré recurrence, the non-minimal H must recur (though after some extremely long time). Boltzmann admitted that these recurring rises in H
technically would occur, but pointed out that, over long times, the
system spends only a tiny fraction of its time in one of these recurring
states.
The second law of thermodynamics states that the entropy of an isolated system
always increases to a maximum equilibrium value. This is strictly true
only in the thermodynamic limit of an infinite number of particles. For a
finite number of particles, there will always be entropy fluctuations.
For example, in the fixed volume of the isolated system, the maximum
entropy is obtained when half the particles are in one half of the
volume, half in the other, but sometimes there will be temporarily a few
more particles on one side than the other, and this will constitute a
very small reduction in entropy. These entropy fluctuations are such
that the longer one waits, the larger an entropy fluctuation one will
probably see during that time, and the time one must wait for a given
entropy fluctuation is always finite, even for a fluctuation to its
minimum possible value. For example, one might have an extremely low
entropy condition of all particles being in one half of the container.
The gas will quickly attain its equilibrium value of entropy, but given
enough time, this same situation will happen again. For practical
systems, e.g. a gas in a 1-liter container at room temperature and
atmospheric pressure, this time is truly enormous, many multiples of the
age of the universe, and, practically speaking, one can ignore the
possibility.
Fluctuations of H in small systems
Since H is a mechanically defined variable that is not conserved, then like any other such variable (pressure, etc.) it will show thermal fluctuations. This means that H regularly shows spontaneous increases from the minimum value. Technically this is not an exception to the H theorem, since the H
theorem was only intended to apply for a gas with a very large number
of particles. These fluctuations are only perceptible when the system is
small and the time interval over which it is observed is not enormously
large.
If H is interpreted as entropy as Boltzmann intended, then this can be seen as a manifestation of the fluctuation theorem.
Connection to information theory
H is a forerunner of Shannon's information entropy. Claude Shannon denoted his measure of information entropy H after the H-theorem. The article on Shannon's information entropy contains an
explanation of the discrete counterpart of the quantity H, known as the information entropy or information uncertainty (with a minus sign). By extending the discrete information entropy to the continuous information entropy, also called differential entropy, one obtains the expression in Eq.(1), and thus a better feel for the meaning of H.
The H-theorem's connection between information and entropy plays a central role in a recent controversy called the Black hole information paradox.
Tolman's H-theorem
Richard C. Tolman's 1938 book The Principles of Statistical Mechanics dedicates a whole chapter to the study of Boltzmann's H theorem, and its extension in the generalized classical statistical mechanics of Gibbs. A further chapter is devoted to the quantum mechanical version of the H-theorem.
Classical mechanical
We let and be our generalized coordinates for a set of particles. Then we consider a function that returns the probability density of particles, over the states in phase space. Note how this can be multiplied by a small region in phase space, denoted by , to yield the (average) expected number of particles in that region:
Tolman offers the following equations for the definition of the quantity H in Boltzmann's original H theorem.
- .
Here we sum over the regions into which phase space is divided, indexed by . And in the limit for an infinitesimal phase space volume , we can write the sum as an integral:
- .
H can also be written in terms of the number of molecules present in each of the cells:
- .
An additional way to calculate the quantity H is:
- ,
where P is the probability of finding a system chosen at random from the specified microcanonical ensemble. It can finally be written as:
- ,
where G is the number of classical states.
The quantity H can also be defined as the integral over velocity space:
,
where P(v) is the probability distribution.
Using the Boltzmann equation one can prove that H can only decrease.
For a system of N statistically independent particles, H is related to the thermodynamic entropy S through:
- .
So, according to the H-theorem, S can only increase.
Quantum mechanical
In
quantum statistical mechanics (which is the quantum version of
classical statistical mechanics), the H-function is the function:
where summation runs over all possible distinct states of the system, and pi is the probability that the system could be found in the i-th state.
This is closely related to the entropy formula of Gibbs,
- ,
and we shall (following e.g., Waldram (1985), p. 39) proceed using S rather than H.
First, differentiating with respect to time gives
- ,
(using the fact that ∑ dpi/dt = 0, since ∑ pi = 1).
Now Fermi's golden rule gives a master equation
for the average rate of quantum jumps from state α to β; and from state
β to α. (Of course, Fermi's golden rule itself makes certain
approximations, and the introduction of this rule is what introduces
irreversibility. It is essentially the quantum version of Boltzmann's Stosszahlansatz.) For an isolated system the jumps will make contributions
- ,
where the reversibility of the dynamics ensures that the same transition constant ναβ appears in both expressions.
So
But the two differences terms in the summation always have the same sign, so each contribution to dS/dt cannot be negative. Therefore,
for an isolated system.
The same mathematics is sometimes used to show that relative entropy is a Lyapunov function of a Markov process in detailed balance, and other chemistry contexts.
Gibbs' H-theorem
Josiah Willard Gibbs described another way in which the entropy of a microscopic system would tend to increase over time. Later writers have called this "Gibbs' H-theorem" as its conclusion resembles that of Boltzmann's. Gibbs himself never called it an H-theorem,
and in fact his definition of entropy—and mechanism of increase—are
very different from Boltzmann's. This section is included for historical
completeness.
The setting of Gibbs' entropy production theorem is in ensemble statistical mechanics, and the entropy quantity is the Gibbs entropy
(information entropy) defined in terms of the probability distribution
for the entire state of the system. This is in contrast to Boltzmann's H defined in terms of the distribution of states of individual molecules, within a specific state of the system.
Gibbs considered the motion of an ensemble which initially starts
out confined to a small region of phase space, meaning that the state
of the system is known with fair precision though not quite exactly (low
Gibbs entropy). The evolution of this ensemble over time proceeds
according to Liouville's equation.
For almost any kind of realistic system, the Liouville evolution tends
to "stir" the ensemble over phase space, a process analogous to the
mixing of a dye in an incompressible fluid.
After some time, the ensemble appears to be spread out over phase
space, although it is actually a finely striped pattern, with the total
volume of the ensemble (and its Gibbs entropy) conserved. Liouville's
equation is guaranteed to conserve Gibbs entropy since there is no
random process acting on the system; in principle, the original ensemble
can be recovered at any time by reversing the motion.
The critical point of the theorem is thus: If the fine structure
in the stirred-up ensemble is very slightly blurred, for any reason,
then the Gibbs entropy increases, and the ensemble becomes an
equilibrium ensemble. As to why this blurring should occur in reality,
there are a variety of suggested mechanisms. For example, one suggested
mechanism is that the phase space is coarse-grained for some reason
(analogous to the pixelization in the simulation of phase space shown in
the figure). For any required finite degree of fineness the ensemble
becomes "sensibly uniform" after a finite time. Or, if the system
experiences a tiny uncontrolled interaction with its environment, the
sharp coherence of the ensemble will be lost. Edwin Thompson Jaynes argued that the blurring is subjective in nature, simply corresponding to a loss of knowledge about the state of the system. In any case, however it occurs, the Gibbs entropy increase is irreversible provided the blurring cannot be reversed.
The exactly evolving entropy, which does not increase, is known as fine-grained entropy. The blurred entropy is known as coarse-grained entropy.
Leonard Susskind analogizes this distinction to the notion of the volume of a fibrous ball of cotton: on one hand the volume of the fibers themselves is constant, but in
another sense there is a larger coarse-grained volume, corresponding to
the outline of the ball.
Gibbs' entropy increase mechanism solves some of the technical difficulties found in Boltzmann's H-theorem:
The Gibbs entropy does not fluctuate nor does it exhibit Poincare
recurrence, and so the increase in Gibbs entropy, when it occurs, is
therefore irreversible as expected from thermodynamics. The Gibbs
mechanism also applies equally well to systems with very few degrees of
freedom, such as the single-particle system shown in the figure. To the
extent that one accepts that the ensemble becomes blurred, then, Gibbs'
approach is a cleaner proof of the second law of thermodynamics.