Force (in units of 10,000 N) between two nucleons as a function of distance as computed from the Reid potential (1968).[1]
The spins of the neutron and proton are aligned, and they are in the S
angular momentum state. The attractive (negative) force has a maximum at
a distance of about 1 fm with a force of about 25,000 N. Particles much
closer than a distance of 0.8 fm experience a large repulsive
(positive) force. Particles separated by a distance greater than 1 fm
are still attracted (Yukawa potential), but the force falls as an
exponential function of distance.
Corresponding
potential energy (in units of MeV) of two nucleons as a function of
distance as computed from the Reid potential. The potential well is a
minimum at a distance of about 0.8 fm. With this potential nucleons can
become bound with a negative "binding energy."
The nuclear force (or nucleon–nucleon interaction or residual strong force) is a force that acts between the protons and neutrons of atoms. Neutrons and protons, both nucleons, are affected by the nuclear force almost identically. Since protons have charge +1 e, they experience an electric force that tends to push them apart, but at short range the attractive nuclear force is strong enough to overcome the electromagnetic force. The nuclear force binds nucleons into atomic nuclei.
The nuclear force is powerfully attractive between nucleons at distances of about 1 femtometre (fm, or 1.0 × 10−15metres),
but it rapidly decreases to insignificance at distances beyond about
2.5 fm. At distances less than 0.7 fm, the nuclear force becomes
repulsive. This repulsive component is responsible for the physical size
of nuclei, since the nucleons can come no closer than the force allows.
By comparison, the size of an atom, measured in angstroms (Å, or 1.0 × 10−10
m), is five orders of magnitude larger. The nuclear force is not
simple, however, since it depends on the nucleon spins, has a tensor
component, and may depend on the relative momentum of the nucleons. The strong nuclear force is one of the fundamental forces of nature.
The nuclear force plays an essential role in storing energy that is used in nuclear power and nuclear weapons.
Work (energy) is required to bring charged protons together against
their electric repulsion. This energy is stored when the protons and
neutrons are bound together by the nuclear force to form a nucleus. The
mass of a nucleus is less than the sum total of the individual masses of the protons and neutrons. The difference in masses is known as the mass defect, which can be expressed as an energy equivalent. Energy
is released when a heavy nucleus breaks apart into two or more lighter
nuclei. This energy is the electromagnetic potential energy that is
released when the nuclear force no longer holds the charged nuclear
fragments together.
A quantitative description of the nuclear force relies on equations that are partly empirical.
These equations model the internucleon potential energies, or
potentials. (Generally, forces within a system of particles can be more
simply modeled by describing the system's potential energy; the negative
gradient of a potential
is equal to the vector force.) The constants for the equations are
phenomenological, that is, determined by fitting the equations to
experimental data. The internucleon potentials attempt to describe the
properties of nucleon–nucleon interaction. Once determined, any given
potential can be used in, e.g., the Schrödinger equation to determine the quantum mechanical properties of the nucleon system.
The discovery of the neutron
in 1932 revealed that atomic nuclei were made of protons and neutrons,
held together by an attractive force. By 1935 the nuclear force was
conceived to be transmitted by particles called mesons. This theoretical development included a description of the Yukawa potential, an early example of a nuclear potential. Mesons, predicted by theory, were discovered experimentally in 1947. By the 1970s, the quark model
had been developed, by which the mesons and nucleons were viewed as
composed of quarks and gluons. By this new model, the nuclear force,
resulting from the exchange of mesons between neighboring nucleons, is a
residual effect of the strong force.
Description
While the nuclear force is usually associated with nucleons, more generally this force is felt between hadrons, or particles composed of quarks.
At small separations between nucleons (less than ~ 0.7 fm between their
centers, depending upon spin alignment) the force becomes repulsive,
which keeps the nucleons at a certain average separation, even if they
are of different types. This repulsion arises from the Pauli exclusion
force for identical nucleons (such as two neutrons or two protons). A
Pauli exclusion force also occurs between quarks of the same type within
nucleons, when the nucleons are different (a proton and a neutron, for
example).
Field strength
At
distances larger than 0.7 fm the force becomes attractive between
spin-aligned nucleons, becoming maximal at a center–center distance of
about 0.9 fm. Beyond this distance the force drops exponentially, until
beyond about 2.0 fm separation, the force is negligible. Nucleons have a
radius of about 0.8 fm.
At short distances (less than 1.7 fm or so), the attractive nuclear force is stronger than the repulsive Coulomb force
between protons; it thus overcomes the repulsion of protons within the
nucleus. However, the Coulomb force between protons has a much greater
range as it varies as the inverse square of the charge separation, and
Coulomb repulsion thus becomes the only significant force between
protons when their separation exceeds about 2 to 2.5 fm.
The nuclear force has a spin-dependent component. The force is
stronger for particles with their spins aligned than for those with
their spins anti-aligned. If two particles are the same, such as two
neutrons or two protons, the force is not enough to bind the particles,
since the spin vectors of two particles of the same type must point in
opposite directions when the particles are near each other and are (save
for spin) in the same quantum state. This requirement for fermions stems from the Pauli exclusion principle.
For fermion particles of different types, such as a proton and neutron,
particles may be close to each other and have aligned spins without
violating the Pauli exclusion principle, and the nuclear force may bind
them (in this case, into a deuteron),
since the nuclear force is much stronger for spin-aligned particles.
But if the particles' spins are anti-aligned the nuclear force is too
weak to bind them, even if they are of different types.
The nuclear force also has a tensor component which depends on
the interaction between the nucleon spins and the angular momentum of
the nucleons, leading to deformation from a simple spherical shape.
Nuclear Binding
To
disassemble a nucleus into unbound protons and neutrons requires work
against the nuclear force. Conversely, energy is released when a nucleus
is created from free nucleons or other nuclei: the nuclear binding energy. Because of mass–energy equivalence (i.e. Einstein's famous formula E = mc2),
releasing this energy causes the mass of the nucleus to be lower than
the total mass of the individual nucleons, leading to the so-called "mass defect".
The nuclear force is nearly independent of whether the nucleons are neutrons or protons. This property is called charge independence. The force depends on whether the spins of the nucleons are parallel or antiparallel, as it has a non-central or tensor component. This part of the force does not conserve orbital angular momentum, which under the action of central forces is conserved.
The symmetry resulting in the strong force, proposed by Werner Heisenberg,
is that protons and neutrons are identical in every respect, other than
their charge. This is not completely true, because neutrons are a tiny
bit heavier, but it is an approximate symmetry. Protons and neutrons are
therefore viewed as the same particle, but with different isospin quantum number. The strong force is invariant under SU(2) transformations, just as are particles with intrinsic spin.
Isospin and intrinsic spin are related under this SU(2) symmetry group.
There are only strong attractions when the total isospin is 0, which is
confirmed by experiment.
Our understanding of the nuclear force is obtained by scattering experiments and the binding energy of light nuclei.
The nuclear force occurs by the exchange of virtual light mesons, such as the virtual pions, as well as two types of virtual mesons with spin (vector mesons), the rho mesons and the omega mesons. The vector mesons account for the spin-dependence of the nuclear force in this "virtual meson" picture.
The nuclear force is distinct from what historically was known as the weak nuclear force. The weak interaction is one of the four fundamental interactions, and plays a role in such processes as beta decay.
The weak force plays no role in the interaction of nucleons, though it
is responsible for the decay of neutrons to protons and vice versa.
History
The nuclear force has been at the heart of nuclear physics ever since the field was born in 1932 with the discovery of the neutron by James Chadwick. The traditional goal of nuclear physics is to understand the properties of atomic nuclei in terms of the 'bare' interaction between pairs of nucleons, or nucleon–nucleon forces (NN forces).
Within months after the discovery of the neutron, Werner Heisenberg and Dmitri Ivanenko had proposed proton–neutron models for the nucleus.
Heisenberg approached the description of protons and neutrons in the
nucleus through quantum mechanics, an approach that was not at all
obvious at the time. Heisenberg's theory for protons and neutrons in the
nucleus was a "major step toward understanding the nucleus as a quantum
mechanical system."
Heisenberg introduced the first theory of nuclear exchange forces that
bind the nucleons. He considered protons and neutrons to be different
quantum states of the same particle, i.e., nucleons distinguished by the
value of their nuclear isospin quantum numbers.
One of the earliest models for the nucleus was the liquid drop model
developed in the 1930s. One property of nuclei is that the average
binding energy per nucleon is approximately the same for all stable
nuclei, which is similar to a liquid drop. The liquid drop model treated
the nucleus as a drop of incompressible nuclear fluid, with nucleons
behaving like molecules in a liquid. The model was first proposed by George Gamow and then developed by Niels Bohr, Werner Heisenberg and Carl Friedrich von Weizsäcker.
This crude model did not explain all the properties of the nucleus, but
it did explain the spherical shape of most nuclei. The model also gave
good predictions for the nuclear binding energy of nuclei.
In 1934, Hideki Yukawa made the earliest attempt to explain the nature of the nuclear force. According to his theory, massive bosons (mesons) mediate the interaction between two nucleons. Although, in light of quantum chromodynamics (QCD), meson theory is no longer perceived as fundamental, the meson-exchange concept (where hadrons are treated as elementary particles) continues to represent the best working model for a quantitative NN potential. The Yukawa potential (also called a screened Coulomb potential) is a potential of the form
where g is a magnitude scaling constant, i.e., the amplitude of potential, is the Yukawa particle mass, r is the radial distance to the particle. The potential is monotone increasing, implying
that the force is always attractive. The constants are determined
empirically. The Yukawa potential depends only on the distance between
particles, r, hence it models a central force.
Throughout the 1930s a group at Columbia University led by I. I. Rabi
developed magnetic resonance techniques to determine the magnetic
moments of nuclei. These measurements led to the discovery in 1939 that
the deuteron also possessed an electric quadrupole moment.
This electrical property of the deuteron had been interfering with the
measurements by the Rabi group. The deuteron, composed of a proton and a
neutron, is one of the simplest nuclear systems. The discovery meant
that the physical shape of the deuteron was not symmetric, which
provided valuable insight into the nature of the nuclear force binding
nucleons. In particular, the result showed that the nuclear force was
not a central force, but had a tensor character. Hans Bethe
identified the discovery of the deuteron's quadrupole moment as one of
the important events during the formative years of nuclear physics.
Historically, the task of describing the nuclear force
phenomenologically was formidable. The first semi-empirical quantitative
models came in the mid-1950s, such as the Woods–Saxon potential
(1954). There was substantial progress in experiment and theory related
to the nuclear force in the 1960s and 1970s. One influential model was
the Reid potential (1968).
In recent years, experimenters have concentrated on the subtleties of
the nuclear force, such as its charge dependence, the precise value of
the πNN coupling constant, improved phase shift analysis, high-precision NNdata, high-precision NN potentials, NN scattering at intermediate and high energies, and attempts to derive the nuclear force from QCD.
The nuclear force as a residual of the strong force
An animation of the interaction. The colored double circles are gluons. Anticolors are shown as per this diagram (larger version).
The same diagram as that above with the individual quark constituents shown, to illustrate how the fundamentalstrong interaction gives rise to the nuclear force. Straight lines are quarks, while multi-colored loops are gluons
(the carriers of the fundamental force). Other gluons, which bind
together the proton, neutron, and pion "in-flight," are not shown.
The nuclear force is a residual effect of the more fundamental strong force, or strong interaction. The strong interaction is the attractive force that binds the elementary particles called quarks together to form the nucleons (protons and neutrons) themselves. This more powerful force is mediated by particles called gluons.
Gluons hold quarks together with a force like that of electric charge,
but of far greater strength. Quarks, gluons and their dynamics are
mostly confined within nucleons, but residual influences extend slightly
beyond nucleon boundaries to give rise to the nuclear force.
The nuclear forces arising between nucleons are analogous to the forces in chemistry between neutral atoms or molecules called London forces.
Such forces between atoms are much weaker than the attractive
electrical forces that hold the atoms themselves together (i.e., that
bind electrons to the nucleus), and their range between atoms is
shorter, because they arise from small separation of charges inside the
neutral atom. Similarly, even though nucleons are made of quarks in
combinations which cancel most gluon forces (they are "color neutral"),
some combinations of quarks and gluons nevertheless leak away from
nucleons, in the form of short-range nuclear force fields that extend
from one nucleon to another nearby nucleon. These nuclear forces are
very weak compared to direct gluon forces ("color forces" or strong forces)
inside nucleons, and the nuclear forces extend only over a few nuclear
diameters, falling exponentially with distance. Nevertheless, they are
strong enough to bind neutrons and protons over short distances, and
overcome the electrical repulsion between protons in the nucleus.
Sometimes, the nuclear force is called the residual strong force, in contrast to the strong interactions which arise from QCD. This phrasing arose during the 1970s when QCD was being established. Before that time, the strong nuclear force referred to the inter-nucleon potential. After the verification of the quark model, strong interaction has come to mean QCD.
Nucleon–nucleon potentials
Two-nucleon systems such as the deuteron, the nucleus of a deuterium atom, as well as proton–proton or neutron–proton scattering are ideal for studying the NN force. Such systems can be described by attributing a potential (such as the Yukawa potential) to the nucleons and using the potentials in a Schrödinger equation.
The form of the potential is derived phenomenologically (by
measurement), although for the long-range interaction, meson-exchange
theories help to construct the potential. The parameters of the
potential are determined by fitting to experimental data such as the deuteron binding energy or NNelastic scatteringcross sections (or, equivalently in this context, so-called NN phase shifts).
A more recent approach is to develop effective field theories for a consistent description of nucleon–nucleon and three-nucleon forces. Quantum hadrodynamics is an effective field theory of the nuclear force, comparable to QCD for color interactions and QED for electromagnetic interactions. Additionally, chiral symmetry breaking can be analyzed in terms of an effective field theory (called chiral perturbation theory) which allows perturbative calculations of the interactions between nucleons with pions as exchange particles.
From nucleons to nuclei
The ultimate goal of nuclear physics would be to describe all nuclear interactions from the basic interactions between nucleons. This is called the microscopic or ab initio approach of nuclear physics. There are two major obstacles to overcome before this dream can become reality:
Calculations in many-body systems are difficult and require advanced computation techniques.
There is evidence that three-nucleon forces
(and possibly higher multi-particle interactions) play a significant
role. This means that three-nucleon potentials must be included into the
model.
This is an active area of research with ongoing advances in
computational techniques leading to better first-principles calculations
of the nuclear shell structure. Two- and three-nucleon potentials have been implemented for nuclides up to A = 12.
Nuclear potentials
A
successful way of describing nuclear interactions is to construct one
potential for the whole nucleus instead of considering all its nucleon
components. This is called the macroscopic approach. For example,
scattering of neutrons from nuclei can be described by considering a
plane wave in the potential of the nucleus, which comprises a real part
and an imaginary part. This model is often called the optical model
since it resembles the case of light scattered by an opaque glass
sphere.
Nuclear potentials can be local or global: local
potentials are limited to a narrow energy range and/or a narrow nuclear
mass range, while global potentials, which have more parameters and are
usually less accurate, are functions of the energy and the nuclear mass
and can therefore be used in a wider range of applications.
How
can the world achieve the deep carbon emissions reductions that are
necessary to slow or reverse the impacts of climate change? The authors
of a new MIT study say that unless nuclear energy is meaningfully
incorporated into the global mix of low-carbon energy technologies, the
challenge of climate change will be much more difficult and costly to
solve. For nuclear energy to take its place as a major low-carbon energy
source, however, issues of cost and policy need to be addressed.
In "The Future of Nuclear Energy in a Carbon-Constrained World,"
released by the MIT Energy Initiative (MITEI) on Sept. 3, the authors
analyze the reasons for the current global stall of nuclear energy
capacity — which currently accounts for only 5 percent of global primary
energy production — and discuss measures that could be taken to arrest
and reverse that trend.
The study group, led by MIT researchers in collaboration with
colleagues from Idaho National Laboratory and the University of
Wisconsin at Madison, is presenting its findings and recommendations at
events in London, Paris, and Brussels this week, followed by events on
Sept. 25 in Washington, and on Oct. 9 in Tokyo. MIT graduate and
undergraduate students and postdocs, as well as faculty from Harvard
University and members of various think tanks, also contributed to the
study as members of the research team.
“Our analysis demonstrates that realizing nuclear energy’s potential
is essential to achieving a deeply decarbonized energy future in many
regions of the world,” says study co-chair Jacopo Buongiorno, the TEPCO
Professor and associate department head of the Department of Nuclear
Science and Engineering at MIT. He adds, “Incorporating new policy and
business models, as well as innovations in construction that may make
deployment of cost-effective nuclear power plants more affordable, could
enable nuclear energy to help meet the growing global demand for energy
generation while decreasing emissions to address climate change.”
The study team notes that the electricity sector in particular is a
prime candidate for deep decarbonization. Global electricity consumption
is on track to grow 45 percent by 2040, and the team’s analysis shows
that the exclusion of nuclear from low-carbon scenarios could cause the
average cost of electricity to escalate dramatically.
“Understanding the opportunities and challenges facing the nuclear
energy industry requires a comprehensive analysis of technical,
commercial, and policy dimensions,” says Robert Armstrong, director of
MITEI and the Chevron Professor of Chemical Engineering. “Over the past
two years, this team has examined each issue, and the resulting report
contains guidance policymakers and industry leaders may find valuable as
they evaluate options for the future.”
The report discusses recommendations for nuclear plant construction,
current and future reactor technologies, business models and policies,
and reactor safety regulation and licensing. The researchers find that
changes in reactor construction are needed to usher in an era of safer,
more cost-effective reactors, including proven construction management
practices that can keep nuclear projects on time and on budget.
“A shift towards serial manufacturing of standardized plants,
including more aggressive use of fabrication in factories and shipyards,
can be a viable cost-reduction strategy in countries where the
productivity of the traditional construction sector is low,” says MIT
visiting research scientist David Petti, study executive director and
Laboratory Fellow at the Idaho National Laboratory. “Future projects
should also incorporate reactor designs with inherent and passive safety
features.”
These safety features could include core materials with high chemical
and physical stability and engineered safety systems that require
limited or no emergency AC power and minimal external intervention.
Features like these can reduce the probability of severe accidents
occurring and mitigate offsite consequences in the event of an incident.
Such designs can also ease the licensing of new plants and accelerate
their global deployment.
“The role of government will be critical if we are to take advantage
of the economic opportunity and low-carbon potential that nuclear has to
offer,” says John Parsons, study co-chair and senior lecturer at MIT’s
Sloan School of Management. “If this future is to be realized,
government officials must create new decarbonization policies that put
all low-carbon energy technologies (i.e. renewables, nuclear, fossil
fuels with carbon capture) on an equal footing, while also exploring
options that spur private investment in nuclear advancement.”
The study lays out detailed options for government support of
nuclear. For example, the authors recommend that policymakers should
avoid premature closures of existing plants, which undermine efforts to
reduce emissions and increase the cost of achieving emission reduction
targets. One way to avoid these closures is the implementation of
zero-emissions credits — payments made to electricity producers where
electricity is generated without greenhouse gas emissions — which the
researchers note are currently in place in New York, Illinois, and New
Jersey.
Another suggestion from the study is that the government support
development and demonstration of new nuclear technologies through the
use of four “levers”: funding to share regulatory licensing costs;
funding to share research and development costs; funding for the
achievement of specific technical milestones; and funding for production
credits to reward successful demonstration of new designs.
The study includes an examination of the current nuclear regulatory
climate, both in the United States and internationally. While the
authors note that significant social, political, and cultural
differences may exist among many of the countries in the nuclear energy
community, they say that the fundamental basis for assessing the safety
of nuclear reactor programs is fairly uniform, and should be reflected
in a series of basic aligned regulatory principles. They recommend
regulatory requirements for advanced reactors be coordinated and aligned
internationally to enable international deployment of commercial
reactor designs, and to standardize and ensure a high level of safety
worldwide.
The study concludes with an emphasis on the urgent need for both
cost-cutting advancements and forward-thinking policymaking to make the
future of nuclear energy a reality.
"The Future of Nuclear Energy in a Carbon-Constrained World" is the
eighth in the "Future of…" series of studies that are intended to serve
as guides to researchers, policymakers, and industry. Each report
explores the role of technologies that might contribute at scale in
meeting rapidly growing global energy demand in a carbon-constrained
world. Nuclear power was the subject of the first of these
interdisciplinary studies, with the 2003 "Future of Nuclear Power" report
(an update was published in 2009). The series has also included a study
on the future of the nuclear fuel cycle. Other reports in the series
have focused on carbon dioxide sequestration, natural gas, the electric
grid, and solar power. These comprehensive reports are written by
multidisciplinary teams of researchers. The research is informed by a
distinguished external advisory committee.
Occam's razor (also Ockham's razor or Ocham's razor; Latin: lex parsimoniae "law of parsimony") is the problem-solving principle that the simplest solution tends to be the right one. When presented with competing hypotheses to solve a problem, one should select the solution with the fewest assumptions. The idea is attributed to William of Ockham (c. 1287–1347), who was an English Franciscan friar, scholastic philosopher, and theologian.
In science, Occam's razor is used as an abductiveheuristic in the development of theoretical models, rather than as a rigorous arbiter between candidate models. In the scientific method, Occam's razor is not considered an irrefutable principle of logic or a scientific result; the preference for simplicity in the scientific method is based on the falsifiability
criterion. For each accepted explanation of a phenomenon, there may be
an extremely large, perhaps even incomprehensible, number of possible
and more complex alternatives. Since one can always burden failing
explanations with ad hoc hypotheses to prevent them from being falsified, simpler theories are preferable to more complex ones because they are more testable.
History
The term Occam's razor did not appear until a few centuries after William of Ockham's death in 1347. Libert Froidmont, in his On Christian Philosophy of the Soul, takes credit for the phrase, speaking of "novacula occami".
Ockham did not invent this principle, but the "razor"—and its
association with him—may be due to the frequency and effectiveness with
which he used it.
Ockham stated the principle in various ways, but the most popular
version, "Entities are not to be multiplied without necessity" (Non sunt multiplicanda entia sine necessitate) was formulated by the Irish Franciscan philosopher John Punch in his 1639 commentary on the works of Duns Scotus.
Formulations before William of Ockham
Part of a page from Duns Scotus' book Commentaria oxoniensia ad IV libros magistri Sententiarus, novis curis edidit p. Marianus Fernandez Garcia (1914, p. 211)': "Pluralitas non est ponenda sine necessitate", i.e., "Plurality is not to be posited without necessity"
The origins of what has come to be known as Occam's razor are traceable to the works of earlier philosophers such as John Duns Scotus (1265–1308), Robert Grosseteste (1175–1253), Maimonides (Moses ben-Maimon, 1138–1204), and even Aristotle (384–322 BC). Aristotle writes in his Posterior Analytics, "We may assume the superiority ceteris paribus [other things being equal] of the demonstration which derives from fewer postulates or hypotheses." Ptolemy (c. AD 90 – c. AD 168) stated, "We consider it a good principle to explain the phenomena by the simplest hypothesis possible."
Phrases such as "It is vain to do with more what can be done with
fewer" and "A plurality is not to be posited without necessity" were
commonplace in 13th-century scholastic writing. Robert Grosseteste, in Commentary on [Aristotle's] the Posterior Analytics Books (Commentarius in Posteriorum Analyticorum Libros)
(c. 1217–1220), declares: "That is better and more valuable which
requires fewer, other circumstances being equal... For if one thing were
demonstrated from many and another thing from fewer equally known
premises, clearly that is better which is from fewer because it makes us
know quickly, just as a universal demonstration is better than
particular because it produces knowledge from fewer premises. Similarly
in natural science, in moral science, and in metaphysics the best is
that which needs no premises and the better that which needs the fewer,
other circumstances being equal."
The Summa Theologica of Thomas Aquinas
(1225–1274) states that "it is superfluous to suppose that what can be
accounted for by a few principles has been produced by many." Aquinas
uses this principle to construct an objection to God's existence, an objection that he in turn answers and refutes generally (cf. quinque viae), and specifically, through an argument based on causality.
Hence, Aquinas acknowledges the principle that today is known as
Occam's razor, but prefers causal explanations to other simple
explanations (cf. also Correlation does not imply causation).
William of Ockham
William of Ockham (circa 1287–1347) was an English Franciscan friar and theologian, an influential medieval philosopher and a nominalist. His popular fame as a great logician rests chiefly on the maxim attributed to him and known as Occam's razor. The term razor
refers to distinguishing between two hypotheses either by "shaving
away" unnecessary assumptions or cutting apart two similar conclusions.
While it has been claimed that Occam's razor is not found in any of William's writings, one can cite statements such as Numquam ponenda est pluralitas sine necessitate [Plurality must never be posited without necessity], which occurs in his theological work on the 'Sentences of Peter Lombard' (Quaestiones et decisiones in quattuor libros Sententiarum Petri Lombardi (ed. Lugd., 1495), i, dist. 27, qu. 2, K).
Nevertheless, the precise words sometimes attributed to William of Ockham, entia non sunt multiplicanda praeter necessitatem (entities must not be multiplied beyond necessity), are absent in his extant works; this particular phrasing comes from John Punch, who described the principle as a "common axiom" (axioma vulgare) of the Scholastics.
William of Ockham's contribution seems to restrict the operation of
this principle in matters pertaining to miracles and God's power; so, in
the Eucharist, a plurality of miracles is possible, simply because it pleases God.
This principle is sometimes phrased as "pluralitas non est ponenda sine necessitate" ("plurality should not be posited without necessity"). In his Summa Totius Logicae, i. 12, William of Ockham cites the principle of economy, Frustra fit per plura quod potest fieri per pauciora (It is futile to do with more things that which can be done with fewer"). (Thorburn, 1918, pp. 352–53; Kneale and Kneale, 1962, p. 243.)
Later formulations
To quote Isaac Newton,
"We are to admit no more causes of natural things than such as are both
true and sufficient to explain their appearances. Therefore, to the
same natural effects we must, as far as possible, assign the same
causes."
Bertrand Russell
offers a particular version of Occam's razor: "Whenever possible,
substitute constructions out of known entities for inferences to unknown
entities."
Around 1960, Ray Solomonoff founded the theory of universal inductive inference,
the theory of prediction based on observations; for example, predicting
the next symbol based upon a given series of symbols. The only
assumption is that the environment follows some unknown but computable
probability distribution. This theory is a mathematical formalization of
Occam's razor.
Another technical approach to Occam's razor is ontological parsimony.
Parsimony means spareness and is also referred to as the Rule of
Simplicity. This is considered a strong version of Occam's razor. A variation used in medicine is called the "Zebra": a doctor should reject an exotic medical diagnosis when a more commonplace explanation is more likely, derived from Theodore Woodward's dictum "When you hear hoofbeats, think of horses not zebras".
Ernst Mach formulated the stronger version of Occam's razor into physics,
which he called the Principle of Economy stating: "Scientists must use
the simplest means of arriving at their results and exclude everything
not perceived by the senses."
This principle goes back at least as far as Aristotle, who wrote "Nature operates in the shortest way possible."
The idea of parsimony or simplicity in deciding between theories,
though not the intent of the original expression of Occam's razor, has
been assimilated into our culture as the widespread layman's formulation
that "the simplest explanation is usually the correct one."
Justifications
Aesthetic
Prior
to the 20th century, it was a commonly held belief that nature itself
was simple and that simpler hypotheses about nature were thus more
likely to be true. This notion was deeply rooted in the aesthetic value
that simplicity holds for human thought and the justifications presented
for it often drew from theology. Thomas Aquinas
made this argument in the 13th century, writing, "If a thing can be
done adequately by means of one, it is superfluous to do it by means of
several; for we observe that nature does not employ two instruments [if]
one suffices."
Occam's
razor has gained strong empirical support in helping to converge on
better theories (see "Applications" section below for some examples).
In the related concept of overfitting, excessively complex models are affected by statistical noise
(a problem also known as the bias-variance trade-off), whereas simpler
models may capture the underlying structure better and may thus have
better predictive performance. It is, however, often difficult to deduce which part of the data is noise (cf. model selection, test set, minimum description length, Bayesian inference, etc.).
Testing the razor
The
razor's statement that "other things being equal, simpler explanations
are generally better than more complex ones" is amenable to empirical
testing. Another interpretation of the razor's statement would be that
"simpler hypotheses are generally better than the complex ones". The
procedure to test the former interpretation would compare the track
records of simple and comparatively complex explanations. If one accepts
the first interpretation, the validity of Occam's razor as a tool would
then have to be rejected if the more complex explanations were more
often correct than the less complex ones (while the converse would lend
support to its use). If the latter interpretation is accepted, the
validity of Occam's razor as a tool could possibly be accepted if the
simpler hypotheses led to correct conclusions more often than not.
Possible explanations can become needlessly complex. It is coherent, for instance, to add the involvement of leprechauns to any explanation, but Occam's razor would prevent such additions unless they were necessary.
Some increases in complexity are sometimes necessary. So there
remains a justified general bias toward the simpler of two competing
explanations. To understand why, consider that for each accepted
explanation of a phenomenon, there is always an infinite number of
possible, more complex, and ultimately incorrect, alternatives. This is
so because one can always burden a failing explanation with an ad hoc hypothesis. Ad hoc hypotheses are justifications that prevent theories from being falsified. Even other empirical criteria, such as consilience,
can never truly eliminate such explanations as competition. Each true
explanation, then, may have had many alternatives that were simpler and
false, but also an infinite number of alternatives that were more
complex and false. But if an alternative ad hoc hypothesis were indeed
justifiable, its implicit conclusions would be empirically verifiable.
On a commonly accepted repeatability principle, these alternative
theories have never been observed and continue to escape observation. In addition, one does not say an explanation is true if it has not withstood this principle.
Put another way, any new, and even more complex, theory can still possibly be true. For example, if an individual makes supernatural claims that leprechauns
were responsible for breaking a vase, the simpler explanation would be
that he is mistaken, but ongoing ad hoc justifications (e.g. "... and
that's not me on the film; they tampered with that, too") successfully
prevent outright falsification. This endless supply of elaborate
competing explanations, called saving hypotheses, cannot be ruled out—except by using Occam's razor.
A study of the predictive validity of Occam's razor found 32 published
papers that included 97 comparisons of economic forecasts from simple
and complex forecasting methods. None of the papers provided a balance
of evidence that complexity of method improved forecast accuracy. In the
25 papers with quantitative comparisons, complexity increased forecast
errors by an average of 27 percent.
Mathematical
One justification of Occam's razor is a direct result of basic probability theory.
By definition, all assumptions introduce possibilities for error; if an
assumption does not improve the accuracy of a theory, its only effect
is to increase the probability that the overall theory is wrong.
There have also been other attempts to derive Occam's razor from probability theory, including notable attempts made by Harold Jeffreys and E. T. Jaynes. The probabilistic (Bayesian) basis for Occam's razor is elaborated by David J. C. MacKay in chapter 28 of his book Information Theory, Inference, and Learning Algorithms, where he emphasizes that a prior bias in favour of simpler models is not required.
William H. Jefferys and James O. Berger
(1991) generalize and quantify the original formulation's "assumptions"
concept as the degree to which a proposition is unnecessarily
accommodating to possible observable data.
They state, "A hypothesis with fewer adjustable parameters will
automatically have an enhanced posterior probability, due to the fact
that the predictions it makes are sharp." The model they propose balances the precision of a theory's predictions against their sharpness—preferring theories that sharply
make correct predictions over theories that accommodate a wide range of
other possible results. This, again, reflects the mathematical
relationship between key concepts in Bayesian inference (namely marginal probability, conditional probability, and posterior probability).
The bias–variance tradeoff
is a framework that incorporates the Occam's razor principal in its
balance between overfitting (i.e. variance minimization) and
underfitting (i.e. bias minimization).
Other philosophers
Karl Popper
Karl Popper
argues that a preference for simple theories need not appeal to
practical or aesthetic considerations. Our preference for simplicity may
be justified by its falsifiability
criterion: we prefer simpler theories to more complex ones "because
their empirical content is greater; and because they are better
testable" (Popper 1992). The idea here is that a simple theory applies
to more cases than a more complex one, and is thus more easily
falsifiable. This is again comparing a simple theory to a more complex
theory where both explain the data equally well.
Elliott Sober
The philosopher of science Elliott Sober
once argued along the same lines as Popper, tying simplicity with
"informativeness": The simplest theory is the more informative, in the
sense that it requires less information to a question. He has since rejected this account of simplicity, purportedly because it fails to provide an epistemic
justification for simplicity. He now believes that simplicity
considerations (and considerations of parsimony in particular) do not
count unless they reflect something more fundamental. Philosophers, he
suggests, may have made the error of hypostatizing simplicity (i.e.,
endowed it with a sui generis
existence), when it has meaning only when embedded in a specific
context (Sober 1992). If we fail to justify simplicity considerations on
the basis of the context in which we use them, we may have no
non-circular justification: "Just as the question 'why be rational?' may
have no non-circular answer, the same may be true of the question 'why
should simplicity be considered in evaluating the plausibility of
hypotheses?'"
... the simplest hypothesis
proposed as an explanation of phenomena is more likely to be the true
one than is any other available hypothesis, that its predictions are
more likely to be true than those of any other available hypothesis, and
that it is an ultimate a priori epistemic principle that simplicity is
evidence for truth.
— Swinburne 1997
According to Swinburne, since our choice of theory cannot be determined by data (see Underdetermination and Duhem-Quine thesis),
we must rely on some criterion to determine which theory to use. Since
it is absurd to have no logical method for settling on one hypothesis
amongst an infinite number of equally data-compliant hypotheses, we
should choose the simplest theory: "Either science is irrational [in the
way it judges theories and predictions probable] or the principle of
simplicity is a fundamental synthetic a priori truth." (Swinburne 1997).
3.328 "If a sign is not necessary then it is meaningless. That is the meaning of Occam's Razor."
(If everything in the symbolism works as though a sign had meaning, then it has meaning.)
4.04 "In the proposition there must be exactly as many things
distinguishable as there are in the state of affairs, which it
represents. They must both possess the same logical (mathematical)
multiplicity (cf. Hertz's Mechanics, on Dynamic Models)."
5.47321 "Occam's Razor is, of course, not an arbitrary rule nor one
justified by its practical success. It simply says that unnecessary
elements in a symbolism mean nothing. Signs which serve one purpose are
logically equivalent; signs which serve no purpose are logically
meaningless."
and on the related concept of "simplicity":
6.363 "The procedure of induction consists in accepting as true the simplest law that can be reconciled with our experiences."
Applications
Science and the scientific method
Andreas Cellarius's illustration of the Copernican system, from the Harmonia Macrocosmica (1660). Future positions of the sun, moon and other solar system bodies can be calculated using a geocentric model (the earth is at the centre) or using a heliocentric model
(the sun is at the centre). Both work, but the geocentric model arrives
at the same conclusions through a much more complex system of
calculations than the heliocentric model. This was pointed out in a
preface to Copernicus' first edition of De revolutionibus orbium coelestium.
In chemistry, Occam's razor is often an important heuristic when developing a model of a reaction mechanism.
Although it is useful as a heuristic in developing models of reaction
mechanisms, it has been shown to fail as a criterion for selecting among
some selected published models.
In this context, Einstein himself expressed caution when he formulated
Einstein's Constraint: "It can scarcely be denied that the supreme goal
of all theory is to make the irreducible basic elements as simple and as
few as possible without having to surrender the adequate representation
of a single datum of experience". An often-quoted version of this
constraint (which cannot be verified as posited by Einstein himself) says "Everything should be kept as simple as possible, but no simpler."
In the scientific method, parsimony is an epistemological, metaphysical or heuristic preference, not an irrefutable principle of logic or a scientific result.
As a logical principle, Occam's razor would demand that scientists
accept the simplest possible theoretical explanation for existing data.
However, science has shown repeatedly that future data often support
more complex theories than do existing data. Science prefers the
simplest explanation that is consistent with the data available at a
given time, but the simplest explanation may be ruled out as new data
become available.
That is, science is open to the possibility that future experiments
might support more complex theories than demanded by current data and is
more interested in designing experiments to discriminate between
competing theories than favoring one theory over another based merely on
philosophical principles.
When scientists use the idea of parsimony, it has meaning only in
a very specific context of inquiry. Several background assumptions are
required for parsimony to connect with plausibility in a particular
research problem. The reasonableness of parsimony in one research
context may have nothing to do with its reasonableness in another. It is
a mistake to think that there is a single global principle that spans
diverse subject matter.
It has been suggested that Occam's razor is a widely accepted
example of extraevidential consideration, even though it is entirely a
metaphysical assumption. There is little empirical evidence that the world is actually simple or that simple accounts are more likely to be true than complex ones.
Most of the time, Occam's razor is a conservative tool, cutting
out "crazy, complicated constructions" and assuring "that hypotheses are
grounded in the science of the day", thus yielding "normal" science:
models of explanation and prediction.
There are, however, notable exceptions where Occam's razor turns a
conservative scientist into a reluctant revolutionary. For example, Max Planck interpolated between the Wien and Jeans
radiation laws and used Occam's razor logic to formulate the quantum
hypothesis, even resisting that hypothesis as it became more obvious
that it was correct.
Appeals to simplicity were used to argue against the phenomena of meteorites, ball lightning, continental drift, and reverse transcriptase.
One can argue for atomic building blocks for matter, because it
provides a simpler explanation for the observed reversibility of both
mixing and chemical reactions as simple separation and rearrangements of
atomic building blocks. At the time, however, the atomic theory was considered more complex because it implied the existence of invisible particles that had not been directly detected. Ernst Mach and the logical positivists rejected John Dalton's atomic theory until the reality of atoms was more evident in Brownian motion, as shown by Albert Einstein.
In the same way, postulating the aether is more complex than transmission of light through a vacuum.
At the time, however, all known waves propagated through a physical
medium, and it seemed simpler to postulate the existence of a medium
than to theorize about wave propagation without a medium. Likewise,
Newton's idea of light particles seemed simpler than Christiaan
Huygens's idea of waves, so many favored it. In this case, as it turned
out, neither the wave—nor the particle—explanation alone suffices, as light behaves like waves and like particles.
Three axioms presupposed by the scientific method are realism
(the existence of objective reality), the existence of natural laws, and
the constancy of natural law. Rather than depend on provability of
these axioms, science depends on the fact that they have not been
objectively falsified. Occam's razor and parsimony support, but do not
prove, these axioms of science. The general principle of science is that
theories (or models) of natural law must be consistent with repeatable
experimental observations. This ultimate arbiter (selection criterion)
rests upon the axioms mentioned above.
There are examples where Occam's razor would have favored the
wrong theory given the available data. Simplicity principles are useful
philosophical preferences for choosing a more likely theory from among
several possibilities that are all consistent with available data. A
single instance of Occam's razor favoring a wrong theory falsifies the
razor as a general principle. Michael Lee and others
provide cases in which a parsimonious approach does not guarantee a
correct conclusion and, if based on incorrect working hypotheses or
interpretations of incomplete data, may even strongly support a false
conclusion.
If multiple models of natural law make exactly the same testable
predictions, they are equivalent and there is no need for parsimony to
choose a preferred one. For example, Newtonian, Hamiltonian and
Lagrangian classical mechanics are equivalent. Physicists have no
interest in using Occam's razor to say the other two are wrong.
Likewise, there is no demand for simplicity principles to arbitrate
between wave and matrix formulations of quantum mechanics. Science often
does not demand arbitration or selection criteria between models that
make the same testable predictions.
Biology
Biologists or philosophers of biology use Occam's razor in either of two contexts both in evolutionary biology: the units of selection controversy and systematics. George C. Williams in his book Adaptation and Natural Selection (1966) argues that the best way to explain altruism
among animals is based on low-level (i.e., individual) selection as
opposed to high-level group selection. Altruism is defined by some
evolutionary biologists (e.g., R. Alexander, 1987; W. D. Hamilton, 1964)
as behavior that is beneficial to others (or to the group) at a cost to
the individual, and many posit individual selection as the mechanism
that explains altruism solely in terms of the behaviors of individual
organisms acting in their own self-interest (or in the interest of their
genes, via kin selection). Williams was arguing against the perspective
of others who propose selection at the level of the group as an
evolutionary mechanism that selects for altruistic traits (e.g., D. S.
Wilson & E. O. Wilson, 2007). The basis for Williams' contention is
that of the two, individual selection is the more parsimonious theory.
In doing so he is invoking a variant of Occam's razor known as Morgan's Canon:
"In no case is an animal activity to be interpreted in terms of higher
psychological processes, if it can be fairly interpreted in terms of
processes which stand lower in the scale of psychological evolution and
development." (Morgan 1903).
However, more recent biological analyses, such as Richard Dawkins' The Selfish Gene,
have contended that Morgan's Canon is not the simplest and most basic
explanation. Dawkins argues the way evolution works is that the genes
propagated in most copies end up determining the development of that
particular species, i.e., natural selection turns out to select specific
genes, and this is really the fundamental underlying principle that
automatically gives individual and group selection as emergent features of evolution.
Zoology provides an example. Muskoxen, when threatened by wolves,
form a circle with the males on the outside and the females and young
on the inside. This is an example of a behavior by the males that seems
to be altruistic. The behavior is disadvantageous to them individually
but beneficial to the group as a whole and was thus seen by some to
support the group selection theory. Another interpretation is kin
selection: if the males are protecting their offspring, they are
protecting copies of their own alleles. Engaging in this behavior would
be favored by individual selection if the cost to the male musk ox is
less than half of the benefit received by his calf – which could easily
be the case if wolves have an easier time killing calves than adult
males. It could also be the case that male musk oxen would be
individually less likely to be killed by wolves if they stood in a
circle with their horns pointing out, regardless of whether they were
protecting the females and offspring. That would be an example of
regular natural selection – a phenomenon called "the selfish herd".
Systematics is the branch of biology
that attempts to establish patterns of genealogical relationship among
biological taxa. It is also concerned with their classification. There
are three primary camps in systematics: cladists, pheneticists, and
evolutionary taxonomists. The cladists hold that genealogy
alone should determine classification, pheneticists contend that
overall similarity is the determining criterion, while evolutionary
taxonomists say that both genealogy and similarity count in
classification.
It is among the cladists that Occam's razor is to be found, although their term for it is cladistic parsimony. Cladistic parsimony (or maximum parsimony) is a method of phylogenetic inference in the construction of types of phylogenetic trees (more specifically, cladograms). Cladograms
are branching, tree-like structures used to represent hypotheses of
relative degree of relationship, based on shared, derived character
states. Cladistic parsimony is used to select as the preferred
hypothesis of relationships the cladogram that requires the fewest
implied character state transformations. Critics of the cladistic
approach often observe that for some types of tree, parsimony
consistently produces the wrong results, regardless of how much data is
collected (this is called statistical inconsistency, or long branch attraction).
However, this criticism is also potentially true for any type of
phylogenetic inference, unless the model used to estimate the tree
reflects the way that evolution actually happened. Because this
information is not empirically accessible, the criticism of statistical
inconsistency against parsimony holds no force.[52] For a book-length treatment of cladistic parsimony, see Elliott Sober's Reconstructing the Past: Parsimony, Evolution, and Inference (1988). For a discussion of both uses of Occam's razor in biology, see Sober's article "Let's Razor Ockham's Razor" (1990).
Other methods for inferring evolutionary relationships use parsimony in a more traditional way. Likelihood
methods for phylogeny use parsimony as they do for all likelihood
tests, with hypotheses requiring few differing parameters (i.e., numbers
of different rates of character change or different frequencies of
character state transitions) being treated as null hypotheses relative
to hypotheses requiring many differing parameters. Thus, complex
hypotheses must predict data much better than do simple hypotheses
before researchers reject the simple hypotheses. Recent advances employ information theory, a close cousin of likelihood, which uses Occam's razor in the same way.
Francis Crick
has commented on potential limitations of Occam's razor in biology. He
advances the argument that because biological systems are the products
of (an ongoing) natural selection, the mechanisms are not necessarily
optimal in an obvious sense. He cautions: "While Ockham's razor is a
useful tool in the physical sciences, it can be a very dangerous
implement in biology. It is thus very rash to use simplicity and
elegance as a guide in biological research."
In biogeography, parsimony is used to infer ancient migrations of species or populations by observing the geographic distribution and relationships of existing organisms. Given the phylogenetic tree, ancestral migrations are inferred to be those that require the minimum amount of total movement.
Religion
In the philosophy of religion, Occam's razor is sometimes applied to the existence of God. William of Ockham
himself was a Christian. He believed in God, and in the authority of
Scripture; he writes that "nothing ought to be posited without a reason
given, unless it is self-evident (literally, known through itself) or
known by experience or proved by the authority of Sacred Scripture."
Ockham believed that an explanation has no sufficient basis in reality
when it does not harmonize with reason, experience, or the Bible.
However, unlike many theologians of his time, Ockham did not believe God
could be logically proven with arguments. To Ockham, science was a
matter of discovery, but theology was a matter of revelation and faith.
He states: "only faith gives us access to theological truths. The ways
of God are not open to reason, for God has freely chosen to create a
world and establish a way of salvation within it apart from any
necessary laws that human logic or rationality can uncover."
St. Thomas Aquinas, in the Summa Theologica,
uses a formulation of Occam's razor to construct an objection to the
idea that God exists, which he refutes directly with a counterargument:
Further, it is superfluous to suppose that what can be
accounted for by a few principles has been produced by many. But it
seems that everything we see in the world can be accounted for by other
principles, supposing God did not exist. For all natural things can be
reduced to one principle which is nature; and all voluntary things can
be reduced to one principle which is human reason, or will. Therefore
there is no need to suppose God's existence.
In turn, Aquinas answers this with the quinque viae, and addresses the particular objection above with the following answer:
Since nature works for a determinate end under the
direction of a higher agent, whatever is done by nature must needs be
traced back to God, as to its first cause. So also whatever is done
voluntarily must also be traced back to some higher cause other than
human reason or will, since these can change or fail; for all things
that are changeable and capable of defect must be traced back to an
immovable and self-necessary first principle, as was shown in the body
of the Article.
Rather than argue for the necessity of a god, some theists base their
belief upon grounds independent of, or prior to, reason, making Occam's
razor irrelevant. This was the stance of Søren Kierkegaard, who viewed belief in God as a leap of faith that sometimes directly opposed reason. This is also the doctrine of Gordon Clark's presuppositional apologetics, with the exception that Clark never thought the leap of faith was contrary to reason (see also Fideism).
Various arguments in favor of God
establish God as a useful or even necessary assumption. Contrastingly
some anti-theists hold firmly to the belief that assuming the existence
of God introduces unnecessary complexity (Schmitt 2005, e.g., the Ultimate Boeing 747 gambit).
Another application of the principle is to be found in the work of George Berkeley
(1685–1753). Berkeley was an idealist who believed that all of reality
could be explained in terms of the mind alone. He invoked Occam's razor
against materialism,
stating that matter was not required by his metaphysic and was thus
eliminable. One potential problem with this belief is that it's
possible, given Berkeley's position, to find solipsism itself more in
line with the razor than a God-mediated world beyond a single thinker.
Occam's razor may also be recognized in the apocryphal story about an exchange between Pierre-Simon Laplace and Napoleon.
It is said that in praising Laplace for one of his recent publications,
the emperor asked how it was that the name of God, which featured so
frequently in the writings of Lagrange, appeared nowhere in Laplace's. At that, he is said to have replied, "It's because I had no need of that hypothesis." Though some point to this story as illustrating Laplace's atheism, more careful consideration suggests that he may instead have intended merely to illustrate the power of methodological naturalism, or even simply that the fewer logical premises one assumes, the stronger is one's conclusion.
In his article "Sensations and Brain Processes" (1959), J. J. C. Smart invoked Occam's razor with the aim to justify his preference of the mind-brain identity theory over spirit-body dualism.
Dualists state that there are two kinds of substances in the universe:
physical (including the body) and spiritual, which is non-physical. In
contrast, identity theorists state that everything is physical,
including consciousness, and that there is nothing nonphysical. Though
it is impossible to appreciate the spiritual when limiting oneself to
the physical, Smart maintained that identity theory explains all
phenomena by assuming only a physical reality. Subsequently, Smart has
been severely criticized for his use (or misuse) of Occam's razor and
ultimately retracted his advocacy of it in this context. Paul Churchland
(1984) states that by itself Occam's razor is inconclusive regarding
duality. In a similar way, Dale Jacquette (1994) stated that Occam's
razor has been used in attempts to justify eliminativism and
reductionism in the philosophy of mind. Eliminativism is the thesis that
the ontology of folk psychology
including such entities as "pain", "joy", "desire", "fear", etc., are
eliminable in favor of an ontology of a completed neuroscience.
Penal ethics
In penal theory and the philosophy of punishment, parsimony refers specifically to taking care in the distribution of punishment in order to avoid excessive punishment. In the utilitarian approach to the philosophy of punishment, Jeremy Bentham's
"parsimony principle" states that any punishment greater than is
required to achieve its end is unjust. The concept is related but not
identical to the legal concept of proportionality. Parsimony is a key consideration of the modern restorative justice, and is a component of utilitarian approaches to punishment, as well as the prison abolition movement. Bentham believed that true parsimony would require punishment to be individualised to take account of the sensibility
of the individual—an individual more sensitive to punishment should be
given a proportionately lesser one, since otherwise needless pain would
be inflicted. Later utilitarian writers have tended to abandon this
idea, in large part due to the impracticality of determining each
alleged criminal's relative sensitivity to specific punishments.
There are various papers in scholarly journals deriving formal
versions of Occam's razor from probability theory, applying it in statistical inference, and using it to come up with criteria for penalizing complexity in statistical inference. Papers have suggested a connection between Occam's razor and Kolmogorov complexity.
One of the problems with the original formulation of the razor is
that it only applies to models with the same explanatory power (i.e.,
it only tells us to prefer the simplest of equally good models). A more
general form of the razor can be derived from Bayesian model comparison,
which is based on Bayes factors
and can be used to compare models that don't fit the observations
equally well. These methods can sometimes optimally balance the
complexity and power of a model. Generally, the exact Occam factor is
intractable, but approximations such as Akaike information criterion, Bayesian information criterion, Variational Bayesian methods, false discovery rate, and Laplace's method are used. Many artificial intelligence researchers are now employing such techniques, for instance through work on Occam Learning or more generally on the Free energy principle.
Statistical versions of Occam's razor have a more rigorous
formulation than what philosophical discussions produce. In particular,
they must have a specific definition of the term simplicity, and that definition can vary. For example, in the Kolmogorov–Chaitinminimum description length approach, the subject must pick a Turing machine whose operations describe the basic operations believed
to represent "simplicity" by the subject. However, one could always
choose a Turing machine with a simple operation that happened to
construct one's entire theory and would hence score highly under the
razor. This has led to two opposing camps: one that believes Occam's
razor is objective, and one that believes it is subjective.
Objective razor
The minimum instruction set of a universal Turing machine requires approximately the same length description across different formulations, and is small compared to the Kolmogorov complexity of most practical theories. Marcus Hutter
has used this consistency to define a "natural" Turing machine of small
size as the proper basis for excluding arbitrarily complex instruction
sets in the formulation of razors.
Describing the program for the universal program as the "hypothesis",
and the representation of the evidence as program data, it has been
formally proven under Zermelo–Fraenkel set theory
that "the sum of the log universal probability of the model plus the
log of the probability of the data given the model should be minimized."
Interpreting this as minimising the total length of a two-part message
encoding model followed by data given model gives us the minimum message length (MML) principle.
One possible conclusion from mixing the concepts of Kolmogorov
complexity and Occam's razor is that an ideal data compressor would also
be a scientific explanation/formulation generator. Some attempts have
been made to re-derive known laws from considerations of simplicity or
compressibility.
According to Jürgen Schmidhuber, the appropriate mathematical theory of Occam's razor already exists, namely, Solomonoff'stheory of optimal inductive inference and its extensions. See discussions in David L. Dowe's "Foreword re C. S. Wallace" for the subtle distinctions between the algorithmic probability work of Solomonoff and the MML work of Chris Wallace, and see Dowe's "MML, hybrid Bayesian network graphical models, statistical consistency, invariance and uniqueness"
both for such discussions and for (in section 4) discussions of MML and
Occam's razor. For a specific example of MML as Occam's razor in the
problem of decision tree induction, see Dowe and Needham's "Message
Length as an Effective Ockham's Razor in Decision Tree Induction".
Controversial aspects of the razor
Occam's
razor is not an embargo against the positing of any kind of entity, or a
recommendation of the simplest theory come what may.
Occam's razor is used to adjudicate between theories that have already
passed "theoretical scrutiny" tests and are equally well-supported by
evidence.
Furthermore, it may be used to prioritize empirical testing between two
equally plausible but unequally testable hypotheses; thereby minimizing
costs and wastes while increasing chances of falsification of the
simpler-to-test hypothesis.
Another contentious aspect of the razor is that a theory can become more complex in terms of its structure (or syntax), while its ontology (or semantics) becomes simpler, or vice versa.
Quine, in a discussion on definition, referred to these two
perspectives as "economy of practical expression" and "economy in
grammar and vocabulary", respectively.
Galileo Galilei lampooned the misuse of Occam's razor in his Dialogue.
The principle is represented in the dialogue by Simplicio. The telling
point that Galileo presented ironically was that if one really wanted to
start from a small number of entities, one could always consider the
letters of the alphabet as the fundamental entities, since one could
construct the whole of human knowledge out of them.
Anti-razors
Occam's razor has met some opposition from people who have considered it too extreme or rash. Walter Chatton
(c. 1290–1343) was a contemporary of William of Ockham (c. 1287–1347)
who took exception to Occam's razor and Ockham's use of it. In response
he devised his own anti-razor: "If three things are not enough to
verify an affirmative proposition about things, a fourth must be added,
and so on." Although there have been a number of philosophers who have
formulated similar anti-razors since Chatton's time, no one anti-razor
has perpetuated in as much notability as Chatton's anti-razor, although
this could be the case of the Late Renaissance Italian motto of unknown
attribution Se non è vero, è ben trovato ("Even if it is not true, it is well conceived") when referred to a particularly artful explanation.
Anti-razors have also been created by Gottfried Wilhelm Leibniz (1646–1716), Immanuel Kant (1724–1804), and Karl Menger (1902–1985). Leibniz's version took the form of a principle of plenitude, as Arthur Lovejoy
has called it: the idea being that God created the most varied and
populous of possible worlds. Kant felt a need to moderate the effects of
Occam's razor and thus created his own counter-razor: "The variety of
beings should not rashly be diminished."
Karl Menger found mathematicians to be too parsimonious with
regard to variables, so he formulated his Law Against Miserliness, which
took one of two forms: "Entities must not be reduced to the point of
inadequacy" and "It is vain to do with fewer what requires more." A less
serious but (some might say) even more extremist anti-razor is 'Pataphysics, the "science of imaginary solutions" developed by Alfred Jarry
(1873–1907). Perhaps the ultimate in anti-reductionism, "'Pataphysics
seeks no less than to view each event in the universe as completely
unique, subject to no laws but its own." Variations on this theme were
subsequently explored by the Argentine writer Jorge Luis Borges in his story/mock-essay "Tlön, Uqbar, Orbis Tertius". There is also Crabtree's Bludgeon,
which cynically states that "[n]o set of mutually inconsistent
observations can exist for which some human intellect cannot conceive a
coherent explanation, however complicated."