A Medley of Potpourri

Monday, September 3, 2018

Newton's Opticks

From Wikipedia, the free encyclopedia

The first, 1704, edition of Opticks: or, a treatise of the reflexions, refractions, inflexions and colours of light.

Opticks: or, A Treatise of the Reflexions, Refractions, Inflexions and Colours of Light is a book by English natural philosopher Isaac Newton that was published in English in 1704. (A scholarly Latin translation appeared in 1706.) The book analyzes the fundamental nature of light by means of the refraction of light with prisms and lenses, the diffraction of light by closely spaced sheets of glass, and the behaviour of color mixtures with spectral lights or pigment powders. It is considered one of the great works of science in history. Opticks was Newton's second major book on physical science. Newton's name did not appear on the title page of the first edition of Opticks.

Overview

The publication of Opticks represented a major contribution to science, different from but in some ways rivalling the Principia. Opticks is largely a record of experiments and the deductions made from them, covering a wide range of topics in what was later to be known as physical optics. That is, this work is not a geometric discussion of catoptrics or dioptrics, the traditional subjects of reflection of light by mirrors of different shapes and the exploration of how light is "bent" as it passes from one medium, such as air, into another, such as water or glass. Rather, the Opticks is a study of the nature of light and colour and the various phenomena of diffraction, which Newton called the "inflexion" of light.

In this book Newton sets forth in full his experiments, first reported to the Royal Society of London in 1672, on dispersion, or the separation of light into a spectrum of its component colours. He demonstrates how the appearance of color arises from selective absorption, reflection, or transmission of the various component parts of the incident light.

The major significance of Newton's work is that it overturned the dogma, attributed to Aristotle or Theophrastus and accepted by scholars in Newton's time, that "pure" light (such as the light attributed to the Sun) is fundamentally white or colourless, and is altered into color by mixture with darkness caused by interactions with matter. Newton showed just the opposite was true: light is composed of different spectral hues (he describes seven — red, orange, yellow, green, blue, indigo and violet), and all colours, including white, are formed by various mixtures of these hues. He demonstrates that color arises from a physical property of light — each hue is refracted at a characteristic angle by a prism or lens — but he clearly states that color is a sensation within the mind and not an inherent property of material objects or of light itself. For example, he demonstrates that a red violet (magenta) color can be mixed by overlapping the red and violet ends of two spectra, although this color does not appear in the spectrum and therefore is not a "color of light". By connecting the red and violet ends of the spectrum, he organised all colours as a color circle that both quantitatively predicts color mixtures and qualitatively describes the perceived similarity among hues.

Opticks and the Principia

Opticks differs in many respects from the Principia. It was first published in English rather than in the Latin used by European philosophers, contributing to the development of a vernacular science literature. This marks a significant transition in the history of the English Language. With Britain's growing confidence and world influence, due at least in part to people like Newton, the English language was rapidly becoming the language of science and business. The book is a model of popular science exposition: although Newton's English is somewhat dated—he shows a fondness for lengthy sentences with much embedded qualifications—the book can still be easily understood by a modern reader. In contrast, few readers of Newton's time found the Principia accessible or even comprehensible. His formal but flexible style shows colloquialisms and metaphorical word choice.

Unlike the Principia, Opticks is not developed using the geometric convention of propositions proved by deduction from either previous propositions, lemmas or first principles (or axioms). Instead, axioms define the meaning of technical terms or fundamental properties of matter and light, and the stated propositions are demonstrated by means of specific, carefully described experiments. The first sentence of the book declares My Design in this Book is not to explain the Properties of Light by Hypotheses, but to propose and prove them by Reason and Experiments. In an Experimentum crucis or "critical experiment" (Book I, Part II, Theorem ii), Newton showed that the color of light corresponded to its "degree of refrangibility" (angle of refraction), and that this angle cannot be changed by additional reflection or refraction or by passing the light through a coloured filter.

The work is a vade mecum of the experimenter's art, displaying in many examples how to use observation to propose factual generalisations about the physical world and then exclude competing explanations by specific experimental tests. However, unlike the Principia, which vowed Non fingo hypotheses or "I make no hypotheses" outside the deductive method, the Opticks develops conjectures about light that go beyond the experimental evidence: for example, that the physical behaviour of light was due its "corpuscular" nature as small particles, or that perceived colours were harmonically proportioned like the tones of a diatonic musical scale.

The Queries

Opticks concludes with a set of "Queries." In the first edition, these were sixteen such Queries; that number was increased in the Latin edition, published in 1706, and then in the revised English edition, published in 1717/18. The first set of Queries were brief, but the later ones became short essays, filling many pages. In the fourth edition of 1730, there were 31 Queries, and it was the famous "31st Query" that, over the next two hundred years, stimulated a great deal of speculation and development on theories of chemical affinity.

These Queries, especially the later ones, deal with a wide range of physical phenomena, far transcending any narrow interpretation of the subject matter of "optics." They concern the nature and transmission of heat; the possible cause of gravity; electrical phenomena; the nature of chemical action; the way in which God created matter in "the Beginning;" the proper way to do science; and even the ethical conduct of human beings. These Queries are not really questions in the ordinary sense. They are almost all posed in the negative, as rhetorical questions. That is, Newton does not ask whether light "is" or "may be" a "body." Rather, he declares: "Is not Light a Body?" Not only does this form indicate that Newton had an answer, but that it may go on for many pages. Clearly, as Stephen Hales (a firm Newtonian of the early eighteenth century) declared, this was Newton's mode of explaining "by Query."

Multiverse

Newton suggests the idea of a multiverse in this passage:

And since Space is divisible in infinitum, and Matter is not necessarily in all places, it may be also allow'd that God is able to create Particles of Matter of several Sizes and Figures, and in several Proportions to Space, and perhaps of different Densities and Forces, and thereby to vary the Laws of Nature, and make Worlds of several sorts in several Parts of the Universe. At least, I see nothing of Contradiction in all this.

Reception

The Opticks was widely read and debated in England and on the Continent. The early presentation of the work to the Royal Society stimulated a bitter dispute between Newton and Robert Hooke over the "corpuscular" or particle theory of light, which prompted Newton to postpone publication of the work until after Hooke's death in 1703. On the Continent, and in France in particular, both the Principia and the Opticks were initially rejected by many natural philosophers, who continued to defend Cartesian natural philosophy and the Aristotelian version of color, and claimed to find Newton's prism experiments difficult to replicate. Indeed, the Aristotelian theory of the fundamental nature of white light was defended into the 19th century, for example by the German writer Johann Wolfgang von Goethe in his Farbenlehre.

Newtonian science became a central issue in the assault waged by the philosophes in the Age of Enlightenment against a natural philosophy based on the authority of ancient Greek or Roman naturalists or on deductive reasoning from first principles (the method advocated by French philosopher René Descartes), rather than on the application of mathematical reasoning to experience or experiment. Voltaire popularised Newtonian science, including the content of both the Principia and the Opticks, in his Elements de la philosophie de Newton (1738), and after about 1750 the combination of the experimental methods exemplified by the Opticks and the mathematical methods exemplified by the Principia were established as a unified and comprehensive model of Newtonian science. Some of the primary adepts in this new philosophy were such prominent figures as Benjamin Franklin, Antoine-Laurent Lavoisier, and James Black.

Subsequent to Newton, much has been amended. Young and Fresnel combined Newton's particle theory with Huygens' wave theory to show that colour is the visible manifestation of light's wavelength. Science also slowly came to realise the difference between perception of colour and mathematisable optics. The German poet Goethe, with his epic diatribe Theory of Colours, could not shake the Newtonian foundation - but "one hole Goethe did find in Newton's armour.. Newton had committed himself to the doctrine that refraction without colour was impossible. He therefore thought that the object-glasses of telescopes must for ever remain imperfect, achromatism and refraction being incompatible. This inference was proved by Dollond to be wrong." (John Tyndall, 1880)

Bertrand paradox (probability)

From Wikipedia, the free encyclopedia

The Bertrand paradox is a problem within the classical interpretation of probability theory. Joseph Bertrand introduced it in his work Calcul des probabilités (1889) as an example to show that probabilities may not be well defined if the mechanism or method that produces the random variable is not clearly defined.

Bertrand's formulation of the problem

The Bertrand paradox goes as follows: Consider an equilateral triangle inscribed in a circle. Suppose a chord of the circle is chosen at random. What is the probability that the chord is longer than a side of the triangle?

Bertrand gave three arguments, all apparently valid, yet yielding different results.

Random chords, selection method 1; red = longer than triangle side, blue = shorter

The "random endpoints" method: Choose two random points on the circumference of the circle and draw the chord joining them. To calculate the probability in question imagine the triangle rotated so its vertex coincides with one of the chord endpoints. Observe that if the other chord endpoint lies on the arc between the endpoints of the triangle side opposite the first point, the chord is longer than a side of the triangle. The length of the arc is one third of the circumference of the circle, therefore the probability that a random chord is longer than a side of the inscribed triangle is 1/3.

Random chords, selection method 2

The "random radius" method: Choose a radius of the circle, choose a point on the radius and construct the chord through this point and perpendicular to the radius. To calculate the probability in question imagine the triangle rotated so a side is perpendicular to the radius. The chord is longer than a side of the triangle if the chosen point is nearer the center of the circle than the point where the side of the triangle intersects the radius. The side of the triangle bisects the radius, therefore the probability a random chord is longer than a side of the inscribed triangle is 1/2.

Random chords, selection method 3

The "random midpoint" method: Choose a point anywhere within the circle and construct a chord with the chosen point as its midpoint. The chord is longer than a side of the inscribed triangle if the chosen point falls within a concentric circle of radius 1/2 the radius of the larger circle. The area of the smaller circle is one fourth the area of the larger circle, therefore the probability a random chord is longer than a side of the inscribed triangle is 1/4.

As presented above, the selection methods differ in the weight they give to chords which are diameters. In method 1, each chord can be chosen in exactly one way, regardless of whether or not it is a diameter. In method 2, each diameter can be chosen in two ways, whereas each other chord can be chosen in only one way. In method 3, each choice of midpoint corresponds to a single chord, except the center of the circle, which is the midpoint of all the diameters. These issues can be avoided by "regularizing" the problem so as to exclude diameters, without affecting the resulting probabilities.

The selection methods can also be visualized as follows. A chord which is not a diameter is uniquely identified by its midpoint. Each of the three selection methods presented above yields a different distribution of midpoints. Methods 1 and 2 yield two different nonuniform distributions, while method 3 yields a uniform distribution. On the other hand, if one looks at the images of the chords below, the chords of method 2 give the circle a homogeneously shaded look, while method 1 and 3 do not.

Midpoints of the chords chosen at random using method 1

Midpoints of the chords chosen at random using method 2

Midpoints of the chords chosen at random using method 3

Chords chosen at random, method 1

Chords chosen at random, method 2

Chords chosen at random, method 3

Other distributions can easily be imagined, many of which will yield a different proportion of chords which are longer than a side of the inscribed triangle.

Classical solution

The problem's classical solution hinges on the method by which a chord is chosen "at random". It turns out that if, and only if, the method of random selection is specified, does the problem have a well-defined solution. This is because each different method has a different underlying distribution of chords. The three solutions presented by Bertrand correspond to different selection methods, and in the absence of further information there is no reason to prefer one over another; accordingly the problem as stated has no unique solution.

An example of how to make the solution unique is to specify that the endpoints of the chord are uniformly distributed between 0 and c, where c is the circumference of the circle. This distribution is the same as that in Bertrand's first argument, and the resulting unique probability is 1/3.
This and other paradoxes of the classical interpretation of probability justified more stringent formulations, including frequentist probability and subjectivist Bayesian probability.

Jaynes's solution using the "maximum ignorance" principle

In his 1973 paper "The Well-Posed Problem", Edwin Jaynes proposed a solution to Bertrand's paradox, based on the principle of "maximum ignorance"—that we should not use any information that is not given in the statement of the problem. Jaynes pointed out that Bertrand's problem does not specify the position or size of the circle, and argued that therefore any definite and objective solution must be "indifferent" to size and position. In other words: the solution must be both scale and translation invariant.

To illustrate: assume that chords are laid at random onto a circle with a diameter of 2, for example by throwing straws onto it from far away. Now another circle with a smaller diameter (e.g., 1.1) is laid into the larger circle. Then the distribution of the chords on that smaller circle needs to be the same as on the larger circle. If the smaller circle is moved around within the larger circle, the probability must not change either. It can be seen very easily that there would be a change for method 3: the chord distribution on the small red circle looks qualitatively different from the distribution on the large circle:

The same occurs for method 1, though it is harder to see in a graphical representation. Method 2 is the only one that is both scale invariant and translation invariant; method 3 is just scale invariant, method 1 is neither.

However, Jaynes did not just use invariances to accept or reject given methods: this would leave the possibility that there is another not yet described method that would meet his common-sense criteria. Jaynes used the integral equations describing the invariances to directly determine the probability distribution. In this problem, the integral equations indeed have a unique solution, and it is precisely what was called "method 2" above, the random radius method.

In a 2015 article, Alon Drory claims that Jaynes' principle can also yield Bertrand's other two solutions. Drory argues that the mathematical implementation of the above invariance properties is not unique, but depends on the underlying procedure of random selection that one uses. He shows that each of Bertrand's three solutions can be derived using rotational, scaling, and translational invariance, concluding that Jaynes' principle is just as subject to interpretation as the principle of indifference itself.

Physical experiments

"Method 2" is the only solution that fulfills the transformation invariants that are present in certain physical systems—such as in statistical mechanics and gas physics—as well as in Jaynes's proposed experiment of throwing straws from a distance onto a small circle. Nevertheless, one can design other practical experiments that give answers according to the other methods. For example, in order to arrive at the solution of "method 1", the random endpoints method, one can affix a spinner to the center of the circle, and let the results of two independent spins mark the endpoints of the chord. In order to arrive at the solution of "method 3", one could cover the circle with molasses and mark the first point that a fly lands on as the midpoint of the chord. Several observers have designed experiments in order to obtain the different solutions and verified the results empirically.

Recent developments

In his 2007 paper, "Bertrand’s Paradox and the Principle of Indifference", Nicholas Shackel affirms that after more than a century the paradox remains unresolved, and continues to stand in refutation of the principle of indifference. Also, in his 2013 paper, "Bertrand’s paradox revisited: Why Bertrand’s ‘solutions’ are all inapplicable", Darrell P. Rowbottom shows that Bertrand’s proposed solutions are all inapplicable to his own question, so that the paradox would be much harder to solve than previously anticipated.

Shackel emphasizes that two different approaches have been generally adopted so far in trying to solve Bertrand's paradox: those where a distinction between non-equivalent problems was considered, and those where the problem was assumed to be a well-posed one. Shackel cites Louis Marinoff as a typical representative of the distinction strategy, and Edwin Jaynes as a typical representative of the well-posing strategy.

However, in a recent work, "Solving the hard problem of Bertrand's paradox", Diederik Aerts and Massimiliano Sassoli de Bianchi consider that a mixed strategy is necessary to tackle Bertrand's paradox. According to these authors, the problem needs first to be disambiguated by specifying in a very clear way the nature of the entity which is subjected to the randomization, and only once this is done the problem can be considered to be a well-posed one, in the Jaynes sense, so that the principle of maximum ignorance can be used to solve it. To this end, and since the problem doesn't specify how the chord has to be selected, the principle needs to be applied not at the level of the different possible choices of a chord, but at the much deeper level of the different possible ways of choosing a chord. This requires the calculation of a meta average over all the possible ways of selecting a chord, which the authors call a universal average. To handle it, they use a discretization method inspired by what is done in the definition of the probability law in the Wiener processes. The result they obtain is in agreement with the numerical result of Jaynes, although their well-posed problem is different from that of Jaynes.

Doomsday argument

From Wikipedia, the free encyclopedia

World population from 10,000 BC to AD 2000

The Doomsday argument (DA) is a probabilistic argument that claims to predict the number of future members of the human species given an estimate of the total number of humans born so far. Simply put, it says that supposing that all humans are born in a random order, chances are that any one human is born roughly in the middle.

It was first proposed in an explicit way by the astrophysicist Brandon Carter in 1983, from which it is sometimes called the Carter catastrophe; the argument was subsequently championed by the philosopher John A. Leslie and has since been independently discovered by J. Richard Gott and Holger Bech Nielsen. Similar principles of eschatology were proposed earlier by Heinz von Foerster, among others. A more general form was given earlier in the Lindy effect, in which for certain phenomena the future life expectancy is proportional to (though not necessarily equal to) the current age, and is based on decreasing mortality rate over time: old things endure.

Denoting by N the total number of humans who were ever or will ever be born, the Copernican principle suggests that any one human is equally likely (along with the other N − 1 humans) to find themselves at any position n of the total population N, so humans assume that our fractional position f = n/N is uniformly distributed on the interval [0, 1] prior to learning our absolute position.

f is uniformly distributed on (0, 1) even after learning of the absolute position n. That is, for example, there is a 95% chance that f is in the interval (0.05, 1), that is f > 0.05. In other words, we could assume that we could be 95% certain that we would be within the last 95% of all the humans ever to be born. If we know our absolute position n, this implies an upper bound for N obtained by rearranging n/N > 0.05 to give N < 20n.

If Leslie's figure is used, then 60 billion humans have been born so far, so it can be estimated that there is a 95% chance that the total number of humans N will be less than 20 × 60 billion = 1.2 trillion. Assuming that the world population stabilizes at 10 billion and a life expectancy of 80 years, it can be estimated that the remaining 1140 billion humans will be born in 9120 years. Depending on the projection of world population in the forthcoming centuries, estimates may vary, but the main point of the argument is that it is unlikely that more than 1.2 trillion humans will ever live on Earth. This problem is similar to the famous German tank problem.

The title "Doomsday Argument" is arguably a misnomer. Its popularity as a way of referring to this concept is perhaps based on the widespread belief that there are more people now alive than have ever lived, which would make the current generation of humans statistically likely to be the last one. According to the Population Reference Bureau, however, the number of biologically modern humans who have ever lived and died is closer to 107 billion, which is considerably more than the 7 billion alive today. That being the case, the argument actually implies it is unlikely that this is the last generation. Instead, it paints a relatively optimistic portrait of how long humanity is likely to last, even given current population growth. It is further worth noting that even if the argument is accepted at face value, it does not entail extinction–humanity could conversely evolve into something distinctly enough different that people born after that point would no longer compose part of the same reference group. For both these reasons, the invocation of "doomsday" is misleading.

Aspects

Remarks

The step that converts N into an extinction time depends upon a finite human lifespan. If immortality becomes common, and the birth rate drops to zero, then the human race could continue forever even if the total number of humans N is finite.
A precise formulation of the Doomsday Argument requires the Bayesian interpretation of probability.
Even among Bayesians some of the assumptions of the argument's logic would not be acceptable; for instance, the fact that it is applied to a temporal phenomenon (how long something lasts) means that N's distribution simultaneously represents an "aleatory probability" (as a future event), and an "epistemic probability" (as a decided value about which we are uncertain).
The U (0,1] f distribution is derived from two choices, which despite being the default are also arbitrary:
- The principle of indifference, so that it is as likely for any other randomly selected person to be born after you as before you.
- The assumption of no 'prior' knowledge on the distribution of N.

Simplification: two possible total numbers of humans

Assume for simplicity that the total number of humans who will ever be born is 60 billion (N₁), or 6,000 billion (N₂). If there is no prior knowledge of the position that a currently living individual, X, has in the history of humanity, we may instead compute how many humans were born before X, and arrive at (say) 59,854,795,447, which would roughly place X amongst the first 60 billion humans who have ever lived.

Now, if we assume that the number of humans who will ever be born equals N₁, the probability that X is amongst the first 60 billion humans who have ever lived is of course 100%. However, if the number of humans who will ever be born equals N₂, then the probability that X is amongst the first 60 billion humans who have ever lived is only 1%. Since X is in fact amongst the first 60 billion humans who have ever lived, this means that the total number of humans who will ever be born is more likely to be much closer to 60 billion than to 6,000 billion. In essence the DA therefore suggests that human extinction is more likely to occur sooner rather than later.

It is possible to sum the probabilities for each value of N and therefore to compute a statistical 'confidence limit' on N. For example, taking the numbers above, it is 99% certain that N is smaller than 6,000 billion.

Note that as remarked above, this argument assumes that the prior probability for N is flat, or 50% for N₁ and 50% for N₂ in the absence of any information about X. On the other hand, it is possible to conclude, given X, that N₂ is more likely than N₁, if a different prior is used for N. More precisely, Bayes' theorem tells us that P(N|X)=P(X|N)P(N)/P(X), and the conservative application of the Copernican principle tells us only how to calculate P(X|N). Taking P(X) to be flat, we still have to make an assumption about the prior probability P(N) that the total number of humans is N. If we conclude that N₂ is much more likely than N₁ (for example, because producing a larger population takes more time, increasing the chance that a low-probability but cataclysmic natural event will take place in that time), then P(X|N) can become more heavily weighted towards the bigger value of N. A further, more detailed discussion, as well as relevant distributions P(N), are given below in the Rebuttals section.

What the argument is not

The Doomsday argument (DA) does not say that humanity cannot or will not exist indefinitely. It does not put any upper limit on the number of humans that will ever exist, nor provide a date for when humanity will become extinct.

An abbreviated form of the argument does make these claims, by confusing probability with certainty. However, the actual DA's conclusion is:

There is a 95% chance of extinction within 9,120 years.

The DA gives a 5% chance that some humans will still be alive at the end of that period. (These dates are based on the assumptions above; the precise numbers vary among specific Doomsday arguments.)

Variations

This argument has generated a lively philosophical debate, and no consensus has yet emerged on its solution. The variants described below produce the DA by separate derivations.

Gott's formulation: 'vague prior' total population

Gott specifically proposes the functional form for the prior distribution of the number of people who will ever be born (N). Gott's DA used the vague prior distribution:

P(N)={\frac {k}{N}}

where

P(N) is the probability prior to discovering n, the total number of humans who have yet been born.
The constant, k, is chosen to normalize the sum of P(N). The value chosen isn't important here, just the functional form (this is an improper prior, so no value of k gives a valid distribution, but Bayesian inference is still possible using it.)

Since Gott specifies the prior distribution of total humans, P(N), Bayes's theorem and the principle of indifference alone give us P(N|n), the probability of N humans being born if n is a random draw from N:

P(N\mid n)={\frac {P(n\mid N)P(N)}{P(n)}}.

This is Bayes's theorem for the posterior probability of total population ever born of N, conditioned on population born thus far of n. Now, using the indifference principle:

P(n\mid N)={\frac {1}{N}}

The unconditioned n distribution of the current population is identical to the vague prior N probability density function, so:

P(n)={\frac {k}{n}}

giving P (N | n) for each specific N (through a substitution into the posterior probability equation):

P(N\mid n)={\frac {n}{N^{2}}}

The easiest way to produce the doomsday estimate with a given confidence (say 95%) is to pretend that N is a continuous variable (since it is very large) and integrate over the probability density from N = n to N = Z. (This will give a function for the probability that N ≤ Z):

P(N\leq Z)=\int _{N=n}^{N=Z}P(N|n)\,dN

={\frac {Z-n}{Z}}

Defining Z = 20n gives:

P(N\leq 20n)={\frac {19}{20}}

This is the simplest Bayesian derivation of the Doomsday Argument:

The chance that the total number of humans that will ever be born (N) is greater than twenty times the total that have been is below 5%

The use of a vague prior distribution seems well-motivated as it assumes as little knowledge as possible about N, given that any particular function must be chosen. It is equivalent to the assumption that the probability density of one's fractional position remains uniformly distributed even after learning of one's absolute position (n).

Gott's 'reference class' in his original 1993 paper was not the number of births, but the number of years 'humans' had existed as a species, which he put at 200,000. Also, Gott tried to give a 95% confidence interval between a minimum survival time and a maximum. Because of the 2.5% chance that he gives to underestimating the minimum he has only a 2.5% chance of overestimating the maximum. This equates to 97.5% confidence that extinction occurs before the upper boundary of his confidence interval.

97.5% is one chance in forty, which can be used in the integral above with Z = 40n, and n = 200,000 years:

P(N\leq 40[200000])={\frac {39}{40}}

This is how Gott produces a 97.5% confidence of extinction within N ≤ 8,000,000 years. The number he quoted was the likely time remaining, N − n = 7.8 million years. This was much higher than the temporal confidence bound produced by counting births, because it applied the principle of indifference to time. (Producing different estimates by sampling different parameters in the same hypothesis is Bertrand's paradox.)

His choice of 95% confidence bounds (rather than 80% or 99.9%, say) matched the scientifically accepted limit of statistical significance for hypothesis rejection. Therefore, he argued that the hypothesis: "humanity will cease to exist before 5,100 years or thrive beyond 7.8 million years" can be rejected.

Leslie's argument differs from Gott's version in that he does not assume a vague prior probability distribution for N. Instead he argues that the force of the Doomsday Argument resides purely in the increased probability of an early Doomsday once you take into account your birth position, regardless of your prior probability distribution for N. He calls this the probability shift.

Heinz von Foerster argued that humanity's abilities to construct societies, civilizations and technologies do not result in self inhibition. Rather, societies' success varies directly with population size. Von Foerster found that this model fit some 25 data points from the birth of Jesus to 1958, with only 7% of the variance left unexplained. Several follow-up letters (1961, 1962, …) were published in Science showing that von Foerster's equation was still on track. The data continued to fit up until 1973. The most remarkable thing about von Foerster's model was it predicted that the human population would reach infinity or a mathematical singularity, on Friday, November 13, 2026. In fact, von Foerster did not imply that the world population on that day could actually become infinite. The real implication was that the world population growth pattern followed for many centuries prior to 1960 was about to come to an end and be transformed into a radically different pattern. Note that this prediction began to be fulfilled just in a few years after the "Doomsday" was published.

Reference classes

One of the major areas of Doomsday Argument debate is the reference class from which n is drawn, and of which N is the ultimate size. The 'standard' Doomsday Argument hypothesis doesn't spend very much time on this point, and simply says that the reference class is the number of 'humans'. Given that you are human, the Copernican principle could be applied to ask if you were born unusually early, but the grouping of 'human' has been widely challenged on practical and philosophical grounds. Nick Bostrom has argued that consciousness is (part of) the discriminator between what is in and what is out of the reference class, and that extraterrestrial intelligences might affect the calculation dramatically.

The following sub-sections relate to different suggested reference classes, each of which has had the standard Doomsday Argument applied to it.

Sampling only WMD-era humans

The Doomsday clock shows the expected time to nuclear doomsday by the judgment of an expert board, rather than a Bayesian model. If the twelve hours of the clock symbolize the lifespan of the human species, its current time of 23:58 implies that we are among the last 1% of people who will ever be born (i.e., that n > 0.99N). J. Richard Gott's temporal version of the Doomsday argument (DA) would require very strong prior evidence to overcome the improbability of being born in such a special time.

If the clock's doomsday estimate is correct, there is less than 1 chance in 100 of seeing it show such a late time in human history, if observed at a random time within that history.

The scientists' warning can be reconciled with the DA, however. The Doomsday clock specifically estimates the proximity of atomic self-destruction—which has only been possible for about seventy years. If doomsday requires nuclear weaponry then the Doomsday Argument 'reference class' is people contemporaneous with nuclear weapons. In this model, the number of people living through, or born after Hiroshima is n, and the number of people who ever will is N. Applying Gott's DA to these variable definitions gives a 50% chance of doomsday within 50 years.

"In this model, the clock's hands are so close to midnight because a condition of doomsday is living post-1945, a condition which applies now but not to the earlier 11 hours and 53 minutes of the clock's metaphorical human 'day'."

If your life is randomly selected from all lives lived under the shadow of the bomb, this simple model gives a 95% chance of doomsday within 1000 years.

The scientists' recent use of moving the clock forward to warn of the dangers posed by global warming muddles this reasoning, however.

SSSA: Sampling from observer-moments

Nick Bostrom, considering observation selection effects, has produced a Self-Sampling Assumption (SSA): "that you should think of yourself as if you were a random observer from a suitable reference class". If the 'reference class' is the set of humans to ever be born, this gives N < 20n with 95% confidence (the standard Doomsday argument). However, he has refined this idea to apply to observer-moments rather than just observers. He has formalized this ( as:

The Strong Self-Sampling Assumption (SSSA): Each observer-moment should reason as if it were randomly selected from the class of all observer-moments in its reference class.

If the minute in which you read this article is randomly selected from every minute in every human's lifespan then (with 95% confidence) this event has occurred after the first 5% of human observer-moments. If the mean lifespan in the future is twice the historic mean lifespan, this implies 95% confidence that N < 10n (the average future human will account for twice the observer-moments of the average historic human). Therefore, the 95th percentile extinction-time estimate in this version is 4560 years.

Rebuttals

We are in the earliest 5%, a priori

If one agrees with the statistical methods, still disagreeing with the Doomsday argument (DA) implies that:

The current generation of humans are within the first 5% of humans to be born.
This is not purely a coincidence.

Therefore, these rebuttals try to give reasons for believing that the currently living humans are some of the earliest beings.

For instance, if one is a member of 50,000 people in a collaborative project, the Doomsday Argument implies a 95% chance that there will never be more than a million members of that project. This can be refuted if one's other characteristics are typical of the early adopter. The mainstream of potential users will prefer to be involved when the project is nearly complete. If one were to enjoy the project's incompleteness, it is already known that he or she is unusual, prior to the discovery of his or her early involvement.

If one has measurable attributes that sets one apart from the typical long run user, the project DA can be refuted based on the fact that one could expect to be within the first 5% of members, a priori. The analogy to the total-human-population form of the argument is: Confidence in a prediction of the distribution of human characteristics that places modern and historic humans outside the mainstream, implies that it is already known, before examining n that it is likely to be very early in N.

For example, if one is certain that 99% of humans who will ever live will be cyborgs, but that only a negligible fraction of humans who have been born to date are cyborgs, one could be equally certain that at least one hundred times as many people remain to be born as have been.

Robin Hanson's paper sums up these criticisms of the DA:

"All else is not equal; we have good reasons for thinking we are not randomly selected humans from all who will ever live."

Drawbacks of this rebuttal:

The question of how the confident prediction is derived. An uncannily prescient picture of humanity's statistical distribution is needed through all time, before humans can pronounce ourselves extreme members of that population. (In contrast, project pioneers have clearly distinct psychology from the mainstream.)
If the majority of humans have characteristics that they do not share, some would argue that this is equivalent to the Doomsday argument, since people similar to those observing these matters will become extinct.

Critique: Human extinction is distant, a posteriori

The a posteriori observation that extinction level events are rare could be offered as evidence that the DA's predictions are implausible; typically, extinctions of a dominant species happens less often than once in a million years. Therefore, it is argued that human extinction is unlikely within the next ten millennia. (Another probabilistic argument, drawing a different conclusion than the DA.)

In Bayesian terms, this response to the DA says that our knowledge of history (or ability to prevent disaster) produces a prior marginal for N with a minimum value in the trillions. If N is distributed uniformly from 10¹² to 10¹³, for example, then the probability of N < 1,200 billion inferred from n = 60 billion will be extremely small. This is an equally impeccable Bayesian calculation, rejecting the Copernican principle on the grounds that we must be 'special observers' since there is no likely mechanism for humanity to go extinct within the next hundred thousand years.

This response is accused of overlooking the technological threats to humanity's survival, to which earlier life was not subject, and is specifically rejected by most of the DA's academic critics (arguably excepting Robin Hanson).

In fact, many futurologists believe the empirical situation is worse than Gott's DA estimate. For instance, Sir Martin Rees believes that the technological dangers give an estimated human survival duration of ninety-five years (with 50% confidence.) Earlier prophets made similar predictions and were 'proven' wrong (e.g., on surviving the nuclear arms race). It is possible that their estimates were accurate, and that their common image as alarmists is a survivorship bias.

The prior N distribution may make n very uninformative

Robin Hanson argues that N's prior may be exponentially distributed:

N={\frac {e^{U(0,q]}}{c}}

Here, c and q are constants. If q is large, then our 95% confidence upper bound is on the uniform draw, not the exponential value of N.

The best way to compare this with Gott's Bayesian argument is to flatten the distribution from the vague prior by having the probability fall off more slowly with N (than inverse proportionally). This corresponds to the idea that humanity's growth may be exponential in time with doomsday having a vague prior pdf in time. This would mean than N, the last birth, would have a distribution looking like the following:

{\displaystyle \Pr(N)={\frac {k}{N^{\alpha }}},0<\alpha <1 .="" annotation="">

This prior N distribution is all that is required (with the principle of indifference) to produce the inference of N from n, and this is done in an identical way to the standard case, as described by Gott (equivalent to

\alpha

= 1 in this distribution):

{\displaystyle \Pr(n)=\int _{N=n}^{N=\infty }\Pr(n\mid N)\Pr(N)\,dN=\int _{n}^{\infty }{\frac {k}{N^{(\alpha +1)}}}\,dN={\frac {k}{{\alpha }n^{\alpha }}}}

Substituting into the posterior probability equation):

\Pr(N\mid n)={\frac {{\alpha }n^{\alpha }}{N^{(1+\alpha )}}}.

Integrating the probability of any N above xn:

\Pr(N>xn)=\int _{N=xn}^{N=\infty }\Pr(N\mid n)\,dN={\frac {1}{x^{\alpha }}}.

For example, if x = 20, and

\alpha

= 0.5, this becomes:

\Pr(N>20n)={\frac {1}{\sqrt {20}}}\simeq 22.3\%.

Therefore, with this prior, the chance of a trillion births is well over 20%, rather than the 5% chance given by the standard DA. If

\alpha

is reduced further by assuming a flatter prior N distribution, then the limits on N given by n become weaker. An

\alpha

of one reproduces Gott's calculation with a birth reference class, and

\alpha

around 0.5 could approximate his temporal confidence interval calculation (if the population were expanding exponentially). As

\alpha \to 0

(gets smaller) n becomes less and less informative about N. In the limit this distribution approaches an (unbounded) uniform distribution, where all values of N are equally likely. This is Page et al.'s "Assumption 3", which they find few reasons to reject, a priori. (Although all distributions with

\alpha \leq 1

are improper priors, this applies to Gott's vague-prior distribution also, and they can all be converted to produce proper integrals by postulating a finite upper population limit.) Since the probability of reaching a population of size 2N is usually thought of as the chance of reaching N multiplied by the survival probability from N to 2N it seems that Pr(N) must be a monotonically decreasing function of N, but this doesn't necessarily require an inverse proportionality.

A prior distribution with a very low

\alpha

parameter makes the DA's ability to constrain the ultimate size of humanity very weak.

Infinite expectation

Another objection to the Doomsday Argument is that the expected total human population is actually infinite. The calculation is as follows:

The total human population N = n/f, where n is the human population to date and f is our fractional position in the total.

We assume that f is uniformly distributed on (0,1].

The expectation of N is

E(N)=\int _{0}^{1}{n \over f}\,df=n\ln(1)-n\ln(0)=+\infty .

Self-Indication Assumption: The possibility of not existing at all

One objection is that the possibility of your existing at all depends on how many humans will ever exist (N). If this is a high number, then the possibility of your existing is higher than if only a few humans will ever exist. Since you do indeed exist, this is evidence that the number of humans that will ever exist is high.

This objection, originally by Dennis Dieks (1992), is now known by Nick Bostrom's name for it: the "Self-Indication Assumption objection". It can be shown that some SIAs prevent any inference of N from n (the current population).

Caves' rebuttal

The Bayesian argument by Carlton M. Caves says that the uniform distribution assumption is incompatible with the Copernican principle, not a consequence of it.

He gives a number of examples to argue that Gott's rule is implausible. For instance, he says, imagine stumbling into a birthday party, about which you know nothing:

Your friendly enquiry about the age of the celebrant elicits the reply that she is celebrating her (t_p = ) 50th birthday. According to Gott, you can predict with 95% confidence that the woman will survive between [50]/39 = 1.28 years and 39[×50] = 1,950 years into the future. Since the wide range encompasses reasonable expectations regarding the woman's survival, it might not seem so bad, till one realizes that [Gott's rule] predicts that with probability 1/2 the woman will survive beyond 100 years old and with probability 1/3 beyond 150. Few of us would want to bet on the woman's survival using Gott's rule.

Although this example exposes a weakness in J. Richard Gott's "Copernicus method" DA (that he does not specify when the "Copernicus method" can be applied) it is not precisely analogous with the modern DA; epistemological refinements of Gott's argument by philosophers such as Nick Bostrom specify that:

Knowing the absolute birth rank (n) must give no information on the total population (N).

Careful DA variants specified with this rule aren't shown implausible by Caves' "Old Lady" example above, because, the woman's age is given prior to the estimate of her lifespan. Since human age gives an estimate of survival time (via actuarial tables) Caves' Birthday party age-estimate could not fall into the class of DA problems defined with this proviso.

To produce a comparable "Birthday party example" of the carefully specified Bayesian DA we would need to completely exclude all prior knowledge of likely human life spans; in principle this could be done (e.g.: hypothetical Amnesia chamber). However, this would remove the modified example from everyday experience. To keep it in the everyday realm the lady's age must be hidden prior to the survival estimate being made. (Although this is no longer exactly the DA, it is much more comparable to it.)

Without knowing the lady’s age, the DA reasoning produces a rule to convert the birthday (n) into a maximum lifespan with 50% confidence (N). Gott's Copernicus method rule is simply: Prob (N < 2n) = 50%. How accurate would this estimate turn out to be? Western demographics are now fairly uniform across ages, so a random birthday (n) could be (very roughly) approximated by a U(0,M] draw where M is the maximum lifespan in the census. In this 'flat' model, everyone shares the same lifespan so N = M. If n happens to be less than (M)/2 then Gott's 2n estimate of N will be under M, its true figure. The other half of the time 2n underestimates M, and in this case (the one Caves highlights in his example) the subject will die before the 2n estimate is reached. In this 'flat demographics' model Gott's 50% confidence figure is proven right 50% of the time.

Self-referencing doomsday argument rebuttal

Some philosophers have been bold enough to suggest that only people who have contemplated the Doomsday argument (DA) belong in the reference class 'human'. If that is the appropriate reference class, Carter defied his own prediction when he first described the argument (to the Royal Society). A member present could have argued thus:

Presently, only one person in the world understands the Doomsday argument, so by its own logic there is a 95% chance that it is a minor problem which will only ever interest twenty people, and I should ignore it.

Jeff Dewynne and Professor Peter Landsberg suggested that this line of reasoning will create a paradox for the Doomsday argument:

If a member did pass such a comment, it would indicate that they understood the DA sufficiently well that in fact 2 people could be considered to understand it, and thus there would be a 5% chance that 40 or more people would actually be interested. Also, of course, ignoring something because you only expect a small number of people to be interested in it is extremely short sighted—if this approach were to be taken, nothing new would ever be explored, if we assume no a priori knowledge of the nature of interest and attentional mechanisms.

Additionally, it should be considered that because Carter did present and describe his argument, in which case the people to whom he explained it did contemplate the DA, as it was inevitable, the conclusion could then be drawn that in the moment of explanation Carter created the basis for his own prediction.

Conflation of future duration with total duration

Various authors have argued that the doomsday argument rests on an incorrect conflation of future duration with total duration. This occurs in the specification of the two time periods as "doom soon" and "doom deferred" which means that both periods are selected to occur after the observed value of the birth order. A rebuttal in Pisaturo (2009) argues that the Doomsday Argument relies on the equivalent of this equation:

P(H_{TS}|D_{p}X)/P(H_{TL}|D_{p}X)=[P(H_{FS}|X)/P(H_{FL}|X)]\cdot [P(D_{p}|H_{TS}X)/P(D_{p}|H_{TL}X)]

where:

X = the prior information;

D_p = the data that past duration is t_p;

H_FS = the hypothesis that the future duration of the phenomenon will be short;

H_FL = the hypothesis that the future duration of the phenomenon will be long;

H_TS = the hypothesis that the total duration of the phenomenon will be short—i.e., that t_t, the phenomenon’s total longevity, = t_TS;

H_TL = the hypothesis that the total duration of the phenomenon will be long—i.e., that t_t, the phenomenon’s total longevity, = t_TL, with t_TL > t_TS.

Pisaturo then observes:

Clearly, this is an invalid application of Bayes’ theorem, as it conflates future duration and total duration.

Pisaturo takes numerical examples based on two possible corrections to this equation: considering only future durations, and considering only total durations. In both cases, he concludes that the Doomsday Argument’s claim, that there is a ‘Bayesian shift’ in favor of the shorter future duration, is fallacious.

This argument is also echoed in O'Neill (2014). In this work the author argues that a unidirectional "Bayesian Shift" is an impossibility within the standard formulation of probability theory and is contradictory to the rules of probability. As with Pisaturo, he argues that the doomsday argument conflates future duration with total duration by specification of doom times that occur after the observed birth order. According to O'Neill:

The reason for the hostility to the doomsday argument and its assertion of a "Bayesian shift" is that many people who are familiar with probability theory are implicitly aware of the absurdity of the claim that one can have an automatic unidirectional shift in beliefs regardless of the actual outcome that is observed. This is an example of the "reasoning to a foregone conclusion" that arises in certain kinds of failures of an underlying inferential mechanism. An examination of the inference problem used in the argument shows that this suspicion is indeed correct and the doomsday argument is invalid. (pp. 216-217)

Mathematics-free explanation by analogy

Assume the human species is a car driver. The driver has encountered some bumps but no catastrophes, and the car (Earth) is still road-worthy. However, insurance is required. The cosmic insurer has not dealt with humanity before, and needs some basis on which to calculate the premium. According to the Doomsday Argument, the insurer merely need ask how long the car and driver have been on the road—currently at least 40,000 years without an "accident"—and use the response to calculate insurance based on a 50% chance that a fatal "accident" will occur inside that time period.

Consider a hypothetical insurance company that tries to attract drivers with long accident-free histories not because they necessarily drive more safely than newly qualified drivers, but for statistical reasons: the hypothetical insurer estimates that each driver looks for insurance quotes every year, so that the time since the last accident is an evenly distributed random sample between accidents. The chance of being more than halfway through an evenly distributed random sample is one-half, and (ignoring old-age effects) if the driver is more than halfway between accidents then he is closer to his next accident than his previous one. A driver who was accident-free for 10 years would be quoted a very low premium for this reason, but someone should not expect cheap insurance if he only passed his test two hours ago (equivalent to the accident-free record of the human species in relation to 40,000 years of geological time.)

Analogy to the estimated final score of a cricket batsman

A random in-progress cricket test match is sampled for a single piece of information: the current batsman's run tally so far. If the batsman is dismissed (rather than his team declaring because it has enough runs), what is the chance that he will end up with a score more than double his current total?

A rough empirical result is that the chance is half (on average).

The Doomsday argument (DA) is that even if we were completely ignorant of the game we could make the same prediction, or profit by offering a bet paying odds of 2-to-3 on the batsmen doubling his current score.

Importantly, we can only offer the bet before the current score is given (this is necessary because the absolute value of the current score would give a cricket expert a lot of information about the chance of that tally doubling). It is necessary to be ignorant of the absolute run tally before making the prediction because this is linked to the likely total, but if the likely total and absolute value are not linked, the survival prediction can be made after discovering the batter's current score. Analogously, the DA says that if the absolute number of humans born gives no information on the number that will be, we can predict the species’ total number of births after discovering that 60 billion people have ever been born: with 50% confidence it is 120 billion people, so that there is better-chance-than-not that the last human birth will occur before the 23rd century.

It is not true that the chance is half, whatever the number of runs currently scored; batting records give an empirical correlation between reaching a given score (50 say) and reaching any other, higher score (say 100). On the average, the chance of doubling the current tally may be half, but the chance of reaching 100 having scored 50 is much lower than reaching ten from five. Thus, the absolute value of the score gives information about the likely final total the batsman will reach, beyond the "scale invariant".

An analogous Bayesian critique of the DA is that it somehow possessed prior knowledge of the all-time human population distribution (total runs scored), and that this is more significant than the finding of a low number of births until now (a low current run count).

There are two alternative methods of making uniform draws from the current score (n):

Put the runs actually scored by dismissed player in order, say 200, and randomly choose between these scoring increments by U(0, 200].
Select a time randomly from the beginning of the match to the final dismissal.

The second sampling-scheme will include those lengthy periods of a game where a dismissed player is replaced, during which the ‘current batsman’ is preparing to take the field and has no runs. If people sample based on time-of-day rather than running-score they will often find that a new batsman has a score of zero when the total score that day was low, but humans will rarely sample a zero if one batsman continued piling on runs all day long. Therefore, sampling a non-zero score would tell us something about the likely final score the current batsman will achieve.

Choosing sampling method 2 rather than method 1 would give a different statistical link between current and final score: any non-zero score would imply that the batsman reached a high final total, especially if the time to replace batsman is very long. This is analogous to the SIA-DA-refutation that N's distribution should include N = 0 states, which leads to the DA having reduced predictive power (in the extreme, no power to predict N from n at all).

The Doomsday Argument as a tricky problem

Sometimes, the Doomsday Argument is presented as a probability problem using Bayes’ formula.

Hypotheses

Two hypotheses are in competition:

The theory A says that humanity will disappear in 2150,
and the theory B says that it will be much later.

Under assumption A, a tenth of humanity was alive in the year 2000, and humanity has included 50 billion individuals.

Under assumption B, one thousandth of humanity was alive in the year 2000, and humanity has included 5 trillion individuals.

The first theory seems less likely, and its a priori probability is set at 1%, while the probability of the second is logically set to 99%.

Now consider an event E, for example: "a person is part of the 5 billion people alive in the year 2000". One may ask "What is the most likely hypothesis, if you take into account this event?" and apply Bayes' formula:

\mathbb {P} (A\mid E)={\frac {\mathbb {P} (E\mid A)\cdot \mathbb {P} (A)}{\mathbb {P} (E)}}

According to the above figures:

\mathbb {P} (E\mid A)=10\%\ ,\ \mathbb {P} (E\mid B)=0.10\%

Now with :

\mathbb {P} (A)={\frac {1}{100}}\ ,\ \mathbb {P} (B)={\frac {99}{100}}

We get :

{\displaystyle \mathbb {P} (E)=\mathbb {P} (E\cap A)+\mathbb {P} (E\cap B)=\mathbb {P} (E\mid A)\cdot \mathbb {P} (A)+\mathbb {P} (E\mid B)\cdot \mathbb {P} (B)={\frac {19.9}{10000}}}

Finally the probabilities have changed dramatically:

\mathbb {P} (A\mid E)={\frac {10}{19.9}}=50.25\%

\mathbb {P} (B\mid E)={\frac {9.9}{19.9}}=49.75\%

Because an individual was chosen randomly, the probability of the end of the world has significantly increased.

Attempted Refutations

A potential refutation was provided in July 2003: Jean-Paul Delahaye showed that Bayes' formula introduces "probabilistic anamorphosis", and demonstrated that Bayes' formula is prone to misleading errors made in good faith by its users. In 2011, Philippe Gay showed that many similar problems can lead to these mistakes: each change of a weighted average by a simple one leads to odd results.

In 2010,^[18] Philippe Gay and Édouard Thomas described a slightly different understanding: the formula must take into account the number of humans involved in each case. These explanations show the same algebra:

{\displaystyle \mathbb {P} (B\mid E)={\frac {0.1\%\times 5\cdot 10^{12}\times 99\%}{0.1\%\times 5\cdot 10^{12}\times 99\%+10\%\times 50\cdot 10^{9}\times 1\%}}={\frac {99\%}{99\%+1\%}}=99\%=\mathbb {P} (B)}

Using a similar method, we get:

\mathbb {P} (A\mid E)={\frac {1\%}{99\%+1\%}}=1\%=\mathbb {P} (A)

A Medley of Potpourri

Search This Blog

Monday, September 3, 2018

Newton's Opticks

Overview

Opticks and the Principia

The Queries

Multiverse

Reception

Bertrand paradox (probability)

Bertrand's formulation of the problem

Classical solution

Jaynes's solution using the "maximum ignorance" principle

Physical experiments

Recent developments

Doomsday argument

Aspects

Remarks

Simplification: two possible total numbers of humans

What the argument is not

Variations

Gott's formulation: 'vague prior' total population

Reference classes

Sampling only WMD-era humans

SSSA: Sampling from observer-moments

Rebuttals

We are in the earliest 5%, a priori

Critique: Human extinction is distant, a posteriori

The prior N distribution may make n very uninformative

Infinite expectation

Self-Indication Assumption: The possibility of not existing at all

Caves' rebuttal

Self-referencing doomsday argument rebuttal

Conflation of future duration with total duration

Mathematics-free explanation by analogy

Analogy to the estimated final score of a cricket batsman

The Doomsday Argument as a tricky problem

Civil rights movements

Followers

Total Pageviews