Search This Blog

Wednesday, September 27, 2023

History of special relativity

From Wikipedia, the free encyclopedia

The history of special relativity consists of many theoretical results and empirical findings obtained by Albert A. Michelson, Hendrik Lorentz, Henri Poincaré and others. It culminated in the theory of special relativity proposed by Albert Einstein and subsequent work of Max Planck, Hermann Minkowski and others.

Introduction

Although Isaac Newton based his physics on absolute time and space, he also adhered to the principle of relativity of Galileo Galilei restating it precisely for mechanical systems. This can be stated as: as far as the laws of mechanics are concerned, all observers in inertial motion are equally privileged, and no preferred state of motion can be attributed to any particular inertial observer. However, as to electromagnetic theory and electrodynamics, during the 19th century the wave theory of light as a disturbance of a "light medium" or luminiferous aether was widely accepted, the theory reaching its most developed form in the work of James Clerk Maxwell. According to Maxwell's theory, all optical and electrical phenomena propagate through that medium, which suggested that it should be possible to experimentally determine motion relative to the aether.

The failure of any known experiment to detect motion through the aether led Hendrik Lorentz, starting in 1892, to develop a theory of electrodynamics based on an immobile luminiferous aether (about whose material constitution Lorentz did not speculate), physical length contraction, and a "local time" in which Maxwell's equations retain their form in all inertial frames of reference. Working with Lorentz's aether theory, Henri Poincaré, having earlier proposed the "relativity principle" as a general law of nature (including electrodynamics and gravitation), used this principle in 1905 to correct Lorentz's preliminary transformation formulas, resulting in an exact set of equations that are now called the Lorentz transformations. A little later in the same year Albert Einstein published his original paper on special relativity in which, again based on the relativity principle, he independently derived and radically reinterpreted the Lorentz transformations by changing the fundamental definitions of space and time intervals, while abandoning the absolute simultaneity of Galilean kinematics, thus avoiding the need for any reference to a luminiferous aether in classical electrodynamics. Subsequent work of Hermann Minkowski, in which he introduced a 4-dimensional geometric "spacetime" model for Einstein's version of special relativity, paved the way for Einstein's later development of his general theory of relativity and laid the foundations of relativistic field theories.

Aether and electrodynamics of moving bodies

Aether models and Maxwell's equations

Following the work of Thomas Young (1804) and Augustin-Jean Fresnel (1816), it was believed that light propagates as a transverse wave within an elastic medium called luminiferous aether. However, a distinction was made between optical and electrodynamical phenomena so it was necessary to create specific aether models for all phenomena. Attempts to unify those models or to create a complete mechanical description of them did not succeed, but after considerable work by many scientists, including Michael Faraday and Lord Kelvin, James Clerk Maxwell (1864) developed an accurate theory of electromagnetism by deriving a set of equations in electricity, magnetism and inductance, named Maxwell's equations. He first proposed that light was in fact undulations (electromagnetic radiation) in the same aetherial medium that is the cause of electric and magnetic phenomena. However, Maxwell's theory was unsatisfactory regarding the optics of moving bodies, and while he was able to present a complete mathematical model, he was not able to provide a coherent mechanical description of the aether.

After Heinrich Hertz in 1887 demonstrated the existence of electromagnetic waves, Maxwell's theory was widely accepted. In addition, Oliver Heaviside and Hertz further developed the theory and introduced modernized versions of Maxwell's equations. The "Maxwell–Hertz" or "Heaviside–Hertz" equations subsequently formed an important basis for the further development of electrodynamics, and Heaviside's notation is still used today. Other important contributions to Maxwell's theory were made by George FitzGerald, Joseph John Thomson, John Henry Poynting, Hendrik Lorentz, and Joseph Larmor.

Search for the aether

Regarding the relative motion and the mutual influence of matter and aether, there were two theories, neither entirely satisfactory. One was developed by Fresnel (and subsequently Lorentz). This model (Stationary Aether Theory) supposed that light propagates as a transverse wave and aether is partially dragged with a certain coefficient by matter. Based on this assumption, Fresnel was able to explain the aberration of light and many optical phenomena.
The other hypothesis was proposed by George Gabriel Stokes, who stated in 1845 that the aether was fully dragged by matter (later this view was also shared by Hertz). In this model the aether might be (by analogy with pine pitch) rigid for fast objects and fluid for slower objects. Thus the Earth could move through it fairly freely, but it would be rigid enough to transport light. Fresnel's theory was preferred because his dragging coefficient was confirmed by the Fizeau experiment in 1851, which measured the speed of light in moving liquids.

A. A. Michelson

Albert A. Michelson (1881) tried to measure the relative motion of the Earth and aether (Aether-Wind), as it was expected in Fresnel's theory, by using an interferometer. He could not determine any relative motion, so he interpreted the result as a confirmation of the thesis of Stokes. However, Lorentz (1886) showed Michelson's calculations were wrong and that he had overestimated the accuracy of the measurement. This, together with the large margin of error, made the result of Michelson's experiment inconclusive. In addition, Lorentz showed that Stokes' completely dragged aether led to contradictory consequences, and therefore he supported an aether theory similar to Fresnel's. To check Fresnel's theory again, Michelson and Edward W. Morley (1886) performed a repetition of the Fizeau experiment. Fresnel's dragging coefficient was confirmed very exactly on that occasion, and Michelson was now of the opinion that Fresnel's stationary aether theory was correct. To clarify the situation, Michelson and Morley (1887) repeated Michelson's 1881 experiment, and they substantially increased the accuracy of the measurement. However, this now famous Michelson–Morley experiment again yielded a negative result, i.e., no motion of the apparatus through the aether was detected (although the Earth's velocity is 60 km/s different in the northern winter than summer). So the physicists were confronted with two seemingly contradictory experiments: the 1886 experiment as an apparent confirmation of Fresnel's stationary aether, and the 1887 experiment as an apparent confirmation of Stokes' completely dragged aether.

A possible solution to the problem was shown by Woldemar Voigt (1887), who investigated the Doppler effect for waves propagating in an incompressible elastic medium and deduced transformation relations that left the wave equation in free space unchanged, and explained the negative result of the Michelson–Morley experiment. The Voigt transformations include the Lorentz factor for the y- and z-coordinates, and a new time variable which later was called "local time". However, Voigt's work was completely ignored by his contemporaries.

FitzGerald (1889) offered another explanation of the negative result of the Michelson–Morley experiment. Contrary to Voigt, he speculated that the intermolecular forces are possibly of electrical origin so that material bodies would contract in the line of motion (length contraction). This was in connection with the work of Heaviside (1887), who determined that the electrostatic fields in motion were deformed (Heaviside Ellipsoid), which leads to physically undetermined conditions at the speed of light. However, FitzGerald's idea remained widely unknown and was not discussed before Oliver Lodge published a summary of the idea in 1892. Also Lorentz (1892b) proposed length contraction independently from FitzGerald in order to explain the Michelson–Morley experiment. For plausibility reasons, Lorentz referred to the analogy of the contraction of electrostatic fields. However, even Lorentz admitted that that was not a necessary reason and length contraction consequently remained an ad hoc hypothesis.

Lorentz's theory of electrons

Hendrik Antoon Lorentz

Lorentz (1892a) set the foundations of Lorentz aether theory, by assuming the existence of electrons which he separated from the aether, and by replacing the "Maxwell–Hertz" equations by the "Maxwell–Lorentz" equations. In his model, the aether is completely motionless and, contrary to Fresnel's theory, also is not partially dragged by matter. An important consequence of this notion was that the velocity of light is totally independent of the velocity of the source. Lorentz gave no statements about the mechanical nature of the aether and the electromagnetic processes, but, rather, tried to explain the mechanical processes by electromagnetic ones and therefore created an abstract electromagnetic æther. In the framework of his theory, Lorentz calculated, like Heaviside, the contraction of the electrostatic fields. Lorentz (1895) also introduced what he called the "Theorem of Corresponding States" for terms of first order in . This theorem states that a moving observer (relative to the aether) in his "fictitious" field makes the same observations as a resting observer in his "real" field. An important part of it was local time , which paved the way to the Lorentz transformation and which he introduced independently of Voigt. With the help of this concept, Lorentz could explain the aberration of light, the Doppler effect and the Fizeau experiment as well. However, Lorentz's local time was only an auxiliary mathematical tool to simplify the transformation from one system into another – it was Poincaré in 1900 who recognized that "local time" is actually indicated by moving clocks. Lorentz also recognized that his theory violated the principle of action and reaction, since the aether acts on matter, but matter cannot act on the immobile aether.

A very similar model was created by Joseph Larmor (1897, 1900). Larmor was the first to put Lorentz's 1895 transformation into a form algebraically equivalent to the modern Lorentz transformations, however, he stated that his transformations preserved the form of Maxwell's equations only to second order of . Lorentz later noted that these transformations did in fact preserve the form of Maxwell's equations to all orders of . Larmor noticed on that occasion that length contraction was derivable from the model; furthermore, he calculated some manner of time dilation for electron orbits. Larmor specified his considerations in 1900 and 1904. Independently of Larmor, Lorentz (1899) extended his transformation for second-order terms and noted a (mathematical) time dilation effect as well.

Other physicists besides Lorentz and Larmor also tried to develop a consistent model of electrodynamics. For example, Emil Cohn (1900, 1901) created an alternative electrodynamics in which he, as one of the first, discarded the existence of the aether (at least in the previous form) and would use, like Ernst Mach, the fixed stars as a reference frame instead. Due to inconsistencies within his theory, like different light speeds in different directions, it was superseded by Lorentz's and Einstein's.

Electromagnetic mass

During his development of Maxwell's Theory, J. J. Thomson (1881) recognized that charged bodies are harder to set in motion than uncharged bodies. Electrostatic fields behave as if they add an "electromagnetic mass" to the mechanical mass of the bodies. I.e., according to Thomson, electromagnetic energy corresponds to a certain mass. This was interpreted as some form of self-inductance of the electromagnetic field. He also noticed that the mass of a body in motion is increased by a constant quantity. Thomson's work was continued and perfected by FitzGerald, Heaviside (1888), and George Frederick Charles Searle (1896, 1897). For the electromagnetic mass they gave — in modern notation — the formula , where is the electromagnetic mass and is the electromagnetic energy. Heaviside and Searle also recognized that the increase of the mass of a body is not constant and varies with its velocity. Consequently, Searle noted the impossibility of superluminal velocities, because infinite energy would be needed to exceed the speed of light. Also for Lorentz (1899), the integration of the speed-dependence of masses recognized by Thomson was especially important. He noticed that the mass not only varied due to speed, but is also dependent on the direction, and he introduced what Abraham later called "longitudinal" and "transverse" mass. (The transverse mass corresponds to what later was called relativistic mass.)

Wilhelm Wien (1900) assumed (following the works of Thomson, Heaviside, and Searle) that the entire mass is of electromagnetic origin, which was formulated in the context that all forces of nature are electromagnetic ones (the "Electromagnetic World View"). Wien stated that, if it is assumed that gravitation is an electromagnetic effect too, then there has to be a proportionality between electromagnetic energy, inertial mass and gravitational mass. In the same paper Henri Poincaré (1900b) found another way of combining the concepts of mass and energy. He recognized that electromagnetic energy behaves like a fictitious fluid with mass density of (or ) and defined a fictitious electromagnetic momentum as well. However, he arrived at a radiation paradox which was fully explained by Einstein in 1905.

Walter Kaufmann (1901–1903) was the first to confirm the velocity dependence of electromagnetic mass by analyzing the ratio (where is the charge and the mass) of cathode rays. He found that the value of decreased with the speed, showing that, assuming the charge constant, the mass of the electron increased with the speed. He also believed that those experiments confirmed the assumption of Wien, that there is no "real" mechanical mass, but only the "apparent" electromagnetic mass, or in other words, the mass of all bodies is of electromagnetic origin.

Max Abraham (1902–1904), who was a supporter of the electromagnetic world view, quickly offered an explanation for Kaufmann's experiments by deriving expressions for the electromagnetic mass. Together with this concept, Abraham introduced (like Poincaré in 1900) the notion of "electromagnetic momentum" which is proportional to . But unlike the fictitious quantities introduced by Poincaré, he considered it as a real physical entity. Abraham also noted (like Lorentz in 1899) that this mass also depends on the direction and coined the names "longitudinal" and "transverse" mass. In contrast to Lorentz, he did not incorporate the contraction hypothesis into his theory, and therefore his mass terms differed from those of Lorentz.

Based on the preceding work on electromagnetic mass, Friedrich Hasenöhrl suggested that part of the mass of a body (which he called apparent mass) can be thought of as radiation bouncing around a cavity. The "apparent mass" of radiation depends on the temperature (because every heated body emits radiation) and is proportional to its energy. Hasenöhrl stated that this energy-apparent-mass relation only holds as long as the body radiates, i.e., if the temperature of a body is greater than 0 K. At first he gave the expression for the apparent mass; however, Abraham and Hasenöhrl himself in 1905 changed the result to , the same value as for the electromagnetic mass for a body at rest.

Absolute space and time

Some scientists and philosophers of science were critical of Newton's definitions of absolute space and time. Ernst Mach (1883) argued that absolute time and space are essentially metaphysical concepts and thus scientifically meaningless, and suggested that only relative motion between material bodies is a useful concept in physics. Mach argued that even effects that according to Newton depend on accelerated motion with respect to absolute space, such as rotation, could be described purely with reference to material bodies, and that the inertial effects cited by Newton in support of absolute space might instead be related purely to acceleration with respect to the fixed stars. Carl Neumann (1870) introduced a "Body alpha", which represents some sort of rigid and fixed body for defining inertial motion. Based on the definition of Neumann, Heinrich Streintz (1883) argued that in a coordinate system where gyroscopes do not measure any signs of rotation inertial motion is related to a "Fundamental body" and a "Fundamental Coordinate System". Eventually, Ludwig Lange (1885) was the first to coin the expression inertial frame of reference and "inertial time scale" as operational replacements for absolute space and time; he defined "inertial frame" as "a reference frame in which a mass point thrown from the same point in three different (non-co-planar) directions follows rectilinear paths each time it is thrown". In 1902, Henri Poincaré published a collection of essays titled Science and Hypothesis, which included: detailed philosophical discussions on the relativity of space, time, and on the conventionality of distant simultaneity; the conjecture that a violation of the relativity principle can never be detected; the possible non-existence of the aether, together with some arguments supporting the aether; and many remarks on non-Euclidean vs. Euclidean geometry.

There were also some attempts to use time as a fourth dimension. This was done as early as 1754 by Jean le Rond d'Alembert in the Encyclopédie, and by some authors in the 19th century like H. G. Wells in his novel The Time Machine (1895). In 1901 a philosophical model was developed by Menyhért Palágyi, in which space and time were only two sides of some sort of "spacetime". He used time as an imaginary fourth dimension, which he gave the form (where , i.e. imaginary number). However, Palagyi's time coordinate is not connected to the speed of light. He also rejected any connection with the existing constructions of n-dimensional spaces and non-Euclidean geometry, so his philosophical model bears only little resemblance with spacetime physics, as it was later developed by Minkowski.

Light constancy and the principle of relative motion

Henri Poincaré

In the second half of the 19th century, there were many attempts to develop a worldwide clock network synchronized by electrical signals. For that endeavor, the finite propagation speed of light had to be considered, because synchronization signals could travel no faster than the speed of light.

In his paper The Measure of Time (1898), Henri Poincaré described some important consequences of this process and explained that astronomers, in determining the speed of light, simply assumed that light has a constant speed and that this speed is the same in all directions. Without this postulate, it would be impossible to infer the speed of light from astronomical observations, as Ole Rømer did based on observations of the moons of Jupiter. Poincaré also noted that the propagation speed of light can be (and in practice often is) used to define simultaneity between spatially separate events:

The simultaneity of two events, or the order of their succession, the equality of two durations, are to be so defined that the enunciation of the natural laws may be as simple as possible. In other words, all these rules, all these definitions are only the fruit of an unconscious opportunism.

In some other papers (1895, 1900b), Poincaré argued that experiments like that of Michelson and Morley show the impossibility of detecting the absolute motion of matter, i.e., the relative motion of matter in relation to the aether. He called this the "principle of relative motion". In the same year, he interpreted Lorentz's local time as the result of a synchronization procedure based on light signals. He assumed that two observers who are moving in the aether synchronize their clocks by optical signals. Since they believe themselves to be at rest, they consider only the transmission time of the signals and then cross-reference their observations to examine whether their clocks are synchronous. From the point of view of an observer at rest in the aether, the clocks are not synchronous and indicate the local time , but the moving observers fail to recognize this because they are unaware of their movement. So, contrary to Lorentz, Poincaré-defined local time can be measured and indicated by clocks. Therefore, in his recommendation of Lorentz for the Nobel Prize in 1902, Poincaré argued that Lorentz had convincingly explained the negative outcome of the aether drift experiments by inventing the "diminished" or "local" time, i.e. a time coordinate in which two events at different places could appear as simultaneous, although they are not simultaneous in reality.

Like Poincaré, Alfred Bucherer (1903) believed in the validity of the relativity principle within the domain of electrodynamics, but contrary to Poincaré, Bucherer even assumed that this implies the nonexistence of the aether. However, the theory that he created later in 1906 was incorrect and not self-consistent, and the Lorentz transformation was absent within his theory as well.

Lorentz's 1904 model

In his paper Electromagnetic phenomena in a system moving with any velocity smaller than that of light, Lorentz (1904) was following the suggestion of Poincaré and attempted to create a formulation of electrodynamics, which explains the failure of all known aether drift experiments, i.e. the validity of the relativity principle. He tried to prove the applicability of the Lorentz transformation for all orders, although he did not succeed completely. Like Wien and Abraham, he argued that there exists only electromagnetic mass, not mechanical mass, and derived the correct expression for longitudinal and transverse mass, which were in agreement with Kaufmann's experiments (even though those experiments were not precise enough to distinguish between the theories of Lorentz and Abraham). And using the electromagnetic momentum, he could explain the negative result of the Trouton–Noble experiment, in which a charged parallel-plate capacitor moving through the aether should orient itself perpendicular to the motion. Also the experiments of Rayleigh and Brace could be explained. Another important step was the postulate that the Lorentz transformation has to be valid for non-electrical forces as well.

At the same time, when Lorentz worked out his theory, Wien (1903) recognized an important consequence of the velocity dependence of mass. He argued that superluminal velocities were impossible, because that would require an infinite amount of energy — the same was already noted by Thomson (1893) and Searle (1897). And in June 1904, after he had read Lorentz's 1904 paper, he noticed the same in relation to length contraction, because at superluminal velocities the factor becomes imaginary.

Lorentz's theory was criticized by Abraham, who demonstrated that on one side the theory obeys the relativity principle, and on the other side the electromagnetic origin of all forces is assumed. Abraham showed, that both assumptions were incompatible, because in Lorentz's theory of the contracted electrons, non-electric forces were needed in order to guarantee the stability of matter. However, in Abraham's theory of the rigid electron, no such forces were needed. Thus the question arose whether the Electromagnetic conception of the world (compatible with Abraham's theory) or the Relativity Principle (compatible with Lorentz's Theory) was correct.

In a September 1904 lecture in St. Louis named The Principles of Mathematical Physics, Poincaré drew some consequences from Lorentz's theory and defined (in modification of Galileo's Relativity Principle and Lorentz's Theorem of Corresponding States) the following principle: "The Principle of Relativity, according to which the laws of physical phenomena must be the same for a stationary observer as for one carried along in a uniform motion of translation, so that we have no means, and can have none, of determining whether or not we are being carried along in such a motion." He also specified his clock synchronization method and explained the possibility of a "new method" or "new mechanics", in which no velocity can surpass that of light for all observers. However, he critically noted that the relativity principle, Newton's action and reaction, the conservation of mass, and the conservation of energy are not fully established and are even threatened by some experiments.

Also Emil Cohn (1904) continued to develop his alternative model (as described above), and while comparing his theory with that of Lorentz, he discovered some important physical interpretations of the Lorentz transformations. He illustrated (like Joseph Larmor in the same year) this transformation by using rods and clocks: If they are at rest in the aether, they indicate the true length and time, and if they are moving, they indicate contracted and dilated values. Like Poincaré, Cohn defined local time as the time that is based on the assumption of isotropic propagation of light. Contrary to Lorentz and Poincaré it was noticed by Cohn, that within Lorentz's theory the separation of "real" and "apparent" coordinates is artificial, because no experiment can distinguish between them. Yet according to Cohn's own theory, the Lorentz transformed quantities would only be valid for optical phenomena, while mechanical clocks would indicate the "real" time.

Poincaré's dynamics of the electron

On June 5, 1905, Henri Poincaré submitted the summary of a work which closed the existing gaps of Lorentz's work. (This short paper contained the results of a more complete work which would be published later, in January 1906.) He showed that Lorentz's equations of electrodynamics were not fully Lorentz-covariant. So he pointed out the group characteristics of the transformation, and he corrected Lorentz's formulas for the transformations of charge density and current density (which implicitly contained the relativistic velocity-addition formula, which he elaborated in May in a letter to Lorentz). Poincaré used for the first time the term "Lorentz transformation", and he gave the transformations their symmetrical form used to this day. He introduced a non-electrical binding force (the so-called "Poincaré stresses") to ensure the stability of the electrons and to explain length contraction. He also sketched a Lorentz-invariant model of gravitation (including gravitational waves) by extending the validity of Lorentz-invariance to non-electrical forces.

Eventually Poincaré (independently of Einstein) finished a substantially extended work of his June paper (the so-called "Palermo paper", received July 23, printed December 14, published January 1906 ). He spoke literally of "the postulate of relativity". He showed that the transformations are a consequence of the principle of least action and developed the properties of the Poincaré stresses. He demonstrated in more detail the group characteristics of the transformation, which he called the Lorentz group, and he showed that the combination is invariant. While elaborating his gravitational theory, he said the Lorentz transformation is merely a rotation in four-dimensional space about the origin, by introducing as a fourth imaginary coordinate (contrary to Palagyi, he included the speed of light), and he already used four-vectors. He wrote that the discovery of magneto-cathode rays by Paul Ulrich Villard (1904) seemed to threaten the entire theory of Lorentz, but this problem was quickly solved. However, although in his philosophical writings Poincaré rejected the ideas of absolute space and time, in his physical papers he continued to refer to an (undetectable) aether. He also continued (1900b, 1904, 1906, 1908b) to describe coordinates and phenomena as local/apparent (for moving observers) and true/real (for observers at rest in the aether). So, with a few exceptions, most historians of science argue that Poincaré did not invent what is now called special relativity, although it is admitted that Poincaré anticipated much of Einstein's methods and terminology.

Special relativity

Einstein 1905

Electrodynamics of moving bodies

Albert Einstein, 1921

On September 26, 1905 (received June 30), Albert Einstein published his annus mirabilis paper on what is now called special relativity. Einstein's paper includes a fundamental description of the kinematics of the rigid body, and it did not require an absolutely stationary space, such as the aether. Einstein identified two fundamental principles, the principle of relativity and the principle of the constancy of light (light principle), which served as the axiomatic basis of his theory. To better understand Einstein's step, a summary of the situation before 1905, as it was described above, shall be given (it must be remarked that Einstein was familiar with the 1895 theory of Lorentz, and Science and Hypothesis by Poincaré, but possibly not their papers of 1904–1905):

a) Maxwell's electrodynamics, as presented by Lorentz in 1895, was the most successful theory at this time. Here, the speed of light is constant in all directions in the stationary aether and completely independent of the velocity of the source;
b) The inability to find an absolute state of motion, i.e. the validity of the relativity principle as the consequence of the negative results of all aether drift experiments and effects like the moving magnet and conductor problem which only depend on relative motion;
c) The Fizeau experiment;
d) The aberration of light;

with the following consequences for the speed of light and the theories known at that time:

  1. The speed of light is not composed of the speed of light in vacuum and the velocity of a preferred frame of reference, by b. This contradicts the theory of the (nearly) stationary aether.
  2. The speed of light is not composed of the speed of light in vacuum and the velocity of the light source, by a and c. This contradicts the emission theory.
  3. The speed of light is not composed of the speed of light in vacuum and the velocity of an aether that would be dragged within or in the vicinity of matter, by a, c, and d. This contradicts the hypothesis of the complete aether drag.
  4. The speed of light in moving media is not composed of the speed of light when the medium is at rest and the velocity of the medium, but is determined by Fresnel's dragging coefficient, by c.

In order to make the principle of relativity as required by Poincaré an exact law of nature in the immobile aether theory of Lorentz, the introduction of a variety ad hoc hypotheses was required, such as the contraction hypothesis, local time, the Poincaré stresses, etc.. This method was criticized by many scholars, since the assumption of a conspiracy of effects which completely prevent the discovery of the aether drift is considered to be very improbable, and it would violate Occam's razor as well. Einstein is considered the first who completely dispensed with such auxiliary hypotheses and drew the direct conclusions from the facts stated above: that the relativity principle is correct and the directly observed speed of light is the same in all inertial reference frames. Based on his axiomatic approach, Einstein was able to derive all results obtained by his predecessors – and in addition the formulas for the relativistic Doppler effect and relativistic aberration – in a few pages, while prior to 1905 his competitors had devoted years of long, complicated work to arrive at the same mathematical formalism. Before 1905 Lorentz and Poincaré had adopted these same principles, as necessary to achieve their final results, but did not recognize that they were also sufficient in the sense that there was no immediate logical need to assume the existence of a stationary aether in order to arrive at the Lorentz transformations. As Lorentz later said, "Einstein simply postulates what we have deduced". Another reason for Einstein's early rejection of the aether in any form (which he later partially retracted) may have been related to his work on quantum physics. Einstein discovered that light can also be described (at least heuristically) as a kind of particle, so the aether as the medium for electromagnetic "waves" (which was highly important for Lorentz and Poincaré) no longer fitted into his conceptual scheme.

It's notable that Einstein's paper contains no direct references to other papers. However, many historians of science like Holton, Miller, Stachel, have tried to find out possible influences on Einstein. He stated that his thinking was influenced by the empiricist philosophers David Hume and Ernst Mach. Regarding the Relativity Principle, the moving magnet and conductor problem (possibly after reading a book of August Föppl) and the various negative aether drift experiments were important for him to accept that principle — but he denied any significant influence of the most important experiment: the Michelson–Morley experiment. Other likely influences include Poincaré's Science and Hypothesis, where Poincaré presented the Principle of Relativity (which, as has been reported by Einstein's friend Maurice Solovine, was closely studied and discussed by Einstein and his friends over a period of years before the publication of Einstein's 1905 paper), and the writings of Max Abraham, from whom he borrowed the terms "Maxwell–Hertz equations" and "longitudinal and transverse mass".

Regarding his views on Electrodynamics and the Principle of the Constancy of Light, Einstein stated that Lorentz's theory of 1895 (or the Maxwell–Lorentz electrodynamics) and also the Fizeau experiment had considerable influence on his thinking. He said in 1909 and 1912 that he borrowed that principle from Lorentz's stationary aether (which implies validity of Maxwell's equations and the constancy of light in the aether frame), but he recognized that this principle together with the principle of relativity makes any reference to an aether unnecessary (at least as to the description of electrodynamics in inertial frames). As he wrote in 1907 and in later papers, the apparent contradiction between those principles can be resolved if it is admitted that Lorentz's local time is not an auxiliary quantity, but can simply be defined as time and is connected with signal velocity. Before Einstein, Poincaré also developed a similar physical interpretation of local time and noticed the connection with signal velocity, but contrary to Einstein he continued to argue that clocks at rest in the stationary aether show the true time, while clocks in inertial motion relative to the aether show only the apparent time. Eventually, near the end of his life in 1953 Einstein described the advantages of his theory over that of Lorentz as follows (although Poincaré had already stated in 1905 that Lorentz invariance is an exact condition for any physical theory):

There is no doubt, that the special theory of relativity, if we regard its development in retrospect, was ripe for discovery in 1905. Lorentz had already recognized that the transformations named after him are essential for the analysis of Maxwell's equations, and Poincaré deepened this insight still further. Concerning myself, I knew only Lorentz's important work of 1895 [...] but not Lorentz's later work, nor the consecutive investigations by Poincaré. In this sense my work of 1905 was independent. [..] The new feature of it was the realization of the fact that the bearing of the Lorentz transformation transcended its connection with Maxwell's equations and was concerned with the nature of space and time in general. A further new result was that the "Lorentz invariance" is a general condition for any physical theory. This was for me of particular importance because I had already previously found that Maxwell's theory did not account for the micro-structure of radiation and could therefore have no general validity.

Mass–energy equivalence

Already in §10 of his paper on electrodynamics, Einstein used the formula

for the kinetic energy of an electron. In elaboration of this he published a paper (received September 27, November 1905), in which Einstein showed that when a material body lost energy (either radiation or heat) of amount E, its mass decreased by the amount E/c2. This led to the famous mass–energy equivalence formula: E = mc2. Einstein considered the equivalency equation to be of paramount importance because it showed that a massive particle possesses an energy, the "rest energy", distinct from its classical kinetic and potential energies. As it was shown above, many authors before Einstein arrived at similar formulas (including a 4/3-factor) for the relation of mass to energy. However, their work was focused on electromagnetic energy which (as we know today) only represents a small part of the entire energy within matter. So it was Einstein who was the first to: (a) ascribe this relation to all forms of energy, and (b) understand the connection of mass–energy equivalence with the relativity principle.

Early reception

First assessments

Walter Kaufmann (1905, 1906) was probably the first who referred to Einstein's work. He compared the theories of Lorentz and Einstein and, although he said Einstein's method is to be preferred, he argued that both theories are observationally equivalent. Therefore, he spoke of the relativity principle as the "Lorentz–Einsteinian" basic assumption. Shortly afterwards, Max Planck (1906a) was the first who publicly defended the theory and interested his students, Max von Laue and Kurd von Mosengeil, in this formulation. He described Einstein's theory as a "generalization" of Lorentz's theory and, to this "Lorentz–Einstein Theory", he gave the name "relative theory"; while Alfred Bucherer changed Planck's nomenclature into the now common "theory of relativity" ("Einsteinsche Relativitätstheorie"). On the other hand, Einstein himself and many others continued to refer simply to the new method as the "relativity principle". And in an important overview article on the relativity principle (1908a), Einstein described SR as a "union of Lorentz's theory and the relativity principle", including the fundamental assumption that Lorentz's local time can be described as real time. (Yet, Poincaré's contributions were rarely mentioned in the first years after 1905.) All of those expressions, (Lorentz–Einstein theory, relativity principle, relativity theory) were used by different physicists alternately in the next years.

Following Planck, other German physicists quickly became interested in relativity, including Arnold Sommerfeld, Wilhelm Wien, Max Born, Paul Ehrenfest, and Alfred Bucherer. von Laue, who learned about the theory from Planck, published the first definitive monograph on relativity in 1911. By 1911, Sommerfeld altered his plan to speak about relativity at the Solvay Congress because the theory was already considered well established.

Kaufmann–Bucherer-Neumann experiments

Kaufmann (1903) presented results of his experiments on the charge-to-mass ratio of beta rays from a radium source, showing the dependence of the velocity on mass. He announced that these results confirmed Abraham's theory. However, Lorentz (1904a) reanalyzed results from Kaufmann (1903) against his theory and based on the data in tables concluded (p. 828) that the agreement with his theory "is seen to come out no less satisfactory than" with Abraham's theory. A recent reanalysis of the data from Kaufmann (1903) confirms that Lorentz's theory (1904a) does agree substantially better than Abraham's theory when applied to data from Kaufmann (1903). Kaufmann (1905, 1906) presented further results, this time with electrons from cathode rays. They represented, in his opinion, a clear refutation of the relativity principle and the Lorentz-Einstein-Theory, and a confirmation of Abraham's theory. For some years Kaufmann's experiments represented a weighty objection against the relativity principle, although it was criticized by Planck and Adolf Bestelmeyer (1906). Other physicists working with beta rays from radium, like Alfred Bucherer (1908) and Günther Neumann (1914), following on Bucherer's work and improving on his methods, also examined the velocity-dependence of mass and this time it was thought that the "Lorentz-Einstein theory" and the relativity principle were confirmed, and Abraham's theory disproved. Kaufmann–Bucherer–Neumann experiments A distinction needs to be made between work with beta ray electrons and cathode ray electrons since beta rays from radium have a substantially larger velocities than cathode-ray electrons and so relativistic effects are very substantially easier to detect with beta rays. Kaufmann's experiments with electrons from cathode rays only showed a qualitative mass increase of moving electrons, but they were not precise enough to distinguish between the models of Lorentz-Einstein and Abraham. It was not until 1940, when experiments with electrons from cathode rays were repeated with sufficient accuracy for confirming the Lorentz-Einstein formula. However, this problem occurred only with this kind of experiment. The investigations of the fine structure of the hydrogen lines already in 1917 provided a clear confirmation of the Lorentz-Einstein formula and the refutation of Abraham's theory.

Relativistic momentum and mass

Max Planck

Planck (1906a) defined the relativistic momentum and gave the correct values for the longitudinal and transverse mass by correcting a slight mistake of the expression given by Einstein in 1905. Planck's expressions were in principle equivalent to those used by Lorentz in 1899. Based on the work of Planck, the concept of relativistic mass was developed by Gilbert Newton Lewis and Richard C. Tolman (1908, 1909) by defining mass as the ratio of momentum to velocity. So the older definition of longitudinal and transverse mass, in which mass was defined as the ratio of force to acceleration, became superfluous. Finally, Tolman (1912) interpreted relativistic mass simply as the mass of the body. However, many modern textbooks on relativity do not use the concept of relativistic mass anymore, and mass in special relativity is considered as an invariant quantity.

Mass and energy

Einstein (1906) showed that the inertia of energy (mass–energy equivalence) is a necessary and sufficient condition for the conservation of the center of mass theorem. On that occasion, he noted that the formal mathematical content of Poincaré's paper on the center of mass (1900b) and his own paper were mainly the same, although the physical interpretation was different in light of relativity.

Kurd von Mosengeil (1906) by extending Hasenöhrl's calculation of black-body radiation in a cavity, derived the same expression for the additional mass of a body due to electromagnetic radiation as Hasenöhrl. Hasenöhrl's idea was that the mass of bodies included a contribution from the electromagnetic field, he imagined a body as a cavity containing light. His relationship between mass and energy, like all other pre-Einstein ones, contained incorrect numerical prefactors (see Electromagnetic mass). Eventually Planck (1907) derived the mass–energy equivalence in general within the framework of special relativity, including the binding forces within matter. He acknowledged the priority of Einstein's 1905 work on , but Planck judged his own approach as more general than Einstein's.

Experiments by Fizeau and Sagnac

As was explained above, already in 1895 Lorentz succeeded in deriving Fresnel's dragging coefficient (to first order of v/c) and the Fizeau experiment by using the electromagnetic theory and the concept of local time. After first attempts by Jakob Laub (1907) to create a relativistic "optics of moving bodies", it was Max von Laue (1907) who derived the coefficient for terms of all orders by using the colinear case of the relativistic velocity addition law. In addition, Laue's calculation was much simpler than the complicated methods used by Lorentz.

In 1911 Laue also discussed a situation where on a platform a beam of light is split and the two beams are made to follow a trajectory in opposite directions. On return to the point of entry the light is allowed to exit the platform in such a way that an interference pattern is obtained. Laue calculated a displacement of the interference pattern if the platform is in rotation – because the speed of light is independent of the velocity of the source, so one beam has covered less distance than the other beam. An experiment of this kind was performed by Georges Sagnac in 1913, who actually measured a displacement of the interference pattern (Sagnac effect). While Sagnac himself concluded that his theory confirmed the theory of an aether at rest, Laue's earlier calculation showed that it is compatible with special relativity as well because in both theories the speed of light is independent of the velocity of the source. This effect can be understood as the electromagnetic counterpart of the mechanics of rotation, for example in analogy to a Foucault pendulum. Already in 1909–11, Franz Harress (1912) performed an experiment which can be considered as a synthesis of the experiments of Fizeau and Sagnac. He tried to measure the dragging coefficient within glass. Contrary to Fizeau he used a rotating device so he found the same effect as Sagnac. While Harress himself misunderstood the meaning of the result, it was shown by Laue that the theoretical explanation of Harress' experiment is in accordance with the Sagnac effect. Eventually, the Michelson–Gale–Pearson experiment (1925, a variation of the Sagnac experiment) indicated the angular velocity of the Earth itself in accordance with special relativity and a resting aether.

Relativity of simultaneity

The first derivations of relativity of simultaneity by synchronization with light signals were also simplified. Daniel Frost Comstock (1910) placed an observer in the middle between two clocks A and B. From this observer a signal is sent to both clocks, and in the frame in which A and B are at rest, they synchronously start to run. But from the perspective of a system in which A and B are moving, clock B is first set in motion, and then comes clock A – so the clocks are not synchronized. Also Einstein (1917) created a model with an observer in the middle between A and B. However, in his description two signals are sent from A and B to an observer aboard a moving train. From the perspective of the frame in which A and B are at rest, the signals are sent at the same time and the observer "is hastening towards the beam of light coming from B, whilst he is riding on ahead of the beam of light coming from A. Hence the observer will see the beam of light emitted from B earlier than he will see that emitted from A. Observers who take the railway train as their reference-body must therefore come to the conclusion that the lightning flash B took place earlier than the lightning flash A."

Spacetime physics

Minkowski's spacetime

Hermann Minkowski

Poincaré's attempt of a four-dimensional reformulation of the new mechanics was not continued by himself, so it was Hermann Minkowski (1907), who worked out the consequences of that notion (other contributions were made by Roberto Marcolongo (1906) and Richard Hargreaves (1908)). This was based on the work of many mathematicians of the 19th century like Arthur Cayley, Felix Klein, or William Kingdon Clifford, who contributed to group theory, invariant theory and projective geometry, formulating concepts such as the Cayley–Klein metric or the hyperboloid model in which the interval and its invariance was defined in terms of hyperbolic geometry. Using similar methods, Minkowski succeeded in formulating a geometrical interpretation of the Lorentz transformation. He completed, for example, the concept of four vectors; he created the Minkowski diagram for the depiction of spacetime; he was the first to use expressions like world line, proper time, Lorentz invariance/covariance, etc.; and most notably he presented a four-dimensional formulation of electrodynamics. Similar to Poincaré he tried to formulate a Lorentz-invariant law of gravity, but that work was subsequently superseded by Einstein's elaborations on gravitation.

In 1907 Minkowski named four predecessors who contributed to the formulation of the relativity principle: Lorentz, Einstein, Poincaré and Planck. And in his famous lecture Space and Time (1908) he mentioned Voigt, Lorentz and Einstein. Minkowski himself considered Einstein's theory as a generalization of Lorentz's and credited Einstein for completely stating the relativity of time, but he criticized his predecessors for not fully developing the relativity of space. However, modern historians of science argue that Minkowski's claim for priority was unjustified, because Minkowski (like Wien or Abraham) adhered to the electromagnetic world picture and apparently did not fully understand the difference between Lorentz's electron theory and Einstein's kinematics. In 1908, Einstein and Laub rejected the four-dimensional electrodynamics of Minkowski as overly complicated "learned superfluousness" and published a "more elementary", non-four-dimensional derivation of the basic equations for moving bodies. But it was Minkowski's geometric model that (a) showed that the special relativity is a complete and internally self-consistent theory, (b) added the Lorentz invariant proper time interval (which accounts for the actual readings shown by moving clocks), and (c) served as a basis for further development of relativity. Eventually, Einstein (1912) recognized the importance of Minkowski's geometric spacetime model and used it as the basis for his work on the foundations of general relativity.

Today special relativity is seen as an application of linear algebra, but at the time special relativity was being developed the field of linear algebra was still in its infancy. There were no textbooks on linear algebra as modern vector space and transformation theory, and the matrix notation of Arthur Cayley (that unifies the subject) had not yet come into widespread use. Cayley's matrix calculus notation was used by Minkowski (1908) in formulating relativistic electrodynamics, even though it was later replaced by Sommerfeld using vector notation. According to a recent source the Lorentz transformations are equivalent to hyperbolic rotations. However Varicak (1910) had shown that the standard Lorentz transformation is a translation in hyperbolic space.

Vector notation and closed systems

Minkowski's spacetime formalism was quickly accepted and further developed. For example, Arnold Sommerfeld (1910) replaced Minkowski's matrix notation by an elegant vector notation and coined the terms "four vector" and "six vector". He also introduced a trigonometric formulation of the relativistic velocity addition rule, which according to Sommerfeld, removes much of the strangeness of that concept. Other important contributions were made by Laue (1911, 1913), who used the spacetime formalism to create a relativistic theory of deformable bodies and an elementary particle theory. He extended Minkowski's expressions for electromagnetic processes to all possible forces and thereby clarified the concept of mass–energy equivalence. Laue also showed that non-electrical forces are needed to ensure the proper Lorentz transformation properties, and for the stability of matter – he could show that the "Poincaré stresses" (as mentioned above) are a natural consequence of relativity theory so that the electron can be a closed system.

Lorentz transformation without second postulate

There were some attempts to derive the Lorentz transformation without the postulate of the constancy of the speed of light. Vladimir Ignatowski (1910) for example used for this purpose (a) the principle of relativity, (b) homogeneity and isotropy of space, and (c) the requirement of reciprocity. Philipp Frank and Hermann Rothe (1911) argued that this derivation is incomplete and needs additional assumptions. Their own calculation was based on the assumptions that: (a) the Lorentz transformation forms a homogeneous linear group, (b) when changing frames, only the sign of the relative speed changes, (c) length contraction solely depends on the relative speed. However, according to Pauli and Miller such models were insufficient to identify the invariant speed in their transformation with the speed of light — for example, Ignatowski was forced to seek recourse in electrodynamics to include the speed of light. So Pauli and others argued that both postulates are needed to derive the Lorentz transformation. However, until today, others continued the attempts to derive special relativity without the light postulate.

Non-euclidean formulations without imaginary time coordinate

Minkowski in his earlier works in 1907 and 1908 followed Poincaré in representing space and time together in complex form (x,y,z,ict) emphasizing the formal similarity with Euclidean space. He noted that spacetime is in a certain sense a four-dimensional non-Euclidean manifold. Sommerfeld (1910) used Minkowski's complex representation to combine non-collinear velocities by spherical geometry and so derive Einstein's addition formula. Subsequent writers, principally Varićak, dispensed with the imaginary time coordinate, and wrote in explicitly non-Euclidean (i.e. Lobachevskian) form reformulating relativity using the concept of rapidity previously introduced by Alfred Robb (1911); Edwin Bidwell Wilson and Gilbert N. Lewis (1912) introduced a vector notation for spacetime; Émile Borel (1913) showed how parallel transport in non-Euclidean space provides the kinematic basis of Thomas precession twelve years before its experimental discovery by Thomas; Felix Klein (1910) and Ludwik Silberstein (1914) employed such methods as well. One historian argues that the non-Euclidean style had little to show "in the way of creative power of discovery", but it offered notational advantages in some cases, particularly in the law of velocity addition. (So in the years before World War I, the acceptance of the non-Euclidean style was approximately equal to that of the initial spacetime formalism, and it continued to be employed in relativity textbooks of the 20th century.

Time dilation and twin paradox

Einstein (1907a) proposed a method for detecting the transverse Doppler effect as a direct consequence of time dilation. And in fact, that effect was measured in 1938 by Herbert E. Ives and G. R. Stilwell (Ives–Stilwell experiment). And Lewis and Tolman (1909) described the reciprocity of time dilation by using two light clocks A and B, traveling with a certain relative velocity to each other. The clocks consist of two plane mirrors parallel to one another and to the line of motion. Between the mirrors a light signal is bouncing, and for the observer resting in the same reference frame as A, the period of clock A is the distance between the mirrors divided by the speed of light. But if the observer looks at clock B, he sees that within that clock the signal traces out a longer, angled path, thus clock B is slower than A. However, for the observer moving alongside B the situation is completely in reverse: Clock B is faster and A is slower. Lorentz (1910–1912) discussed the reciprocity of time dilation and analyzed a clock "paradox", which apparently occurs as a consequence of the reciprocity of time dilation. Lorentz showed that there is no paradox if one considers that in one system only one clock is used, while in the other system two clocks are necessary, and the relativity of simultaneity is fully taken into account.

Max von Laue

A similar situation was created by Paul Langevin in 1911 with what was later called the "twin paradox", where he replaced the clocks by persons (Langevin never used the word "twins" but his description contained all other features of the paradox). Langevin solved the paradox by alluding to the fact that one twin accelerates and changes direction, so Langevin could show that the symmetry is broken and the accelerated twin is younger. However, Langevin himself interpreted this as a hint as to the existence of an aether. Although Langevin's explanation is still accepted by some, his conclusions regarding the aether were not generally accepted. Laue (1913) pointed out that any acceleration can be made arbitrarily small in relation to the inertial motion of the twin, and that the real explanation is that one twin is at rest in two different inertial frames during his journey, while the other twin is at rest in a single inertial frame. Laue was also the first to analyze the situation based on Minkowski's spacetime model for special relativity – showing how the world lines of inertially moving bodies maximize the proper time elapsed between two events.

Acceleration

Einstein (1908) tried – as a preliminary in the framework of special relativity – also to include accelerated frames within the relativity principle. In the course of this attempt he recognized that for any single moment of acceleration of a body one can define an inertial reference frame in which the accelerated body is temporarily at rest. It follows that in accelerated frames defined in this way, the application of the constancy of the speed of light to define simultaneity is restricted to small localities. However, the equivalence principle that was used by Einstein in the course of that investigation, which expresses the equality of inertial and gravitational mass and the equivalence of accelerated frames and homogeneous gravitational fields, transcended the limits of special relativity and resulted in the formulation of general relativity.

Nearly simultaneously with Einstein, Minkowski (1908) considered the special case of uniform accelerations within the framework of his spacetime formalism. He recognized that the worldline of such an accelerated body corresponds to a hyperbola. This notion was further developed by Born (1909) and Sommerfeld (1910), with Born introducing the expression "hyperbolic motion". He noted that uniform acceleration can be used as an approximation for any form of acceleration within special relativity. In addition, Harry Bateman and Ebenezer Cunningham (1910) showed that Maxwell's equations are invariant under a much wider group of transformation than the Lorentz group, i.e., the spherical wave transformations, being a form of conformal transformations. Under those transformations the equations preserve their form for some types of accelerated motions. A general covariant formulation of electrodynamics in Minkowski space was eventually given by Friedrich Kottler (1912), whereby his formulation is also valid for general relativity. Concerning the further development of the description of accelerated motion in special relativity, the works by Langevin and others for rotating frames (Born coordinates), and by Wolfgang Rindler and others for uniform accelerated frames (Rindler coordinates) must be mentioned.

Rigid bodies and Ehrenfest paradox

Einstein (1907b) discussed the question of whether, in rigid bodies, as well as in all other cases, the velocity of information can exceed the speed of light, and explained that information could be transmitted under these circumstances into the past, thus causality would be violated. Since this contravenes radically against every experience, superluminal velocities are thought impossible. He added that a dynamics of the rigid body must be created in the framework of SR. Eventually, Max Born (1909) in the course of his above-mentioned work concerning accelerated motion, tried to include the concept of rigid bodies into SR. However, Paul Ehrenfest (1909) showed that Born's concept lead the so-called Ehrenfest paradox, in which, due to length contraction, the circumference of a rotating disk is shortened while the radius stays the same. This question was also considered by Gustav Herglotz (1910), Fritz Noether (1910), and von Laue (1911). It was recognized by Laue that the classic concept is not applicable in SR since a "rigid" body possesses infinitely many degrees of freedom. Yet, while Born's definition was not applicable on rigid bodies, it was very useful in describing rigid motions of bodies. In connection to the Ehrenfest paradox, it was also discussed (by Vladimir Varićak and others) whether length contraction is "real" or "apparent", and whether there is a difference between the dynamic contraction of Lorentz and the kinematic contraction of Einstein. However, it was rather a dispute over words because, as Einstein said, the kinematic length contraction is "apparent" for a co-moving observer, but for an observer at rest it is "real" and the consequences are measurable.

Acceptance of special relativity

Planck, in 1909, compared the implications of the modern relativity principle — he particularly referred to the relativity of time – with the revolution by the Copernican system. Poincaré made a similar analogy in 1905. An important factor in the adoption of special relativity by physicists was its development by Poincaré and Minkowski into a spacetime theory. Consequently, by about 1911, most theoretical physicists accepted special relativity. In 1912 Wilhelm Wien recommended both Lorentz (for the mathematical framework) and Einstein (for reducing it to a simple principle) for the Nobel Prize in Physics – although it was decided by the Nobel committee not to award the prize for special relativity. Only a minority of theoretical physicists such as Abraham, Lorentz, Poincaré, or Langevin still believed in the existence of an aether. Einstein later (1918–1920) qualified his position by arguing that one can speak about a relativistic aether, but the "idea of motion" cannot be applied to it. Lorentz and Poincaré had always argued that motion through the aether was undetectable. Einstein used the expression "special theory of relativity" in 1915, to distinguish it from general relativity.

Relativistic theories

Gravitation

The first attempt to formulate a relativistic theory of gravitation was undertaken by Poincaré (1905). He tried to modify Newton's law of gravitation so that it assumes a Lorentz-covariant form. He noted that there were many possibilities for a relativistic law, and he discussed two of them. It was shown by Poincaré that the argument of Pierre-Simon Laplace, who argued that the speed of gravity is many times faster than the speed of light, is not valid within a relativistic theory. That is, in a relativistic theory of gravitation, planetary orbits are stable even when the speed of gravity is equal to that of light. Similar models to that of Poincaré were discussed by Minkowski (1907b) and Sommerfeld (1910). However, it was shown by Abraham (1912) that those models belong to the class of "vector theories" of gravitation. The fundamental defect of those theories is that they implicitly contain a negative value for the gravitational energy in the vicinity of matter, which would violate the energy principle. As an alternative, Abraham (1912) and Gustav Mie (1913) proposed different "scalar theories" of gravitation. While Mie never formulated his theory in a consistent way, Abraham completely gave up the concept of Lorentz-covariance (even locally), and therefore it was irreconcilable with relativity.

In addition, all of those models violated the equivalence principle, and Einstein argued that it is impossible to formulate a theory which is both Lorentz-covariant and satisfies the equivalence principle. However, Gunnar Nordström (1912, 1913) was able to create a model which fulfilled both conditions. This was achieved by making both the gravitational and the inertial mass dependent on the gravitational potential. Nordström's theory of gravitation was remarkable because it was shown by Einstein and Adriaan Fokker (1914), that in this model gravitation can be completely described in terms of spacetime curvature. Although Nordström's theory is without contradiction, from Einstein's point of view a fundamental problem persisted: It does not fulfill the important condition of general covariance, as in this theory preferred frames of reference can still be formulated. So contrary to those "scalar theories", Einstein (1911–1915) developed a "tensor theory" (i.e. general relativity), which fulfills both the equivalence principle and general covariance. As a consequence, the notion of a complete "special relativistic" theory of gravitation had to be given up, as in general relativity the constancy of light speed (and Lorentz covariance) is only locally valid. The decision between those models was brought about by Einstein, when he was able to exactly derive the perihelion precession of Mercury, while the other theories gave erroneous results. In addition, only Einstein's theory gave the correct value for the deflection of light near the Sun.

Quantum field theory

The need to put together relativity and quantum mechanics was one of the major motivations in the development of quantum field theory. Pascual Jordan and Wolfgang Pauli showed in 1928 that quantum fields could be made to be relativistic, and Paul Dirac produced the Dirac equation for electrons, and in so doing predicted the existence of antimatter.

Many other domains have since been reformulated with relativistic treatments: relativistic thermodynamics, relativistic statistical mechanics, relativistic hydrodynamics, relativistic quantum chemistry, relativistic heat conduction, etc.

Experimental evidence

Important early experiments confirming special relativity as mentioned above were the Fizeau experiment, the Michelson–Morley experiment, the Kaufmann–Bucherer–Neumann experiments, the Trouton–Noble experiment, the experiments of Rayleigh and Brace, and the Trouton–Rankine experiment.

In the 1920s, a series of Michelson–Morley type experiments were conducted, confirming relativity to even higher precision than the original experiment. Another type of interferometer experiment was the Kennedy–Thorndike experiment in 1932, by which the independence of the speed of light from the velocity of the apparatus was confirmed. Time dilation was directly measured in the Ives–Stilwell experiment in 1938 and by measuring the decay rates of moving particles in 1940. All of those experiments have been repeated several times with increased precision. In addition, that the speed of light is unreachable for massive bodies was measured in many tests of relativistic energy and momentum. Therefore, knowledge of those relativistic effects is required in the construction of particle accelerators.

In 1962 J. G. Fox pointed out that all previous experimental tests of the constancy of the speed of light were conducted using light which had passed through stationary material: glass, air, or the incomplete vacuum of deep space. As a result, all were thus subject to the effects of the extinction theorem. This implied that the light being measured would have had a velocity different from that of the original source. He concluded that there was likely as yet no acceptable proof of the second postulate of special relativity. This surprising gap in the experimental record was quickly closed in the ensuing years, by experiments by Fox, and by Alvager et al., which used gamma rays sourced from high energy mesons. The high energy levels of the measured photons, along with very careful accounting for extinction effects, eliminated any significant doubt from their results.

Many other tests of special relativity have been conducted, testing possible violations of Lorentz invariance in certain variations of quantum gravity. However, no sign of anisotropy of the speed of light has been found even at the 10−17 level, and some experiments even ruled out Lorentz violations at the 10−40 level, see Modern searches for Lorentz violation.

Priority

Some claim that Poincaré and Lorentz, not Einstein, are the true discoverers of special relativity. For more see the article on relativity priority dispute.

Criticisms

Some criticized Special Relativity for various reasons, such as lack of empirical evidence, internal inconsistencies, rejection of mathematical physics per se, or philosophical reasons. Although there still are critics of relativity outside the scientific mainstream, the overwhelming majority of scientists agree that Special Relativity has been verified in many different ways and there are no inconsistencies within the theory.

Thought experiment

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Thought_experiment
Schrödinger's cat (1935) presents a cat that is in a superposition of alive and dead states, depending on a random quantum event. It illustrates the counter-intuitive implications of Bohr's Copenhagen interpretation when applied to everyday objects.

A thought experiment is a hypothetical situation in which a hypothesis, theory, or principle is laid out for the purpose of thinking through its consequences.

History

The ancient Greek δείκνυμι, deiknymi, 'thought experiment', "was the most ancient pattern of mathematical proof", and existed before Euclidean mathematics, where the emphasis was on the conceptual, rather than on the experimental part of a thought-experiment.

Johann Witt-Hansen established that Hans Christian Ørsted was the first to use the term Gedankenexperiment (from German: 'thought experiment') circa 1812. Ørsted was also the first to use the equivalent term Gedankenversuch in 1820.

By 1883, Ernst Mach used the term Gedankenexperiment in a different way, to denote exclusively the imaginary conduct of a real experiment that would be subsequently performed as a real physical experiment by his students. Physical and mental experimentation could then be contrasted: Mach asked his students to provide him with explanations whenever the results from their subsequent, real, physical experiment differed from those of their prior, imaginary experiment.

The English term thought experiment was coined (as a calque) from Mach's Gedankenexperiment, and it first appeared in the 1897 English translation of one of Mach's papers. Prior to its emergence, the activity of posing hypothetical questions that employed subjunctive reasoning had existed for a very long time (for both scientists and philosophers). The irrealis moods are ways to categorize it or to speak about it. This helps explain the extremely wide and diverse range of the application of the term "thought experiment" once it had been introduced into English.

Galileo's thought experiment concerned the outcome (c) of attaching a small stone (a) to a larger one (b)

Galileo's demonstration that falling objects must fall at the same rate regardless of their masses was a significant step forward in the history of modern science. This is widely thought to have been a straightforward physical demonstration, involving climbing up the Leaning Tower of Pisa and dropping two heavy weights off it, whereas in fact, it was a logical demonstration, using the 'thought experiment' technique. The 'experiment' is described by Galileo in Discorsi e dimostrazioni matematiche (1638) (from Italian: 'Mathematical Discourses and Demonstrations') thus:

Salviati. If then we take two bodies whose natural speeds are different, it is clear that on uniting the two, the more rapid one will be partly retarded by the slower, and the slower will be somewhat hastened by the swifter. Do you not agree with me in this opinion?

Simplicio. You are unquestionably right.

Salviati. But if this is true, and if a large stone moves with a speed of, say, eight while a smaller moves with a speed of four, then when they are united, the system will move with a speed less than eight; but the two stones when tied together make a stone larger than that which before moved with a speed of eight. Hence the heavier body moves with less speed than the lighter; an effect which is contrary to your supposition. Thus you see how, from your assumption that the heavier body moves more rapidly than the lighter one, I infer that the heavier body moves more slowly.

Uses

The common goal of a thought experiment is to explore the potential consequences of the principle in question:

A thought experiment is a device with which one performs an intentional, structured process of intellectual deliberation in order to speculate, within a specifiable problem domain, about potential consequents (or antecedents) for a designated antecedent (or consequent).

Given the structure of the experiment, it may not be possible to perform it, and even if it could be performed, there need not be an intention to perform it.

Examples of thought experiments include Schrödinger's cat, illustrating quantum indeterminacy through the manipulation of a perfectly sealed environment and a tiny bit of radioactive substance, and Maxwell's demon, which attempts to demonstrate the ability of a hypothetical finite being to violate the 2nd law of thermodynamics.

It is a common element of science-fiction stories.

Thought experiments, which are well-structured, well-defined hypothetical questions that employ subjunctive reasoning (irrealis moods) – "What might happen (or, what might have happened) if . . . " – have been used to pose questions in philosophy at least since Greek antiquity, some pre-dating Socrates. In physics and other sciences many thought experiments date from the 19th and especially the 20th Century, but examples can be found at least as early as Galileo.

In thought experiments, we gain new information by rearranging or reorganizing already known empirical data in a new way and drawing new (a priori) inferences from them or by looking at these data from a different and unusual perspective. In Galileo's thought experiment, for example, the rearrangement of empirical experience consists of the original idea of combining bodies of different weights.

Thought experiments have been used in philosophy (especially ethics), physics, and other fields (such as cognitive psychology, history, political science, economics, social psychology, law, organizational studies, marketing, and epidemiology). In law, the synonym "hypothetical" is frequently used for such experiments.

Regardless of their intended goal, all thought experiments display a patterned way of thinking that is designed to allow us to explain, predict and control events in a better and more productive way.

Theoretical consequences

In terms of their theoretical consequences, thought experiments generally:

  • challenge (or even refute) a prevailing theory, often involving the device known as reductio ad absurdum, (as in Galileo's original argument, a proof by contradiction),
  • confirm a prevailing theory,
  • establish a new theory, or
  • simultaneously refute a prevailing theory and establish a new theory through a process of mutual exclusion

Practical applications

Thought experiments can produce some very important and different outlooks on previously unknown or unaccepted theories. However, they may make those theories themselves irrelevant, and could possibly create new problems that are just as difficult, or possibly more difficult to resolve.

In terms of their practical application, thought experiments are generally created to:

  • challenge the prevailing status quo (which includes activities such as correcting misinformation (or misapprehension), identify flaws in the argument(s) presented, to preserve (for the long-term) objectively established fact, and to refute specific assertions that some particular thing is permissible, forbidden, known, believed, possible, or necessary);
  • extrapolate beyond (or interpolate within) the boundaries of already established fact;
  • predict and forecast the (otherwise) indefinite and unknowable future;
  • explain the past;
  • the retrodiction, postdiction and hindcasting of the (otherwise) indefinite and unknowable past;
  • facilitate decision making, choice, and strategy selection;
  • solve problems, and generate ideas;
  • move current (often insoluble) problems into another, more helpful, and more productive problem space (e.g.: functional fixedness);
  • attribute causation, preventability, blame, and responsibility for specific outcomes;
  • assess culpability and compensatory damages in social and legal contexts;
  • ensure the repeat of past success; or
  • examine the extent to which past events might have occurred differently.
  • ensure the (future) avoidance of past failures

Types

Temporal representation of a prefactual thought experiment.

Generally speaking, there are seven types of thought experiments in which one reasons from causes to effects, or effects to causes:

Prefactual

Prefactual (before the fact) thought experiments – the term prefactual was coined by Lawrence J. Sanna in 1998 – speculate on possible future outcomes, given the present, and ask "What will be the outcome if event E occurs?"

Counterfactual

Temporal representation of a counterfactual thought experiment.

Counterfactual (contrary to established fact) thought experiments – the term counterfactual was coined by Nelson Goodman in 1947, extending Roderick Chisholm's (1946) notion of a "contrary-to-fact conditional" – speculate on the possible outcomes of a different past; and ask "What might have happened if A had happened instead of B?" (e.g., "If Isaac Newton and Gottfried Leibniz had cooperated with each other, what would mathematics look like today?").

The study of counterfactual speculation has increasingly engaged the interest of scholars in a wide range of domains such as philosophy, psychology, cognitive psychology, history, political science, economics, social psychology, law, organizational theory, marketing, and epidemiology.

Semifactual

Temporal representation of a semifactual thought experiment.

Semifactual thought experiments – the term semifactual was coined by Nelson Goodman in 1947 – speculate on the extent to which things might have remained the same, despite there being a different past; and asks the question Even though X happened instead of E, would Y have still occurred? (e.g., Even if the goalie had moved left, rather than right, could he have intercepted a ball that was traveling at such a speed?).

Semifactual speculations are an important part of clinical medicine.

Predictive

Temporal representation of prediction, forecasting and nowcasting.

The activity of prediction attempts to project the circumstances of the present into the future. According to David Sarewitz and Roger Pielke (1999, p123), scientific prediction takes two forms:

  1. "The elucidation of invariant – and therefore predictive – principles of nature"; and
  2. "[Using] suites of observational data and sophisticated numerical models in an effort to foretell the behavior or evolution of complex phenomena".

Although they perform different social and scientific functions, the only difference between the qualitatively identical activities of predicting, forecasting, and nowcasting is the distance of the speculated future from the present moment occupied by the user. Whilst the activity of nowcasting, defined as "a detailed description of the current weather along with forecasts obtained by extrapolation up to 2 hours ahead", is essentially concerned with describing the current state of affairs, it is common practice to extend the term "to cover very-short-range forecasting up to 12 hours ahead" (Browning, 1982, p.ix).

Hindcasting

Temporal representation of hindcasting.

The activity of hindcasting involves running a forecast model after an event has happened in order to test whether the model's simulation is valid.

Retrodiction

Temporal representation of retrodiction or postdiction.

The activity of retrodiction (or postdiction) involves moving backward in time, step-by-step, in as many stages as are considered necessary, from the present into the speculated past to establish the ultimate cause of a specific event (e.g., reverse engineering and forensics).

Given that retrodiction is a process in which "past observations, events, add and data are used as evidence to infer the process(es) that produced them" and that diagnosis "involve[s] going from visible effects such as symptoms, signs and the like to their prior causes", the essential balance between prediction and retrodiction could be characterized as:

retrodiction : diagnosis :: prediction : prognosis

regardless of whether the prognosis is of the course of the disease in the absence of treatment, or of the application of a specific treatment regimen to a specific disorder in a particular patient.

Backcasting

Temporal representation of backcasting.

The activity of backcasting – the term backcasting was coined by John Robinson in 1982 – involves establishing the description of a very definite and very specific future situation. It then involves an imaginary moving backward in time, step-by-step, in as many stages as are considered necessary, from the future to the present to reveal the mechanism through which that particular specified future could be attained from the present.

Backcasting is not concerned with predicting the future:

The major distinguishing characteristic of backcasting analyses is the concern, not with likely energy futures, but with how desirable futures can be attained. It is thus explicitly normative, involving 'working backward' from a particular future end-point to the present to determine what policy measures would be required to reach that future.

According to Jansen (1994, p. 503:

Within the framework of technological development, "forecasting" concerns the extrapolation of developments towards the future and the exploration of achievements that can be realized through technology in the long term. Conversely, the reasoning behind "backcasting" is: on the basis of an interconnecting picture of demands technology must meet in the future – "sustainability criteria" – to direct and determine the process that technology development must take and possibly also the pace at which this development process must take effect. Backcasting [is] both an important aid in determining the direction technology development must take and in specifying the targets to be set for this purpose. As such, backcasting is an ideal search toward determining the nature and scope of the technological challenge posed by sustainable development, and it can thus serve to direct the search process toward new – sustainable – technology.

Fields

Thought experiments have been used in a variety of fields, including philosophy, law, physics, and mathematics. In philosophy they have been used at least since classical antiquity, some pre-dating Socrates. In law, they were well known to Roman lawyers quoted in the Digest. In physics and other sciences, notable thought experiments date from the 19th and especially the 20th century, but examples can be found at least as early as Galileo.

Philosophy

In philosophy, a thought experiment typically presents an imagined scenario with the intention of eliciting an intuitive or reasoned response about the way things are in the thought experiment. (Philosophers might also supplement their thought experiments with theoretical reasoning designed to support the desired intuitive response.) The scenario will typically be designed to target a particular philosophical notion, such as morality, or the nature of the mind or linguistic reference. The response to the imagined scenario is supposed to tell us about the nature of that notion in any scenario, real or imagined.

For example, a thought experiment might present a situation in which an agent intentionally kills an innocent for the benefit of others. Here, the relevant question is not whether the action is moral or not, but more broadly whether a moral theory is correct that says morality is determined solely by an action's consequences (See Consequentialism). John Searle imagines a man in a locked room who receives written sentences in Chinese, and returns written sentences in Chinese, according to a sophisticated instruction manual. Here, the relevant question is not whether or not the man understands Chinese, but more broadly, whether a functionalist theory of mind is correct.

It is generally hoped that there is universal agreement about the intuitions that a thought experiment elicits. (Hence, in assessing their own thought experiments, philosophers may appeal to "what we should say," or some such locution.) A successful thought experiment will be one in which intuitions about it are widely shared. But often, philosophers differ in their intuitions about the scenario.

Other philosophical uses of imagined scenarios arguably are thought experiments also. In one use of scenarios, philosophers might imagine persons in a particular situation (maybe ourselves), and ask what they would do.

For example, in the veil of ignorance, John Rawls asks us to imagine a group of persons in a situation where they know nothing about themselves, and are charged with devising a social or political organization. The use of the state of nature to imagine the origins of government, as by Thomas Hobbes and John Locke, may also be considered a thought experiment. Søren Kierkegaard explored the possible ethical and religious implications of Abraham's binding of Isaac in Fear and Trembling. Similarly, Friedrich Nietzsche, in On the Genealogy of Morals, speculated about the historical development of Judeo-Christian morality, with the intent of questioning its legitimacy.

An early written thought experiment was Plato's allegory of the cave. Another historic thought experiment was Avicenna's "Floating Man" thought experiment in the 11th century. He asked his readers to imagine themselves suspended in the air isolated from all sensations in order to demonstrate human self-awareness and self-consciousness, and the substantiality of the soul.

Science

Scientists tend to use thought experiments as imaginary, "proxy" experiments prior to a real, "physical" experiment (Ernst Mach always argued that these gedankenexperiments were "a necessary precondition for physical experiment"). In these cases, the result of the "proxy" experiment will often be so clear that there will be no need to conduct a physical experiment at all.

Scientists also use thought experiments when particular physical experiments are impossible to conduct (Carl Gustav Hempel labeled these sorts of experiment "theoretical experiments-in-imagination"), such as Einstein's thought experiment of chasing a light beam, leading to special relativity. This is a unique use of a scientific thought experiment, in that it was never carried out, but led to a successful theory, proven by other empirical means.

Properties

Further categorization of thought experiments can be attributed to specific properties.

Possibility

In many thought experiments, the scenario would be nomologically possible, or possible according to the laws of nature. John Searle's Chinese room is nomologically possible.

Some thought experiments present scenarios that are not nomologically possible. In his Twin Earth thought experiment, Hilary Putnam asks us to imagine a scenario in which there is a substance with all of the observable properties of water (e.g., taste, color, boiling point), but is chemically different from water. It has been argued that this thought experiment is not nomologically possible, although it may be possible in some other sense, such as metaphysical possibility. It is debatable whether the nomological impossibility of a thought experiment renders intuitions about it moot.

In some cases, the hypothetical scenario might be considered metaphysically impossible, or impossible in any sense at all. David Chalmers says that we can imagine that there are zombies, or persons who are physically identical to us in every way but who lack consciousness. This is supposed to show that physicalism is false. However, some argue that zombies are inconceivable: we can no more imagine a zombie than we can imagine that 1+1=3. Others have claimed that the conceivability of a scenario may not entail its possibility.

Causal reasoning

The first characteristic pattern that thought experiments display is their orientation in time. They are either:

  • Antefactual speculations: experiments that speculate about what might have happened prior to a specific, designated event, or
  • Postfactual speculations: experiments that speculate about what may happen subsequent to (or consequent upon) a specific, designated event.

The second characteristic pattern is their movement in time in relation to "the present moment standpoint" of the individual performing the experiment; namely, in terms of:

  • Their temporal direction: are they past-oriented or future-oriented?
  • Their temporal sense:
    • (a) in the case of past-oriented thought experiments, are they examining the consequences of temporal "movement" from the present to the past, or from the past to the present? or,
    • (b) in the case of future-oriented thought experiments, are they examining the consequences of temporal "movement" from the present to the future, or from the future to the present?

Relation to real experiments

The relation to real experiments can be quite complex, as can be seen again from an example going back to Albert Einstein. In 1935, with two coworkers, he published a paper on a newly created subject called later the EPR effect (EPR paradox). In this paper, starting from certain philosophical assumptions, on the basis of a rigorous analysis of a certain, complicated, but in the meantime assertedly realizable model, he came to the conclusion that quantum mechanics should be described as "incomplete". Niels Bohr asserted a refutation of Einstein's analysis immediately, and his view prevailed. After some decades, it was asserted that feasible experiments could prove the error of the EPR paper. These experiments tested the Bell inequalities published in 1964 in a purely theoretical paper. The above-mentioned EPR philosophical starting assumptions were considered to be falsified by the empirical fact (e.g. by the optical real experiments of Alain Aspect).

Thus thought experiments belong to a theoretical discipline, usually to theoretical physics, but often to theoretical philosophy. In any case, it must be distinguished from a real experiment, which belongs naturally to the experimental discipline and has "the final decision on true or not true", at least in physics.

Interactivity

Thought experiments can also be interactive where the author invites people into his thought process through providing alternative paths with alternative outcomes within the narrative, or through interaction with a programmed machine, like a computer program.

Thanks to the advent of the Internet, the digital space has lent itself as a new medium for a new kind of thought experiments. The philosophical work of Stefano Gualeni, for example, focuses on the use of virtual worlds to materialize thought experiments and to playfully negotiate philosophical ideas. His arguments were originally presented in his book Virtual Worlds as Philosophical Tools.

Gualeni's argument is that the history of philosophy has, until recently, merely been the history of written thought, and digital media can complement and enrich the limited and almost exclusively linguistic approach to philosophical thought. He considers virtual worlds to be philosophically viable and advantageous in contexts like those of thought experiments, when the recipients of a certain philosophical notion or perspective are expected to objectively test and evaluate different possible courses of action, or in cases where they are confronted with interrogatives concerning non-actual or non-human phenomenologies.

Intrinsic dimension

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Intrinsic_dimension

The intrinsic dimension for a data set can be thought of as the number of variables needed in a minimal representation of the data. Similarly, in signal processing of multidimensional signals, the intrinsic dimension of the signal describes how many variables are needed to generate a good approximation of the signal.

When estimating intrinsic dimension, however, a slightly broader definition based on manifold dimension is often used, where a representation in the intrinsic dimension does only need to exist locally. Such intrinsic dimension estimation methods can thus handle data sets with different intrinsic dimensions in different parts of the data set. This is often referred to as local intrinsic dimensionality.

The intrinsic dimension can be used as a lower bound of what dimension it is possible to compress a data set into through dimension reduction, but it can also be used as a measure of the complexity of the data set or signal. For a data set or signal of N variables, its intrinsic dimension M satisfies 0 ≤ M ≤ N, although estimators may yield higher values.

Example

Let be a two-variable function (or signal) which is of the form for some one-variable function g which is not constant. This means that f varies, in accordance to g, with the first variable or along the first coordinate. On the other hand, f is constant with respect to the second variable or along the second coordinate. It is only necessary to know the value of one, namely the first, variable in order to determine the value of f. Hence, it is a two-variable function but its intrinsic dimension is one.

A slightly more complicated example is. f is still intrinsic one-dimensional, which can be seen by making a variable transformation and which gives . Since the variation in f can be described by the single variable y1 its intrinsic dimension is one.

For the case that f is constant, its intrinsic dimension is zero since no variable is needed to describe variation. For the general case, when the intrinsic dimension of the two-variable function f is neither zero or one, it is two.

In the literature, functions which are of intrinsic dimension zero, one, or two are sometimes referred to as i0D, i1D or i2D, respectively.

Formal definition for signals

For an N-variable function f, the set of variables can be represented as an N-dimensional vector x: .

If for some M-variable function g and M × N matrix A is it the case that

  • for all x;
  • M is the smallest number for which the above relation between f and g can be found,

then the intrinsic dimension of f is M.

The intrinsic dimension is a characterization of f, it is not an unambiguous characterization of g nor of A. That is, if the above relation is satisfied for some f, g, and A, it must also be satisfied for the same f and g′ and A′ given by and where B is a non-singular M × M matrix, since .

The Fourier transform of signals of low intrinsic dimension

An N variable function which has intrinsic dimension M < N has a characteristic Fourier transform. Intuitively, since this type of function is constant along one or several dimensions its Fourier transform must appear like an impulse (the Fourier transform of a constant) along the same dimension in the frequency domain.

A simple example

Let f be a two-variable function which is i1D. This means that there exists a normalized vector and a one-variable function g such that for all . If F is the Fourier transform of f (both are two-variable functions) it must be the case that .

Here G is the Fourier transform of g (both are one-variable functions), δ is the Dirac impulse function and m is a normalized vector in perpendicular to n. This means that F vanishes everywhere except on a line which passes through the origin of the frequency domain and is parallel to m. Along this line F varies according to G.

The general case

Let f be an N-variable function which has intrinsic dimension M, that is, there exists an M-variable function g and M × N matrix A such that .

Its Fourier transform F can then be described as follows:

  • F vanishes everywhere except for a subspace of dimension M
  • The subspace M is spanned by the rows of the matrix A
  • In the subspace, F varies according to G the Fourier transform of g

Generalizations

The type of intrinsic dimension described above assumes that a linear transformation is applied to the coordinates of the N-variable function f to produce the M variables which are necessary to represent every value of f. This means that f is constant along lines, planes, or hyperplanes, depending on N and M.

In a general case, f has intrinsic dimension M if there exist M functions a1, a2, ..., aM and an M-variable function g such that

  • for all x
  • M is the smallest number of functions which allows the above transformation

A simple example is transforming a 2-variable function f to polar coordinates:

  • , f is i1D and is constant along any circle centered at the origin
  • , f is i1D and is constant along all rays from the origin

For the general case, a simple description of either the point sets for which f is constant or its Fourier transform is usually not possible.

Local Intrinsic Dimensionality

Local intrinsic dimensionality (LID) refers to the observation that often data is distributed on a lower-dimensional manifold when only considering a nearby subset of the data. For example the function can be considered one-dimensional when y is close to 0 (with one variable x), two-dimensional when y is close to 1, and again one-dimensional when y is positive and much larger than 1 (with variable x+y).

Local intrinsic dimensionality is often used with respect to data. It then usually is estimated based on the k nearest neighbors of a data point, often based on a concept related to the doubling dimension in mathematics. Since the volume of a d-sphere grows exponentially in d, the rate at which new neighbors are found as the search radius is increased can be used to estimate the local intrinsic dimensionality (e.g., GED estimation). However, alternate approaches of estimation have been proposed, for example angle-based estimation.

Intrinsic dimension estimation

Intrinsic dimension of data manifolds can be estimated by many methods, depending on assumptions of the data manifold. A 2016 review is.

The two-nearest neighbors (TwoNN) method is a method for estimating the intrinsic dimension of an immersed Riemannian manifold. The algorithm is as follows:

Scatter some points on the manifold.

Measure for many points, where are the distances to the point's two closest neighbors.

Fit the empirical CDF of to .

Return .

History

During the 1950s so called "scaling" methods were developed in the social sciences to explore and summarize multidimensional data sets. After Shepard introduced non-metric multidimensional scaling in 1962 one of the major research areas within multi-dimensional scaling (MDS) was estimation of the intrinsic dimension. The topic was also studied in information theory, pioneered by Bennet in 1965 who coined the term "intrinsic dimension" and wrote a computer program to estimate it.

During the 1970s intrinsic dimensionality estimation methods were constructed that did not depend on dimensionality reductions such as MDS: based on local eigenvalues., based on distance distributions, and based on other dimension-dependent geometric properties

Estimating intrinsic dimension of sets and probability measures has also been extensively studied since around 1980 in the field of dynamical systems, where dimensions of (strange) attractors have been the subject of interest. For strange attractors there is no manifold assumption, and the dimension measured is some version of fractal dimension — which also can be non-integer. However, definitions of fractal dimension yield the manifold dimension for manifolds.

In the 2000s the "curse of dimensionality" has been exploited to estimate intrinsic dimension.

Applications

The case of a two-variable signal which is i1D appears frequently in computer vision and image processing and captures the idea of local image regions which contain lines or edges. The analysis of such regions has a long history, but it was not until a more formal and theoretical treatment of such operations began that the concept of intrinsic dimension was established, even though the name has varied.

For example, the concept which here is referred to as an image neighborhood of intrinsic dimension 1 or i1D neighborhood is called 1-dimensional by Knutsson (1982), linear symmetric by Bigün & Granlund (1987) and simple neighborhood in Granlund & Knutsson (1995).

Lie group

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Lie_group In mathematics , a Lie gro...