Search This Blog

Wednesday, October 10, 2018

Allometry

From Wikipedia, the free encyclopedia
 
Skeleton of an elephant
 
Skeleton of a tiger quoll (Dasyurus maculatus).
The proportionately thicker bones in the elephant are an example of allometric scaling

Allometry is the study of the relationship of body size to shape, anatomy, physiology and finally behaviour, first outlined by Otto Snell in 1892, by D'Arcy Thompson in 1917 in On Growth and Form and by Julian Huxley in 1932.

Overview

Allometry is a well-known study, particularly in statistical shape analysis for its theoretical developments, as well as in biology for practical applications to the differential growth rates of the parts of a living organism's body. One application is in the study of various insect species (e.g., Hercules beetles), where a small change in overall body size can lead to an enormous and disproportionate increase in the dimensions of appendages such as legs, antennae, or horns The relationship between the two measured quantities is often expressed as a power law equation which expresses a remarkable scale symmetry:
{\displaystyle y=kx^{a}\,\!}
or in a logarithmic form:
\log y=a\log x+\log k\,\!
where a is the scaling exponent of the law. Methods for estimating this exponent from data can use type-2 regressions, such as major axis regression or reduced major axis regression, as these account for the variation in both variables, contrary to least squares regression, which does not account for error variance in the independent variable (e.g., log body mass). Other methods include measurement-error models and a particular kind of principal component analysis.

Allometry often studies shape differences in terms of ratios of the objects' dimensions. Two objects of different size, but common shape, will have their dimensions in the same ratio. Take, for example, a biological object that grows as it matures. Its size changes with age, but the shapes are similar. Studies of ontogenetic allometry often use lizards or snakes as model organisms both because they lack parental care after birth or hatching and because they exhibit a large range of body sizes between the juvenile and adult stage. Lizards often exhibit allometric changes during their ontogeny.

In addition to studies that focus on growth, allometry also examines shape variation among individuals of a given age (and sex), which is referred to as static allometry. Comparisons of species are used to examine interspecific or evolutionary allometry.

Isometric scaling and geometric similarity

Scaling range for different organisms
Group Factor Length range
Insects 1000 10-4 to 10-1 m
Fish 1000 10-2 to 10+1 m
Mammals 1000 10-1 to 10+2 m
Vascular plants 10,000 10-2 to 10+2 m
Algae 100,000 10-5 to 100 m
Isometric scaling happens when proportional relationships are preserved as size changes during growth or over evolutionary time. An example is found in frogs — aside from a brief period during the few weeks after metamorphosis, frogs grow isometrically. Therefore, a frog whose legs are as long as its body will retain that relationship throughout its life, even if the frog itself increases in size tremendously.

Isometric scaling is governed by the square-cube law. An organism which doubles in length isometrically will find that the surface area available to it will increase fourfold, while its volume and mass will increase by a factor of eight. This can present problems for organisms. In the case of above, the animal now has eight times the biologically active tissue to support, but the surface area of its respiratory organs has only increased fourfold, creating a mismatch between scaling and physical demands. Similarly, the organism in the above example now has eight times the mass to support on its legs, but the strength of its bones and muscles is dependent upon their cross-sectional area, which has only increased fourfold. Therefore, this hypothetical organism would experience twice the bone and muscle loads of its smaller version. This mismatch can be avoided either by being "overbuilt" when small or by changing proportions during growth, called allometry.

Isometric scaling is often used as a null hypothesis in scaling studies, with 'deviations from isometry' considered evidence of physiological factors forcing allometric growth.

Allometric scaling

Allometric scaling is any change that deviates from isometry. A classic example discussed by Galileo in his Dialogues Concerning Two New Sciences is the skeleton of mammals. The skeletal structure becomes much stronger and more robust relative to the size of the body as the body size increases. Allometry is often expressed in terms of a scaling exponent based on body mass, or body length (Snout-vent length, total length etc.). A perfectly isometrically scaling organism would see all volume-based properties change proportionally to the body mass, all surface area-based properties change with mass to the power of 2/3, and all length-based properties change with mass to the power of 1/3. If, after statistical analyses, for example, a volume-based property was found to scale to mass to the 0.9th power, then this would be called "negative allometry", as the values are smaller than predicted by isometry. Conversely, if a surface area-based property scales to mass to the 0.8th power, the values are higher than predicted by isometry and the organism is said to show "positive allometry". One example of positive allometry occurs among species of monitor lizards (family Varanidae), in which the limbs are relatively longer in larger-bodied species. The same is true for some fish, e.g. the muskellunge, the weight of which grows with about the power of 3.325 of its length. A 30-inch (76 cm) muskellunge will weigh about 8 pounds (3.6 kg), while a 40-inch (100 cm) muskellunge will weigh about 18 pounds (8.2 kg), so 33% longer length will more than double the weight.

Determining if a system is scaling with allometry

To determine whether isometry or allometry is present, an expected relationship between variables needs to be determined to compare data to. This is important in determining if the scaling relationship in a dataset deviates from an expected relationship (such as those that follow isometry). The use of tools such as dimensional analysis is very helpful in determining expected slope. This ‘expected’ slope, as it is known, is essential for detecting allometry because scaling variables are comparisons to other things. Saying that mass scales with a slope of 5 in relation to length doesn’t have much meaning unless knowing the isometric slope is 3, meaning in this case, the mass is increasing extremely fast. For example, different sized frogs should be able to jump the same distance according to the geometric similarity model proposed by Hill 1950 and interpreted by Wilson 2000, but in actuality larger frogs do jump longer distances. Dimensional analysis is extremely useful for balancing units in an equation or in this case, determining expected slope.

A few dimensional examples follow (M=Mass, L=Length, V=Volume, which is also L cubed because a volume is merely length cubed):

Allometric relations show as straight lines when plotted on double-logarithmic axes

To find the expected slope for the relationship between mass and the characteristic length of an animal (see figure), the units of mass (M=L3, because mass is a volume; volumes are lengths cubed) from the Y-axis are divided by the units of the X-axis (in this case, L). The expected slope on a double-logarithmic plot of L3/ L1 in this case is 3 (log10(L3)/log10(L1)=3). This is the slope of a straight line, but most data gathered in science do not fall neatly in a straight line, so data transformations are useful. It is also important to keep in mind what is being compared in the data. Comparing a characteristic such as head length to head width might yield different results from comparing head length to body length. That is, different characteristics may scale differently.

A common way to analyze data such as those collected in scaling is to use log-transformation. There are two reasons for log transformation - a biological reason and a statistical reason. Biologically, log-log transformation places numbers into a geometric domain so that proportional deviations are represented consistently, independent of the scale and units of measurement. In biology this is appropriate because many biological phenomena (e.g. growth, reproduction, metabolism, sensation) are fundamentally multiplicative. Statistically, it is beneficial to transform both axes using logarithms and then perform a linear regression. This will normalize the data set and make it easier to analyze trends using the slope of the line. Before analyzing data though, it is important to have a predicted slope of the line to compare the analysis to.

After data are log-transformed and linearly regressed, comparisons can then use least squares regression with 95% confidence intervals or reduced major axis analysis. Sometimes the two analyses can yield different results, but often they do not. If the expected slope is outside the confidence intervals, then there is allometry present. If mass in this imaginary animal scaled with a slope of 5 and this was a statistically significant value, then mass would scale very fast in this animal versus the expected value. It would scale with positive allometry. If the expected slope were 3 and in reality in a certain organism mass scaled with 1 (assuming this slope is statistically significant), then it would be negatively allometric.

Another example: Force is dependent on the cross-sectional area of muscle (CSA), which is L2. If comparing force to a length, then the expected slope is 2. Alternatively, this analysis may be accomplished with a power regression. Plot the relationship between the data onto a graph. Fit this to a power curve (depending on the stats program, this can be done multiple ways), and it will give an equation with the form: y=Zxn, where n is the number. That “number” is the relationship between the data points. The downside, to this form of analysis, is that it makes it a little more difficult to do statistical analyses.

Physiological scaling

Many physiological and biochemical processes (such as heart rate, respiration rate or the maximum reproduction rate) show scaling, mostly associated with the ratio between surface area and mass (or volume) of the animal. The metabolic rate of an individual animal is also subject to scaling.

Metabolic rate and body mass

In plotting an animal's basal metabolic rate (BMR) against the animal's own body mass, a logarithmic straight line is obtained, indicating a power-law dependence. Overall metabolic rate in animals is generally accepted to show negative allometry, scaling to mass to a power of ≈ 0.75, known as Kleiber's law, 1932. This means that larger-bodied species (e.g., elephants) have lower mass-specific metabolic rates and lower heart rates, as compared with smaller-bodied species (e.g., mice). The straight line generated from a double logarithmic scale of metabolic rate in relation to body mass is known as the "mouse-to-elephant curve". These relationships of metabolic rates, times, and internal structure have been explained as, "an elephant is approximately a blown-up gorilla, which is itself a blown-up mouse."

Max Kleiber contributed the following allometric equation for relating the BMR to the body mass of an animal. Statistical analysis of the intercept did not vary from 70 and the slope was not varied from 0.75, thus:
{\displaystyle {\text{Metabolic rate}}=70M^{0.75}} (although the universality of this relation has been disputed both empirically and theoretically)
where M is body mass, and metabolic rate is measured in kcal per day.

Consequently, the body mass itself can explain the majority of the variation in the BMR. After the body mass effect, the taxonomy of the animal plays the next most significant role in the scaling of the BMR. The further speculation that environmental conditions play a role in BMR can only be properly investigated once the role of taxonomy is established. The challenge with this lies in the fact that a shared environment also indicates a common evolutionary history and thus a close taxonomic relationship. There are strides currently in research to overcome these hurdles; for example, an analysis in muroid rodents, the mouse, hamster, and vole type, took into account taxonomy. Results revealed the hamster (warm dry habitat) had lowest BMR and the mouse (warm wet dense habitat) had the highest BMR. Larger organs could explain the high BMR groups, along with their higher daily energy needs. Analyses such as these demonstrate the physiological adaptations to environmental changes that animals undergo.

Energy metabolism is subjected to the scaling of an animal and can be overcome by an individual's body design. The metabolic scope for an animal is the ratio of resting and maximum rate of metabolism for that particular species as determined by oxygen consumption. Oxygen consumption VO2 and maximum oxygen consumption VO2 max. Oxygen consumption in species that differ in body size and organ system dimensions show a similarity in their charted VO2 distributions indicating that, despite the complexity of their systems, there is a power law dependence of similarity; therefore, universal patterns are observed in diverse animal taxonomy.

Across a broad range of species, allometric relations are not necessarily linear on a log-log scale. For example, the maximal running speeds of mammals show a complicated relationship with body mass, and the fastest sprinters are of intermediate body size.

Allometric muscle characteristics

The muscle characteristics of animals are similar in a wide range of animal sizes, though muscle sizes and shapes can and often do vary depending on environmental constraints placed on them. The muscle tissue itself maintains its contractile characteristics and does not vary depending on the size of the animal. Physiological scaling in muscles affects the number of muscle fibers and their intrinsic speed to determine the maximum power and efficiency of movement in a given animal. The speed of muscle recruitment varies roughly in inverse proportion to the cube root of the animal’s weight (compare the intrinsic frequency of the sparrow’s flight muscle to that of a stork).
{\displaystyle \mathrm {frequency} ={\frac {1}{\mathrm {mass} ^{1/3}}}}
For inter-species allometric relations related to such ecological variables as maximal reproduction rate, attempts have been made to explain scaling within the context of dynamic energy budget theory and the metabolic theory of ecology. However, such ideas have been less successful.

Allometry of legged locomotion

Methods of study

Allometry has been used to study patterns in locomotive principles across a broad range of species. Such research has been done in pursuit of a better understanding of animal locomotion, including the factors that different gaits seek to optimize. Allometric trends observed in extant animals have even been combined with evolutionary algorithms to form realistic hypotheses concerning the locomotive patterns of extinct species. These studies have been made possible by the remarkable similarities among disparate species’ locomotive kinematics and dynamics, “despite differences in morphology and size”.

Allometric study of locomotion involves the analysis of the relative sizes, masses, and limb structures of similarly shaped animals and how these features affect their movements at different speeds. Patterns are identified based on dimensionless Froude numbers, which incorporate measures of animals’ leg lengths, speed or stride frequency, and weight.

Alexander incorporates Froude-number analysis into his “dynamic similarity hypothesis” of gait patterns. Dynamically similar gaits are those between which there are constant coefficients that can relate linear dimensions, time intervals, and forces. In other words, given a mathematical description of gait A and these three coefficients, one could produce gait B, and vice versa. The hypothesis itself is as follows: “animals of different sizes tend to move in dynamically similar fashion whenever the ratio of their speed allows it.” While the dynamic similarity hypothesis may not be a truly unifying principle of animal gait patterns, it is a remarkably accurate heuristic.

It has also been shown that living organisms of all shapes and sizes utilize spring mechanisms in their locomotive systems, probably in order to minimize the energy cost of locomotion. The allometric study of these systems has fostered a better understanding of why spring mechanisms are so common, how limb compliance varies with body size and speed, and how these mechanisms affect general limb kinematics and dynamics.

Principles of legged locomotion identified through allometry

  • Alexander found that animals of different sizes and masses traveling with the same Froude number consistently exhibit similar gait patterns.
  • Duty factors—percentages of a stride during which a foot maintains contact with the ground—remain relatively constant for different animals moving with the same Froude number.
  • The dynamic similarity hypothesis states that "animals of different sizes tend to move in dynamically similar fashion whenever the ratio of their speed allows it".
  • Body mass has even more of an effect than speed on limb dynamics.
  • Leg stiffness, {\displaystyle k_{\text{leg}}={\frac {\text{peak force}}{\text{peak displacement}}}}, is proportional to M^{0.67}, where M is body mass.
  • Peak force experienced throughout a stride is proportional to M^{0.97}.
  • The amount by which a leg shortens during a stride (i.e. its peak displacement) is proportional to M^{0.30}.
  • The angle swept by a leg during a stride is proportional to M^{-0.034}.
  • The mass-specific work rate of a limb is proportional to M^{0.11}.

Drug dose scaling

The physiological effect of drugs and other substances in many cases scales allometrically.

West, Brown, and Enquist in 1997 derived a hydrodynamic theory to explain the universal fact that metabolic rate scales as the ¾ power with body weight. They also showed why lifespan scales as the +¼ power and heart rate as the -¼ power. Blood flow (+¾) and resistance (-¾) scale in the same way, leading to blood pressure being constant across species.

Hu and Hayton in 2001 discussed whether the basal metabolic rate scale is a ⅔ or ¾ power of body mass. The exponent of ¾ might be used for substances that are eliminated mainly by metabolism, or by metabolism and excretion combined, while ⅔ might apply for drugs that are eliminated mainly by renal excretion.

An online allometric scaler of drug doses based on the above work is available.

The US Food and Drug Administration (FDA) published guidance in 2005 giving a flow chart that presents the decisions and calculations used to generate the maximum recommended starting dose in drug clinical trials from animal data.

Allometric scaling in fluid locomotion

The mass and density of an organism have a large effect on the organism's locomotion through a fluid. For example, a tiny organisms uses flagella and can effectively move through a fluid it is suspended in. Then on the other scale a blue whale that is much more massive and dense in comparison with the viscosity of the fluid, compared to a bacterium in the same medium. The way in which the fluid interacts with the external boundaries of the organism is important with locomotion through the fluid. For streamlined swimmers the resistance or drag determines the performance of the organism. This drag or resistance can be seen in two distinct flow patterns. There is Laminar Flow where the fluid is relatively uninterrupted after the organism moves through it. Turbulent flow is the opposite, where the fluid moves roughly around an organisms that creates vortices that absorb energy from the propulsion or momentum of the organism. Scaling also affects locomotion through a fluid because of the energy needed to propel an organism and to keep up velocity through momentum. The rate of oxygen consumption per gram body size decreases consistently with increasing body size.

In general, smaller, more streamlined organisms create laminar flow (R < 0.5x106), whereas larger, less streamlined organisms produce turbulent flow (R > 2.0×106). Also, increase in velocity (V) increases turbulence, which can be proved using the Reynolds equation. In nature however, organisms such as a 6‘-6” dolphin moving at 15 knots does not have the appropriate Reynolds numbers for laminar flow R = 107, but exhibit it in nature. Mr. G.A Steven observed and documented dolphins moving at 15 knots alongside his ship leaving a single trail of light when phosphorescent activity in the sea was high. The factors that contribute are:
  • Surface area of the organism and its effect on the fluid in which the organism lives is very important in determining the parameters of locomotion.
  • The Velocity of an organism through fluid changes the dynamic of the flow around that organism and as velocity increases the shape of the organism becomes more important for laminar flow.
  • Density and viscosity of fluid.
  • Length of the organism is factored into the equation because the surface area of just the front 2/3 of the organism has an effect on the drag
The resistance to the motion of an approximately stream-lined solid through a fluid can be expressed by the formula: C(total surface)V2/2  V = velocity
ρ = density of fluid
Cf = 1.33R − 1 (laminar flow) R = Reynolds number
Reynolds number [R] = VL/ν
V = velocity
L = axial length of organism
ν = kinematic viscosity (viscosity/density)
Notable Reynolds numbers:
R < 0.5x106 = laminar flow threshold
R > 2.0x106 = turbulent flow threshold
Scaling also has an effect on the performance of organisms in fluid. This is extremely important for marine mammals and other marine organisms that rely on atmospheric oxygen to survive and carry out respiration. This can affect how fast an organism can propel itself efficiently and more importantly how long it can dive, or how long and how deep an organism can stay underwater. Heart mass and lung volume are important in determining how scaling can affect metabolic function and efficiency. Aquatic mammals, like other mammals, have the same size heart proportional to their bodies.

Mammals have a heart that is about 0.6% of the total body mass across the board from a small mouse to a large Blue Whale. It can be expressed as: Heart Weight = 0.006Mb1.0, where Mb is the body mass of the individual. Lung volume is also directly related to body mass in mammals (slope = 1.02). The lung has a volume of 63 ml for every kg of body mass. In addition, the tidal volume at rest in an individual is 1/10 the lung volume. Also respiration costs with respect to oxygen consumption is scaled in the order of Mb.75. This shows that mammals, regardless of size, have the same size respiratory and cardiovascular systems and it turn have the same amount of blood: About 5.5% of body mass. This means that for a similarly designed marine mammals, the larger the individual the more efficiently they can travel compared to a smaller individual. It takes the same effort to move one body length whether the individual is one meter or ten meters. This can explain why large whales can migrate far distance in the oceans and not stop for rest. It is metabolically less expensive to be larger in body size. This goes for terrestrial and flying animals as well. In fact, for an organism to move any distance, regardless of type from elephants to centipedes, smaller animals consume more oxygen per unit body mass than larger ones. This metabolic advantage that larger animals have makes it possible for larger marine mammals to dive for longer durations of time than their smaller counterparts. That the heart rate is lower means that larger animals can carry more blood, which carries more oxygen. Then in conjuncture with the fact that mammals reparation costs scales in the order of Mb.75 shows how an advantage can be had in having a larger body mass. More simply, a larger whale can hold more oxygen and at the same time demand less metabolically than a smaller whale.

Traveling long distances and deep dives are a combination of good stamina and also moving an efficient speed and in an efficient way to create laminar flow, reducing drag and turbulence. In sea water as the fluid, it traveling long distances in large mammals, such as whales, is facilitated by their neutral buoyancy and have their mass completely supported by the density of the sea water. On land, animals have to expend a portion of their energy during locomotion to fight the effects of gravity.
Flying organisms such as birds are also considered moving through a fluid. In scaling birds of similar shape, it has also been seen that larger individuals have less metabolic cost per kg than smaller species, which would be expected because it holds true for every other form of animal. Birds also have a variance in wing beat frequency. Even with the compensation of larger wings per unit body mass, larger birds also have a slower wing beat frequency, which allows larger birds to fly at higher altitudes, longer distances, and faster absolute speeds than smaller birds. Because of the dynamics of lift-based locomotion and the fluid dynamics, birds have a U-shaped curve for metabolic cost and velocity. Because flight, in air as the fluid, is metabolically more costly at the lowest and the highest velocities. On the other end, small organisms such as insects can make gain advantage from the viscosity of the fluid (air) that they are moving in. A wing-beat timed perfectly can effectively uptake energy from the previous stroke. (Dickinson 2000) This form of wake capture allows an organism to recycle energy from the fluid or vortices within that fluid created by the organism itself. This same sort of wake capture occurs in aquatic organisms as well, and for organisms of all sizes. This dynamic of fluid locomotion allows smaller organisms to gain advantage because the effect on them from the fluid is much greater because of their relatively smaller size.

Allometric engineering

Allometric engineering is a method for manipulating allometric relationships within or among groups.

In characteristics of a city

Arguing that there are a number of analogous concepts and mechanisms between cities and biological entities, Bettencourt et al. showed a number of scaling relationships between observable properties of a city and the city size. GDP, "supercreative" employment, number of inventors, crime, spread of disease, and even pedestrian walking speeds scale with city population.

Examples

Some examples of allometric laws:
  • Kleiber's law, metabolic rate q_0 is proportional to body mass M raised to the 3/4 power:
q_{0}\sim M^{\frac {3}{4}}
  • breathing and heart rate t are both inversely proportional to body mass M raised to the 1/4 power:
{\displaystyle t\sim M^{-{\frac {1}{4}}}}
  • mass transfer contact area A and body mass M:
A\sim M^{\frac {7}{8}}
  • the proportionality between the optimal cruising speed V_{opt} of flying bodies (insects, birds, airplanes) and body mass M raised to the power 1/6:
{\displaystyle V_{\text{opt}}\sim M^{\frac {1}{6}}}

Determinants of size in different species

Many factors go into the determination of body mass and size for a given animal. These factors often affect body size on an evolutionary scale, but conditions such as availability of food and habitat size can act much more quickly on a species. Other examples include the following:
  • Physiological design
Basic physiological design plays a role in the size of a given species. For example, animals with a closed circulatory system are larger than animals with open or no circulatory systems.
  • Mechanical design
Mechanical design can also determine the maximum allowable size for a species. Animals with tubular endoskeletons tend to be larger than animals with exoskeletons or hydrostatic skeletons.
  • Habitat
An animal’s habitat throughout its evolution is one of the largest determining factors in its size. On land, there is a positive correlation between body mass of the top species in the area and available land area. However, there are a much greater number of “small” species in any given area. This is most likely determined by ecological conditions, evolutionary factors, and the availability of food; a small population of large predators depend on a much greater population of small prey to survive. In an aquatic environment, the largest animals can grow to have a much greater body mass than land animals where gravitational weight constraints are a factor.

Power law

From Wikipedia, the free encyclopedia

An example power-law graph, being used to demonstrate ranking of popularity. To the right is the long tail, and to the left are the few that dominate (also known as the 80–20 rule).

In statistics, a power law is a functional relationship between two quantities, where a relative change in one quantity results in a proportional relative change in the other quantity, independent of the initial size of those quantities: one quantity varies as a power of another. For instance, considering the area of a square in terms of the length of its side, if the length is doubled, the area is multiplied by a factor of four.

Empirical examples

The distributions of a wide variety of physical, biological, and man-made phenomena approximately follow a power law over a wide range of magnitudes: these include the sizes of craters on the moon and of solar flares, the foraging pattern of various species, the sizes of activity patterns of neuronal populations, the frequencies of words in most languages, frequencies of family names, the species richness in clades of organisms, the sizes of power outages, criminal charges per convict, volcanic eruptions, human judgements of stimulus intensity and many other quantities. Few empirical distributions fit a power law for all their values, but rather follow a power law in the tail. Acoustic attenuation follows frequency power-laws within wide frequency bands for many complex media. Allometric scaling laws for relationships between biological variables are among the best known power-law functions in nature.

Properties

Scale invariance

One attribute of power laws is their scale invariance. Given a relation f(x)=ax^{-k}, scaling the argument x by a constant factor c causes only a proportionate scaling of the function itself. That is,
{\displaystyle f(cx)=a(cx)^{-k}=c^{-k}f(x)\propto f(x),\!}
where \propto denotes direct proportionality. That is, scaling by a constant c simply multiplies the original power-law relation by the constant c^{{-k}}. Thus, it follows that all power laws with a particular scaling exponent are equivalent up to constant factors, since each is simply a scaled version of the others. This behavior is what produces the linear relationship when logarithms are taken of both f(x) and x, and the straight-line on the log–log plot is often called the signature of a power law. With real data, such straightness is a necessary, but not sufficient, condition for the data following a power-law relation. In fact, there are many ways to generate finite amounts of data that mimic this signature behavior, but, in their asymptotic limit, are not true power laws (e.g., if the generating process of some data follows a Log-normal distribution). Thus, accurately fitting and validating power-law models is an active area of research in statistics; see below.

Lack of well-defined average value

A power-law {\displaystyle x^{-k}} has a well-defined mean over {\displaystyle x\in [1,\infty )} only if {\displaystyle k>2}, and it has a finite variance only if {\displaystyle k>3}; most identified power laws in nature have exponents such that the mean is well-defined but the variance is not, implying they are capable of black swan behavior. This can be seen in the following thought experiment: imagine a room with your friends and estimate the average monthly income in the room. Now imagine the world's richest person entering the room, with a monthly income of about 1 billion US$. What happens to the average income in the room? Income is distributed according to a power-law known as the Pareto distribution (for example, the net worth of Americans is distributed according to a power law with an exponent of 2).

On the one hand, this makes it incorrect to apply traditional statistics that are based on variance and standard deviation (such as regression analysis). On the other hand, this also allows for cost-efficient interventions. For example, given that car exhaust is distributed according to a power-law among cars (very few cars contribute to most contamination) it would be sufficient to eliminate those very few cars from the road to reduce total exhaust substantially.

The median does exist, however: for a power law xk, with exponent k > 1, it takes the value 21/(k – 1)xmin, where xmin is the minimum value for which the power law holds

Universality

The equivalence of power laws with a particular scaling exponent can have a deeper origin in the dynamical processes that generate the power-law relation. In physics, for example, phase transitions in thermodynamic systems are associated with the emergence of power-law distributions of certain quantities, whose exponents are referred to as the critical exponents of the system. Diverse systems with the same critical exponents—that is, which display identical scaling behaviour as they approach criticality—can be shown, via renormalization group theory, to share the same fundamental dynamics. For instance, the behavior of water and CO2 at their boiling points fall in the same universality class because they have identical critical exponents. In fact, almost all material phase transitions are described by a small set of universality classes. Similar observations have been made, though not as comprehensively, for various self-organized critical systems, where the critical point of the system is an attractor. Formally, this sharing of dynamics is referred to as universality, and systems with precisely the same critical exponents are said to belong to the same universality class.

Power-law functions

Scientific interest in power-law relations stems partly from the ease with which certain general classes of mechanisms generate them. The demonstration of a power-law relation in some data can point to specific kinds of mechanisms that might underlie the natural phenomenon in question, and can indicate a deep connection with other, seemingly unrelated systems; see also universality above. The ubiquity of power-law relations in physics is partly due to dimensional constraints, while in complex systems, power laws are often thought to be signatures of hierarchy or of specific stochastic processes. A few notable examples of power laws are Pareto's law of income distribution, structural self-similarity of fractals, and scaling laws in biological systems. Research on the origins of power-law relations, and efforts to observe and validate them in the real world, is an active topic of research in many fields of science, including physics, computer science, linguistics, geophysics, neuroscience, sociology, economics and more.

However, much of the recent interest in power laws comes from the study of probability distributions: The distributions of a wide variety of quantities seem to follow the power-law form, at least in their upper tail (large events). The behavior of these large events connects these quantities to the study of theory of large deviations (also called extreme value theory), which considers the frequency of extremely rare events like stock market crashes and large natural disasters. It is primarily in the study of statistical distributions that the name "power law" is used.

In empirical contexts, an approximation to a power-law o(x^k) often includes a deviation term \varepsilon , which can represent uncertainty in the observed values (perhaps measurement or sampling errors) or provide a simple way for observations to deviate from the power-law function (perhaps for stochastic reasons):
y = ax^k + \varepsilon.\!
Mathematically, a strict power law cannot be a probability distribution, but a distribution that is a truncated power function is possible: p(x) = C x^{-\alpha} for x > x_\text{min} where the exponent \alpha (Greek letter alpha, not to be confused with scaling factor a used above) is greater than 1 (otherwise the tail has infinite area), the minimum value x_\text{min} is needed otherwise the distribution has infinite area as x approaches 0, and the constant C is a scaling factor to ensure that the total area is 1, as required by a probability distribution. More often one uses an asymptotic power law – one that is only true in the limit; see power-law probability distributions below for details. Typically the exponent falls in the range 2 < \alpha < 3, though not always.

Examples

More than a hundred power-law distributions have been identified in physics (e.g. sandpile avalanches), biology (e.g. species extinction and body mass), and the social sciences (e.g. city sizes and income). Among them are:

Variants

Broken power law

Some models of the initial mass function use a broken power law; here Kroupa (2001) in red.

A broken power law is a piecewise function, consisting of two or more power laws, combined with a threshold. For example, with two power laws:
f(x) \propto x^{\alpha_1} for x<x_\text{th},
f(x) \propto x^{\alpha_1-\alpha_2}_\text{th}x^{\alpha_2}\text{ for } x>x_\text{th}.

Power law with exponential cutoff

A power law with an exponential cutoff is simply a power law multiplied by an exponential function:
f(x) \propto x^{\alpha}e^{\beta x}.

Curved power law

f(x) \propto x^{\alpha + \beta x}

Power-law probability distributions

In a looser sense, a power-law probability distribution is a distribution whose density function (or mass function in the discrete case) has the form, for large values of x,
{\displaystyle P(X>x)\sim L(x)x^{-(\alpha +1)}}
where \alpha >0, and L(x) is a slowly varying function, which is any function that satisfies \lim _{{x\rightarrow \infty }}L(r\,x)/L(x)=1 for any positive factor r. This property of L(x) follows directly from the requirement that p(x) be asymptotically scale invariant; thus, the form of L(x) only controls the shape and finite extent of the lower tail. For instance, if L(x) is the constant function, then we have a power law that holds for all values of x. In many cases, it is convenient to assume a lower bound x_{\mathrm{min}} from which the law holds. Combining these two cases, and where x is a continuous variable, the power law has the form
p(x) = \frac{\alpha-1}{x_\min} \left(\frac{x}{x_\min}\right)^{-\alpha},
where the pre-factor to \frac{\alpha-1}{x_\min} is the normalizing constant. We can now consider several properties of this distribution. For instance, its moments are given by
\langle x^{m} \rangle = \int_{x_\min}^\infty x^{m} p(x) \,\mathrm{d}x = \frac{\alpha-1}{\alpha-1-m}x_\min^m
which is only well defined for m < \alpha -1. That is, all moments m \geq \alpha - 1 diverge: when {\displaystyle \alpha \leq 2}, the average and all higher-order moments are infinite; when 2<\alpha<3, the mean exists, but the variance and higher-order moments are infinite, etc. For finite-size samples drawn from such distribution, this behavior implies that the central moment estimators (like the mean and the variance) for diverging moments will never converge – as more data is accumulated, they continue to grow. These power-law probability distributions are also called Pareto-type distributions, distributions with Pareto tails, or distributions with regularly varying tails.

A modification, which does not satisfy the general form above, with an exponential cutoff, is
p(x) \propto L(x) x^{-\alpha} \mathrm{e}^{-\lambda x}.
In this distribution, the exponential decay term \mathrm{e}^{-\lambda x} eventually overwhelms the power-law behavior at very large values of x. This distribution does not scale and is thus not asymptotically as a power law; however, it does approximately scale over a finite region before the cutoff. (Note that the pure form above is a subset of this family, with \lambda =0.) This distribution is a common alternative to the asymptotic power-law distribution because it naturally captures finite-size effects.

The Tweedie distributions are a family of statistical models characterized by closure under additive and reproductive convolution as well as under scale transformation. Consequently, these models all express a power-law relationship between the variance and the mean. These models have a fundamental role as foci of mathematical convergence similar to the role that the normal distribution has as a focus in the central limit theorem. This convergence effect explains why the variance-to-mean power law manifests so widely in natural processes, as with Taylor's law in ecology and with fluctuation scaling in physics. It can also be shown that this variance-to-mean power law, when demonstrated by the method of expanding bins, implies the presence of 1/f noise and that 1/f noise can arise as a consequence of this Tweedie convergence effect.

Graphical methods for identification

Although more sophisticated and robust methods have been proposed, the most frequently used graphical methods of identifying power-law probability distributions using random samples are Pareto quantile-quantile plots (or Pareto Q-Q plots), mean residual life plots and log–log plots. Another, more robust graphical method uses bundles of residual quantile functions. (Please keep in mind that power-law distributions are also called Pareto-type distributions.) It is assumed here that a random sample is obtained from a probability distribution, and that we want to know if the tail of the distribution follows a power law (in other words, we want to know if the distribution has a "Pareto tail"). Here, the random sample is called "the data".

Pareto Q-Q plots compare the quantiles of the log-transformed data to the corresponding quantiles of an exponential distribution with mean 1 (or to the quantiles of a standard Pareto distribution) by plotting the former versus the latter. If the resultant scatterplot suggests that the plotted points " asymptotically converge" to a straight line, then a power-law distribution should be suspected. A limitation of Pareto Q-Q plots is that they behave poorly when the tail index \alpha (also called Pareto index) is close to 0, because Pareto Q-Q plots are not designed to identify distributions with slowly varying tails.

On the other hand, in its version for identifying power-law probability distributions, the mean residual life plot consists of first log-transforming the data, and then plotting the average of those log-transformed data that are higher than the i-th order statistic versus the i-th order statistic, for i = 1, ..., n, where n is the size of the random sample. If the resultant scatterplot suggests that the plotted points tend to "stabilize" about a horizontal straight line, then a power-law distribution should be suspected. Since the mean residual life plot is very sensitive to outliers (it is not robust), it usually produces plots that are difficult to interpret; for this reason, such plots are usually called Hill horror plots

A straight line on a log–log plot is necessary but insufficient evidence for power-laws, the slope of the straight line corresponds to the power law exponent.

Log–log plots are an alternative way of graphically examining the tail of a distribution using a random sample. Caution has to be exercised however as a log-log plot is necessary but insufficient evidence for a power law relationship, as many non power-law distributions will appear as straight lines on a log-log plot. This method consists of plotting the logarithm of an estimator of the probability that a particular number of the distribution occurs versus the logarithm of that particular number. Usually, this estimator is the proportion of times that the number occurs in the data set. If the points in the plot tend to "converge" to a straight line for large numbers in the x axis, then the researcher concludes that the distribution has a power-law tail. Examples of the application of these types of plot have been published. A disadvantage of these plots is that, in order for them to provide reliable results, they require huge amounts of data. In addition, they are appropriate only for discrete (or grouped) data.

Another graphical method for the identification of power-law probability distributions using random samples has been proposed. This methodology consists of plotting a bundle for the log-transformed sample. Originally proposed as a tool to explore the existence of moments and the moment generation function using random samples, the bundle methodology is based on residual quantile functions (RQFs), also called residual percentile functions, which provide a full characterization of the tail behavior of many well-known probability distributions, including power-law distributions, distributions with other types of heavy tails, and even non-heavy-tailed distributions. Bundle plots do not have the disadvantages of Pareto Q-Q plots, mean residual life plots and log–log plots mentioned above (they are robust to outliers, allow visually identifying power laws with small values of \alpha , and do not demand the collection of much data). In addition, other types of tail behavior can be identified using bundle plots.

Plotting power-law distributions

In general, power-law distributions are plotted on doubly logarithmic axes, which emphasizes the upper tail region. The most convenient way to do this is via the (complementary) cumulative distribution (cdf), P(x) = \mathrm{Pr}(X > x),
P(x) = \Pr(X > x) =  C \int_x^\infty p(X)\,\mathrm{d}X =  \frac{\alpha-1}{x_\min^{-\alpha+1}} \int_x^\infty X^{-\alpha}\,\mathrm{d}X = \left(\frac{x}{x_\min} \right)^{-\alpha+1}.
Note that the cdf is also a power-law function, but with a smaller scaling exponent. For data, an equivalent form of the cdf is the rank-frequency approach, in which we first sort the n observed values in ascending order, and plot them against the vector \left[1,\frac{n-1}{n},\frac{n-2}{n},\dots,\frac{1}{n}\right].

Although it can be convenient to log-bin the data, or otherwise smooth the probability density (mass) function directly, these methods introduce an implicit bias in the representation of the data, and thus should be avoided. The cdf, on the other hand, is more robust to (but not without) such biases in the data and preserves the linear signature on doubly logarithmic axes. Though a cdf representation is favored over that of the pdf while fitting a power law to the data with the linear least square method, it is not devoid of mathematical inaccuracy. Thus, while estimating exponents of a power law distribution, maximum likelihood estimator is recommended.

Estimating the exponent from empirical data

There are many ways of estimating the value of the scaling exponent for a power-law tail, however not all of them yield unbiased and consistent answers. Some of the most reliable techniques are often based on the method of maximum likelihood. Alternative methods are often based on making a linear regression on either the log–log probability, the log–log cumulative distribution function, or on log-binned data, but these approaches should be avoided as they can all lead to highly biased estimates of the scaling exponent.

Maximum likelihood

For real-valued, independent and identically distributed data, we fit a power-law distribution of the form
p(x) = \frac{\alpha-1}{x_\min} \left(\frac{x}{x_\min}\right)^{-\alpha}
to the data x\geq x_\min, where the coefficient \frac{\alpha-1}{x_\min} is included to ensure that the distribution is normalized. Given a choice for x_\min, the log likelihood function becomes:
{\displaystyle {\mathcal {L}}(\alpha )=\log \prod _{i=1}^{n}{\frac {\alpha -1}{x_{\min }}}\left({\frac {x_{i}}{x_{\min }}}\right)^{-\alpha }}
The maximum of this likelihood is found by differentiating with respect to parameter \alpha , setting the result equal to zero. Upon rearrangement, this yields the estimator equation:
\hat{\alpha} = 1 + n \left[ \sum_{i=1}^n \ln \frac{x_i}{x_\min} \right]^{-1}
where \{x_i\} are the n data points x_{i}\geq x_\min. This estimator exhibits a small finite sample-size bias of order O(n^{-1}), which is small when n > 100. Further, the standard error of the estimate is \sigma ={\frac  {{\hat  {\alpha }}-1}{{\sqrt  {n}}}}+O(n^{{-1}}). This estimator is equivalent to the popular Hill estimator from quantitative finance and extreme value theory.

For a set of n integer-valued data points \{x_i\}, again where each x_i\geq x_\min, the maximum likelihood exponent is the solution to the transcendental equation
\frac{\zeta'(\hat\alpha,x_\min)}{\zeta(\hat{\alpha},x_\min)} = -\frac{1}{n} \sum_{i=1}^n \ln \frac{x_i}{x_\min}
where \zeta(\alpha,x_{\mathrm{min}}) is the incomplete zeta function. The uncertainty in this estimate follows the same formula as for the continuous equation. However, the two equations for \hat{\alpha} are not equivalent, and the continuous version should not be applied to discrete data, nor vice versa.

Further, both of these estimators require the choice of x_\min. For functions with a non-trivial L(x) function, choosing x_\min too small produces a significant bias in {\hat {\alpha }}, while choosing it too large increases the uncertainty in \hat{\alpha}, and reduces the statistical power of our model. In general, the best choice of x_\min depends strongly on the particular form of the lower tail, represented by L(x) above.
More about these methods, and the conditions under which they can be used, can be found in . Further, this comprehensive review article provides usable code (Matlab, Python, R and C++) for estimation and testing routines for power-law distributions.

Kolmogorov–Smirnov estimation

Another method for the estimation of the power-law exponent, which does not assume independent and identically distributed (iid) data, uses the minimization of the Kolmogorov–Smirnov statistic, D, between the cumulative distribution functions of the data and the power law:
\hat{\alpha} = \underset{\alpha}{\operatorname{arg\,min}} \, D_\alpha
with
 D_\alpha = \max_x | P_\mathrm{emp}(x) - P_\alpha(x) |
where P_\mathrm{emp}(x) and P_\alpha(x) denote the cdfs of the data and the power law with exponent \alpha , respectively. As this method does not assume iid data, it provides an alternative way to determine the power-law exponent for data sets in which the temporal correlation can not be ignored.

Two-point fitting method

This criterion can be applied for the estimation of power-law exponent in the case of scale free distributions and provides a more convergent estimate than the maximum likelihood method. It has been applied to study probability distributions of fracture apertures. In some contexts the probability distribution is described, not by the cumulative distribution function, by the cumulative frequency of a property X, defined as the number of elements per meter (or area unit, second etc.) for which X > x applies, where x is a variable real number. As an example, the cumulative distribution of the fracture aperture, X, for a sample of N elements is defined as 'the number of fractures per meter having aperture greater than x . Use of cumulative frequency has some advantages, e.g. it allows one to put on the same diagram data gathered from sample lines of different lengths at different scales (e.g. from outcrop and from microscope).

R function

The following function estimates the exponent in R, plotting the log–log data and the fitted line.

    pwrdist <- span=""> function(u,...)
 
    {
        # u is vector of event counts, e.g. how many
        # crimes was a given perpetrator charged for by the police
        fx <- span=""> table(u)
        i <- span=""> as.numeric(names(fx))
        y <- span=""> rep(0,max(i))
        y[i] <- span=""> fx
        m0 <- span=""> glm(y~log(1:max(i)),family=quasipoisson())
        print(summary(m0))
        sub <- span=""> paste("s=",round(m0$coef[2],2),"lambda=",sum(u),"/",length(u))
        plot(i,fx,log="xy",xlab="x",sub=sub,ylab="counts",...)
        grid()
        lines(1:max(i),(fitted(m0)),type="b")
        return(m0)
    }

Validating power laws

Although power-law relations are attractive for many theoretical reasons, demonstrating that data does indeed follow a power-law relation requires more than simply fitting a particular model to the data. This is important for understanding the mechanism that gives rise to the distribution: superficially similar distributions may arise for significantly different reasons, and different models yield different predictions, such as extrapolation.

For example, log-normal distributions are often mistaken for power-law distributions: a data set drawn from a lognormal distribution will be approximately linear for large values (corresponding to the upper tail of the lognormal being close to a power law), but for small values the lognormal will drop off significantly (bowing down), corresponding to the lower tail of the lognormal being small (there are very few small values, rather than many small values in a power law).

For example, Gibrat's law about proportional growth processes produce distributions that are lognormal, although their log–log plots look linear over a limited range. An explanation of this is that although the logarithm of the lognormal density function is quadratic in log(x), yielding a "bowed" shape in a log–log plot, if the quadratic term is small relative to the linear term then the result can appear almost linear, and the lognormal behavior is only visible when the quadratic term dominates, which may require significantly more data. Therefore, a log–log plot that is slightly "bowed" downwards can reflect a log-normal distribution – not a power law.

In general, many alternative functional forms can appear to follow a power-law form for some extent. Stumpf proposed plotting the empirical cumulative distribution function in the log-log domain and claimed that a candidate power-law should cover at least two orders of magnitude. Also, researchers usually have to face the problem of deciding whether or not a real-world probability distribution follows a power law. As a solution to this problem, Diaz proposed a graphical methodology based on random samples that allow visually discerning between different types of tail behavior. This methodology uses bundles of residual quantile functions, also called percentile residual life functions, which characterize many different types of distribution tails, including both heavy and non-heavy tails. However, Stumpf claimed the need for both a statistical and a theoretical background in order to support a power-law in the underlying mechanism driving the data generating process.

One method to validate a power-law relation tests many orthogonal predictions of a particular generative mechanism against data. Simply fitting a power-law relation to a particular kind of data is not considered a rational approach. As such, the validation of power-law claims remains a very active field of research in many areas of modern science.

Brønsted–Lowry acid–base theory

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Br%C3%B8nsted%E2%80%93Lowry_acid%E2%80%93base_theory The B...