Search This Blog

Thursday, June 2, 2022

Arrow's impossibility theorem

From Wikipedia, the free encyclopedia

In social choice theory, Arrow's impossibility theorem, the general possibility theorem or Arrow's paradox is an impossibility theorem stating that when voters have three or more distinct alternatives (options), no ranked voting electoral system can convert the ranked preferences of individuals into a community-wide (complete and transitive) ranking while also meeting a specified set of criteria: unrestricted domain, non-dictatorship, Pareto efficiency, and independence of irrelevant alternatives. The theorem is often cited in discussions of voting theory as it is further interpreted by the Gibbard–Satterthwaite theorem. The theorem is named after economist and Nobel laureate Kenneth Arrow, who demonstrated the theorem in his doctoral thesis and popularized it in his 1951 book Social Choice and Individual Values. The original paper was titled "A Difficulty in the Concept of Social Welfare".

In short, the theorem states that no rank-order electoral system can be designed that always satisfies these three "fairness" criteria:

  • If every voter prefers alternative X over alternative Y, then the group prefers X over Y.
  • If every voter's preference between X and Y remains unchanged, then the group's preference between X and Y will also remain unchanged (even if voters' preferences between other pairs like X and Z, Y and Z, or Z and W change).
  • There is no "dictator": no single voter possesses the power to always determine the group's preference.

Cardinal voting electoral systems are not covered by the theorem, as they convey more information than rank orders. However, Gibbard's theorem shows that strategic voting remains a problem.

The axiomatic approach Arrow adopted can treat all conceivable rules (that are based on preferences) within one unified framework. In that sense, the approach is qualitatively different from the earlier one in voting theory, in which rules were investigated one by one. One can therefore say that the contemporary paradigm of social choice theory started from this theorem.

The practical consequences of the theorem are debatable: Arrow has said "Most systems are not going to work badly all of the time. All I proved is that all can work badly at times."

Statement

The need to aggregate preferences occurs in many disciplines: in welfare economics, where one attempts to find an economic outcome which would be acceptable and stable; in decision theory, where a person has to make a rational choice based on several criteria; and most naturally in electoral systems, which are mechanisms for extracting a governance-related decision from a multitude of voters' preferences.

The framework for Arrow's theorem assumes that we need to extract a preference order on a given set of options (outcomes). Each individual in the society (or equivalently, each decision criterion) gives a particular order of preferences on the set of outcomes. We are searching for a ranked voting electoral system, called a social welfare function (preference aggregation rule), which transforms the set of preferences (profile of preferences) into a single global societal preference order. Arrow's theorem says that if the decision-making body has at least two members and at least three options to decide among, then it is impossible to design a social welfare function that satisfies all these conditions (assumed to be a reasonable requirement of a fair electoral system) at once:

Non-dictatorship
The social welfare function should account for the wishes of multiple voters. It cannot simply mimic the preferences of a single voter.
Unrestricted domain, or universality
For any set of individual voter preferences, the social welfare function should yield a unique and complete ranking of societal choices. Thus:
  • It must do so in a manner that results in a complete ranking of preferences for society.
  • It must deterministically provide the same ranking each time voters' preferences are presented the same way.
Independence of irrelevant alternatives (IIA)
The social preference between x and y should depend only on the individual preferences between x and y (pairwise independence). More generally, changes in individuals' rankings of irrelevant alternatives (ones outside a certain subset) should have no impact on the societal ranking of the subset. For example, if candidate x ranks socially before candidate y, then x should rank socially before y even if a third candidate z is removed from participation. (See Remarks below.)
Monotonicity, or positive association of social and individual values
If any individual modifies his or her preference order by promoting a certain option, then the societal preference order should respond only by promoting that same option or not changing, never by placing it lower than before. An individual should not be able to hurt an option by ranking it higher.
Non-imposition, or citizen sovereignty
Every possible societal preference order should be achievable by some set of individual preference orders. This means that the social welfare function is surjective: It has an unrestricted target space.

A later (1963) version of Arrow's theorem replaced the monotonicity and non-imposition criteria with:

Pareto efficiency, or unanimity
If every individual prefers a certain option to another, then so must the resulting societal preference order. This, again, is a demand that the social welfare function will be minimally sensitive to the preference profile.

This later version is more general, having weaker conditions. The axioms of monotonicity, non-imposition, and IIA together imply Pareto efficiency, whereas Pareto efficiency (itself implying non-imposition) and IIA together do not imply monotonicity.

Independence of irrelevant alternatives (IIA)

The IIA condition has three purposes (or effects):

Normative
Irrelevant alternatives should not matter.
Practical
Use of minimal information.
Strategic
Providing the right incentives for the truthful revelation of individual preferences. Though the strategic property is conceptually different from IIA, it is closely related.

Arrow's death-of-a-candidate example (1963, page 26) suggests that the agenda (the set of feasible alternatives) shrinks from, say, X = {a, b, c} to S = {a, b} because of the death of candidate c. This example is misleading since it can give the reader an impression that IIA is a condition involving two agenda and one profile. The fact is that IIA involves just one agendum ({x, y} in case of pairwise independence) but two profiles. If the condition is applied to this confusing example, it requires this: Suppose an aggregation rule satisfying IIA chooses b from the agenda {a, b} when the profile is given by (cab, cba), that is, individual 1 prefers c to a to b, 2 prefers c to b to a. Then, it must still choose b from {a, b} if the profile were, say: (abc, bac); (acb, bca); (acb, cba); or (abc, cba).

In different words, Arrow defines IIA as saying that the social preferences between alternatives x and y depend only on the individual preferences between x and y (not on those involving other candidates).

Formal statement of the theorem

Let A be a set of outcomes, N a number of voters or decision criteria. We shall denote the set of all full linear orderings of A by L(A).

A (strict) social welfare function (preference aggregation rule) is a function

which aggregates voters' preferences into a single preference order on A.

An N-tuple (R1, …, RN) ∈ L(A)N of voters' preferences is called a preference profile. In its strongest and simplest form, Arrow's impossibility theorem states that whenever the set A of possible alternatives has more than 2 elements, then the following three conditions become incompatible:

Unanimity, or weak Pareto efficiency
If alternative a is ranked strictly higher than b for all orderings R1 , …, RN, then a is ranked strictly higher than b by F(R1, R2, …, RN). (Unanimity implies non-imposition.)
Non-dictatorship
There is no individual, i whose strict preferences always prevail. That is, there is no i ∈ {1, …, N} such that for all (R1, …, RN) ∈ L(A)N and all a and b, when a is ranked strictly higher than b by Ri then a is ranked strictly higher than b by F(R1, R2, …, RN).
Independence of irrelevant alternatives
For two preference profiles (R1, …, RN) and (S1, …, SN) such that for all individuals i, alternatives a and b have the same order in Ri as in Si, alternatives a and b have the same order in F(R1, …, RN) as in F(S1, …, SN).

Informal proof

Based on two proofs appearing in Economic Theory. For simplicity we have presented all rankings as if ties are impossible. A complete proof taking possible ties into account is not essentially different from the one given here, except that one ought to say "not above" instead of "below" or "not below" instead of "above" in some cases. Full details are given in the original articles.

We will prove that any social choice system respecting unrestricted domain, unanimity, and independence of irrelevant alternatives (IIA) is a dictatorship. The key idea is to identify a pivotal voter whose ballot swings the societal outcome. We then prove that this voter is a partial dictator (in a specific technical sense, described below). Finally we conclude by showing that all of the partial dictators are the same person, hence this voter is a dictator.

Part one: There is a "pivotal" voter for B over A

Part one: Successively move B from the bottom to the top of voters' ballots. The voter whose change results in B being ranked over A is the pivotal voter for B over A.

Say there are three choices for society, call them A, B, and C. Suppose first that everyone prefers option B the least: everyone prefers A to B, and everyone prefers C to B. By unanimity, society must also prefer both A and C to B. Call this situation profile 0.

On the other hand, if everyone preferred B to everything else, then society would have to prefer B to everything else by unanimity. Now arrange all the voters in some arbitrary but fixed order, and for each i let profile i be the same as profile 0, but move B to the top of the ballots for voters 1 through i. So profile 1 has B at the top of the ballot for voter 1, but not for any of the others. Profile 2 has B at the top for voters 1 and 2, but no others, and so on.

Since B eventually moves to the top of the societal preference, there must be some profile, number k, for which B moves above A in the societal rank. We call the voter whose ballot change causes this to happen the pivotal voter for B over A. Note that the pivotal voter for B over A is not, a priori, the same as the pivotal voter for A over B. In part three of the proof we will show that these do turn out to be the same.

Also note that by IIA the same argument applies if profile 0 is any profile in which A is ranked above B by every voter, and the pivotal voter for B over A will still be voter k. We will use this observation below.

Part two: The pivotal voter for B over A is a dictator for B over C

In this part of the argument we refer to voter k, the pivotal voter for B over A, as the pivotal voter for simplicity. We will show that the pivotal voter dictates society's decision for B over C. That is, we show that no matter how the rest of society votes, if Pivotal Voter ranks B over C, then that is the societal outcome. Note again that the dictator for B over C is not a priori the same as that for C over B. In part three of the proof we will see that these turn out to be the same too.

Part two: Switching A and B on the ballot of voter k causes the same switch to the societal outcome, by part one of the argument. Making any or all of the indicated switches to the other ballots has no effect on the outcome.

In the following, we call voters 1 through k − 1, segment one, and voters k + 1 through N, segment two. To begin, suppose that the ballots are as follows:

  • Every voter in segment one ranks B above C and C above A.
  • Pivotal voter ranks A above B and B above C.
  • Every voter in segment two ranks A above B and B above C.

Then by the argument in part one (and the last observation in that part), the societal outcome must rank A above B. This is because, except for a repositioning of C, this profile is the same as profile k − 1 from part one. Furthermore, by unanimity the societal outcome must rank B above C. Therefore, we know the outcome in this case completely.

Now suppose that pivotal voter moves B above A, but keeps C in the same position and imagine that any number (even all!) of the other voters change their ballots to move B below C, without changing the position of A. Then aside from a repositioning of C this is the same as profile k from part one and hence the societal outcome ranks B above A. Furthermore, by IIA the societal outcome must rank A above C, as in the previous case. In particular, the societal outcome ranks B above C, even though Pivotal Voter may have been the only voter to rank B above C. By IIA, this conclusion holds independently of how A is positioned on the ballots, so pivotal voter is a dictator for B over C.

Part three: There exists a dictator

Part three: Since voter k is the dictator for B over C, the pivotal voter for B over C must appear among the first k voters. That is, outside of segment two. Likewise, the pivotal voter for C over B must appear among voters k through N. That is, outside of Segment One.

In this part of the argument we refer back to the original ordering of voters, and compare the positions of the different pivotal voters (identified by applying parts one and two to the other pairs of candidates). First, the pivotal voter for B over C must appear earlier (or at the same position) in the line than the dictator for B over C: As we consider the argument of part one applied to B and C, successively moving B to the top of voters' ballots, the pivot point where society ranks B above C must come at or before we reach the dictator for B over C. Likewise, reversing the roles of B and C, the pivotal voter for C over B must be at or later in line than the dictator for B over C. In short, if kX/Y denotes the position of the pivotal voter for X over Y (for any two candidates X and Y), then we have shown

kB/C ≤ kB/AkC/B.

Now repeating the entire argument above with B and C switched, we also have

kC/BkB/C.

Therefore, we have

kB/C = kB/A = kC/B

and the same argument for other pairs shows that all the pivotal voters (and hence all the dictators) occur at the same position in the list of voters. This voter is the dictator for the whole election.

Interpretations

Although Arrow's theorem is a mathematical result, it is often expressed in a non-mathematical way with a statement such as no voting method is fair, every ranked voting method is flawed, or the only voting method that isn't flawed is a dictatorship. These statements are simplifications of Arrow's result which are not universally considered to be true. What Arrow's theorem does state is that a deterministic preferential voting mechanism—that is, one where a preference order is the only information in a vote, and any possible set of votes gives a unique result—cannot comply with all of the conditions given above simultaneously.

Various theorists have suggested weakening the IIA criterion as a way out of the paradox. Proponents of ranked voting methods contend that the IIA is an unreasonably strong criterion. It is the one breached in most useful electoral systems. Advocates of this position point out that failure of the standard IIA criterion is trivially implied by the possibility of cyclic preferences. If voters cast ballots as follows:

  • 1 vote for A > B > C
  • 1 vote for B > C > A
  • 1 vote for C > A > B

then the pairwise majority preference of the group is that A wins over B, B wins over C, and C wins over A: these yield rock-paper-scissors preferences for any pairwise comparison. In this circumstance, any aggregation rule that satisfies the very basic majoritarian requirement that a candidate who receives a majority of votes must win the election, will fail the IIA criterion, if social preference is required to be transitive (or acyclic). To see this, suppose that such a rule satisfies IIA. Since majority preferences are respected, the society prefers A to B (two votes for A > B and one for B > A), B to C, and C to A. Thus a cycle is generated, which contradicts the assumption that social preference is transitive.

So, what Arrow's theorem really shows is that any majority-wins electoral system is a non-trivial game, and that game theory should be used to predict the outcome of most voting mechanisms. This could be seen as a discouraging result, because a game need not have efficient equilibria; e.g., a ballot could result in an alternative nobody really wanted in the first place, yet everybody voted for.

Remark: Scalar rankings from a vector of attributes and the IIA property

The IIA property might not be satisfied in human decision-making of realistic complexity because the scalar preference ranking is effectively derived from the weighting—not usually explicit—of a vector of attributes (one book dealing with the Arrow theorem invites the reader to consider the related problem of creating a scalar measure for the track and field decathlon event—e.g. how does one make scoring 600 points in the discus event "commensurable" with scoring 600 points in the 1500 m race) and this scalar ranking can depend sensitively on the weighting of different attributes, with the tacit weighting itself affected by the context and contrast created by apparently "irrelevant" choices. Edward MacNeal discusses this sensitivity problem with respect to the ranking of "most livable city" in the chapter "Surveys" of his book MathSemantics: making numbers talk sense (1994).

Alternatives based on functions of preference profiles

In an attempt to escape from the negative conclusion of Arrow's theorem, social choice theorists have investigated various possibilities ("ways out"). This section includes approaches that deal with

  • aggregation rules (functions that map each preference profile into a social preference), and
  • other functions, such as functions that map each preference profile into an alternative.

Since these two approaches often overlap, we discuss them at the same time. What is characteristic of these approaches is that they investigate various possibilities by eliminating or weakening or replacing one or more conditions (criteria) that Arrow imposed.

Infinitely many individuals

Several theorists (e.g., Fishburn and Kirman and Sondermann) point out that when one drops the assumption that there are only finitely many individuals, one can find aggregation rules that satisfy all of Arrow's other conditions.

However, such aggregation rules are practically of limited interest, since they are based on ultrafilters, highly non-constructive mathematical objects. In particular, Kirman and Sondermann argue that there is an "invisible dictator" behind such a rule. Mihara shows that such a rule violates algorithmic computability. These results can be seen to establish the robustness of Arrow's theorem.

On the other hand, the ultrafilters (indeed, constructing them in an infinite model relies on the axiom of choice) are inherent in finite models as well (with no need of the axiom of choice). They can be interpreted as decisive hierarchies, with the only difference that the hierarchy's top level - Arrow's dictator - always exists in a finite model but can be unattainable (= missing) in an infinite hierarchy. In the latter case, the "invisible dictator" is nothing else but the infinite decisive hierarchy itself. If desired, it can be complemented with a limit point, which then becomes a "visible dictator". Since dictators are inseparable from decisive hierarchies, the Dictatorship prohibition automatically prohibits decisive hierarchies, which is much less self-evident than the Dictatorship prohibition. See also paragraph "Relaxing the Dictatorship prohibition".

Limiting the number of alternatives

When there are only two alternatives to choose from, May's theorem shows that only simple majority rule satisfies a certain set of criteria (e.g., equal treatment of individuals and of alternatives; increased support for a winning alternative should not make it into a losing one). On the other hand, when there are at least three alternatives, Arrow's theorem points out the difficulty of collective decision making. Why is there such a sharp difference between the case of less than three alternatives and that of at least three alternatives?

Nakamura's theorem (about the core of simple games) gives an answer more generally. It establishes that if the number of alternatives is less than a certain integer called the Nakamura number, then the rule in question will identify "best" alternatives without any problem; if the number of alternatives is greater or equal to the Nakamura number, then the rule will not always work, since for some profile a voting paradox (a cycle such as alternative A socially preferred to alternative B, B to C, and C to A) will arise. Since the Nakamura number of majority rule is 3 (except the case of four individuals), one can conclude from Nakamura's theorem that majority rule can deal with up to two alternatives rationally. Some super-majority rules (such as those requiring 2/3 of the votes) can have a Nakamura number greater than 3, but such rules violate other conditions given by Arrow.

Pairwise voting

A common way "around" Arrow's paradox is limiting the alternative set to two alternatives. Thus, whenever more than two alternatives should be put to the test, it seems very tempting to use a mechanism that pairs them and votes by pairs. As tempting as this mechanism seems at first glance, it is generally far from satisfying even Pareto efficiency, not to mention IIA. The specific order by which the pairs are decided strongly influences the outcome. This is not necessarily a bad feature of the mechanism. Many sports use the tournament mechanism—essentially a pairing mechanism—to choose a winner. This gives considerable opportunity for weaker teams to win, thus adding interest and tension throughout the tournament. This means that the person controlling the order by which the choices are paired (the agenda maker) has great control over the outcome. In any case, when viewing the entire voting process as one game, Arrow's theorem still applies.

Domain restrictions

Another approach is relaxing the universality condition, which means restricting the domain of aggregation rules. The best-known result along this line assumes "single peaked" preferences.

Duncan Black has shown that if there is only one dimension on which every individual has a "single-peaked" preference, then all of Arrow's conditions are met by majority rule. Suppose that there is some predetermined linear ordering of the alternative set. An individual's preference is single-peaked with respect to this ordering if he has some special place that he likes best along that line, and his dislike for an alternative grows larger as the alternative goes further away from that spot (i.e., the graph of his utility function has a single peak if alternatives are placed according to the linear ordering on the horizontal axis). For example, if voters were voting on where to set the volume for music, it would be reasonable to assume that each voter had their own ideal volume preference and that as the volume got progressively too loud or too quiet they would be increasingly dissatisfied. If the domain is restricted to profiles in which every individual has a single peaked preference with respect to the linear ordering, then simple aggregation rules, which include majority rule, have an acyclic (defined below) social preference, hence "best" alternative. In particular, when there are odd number of individuals, then the social preference becomes transitive, and the socially "best" alternative is equal to the median of all the peaks of the individuals (Black's median voter theorem). Under single-peaked preferences, the majority rule is in some respects the most natural voting mechanism.

One can define the notion of "single-peaked" preferences on higher-dimensional sets of alternatives. However, one can identify the "median" of the peaks only in exceptional cases. Instead, we typically have the destructive situation suggested by McKelvey's Chaos Theorem: for any x and y, one can find a sequence of alternatives such that x is beaten by x1 by a majority, x1 by x2, up to xk by y.

Relaxing transitivity

By relaxing the transitivity of social preferences, we can find aggregation rules that satisfy Arrow's other conditions. If we impose neutrality (equal treatment of alternatives) on such rules, however, there exists an individual who has a "veto". So the possibility provided by this approach is also very limited.

First, suppose that a social preference is quasi-transitive (instead of transitive); this means that the strict preference ("better than") is transitive: if and , then . Then, there do exist non-dictatorial aggregation rules satisfying Arrow's conditions, but such rules are oligarchic. This means that there exists a coalition L such that L is decisive (if every member in L prefers x to y, then the society prefers x to y), and each member in L has a veto (if she prefers x to y, then the society cannot prefer y to x).

Second, suppose that a social preference is acyclic (instead of transitive): there do not exist alternatives that form a cycle (). Then, provided that there are at least as many alternatives as individuals, an aggregation rule satisfying Arrow's other conditions is collegial. This means that there are individuals who belong to the intersection ("collegium") of all decisive coalitions. If there is someone who has a veto, then he belongs to the collegium. If the rule is assumed to be neutral, then it does have someone who has a veto.

Finally, Brown's theorem left open the case of acyclic social preferences where the number of alternatives is less than the number of individuals. One can give a definite answer for that case using the Nakamura number. See limiting the number of alternatives.

Relaxing assumption IIA

There are numerous examples of aggregation rules satisfying Arrow's conditions except IIA. The Borda rule is one of them. These rules, however, are susceptible to strategic manipulation by individuals.

See also Interpretations of the theorem above.

Relaxing the Pareto criterion

Wilson (1972) shows that if an aggregation rule is non-imposed and non-null, then there is either a dictator or an inverse dictator, provided that Arrow's conditions other than Pareto are also satisfied. Here, an inverse dictator is an individual i such that whenever i prefers x to y, then the society prefers y to x.

Amartya Sen offered both relaxation of transitivity and removal of the Pareto principle. He demonstrated another interesting impossibility result, known as the "impossibility of the Paretian Liberal" (see liberal paradox for details). Sen went on to argue that this demonstrates the futility of demanding Pareto optimality in relation to voting mechanisms.

Relaxing the Dictatorship prohibition

Andranik Tangian (2010) introduced measures of dictator's "representativeness", for instance, the "popularity index" defined as the average size of the social group whose pairwise preferences are shared (= represented) by the dictator, averaged over all pairs of alternatives and all preference profiles. It was shown that there always exist "good" Arrow's dictators who on the average represent a majority. Since they are rather representatives of the society - like democratically elected presidents - there are no self-evident reasons to prohibit them. Restricting the notion of dictator to "bad" ones only, i.e. those who on the average represent a minority, Arrow's axioms were proven to be consistent.

Social choice instead of social preference

In social decision making, to rank all alternatives is not usually a goal. It often suffices to find some alternative. The approach focusing on choosing an alternative investigates either social choice functions (functions that map each preference profile into an alternative) or social choice rules (functions that map each preference profile into a subset of alternatives).

As for social choice functions, the Gibbard–Satterthwaite theorem is well-known, which states that if a social choice function whose range contains at least three alternatives is strategy-proof, then it is dictatorial.

As for social choice rules, we should assume there is a social preference behind them. That is, we should regard a rule as choosing the maximal elements ("best" alternatives) of some social preference. The set of maximal elements of a social preference is called the core. Conditions for existence of an alternative in the core have been investigated in two approaches. The first approach assumes that preferences are at least acyclic (which is necessary and sufficient for the preferences to have a maximal element on any finite subset). For this reason, it is closely related to relaxing transitivity. The second approach drops the assumption of acyclic preferences. Kumabe and Mihara adopt this approach. They make a more direct assumption that individual preferences have maximal elements, and examine conditions for the social preference to have a maximal element. See Nakamura number for details of these two approaches.

Other alternatives

Arrow originally rejected cardinal utility as a meaningful tool for expressing social welfare, and so focused his theorem on preference rankings, but later stated that a cardinal score system with three or four classes "is probably the best".

Arrow's framework assumes that individual and social preferences are "orderings" (i.e., satisfy completeness and transitivity) on the set of alternatives. This means that if the preferences are represented by a utility function, its value is an ordinal utility in the sense that it is meaningful so far as the greater value indicates the better alternative. For instance, having ordinal utilities of 4, 3, 2, 1 for alternatives a, b, c, d, respectively, is the same as having 1000, 100.01, 100, 0, which in turn is the same as having 99, 98, 1, .997. They all represent the ordering in which a is preferred to b to c to d. The assumption of ordinal preferences, which precludes interpersonal comparisons of utility, is an integral part of Arrow's theorem.

For various reasons, an approach based on cardinal utility, where the utility has a meaning beyond just giving a ranking of alternatives, is not common in contemporary economics. However, once one adopts that approach, one can take intensities of preferences into consideration, or one can compare (i) gains and losses of utility or (ii) levels of utility, across different individuals. In particular, Harsanyi (1955) gives a justification of utilitarianism (which evaluates alternatives in terms of the sum of individual utilities), originating from Jeremy Bentham. Hammond (1976) gives a justification of the maximin principle (which evaluates alternatives in terms of the utility of the worst-off individual), originating from John Rawls.

Not all voting methods use, as input, only an ordering of all candidates. Methods which don't, often called "rated" or "cardinal" (as opposed to "ranked", "ordinal", or "preferential") electoral system, can be viewed as using information that only cardinal utility can convey. In that case, it is not surprising if some of them satisfy all of Arrow's conditions that are reformulated. Range voting is such a method. Whether such a claim is correct depends on how each condition is reformulated. Other rated electoral system which pass certain generalizations of Arrow's criteria include approval voting and majority judgment. Note that Arrow's theorem does not apply to single-winner methods such as these, but Gibbard's theorem still does: no non-defective electoral system is fully strategy-free, so the informal dictum that "no electoral system is perfect" still has a mathematical basis.

Finally, though not an approach investigating some kind of rules, there is a criticism by James M. Buchanan, Charles Plott, and others. It argues that it is silly to think that there might be social preferences that are analogous to individual preferences. Arrow (1963, Chapter 8) answers this sort of criticism seen in the early period, which come at least partly from misunderstanding.

Bremsstrahlung

From Wikipedia, the free encyclopedia
 
Bremsstrahlung produced by a high-energy electron deflected in the electric field of an atomic nucleus.

Bremsstrahlung /ˈbrɛmʃtrɑːləŋ/, from bremsen "to brake" and Strahlung "radiation"; i.e., "braking radiation" or "deceleration radiation", is electromagnetic radiation produced by the deceleration of a charged particle when deflected by another charged particle, typically an electron by an atomic nucleus. The moving particle loses kinetic energy, which is converted into radiation (i.e., photons), thus satisfying the law of conservation of energy. The term is also used to refer to the process of producing the radiation. Bremsstrahlung has a continuous spectrum, which becomes more intense and whose peak intensity shifts toward higher frequencies as the change of the energy of the decelerated particles increases.

Broadly speaking, bremsstrahlung or braking radiation is any radiation produced due to the deceleration (negative acceleration) of a charged particle, which includes synchrotron radiation (i.e., photon emission by a relativistic particle), cyclotron radiation (i.e. photon emission by a non-relativistic particle), and the emission of electrons and positrons during beta decay. However, the term is frequently used in the more narrow sense of radiation from electrons (from whatever source) slowing in matter.

Bremsstrahlung emitted from plasma is sometimes referred to as free–free radiation. This refers to the fact that the radiation in this case is created by electrons that are free (i.e., not in an atomic or molecular bound state) before, and remain free after, the emission of a photon. In the same parlance, bound–bound radiation refers to discrete spectral lines (an electron "jumps" between two bound states), while free–bound radiation refers to the radiative combination process, in which a free electron recombines with an ion.

Classical description

Field lines and modulus of the electric field generated by a (negative) charge first moving at a constant speed and then stopping quickly to show the generated Bremsstrahlung radiation.

If quantum effects are negligible, an accelerating charged particle radiates power as described by the Larmor formula and its relativistic generalization.

Total radiated power

The total radiated power is

where (the velocity of the particle divided by the speed of light), is the Lorentz factor, signifies a time derivative of , and q is the charge of the particle. In the case where velocity is parallel to acceleration (i.e., linear motion), the expression reduces to

where is the acceleration. For the case of acceleration perpendicular to the velocity (), for example in synchrotrons, the total power is

Power radiated in the two limiting cases is proportional to or . Since , we see that for particles with the same energy the total radiated power goes as or , which accounts for why electrons lose energy to bremsstrahlung radiation much more rapidly than heavier charged particles (e.g., muons, protons, alpha particles). This is the reason a TeV energy electron-positron collider (such as the proposed International Linear Collider) cannot use a circular tunnel (requiring constant acceleration), while a proton-proton collider (such as the Large Hadron Collider) can utilize a circular tunnel. The electrons lose energy due to bremsstrahlung at a rate times higher than protons do.

Angular distribution

The most general formula for radiated power as a function of angle is:

,

where is a unit vector pointing from the particle towards the observer, and is an infinitesimal bit of solid angle.

In the case where velocity is parallel to acceleration (for example, linear motion), this simplifies to

,

where is the angle between and the direction of observation.

Simplified quantum description

This section gives a quantum-mechanical analog of the prior section, but with some simplifications. We give a non-relativistic treatment of the special case of an electron of mass , charge , and initial speed decelerating in the Coulomb field of a gas of heavy ions of charge and number density . The emitted radiation is a photon of frequency and energy . We wish to find the emissivity which is the power emitted per (solid angle in photon velocity space * photon frequency), summed over both transverse photon polarizations. We follow the common astrophysical practice of writing this result in terms of an approximate classical result times the free-free emission Gaunt factor gff which incorporates quantum and other corrections:

.

A general, quantum-mechanical formula for exists but is very complicated, and usually is found by numerical calculations. We present some approximate results with the following additional assumptions:

  • Vacuum interaction: we neglect any effects of the background medium, such as plasma screening effects. This is reasonable for photon frequency much greater than the plasma frequency with the plasma electron density. Note that light waves are evanescent for and a significantly different approach would be needed.
  • Soft photons: , that is, the photon energy is much less than the initial electron kinetic energy.

With these assumptions, two unitless parameters characterize the process: , which measures the strength of the electron-ion Coulomb interaction, and , which measures the photon "softness" and we assume is always small (the choice of the factor 2 is for later convenience). In the limit , the quantum-mechanical Born approximation gives:

.

In the opposite limit , the full quantum-mechanical result reduces to the purely classical result

,

where is the Euler–Mascheroni constant. Note that which is a purely classical expression without Planck's constant .

A semi-classical, heuristic way to understand the Gaunt factor is to write it as where and are maximum and minimum "impact parameters" for the electron-ion collision, in the presence of the photon electric field. With our assumptions, : for larger impact parameters, the sinusoidal oscillation of the photon field provides "phase mixing" that strongly reduces the interaction. is the larger of the quantum-mechanical deBroglie wavelength and the classical distance of closest approach where the electron-ion Coulomb potential energy is comparable to the electron's initial kinetic energy.

The above results generally apply as long as the argument of the logarithm is large, and break down when it is less than unity. Namely, the Gaunt factor becomes negative in this case, which is unphysical. A rough approximation to the full calculations, with the appropriate Born and classical limits, is

.

Thermal bremsstrahlung: emission and absorption

The bremsstrahlung power spectrum rapidly decreases for large , and is also suppressed near . This plot is for the quantum case , and .

This section discusses bremsstrahlung emission and the inverse absorption process (called inverse bremsstrahlung) in a macroscopic medium. We start with the equation of radiative transfer, which applies to general processes and not just bremsstrahlung:

is the radiation spectral intensity, or power per (area * solid angle in photon velocity space * photon frequency) summed over both polarizations. is the emissivity, analogous to defined above, and is the absorptivity. and are properties of the matter, not the radiation, and account for all the particles in the medium - not just a pair of one electron and one ion as in the prior section. If is uniform in space and time, then the left-hand side of the transfer equation is zero, and we find

If the matter and radiation are also in thermal equilibrium at some temperature, then must be the blackbody spectrum:

Since and are independent of , this means that must be the blackbody spectrum whenever the matter is in equilibrium at some temperature – regardless of the state of the radiation. This allows us to immediately know both and once one is known – for matter in equilibrium.

In plasma

NOTE: this section currently gives formulas that apply in the Rayleigh-Jeans limit , and does not use a quantized (Planck) treatment of radiation. Thus a usual factor like does not appear. The appearance of in below is due to the quantum-mechanical treatment of collisions.

In a plasma, the free electrons continually collide with the ions, producing bremsstrahlung. A complete analysis requires accounting for both binary Coulomb collisions as well as collective (dielectric) behavior. A detailed treatment is given by Bekefi, while a simplified one is given by Ichimaru. In this section we follow Bekefi's dielectric treatment, with collisions included approximately via the cutoff wavenumber, .

Consider a uniform plasma, with thermal electrons distributed according to the Maxwell–Boltzmann distribution with the temperature . Following Bekefi, the power spectral density (power per angular frequency interval per volume, integrated over the whole sr of solid angle, and in both polarizations) of the bremsstrahlung radiated, is calculated to be

where is the electron plasma frequency, is the photon frequency, is the number density of electrons and ions, and other symbols are physical constants. The second bracketed factor is the index of refraction of a light wave in a plasma, and shows that emission is greatly suppressed for (this is the cutoff condition for a light wave in a plasma; in this case the light wave is evanescent). This formula thus only applies for . This formula should be summed over ion species in a multi-species plasma.

The special function is defined in the exponential integral article, and the unitless quantity is

.

is a maximum or cutoff wavenumber, arising due to binary collisions, and can vary with ion species. Roughly, when (typical in plasmas that are not too cold), where eV is the Hartree energy, and is the electron thermal de Broglie wavelength. Otherwise, where is the classical Coulomb distance of closest approach.

For the usual case , we find

The formula for is approximate, in that it neglects enhanced emission occurring for slightly above .

In the limit , we can approximate as where is the Euler–Mascheroni constant. The leading, logarithmic term is frequently used, and resembles the Coulomb logarithm that occurs in other collisional plasma calculations. For the log term is negative, and the approximation is clearly inadequate. Bekefi gives corrected expressions for the logarithmic term that match detailed binary-collision calculations.

The total emission power density, integrated over all frequencies, is

and decreases with ; it is always positive. For , we find

Note the appearance of due to the quantum nature of . In practical units, a commonly used version of this formula for is 

This formula is 1.59 times the one given above, with the difference due to details of binary collisions. Such ambiguity is often expressed by introducing Gaunt factor , e.g. in one finds

where everything is expressed in the CGS units.

Relativistic corrections

Relativistic corrections to the emission of a 30-keV photon by an electron impacting on a proton.

For very high temperatures there are relativistic corrections to this formula, that is, additional terms of the order of

Bremsstrahlung cooling

If the plasma is optically thin, the bremsstrahlung radiation leaves the plasma, carrying part of the internal plasma energy. This effect is known as the bremsstrahlung cooling. It is a type of radiative cooling. The energy carried away by bremsstrahlung is called bremsstrahlung losses and represents a type of radiative losses. One generally uses the term bremsstrahlung losses in the context when the plasma cooling is undesired, as e.g. in fusion plasmas.

Polarizational bremsstrahlung

Polarizational bremsstrahlung (sometimes referred to as "atomic bremsstrahlung") is the radiation emitted by the target's atomic electrons as the target atom is polarized by the Coulomb field of the incident charged particle. Polarizational bremsstrahlung contributions to the total bremsstrahlung spectrum have been observed in experiments involving relatively massive incident particles, resonance processes, and free atoms. However, there is still some debate as to whether or not there are significant polarizational bremsstrahlung contributions in experiments involving fast electrons incident on solid targets.

It is worth noting that the term "polarizational" is not meant to imply that the emitted bremsstrahlung is polarized. Also, the angular distribution of polarizational bremsstrahlung is theoretically quite different than ordinary bremsstrahlung.

Sources

X-ray tube

Spectrum of the X-rays emitted by an X-ray tube with a rhodium target, operated at 60 kV. The continuous curve is due to bremsstrahlung, and the spikes are characteristic K lines for rhodium. The curve goes to zero at 21 pm in agreement with the Duane–Hunt law, as described in the text.
 

In an X-ray tube, electrons are accelerated in a vacuum by an electric field towards a piece of metal called the "target". X-rays are emitted as the electrons slow down (decelerate) in the metal. The output spectrum consists of a continuous spectrum of X-rays, with additional sharp peaks at certain energies. The continuous spectrum is due to bremsstrahlung, while the sharp peaks are characteristic X-rays associated with the atoms in the target. For this reason, bremsstrahlung in this context is also called continuous X-rays.

The shape of this continuum spectrum is approximately described by Kramers' law.

The formula for Kramers' law is usually given as the distribution of intensity (photon count) against the wavelength of the emitted radiation:

The constant K is proportional to the atomic number of the target element, and is the minimum wavelength given by the Duane–Hunt law.

The spectrum has a sharp cutoff at , which is due to the limited energy of the incoming electrons. For example, if an electron in the tube is accelerated through 60 kV, then it will acquire a kinetic energy of 60 keV, and when it strikes the target it can create X-rays with energy of at most 60 keV, by conservation of energy. (This upper limit corresponds to the electron coming to a stop by emitting just one X-ray photon. Usually the electron emits many photons, and each has an energy less than 60 keV.) A photon with energy of at most 60 keV has wavelength of at least 21 pm, so the continuous X-ray spectrum has exactly that cutoff, as seen in the graph. More generally the formula for the low-wavelength cutoff, the Duane-Hunt law, is:

where h is Planck's constant, c is the speed of light, V is the voltage that the electrons are accelerated through, e is the elementary charge, and pm is picometres.

Beta decay

Beta particle-emitting substances sometimes exhibit a weak radiation with continuous spectrum that is due to bremsstrahlung (see the "outer bremsstrahlung" below). In this context, bremsstrahlung is a type of "secondary radiation", in that it is produced as a result of stopping (or slowing) the primary radiation (beta particles). It is very similar to X-rays produced by bombarding metal targets with electrons in X-ray generators (as above) except that it is produced by high-speed electrons from beta radiation.

Inner and outer bremsstrahlung

The "inner" bremsstrahlung (also known as "internal bremsstrahlung") arises from the creation of the electron and its loss of energy (due to the strong electric field in the region of the nucleus undergoing decay) as it leaves the nucleus. Such radiation is a feature of beta decay in nuclei, but it is occasionally (less commonly) seen in the beta decay of free neutrons to protons, where it is created as the beta electron leaves the proton.

In electron and positron emission by beta decay the photon's energy comes from the electron-nucleon pair, with the spectrum of the bremsstrahlung decreasing continuously with increasing energy of the beta particle. In electron capture, the energy comes at the expense of the neutrino, and the spectrum is greatest at about one third of the normal neutrino energy, decreasing to zero electromagnetic energy at normal neutrino energy. Note that in the case of electron capture, bremsstrahlung is emitted even though no charged particle is emitted. Instead, the bremsstrahlung radiation may be thought of as being created as the captured electron is accelerated toward being absorbed. Such radiation may be at frequencies that are the same as soft gamma radiation, but it exhibits none of the sharp spectral lines of gamma decay, and thus is not technically gamma radiation.

The internal process is to be contrasted with the "outer" bremsstrahlung due to the impingement on the nucleus of electrons coming from the outside (i.e., emitted by another nucleus), as discussed above.

Radiation safety

In some cases, e.g. 32
P
, the bremsstrahlung produced by shielding the beta radiation with the normally used dense materials (e.g. lead) is itself dangerous; in such cases, shielding must be accomplished with low density materials, e.g. Plexiglas (Lucite), plastic, wood, or water; as the atomic number is lower for these materials, the intensity of bremsstrahlung is significantly reduced, but a larger thickness of shielding is required to stop the electrons (beta radiation).

In astrophysics

The dominant luminous component in a cluster of galaxies is the 107 to 108 kelvin intracluster medium. The emission from the intracluster medium is characterized by thermal bremsstrahlung. This radiation is in the energy range of X-rays and can be easily observed with space-based telescopes such as Chandra X-ray Observatory, XMM-Newton, ROSAT, ASCA, EXOSAT, Suzaku, RHESSI and future missions like IXO and Astro-H.

Bremsstrahlung is also the dominant emission mechanism for H II regions at radio wavelengths.

In electric discharges

In electric discharges, for example as laboratory discharges between two electrodes or as lightning discharges between cloud and ground or within clouds, electrons produce Bremsstrahlung photons while scattering off air molecules. These photons become manifest in terrestrial gamma-ray flashes and are the source for beams of electrons, positrons, neutrons and protons. The appearance of Bremsstrahlung photons also influences the propagation and morphology of discharges in nitrogen-oxygen mixtures with low percentages of oxygen.

Quantum mechanical description

The complete quantum mechanical description was first performed by Bethe and Heitler. They assumed plane waves for electrons which scatter at the nucleus of an atom, and derived a cross section which relates the complete geometry of that process to the frequency of the emitted photon. The quadruply differential cross section which shows a quantum mechanical symmetry to pair production, is:

There is the atomic number, the fine structure constant, the reduced Planck's constant and the speed of light. The kinetic energy of the electron in the initial and final state is connected to its total energy or its momenta via

where is the mass of an electron. Conservation of energy gives

where is the photon energy. The directions of the emitted photon and the scattered electron are given by

where is the momentum of the photon.

The differentials are given as

The absolute value of the virtual photon between the nucleus and electron is

The range of validity is given by the Born approximation

where this relation has to be fulfilled for the velocity of the electron in the initial and final state.

For practical applications (e.g. in Monte Carlo codes) it can be interesting to focus on the relation between the frequency of the emitted photon and the angle between this photon and the incident electron. Köhn and Ebert integrated the quadruply differential cross section by Bethe and Heitler over and and obtained:

with

and

However, a much simpler expression for the same integral can be found in (Eq. 2BN) and in (Eq. 4.1).

An analysis of the doubly differential cross section above shows that electrons whose kinetic energy is larger than the rest energy (511 keV) emit photons in forward direction while electrons with a small energy emit photons isotropically.

Electron–electron bremsstrahlung

One mechanism, considered important for small atomic numbers , is the scattering of a free electron at the shell electrons of an atom or molecule. Since electron–electron bremsstrahlung is a function of and the usual electron-nucleus bremsstrahlung is a function of , electron–electron bremsstrahlung is negligible for metals. For air, however, it plays an important role in the production of terrestrial gamma-ray flashes.

Self-awareness

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Sel...