An ecological fallacy (also ecological inference fallacy or population fallacy) is a formal fallacy in the interpretation of statistical data that occurs when inferences
about the nature of individuals are deduced from inferences about the
group to which those individuals belong. 'Ecological fallacy' is a term
that is sometimes used to describe the fallacy of division,
which is not a statistical fallacy. The four common statistical
ecological fallacies are: confusion between ecological correlations and
individual correlations, confusion between group average and total
average, Simpson's paradox, and confusion between higher average and higher likelihood.
Examples
Mean and median
An
example of ecological fallacy is the assumption that a population mean
has a simple interpretation when considering likelihoods for an
individual.
For instance, if the mean score of a group is larger than zero,
this does not imply that a random individual of that group is more
likely to have a positive score than a negative one (as long as there
are more negative scores than positive scores an individual is more
likely to have a negative score). Similarly, if a particular group of
people is measured to have a lower mean IQ than the general population,
it is an error to conclude that a randomly-selected member of the group
is more likely than not to have a lower IQ than the mean IQ of the
general population; it is also not necessarily the case that a randomly
selected member of the group is more likely than not to have a lower IQ
than a randomly-selected member of the general population.
Mathematically, this comes from the fact that a distribution can have a
positive mean but a negative median. This property is linked to the skewness of the distribution.
Consider the following numerical example:
Group A: 80% of people got 40 points and 20% of them got 95 points. The mean score is 51 points.
Group B: 50% of people got 45 points and 50% got 55 points. The mean score is 50 points.
If we pick two people at random from A and B, there are 4 possible outcomes:
A – 40, B – 45 (B wins, 40% probability – 0.8 × 0.5)
A – 40, B – 55 (B wins, 40% probability – 0.8 × 0.5)
A – 95, B – 45 (A wins, 10% probability – 0.2 × 0.5)
A – 95, B – 55 (A wins, 10% probability – 0.2 × 0.5)
Although Group A has a higher mean score, 80% of the time a random
individual of A will score lower than a random individual of B.
Individual and aggregate correlations
Research dating back to Émile Durkheim suggests that predominantly Protestant localities have higher suicide rates than predominantly Catholic localities. According to Freedman,
the idea that Durkheim's findings link, at an individual level, a
person's religion to his or her suicide risk is an example of the
ecological fallacy. A group-level relationship does not automatically
characterize the relationship at the level of the individual.
Similarly, even if at the individual level, wealth is positively
correlated to tendency to vote Republican, we observe that wealthier
states tend to vote Democratic. For example, in 2004, the Republican
candidate, George W. Bush, won the fifteen poorest states, and the Democratic candidate, John Kerry,
won 9 of the 11 wealthiest states. Yet 62% of voters with annual
incomes over $200,000 voted for Bush, but only 36% of voters with annual
incomes of $15,000 or less voted for Bush.
Aggregate-level correlation will differ from individual-level
correlation if voting preferences are affected by the total wealth of
the state even after controlling for individual wealth. It could be that
the true driving factor in voting preference is self-perceived relative
wealth; perhaps those who see themselves as better off than their
neighbours are more likely to vote Republican. In this case, an
individual would be more likely to vote Republican if she became
wealthier, but she would be more likely to vote for a Democrat if her
neighbor's wealth increased (resulting in a wealthier state).
However, the observed difference in voting habits based on
state-level and individual-level wealth could also be explained by the
common confusion between higher averages and higher likelihoods as
discussed above. States may not be wealthier because they contain more
wealthy people (i.e. more people with annual incomes over $200,000), but
rather because they contain a small number of super-rich individuals;
the ecological fallacy then results from incorrectly assuming that
individuals in wealthier states are more likely to be wealthy.
Many examples of ecological fallacies can be found in studies of
social networks, which often combine analysis and implications from
different levels. This has been illustrated in an academic paper on
networks of farmers in Sumatra.
Robinson's paradox
A
1950 paper by William S. Robinson computed the illiteracy rate and the
proportion of the population born outside the US for each state and for
the District of Columbia, as of the 1930 census.
He showed that these two figures were associated with a negative
correlation of −0.53; in other words, the greater the proportion of
immigrants in a state, the lower its average illiteracy. However, when
individuals are considered, the correlation was +0.12 (immigrants were
on average more illiterate than native citizens). Robinson showed that
the negative correlation at the level of state populations was because
immigrants tended to settle in states where the native population was
more literate. He cautioned against deducing conclusions about
individuals on the basis of population-level, or "ecological" data. In
2011, it was found that Robinson's calculations of the ecological
correlations are based on the wrong state level data. The correlation of
−0.53 mentioned above is in fact −0.46. Robinson's paper was seminal, but the term 'ecological fallacy' was not coined until 1958 by Selvin.
Formal problem
The correlation of aggregate quantities (or ecological correlation) is not equal to the correlation of individual quantities. Denote by Xi, Yi two quantities at the individual level. The formula for the covariance of the aggregate quantities in groups of size N is
The covariance of two aggregated variables depends not only on the
covariance of two variables within the same individuals but also on
covariances of the variables between different individuals. In other
words, correlation of aggregate variables take into account cross
sectional effects which are not relevant at the individual level.
The problem for correlations entails naturally a problem for
regressions on aggregate variables: the correlation fallacy is therefore
an important issue for a researcher who wants to measure causal
impacts. Start with a regression model where the outcome is impacted by
The regression model at the aggregate level is obtained by summing the individual equations:
Nothing prevents the regressors and the errors from being correlated
at the aggregate level. Therefore, generally, running a regression on
aggregate data does not estimate the same model than running a
regression with individual data.
The aggregate model is correct if and only if
This means that, controlling for , does not determine .
Choosing between aggregate and individual inference
There
is nothing wrong in running regressions on aggregate data if one is
interested in the aggregate model. For instance, for the governor of a
state, it is correct to run regressions between police force on crime
rate at the state level if one is interested in the policy implication
of a rise in police force. However, an ecological fallacy would happen
if a city council deduces the impact of an increase in police force in
the crime rate at the city level from the correlation at the state
level.
Choosing to run aggregate or individual regressions to understand
aggregate impacts on some policy depends on the following trade-off:
aggregate regressions lose individual level data but individual
regressions add strong modeling assumptions. Some researchers suggest
that the ecological correlation gives a better picture of the outcome of
public policy actions, thus they recommend the ecological correlation
over the individual level correlation for this purpose (Lubinski &
Humphreys, 1996). Other researchers disagree, especially when the
relationships among the levels are not clearly modeled. To prevent
ecological fallacy, researchers with no individual data can model first
what is occurring at the individual level, then model how the individual
and group levels are related, and finally examine whether anything
occurring at the group level adds to the understanding of the
relationship. For instance, in evaluating the impact of state policies,
it is helpful to know that policy impacts vary less among the states
than do the policies themselves, suggesting that the policy differences
are not well translated into results, despite high ecological
correlations (Rose, 1973).
Group and total averages
Ecological
fallacy can also refer to the following fallacy: the average for a
group is approximated by the average in the total population divided by
the group size. Suppose one knows the number of Protestants and the
suicide rate in the USA, but one does not have data linking religion and
suicide at the individual level. If one is interested in the suicide
rate of Protestants, it is a mistake to estimate it by the total suicide
rate divided by the number of Protestants.
Formally, denote the mean of the group, we generally have:
A striking ecological fallacy is Simpson's paradox: the fact
that when comparing two populations divided into groups, the average of
some variable in the first population can be higher in every group and
yet lower in the total population. Formally, when each value of Z refers to a different group and X refers to some treatment, it can happen that
When does not depend on , the Simpson's paradox is exactly the omitted variable bias for the regression of Y on X where the regressor is a dummy variable and the omitted variable is a categorical variable
defining groups for each value it takes. The application is striking
because the bias is high enough that parameters have opposite signs.
Legal applications
The ecological fallacy was discussed in a court challenge to the 2004 Washington gubernatorial election in which a number of illegal voters were identified, after the election; their votes were unknown, because the vote was by secret ballot.
The challengers argued that illegal votes cast in the election would
have followed the voting patterns of the precincts in which they had
been cast, and thus adjustments should be made accordingly. An expert witness said this approach was like trying to figure out Ichiro Suzuki's batting average by looking at the batting average of the entire Seattle Mariners
team, since the illegal votes were cast by an unrepresentative sample
of each precinct's voters, and might be as different from the average
voter in the precinct as Ichiro was from the rest of his team. The judge determined that the challengers' argument was an ecological fallacy and rejected it.
Europeans are the focus of European ethnology, the field of anthropology related to the various indigenous groups that reside in the states of Europe. Groups may be defined by common genetic ancestry, common language, or both. Pan and Pfeil (2004) count 87 distinct "peoples of Europe", of which 33 form the majority population in at least one sovereign state, while the remaining 54 constitute ethnic minorities.
The total number of national minority populations in Europe is
estimated at 105 million people, or 14% of 770 million Europeans. The Russians are the most populous among Europeans, with a population over 134 million.
There are no universally accepted and precise definitions of the terms "ethnic group" and "nationality". In the context of European ethnography in particular, the terms ethnic group, people, nationality and ethno-linguistic group,
are used as mostly synonymous, although preference may vary in usage
with respect to the situation specific to the individual countries of
Europe.
About 20–25 million residents (3%) are members of diasporas of non-European origin. The population of the European Union, with some 450 million residents, accounts for two thirds of the current European population.
Of the total population of Europe of some 740 million (as of 2010),
close to 90% (or some 650 million) fall within three large branches of Indo-European languages, these being;
Indo-Aryan is represented by the Romani language spoken by Roma people of eastern Europe, and is at root related to the Indo-Aryan languages of the Indian subcontinent.
Besides the Indo-European languages, there are other language families on the European continent which are considered unrelated to Indo-European:
Language isolates: Basque,
spoken in the Basque regions of Spain and France, is an isolate
language, the only one in Europe, and is believed to be unrelated to any
other language, living or extinct.
The Basques have been found to descend from the population of the late Neolithic or early Bronze Age directly. By contrast, Indo-European groups of Europe (the Centum, Balto-Slavic, and Albanian groups) migrated throughout most of Europe from the Pontic steppe. They are assumed to have developed in situ through admixture of earlier Mesolithic and Neolithic populations with Bronze Age, proto-Indo-Europeans.
The Finnic peoples are assumed to also be descended from Proto-Uralic populations further to the east, nearer to the Ural Mountains, that had migrated to their historical homelands in Europe by about 3,000 years ago. A more recent study in 2019 found that proto-Uralic may have originated further East in Siberia
among an East Asian-related populations. The authors link the arrival
of Uralic languages to the arrival of Siberian-like ancestry to the Baltic region.
Regarding the European Bronze Age, the only relatively likely reconstruction is that of Proto-Greek (ca. 2000 BC). A Proto-Italo-Celtic ancestor of both Italic and Celtic (assumed for the Bell beaker period), and a Proto-Balto-Slavic language (assumed for roughly the Corded Ware horizon) has been postulated with less confidence. Old European hydronymy has been taken as indicating an early (Bronze Age) Indo-European predecessor of the later centum languages.
According to geneticist David Reich, based on ancient human genomes that his laboratory sequenced in 2016, Europeans descend from a mixture of four distinct ancestral components.
The western Kipchaks known as Cumans entered the lands of present-day Ukraine in the 11th century.
The Mongol/Tatar invasions (1223–1480), and Ottoman control of the Balkans (1389–1878). These medieval incursions account for the presence of European Turks and Tatars.
Book IX of Isidore's Etymologiae (7th century) treats de linguis, gentibus, regnis, militia, civibus (concerning languages, peoples, realms, war and cities).
Ahmad ibn Fadlan in the 10th century gives an account of the Bolghar and the Rus' peoples.
William Rubruck, while most notable for his account of the Mongols, in his account of his journey to Asia also gives accounts of the Tatars and the Alans.
Saxo Grammaticus and Adam of Bremen give an account of pre-Christian Scandinavia. The Chronicon Slavorum (12th century) gives an account of the northwestern Slavic tribes.
Gottfried Hensel in his 1741 Synopsis Universae Philologiae published one of the earliest ethno-linguistic map of Europe, showing the beginning of the pater noster in the various European languages and scripts.
In the 19th century, ethnicity was discussed in terms of scientific racism, and the ethnic groups of Europe were grouped into a number of "races", Mediterranean, Alpine and Nordic, all part of a larger "Caucasian" group.
The beginnings of ethnic geography as an academic subdiscipline lie in the period following World War I, in the context of nationalism, and in the 1930s exploitation for the purposes of fascist and Nazi propaganda, so that it was only in the 1960s that ethnic geography began to thrive as a bona fide academic subdiscipline.
The origins of modern ethnography are often traced to the work of Bronisław Malinowski, who emphasized the importance of fieldwork.
The emergence of population genetics further undermined the categorisation of Europeans into clearly defined racial groups. A 2007 study on the genetic history of Europe
found that the most important genetic differentiation in Europe occurs
on a line from the north to the south-east (northern Europe to the
Balkans), with another east–west axis of differentiation across Europe,
separating the indigenous Basques, Sardinians and Sami
from other European populations.
Despite these stratifications it noted the unusually high degree of
European homogeneity: "there is low apparent diversity in Europe with
the entire continent-wide samples only marginally more dispersed than
single population samples elsewhere in the world."
The total number of national minority populations in Europe is estimated at 105 million people, or 14% of Europeans.
The member states of the Council of Europe in 1995 signed the Framework Convention for the Protection of National Minorities.
The broad aims of the Convention are to ensure that the signatory
states respect the rights of national minorities, undertaking to combat
discrimination, promote equality, preserve and develop the culture and
identity of national minorities, guarantee certain freedoms in relation
to access to the media, minority languages and education and encourage
the participation of national minorities in public life. The Framework
Convention for the Protection of National Minorities defines a national
minority implicitly to include minorities possessing a territorial
identity and a distinct cultural heritage. By 2008, 39 member states had
signed and ratified the Convention, with the notable exception of France.
Indigenous minorities
Various European ethnic groups have lived there for millennia,
however, the UN recognizes very few indigenous populations of Europe,
which are confined to the far north and far east of the continent.
Many non-European ethnic groups and nationalities have migrated to
Europe over the centuries. Some arrived centuries ago. However, the vast
majority arrived more recently, mostly in the 20th and 21st centuries.
Often, they come from former colonies of the British, Dutch, French,
Portuguese and Spanish empires.
Ashkenazi Jews: approx. 1.4 million, mostly in the United Kingdom, France, Russia, Germany and Ukraine. They are believed by scholars to have arrived from Israel via southern Europe in the Roman era and settled in France and Germany towards the end of the first millennium. The Nazi Holocaust wiped out the vast majority during World War II and forced most to flee, with many of them going to Israel.
Sephardi Jews: approx. 0.3 million, mostly in France. They arrived via Spain and Portugal in the pre-Roman and Roman eras, and were forcibly converted or expelled in the 15th and 16th centuries.
Mizrahi Jews: approx. 0.3 million, mostly in France, via Islamic-majority countries of the Middle East.
Italqim: approx. 50,000, mostly in Italy, since the 2nd century BC.
Romaniotes: approx. 6,000, mostly in Greece, with communities dating at least from the 1st century AD.
Assyrians: mostly in Sweden and Germany, as well as in Russia, Armenia, Denmark and Great Britain (see Assyrian diaspora). Assyrians have been present in Eastern Turkey since the Bronze Age (circa 2000 BCE).
Kurds: approx. 2.5 million, mostly in the UK, Germany, Sweden and Turkey.
Horn Africans (Somalis, Ethiopians, Eritreans, Djiboutians, and the Northern Sudanese):
approx. 700,000, mostly in Scandinavia, the UK, the Netherlands,
Germany, Switzerland, Austria, Finland, and Italy. Majority arrived to
Europe as refugees. Proportionally few live in Italy despite former colonial ties, most live in the Nordic countries.
Sub-Saharan Africans (many ethnicities including Afro-Caribbeans, African-Americans, Afro-Latinos
and others by descent): approx. 5 million, mostly in the UK and France,
with smaller numbers in the Netherlands, Germany, Italy, Spain,
Portugal and elsewhere.
Latin Americans: approx. 2.2 million, mainly in Spain and to a lesser extent Italy and the UK. See also Latin American Britons (80,000 Latin American born in 2001).
Chilean refugees escaping the Augusto Pinochet regime of the 1970s formed communities in France, Sweden, the UK, former East Germany and the Netherlands.
Mexicans: about 21,000 in Spain and 14,000 in Germany
Venezuelans:
around 520,000 mostly in Spain (200,000), Portugal (100,000), France
(30,000), Germany (20,000), UK (15,000), Ireland (5,000), Italy (5,000)
and the Netherlands (1,000).
South Asians: approx. 3–4 million, mostly in the UK but reside in smaller numbers in Germany and France.
Romani
(Gypsies): approx. 4 or 10 million (although estimates vary widely),
dispersed throughout Europe but with large numbers concentrated in the
Balkans area, they are of ancestral South Asian and European descent, originating from the northern regions of the Indian subcontinent.
Indians: approx. 2 million, mostly in the UK, also in Italy, in Germany and smaller numbers in Ireland.
Pakistanis: approx. 1,000,000, mostly in the UK and in Italy, but also in Norway and Sweden.
Bangladeshi residing in Europe estimated at over 500,000, mostly in the UK and in Italy.
Sri Lankans: approx. 200,000, mainly in the UK and in Italy.
Amerindians and Inuit, a scant few in the European continent of American Indian ancestry (often Latin Americans in Spain, France and the UK; Inuit in Denmark),
but most may be children or grandchildren of U.S. soldiers from
American Indian tribes by intermarriage with local European women.
Medieval notions of a relation of the peoples of Europe are expressed in terms of genealogy of mythical founders of the individual groups.
The Europeans were considered the descendants of Japheth from early times, corresponding to the division of the known world into three continents, the descendants of Shem peopling Asia and those of Ham peopling Africa. Identification of Europeans as "Japhetites" is also reflected in early suggestions for terming the Indo-European languages "Japhetic".
The first man that dwelt in Europe was Alanus,
with his three sons, Hisicion, Armenon, and Neugio. Hisicion had four
sons, Francus, Romanus, Alamanus, and Bruttus. Armenon had five sons,
Gothus, Valagothus, Cibidus, Burgundus, and Longobardus. Neugio had
three sons, Vandalus, Saxo, and Boganus.
European culture is largely rooted in what is often referred to as its "common cultural heritage".
Due to the great number of perspectives which can be taken on the
subject, it is impossible to form a single, all-embracing conception of
European culture. Nonetheless, there are core elements which are generally agreed upon as forming the cultural foundation of modern Europe. One list of these elements given by K. Bochmann includes:
A specific conception of the individual expressed by the existence of, and respect for, a legality that guarantees human rights and the liberty of the individual;
A plurality of states with different political orders, which are condemned to live together in one way or another;
Respect for peoples, states and nations outside Europe.
Berting says that these points fit with "Europe's most positive realisations".
The concept of European culture is generally linked to the classical definition of the Western world. In this definition, Western culture is the set of literary, scientific, political, artistic and philosophical principles which set it apart from other civilizations. Much of this set of traditions and knowledge is collected in the Western canon.
The term has come to apply to countries whose history has been strongly
marked by European immigration or settlement during the 18th and 19th
centuries, such as the Americas, and Australasia, and is not restricted to Europe.
Christianity has been the dominant religion shaping European culture for at least the last 1700 years.
Modern philosophical thought has very much been influenced by Christian
philosophers such as St Thomas Aquinas and Erasmus. And throughout most
of its history, Europe has been nearly equivalent to Christian culture, The Christian culture was the predominant force in western civilization, guiding the course of philosophy, art, and science. The notion of "Europe" and the "Western World" has been intimately connected with the concept of "Christianity and Christendom" many even attribute Christianity for being the link that created a unified European identity.
Christianity is still the largest religion in Europe; according to a 2011 survey, 76.2% of Europeans considered themselves Christians. Also according to a study on Religiosity in the European Union in 2012, by Eurobarometer, Christianity is the largest religion in the European Union, accounting for 72% of the EU's population. As of 2010 Catholics were the largest Christian group in Europe, accounting for more than 48% of European Christians. The second-largest Christian group in Europe were the Orthodox, who made up 32% of European Christians. About 19% of European Christians were part of the Protestant tradition. Russia is the largest Christian country in Europe by population, followed by Germany and Italy.
Islam has some tradition in the Balkans and the Caucasus due to conquest and colonization from the Ottoman Empire in the 16th to 19th centuries, as well as earlier though discontinued long-term presence in much of Iberia as well as Sicily. Muslims account for the majority of the populations in Albania, Azerbaijan, Kosovo, Northern Cyprus (controlled by Turks), and Bosnia and Herzegovina. Significant minorities are present in the rest of Europe. Russia also has one of the largest Muslim communities in Europe, including the Tatars of the Middle Volga and multiple groups in the Caucasus, including Chechens, Avars, Ingush and others. With 20th-century migrations, Muslims in Western Europe have become a noticeable minority. According to the Pew Forum, the total number of Muslims in Europe in 2010 was about 44 million (6%), while the total number of Muslims in the European Union in 2007 was about 16 million (3.2%).
Judaism has a long history in Europe, but is a small minority religion, with France
(1%) the only European country with a Jewish population in excess of
0.5%. The Jewish population of Europe is composed primarily of two groups, the Ashkenazi and the Sephardi. Ancestors of Ashkenazi Jews likely migrated to Central Europe at least as early as the 8th century, while Sephardi Jews established themselves in Spain and Portugal at least one thousand years before that. Jews originated in the Levant
where they resided for thousands of years until the 2nd century AD,
when they spread around the Mediterranean and into Europe, although
small communities were known to exist in Greece as well as the Balkans
since at least the 1st century BC. Jewish history was notably affected
by the Holocaust and emigration (including Aliyah, as well as emigration to America)
in the 20th century. The Jewish population of Europe in 2010 was
estimated to be approximately 1.4 million (0.2% of European population)
or 10% of the world's Jewish population. In the 21st century, France has the largest Jewish population in Europe, followed by the United Kingdom, Germany, Russia and Ukraine.
In modern times, significant secularization since the 20th century, notably in secularist France, Estonia and the Czech Republic. Currently, distribution of theism in Europe is very heterogeneous, with more than 95% in Poland, and less than 20% in the Czech Republic and Estonia. The 2005 Eurobarometer poll found that 52% of EU citizens believe in God. According to a Pew Research Center Survey in 2012 the Religiously Unaffiliated (Atheists and Agnostics) make up about 18.2% of the European population in 2010.
According to the same Survey the Religiously Unaffiliated make up the
majority of the population in only two European countries: Czech
Republic (76%) and Estonia (60%).
"Pan-European identity" or "Europatriotism" is an emerging sense of personal identification with Europe, or the European Union as a result of the gradual process of European integration taking place over the last quarter of the 20th century, and especially in the period after the end of the Cold War, since the 1990s. The foundation of the OSCE following the 1990s Paris Charter has facilitated this process on a political level during the 1990s and 2000s.
From the later 20th century, 'Europe' has come to be widely used as a synonym for the European Union even though there are millions of people living on the European continent in non-EU member states. The prefix pan
implies that the identity applies throughout Europe, and especially in
an EU context, and 'pan-European' is often contrasted with national identity.