Human taxonomy is the classification of the human species (systematic name Homo sapiens, Latin: "wise man") within zoological taxonomy. The systematic genus, Homo, is designed to include both anatomically modern humans and extinct varieties of archaic humans. Current humans have been designated as subspecies Homo sapiens sapiens, differentiated, according to some, from the direct ancestor, Homo sapiens idaltu (with some other research instead classifying idaltu and current humans as belonging to the same subspecies).
Since the introduction of systematic names in the 18th century, knowledge of human evolution
has increased drastically, and a number of intermediate taxa have been
proposed in the 20th and early 21st centuries. The most widely accepted
taxonomy grouping takes the genus Homo as originating between two and three million years ago, divided into at least two species, archaic Homo erectus and modern Homo sapiens, with about a dozen further suggestions for species without universal recognition.
The genus Homo is placed in the tribeHominini alongside Pan (chimpanzees). The two genera are estimated to have diverged
over an extended time of hybridization, spanning roughly 10 to 6
million years ago, with possible admixture as late as 4 million years
ago. A subtribe of uncertain validity, grouping archaic "pre-human" or
"para-human" species younger than the Homo-Pan split, is Australopithecina (proposed in 1939).
A proposal by Wood and Richmond (2000) would introduce Hominina as a subtribe alongside Australopithecina, with Homo
the only known genus within Hominina. Alternatively, following
Cela-Conde and Ayala (2003), the "pre-human" or "proto-human" genera of Australopithecus, Ardipithecus, Praeanthropus, and possibly Sahelanthropus, may be placed on equal footing alongside the genus Homo. An even more extreme view rejects the division of Pan and Homo as separate genera, which based on the Principle of Priority would imply the reclassification of chimpanzees as Homo paniscus (or similar).
Categorizing humans based on phenotypes is a socially controversial subject. Biologists originally classified races as subspecies,
but contemporary anthropologists reject the concept of race as a useful
tool to understanding humanity, and instead view humanity as a complex,
interrelated genetic continuum. Taxonomy of the hominins continues to
evolve.
Human taxonomy on one hand involves the placement of humans within the taxonomy of the hominids (great apes), and on the other the division of archaic and modern humans into species and, if applicable, subspecies. Modern zoological taxonomy was developed by Carl Linnaeus
during the 1730s to 1750s. He was the first to develop the idea that,
like other biological entities, groups of people could too share
taxonomic classifications. He named the human species as Homo sapiens in 1758, as the only member species of the genus Homo, divided into several subspecies corresponding to the great races. The Latin noun homล (genitive hominis) means "human being". The systematic name Hominidae for the family of the great apes was introduced by John Edward Gray (1825). Gray also supplied Hominini as the name of the tribe including both chimpanzees (genus Pan) and humans (genus Homo).
The discovery of the first extinct archaic human species from the fossil record dates to the mid 19th century: Homo neanderthalensis,
classified in 1864. Since then, a number of other archaic species have
been named, but there is no universal consensus as to their exact
number. After the discovery of H. neanderthalensis, which even if
"archaic" is recognizable as clearly human, late 19th to early 20th
century anthropology for a time was occupied with finding the supposedly
"missing link" between Homo and Pan. The "Piltdown Man"
hoax of 1912 was the fraudulent presentation of such a transitional
species. Since the mid-20th century, knowledge of the development of
Hominini has become much more detailed, and taxonomical terminology has
been altered a number of times to reflect this.
The introduction of Australopithecus as a third genus, alongside Homo and Pan, in the tribe Hominini is due to Raymond Dart (1925). Australopithecina as a subtribe containing Australopithecus as well as Paranthropus (Broom
1938) is a proposal by Gregory & Hellman (1939). More recently
proposed additions to the Australopithecina subtribe include Ardipithecus (1995) and Kenyanthropus (2001). The position of Sahelanthropus (2002) relative to Australopithecina within Hominini is unclear. Cela-Conde and Ayala (2003) propose the recognition of Australopithecus, Ardipithecus, Praeanthropus, and Sahelanthropus (the latter incertae sedis)as separate genera.
The genus Homo has been taken to originate some two million years ago, since the discovery of stone tools in Olduvai Gorge, Tanzania, in the 1960s. Homo habilis (Leakey et al., 1964) would be the first "human" species (member of genus Homo) by definition, its type specimen being the OH 7 fossils. However, the discovery of more fossils of this type has opened up the debate on the delineation of H. habilis from Australopithecus. Especially, the LD 350-1 jawbone fossil discovered in 2013, dated to 2.8 Mya, has been argued as being transitional between the two. It is also disputed whether H. habilis was the first hominin to use stone tools, as Australopithecus garhi, dated to c. 2.5 Mya, has been found along with stone tool implements. Fossil KNM-ER 1470 (discovered in 1972, designated Pithecanthropus rudolfensis by Alekseyev 1978) is now seen as either a third early species of Homo (alongside H. habilis and H. erectus) at about 2 million years ago, or alternatively as transitional between Australopithecus and Homo.
Wood and Richmond (2000) proposed that Gray's tribe Hominini ("hominins") be designated as comprising all species after the chimpanzee–human last common ancestor by definition, to the inclusion of Australopithecines and other possible pre-human or para-human species (such as Ardipithecus and Sahelanthropus) not known in Gray's time. In this suggestion, the new subtribe of Hominina was to be designated as including the genus Homo
exclusively, so that Hominini would have two subtribes,
Australopithecina and Hominina, with the only known genus in Hominina
being Homo. Orrorin (2001) has been proposed as a possible ancestor of Hominina but not Australopithecina.
Designations alternative to Hominina have been proposed:
Australopithecinae (Gregory & Hellman 1939) and Preanthropinae
(Cela-Conde & Altaba 2002);
At least a dozen species of Homo other than Homo sapiens have been proposed, with varying degrees of consensus. Homo erectus is widely recognized as the species directly ancestral to Homo sapiens. Most other proposed species are proposed as alternatively belonging to either Homo erectus or Homo sapiens as a subspecies. This concerns Homo ergaster in particular. One proposal divides Homo erectus into an African and an Asian variety; the African is Homo ergaster, and the Asian is Homo erectus sensu stricto. (Inclusion of Homo ergaster with Asian Homo erectus is Homo erectus sensu lato.) There appears to be a recent trend, with the availability of ever more difficult-to-classify fossils such as the Dmanisi skulls (2013) or Homo naledi fossils (2015) to subsume all archaic varieties under Homo erectus.
The recognition or nonrecognition of subspecies of Homo sapiens
has a complicated history. The rank of subspecies in zoology is
introduced for convenience, and not by objective criteria, based on
pragmatic consideration of factors such as geographic isolation and sexual selection. The informal taxonomic rank of race is variously considered equivalent or subordinate to the rank of subspecies, and the division of anatomically modern humans (H. sapiens) into subspecies is closely tied to the recognition of major racial groupings based on human genetic variation.
A subspecies cannot be recognized independently: a species will
either be recognized as having no subspecies at all or at least two
(including any that are extinct). Therefore, the designation of an
extant subspecies Homo sapiens sapiens only makes sense if at least one other subspecies is recognized. H. s. sapiens is attributed to "Linnaeus (1758)" by the taxonomic Principle of Coordination. During the 19th to mid-20th century, it was common practice to classify the major divisions of extant H. sapiens as subspecies, following Linnaeus (1758), who had recognized H. s. americanus, H. s. europaeus, H. s. asiaticus and H. s. afer as grouping the native populations of the Americas, West Eurasia, East Asia and Sub-Saharan Africa, respectively. Linnaeus also included H. s. ferus, for the "wild" form which he identified with feral children, and two other "wild" forms for reported specimens now considered very dubious (see cryptozoology), H. s. monstrosus and H. s. troglodytes.
Homo sapiens neanderthalensis was proposed by King (1864) as an alternative to Homo neanderthalensis. There have been "taxonomic wars" over whether Neanderthals were a separate species since their discovery in the 1860s. Pรครคbo
(2014) frames this as a debate that is unresolvable in principle,
"since there is no definition of species perfectly describing the case." Louis Lartet (1869) proposed Homo sapiens fossilis based on the Cro-Magnon fossils.
There are a number of proposals of extinct varieties of Homo sapiens made in the 20th century. Many of the original proposals were not using explicit trinomial nomenclature, even though they are still cited as valid synonyms of H. sapiens by Wilson & Reeder (2005). These include: Homo grimaldii (Lapouge, 1906),
Homo aurignacensis hauseri (Klaatsch & Hauser, 1910),
Notanthropus eurafricanus (Sergi, 1911),
Homo fossilis infrasp. proto-aethiopicus (Giuffrida-Ruggeri, 1915),
Telanthropus capensis (Broom, 1917),
Homo wadjakensis (Dubois, 1921),
Homo sapiens cro-magnonensis, Homo sapiens grimaldiensis (Gregory, 1921),
Homo drennani (Kleinschmidt, 1931),
Homo galilensis (Joleaud, 1931) = Paleanthropus palestinus (McCown & Keith, 1932).
Rightmire (1983) proposed Homo sapiens rhodesiensis.
After World War II, the practice of dividing extant populations of Homo sapiens into subspecies declined. An early authority explicitly avoiding the division of H. sapiens into subspecies was Grzimeks Tierleben, published 1967–1972.
A late example of an academic authority proposing that the human racial groups should be considered taxonomical subspecies is John Baker (1974). The trinomial nomenclature Homo sapiens sapiens became popular for "modern humans" in the context of Neanderthals being considered a subspecies of H. sapiens in the second half of the 20th century. Derived from the convention, widespread in the 1980s, of considering two subspecies, H. s. neanderthalensis and H. s. sapiens, the explicit claim that "H. s. sapiens is the only extant human subspecies" appears in the early 1990s.
Since the 2000s, the extinct Homo sapiens idaltu (White et al., 2003) has gained wide recognition as a subspecies of Homo sapiens,
but even in this case there is a dissenting view arguing that "the
skulls may not be distinctive enough to warrant a new subspecies name". H. s. neanderthalensis and H. s. rhodesiensis continue to be considered separate species by some authorities, but the 2010s discovery of genetic evidence of archaic human admixture with modern humans has reopened the details of taxonomy of archaic humans.
Homo erectus
since its introduction in 1892 has been divided into numerous
subspecies, many of them formerly considered individual species of Homo. None of these subspecies have universal consensus among paleontologists.
Researchers have investigated the relationship between race and genetics as part of efforts to understand how biology may or may not contribute to human racial categorization. Today, the consensus among scientists is that race is a social construct, and that using it as a proxy for genetic differences among populations is misleading.
Many constructions of race are associated with phenotypical traits and geographic ancestry, and scholars like Carl Linnaeus have proposed scientific models for the organization of race since at least the 18th century. Following the discovery of Mendelian genetics and the mapping of the human genome, questions about the biology of race have often been framed in terms of genetics.
A wide range of research methods have been employed to examine patterns
of human variation and their relations to ancestry and racial groups,
including studies of individual traits, studies of large populations and genetic clusters, and studies of genetic risk factors for disease.
Research into race and genetics has also been criticized as emerging from, or contributing to, scientific racism. Genetic studies of traits and populations have been used to justify social inequalities associated with race, despite the fact that patterns of human variation have been shown to be mostly clinal,
with human genetic code being approximately 99.6%-99.9% identical
between individuals, and with no clear boundaries between groups.
Some researchers have argued that race can act as a proxy for
genetic ancestry because individuals of the same racial category may
share a common ancestry, but this view has fallen increasingly out of
favor among experts.
The mainstream view is that it is necessary to distinguish between
biology and the social, political, cultural, and economic factors that
contribute to conceptions of race.
The concept of "race" as a classification system of humans based on
visible physical characteristics emerged over the last five centuries,
influenced by European colonialism. However, there is widespread evidence of what would be described in modern terms as racial consciousness throughout the entirety of recorded history. For example, in Ancient Egypt there were four broad racial divisions of human beings: Egyptians, Asiatics, Libyans, and Nubians. There was also Aristotle of Ancient Greece, who once wrote: "The peoples of Asia... lack spirit, so that they are in continuous subjection and slavery." The concept has manifested in different forms
based on social conditions of a particular group, often used to justify
unequal treatment. Early influential attempts to classify humans into
discrete races include 4 races in Carl Linnaeus's Systema Naturae (Homo europaeus, asiaticus, americanus, and afer) and 5 races in Johann Friedrich Blumenbach's On the Natural Variety of Mankind. Notably, over the next centuries, scholars argued for anywhere from 3 to more than 60 race categories.
Race concepts have changed within a society over time; for example, in
the United States social and legal designations of "White" have been
inconsistently applied to Native Americans, Arab Americans, and Asian
Americans, among other groups (See main article: Definitions of whiteness in the United States).
Race categories also vary worldwide; for example, the same person might
be perceived as belonging to a different category in the United States
versus Brazil. Because of the arbitrariness inherent in the concept of race, it is difficult to relate it to biology in a straightforward way.
Race and human genetic variation
There
is broad consensus across the biological and social sciences that race
is a social construct, not an accurate representation of human genetic
variation. Humans are remarkably genetically similar, sharing approximately 99.6%-99.9% of their genetic code with one another.
We nonetheless see wide individual variation in phenotype, which arises
from both genetic differences and complex gene-environment
interactions. The vast majority of this genetic variation occurs within groups; very little genetic variation differentiates between groups.
Crucially, the between-group genetic differences that do exist do not
map onto socially recognized categories of race. Furthermore, although
human populations show some genetic clustering across geographic space,
human genetic variation is "clinal", or continuous.This, in addition to the fact that different traits vary on different
clines, makes it impossible to draw discrete genetic boundaries around
human groups. Finally, insights from ancient DNA are revealing that no
human population is "pure" – all populations represent a long history of
migration and mixing.
Genetic variation arises from mutations, from natural selection, migration between populations (gene flow) and from the reshuffling of genes through sexual reproduction.
Mutations lead to a change in the DNA structure, as the order of the
bases are rearranged. Resultantly, different polypeptide proteins are
coded. Some mutations may be positive and can help the individual
survive more effectively in their environment. Mutation is counteracted
by natural selection and by genetic drift; note too the founder effect,
when a small number of initial founders establish a population which
hence starts with a correspondingly small degree of genetic variation. Epigenetic inheritance involves heritable changes in phenotype (appearance) or gene expression caused by mechanisms other than changes in the DNA sequence.
Human phenotypes are highly polygenic (dependent on interaction by many genes) and are influenced by environment as well as by genetics.
Nucleotide diversity is based on single mutations, single nucleotide polymorphisms (SNPs). The nucleotide diversity between humans is about 0.1 percent (one difference per one thousand nucleotides
between two humans chosen at random). This amounts to approximately
three million SNPs (since the human genome has about three billion
nucleotides). There are an estimated ten million SNPs in the human
population.
Research has shown that non-SNP (structural) variation accounts for more human genetic variation than single nucleotide diversity. Structural variation includes copy-number variation and results from deletions, inversions, insertions and duplications. It is estimated that approximately 0.4 to 0.6 percent of the genomes of unrelated people differ.
Genetic basis for race
Much scientific research has been organized around the question of whether or not there is genetic basis for race. In Luigi Luca Cavalli-Sforza's book (circa 1994) "The History and Geography of Human Genes"
he writes, "From a scientific point of view, the concept of race has
failed to obtain any consensus; none is likely, given the gradual
variation in existence. It may be objected that the racial stereotypes
have a consistency that allows even the layman to classify individuals.
However, the major stereotypes, all based on skin color, hair color and
form, and facial traits, reflect superficial differences that are not
confirmed by deeper analysis with more reliable genetic traits and whose
origin dates from recent evolution mostly under the effect of climate
and perhaps sexual selection".
A more up-to-date and comprehensive book authored by geneticist David Reich (2018) reaffirms the conclusion that the traditional views which assert a biological basis for race are wrong:
Today, many people assume that
humans can be grouped biologically into "primeval" groups, corresponding
to our notion of "races"... But this long-held view about "race" has
just in the last years been proven wrong.
— David Reich, Who We Are and How We Got Here, (Introduction, pg. xxiv).
Research methods
Scientists investigating human variation have used a series of methods to characterize how different populations vary.
Early racial classification attempts measured surface traits, particularly skin color, hair color and texture, eye color, and head size and shape. (Measurements of the latter through craniometry
were repeatedly discredited in the late 19th and mid-20th centuries due
to a lack of correlation of phenotypic traits with racial
categorization.)
In actuality, biological adaptation plays the biggest role in these
bodily features and skin type. A relative handful of genes accounts for
the inherited factors shaping a person's appearance. Humans have an estimated 19,000–20,000 human protein-coding genes. Richard Sturm and David Duffy describe 11 genes that affect skin pigmentation and explain most variations in human skin color, the most significant of which are MC1R, ASIP, OCA2, and TYR. There is evidence that as many as 16 different genes could be responsible for eye color in humans; however, the main two genes associated with eye color variation are OCA2 and HERC2, and both are localized in chromosome 15.
Analysis of blood proteins and between-group genetics
Before the discovery of DNA, scientists used blood proteins (the human blood group systems) to study human genetic variation. Research by Ludwik and Hanka Herschfeld during World War I found that the incidence of blood groups
A and B differed by region; for example, among Europeans 15 percent
were group B and 40 percent group A. Eastern Europeans and Russians had a
higher incidence of group B; people from India had the greatest
incidence. The Herschfelds concluded that humans comprised two
"biochemical races", originating separately. It was hypothesized that
these two races later mixed, resulting in the patterns of groups A and
B. This was one of the first theories of racial differences to include
the idea that human variation did not correlate with genetic variation.
It was expected that groups with similar proportions of blood groups
would be more closely related, but instead it was often found that
groups separated by great distances (such as those from Madagascar and
Russia), had similar incidences. It was later discovered that the ABO blood group system is not just common to humans, but shared with other primates, and likely predates all human groups.
In 1972, Richard Lewontin performed a FST
statistical analysis using 17 markers (including blood-group proteins).
He found that the majority of genetic differences between humans (85.4
percent) were found within a population, 8.3 percent were found between
populations within a race and 6.3 percent were found to differentiate
races (Caucasian, African, Mongoloid, South Asian Aborigines, Amerinds,
Oceanians, and Australian Aborigines in his study). Since then, other
analyses have found FST values of 6–10 percent between
continental human groups, 5–15 percent between different populations on
the same continent and 75–85 percent within populations. This view has been affirmed by the American Anthropological Association and the American Association of Physical Anthropologists since.
Critiques of blood protein analysis
While acknowledging Lewontin's observation that humans are genetically homogeneous, A. W. F. Edwards in his 2003 paper "Human Genetic Diversity: Lewontin's Fallacy"
argued that information distinguishing populations from each other is
hidden in the correlation structure of allele frequencies, making it
possible to classify individuals using mathematical techniques. Edwards
argued that even if the probability of misclassifying an individual
based on a single genetic marker is as high as 30 percent (as Lewontin
reported in 1972), the misclassification probability nears zero if
enough genetic markers are studied simultaneously. Edwards saw
Lewontin's argument as based on a political stance, denying biological
differences to argue for social equality. Edwards' paper is reprinted, commented upon by experts such as Noah Rosenberg, and given further context in an interview with philosopher of science Rasmus Grรธnfeldt Winther in a recent anthology.
As referred to before, Edwards criticises Lewontin's paper as he
took 17 different traits and analysed them independently, without
looking at them in conjunction with any other protein. Thus, it would
have been fairly convenient for Lewontin to come up with the conclusion
that racial naturalism is not tenable, according to his argument.
Sesardic also strengthened Edwards' view, as he used an illustration
referring to squares and triangles, and showed that if you look at one
trait in isolation, then it will most likely be a bad predicator of
which group the individual belongs to.
In contrast, in a 2014 paper, reprinted in the 2018 Edwards Cambridge
University Press volume, Rasmus Grรธnfeldt Winther argues that
"Lewontin's Fallacy" is effectively a misnomer, as there really are two
different sets of methods and questions at play in studying the genomic
population structure of our species: "variance partitioning" and
"clustering analysis." According to Winther, they are "two sides of the
same mathematics coin" and neither "necessarily implies anything about
the reality of human groups."
Current studies of population genetics
Researchers currently use genetic testing, which may involve hundreds (or thousands) of genetic markers or the entire genome.
Structure
Several methods to examine and quantify genetic subgroups exist, including cluster and principal components analysis.
Genetic markers from individuals are examined to find a population's
genetic structure. While subgroups overlap when examining variants of
one marker only, when a number of markers are examined different
subgroups have different average genetic structure. An individual may be
described as belonging to several subgroups. These subgroups may be
more or less distinct, depending on how much overlap there is with other
subgroups.
In cluster analysis, the number of clusters to search for K is determined in advance; how distinct the clusters are varies.
The results obtained from cluster analyses depend on several factors:
A large number of genetic markers studied facilitates finding distinct clusters.
Some genetic markers vary more than others, so fewer are required to find distinct clusters. Ancestry-informative markers
exhibit substantially different frequencies between populations from
different geographical regions. Using AIMs, scientists can determine a
person's ancestral continent of origin based solely on their DNA. AIMs
can also be used to determine someone's admixture proportions.
The more individuals studied, the easier it becomes to detect distinct clusters (statistical noise is reduced).
Low genetic variation makes it more difficult to find distinct clusters. Greater geographic distance generally increases genetic variation, making identifying clusters easier.
A similar cluster structure is seen with different genetic markers
when the number of genetic markers included is sufficiently large. The
clustering structure obtained with different statistical techniques is
similar. A similar cluster structure is found in the original sample
with a subsample of the original sample.
Recent studies have been published using an increasing number of genetic markers.
Focus on study of structure has been criticized for giving the
general public a misleading impression of human genetic variation,
obscuring the general finding that genetic variants which are limited to
one region tend to be rare within that region, variants that are common
within a region tend to be shared across the globe, and most
differences between individuals, whether they come from the same region
or different regions, are due to global variants.
Distance
Genetic distance
is genetic divergence between species or populations of a species. It
may compare the genetic similarity of related species, such as humans
and chimpanzees. Within a species, genetic distance measures divergence
between subgroups. Genetic distance significantly correlates to
geographic distance between populations, a phenomenon sometimes known as
"isolation by distance".
Genetic distance may be the result of physical boundaries restricting
gene flow such as islands, deserts, mountains or forests. Genetic
distance is measured by the fixation index (FST). FST is the correlation of randomly chosen alleles
in a subgroup to a larger population. It is often expressed as a
proportion of genetic diversity. This comparison of genetic variability
within (and between) populations is used in population genetics.
The values range from 0 to 1; zero indicates the two populations are
freely interbreeding, and one would indicate that two populations are
separate.
Many studies place the average FST distance between human races at about 0.125. Henry Harpending
argued that this value implies on a world scale a "kinship between two
individuals of the same human population is equivalent to kinship
between grandparent and grandchild or between half siblings". In fact,
the formulas derived in Harpending's paper in the "Kinship in a
subdivided population" section imply that two unrelated individuals of
the same race have a higher coefficient of kinship (0.125) than an
individual and their mixed race half-sibling (0.109).
Critiques of FST
While acknowledging that FST remains useful, a number of scientists have written about other approaches to characterizing human genetic variation. Long & Kittles (2009) stated that FST failed to identify important variation and that when the analysis includes only humans, FST = 0.119, but adding chimpanzees increases it only to FST = 0.183. Mountain & Risch (2004) argued that an FST estimate of 0.10–0.15 does not rule out a genetic basis for phenotypic differences between groups and that a low FST estimate implies little about the degree to which genes contribute to between-group differences. Pearse & Crandall 2004 wrote that FST
figures cannot distinguish between a situation of high migration
between populations with a long divergence time, and one of a relatively
recent shared history but no ongoing gene flow.
In their 2015 article, Keith Hunley, Graciela Cabana, and Jeffrey Long
(who had previously criticized Lewontin's statistical methodology with
Rick Kittles)
recalculate the apportionment of human diversity using a more complex
model than Lewontin and his successors. They conclude: "In sum, we
concur with Lewontin's conclusion that Western-based racial
classifications have no taxonomic significance, and we hope that this
research, which takes into account our current understanding of the
structure of human diversity, places his seminal finding on firmer
evolutionary footing."
Anthropologists (such as C. Loring Brace), philosopher Jonathan Kaplan and geneticist Joseph Graves
have argued that while it is possible to find biological and genetic
variation roughly corresponding to race, this is true for almost all
geographically distinct populations: the cluster structure of genetic
data is dependent on the initial hypotheses of the researcher and the
populations sampled. When one samples continental groups, the clusters
become continental; with other sampling patterns, the clusters would be
different. Weiss and Fullerton note that if one sampled only Icelanders,
Mayans and Maoris, three distinct clusters would form; all other
populations would be composed of genetic admixtures of Maori, Icelandic and Mayan material.
Kaplan therefore concludes that, while differences in particular allele
frequencies can be used to identify populations that loosely correspond
to the racial categories common in Western social discourse, the
differences are of no more biological significance than the differences
found between any human populations (e.g., the Spanish and Portuguese).
Historical and geographical analyses
Current-population
genetic structure does not imply that differing clusters or components
indicate only one ancestral home per group; for example, a genetic
cluster in the US comprises Hispanics with European, Native American and
African ancestry.
Geographic analyses attempt to identify places of origin, their
relative importance and possible causes of genetic variation in an area.
The results can be presented as maps showing genetic variation.
Cavalli-Sforza and colleagues argue that if genetic variations are
investigated, they often correspond to population migrations due to new
sources of food, improved transportation or shifts in political power.
For example, in Europe the most significant direction of genetic
variation corresponds to the spread of agriculture from the Middle East
to Europe between 10,000 and 6,000 years ago. Such geographic analysis works best in the absence of recent large-scale, rapid migrations.
Historic analyses use differences in genetic variation (measured by genetic distance) as a molecular clock indicating the evolutionary relation of species or groups, and can be used to create evolutionary trees reconstructing population separations.
Results of genetic-ancestry research are supported if they agree with research results from other fields, such as linguistics or archeology. Cavalli-Sforza and colleagues have argued that there is a correspondence between language families
found in linguistic research and the population tree they found in
their 1994 study. There are generally shorter genetic distances between
populations using languages from the same language family. Exceptions to
this rule are also found, for example Sami, who are genetically associated with populations speaking languages from other language families. The Sami speak a Uralic language,
but are genetically primarily European. This is argued to have resulted
from migration (and interbreeding) with Europeans while retaining their
original language. Agreement also exists between research dates in
archeology and those calculated using genetic distance.
Self-identification studies
Jorde
and Wooding found that while clusters from genetic markers were
correlated with some traditional concepts of race, the correlations were
imperfect and imprecise due to the continuous and overlapping nature of
genetic variation, noting that ancestry, which can be accurately
determined, is not equivalent to the concept of race.
A 2005 study by Tang and colleagues used 326 genetic markers to determine genetic clusters. The 3,636 subjects, from the United States and Taiwan,
self-identified as belonging to white, African American, East Asian or
Hispanic ethnic groups. The study found "nearly perfect correspondence
between genetic cluster and SIRE for major ethnic groups living in the
United States, with a discrepancy rate of only 0.14 percent".
Paschou et al. found "essentially perfect" agreement between 51
self-identified populations of origin and the population's genetic
structure, using 650,000 genetic markers. Selecting for informative
genetic markers allowed a reduction to less than 650, while retaining
near-total accuracy.
Correspondence between genetic clusters in a population (such as
the current US population) and self-identified race or ethnic groups
does not mean that such a cluster (or group) corresponds to only one
ethnic group. African Americans have an estimated 20–25-percent European
genetic admixture; Hispanics have European, Native American and African
ancestry.
In Brazil there has been extensive admixture between Europeans,
Amerindians and Africans. As a result, skin color differences within the
population are not gradual, and there are relatively weak associations
between self-reported race and African ancestry.
Ethnoracial self- classification in Brazilians is certainly not random
with respect to genome individual ancestry, but the strength of the
association between the phenotype and median proportion of African
ancestry varies largely across population.
Critique of genetic-distance studies and clusters
Genetic distances generally increase continually with geographic
distance, which makes a dividing line arbitrary. Any two neighboring
settlements will exhibit some genetic difference from each other, which
could be defined as a race. Therefore, attempts to classify races impose
an artificial discontinuity on a naturally occurring phenomenon. This
explains why studies on population genetic structure yield varying
results, depending on methodology.
Rosenberg and colleagues (2005) have argued, based on cluster
analysis of the 52 populations in the Human Genetic Diversity Panel,
that populations do not always vary continuously and a population's
genetic structure is consistent if enough genetic markers (and subjects)
are included.
Examination of the relationship
between genetic and geographic distance supports a view in which the
clusters arise not as an artifact of the sampling scheme, but from small
discontinuous jumps in genetic distance for most population pairs on
opposite sides of geographic barriers, in comparison with genetic
distance for pairs on the same side. Thus, analysis of the 993-locus
dataset corroborates our earlier results: if enough markers are used
with a sufficiently large worldwide sample, individuals can be
partitioned into genetic clusters that match major geographic
subdivisions of the globe, with some individuals from intermediate
geographic locations having mixed membership in the clusters that
correspond to neighboring regions.
They also wrote, regarding a model with five clusters corresponding
to Africa, Eurasia (Europe, Middle East, and Central/South Asia), East
Asia, Oceania, and the Americas:
For population pairs from the same
cluster, as geographic distance increases, genetic distance increases in
a linear manner, consistent with a clinal population structure.
However, for pairs from different clusters, genetic distance is
generally larger than that between intracluster pairs that have the same
geographic distance. For example, genetic distances for population
pairs with one population in Eurasia and the other in East Asia are
greater than those for pairs at equivalent geographic distance within
Eurasia or within East Asia. Loosely speaking, it is these small
discontinuous jumps in genetic distance—across oceans, the Himalayas, and the Sahara—that provide the basis for the ability of STRUCTURE to identify clusters that correspond to geographic regions.
This applies to populations in their ancestral homes when migrations and
gene flow were slow; large, rapid migrations exhibit different
characteristics. Tang and colleagues (2004) wrote, "we detected only
modest genetic differentiation between different current geographic
locales within each race/ethnicity group. Thus, ancient geographic
ancestry, which is highly correlated with self-identified
race/ethnicity—as opposed to current residence—is the major determinant
of genetic structure in the U.S. population".
Cluster analysis has been criticized because the number of clusters
to search for is decided in advance, with different values possible
(although with varying degrees of probability). Principal component analysis does not decide in advance how many components for which to search.
The 2002 study by Rosenberg et al.
exemplifies why meanings of these clusterings are disputable. The study
shows that at the K=5 cluster analysis, genetic clusterings roughly map
onto each of the five major geographical regions. Similar results were
gathered in further studies in 2005.
Critique of ancestry-informative markers
Ancestry-informative markers
(AIMs) are a genealogy tracing technology that has come under much
criticism due to its reliance on reference populations. In a 2015
article, Troy Duster outlines how contemporary technology allows the
tracing of ancestral lineage but along only the lines of one maternal
and one paternal line. That is, of 64 total
great-great-great-great-grandparents, only one from each parent is
identified, implying the other 62 ancestors are ignored in tracing
efforts.
Furthermore, the 'reference populations' used as markers for membership
of a particular group are designated arbitrarily and contemporarily. In
other words, using populations who currently reside in given places as
references for certain races and ethnic groups is unreliable due to the
demographic changes which have occurred over many centuries in those
places. Furthermore, ancestry-informative markers being widely shared
among the whole human population, it is their frequency which is tested,
not their mere absence/presence. A threshold of relative frequency has,
therefore, to be set. According to Duster, the criteria for setting
such thresholds are a trade secret of the companies marketing the tests.
Thus, we cannot say anything conclusive on whether they are
appropriate.
Results of AIMs are extremely sensitive to where this bar is set.
Given that many genetic traits are found to be very similar amid many
different populations, the designated threshold frequencies are very
important. This can also lead to mistakes, given that many populations
may share the same patterns, if not exactly the same genes. "This means
that someone from Bulgaria whose ancestors go back to the fifteenth
century could (and sometime does) map as partly 'Native American'".
This happens because AIMs rely on a '100% purity' assumption of
reference populations. That is, they assume that a pattern of traits
would ideally be a necessary and sufficient condition for assigning an
individual to an ancestral reference populations.
There are certain statistical differences between racial groups in susceptibility to certain diseases. Genes change in response to local diseases; for example, people who are Duffy-negative
tend to have a higher resistance to malaria. The Duffy negative
phenotype is highly frequent in central Africa and the frequency
decreases with distance away from Central Africa, with higher
frequencies in global populations with high degrees of recent African
immigration. This suggests that the Duffy negative genotype evolved in
Sub-Saharan Africa and was subsequently positively selected for in the
Malaria endemic zone. A number of genetic conditions prevalent in malaria-endemic areas may provide genetic resistance to malaria, including sickle cell disease, thalassaemias and glucose-6-phosphate dehydrogenase. Cystic fibrosis is the most common life-limiting autosomal recessive disease among people of European ancestry; a hypothesized heterozygote advantage, providing resistance to diseases earlier common in Europe, has been challenged.
Scientists Michael Yudell, Dorothy Roberts, Rob DeSalle, and Sarah
Tishkoff argue that using these associations in the practice of medicine
has led doctors to overlook or misidentify disease: "For example,
hemoglobinopathies can be misdiagnosed because of the identification of
sickle-cell as a 'Black' disease and thalassemia as a 'Mediterranean'
disease. Cystic fibrosis is underdiagnosed in populations of African
ancestry, because it is thought of as a 'White' disease."
Information about a person's population of origin may aid in diagnosis, and adverse drug responses may vary by group.
Because of the correlation between self-identified race and genetic
clusters, medical treatments influenced by genetics have varying rates
of success between self-defined racial groups. For this reason, some physicians consider a patient's race in choosing the most effective treatment, and some drugs are marketed with race-specific instructions.
Jorde and Wooding (2004) have argued that because of genetic variation
within racial groups, when "it finally becomes feasible and available,
individual genetic assessment of relevant genes will probably prove more
useful than race in medical decision making". However, race continues
to be a factor when examining groups (such as epidemiologic research). Some doctors and scientists such as geneticist Neil Risch
argue that using self-identified race as a proxy for ancestry is
necessary to be able to get a sufficiently broad sample of different
ancestral populations, and in turn to be able to provide health care
that is tailored to the needs of minority groups.
Usage in scientific journals
Some
scientific journals have addressed previous methodological errors by
requiring more rigorous scrutiny of population variables. Since 2000, Nature Genetics
requires its authors to "explain why they make use of particular ethnic
groups or populations, and how classification was achieved". Editors of
Nature Genetics say that "[they] hope that this will raise awareness and inspire more rigorous designs of genetic and epidemiological studies".
A 2021 study that examined over 11,000 papers from 1949 to 2018 in The American Journal of Human Genetics,
found that "race" was used in only 5% of papers published in the last
decade, down from 22% in the first. Together with an increase in use of
the terms "ethnicity," "ancestry," and location-based terms, it suggests
that human geneticists have mostly abandoned the term "race."
Gene-environment interactions
Lorusso and Bacchini argue that self-identified race is of greater use in medicine as it correlates strongly with risk-related exposomes that are potentially heritable when they become embodied in the epigenome.
They summarise evidence of the link between racial discrimination and
health outcomes due to poorer food quality, access to healthcare,
housing conditions, education, access to information, exposure to
infectious agents and toxic substances, and material scarcity. They also
cite evidence that this process can work positively – for example, the
psychological advantage of perceiving oneself at the top of a social
hierarchy is linked to improved health. However they caution that the
effects of discrimination do not offer a complete explanation for
differential rates of disease and risk factors between racial groups,
and the employment of self-identified race has the potential to
reinforce racial inequalities.
Objections to racial naturalism
Racial
naturalism is the view that racial classifications are grounded in
objective patterns of genetic similarities and differences. Proponents
of this view have justified it using the scientific evidence described
above. However, this view is controversial and philosophers of race have put forward four main objections to it.
Semantic objections, such as the discreteness objection, argue
that the human populations picked out in population-genetic research are
not races and do not correspond to what "race" means in the United
States. "The discreteness objection does not require there to be no
genetic admixture in the human species in order for there to be US
'racial groups' ... rather ... what the objection claims is that
membership in US racial groups is different from membership in
continental populations. ... Thus, strictly speaking, Blacks are not
identical to Africans, Whites are not identical to Eurasians, Asians are
not identical to East Asians and so forth." Therefore, it could be argued that scientific research is not really about race.
The next two objections, are metaphysical objections which argue
that even if the semantic objections fail, human genetic clustering
results do not support the biological reality of race. The 'very
important objection' stipulates that races in the US definition fail to
be important to biology, in the sense that continental populations do
not form biological subspecies. The 'objectively real objection' states
that "US racial groups are not biologically real because they are not
objectively real in the sense of existing independently of human
interest, belief, or some other mental state of humans."
Racial naturalists, such as Quayshawn Spencer, have responded to each
of these objections with counter-arguments. There are also
methodological critics who reject racial naturalism because of concerns
relating to the experimental design, execution, or interpretation of the
relevant population-genetic research.
Another semantic objection is the visibility objection which
refutes the claim that there are US racial groups in human population
structures. Philosophers such as Joshua Glasgow and Naomi Zack
believe that US racial groups cannot be defined by visible traits, such
as skin colour and physical attributes: "The ancestral genetic tracking
material has no effect on phenotypes, or biological traits of
organisms, which would include the traits deemed racial, because the
ancestral tracking genetic material plays no role in the production of
proteins it is not the kind of material that 'codes' for protein
production."
Spencer contends that certain racial discourses require visible groups,
but disagrees that this is a requirement in all US racial discourse.
A different objection states that US racial groups are not
biologically real because they are not objectively real in the sense of
existing independently of some mental state of humans. Proponents of
this second metaphysical objection include Naomi Zack and Ron Sundstrom. Spencer argues that an entity can be both biologically real and
socially constructed. Spencer states that in order to accurately capture
real biological entities, social factors must also be considered.
It has been argued that knowledge of a person's race is limited in value, since people of the same race vary from one another. David J. Witherspoon
and colleagues have argued that when individuals are assigned to
population groups, two randomly chosen individuals from different
populations can resemble each other more than a randomly chosen member
of their own group. They found that many thousands of genetic markers
had to be used for the answer to "How often is a pair of individuals
from one population genetically more dissimilar than two individuals
chosen from two different populations?" to be "Never". This assumed
three population groups, separated by large geographic distances
(European, African and East Asian). The global human population is more
complex, and studying a large number of groups would require an
increased number of markers for the same answer. They conclude that
"caution should be used when using geographic or genetic ancestry to
make inferences about individual phenotypes",
and "The fact that, given enough genetic data, individuals can be
correctly assigned to their populations of origin is compatible with the
observation that most human genetic variation is found within
populations, not between them. It is also compatible with our finding
that, even when the most distinct populations are considered and
hundreds of loci are used, individuals are frequently more similar to
members of other populations than to members of their own population".
This is similar to the conclusion reached by anthropologist Norman Sauer
in a 1992 article on the ability of forensic anthropologists to assign
"race" to a skeleton, based on craniofacial features and limb
morphology. Sauer said, "the successful assignment of race to a skeletal
specimen is not a vindication of the race concept, but rather a
prediction that an individual, while alive was assigned to a particular
socially constructed 'racial' category. A specimen may display features
that point to African ancestry. In this country that person is likely to
have been labeled Black regardless of whether or not such a race
actually exists in nature".
Criticism of race-based medicines
Troy Duster
points out that genetics is often not the predominant determinant of
disease susceptibilities, even though they might correlate with specific
socially defined categories. This is because this research oftentimes
lacks control for a multiplicity of socio-economic factors. He cites
data collected by King and Rewers that indicates how dietary differences
play a significant role in explaining variations of diabetes prevalence
between populations.
Duster elaborates by putting forward the example of the Pima of Arizona, a population suffering from disproportionately high rates of diabetes. The reason for such, he argues, was not necessarily a result of the prevalence of the FABP2 gene, which is associated with insulin resistance.
Rather he argues that scientists often discount the lifestyle
implications under specific socio-historical contexts. For instance,
near the end of the 19th century, the Pima economy was predominantly
agriculture-based. However, as the European American population settles
into traditionally Pima territory, the Pima lifestyles became heavily
Westernised. Within three decades, the incidence of diabetes increased
multiple folds. Governmental provision of free relatively high-fat food
to alleviate the prevalence of poverty in the population is noted as an
explanation of this phenomenon.
Lorusso and Bacchini argue against the assumption that "self-identified race is a good proxy for a specific genetic ancestry"
on the basis that self-identified race is complex: it depends on a
range of psychological, cultural and social factors, and is therefore
"not a robust proxy for genetic ancestry".
Furthermore, they explain that an individual's self-identified race is
made up of further, collectively arbitrary factors: personal opinions
about what race is and the extent to which it should be taken into
consideration in everyday life. Furthermore, individuals who share a
genetic ancestry may differ in their racial self-identification across
historical or socioeconomic contexts. From this, Lorusso and Bacchini
conclude that the accuracy in the prediction of genetic ancestry on the
basis of self-identification is low, specifically in racially admixed
populations born out of complex ancestral histories.