Julian Savulescu (born 1963) is an Australian philosopher and
bioethicist. He is Chen Su Lan Centennial Professor in Medical Ethics
and Director of the Centre for Biomedical Ethics at the Yong Loo Lin
School of Medicine, National University of Singapore. He is also the Uehiro Chair in Practical Ethics at the University of Oxford, and was previously the Fellow of St Cross College, Oxford, Director of the Oxford Uehiro Centre for Practical Ethics,
and co-director of the Wellcome Centre for Ethics and Humanities. He is
a visiting professorial fellow in Biomedical Ethics at the Murdoch Children's Research Institute in Australia, and distinguished visiting professor in Law at Melbourne University
since 2017. He directs the Biomedical Ethics Research Group and is a
member of the Centre for Ethics of Pediatric Genomics in Australia. He
is a former editor and current board member of the Journal of Medical Ethics (2001–2004 and 2011–2018).
Career
Savulescu completed a Bachelor of Medical Sciences and a PhD at Monash University, under the supervision of philosopher Peter Singer.
His doctoral thesis was on good reasons to die and euthanasia. After
graduating, he took a Menzies Foundation postdoctoral scholarship, supervised by Derek Parfit
before returning to Australia. He established a group on the ethics of
genetics at the Murdoch Children's Research Institute, Australia. In
2002, he took up the Uehiro Chair in Practical Ethics in Oxford. In
2003, he established the Oxford Uehiro Centre for Practical Ethics as
Director. He edits the Oxford University Press book series, the Uehiro Series in Practical Ethics.
Views
Procreative beneficence
Savulescu coined the term procreative beneficence. He describes it as the moral obligation (rather than mere permission) of parents who can select among potential children to choose those expected to have the best life prospects. For instance through preimplantation genetic diagnosis (PGD) and subsequent embryo selection or selective termination.A similar position was defended by John Harris.
One argument is that some traits such as memory are "all-purpose
means", in the sense of being instrumental in realizing whatever life
plans the child may come to have.
Philosopher Walter Veit has argued that if one accepts both procreative beneficence and consequentialism, then a parental obligation for genetic enhancement
logically follows, as there is no intrinsic moral difference between
selecting and enhancing embryos for welfare-maximizing traits.
The principle of procreative beneficience is controversial.
Bioethicist Rebecca Bennett argued against Savulescu's position,
contending that not selecting the best offspring harms no one since
those potential individuals would otherwise never have existed. She
further wrote that the intuitions supporting such a selection merely
reflect non-moral preferences rather than genuine moral obligations. Peter Herissone-Kelly argued against this criticism.
Moral Enhancement
In
2009, Professor Savulescu presented a paper at the "Festival of
Dangerous Ideas", held at the Sydney Opera House in October 2009,
entitled "Unfit for Life: Genetically Enhance Humanity or Face
Extinction", which can be seen on Vimeo. Savulescu argues that unless humans are willing to undergo "moral enhancement",
they may be on the brink of disappearing in a metaphorical "Bermuda
Triangle", which he describes as a dangerous convergence of three
factors: widespread access to destructive technologies, inherent
limitations of human moral nature (such as parochialism and
self-interest), and inadequacies of liberal democracy to address global
challenges.
Norbert Paulo criticised Savulescu's argument for moral
enhancement, arguing that if democratic governments had to morally
enhance their populations because the majoritarian population are
morally deficient, they could not be legitimate as they manipulated the
population's will. Thus in Paulo's view, those advocating large-scale,
state-driven and partially mandatory moral enhancement are advocating a
non-democratic order.
Embryonic stem cells
Savulescu also justifies the destruction of embryos and foetuses as a source of organs and tissue for transplantation to adults.
In his abstract he argues, "The most publicly justifiable application
of human cloning, if there is one at all, is to provide self-compatible
cells or tissues for medical use, especially transplantation. Some have
argued that this raises no new ethical issues above those raised by any
form of embryo experimentation. I argue that this research is less
morally problematic than other embryo research. Indeed, it is not merely
morally permissible but morally required that we employ cloning to
produce embryos or fetuses for the sake of providing cells, tissues or
even organs for therapy, followed by abortion of the embryo or fetus."
He argues that if it is permissible to destroy foetuses, for social
reasons, or no reasons at all, it must be justifiable to destroy them to
save lives.
He argues that stem cell research is important enough as to be justifiable even if one conceptualizes the embryo as a person.
Abortion debate
Further, as editor of the Journal of Medical Ethics,
he published, in 2012, an article by two Italian academics which stated
that a new-born baby is effectively no different from a foetus, is not a
"person" and, morally, could be killed at the decision of the parents
etc. This article was published as part of a special double issue, "Abortion, Infanticide, and Allowing Babies to Die". The double issue included articles by Peter Singer, Michael Tooley, Jeff McMahan, C. A. J. Coady, Leslie Francis, John Finnis,
and others. In an editorial, Savulescu wrote: "The Journal aims in this
issue to promote further and more extensive rational debate concerning
this controversial and important topic by providing a range of arguments
from a variety of perspectives. We have tried to be as inclusive as
possible and provided a double issue to include as many as possible of
the submissions we received. Infanticide is an important issue and one
worthy of scholarly attention because it touches on an area of concern
that few societies have had the courage to tackle honestly and openly:
euthanasia. We hope that the papers in this issue will stimulate ethical
reflection on practices of euthanasia that are occurring and its proper
justification and limits." He also stated, "I am strongly opposed to the legalisation of infanticide along the lines discussed by Giubilini and Minerva."
Other positions
Along with neuroethicist
Guy Kahane, Savulescu's article "Brain Damage and the Moral
Significance of Consciousness" appears to be the first mainstream
publication to argue that increased evidence of consciousness in
patients diagnosed with being in persistent vegetative state actually supports withdrawing or withholding care.
Other information
He has co-authored two books: Medical Ethics and Law: The Core Curriculum with Tony Hope and Judith Hendrick and Unfit for the Future: The Need for Moral Enhancement (published by Oxford University Press) with Ingmar Persson.
He has also edited the books Der neue Mensch? Enhancement und Genetik (together with Nikolaus Knoepffler), Human Enhancement (together with Nick Bostrom), Enhancing Human Capacities, The Ethics of Human Enhancement. He was also a co-author of Love Is the Drug: The Chemical Future of Our Relationships addressing the future potential widespread use of aphrodisiacs.
In it, he argued, that certain forms of medications can be ethically
consumed as a "helpful complement" in relationships. Both to fall in
love, and, to fall out of it.
Awards
In 2009, Savulescu was awarded a Distinguished Alumni Award by Monash University. In the same year, he was also announced as the winner in the Thinking category of The Australian newspaper's Emerging Leaders Awards.
Savulescu has a Honorary degree from the University of Bucharest (2014). He was awarded the 'Thinker' Award in the top 100 Australian Future Leaders (2009), and is a Monash University Distinguished Alumni (2009). He was ASMR Gold Medalist (2005).
In 2018, Savulescu and a team of co-authors were awarded the Daniel M. Wegner Theoretical Innovation Prize.
This prize recognises the author of an article or book chapter judged
to provide the most innovative theoretical contribution to
social/personality psychology within a given year. He was also shortlisted for the AHRC Medal for Leadership in Medical Humanities in 2018. He was elected a Corresponding Fellow of the Australian Academy of the Humanities in 2023.
A graphical representation of the typical human karyotypeThe human mitochondrial DNA
Human genetic variation is the genetic differences in and among populations. There may be multiple variants of any given gene in the human population (alleles), a situation called polymorphism.
No two humans are genetically identical. Even monozygotic twins (who develop from one zygote) have infrequent genetic differences due to mutations occurring during development and gene copy-number variation. Differences between individuals, even closely related individuals, are the key to techniques such as genetic fingerprinting.
The human genome has a total length of approximately 3.2 billion base pairs (bp) in 46 chromosomes of DNA as well as slightly under 17,000 bp DNA in cellular mitochondria.
In 2015, the typical difference between an individual's genome and the
reference genome was estimated at 20 million base pairs (or 0.6% of the
total). As of 2017, there were a total of 324 million known variants from sequenced human genomes.
Comparatively speaking, humans are a genetically homogeneous
species. Although a small number of genetic variants are found more
frequently in certain geographic regions or in people with ancestry from
those regions, this variation accounts for a small portion (~15%) of
human genome variability. The majority of variation exists within the
members of each human population. For comparison, rhesus macaques exhibit 2.5-fold greater DNA sequence diversity compared to humans.
These rates differ depending on what macromolecules are being analyzed.
Chimpanzees have more genetic variance than humans when examining
nuclear DNA, but humans have more genetic variance when examining at the
level of proteins.
The lack of discontinuities in genetic distances between human
populations, absence of discrete branches in the human species, and
striking homogeneity of human beings globally, imply that there is no
scientific basis for inferring races or subspecies in humans, and for
most traits, there is much more variation within populations than between them.
Despite this, modern genetic studies have found substantial average
genetic differences across human populations in traits such as skin
colour, bodily dimensions, lactose and starch digestion, high altitude
adaptions, drug response, taste receptors, and predisposition to
developing particular diseases. The greatest diversity is found within and among populations in Africa, and gradually declines with increasing distance from the African continent, consistent with the Out of Africa theory of human origins.
The study of human genetic variation has evolutionary
significance and medical applications. It can help scientists
reconstruct and understand patterns of past human migration. In
medicine, study of human genetic variation may be important because some
disease-causing alleles occur more often in certain population groups.
For instance, the mutation for sickle-cell anemia
is more often found in people with ancestry from certain sub-Saharan
African, south European, Arabian, and Indian populations, due to the
evolutionary pressure from mosquitos carrying malaria in these regions.
New findings show that each human has on average 60 new mutations compared to their parents.
There are at least three reasons why genetic variation exists between populations. Natural selection
may confer an adaptive advantage to individuals in a specific
environment if an allele provides a competitive advantage. Alleles under
selection are likely to occur only in those geographic regions where
they confer an advantage. A second important process is genetic drift, which is the effect of random changes in the gene pool, under conditions where most mutations are neutral
(that is, they do not appear to have any positive or negative selective
effect on the organism). Finally, small migrant populations have
statistical differences – called the founder effect
– from the overall populations where they originated; when these
migrants settle new areas, their descendant population typically differs
from their population of origin: different genes predominate and it is
less genetically diverse.
In humans, the main cause is genetic drift. Serial founder effects
and past small population size (increasing the likelihood of genetic
drift) may have had an important influence in neutral differences
between populations. The second main cause of genetic variation is due to the high degree of neutrality of most mutations.
A small, but significant number of genes appear to have undergone
recent natural selection, and these selective pressures are sometimes
specific to one region.
Measures of variation
Genetic variation among humans occurs on many scales, from gross alterations in the human karyotype to single nucleotide changes. Chromosome abnormalities are detected in 1 of 160 live human births. Apart from sex chromosome disorders, most cases of aneuploidy result in death of the developing fetus (miscarriage); the most common extra autosomal chromosomes among live births are 21, 18 and 13.
Nucleotide diversity
is the average proportion of nucleotides that differ between two
individuals. As of 2004, the human nucleotide diversity was estimated to
be 0.1% to 0.4% of base pairs. In 2015, the 1000 Genomes Project,
which sequenced one thousand individuals from 26 human populations,
found that "a typical [individual] genome differs from the reference
human genome at 4.1 million to 5.0 million sites … affecting 20 million
bases of sequence"; the latter figure corresponds to 0.6% of total
number of base pairs.
Nearly all (>99.9%) of these sites are small differences, either
single nucleotide polymorphisms or brief insertions or deletions (indels) in the genetic sequence, but structural variations account for a greater number of base-pairs than the SNPs and indels.
As of 2017, the Single Nucleotide Polymorphism Database (dbSNP), which lists SNP and other variants, listed 324 million variants found in sequenced human genomes.
Single nucleotide polymorphisms
DNA molecule 1 differs from DNA molecule 2 at a single base-pair location (a C/T polymorphism).
A single nucleotide polymorphism
(SNP) is a difference in a single nucleotide between members of one
species that occurs in at least 1% of the population. The 2,504
individuals characterized by the 1000 Genomes Project had 84.7 million SNPs among them. SNPs are the most common type of sequence variation, estimated in 1998 to account for 90% of all sequence variants. Other sequence variations are single base exchanges, deletions and insertions. SNPs occur on average about every 100 to 300 bases and so are the major source of heterogeneity.
A functional, or non-synonymous, SNP is one that affects some factor such as gene splicing or messenger RNA, and so causes a phenotypic difference between members of the species. About 3% to 5% of human SNPs are functional (see International HapMap Project). Neutral, or synonymous SNPs are still useful as genetic markers in genome-wide association studies, because of their sheer number and the stable inheritance over generations.
A coding SNP is one that occurs inside a gene. There are 105 Human Reference SNPs that result in premature stop codons
in 103 genes. This corresponds to 0.5% of coding SNPs. They occur due
to segmental duplication in the genome. These SNPs result in loss of
protein, yet all these SNP alleles are common and are not purified in negative selection.
According to the 1000 Genomes Project, a typical human has 2,100
to 2,500 structural variations, which include approximately 1,000 large
deletions, 160 copy-number variants, 915 Alu insertions, 128 L1 insertions, 51 SVA insertions, 4 NUMTs, and 10 inversions.
A copy-number variation (CNV) is a difference in the genome due to
deleting or duplicating large regions of DNA on some chromosome. It is
estimated that 0.4% of the genomes of unrelated humans differ with
respect to copy number. When copy number variation is included,
human-to-human genetic variation is estimated to be at least 0.5% (99.5%
similarity). Copy number variations are inherited but can also arise during development.
A visual map with the regions with high genomic variation of the modern-human reference assembly relatively to a
Neanderthal of 50k has been built by Pratas et al.
Epigenetics
Epigenetic variation is variation in the chemical tags that attach to DNA and affect how genes get read. The tags, "called epigenetic markings, act as switches that control how genes can be read."[41] At some alleles, the epigenetic state of the DNA, and associated phenotype, can be inherited across generations of individuals.
Genetic variability is a measure of the tendency of individual genotypes in a population to vary (become different) from one another. Variability is different from genetic diversity,
which is the amount of variation seen in a particular population. The
variability of a trait is how much that trait tends to vary in response
to environmental and genetic influences.
In biology, a cline is a continuum of species,
populations, varieties, or forms of organisms that exhibit gradual
phenotypic and/or genetic differences over a geographical area,
typically as a result of environmental heterogeneity.
In the scientific study of human genetic variation, a gene cline can be
rigorously defined and subjected to quantitative metrics.
The most commonly studied human haplogroups are Y-chromosome (Y-DNA) haplogroups and mitochondrial DNA (mtDNA) haplogroups, both of which can be used to define genetic populations. Y-DNA is passed solely along the patrilineal line, from father to son, while mtDNA is passed down the matrilineal line, from mother to both daughter or son. The Y-DNA and mtDNA may change by chance mutation at each generation.
A variable number tandem repeat (VNTR) is the variation of length of a tandem repeat. A tandem repeat is the adjacent repetition of a short nucleotide sequence. Tandem repeats exist on many chromosomes, and their length varies between individuals. Each variant acts as an inheritedallele, so they are used for personal or parental identification. Their analysis is useful in genetics and biology research, forensics, and DNA fingerprinting.
Map of the migration of modern humans out of Africa, based on mitochondrial DNA. Colored rings indicate thousand years before present.Genetic distance map by Magalhães et al. (2012)
The recent African origin of modern humans paradigm assumes the dispersal of non-African populations of anatomically modern humans
after 70,000 years ago. Dispersal within Africa occurred significantly
earlier, at least 130,000 years ago. The "out of Africa" theory
originates in the 19th century, as a tentative suggestion in Charles
Darwin's Descent of Man,
but remained speculative until the 1980s when it was supported by the
study of present-day mitochondrial DNA, combined with evidence from physical anthropology of archaic specimens.
According to a 2000 study of Y-chromosome sequence variation,
human Y-chromosomes trace ancestry to Africa, and the descendants of
the derived lineage left Africa and eventually were replaced by archaic
human Y-chromosomes in Eurasia. The study also shows that a minority of
contemporary populations in East Africa and the Khoisan
are the descendants of the most ancestral patrilineages of anatomically
modern humans that left Africa 35,000 to 89,000 years ago.
Other evidence supporting the theory is that variations in skull
measurements decrease with distance from Africa at the same rate as the
decrease in genetic diversity. Human genetic diversity decreases in
native populations with migratory distance from Africa, and this is
thought to be due to bottlenecks during human migration, which are events that temporarily reduce population size.
A 2009 genetic clustering study, which genotyped 1327 polymorphic
markers in various African populations, identified six ancestral
clusters. The clustering corresponded closely with ethnicity, culture
and language. A 2018 whole genome sequencing
study of the world's populations observed similar clusters among the
populations in Africa. At K=9, distinct ancestral components defined the
Afroasiatic-speaking populations inhabiting North Africa and Northeast Africa; the Nilo-Saharan-speaking populations in Northeast Africa and East Africa; the Ari populations in Northeast Africa; the Niger-Congo-speaking populations in West-Central Africa, West Africa, East Africa and Southern Africa; the Pygmy populations in Central Africa; and the Khoisan populations in Southern Africa.
In May 2023, scientists reported, based on genetic studies, a
more complicated pathway of human evolution than previously understood.
According to the studies, humans evolved from different places and times
in Africa, instead of from a single location and period of time.
Because of the common ancestry of all humans, only a small number of
variants have large differences in frequency between populations.
However, some rare variants in the world's human population are much
more frequent in at least one population (more than 5%).
Genetic variationGenetic variation of Eurasian populations showing different frequency of West- and East-Eurasian components
It is commonly assumed that early humans left Africa, and thus must
have passed through a population bottleneck before their
African-Eurasian divergence around 100,000 years ago (ca. 3,000
generations). The rapid expansion of a previously small population has two important effects on the distribution of genetic variation. First, the so-called founder effect
occurs when founder populations bring only a subset of the genetic
variation from their ancestral population. Second, as founders become
more geographically separated, the probability that two individuals from
different founder populations will mate becomes smaller. The effect of
this assortative mating is to reduce gene flow between geographical groups and to increase the genetic distance between groups.
The expansion of humans from Africa affected the distribution of
genetic variation in two other ways. First, smaller (founder)
populations experience greater genetic drift
because of increased fluctuations in neutral polymorphisms. Second, new
polymorphisms that arose in one group were less likely to be
transmitted to other groups as gene flow was restricted.
Populations in Africa tend to have lower amounts of linkage disequilibrium
than do populations outside Africa, partly because of the larger size
of human populations in Africa over the course of human history and
partly because the number of modern humans who left Africa to colonize
the rest of the world appears to have been relatively low.
In contrast, populations that have undergone dramatic size reductions
or rapid expansions in the past and populations formed by the mixture of
previously separate ancestral groups can have unusually high levels of
linkage disequilibrium
Distribution of variation
Human genetic variation calculated from genetic data representing 346 microsatellite
loci taken from 1484 individuals in 78 human populations. The upper
graph illustrates that as populations are further from East Africa, they
have declining genetic diversity as measured in average number of
microsatellite repeats at each of the loci. The bottom chart illustrates
isolation by distance.
Populations with a greater distance between them are more dissimilar
(as measured by the Fst statistic) than those which are geographically
close to one another. The horizontal axis of both charts is geographic
distance as measured along likely routes of human migration. (Chart from
Kanitz et al. 2018)
The distribution of genetic variants within and among human
populations are impossible to describe succinctly because of the
difficulty of defining a "population," the clinal nature of variation,
and heterogeneity across the genome (Long and Kittles 2003). In general,
however, an average of 85% of genetic variation exists within local
populations, ~7% is between local populations within the same continent,
and ~8% of variation occurs between large groups living on different
continents. The recent African origin
theory for humans would predict that in Africa there exists a great
deal more diversity than elsewhere and that diversity should decrease
the further from Africa a population is sampled.
Sub-Saharan Africa has the most human genetic diversity and the same has been shown to hold true for phenotypic variation in skull form.Phenotype is connected to genotype through gene expression.
Genetic diversity decreases smoothly with migratory distance from that
region, which many scientists believe to be the origin of modern humans,
and that decrease is mirrored by a decrease in phenotypic variation.
Skull measurements are an example of a physical attribute whose
within-population variation decreases with distance from Africa.
The distribution of many physical traits resembles the distribution of genetic variation within and between human populations (American Association of Physical Anthropologists
1996; Keita and Kittles 1997). For example, ~90% of the variation in
human head shapes occurs within continental groups, and ~10% separates
groups, with a greater variability of head shape among individuals with
recent African ancestors (Relethford 2002).
A prominent exception to the common distribution of physical characteristics within and among groups is skin color.
Approximately 10% of the variance in skin color occurs within groups,
and ~90% occurs between groups (Relethford 2002). This distribution of
skin color and its geographic patterning – with people whose ancestors
lived predominantly near the equator having darker skin than those with
ancestors who lived predominantly in higher latitudes – indicate that
this attribute has been under strong selective pressure. Darker skin appears to be strongly selected for in equatorial regions to prevent sunburn, skin cancer, the photolysis of folate, and damage to sweat glands.
Understanding how genetic diversity in the human population
impacts various levels of gene expression is an active area of research.
While earlier studies focused on the relationship between DNA variation
and RNA expression, more recent efforts are characterizing the genetic
control of various aspects of gene expression including chromatin
states, translation, and protein levels.
A study published in 2007 found that 25% of genes showed different
levels of gene expression between populations of European and Asian
descent.
The primary cause of this difference in gene expression was thought to
be SNPs in gene regulatory regions of DNA. Another study published in
2007 found that approximately 83% of genes were expressed at different
levels among individuals and about 17% between populations of European
and African descent.
Wright's fixation index as measure of variation
The population geneticist Sewall Wright developed the fixation index (often abbreviated to FST)
as a way of measuring genetic differences between populations. This
statistic is often used in taxonomy to compare differences between any
two given populations by measuring the genetic differences among and
between populations for individual genes, or for many genes
simultaneously.
It is often stated that the fixation index for humans is about 0.15.
This translates to an estimated 85% of the variation measured in the
overall human population is found within individuals of the same
population, and about 15% of the variation occurs between populations.
These estimates imply that any two individuals from different
populations may be more similar to each other than either is to a member
of their own group.
"The shared evolutionary history of living humans has resulted in a high
relatedness among all living people, as indicated for example by the
very low fixation index (FST) among living human populations." Richard Lewontin,
who affirmed these ratios, thus concluded neither "race" nor
"subspecies" were appropriate or useful ways to describe human
populations.
Wright himself believed that values >0.25 represent very great genetic variation and that an FST
of 0.15–0.25 represented great variation. However, about 5% of human
variation occurs between populations within continents, therefore FST
values between continental groups of humans (or races) of as low as 0.1
(or possibly lower) have been found in some studies, suggesting more
moderate levels of genetic variation. Graves (1996) has countered that FST
should not be used as a marker of subspecies status, as the statistic
is used to measure the degree of differentiation between populations, although see also Wright (1978).
Jeffrey Long and Rick Kittles give a long critique of the application of FST
to human populations in their 2003 paper "Human Genetic Diversity and
the Nonexistence of Biological Races". They find that the figure of 85%
is misleading because it implies that all human populations contain on
average 85% of all genetic diversity. They argue the underlying
statistical model incorrectly assumes equal and independent histories of
variation for each large human population. A more realistic approach is
to understand that some human groups are parental to other groups and
that these groups represent paraphyletic groups to their descent groups. For example, under the recent African origin
theory the human population in Africa is paraphyletic to all other
human groups because it represents the ancestral group from which all
non-African populations derive, but more than that, non-African groups
only derive from a small non-representative sample of this African
population. This means that all non-African groups are more closely
related to each other and to some African groups (probably east
Africans) than they are to others, and further that the migration out of
Africa represented a genetic bottleneck,
with much of the diversity that existed in Africa not being carried out
of Africa by the emigrating groups. Under this scenario, human
populations do not have equal amounts of local variability, but rather
diminished amounts of diversity the further from Africa any population
lives. Long and Kittles find that rather than 85% of human genetic
diversity existing in all human populations, about 100% of human
diversity exists in a single African population, whereas only about 70%
of human genetic diversity exists in a population derived from New
Guinea. Long and Kittles argued that this still produces a global human
population that is genetically homogeneous compared to other mammalian
populations.
Anatomically modern humans interbred with Neanderthals during the Middle Paleolithic. In May 2010, the Neanderthal Genome Project presented genetic evidence that interbreeding
took place and that a small but significant portion, around 2–4%, of
Neanderthal admixture is present in the DNA of modern Eurasians and
Oceanians, and nearly absent in sub-Saharan African populations.
Between 4% and 6% of the genome of Melanesians (represented by the Papua New Guinean and Bougainville Islander) appears to derive from Denisovans
– a previously unknown hominin which is more closely related to
Neanderthals than to Sapiens. It was possibly introduced during the
early migration of the ancestors of Melanesians into Southeast Asia.
This history of interaction suggests that Denisovans once ranged widely
over eastern Asia.
Thus, Melanesians emerge as one of the most archaic-admixed populations, having Denisovan/Neanderthal-related admixture of ~8%.
In a study published in 2013, Jeffrey Wall from University of
California studied whole sequence-genome data and found higher rates of
introgression in Asians compared to Europeans.
Hammer et al. tested the hypothesis that contemporary African genomes
have signatures of gene flow with archaic human ancestors and found
evidence of archaic admixture in the genomes of some African groups,
suggesting that modest amounts of gene flow were widespread throughout
time and space during the evolution of anatomically modern humans.
A study published in 2020 found that the Yoruba and Mende
populations of West Africa derive between 2% and 19% of their genome
from an as-yet unidentified archaic hominin population that likely
diverged before the split of modern humans and the ancestors of
Neanderthals and Denisovans, potentially making these groups the most archaic-admixed human populations identified yet.
Categorization of the world population
Chart showing human genetic clusteringIndividuals
mostly have genetic variants which are found in multiple regions of the
world. Based on data from "A unified genealogy of modern and ancient
genomes".
New data on human genetic variation has reignited the debate about a
possible biological basis for categorization of humans into races. Most
of the controversy surrounds the question of how to interpret the
genetic data and whether conclusions based on it are sound. Some
researchers argue that self-identified race can be used as an indicator
of geographic ancestry for certain health risks and medications.
Although the genetic differences among human groups are relatively small, these differences in certain genes such as duffy, ABCC11, SLC24A5, called ancestry-informative markers
(AIMs) nevertheless can be used to reliably situate many individuals
within broad, geographically based groupings. For example, computer
analyses of hundreds of polymorphic loci sampled in globally distributed
populations have revealed the existence of genetic clustering that
roughly is associated with groups that historically have occupied large
continental and subcontinental regions (Rosenberg et al. 2002; Bamshad et al. 2003).
Some commentators have argued that these patterns of variation
provide a biological justification for the use of traditional racial
categories. They argue that the continental clusterings correspond
roughly with the division of human beings into sub-Saharan Africans; Europeans, Western Asians, Central Asians, Southern Asians and Northern Africans; Eastern Asians, Southeast Asians, Polynesians and Native Americans; and other inhabitants of Oceania (Melanesians, Micronesians & Australian Aborigines) (Risch et al.
2002). Other observers disagree, saying that the same data undercut
traditional notions of racial groups (King and Motulsky 2002; Calafell
2003; Tishkoff and Kidd 2004).
They point out, for example, that major populations considered races or
subgroups within races do not necessarily form their own clusters.
Racial categories are also undermined by findings that genetic
variants which are limited to one region tend to be rare within that
region, variants that are common within a region tend to be shared
across the globe, and most differences between individuals, whether they
come from the same region or different regions, are due to global
variants. No genetic variants have been found which are fixed within a continent or major region and found nowhere else.
Furthermore, because human genetic variation is clinal, many
individuals affiliate with two or more continental groups. Thus, the
genetically based "biogeographical ancestry" assigned to any given
person generally will be broadly distributed and will be accompanied by
sizable uncertainties (Pfaff et al. 2004).
In many parts of the world, groups have mixed in such a way that
many individuals have relatively recent ancestors from widely separated
regions. Although genetic analyses of large numbers of loci can produce
estimates of the percentage of a person's ancestors coming from various
continental populations (Shriver et al. 2003; Bamshad et al.
2004), these estimates may assume a false distinctiveness of the
parental populations, since human groups have exchanged mates from local
to continental scales throughout history (Cavalli-Sforza et al.
1994; Hoerder 2002). Even with large numbers of markers, information for
estimating admixture proportions of individuals or groups is limited,
and estimates typically will have wide confidence intervals (Pfaff et al. 2004).
Genetic data can be used to infer population structure and assign
individuals to groups that often correspond with their self-identified
geographical ancestry. Jorde and Wooding (2004) argued that "Analysis of
many loci now yields reasonably accurate estimates of genetic
similarity among individuals, rather than populations. Clustering of
individuals is correlated with geographic origin or ancestry."
However, identification by geographic origin may quickly break down
when considering historical ancestry shared between individuals back in
time.
An analysis of autosomalSNP data from the International HapMap Project (Phase II) and CEPH Human Genome Diversity Panel samples was published in 2009.
The study of 53 populations taken from the HapMap and CEPH data (1138 unrelated individuals) suggested that natural selection
may shape the human genome much more slowly than previously thought,
with factors such as migration within and among continents more heavily
influencing the distribution of genetic variations.
A similar study published in 2010 found strong genome-wide evidence for
selection due to changes in ecoregion, diet, and subsistence
particularly in connection with polar ecoregions, with foraging, and
with a diet rich in roots and tubers. In a 2016 study, principal component analysis
of genome-wide data was capable of recovering previously-known targets
for positive selection (without prior definition of populations) as well
as a number of new candidate genes.
Forensic anthropology
Forensic anthropologists
can assess the ancestry of skeletal remains by analyzing skeletal
morphology as well as using genetic and chemical markers, when possible.
While these assessments are never certain, the accuracy of skeletal
morphology analyses in determining true ancestry has been estimated at
90%.
Ternary plot
showing average admixture of five North American ethnic groups.
Individuals that self-identify with each group can be found at many
locations on the map, but on average groups tend to cluster differently.
Gene flow between two populations reduces the average genetic
distance between the populations, only totally isolated human
populations experience no gene flow and most populations have continuous
gene flow with other neighboring populations which create the clinal
distribution observed for most genetic variation. When gene flow takes
place between well-differentiated genetic populations the result is
referred to as "genetic admixture".
Admixture mapping is a technique used to study how genetic variants cause differences in disease rates between population.
Recent admixture populations that trace their ancestry to multiple
continents are well suited for identifying genes for traits and diseases
that differ in prevalence between parental populations.
African-American populations have been the focus of numerous population
genetic and admixture mapping studies, including studies of complex
genetic traits such as white cell count, body-mass index, prostate
cancer and renal disease.
An analysis of phenotypic and genetic variation including skin
color and socio-economic status was carried out in the population of
Cape Verde which has a well documented history of contact between
Europeans and Africans. The studies showed that pattern of admixture in
this population has been sex-biased (involving mostly matings between
European men and African women) and there is a significant interaction
between socioeconomic status and skin color, independent of ancestry. Another study shows an increased risk of graft-versus-host disease complications after transplantation due to genetic variants in human leukocyte antigen (HLA) and non-HLA proteins.
Given that each individual has millions of genetic variants (compared to the reference genome),
it is an important question what impact these variants have on human
health or gene function. Most genetic variants have only small to
moderate effects, if any. Frequently cited examples include hypertension (Douglas et al. 1996), diabetes, obesity (Fernandez et al. 2003), and prostate cancer (Platz et al. 2000). However, the role of genetic factors in generating these differences remains uncertain.
Effect on protein function
The human genome encodes about 20,000 protein-coding genes with about 550 amino acids each. Hence, human proteins span about 11 million amino acids (22 million per diploid genome). The median number of missense mutations
in individual human genomes is about 8600, that is, two individuals
differ by 1 in about 2600 amino acids or in about 20% of their proteins.
The average individual has about 137 (predicted) loss of function
mutations, including 71 frameshift and 148 in-frame deletions or insertions.
Mutations at 32.2% and 9.5% of all possible genomic positions,
respectively, can lead to missense and stop-gained variants (i.e.,
truncated proteins).
In a sample of almost 1 million people, almost 5000 genes were
identified that had loss-of-function variants in both alleles of the
same individual. That is, if these 5000 genes can tolerate homozygous loss of function mutations, they are unlikely to be essential.
Monogenetic diseases
Differences in allele frequencies contribute to group differences in the incidence of some monogenic diseases, and they may contribute to differences in the incidence of some common diseases.
For the monogenic diseases, the frequency of causative alleles usually
correlates best with ancestry, whether familial (for example, Ellis–Van Creveld syndrome among the Pennsylvania Amish), ethnic (Tay–Sachs disease among Ashkenazi Jewish
populations), or geographical (hemoglobinopathies among people with
ancestors who lived in malarial regions). To the extent that ancestry
corresponds with racial or ethnic groups or subgroups, the incidence of
monogenic diseases can differ between groups categorized by race or
ethnicity, and health-care professionals typically take these patterns
into account in making diagnoses.
Beneficial variants
Some
other variations on the other hand are beneficial to human, as they
prevent certain diseases and increase the chance to adapt to the
environment. For example, mutation in CCR5 gene that protects against AIDS. CCR5 gene is absent on the surface of cell due to mutation. Without CCR5 gene on the surface, there is nothing for HIV
viruses to grab on and bind into. Therefore, the mutation on CCR5 gene
decreases the chance of an individual's risk with AIDS. The mutation in
CCR5 is also quite common in certain areas, with more than 14% of the
population carry the mutation in Europe and about 6–10% in Asia and North Africa.
HIV attachment
Many genetic variants may have aided humans in ancient times but
plague us today. For example, genes that allow humans to more
efficiently process food also make people susceptible to obesity and
diabetes today.
There are numerous related projects that deal with genetic
variation (or variation in the encoded proteins), e.g. organized by the
following organizations:
HUman Genome Organisation (HUGO) -- organizes activities around human genome sequencing, including variants
Figure 1: Genetic distance map by Cavalli-Sforza et al. (1994)
Genetic distance is a measure of the genetic divergence between species or between populations within a species, whether the distance measures time from common ancestor or degree of differentiation. Populations with many similar alleles have small genetic distances. This indicates that they are closely related and have a recent common ancestor.
Genetic distance is useful for reconstructing the history of populations, such as the multiple human expansions out of Africa. It is also used for understanding the origin of biodiversity.
For example, the genetic distances between different breeds of
domesticated animals are often investigated in order to determine which
breeds should be protected to maintain genetic diversity.
Biological foundation
Life on earth began from very simple unicellular organisms evolving into most complex multicellular organisms through the course of over three billion years. Creating a comprehensive tree of life
that represents all the organisms that have ever lived on earth is
important for understanding the evolution of life in the face of all
challenges faced by living organisms to deal with similar challenges in
future. Evolutionary biologists have attempted to create evolutionary or
phylogenetic trees encompassing as many organisms as possible based on the available resources. Fossildating and molecular clock
are the two means of generating evolutionary history of living
organisms. Fossil record is random, incomplete and does not provide a
continuous chain of events like a movie with missing frames cannot tell
the whole plot of the movie.
Molecular clocks on the other hand are specific sequences of DNA, RNA or proteins
(amino acids) that are used to determine at molecular level the
similarities and differences among species, to find out the timeline of
divergence, and to trace back the common ancestor of species based on the mutation rates and sequence changes being accumulated in those specific sequences.
The primary driver of evolution is the mutation or changes in genes and
accounting for those changes over time determines the approximate
genetic distance between species. These specific molecular clocks are
fairly conserved
across a range of species and have a constant rate of mutation like a
clock and are calibrated based on evolutionary events (fossil records).
For example, gene for alpha-globin (constituent of hemoglobin) mutates
at a rate of 0.56 per base pair per billion years. The molecular clock can fill those gaps created by missing fossil records.
In the genome of an organism, each gene is located at a specific place called the locus
for that gene. Allelic variations at these loci cause phenotypic
variation within species (e.g. hair colour, eye colour). However, most
alleles do not have an observable impact on the phenotype. Within a
population new alleles generated by mutation either die out or spread
throughout the population. When a population is split into different
isolated populations (by either geographical or ecological factors),
mutations that occur after the split will be present only in the
isolated population. Random fluctuation of allele frequencies also
produces genetic differentiation between populations. This process is
known as genetic drift. By examining the differences between allele frequencies between the populations and computing genetic distance, we can estimate how long ago the two populations were separated.
Let’s suppose a sequence of DNA or a hypothetical gene that has mutation rate of one base
per 10 million years. Using this sequence of DNA, the divergence of two
different species or genetic distance between two different species can
be determined by counting the number of base pair differences among
them. For example, in Figure 2 a difference of 4 bases in the
hypothetical sequence among those two species would indicate that they
diverged 40 million years ago, and their common ancestor would have
lived at least 20 million years ago before their divergence. Based on
molecular clock, the equation below can be used to calculate the time
since divergence.
Number of mutation ÷ Mutation per year (rate of mutation) = time since divergence
Figure 2: Divergence timeline between two hypothetical species.
Process of determining genetic distance
Recent advancement in sequencing technology and the availability of comprehensive genomic databases and bioinformatics tools
that are capable of storing and processing colossal amount of data
generated by the advanced sequencing technology has tremendously
improved evolutionary studies and the understanding of evolutionary relationships among species.
Markers for genetic distance
Different biomolecular markers such DNA, RNA and amino acid sequences (protein) can be used for determining the genetic distance.
The selection criteria of appropriate biomarker for genetic distance entails the following three steps:
The choice of variability depends on the intended outcome. For example, very high level of variability is recommended for demographic studies and parentage analyses,
medium to high variability for comparing distinct populations, and
moderate to very low variability is recommended for phylogenetic
studies. The genomic localization and ploidy of the marker is also an important factor. For example, the gene copy number is inversely proportional to the robustness with haploid genome (mitochondrial DNA) more prone to genetic drift than diploid genome (nuclear DNA).
The choice and examples of molecular markers for evolutionary biology studies.
Phylogenetics:
Exploring the genetic distance among species can help in establishing
evolutionary relationships among them, the time of divergence between
them and creating a comprehensive phylogenetic tree that connect them to
their common ancestors.
Accuracy of genomic prediction: Genetic distance can be used to predict unobserved phenotypes which has implication in medical diagnostics, and breeding of plants and animals.
Population Genetics: Genetic distance can help in studying population genetics, understanding intra and inter-population genetic diversity.
Taxonomy and Species Delimitation: Determining genetic distance through DNA barcoding is an effective tool for delimiting species especially identifying cryptic species.
An optimized percentage threshold genetic distance is recommended based
on the data and species being studied to improve and enhance the
reliability and applicability of delimitation that can delineate species boundaries and identify cryptic species that look similar but are genetically distinct.
Evolutionary forces affecting genetic distance
Evolutionary forces such as mutation, genetic drift, natural selection, and gene flow
drive the process of evolution and genetic diversity. All these forces
play significant role in genetic distance within and among species.
Measures
Figure
3: Image depicts speciation stemmed from geographic isolation where a
starting population is separated. Over vast amounts of time, isolated
groups of a particular taxa may diverge into distinct species.
Different statistical measures exist that aim to quantify genetic
deviation between populations or species. By utilizing assumptions
gained from experimental analysis of evolutionary forces, a model that
more accurately suits a given experiment can be selected to study a
genetic group. Additionally, comparing how well different metrics model
certain population features such as isolation can identify metrics that
are more suited for understanding newly studied groups The most commonly used genetic distance metrics are Nei's genetic distance, Cavalli-Sforza and Edwards measure, and Reynolds, Weir and Cockerham's genetic distance.
Jukes-Cantor Distance
One of the most basic and straight forward distance measures is Jukes-Cantor distance.
This measure is constructed based on the assumption that no insertions
or deletions occurred, all substitutions are independent, and that each
nucleotide change is equally likely. With these presumptions, we can
obtain the following equation:
where is the Jukes-Cantor distance between two sequences A, and B, and being the dissimilarity between the two sequences.
Nei's standard genetic distance
In 1972, Masatoshi Nei
published what came to be known as Nei's standard genetic distance.
This distance has the nice property that if the rate of genetic change
(amino acid substitution) is constant per year or generation then Nei's
standard genetic distance (D) increases in proportion to divergence time. This measure assumes that genetic differences are caused by mutation and genetic drift.
This distance can also be expressed in terms of the arithmetic mean of gene identity. Let be the probability for the two members of population having the same allele at a particular locus and be the corresponding probability in population . Also, let be the probability for a member of and a member of having the same allele. Now let , and represent the arithmetic mean of , and over all loci, respectively. In other words,
where is the total number of loci examined.
Nei's standard distance can then be written as
Cavalli-Sforza chord distance
In 1967 Luigi Luca Cavalli-Sforza and A. W. F. Edwards published this measure. It assumes that genetic differences arise due to genetic drift
only. One major advantage of this measure is that the populations are
represented in a hypersphere, the scale of which is one unit per gene
substitution. The chord distance in the hyperdimensional sphere is given by
Some authors drop the factor to simplify the formula at the cost of losing the property that the scale is one unit per gene substitution.
Reynolds, Weir, and Cockerham's genetic distance
In 1983, this measure was published by John Reynolds, Bruce Weir and C. Clark Cockerham.
This measure assumes that genetic differentiation occurs only by genetic drift without mutations. It estimates the coancestry coefficient which provides a measure of the genetic divergence by:
Kimura 2 Parameter distance
Figure
4: A diagram showing the relationship between DNA base-pairs and the
type of mutation needed to convert each base to another based on the
Kimura 2 parameter substitution model.
The Kimura two parameter model (K2P) was developed in 1980 by Japanese biologist Motoo Kimura. It is compatible with the neutral theory
of evolution, which was also developed by the same author. As depicted
in Figure 4, this measure of genetic distance accounts for the type of
mutation occurring, namely whether it is a transition (i.e. purine to purine or pyrimidine to pyrimidine) or a transversion (i.e. purine to pyrimidine or vice versa). With this information, the following formula can be derived:
where P is and Q is , with being the number of transition type conversions, being the number of transversion type conversions, and being the number of nucleotides sites compared.
It is worth noting when transition and transversion type substitutions have an equal chance of occurring, and is assumed to equal , then the above formula can be reduced down to the Jukes Cantor model. In practice however, is typically larger than .
It has been shown that while K2P works well in classifying
distantly-related species, it is not always the best choice for
comparing closely-related species. In these cases, it may be better to
use p-distance instead.
Kimura 3 Parameter distance
Figure
5: A diagram showing the relationship between DNA base-pairs and the
type of mutation needed to convert each base to another based on the
Kimura 3 parameter substitution model.
The Kimura three parameter (K3P) model was first published in 1981.
This measure assumes three rates of substitution when nucleotides
mutate, which can be seen in Figure 5. There is one rate for transition type mutations, one rate for transversion type mutations to corresponding bases (e.g. G to C; transversion type 1 in the figure), and one rate for transversion type mutations to non-corresponding bases (e.g. G to T; transversion type 2 in the figure).
With these rates of substitution, the following formula can be derived:
where is the probability of a transition type mutation, is the probability of a transversion type mutation to a corresponding base, and is the probability of a transversion type mutation to a non-corresponding base. When and are assumed to be equal, this reduces down to the Kimura 2 parameter distance.
Other measures
Many other measures of genetic distance have been proposed with varying success.
Nei's DA distance 1983
Nei's DA
distance was created by Masatoshi Nei, a Japanese-American biologist in
1983. This distance assumes that genetic differences arise due to mutation and genetic drift,
but this distance measure is known to give more reliable population
trees than other distances particularly for microsatellite DNA data.
This method is not ideal in cases where natural selection plays a
significant role in a populations genetics.
: Nei's DA distance, the genetic distance between populations X and Y
: A locus or gene studied with being the sum of loci or genes
and : The frequencies of allele u in populations X and Y, respectively
L: The total number of loci examined
Euclidean distance
Figure 6: Euclidean genetic distance between 51 worldwide human populations, calculated using 289,160 SNPs. Dark red is the most similar pair and dark blue is the most distant pair.
Euclidean distance is a formula brought about from Euclid's Elements,
a 13 book set detailing the foundation of all euclidean mathematics.
The foundational principles outlined in these works is used not only in
euclidean spaces but expanded upon by Issac Newton and Gottfried Leibniz
in isolated pursuits to create calculus.The euclidean distance formula is
used to convey, as simply as possible, the genetic dissimilarity
between populations, with a larger distance indicating greater
dissimilarity.
As seen in figure 6, this method can be visualized in a graphical
manner, this is due to the work of René Descartes who created the
fundamental principle of analytic geometry, or the cartesian coordinate
system. In an interesting example of historical repetitions, René
Descartes was not the only one who discovered the fundamental principle
of analytical geometry, this principle was as discovered in an isolated
pursuit by Pierre de Fermat who left his work unpublished.
: Euclidean genetic distance between populations X and Y
and : Allele frequencies at locus u in populations X and Y, respectively
Goldstein distance 1995
It was specifically developed for microsatellite markers and is based on the stepwise-mutation model
(SMM). The Goldstein distance formula is modeled in such a way that
expected value will increase linearly with time, this property is
maintained even when the assumptions of single-step mutations and
symmetrical mutation rate are violated. Goldstein distance is derived
from the average square distance model, of which Goldstein was also a
contributor.
: Goldstein genetic distance between populations X and Y
and : Mean allele sizes in populations X and Y
L: Total number of microsatallite loci examined
Nei's minimum genetic distance 1973
This calculation represents the minimum amount of codon differences for each locus. The measurement is based on the assumption that genetic differences arise due to mutation and genetic drift.
: Minimum amount of codon difference per locus
and : Average probability of two members of the X population having the same allele
: Average probability of members of the X and Y populations having the same allele
Czekanowski (Manhattan) Distance
Figure 7: Representation of path between points that is calculated for the Czekanwski (Manhattan) distance formula.
Similar to Euclidean distance, Czekanowski distance involves
calculated the distance between points of allele frequency that are
graphed on an axis created by . However, Czekanowski assumes a direct
path is not available and sums the sides of the triangle formed by the
data points instead of finding the hypotenuse. This formula is nicknamed
the Manhattan distance because its methodology is similar to the nature
of the New York City burrow. Manhattan is mainly built on a grid system
requiring resentence to only make 90 degree turns during travel, which
parallels the thinking of the formula.
and : Allele frequencies at locus u in populations X and Y, respectively
and : X-axis value of the frequency of an allele for X and Y populations
and : Y-axis value of the frequency of an allele for X and Y populations
Roger's Distance 1972
Figure 8: Representation of path between points that is calculated for the Roger's distance formula.
Similar to Czekanowski distance, Roger's distance involves
calculating the distance between points of allele frequency. However,
this method takes the direct distance between the points.
and : Allele frequencies at locus u in populations X and Y, respectively
: Total number of microsatallite loci examined
Limitations of Simple Distance Formulas
While
these formulas are easy and quick calculations to make, the information
that is provided gives limited information. The results of these
formulas do not account for the potential effects of the number of codon
changes between populations, or separation time between populations.
A commonly used measure of genetic distance is the fixation index (FST)
which varies between 0 and 1. A value of 0 indicates that two
populations are genetically identical (minimal or no genetic diversity
between the two populations) whereas a value of 1 indicates that two
populations are genetically different (maximum genetic diversity between
the two populations). No mutation is assumed. Large populations between
which there is much migration, for example, tend to be little
differentiated whereas small populations between which there is little
migration tend to be greatly differentiated. FST is a convenient measure of this differentiation, and as a result FST and related statistics are among the most widely used descriptive statistics in population and evolutionary genetics. But FST is more than a descriptive statistic and measure of genetic differentiation. FST is directly related to the Variance in allele frequency among populations and conversely to the degree of resemblance among individuals within populations. If FST
is small, it means that allele frequencies within each population are
very similar; if it is large, it means that allele frequencies are very
different.