From Wikipedia, the free encyclopedia
The history of molecular evolution starts in the early 20th century with "comparative biochemistry", but the field of molecular evolution came into its own in the 1960s and 1970s, following the rise of molecular biology. The advent of protein sequencing
allowed molecular biologists to create phylogenies based on sequence
comparison, and to use the differences between homologous sequences as a
molecular clock to estimate the time since the last common ancestor. In the late 1960s, the neutral theory of molecular evolution
provided a theoretical basis for the molecular clock, though both the
clock and the neutral theory were controversial, since most evolutionary
biologists held strongly to panselectionism, with natural selection
as the only important cause of evolutionary change. After the 1970s,
nucleic acid sequencing allowed molecular evolution to reach beyond
proteins to highly conserved ribosomal RNA sequences, the foundation of a reconceptualization of the early history of life.
Early history
Before the rise of molecular biology
in the 1950s and 1960s, a small number of biologists had explored the
possibilities of using biochemical differences between species to study evolution.
Alfred Sturtevant predicted the existence of chromosomal inversions in 1921 and with Dobzhansky
constructed one of the first molecular phylogenies on 17 Drosophila
Pseudo-obscura strains from the accumulation of chromosomal inversions
observed from the hybridization of polyten chromosomes.
Ernest Baldwin worked extensively on comparative biochemistry beginning in the 1930s, and Marcel Florkin pioneered techniques for constructing phylogenies
based on molecular and biochemical characters in the 1940s. However, it
was not until the 1950s that biologists developed techniques for
producing biochemical data for the quantitative study of molecular evolution.
The first molecular systematics research was based on immunological assays and protein "fingerprinting" methods. Alan Boyden—building on immunological methods of George Nuttall—developed new techniques beginning in 1954, and in the early 1960s Curtis Williams and Morris Goodman used immunological comparisons to study primate phylogeny. Others, such as Linus Pauling and his students, applied newly developed combinations of electrophoresis and paper chromatography to proteins subject to partial digestion by digestive enzymes to create unique two-dimensional patterns, allowing fine-grained comparisons of homologous proteins.
Beginning in the 1950s, a few naturalists also experimented with molecular approaches—notably Ernst Mayr and Charles Sibley.
While Mayr quickly soured on paper chromatography, Sibley successfully
applied electrophoresis to egg-white proteins to sort out problems in
bird taxonomy, soon supplemented that with DNA hybridization techniques—the beginning of a long career built on molecular systematics.
While such early biochemical techniques found grudging acceptance
in the biology community, for the most part they did not impact the
main theoretical problems of evolution and population genetics. This
would change as molecular biology shed more light on the physical and
chemical nature of genes.
Genetic load, the classical/balance controversy, and the measurement of heterozygosity
At
the time that molecular biology was coming into its own in the 1950s,
there was a long-running debate—the classical/balance controversy—over
the causes of heterosis, the increase in fitness observed when inbred lines are outcrossed. In 1950, James F. Crow offered two different explanations (later dubbed the classical and balance positions) based on the paradox first articulated by J. B. S. Haldane
in 1937: the effect of deleterious mutations on the average fitness of a
population depends only on the rate of mutations (not the degree of
harm caused by each mutation) because more-harmful mutations are
eliminated more quickly by natural selection, while less-harmful
mutations remain in the population longer. H. J. Muller dubbed this "genetic load".
Muller, motivated by his concern about the effects of radiation
on human populations, argued that heterosis is primarily the result of
deleterious homozygous recessive alleles, the effects of which are
masked when separate lines are crossed—this was the dominance hypothesis, part of what Dobzhansky labeled the classical position.
Thus, ionizing radiation and the resulting mutations produce
considerable genetic load even if death or disease does not occur in the
exposed generation, and in the absence of mutation natural selection
will gradually increase the level of homozygosity. Bruce Wallace, working with J. C. King, used the overdominance hypothesis to develop the balance position, which left a larger place for overdominance
(where the heterozygous state of a gene is more fit than the homozygous
states). In that case, heterosis is simply the result of the increased
expression of heterozygote advantage.
If overdominant loci are common, then a high level of heterozygosity
would result from natural selection, and mutation-inducing radiation may
in fact facilitate an increase in fitness due to overdominance. (This
was also the view of Dobzhansky.)
Debate continued through 1950s, gradually becoming a central focus of population genetics. A 1958 study of Drosophila by Wallace suggested that radiation-induced mutations increased
the viability of previously homozygous flies, providing evidence for
heterozygote advantage and the balance position; Wallace estimated that
50% of loci in natural Drosophila populations were heterozygous. Motoo Kimura's
subsequent mathematical analyses reinforced what Crow had suggested in
1950: that even if overdominant loci are rare, they could be responsible
for a disproportionate amount of genetic variability. Accordingly,
Kimura and his mentor Crow came down on the side of the classical
position. Further collaboration between Crow and Kimura led to the infinite alleles model,
which could be used to calculate the number of different alleles
expected in a population, based on population size, mutation rate, and
whether the mutant alleles were neutral, overdominant, or deleterious.
Thus, the infinite alleles model offered a potential way to decide
between the classical and balance positions, if accurate values for the
level of heterozygosity could be found.
By the mid-1960s, the techniques of biochemistry and molecular biology—in particular protein electrophoresis—provided
a way to measure the level of heterozygosity in natural populations: a
possible means to resolve the classical/balance controversy. In 1963, Jack L. Hubby published an electrophoresis study of protein variation in Drosophila; soon after, Hubby began collaborating with Richard Lewontin
to apply Hubby's method to the classical/balance controversy by
measuring the proportion of heterozygous loci in natural populations.
Their two landmark papers, published in 1966, established a significant
level of heterozygosity for Drosophila (12%, on average).
However, these findings proved difficult to interpret. Most population
geneticists (including Hubby and Lewontin) rejected the possibility of
widespread neutral mutations; explanations that did not involve
selection were anathema to mainstream evolutionary biology. Hubby and
Lewontin also ruled out heterozygote advantage as the main cause because
of the segregation load it would entail, though critics argued that the findings actually fit well with overdominance hypothesis.
Protein sequences and the molecular clock
While
evolutionary biologists were tentatively branching out into molecular
biology, molecular biologists were rapidly turning their attention
toward evolution.
After developing the fundamentals of protein sequencing with insulin between 1951 and 1955, Frederick Sanger and his colleagues had published a limited interspecies comparison of the insulin sequence in 1956. Francis Crick, Charles Sibley
and others recognized the potential for using biological sequences to
construct phylogenies, though few such sequences were yet available. By
the early 1960s, techniques for protein sequencing had advanced to the point that direct comparison of homologous amino acid sequences was feasible. In 1961, Emanuel Margoliash and his collaborators completed the sequence for horse cytochrome c (a longer and more widely distributed protein than insulin), followed in short order by a number of other species.
In 1962, Linus Pauling and Emile Zuckerkandl proposed using the number of differences between homologous protein sequences to estimate the time since divergence, an idea Zuckerkandl had conceived around 1960 or 1961. This began with Pauling's long-time research focus, hemoglobin, which was being sequenced by Walter Schroeder;
the sequences not only supported the accepted vertebrate phylogeny, but
also the hypothesis (first proposed in 1957) that the different globin
chains within a single organism could also be traced to a common
ancestral protein. Between 1962 and 1965, Pauling and Zuckerkandl refined and elaborated this idea, which they dubbed the molecular clock, and Emil L. Smith
and Emanuel Margoliash expanded the analysis to cytochrome c. Early
molecular clock calculations agreed fairly well with established
divergence times based on paleontological evidence. However, the
essential idea of the molecular clock—that individual proteins evolve at
a regular rate independent of a species' morphological evolution—was extremely provocative (as Pauling and Zuckerkandl intended it to be).
The "molecular wars"
From
the early 1960s, molecular biology was increasingly seen as a threat to
the traditional core of evolutionary biology. Established evolutionary
biologists—particularly Ernst Mayr, Theodosius Dobzhansky and G. G. Simpson, three of the founders of the modern evolutionary synthesis
of the 1930s and 1940s—were extremely skeptical of molecular
approaches, especially when it came to the connection (or lack thereof)
to natural selection.
Molecular evolution in general—and the molecular clock in
particular—offered little basis for exploring evolutionary causation.
According to the molecular clock hypothesis, proteins evolved
essentially independently of the environmentally determined forces of
selection; this was sharply at odds with the panselectionism
prevalent at the time. Moreover, Pauling, Zuckerkandl, and other
molecular biologists were increasingly bold in asserting the
significance of "informational macromolecules" (DNA, RNA and proteins)
for all biological processes, including evolution.
The struggle between evolutionary biologists and molecular
biologists—with each group holding up their discipline as the center of
biology as a whole—was later dubbed the "molecular wars" by Edward O. Wilson,
who experienced firsthand the domination of his biology department by
young molecular biologists in the late 1950s and the 1960s.
In 1961, Mayr began arguing for a clear distinction between functional biology (which considered proximate causes and asked "how" questions) and evolutionary biology (which considered ultimate causes and asked "why" questions) He argued that both disciplines and individual scientists could be classified on either the functional or evolutionary
side, and that the two approaches to biology were complementary. Mayr,
Dobzhansky, Simpson and others used this distinction to argue for the
continued relevance of organismal biology, which was rapidly losing
ground to molecular biology and related disciplines in the competition
for funding and university support. It was in that context that Dobzhansky first published his famous statement, "nothing in biology makes sense except in the light of evolution",
in a 1964 paper affirming the importance of organismal biology in the
face of the molecular threat; Dobzhansky characterized the molecular
disciplines as "Cartesian" (reductionist) and organismal disciplines as "Darwinian".
Mayr and Simpson attended many of the early conferences where
molecular evolution was discussed, critiquing what they saw as the
overly simplistic approaches of the molecular clock. The molecular
clock, based on uniform rates of genetic change driven by random
mutations and drift, seemed incompatible with the varying rates of
evolution and environmentally-driven adaptive processes (such as adaptive radiation)
that were among the key developments of the evolutionary synthesis. At
the 1962 Wenner-Gren conference, the 1964 Colloquium on the Evolution of
Blood Proteins in Bruges, Belgium, and the 1964 Conference on Evolving Genes and Proteins at Rutgers University,
they engaged directly with the molecular biologists and biochemists,
hoping to maintain the central place of Darwinian explanations in
evolution as its study spread to new fields.
Gene-centered view of evolution
Though not directly related to molecular evolution, the mid-1960s also saw the rise of the gene-centered view of evolution, spurred by George C. Williams's Adaptation and Natural Selection (1966). Debate over units of selection, particularly the controversy over group selection,
led to increased focus on individual genes (rather than whole organisms
or populations) as the theoretical basis for evolution. However, the
increased focus on genes did not mean a focus on molecular evolution; in
fact, the adaptationism
promoted by Williams and other evolutionary theories further
marginalized the apparently non-adaptive changes studied by molecular
evolutionists.
The neutral theory of molecular evolution
The intellectual threat of molecular evolution became more explicit in 1968, when Motoo Kimura introduced the neutral theory of molecular evolution.
Based on the available molecular clock studies (of hemoglobin from a
wide variety of mammals, cytochrome c from mammals and birds, and triosephosphate dehydrogenase from rabbits and cows), Kimura (assisted by Tomoko Ohta) calculated an average rate of DNA substitution of one base pair
change per 300 base pairs (encoding 100 amino acids) per 28 million
years. For mammal genomes, this indicated a substitution rate of one
every 1.8 years, which would produce an unsustainably high substitution load
unless the preponderance of substitutions was selectively neutral.
Kimura argued that neutral mutations occur very frequently, a conclusion
compatible with the results of the electrophoretic studies of protein
heterozygosity. Kimura also applied his earlier mathematical work on
genetic drift to explain how neutral mutations could come to fixation,
even in the absence of natural selection; he soon convinced James F.
Crow of the potential power of neutral alleles and genetic drift as
well.
Kimura's theory—described only briefly in a letter to Nature—was followed shortly after with a more substantial analysis by Jack L. King and Thomas H. Jukes—who titled their first paper on the subject "non-Darwinian evolution".
Though King and Jukes produced much lower estimates of substitution
rates and the resulting genetic load in the case of non-neutral changes,
they agreed that neutral mutations driven by genetic drift were both
real and significant. The fairly constant rates of evolution observed
for individual proteins was not easily explained without invoking
neutral substitutions (though G. G. Simpson and Emil Smith had tried).
Jukes and King also found a strong correlation between the frequency of
amino acids and the number of different codons encoding each amino acid.
This pointed to substitutions in protein sequences as being largely the
product of random genetic drift.
King and Jukes' paper, especially with the provocative title, was
seen as a direct challenge to mainstream neo-Darwinism, and it brought
molecular evolution and the neutral theory to the center of evolutionary
biology. It provided a mechanism for the molecular clock and a
theoretical basis for exploring deeper issues of molecular evolution,
such as the relationship between rate of evolution and functional
importance. The rise of the neutral theory marked synthesis of
evolutionary biology and molecular biology—though an incomplete one.
With their work on firmer theoretical footing, in 1971 Emile Zuckerkandl and other molecular evolutionists founded the Journal of Molecular Evolution.
The neutralist-selectionist debate and near-neutrality
The critical responses to the neutral theory that soon appeared marked the beginning of the neutralist-selectionist debate.
In short, selectionists viewed natural selection as the primary or
only cause of evolution, even at the molecular level, while neutralists
held that neutral mutations were widespread and that genetic drift was a
crucial factor in the evolution of proteins. Kimura became the most
prominent defender of the neutral theory—which would be his main focus
for the rest of his career. With Ohta, he refocused his arguments on
the rate at which drift could fix new mutations in finite populations,
the significance of constant protein evolution rates, and the functional
constraints on protein evolution that biochemists and molecular
biologists had described. Though Kimura had initially developed the
neutral theory partly as an outgrowth of the classical position
within the classical/balance controversy (predicting high genetic load
as a consequence of non-neutral mutations), he gradually deemphasized
his original argument that segregational load would be impossibly high
without neutral mutations (which many selectionists, and even fellow
neutralists King and Jukes, rejected).
From the 1970s through the early 1980s, both selectionists and
neutralists could explain the observed high levels of heterozygosity in
natural populations, by assuming different values for unknown
parameters. Early in the debate, Kimura's student Tomoko Ohta
focused on the interaction between natural selection and genetic drift,
which was significant for mutations that were not strictly neutral, but
nearly so. In such cases, selection would compete with drift: most
slightly deleterious mutations would be eliminated by natural selection
or chance; some would move to fixation through drift. The behavior of
this type of mutation, described by an equation that combined the
mathematics of the neutral theory with classical models, became the
basis of Ohta's nearly neutral theory of molecular evolution.
In 1973, Ohta published a short letter in Nature
suggesting that a wide variety of molecular evidence supported the
theory that most mutation events at the molecular level are slightly
deleterious rather than strictly neutral. Molecular evolutionists were
finding that while rates of protein evolution (consistent with the molecular clock) were fairly independent of generation time, rates of noncoding DNA
divergence were inversely proportional to generation time. Noting that
population size is generally inversely proportional to generation time,
Tomoko Ohta proposed that most amino acid substitutions are slightly
deleterious while noncoding DNA substitutions are more neutral. In this
case, the faster rate of neutral evolution in proteins expected in
small populations (due to genetic drift) is offset by longer generation
times (and vice versa), but in large populations with short generation
times, noncoding DNA evolves faster while protein evolution is retarded
by selection (which is more significant than drift for large
populations).
Between then and the early 1990s, many studies of molecular
evolution used a "shift model" in which the negative effect on the
fitness of a population due to deleterious mutations shifts back to an
original value when a mutation reaches fixation. In the early 1990s,
Ohta developed a "fixed model" that included both beneficial and
deleterious mutations, so that no artificial "shift" of overall
population fitness was necessary.
According to Ohta, however, the nearly neutral theory largely fell out
of favor in the late 1980s, because of the mathematically simpler
neutral theory for the widespread molecular systematics research that flourished after the advent of rapid DNA sequencing.
As more detailed systematics studies started to compare the evolution
of genome regions subject to strong selection versus weaker selection in
the 1990s, the nearly neutral theory and the interaction between
selection and drift have once again become an important focus of
research.
Microbial phylogeny
While
early work in molecular evolution focused on readily sequenced proteins
and relatively recent evolutionary history, by the late 1960s some
molecular biologists were pushing further toward the base of the tree of
life by studying highly conserved nucleic acid sequences. Carl Woese, a molecular biologist whose earlier work was on the genetic code and its origin, began using small subunit ribosomal RNA
to reclassify bacteria by genetic (rather than morphological)
similarity. Work proceeded slowly at first, but accelerated as new
sequencing methods were developed in the 1970s and 1980s. By 1977,
Woese and George Fox announced that some bacteria, such as methanogens,
lacked the rRNA units that Woese's phylogenetic studies were based on;
they argued that these organisms were actually distinct enough from
conventional bacteria and the so-called higher organisms to form their own kingdom, which they called archaebacteria. Though controversial at first (and challenged again in the late 1990s), Woese's work became the basis of the modern three-domain system of Archaea, Bacteria, and Eukarya (replacing the five-domain system that had emerged in the 1960s).
Work on microbial phylogeny also brought molecular evolution closer to cell biology and origin of life
research. The differences between archaea pointed to the importance of
RNA in the early history of life. In his work with the genetic code,
Woese had suggested RNA-based life had preceded the current forms of
DNA-based life, as had several others before him—an idea that Walter Gilbert would later call the "RNA world".
In many cases, genomics research in the 1990s produced phylogenies
contradicting the rRNA-based results, leading to the recognition of
widespread lateral gene transfer across distinct taxa. Combined with the probable endosymbiotic origin of organelle-filled
eukarya, this pointed to a far more complex picture of the origin and
early history of life, one which might not be describable in the
traditional terms of common ancestry.