Archaeogenetics is the study of ancient DNA using various molecular genetic methods and DNA resources. This form of genetic analysis can be applied to human, animal, and plant specimens. Ancient DNA can be extracted from various fossilized specimens including bones, eggshells, and artificially preserved tissues in human and animal specimens. In plants, Ancient DNA can be extracted from seeds, tissue, and in some cases, feces. Archaeogenetics provides us with genetic evidence of ancient population group migrations,[1] domestication events, and plant and animal evolution.[2] The ancient DNA cross referenced with the DNA of relative modern genetic populations allows researchers to run comparison studies that provide a more complete analysis when ancient DNA is compromised.[3]
Archaeogenetics receives its name from the Greek word arkhaios meaning ancient, and the term genetics, meaning the study of heredity.[4] The term archaeogenetics was conceived by archaeologist Colin Renfrew.[5]
Early work
Ludwik Hirszfeld (1884–1954)
Ludwik Hirszfeld was a Polish microbiologist and serologist who was the President of the Blood Group Section of the Second International Congress of Blood Transfusion. He founded blood group inheritance with Erich von Dungern in 1910, and contributed to it greatly throughout his life.[6] He studied ABO blood groups. In one of his studies in 1919, Hirszfeld documented the ABO blood groups and hair color of people at the Macedonian front, leading to his discovery that the hair color and blood type had no correlation. In addition to that he observed that there was a decrease of blood group A from western Europe to India and the opposite for blood group B. He hypothesized that the east-to-west blood group ratio stemmed from two blood groups consisting of mainly A or B mutating from blood group O, and mixing through migration or intermingling. A majority of his work was researching the links of blood types to sex, disease, climate, age, social class, and race. His work led him to discover that peptic ulcer was more dominant in blood group O, and that AB blood type mothers had a high male-to-female birth ratio.[7]Arthur Mourant (1904–1994)
Arthur Mourant was a British hematologist and chemist. He received many awards, most notably Fellowship of the Royal Society. His work included organizing the existing data on blood group gene frequencies, and largely contributing to the genetic map of the world through his investigation of blood groups in many populations. Mourant discovered the new blood group antigens of the Lewis, Henshaw, Kell, and Rhesus systems, and analyzed the association of blood groups and various other diseases. He also focused on the biological significance of polymorphisms. His work provided the foundation for archaeogenetics because it facilitated the separation of genetic evidence for biological relationships between people. This genetic evidence was previously used for that purpose. It also provided material that could be used to appraise the theories of population genetics.[8]William Boyd (1903–1983)
William Boyd was an American immunochemist and biochemist who became famous for his research on the genetics of race in the 1950s.[9] During the 1940s, Boyd and Karl O. Renkonen independently discovered that lectins react differently to various blood types, after finding that the crude extracts of the lima bean and tufted vetch agglutinated the red blood cells from blood type A but not blood types B or O. This ultimately led to the disclosure of thousands of plants that contained these proteins.[10] In order to examine racial differences and the distribution and migration patterns of various racial groups, Boyd systematically collected and classified blood samples from around the world, leading to his discovery that blood groups are not influenced by the environment, and are inherited. In his book Genetics and the Races of Man (1950), Boyd categorized the world population into 13 distinct races, based on their different blood type profiles and his idea that human races are populations with differing alleles.[11][12] One of the most abundant information sources regarding inheritable traits linked to race remains the study of blood groups.[12]Methods
Fossil DNA preservation
Fossil retrieval starts with selecting an excavation site. Potential excavation sites are usually identified with the mineralogy of the location and visual detection of bones in the area. However, there are more ways to discover excavation zones using technology such as field portable x-ray fluorescence[13] and Dense Stereo Reconstruction.[14] Tools used include knives, brushes, and pointed trowels which assist in the removal of fossils from the earth.[15]To avoid contaminating the ancient DNA, specimens are handled with gloves and stored in -20 °C immediately after being unearthed. Ensuring that the fossil sample is analyzed in a lab that has not been used for other DNA analysis could prevent contamination as well.[15][16] Bones are milled to a powder and treated with a solution before the polymerase chain reaction (PCR) process.[16] Samples for DNA amplification may not necessarily be fossil bones. Preserved skin, salt- preserved or air-dried, can also be used in certain situations.[17]
DNA preservation is difficult because the bone fossilisation degrades and DNA is chemically modified, usually by bacteria and fungi in the soil. The best time to extract DNA from a fossil is when it is freshly out of the ground as it contains six times the DNA when compared to stored bones. The temperature of extraction site also affects the amount of obtainable DNA, evident by a decrease in success rate for DNA amplification if the fossil is found in warmer regions. A drastic change of a fossil's environment also affects DNA preservation. Since excavation causes an abrupt change in the fossil's environment, it may lead to physiochemical change in the DNA molecule. Moreover, DNA preservation is also affected by other factors such as the treatment of the unearthed fossil like (e.g. washing, brushing and sun dring), pH, irradiation, the chemical composition of bone and soil, and hydrology. There are three perseveration diagenetic phases. The first phase is bacterial putrefaction, which is estimated to cause a 15-fold degradation of DNA. Phase 2 is when bone chemically degrades, mostly by depurination. The third diagenetic phase occurs after the fossil is excavated and stored, in which bone DNA degradation occurs most rapidly.[16]
Methods of DNA extraction
Once a specimen is collected from an archaeological site, DNA can be extracted through a series of processes.[18] One of the more common methods utilizes silica and takes advantage of polymerase chain reactions in order to collect ancient DNA from bone samples.[19]There are several challenges that add to the difficulty when attempting to extract ancient DNA from fossils and prepare it for analysis. DNA is continuously being split up. While the organism is alive these splits are repaired; however, once an organism has died, the DNA will begin to deteriorate without repair. This results in samples having strands of DNA measuring around 100 base pairs in length. Contamination is another significant challenge at multiple steps throughout the process. Often other DNA, such as bacterial DNA, will be present in the original sample. To avoid contamination it is necessary to take many precautions such as separate ventilation systems and workspaces for ancient DNA extraction work.[20] The best samples to use are fresh fossils as uncareful washing can lead to mold growth.[18] DNA coming from fossils also occasionally contains a compound that inhibits DNA replication.[21] Coming to a consensus on which methods are best at mitigating challenges is also difficult due to the lack of repeatability caused by the uniqueness of specimens.[20]
Silica-based DNA extraction is a method used as a purification step to extract DNA from archaeological bone artifacts and yield DNA that can be amplified using polymerase chain reaction (PCR) techniques.[21] This process works by using silica as a means to bind DNA and separate it from other components of the fossil process that inhibit PCR amplification. However, silica itself is also a strong PCR inhibitor, so careful measures must be taken to ensure that silica is removed from the DNA after extraction.[22] The general process for extracting DNA using the silica-based method is outlined by the following:[19]
- Bone specimen is cleaned and the outer layer is scraped off
- Sample is collected from preferably compact section
- Sample is ground to fine powder and added to an extraction solution to release DNA
- Silica solution is added and centrifuged to facilitate DNA binding
- Binding solution is removed and a buffer is added to the solution to release the DNA from the silica
Polymerase chain reaction is a process that can amplify segments of DNA and is often used on extracted ancient DNA. It has three main steps: denaturation, annealing, and extension. Denaturation splits the DNA into two single strands at high temperatures. Annealing involves attaching primer strands of DNA to the single strands that allow Taq polymerase to attach to the DNA. Extension occurs when Taq polymerase is added to the sample and matches base pairs to turn the two single strands into two complete double strands.[18] This process is repeated many times, and is usually repeated a higher number of times when used with ancient DNA.[23] Some issues with PCR is that it requires overlapping primer pairs for ancient DNA due to the short sequences. There can also be “jumping PCR” which causes recombination during the PCR process which can make analyzing the DNA more difficult in inhomogeneous samples.
Methods of DNA analysis
DNA extracted from fossil remains is primarily sequenced using Massive parallel sequencing,[24] which allows simultaneous amplification and sequencing of all DNA segments in a sample, even when it is highly fragmented and of low concentration.[23] It involves attaching a generic sequence to every single strand that generic primers can bond to, and thus all of the DNA present is amplified. This is generally more costly and time intensive than PCR but due to the difficulties involved in ancient DNA amplification it is cheaper and more efficient.[23] One method of massive parallel sequencing, developed by Margulies et al., employs bead-based emulsion PCR and pyrosequencing,[25] and was found to be powerful in analyses of aDNA because it avoids potential loss of sample, substrate competition for templates, and error propagation in replication.[26]The most common way to analyze aDNA sequence is to compare it with a known sequence from other sources, and this could be done in different ways for different purposes.
The identity of the fossil remain can be uncovered by comparing its DNA sequence with those of known species using software such as BLASTN.[26] This archaeogenetic approach is especially helpful when the morphology of the fossil is ambiguous.[27] Apart from that, species identification can also be done by finding specific genetic markers in an aDNA sequence. For example, the American indigenous population is characterized by specific mitochondrial RFLPs and deletions defined by Wallace et al.[28]
aDNA comparison study can also reveal the evolutionary relationship between two species. The number of base differences between DNA of an ancient species and that of a closely related extant species can be used to estimate the divergence time of those two species from their last common ancestor.[24] The phylogeny of some extinct species, such as Australian marsupial wolves and American ground sloths, has been constructed by this method.[24] Mitochondrial DNA in animals and chloroplast DNA in plants are usually used for this purpose because they have hundreds of copies per cell and thus are more easily accessible in ancient fossils.[24]
Another method to investigate relationship between two species is through DNA hybridization. Single-stranded DNA segments of both species are allowed to form complementary pair bonding with each other. More closely related species have a more similar genetic makeup, and thus a stronger hybridization signal. Scholz et al. conducted southern blot hybridization on Neanderthal aDNA (extracted from fossil remain W-NW and Krapina). The results showed weak ancient human-Neanderthal hybridization and strong ancient human-modern human hybridization. The human-chimpanzee and neanderthal-chimpanzee hybridization are of similarly weak strength. This suggests that humans and neanderthals are not as closely related as two individuals of the same species are, but they are more related to each other than to chimpanzees.[16]
There have also been some attempts to decipher aDNA to provide valuable phenotypic information of ancient species. This is always done by mapping aDNA sequence onto the karyotype of a well-studied closely related species, which share a lot of similar phenotypic traits.[26] For example, Green et al. compared the aDNA sequence from Neanderthal Vi-80 fossil with modern human X and Y chromosome sequence, and they found a similarity in 2.18 and 1.62 bases per 10,000 respectively, suggesting Vi-80 sample was from a male individual.[26] Other similar studies include finding of a mutation associated with dwarfism in Arabidopsis in ancient Nubian cotton,[27] and investigation on the bitter taste perception locus in Neanderthals.[29]
Applications
Human archaeology
Africa
Modern humans arose in Africa approximately 200 kya (thousand years ago).[30] Examination of mitochondrial DNA (mtDNA), Y-chromosome DNA, and X-chromosome DNA indicate that the earliest population to leave Africa consisted of approximately 1500 males and females.[30] It has been suggested by various studies that populations were geographically “structured” to some degree prior to the expansion out of Africa; this is suggested by the antiquity of shared mtDNA lineages.[30] One study of 121 populations from various places throughout the continent found 14 genetic and linguistic “clusters,” suggesting an ancient geographic structure to African populations.[30] In general, genotypic and phenotypic analysis have shown “large and subdivided throughout much of their evolutionary history.”[30]Genetic analysis has supported archaeological hypotheses of a large-scale migrations of Bantu speakers into Southern Africa approximately 5 kya.[30] Microsatellite DNA, single nucleotide polymorphisms (SNPs), and insertion/deletion polymorphisms (INDELS) have shown that Nilo-Saharan speaking populations originate from Sudan.[30] Furthermore, there is genetic evidence that Chad-speaking descendents of Nilo-Saharan speakers migrated from Sudan to Lake Chad about 8 kya.[30] Genetic evidence has also indicated that non-African populations made significant contributions to the African gene pool.[30] For example, the Saharan African Beja people have high levels of Middle-Eastern as well as East African Cushitic DNA.[30]
Europe
Analysis of mtDNA shows that Eurasia was occupied in a single migratory event between 60 and 70 kya.[1] Genetic evidence shows that occupation of the Near East and Europe happened no earlier than 50 kya.[1] Studying haplogroup U has shown separate dispersals from the Near East both into Europe and into North Africa.[1]Much of the work done in archaeogenetics focuses on the Neolithic transition in Europe.[31] Cavalli-Svorza’s analysis of genetic-geographic patterns led him to conclude that there was a massive influx of Near Eastern populations into Europe at the start of the Neolithic.[31] This view led him “to strongly emphasize the expanding early farmers at the expense of the indigenous Mesolithic foraging populations.”[31] mtDNA analysis in the 1990s, however, contradicted this view. M.B. Richards estimated that merely 10-22% of extant European mtDNA’s had come from Near Eastern populations during the Neolithic.[31] Most mtDNA’s were “already established” among existing Mesolithic and Paleolithic groups.[31] Most “control-region lineages” of modern European mtDNA are traced to a founder event of reoccupying northern Europe towards the end of the Last Glacial Maximum (LGM).[1] One study of extant European mtDNA’s suggest this reoccupation occurred after the end of the LGM, although another suggests it occurred before.[1][31] Analysis of haplogroups V, H, and U5 support a “pioneer colonization” model of European occupation, with incorporation of foraging populations into arriving Neolithic populations.[31] Furthermore, analysis of ancient DNA, not just extant DNA, is shedding light on some issues. For instance, comparison of neolithic and mesolithic DNA has indicated that the development of dairying preceded widespread lactose tolerance.[31]
South Asia
Studies of mtDNA line M suggest that the first occupants of India were Austro-Asiatic speakers who entered about 45-60 kya.[32] The Indian gene pool has contributions from an African source population, as well as West Asian and Central Asian populations from migrations no earlier than 8 kya.[32] The lack of variation in mtDNA lineages compared to the Y-chromosome lineages indicate that primarily males partook in these migrations.[32] The discovery of two subbranches U2i and U2e of the U mtDNA lineage, which arose in Central Asia has “modulated” views of a large migration from Central Asia into India, as the two branches diverged 50 kya.[32] Furthermore, U2e is found in large percentages in Europe but not India, and vice versa for U2i, implying U2i is native to India.[32]East Asia
Analysis of mtDNA and NRY (non-recombining region of Y chromosome) sequences have indicated that the first major dispersal out of Africa went through Saudi Arabia and the Indian coast 50-100 kya, and a second major dispersal occurred 15-50 kya north of the Himalayas.[33]Much work has been done to discover the extent of north-to-south and south-to-north migrations within Eastern Asia.[33] Comparing the genetic diversity of northeastern groups with southeastern groups has allowed archaeologists to conclude many of the northeast Asian groups came from the southeast.[33] The Pan-Asian SNP (single nucleotide polymorphism) study found “a strong and highly significant correlation between haplotype diversity and latitude,” which, when coupled with demographic analysis, supports the case for a primarily south-to-north occupation of East Asia.[33] Archaeogenetics has also been used to study hunter-gatherer populations in the region, such as the Ainu from Japan and Negrito groups in the Philippines.[33] For example, the Pan-Asian SNP study found that Negrito populations in the Philippines and the Negrito populations in the Philippines were more closely related to non-Negrito local populations than to each other, suggesting Negrito and non-Negrito populations are linked by one entry event into East Asia.[33]
Americas
Archaeogenetics has been used to better understand the populating of the Americas from Asia.[34] Native American mtDNA haplogroups have been estimated to be between 15 and 20 kya, although there is some variation in these estimates.[34] Genetic data has been used to propose various theories regarding how the Americas were colonized.[34] Although the most widely held theory suggests “three waves” of migration after the LGM through the Bering Strait, genetic data have given rise to alternative hypotheses.[34] For example, one hypothesis proposes a migration from Siberia to South America 20-15 kya and a second migration that occurred after glacial recession.[34] Y-chromosome data has led some to hold that there was a single migration starting from the Aldai Mountains of Siberia between 17.2- 10.1 kya, after the LGM.[34] Analysis of both mtDNA and Y-chromosome DNA reveals evidence of “small, founding populations.”[34] Studying haplogroups has led some scientists to conclude that a southern migration into the Americas from one small population was impossible, although separate analysis has found that such a model is feasible if such a migration happened along the coasts.[34]Australia and New Guinea
Finally, archaeogenetics has been used to study the occupation of Australia and New Guinea.[35] The aborigines of Australia and New Guinea are phenotypically very similar, but mtDNA has shown that this is due to convergence from living in similar conditions.[35] Non-coding regions of mt-DNA have shown “no similarities” between the aboriginal populations of Australia and New Guinea.[35] Furthermore, no major NRY lineages are shared between the two populations. The high frequency of a single NRY lineage unique to Australia coupled with “low diversity of lineage-associated Y-chromosomal short tandem repeat (Y-STR) haplotypes” provide evidence for a “recent founder or bottleneck” event in Australia.[35] But there is relatively large variation in mtDNA, which would imply that the bottleneck effect impacted males primarily.[35] Together, NRY and mtDNA studies show that the splitting event between the two groups was over 50kya, casting doubt on recent common ancestry between the two.[35]Plants and animals
Archaeogentics has been used to understand the development of domestication of plants and animals.Domestication of plants
The combination of genetics and archeological findings have been used to trace the earliest signs of plant domestication around the world. However, since the nuclear, mitochondrial, and chloroplast genomes used to trace domestication’s moment of origin have evolved at different rates, its use to trace genealogy have been somewhat problematic.[36] Nuclear DNA in specific is used over mitochondrial and chloroplast DNA because of its faster mutation rate as well as its intraspecific variation due to a higher consistency of polymorphism genetic markers.[36] Findings in crop ‘domestication genes’ (traits that were specifically selected for or against) include- tb1 (teosinte branched1) - affecting the apical dominance in maize[36]
- tga1 (teosinte glume architecture1) - making maize kernels compatible for the convenience of humans [36]
- te1 (Terminal ear1) - affecting the weight of kernels[36]
- fw2.2 - affecting the weight in tomatoes[36]
- BoCal - inflorescence of broccoli and cauliflower[36]
Domestication of animals
Archaeogenetics has been used to study the domestication of animals.[37] By analyzing genetic diversity in domesticated animal populations researchers can search for genetic markers in DNA to give valuable insight about possible traits of progenitor species.[37] These traits are then used to help distinguish archaeological remains between wild and domesticated specimens.[37] The genetic studies can also lead to the identification of ancestors for domesticated animals.[37] The information gained from genetics studies on current populations helps guide the Archaeologist’s search for documenting these ancestors.[37]Archaeogenetics has been used to trace the domestication of pigs throughout the old world.[38] These studies also reveal evidence about the details of early farmers.[38] Methods of Archaeogenetics have also been used to further understand the development of domestication of dogs.[39] Genetic studies have shown that all dogs are descendants from the gray wolf, however, it is currently unknown when, where, and how many times dogs were domesticated.[39] Some genetic studies have indicated multiple domestications while others have not.[39] Archaeological findings help better understand this complicated past by providing solid evidence about the progression of the domestication of dogs.[39] As early humans domesticated dogs the archaeological remains of buried dogs became increasingly more abundant.[39] Not only does this provide more opportunities for archaeologists to study the remains, it also provides clues about early human culture.[39]