Search This Blog

Friday, December 11, 2020

Molecular paleontology

From Wikipedia, the free encyclopedia

Molecular paleontology refers to the recovery and analysis of DNA, proteins, carbohydrates, or lipids, and their diagenetic products from ancient human, animal, and plant remains. The field of molecular paleontology has yielded important insights into evolutionary events, species' diasporas, the discovery and characterization of extinct species. By applying molecular analytical techniques to DNA in fossils, one can quantify the level of relatedness between any two organisms for which DNA has been recovered.

Advancements in the field of molecular paleontology have allowed scientists to pursue evolutionary questions on a genetic level rather than relying on phenotypic variation alone. Using various biotechnological techniques such as DNA isolation, amplification, and sequencing scientists have been able to gain expanded new insights into the divergence and evolutionary history of countless organisms.

History

The study of molecular paleontology is said to have begun with the discovery by Abelson of 360 million year old amino acids preserved in fossil shells. However, Svante Pääbo is often the one considered to be the founder of the field of molecular paleontology.

The field of molecular paleontology has had several major advances since the 1950s and is a continuously growing field. Below is a timeline showing notable contributions that have been made.

A visual graphic of the events listed in the timeline section.
A timeline demonstrating important dates in molecular paleontology. All of these dates are listed and specifically sourced in the History section under Timeline.

mid-1950s: Abelson found preserved amino acids in fossil shells that were about 360 million years old. Produced idea of comparing fossil amino acid sequences with existing organism so that molecular evolution could be studied.

1970s: Fossil peptides are studied by amino acid analysis. Start to use whole peptides and immunological methods.

Late 1970s: Palaeobotanists (can also be spelled as Paleobotanists) studied molecules from well-preserved fossil plants.

1984: The first successful DNA sequencing of an extinct species, the quagga, a zebra-like species.

1991: Published article on the successful extraction of proteins from the fossil bone of a dinosaur, specifically the seismosaurus.

2005: Scientists resurrect extinct 1918 influenza virus.

2006: Neanderthals nuclear DNA sequence segments begin to be analyzed and published.

2007: Scientists synthesize entire extinct human endogenous retrovirus (HERV-K) from scratch.

2010: A new species of early hominid, the Denisovans, discovered from mitochondrial and nuclear genomes recovered from bone found in a cave in Siberia. Analysis showed that the Denisovan specimen lived approximately 41,000 years ago, and shared a common ancestor with both modern humans and Neanderthals approximately 1 million years ago in Africa.

2013: The first entire Neanderthal genome is successfully sequenced. More information can be found at the Neanderthal genome project.

2013: A 400,000-year-old specimen with remnant mitochondrial DNA sequenced and is found to be a common ancestor to Neanderthals and Denisovans, later named Homo heidelbergensis.

2015: A 110,000-year-old fossil tooth containing DNA from Denisovans was reported.

The quagga

The first successful DNA sequencing of an extinct species was in 1984, from a 150-year-old museum specimen of the quagga, a zebra-like species. Mitochondrial DNA (also known as mtDNA) was sequenced from desiccated muscle of the quagga, and was found to differ by 12 base substitutions from the mitochondrial DNA of a mountain zebra. It was concluded that these two species had a common ancestor 3-4 million years ago, which is consistent with known fossil evidence of the species.

Denisovans

The Denisovans of Eurasia, a hominid species related to Neanderthals and humans, was discovered as a direct result of DNA sequencing of a 41,000-year-old specimen recovered in 2008. Analysis of the mitochondrial DNA from a retrieved finger bone showed the specimen to be genetically distinct from both humans and Neanderthals. Two teeth and a toe bone were later found to belong to different individuals with the same population. Analysis suggests that both the Neanderthals and Denisovans were already present throughout Eurasia when modern humans arrived. In November 2015, scientists reported finding a fossil tooth containing DNA from Denisovans, and estimated its age at 110,000-years-old.

Mitochondrial DNA analysis

A photo of Neanderthal DNA extraction in process
Neanderthal DNA extraction. Working in a clean room, researchers at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, took extensive precautions to avoid contaminating Neanderthal DNA samples - extracted from bones like this one - with DNA from any other source, including modern humans. NHGRI researchers are part of the international team that sequenced the genome of the Neanderthal, Homo neanderthalensis.

The mtDNA from the Denisovan finger bone differs from that of modern humans by 385 bases (nucleotides) in the mtDNA strand out of approximately 16,500, whereas the difference between modern humans and Neanderthals is around 202 bases. In contrast, the difference between chimpanzees and modern humans is approximately 1,462 mtDNA base pairs. This suggested a divergence time around one million years ago. The mtDNA from a tooth bore a high similarity to that of the finger bone, indicating they belonged to the same population. From a second tooth, an mtDNA sequence was recovered that showed an unexpectedly large number of genetic differences compared to that found in the other tooth and the finger, suggesting a high degree of mtDNA diversity. These two individuals from the same cave showed more diversity than seen among sampled Neanderthals from all of Eurasia, and were as different as modern-day humans from different continents.

Nuclear genome analysis

Isolation and sequencing of nuclear DNA has also been accomplished from the Denisova finger bone. This specimen showed an unusual degree of DNA preservation and low level of contamination. They were able to achieve near-complete genomic sequencing, allowing a detailed comparison with Neanderthal and modern humans. From this analysis, they concluded, in spite of the apparent divergence of their mitochondrial sequence, the Denisova population along with Neanderthal shared a common branch from the lineage leading to modern African humans. The estimated average time of divergence between Denisovan and Neanderthal sequences is 640,000 years ago, and the time between both of these and the sequences of modern Africans is 804,000 years ago. They suggest the divergence of the Denisova mtDNA results either from the persistence of a lineage purged from the other branches of humanity through genetic drift or else an introgression from an older hominin lineage.

Homo heidelbergensis

A photo of the Denisovan cranium found at Sima de los Huesos
"Homo heidelbergensis Cranium 5 is one of the most important discoveries in the Sima de los Huesos, Atapuerca (Spain). The mandible of this cranium appeared, nearly intact, some years after its find, close to the same location.

Homo heidelbergensis was first discovered in 1907 near Heidelberg, Germany and later also found elsewhere in Europe, Africa, and Asia. However it was not until 2013 that a specimen with retrievable DNA was found, in a ~400,000 year old femur found in the Sima de los Huesos Cave in Spain. The femur was found to contain both mtDNA and nuclear DNA. Improvements in DNA extraction and library preparation techniques allowed for mtDNA to be successfully isolated and sequenced, however the nuclear DNA was found to be too degraded in the observed specimen, and was also contaminated with DNA from an ancient cave bear (Ursus deningeri) present in the cave. The mtDNA analysis found a surprising link between the specimen and the Denisovans, and this finding raised many questions. Several scenarios were proposed in a January 2014 paper titled "A mitochondrial genome sequence of a hominin from Sima de los Huesos", elucidating the lack of convergence in the scientific community on how Homo heidelbergensis is related to other known hominin groups. One plausible scenario that the authors proposed was that the H. heidelbergensis was an ancestor to both Denisovans and Neanderthals. Completely sequenced nuclear genomes from both Denisovans and Neanderthals suggest a common ancestor approximately 700,000 years ago, and one leading researcher in the field, Svante Paabo, suggests that perhaps this new hominin group is that early ancestor.

Applications

Discovery and characterization of new species

Molecular paleontology techniques applied to fossils have contributed to the discovery and characterization of several new species, including the Denisovans and Homo heidelbergensis. We have been able to better understand the path that humans took as they populated the earth, and what species were present during this diaspora.

De-extinction

An artist's color drawing of the Pyrenean ibex
The Pyrenean ibex was temporarily brought back from extinction in 1984.

It is now possible to revive extinct species using molecular paleontology techniques. This was first accomplished via cloning in 2003 with the Pyrenean ibex, a type of wild goat that became extinct in 2000. Nuclei from the Pyrenean ibex's cells were injected into goat eggs emptied of their own DNA, and implanted into surrogate goat mothers. The offspring lived only seven minutes after birth, due to defects in its lungs. Other cloned animals have been observed to have similar lung defects.

There are many species that have gone extinct as a direct result of human activity. Some examples include the dodo, the great auk, the Tasmanian tiger, the Chinese river dolphin, and the passenger pigeon. An extinct species can be revived by using allelic replacement of a closely related species that is still living. By only having to replace a few genes within an organism, instead of having to build the extinct species' genome from scratch, it could be possible to bring back several species in this way, even Neanderthals.

The ethics surrounding the re-introduction of extinct species are very controversial. Critics of bringing extinct species back to life contend that it would divert limited money and resources from protecting the world's current biodiversity problems. With current extinction rates approximated to be 100 to 1,000 times the background extinction rate, it is feared that a de-extinction program might lessen public concerns over the current mass extinction crisis, if it is believed that these species can simply be brought back to life. As the editors of a Scientific American article on de-extinction pose: Should we bring back the woolly mammoth only to let elephants become extinct in the meantime? The main driving factor for the extinction of most species in this era (post 10,000 BC) is the loss of habitat, and temporarily bringing back an extinct species will not recreate the environment they once inhabited.

Proponents of de-extinction, such as George Church, speak of many potential benefits. Reintroducing an extinct keystone species, such as the woolly mammoth, could help re-balance the ecosystems that once depended on them. Some extinct species could create broad benefits for the environments they once inhabited, if returned. For example, woolly mammoths may be able to slow the melting of the Russian and Arctic tundra in several ways such as eating dead grass so that new grass can grow and take root, and periodically breaking up the snow, subjecting the ground below to the arctic air. These techniques could also be used to reintroduce genetic diversity in a threatened species, or even introduce new genes and traits to allow the animals to compete better in a changing environment.

Research and technology

When a new potential specimen is found, scientists normally first analyze for cell and tissue preservation using histological techniques, and test the conditions for the survivability of DNA. They will then attempt to isolate a DNA sample using the technique described below, and conduct a PCR amplification of the DNA to increase the amount of DNA available for testing. This amplified DNA is then sequenced. Care is taken to verify that the sequence matches the phylogenetic traits of the organism. When an organism dies, a technique called amino acid dating can be used to age the organism. It inspects the degree of racemization of aspartic acid, leucine, and alanine within the tissue. As time passes, the D/L ratio (where "D" and "L" are mirror images of each other) increase from 0 to 1. In samples where the D/L ratio of aspartic acid is greater than 0.08, ancient DNA sequences can not be retrieved (as of 1996).

Mitochondrial DNA vs. nuclear DNA

An infographic contrasting inheritance of mitochondrial and nuclear DNA
Unlike nuclear DNA (left), mitochondrial DNA is only inherited from the maternal lineage (right).

Mitochondrial DNA (mtDNA) is separate from one's nuclear DNA. It is present in organelles called mitochondria in each cell. Unlike nuclear DNA, which is inherited from both parents and rearranged every generation, an exact copy of mitochondrial DNA gets passed down from mother to her sons and daughters. The benefits of performing DNA analysis with Mitochondrial DNA is that it has a far smaller mutation rate than nuclear DNA, making tracking lineages on the scale of tens of thousands of years much easier. Knowing the base mutation rate for mtDNA, (in humans this rate is also known as the Human mitochondrial molecular clock) one can determine the amount of time any two lineages have been separated. Another advantage of mtDNA is that thousands of copies of it exist in every cell, whereas only two copies of nuclear DNA exist in each cell. All eukaryotes, a group which includes all plants, animals, and fungi, have mtDNA. A disadvantage of mtDNA is that only the maternal line is represented. For example, a child will inherit 1/8 of its DNA from each of its eight great-grandparents, however it will inherit an exact clone of its maternal great-grandmother's mtDNA. This is analogous to a child inheriting only his paternal great-grandfather's last name, and not a mix of all of the eight surnames.

Isolation

There are many things to consider when isolating a substance. First, depending upon what it is and where it is located, there are protocols that must be carried out in order to avoid contamination and further degradation of the sample. Then, handling of the materials is usually done in a physically isolated work area and under specific conditions (i.e. specific Temperature, moisture, etc...) also to avoid contamination and further loss of sample.

Once the material has been obtained, depending on what it is, there are different ways to isolate and purify it. DNA extraction from fossils is one of the more popular practices and there are different steps that can be taken to get the desired sample. DNA extracted from amber-entombed fossils can be taken from small samples and mixed with different substances, centrifuged, incubated, and centrifuged again. On the other hand, DNA extraction from insects can be done by grinding the sample, mixing it with buffer, and undergoing purification through glass fiber columns. In the end, regardless of how the sample was isolated for these fossils, the DNA isolated must be able to undergo amplification.

Amplification

An infographic showing the replication process of PCR
Polymerase chain reaction

The field of molecular paleontology benefited greatly from the invention of the polymerase chain reaction(PCR), which allows one to make billions of copies of a DNA fragment from just a single preserved copy of the DNA. One of the biggest challenges up until this point was the extreme scarcity of recovered DNA because of degradation of the DNA over time.

Sequencing

DNA sequencing is done to determine the order of nucleotides and genes. There are many different materials from which DNA can be extracted. In animals, the mitochondrial chromosome can be used for molecular study. Chloroplasts can be studied in plants as a primary source of sequence data.

An evolutionary tree of mammals
An evolutionary tree of mammals

In the end, the sequences generated are used to build evolutionary trees. Methods to match data sets include: maximum probability, minimum evolution (also known as neighbor-joining) which searches for the tree with shortest overall length, and the maximum parsimony method which finds the tree requiring the fewest character-state changes. The groups of species defined within a tree can also be later evaluated by statistical tests, such as the bootstrap method, to see if they are indeed significant.

Limitations and challenges

Ideal environmental conditions for preserving DNA where the organism was desiccated and uncovered are difficult to come by, as well as maintaining their condition until analysis. Nuclear DNA normally degrades rapidly after death by endogenous hydrolytic processes, by UV radiation, and other environmental stressors.

Also, interactions with the organic breakdown products of surrounding soil have been found to help preserve biomolecular materials. However, they have also created the additional challenge of being able to separate the various components in order to be able to conduct the proper analysis on them. Some of these breakdowns have also been found to interfere with the action of some of the enzymes used during PCR.

Finally, one of the largest challenge in extracting ancient DNA, particularly in ancient human DNA, is in contamination during PCR. Small amounts of human DNA can contaminate the reagents used for extraction and PCR of ancient DNA. These problems can be overcome by rigorous care in the handling of all solutions as well as the glassware and other tools used in the process. It can also help if only one person performs the extractions, to minimize different types of DNA present.

Thursday, December 10, 2020

Ancient pathogen genomics

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Ancient pathogen genomics is a scientific field related to the study of pathogen genomes recovered from ancient human, plant or animal remains. Ancient pathogens are microorganisms, now extinct, that in the past centuries caused several epidemics and deaths worldwide. Their genome, to which we referred as ancient DNA (aDNA), is isolated from the burial's remains (bones and teeth) of victims of the pandemics caused by these pathogens.

The analysis of the genomic features of ancient pathogen genomes allows researchers to understand the evolution of modern microbial strains that can hypothetically generate new pandemics or outbreaks. The analysis of aDNA is carried out by bioinformatic tools and molecular biology techniques to compare ancient pathogens with the modern descendants. The comparison also provides phylogenetic information of these strains.

Reconstructing ancient pathogen genomes through NGS technologies

Pathogen DNA detection in ancient remains can be achieved with laboratory or computational methods. In both cases, the procedure starts with the extraction of DNA from ancient specimens. The laboratory methods are based on the construction of NGS libraries and the subsequent capture-based screening. Computational tools are used to map the reads obtained by NGS against a single- or multi-genome reference (targeted approach); alternatively, metagenomic profiling or taxonomic assignment of shotgun NGS reads methods can be applied (broad approach).

Isolating ancient DNA

The limited preservation and thus low abundance, the highly fragmented and damaged state and the presence of modern DNA contamination and environmental DNA background makes the retrieval of ancient DNA (aDNA) a challenging procedure.

In order to efficiently recover aDNA, DNA is generally isolated from tissues that contain a high quantity of aDNA, like bone and teeth, which are abundant in archaeological record. The preservation of pathogens across different anatomical elements is very variable according to the type of pathogen and its tissue tropism, its route of entry into the body and the resulting disease. Pathogens that cause chronic infections in their hosts typically produce diagnostic bone changes as opposed to acute blood-borne infections. Therefore, for that infections that have caused the death of the host in the acute phase, the preferred sampling material is the inner chamber of the teeth since this is a tissue that is highly vascularized during life.

aDNA is characterised by damages that are accumulated over the course of time: the evaluation of DNA 'damage pattern' through computational tools is useful to authenticate ancient pathogen DNA since the same pattern is not found in modern contaminants.

The most represented chemical damage that affects the DNA post-mortem is the hydrolytic deamination of cytosines, converting them in uracils, which are then read as thymines. Due to this reaction, ancient DNA contains an unexpected proportion of cytosine to thymine transitions, in particular at the ends of the molecules. Other common DNA modifications, besides the deamination of cytosine into thymine (this occurs when cytosines were methylated), is the presence of abasic sites and single-strand breaks.

aDNA is extensively fragmented (most of the fragments are less than 100 base pairs long): this tendency can be used as a quantitative measure of authenticity, as modern contaminant molecules are expected to be longer. To exploit this characteristic feature of ancient DNA, improved silica-based extraction protocols with modified volume and composition of the DNA-binding buffer were introduced.

Construction of DNA libraries

In order to be sequenced with second generation sequencing methods, template molecules have to be modified through ligation of adaptors. Both the steps of library construction and the PCR amplification that follows are subject to errors. In particular, adaptor binding biases can occur and the relative efficacy of PCR enzymes in amplifying the construct can be variable.

There are three most common types of aDNA libraries. The double-stranded DNA library uses double-stranded DNA templates and firstly requires a step for the repair of the ends of aDNA fragments. Then, fragments are ligated to double-stranded adaptors and the resultant nicks are filled in. This method has some limitations, like the presence of a fraction of constructs that do not contain both the different adaptors and the possible formation of adaptor dimers.

To overcome this latter problem, a method for the construction of an A-tailed library was developed. In this method, aDNA is end-repaired and then an adenine residue is added to the 3' ends of the strands, which can facilitate the ligation of the template with adaptors that contain a tailor of thymine. Furthermore, the use of these T-tailed adaptors prevents the formation of adaptor dimers. The type of adaptor that is typically used is double-stranded and has a Y shape, which means that it has a region located at the T-tailed end where it is complementary and a region at the other end where it is non-complementary. The use of this type of adaptors allows to generate a template of aDNA flanked by different non-complementary adaptor sequences at each end that are useful for the unidirectional sequencing.

Another strategy is based on the use of single-stranded DNA libraries. In this method, DNA is first denatured to generate a single strand through heat and then ligated to a single-stranded biotinylated adaptor. The DNA strand is then used as a template by a DNA-polymerase which produces the complementary strand. Subsequently, a second adaptor is ligated at the 3' end of the complementary strand and the full construct is amplified through PCR and then sequenced. The purification step is performed using streptavidin-coated paramagnetic beads which allow minimising the DNA loss during this phase of the procedure.

Enriching libraries for aDNA

Different methods (called enrichment methods) have been developed to improve accessibility to endogenous DNA in ancient remains. These approaches can mainly be divided into three types: those used during library construction, by preferentially incorporating aDNA fragments characterised by the high level of damage, those applied after library construction, by separating exogenous and endogenous fractions through annealing to pre-defined sets of probes (in solution or on microarrays), or those based on targeted digestion of environmental microbial DNA using restriction enzymes and primer extension capture (PEC).

Selective uracil enrichment

During the construction of the library, the ssDNA fragments are bound through a biotinylated adaptor to streptavidin-coated beads. In the polymerase extension step, the DNA strand complementary to the original template is generated. In this kind of enrichment, the constructs undergo phosphorylation at the 5' end, to enable the ligation of a non-phosphorylated adaptor (ligation between the 3' end of the adaptor and the 5' end of the newly synthesized strand). DNA is then treated with uracil DNA glycosylase (UDG) and endonuclease VIII (USER mix): UDG generates abasic sites at cytosine that were deaminated into uracils post-mortem, endo VIII cuts at the resulting abasic site. This cleavage generates new 3' termini, which are then dephosphorylated, resulting in 3'OH ends that can be used as starting points for a new step of extension. This results in the elongation of the damaged strand, from the damaged region towards the bounded bead: while the new DNA molecule is synthesised, the original fragment is displaced. As a result, the dsDNA molecules newly formed no longer contain the adaptor bound to the beads, leaving in the supernatant a dsDNA library of the strands that originally harboured deaminated cytosines, available for further amplification and sequencing. The undamaged DNA template fraction remains attached to the paramagnetic beads.

Extension-free target enrichment in solution

This approach is based on in solution target-probe hybridization to screen for only a single microorganism, after the construction of the library. It is a species-specific assay that requires heat denaturation of DNA libraries and the construction of a probe DNA library using long-range PCR if fresh DNA material from closely related species is available, or through custom design and synthesis of oligonucleotides. This method is useful when the microorganism to target is known, for example, when the hypothesis exists for the causative agent of an epidemic or in presence of skeletal lesions in the studied individuals.

Solid-phase target enrichment

Another enriching strategy applied after constructing the library is the direct application of microarrays. They are applied for a wide laboratory-based pathogen screening that searches simultaneously for various pathogenic microorganisms. This kind of approach is favourable for those pathogens that leave no physical skeletal evidence and whose presence cannot be easily hypothesized a priori. The probes are designed to represent conserved or unique regions from a range of pathogenic viruses, parasites or bacteria.

Since microarrays contain sequences derived from modern strains of ancient pathogens, the limits of this method are the poor detection of the most divergent genomic regions and the omission of regions with important genomic rearrangements or unknown additional plasmids.

Whole-genome enrichment

The whole-genome in-solution capture (WISC) allows the characterization of the entire genome sequence of ancient individuals. This technique is based on the use of a genome-wide biotinylated RNA probe library generated through in vitro transcription of fresh modern DNA extracts from species closely related to the target aDNA sample. The heat-denatured aDNA library is then annealed to the RNA probes. To improve stringency and reduce enrichment for highly repetitive regions, low-complexity DNA and adaptor-blocking RNA oligonucleotides are added. The library fraction of interest in then recovered through elution from streptavidin-coated paramagnetic beads (to which the RNA probes are bound).

Computational analysis

The analysis of sequence data obtained by NGS relies on the same computational approaches used for modern DNA, with some peculiarities. A widely used tool to align reads from aDNA against reference genomes is the PALEOMIX package, which can quantify DNA damage levels through mapDamage2 and perform phylogenomic and metagenomic analyses. It is important to consider that the alignment will always exhibit substantial fractions of nucleotides mismatched that do not result from sequencing errors or polymorphisms but from the presence of damaged bases. For this reason, the acceptance threshold for read-to-reference edit distance should be chosen according to the phylogenetic distance to the reference genome. Probabilistic aligners that take into account the damage pattern of aDNA have been developed to improve alignments.

MALT

Studies of the ancient DNA of pathogens is restricted to skeletal collections that change their appearance as a result of infections. A pathogen linked to a known epidemiological context is identified through screening without prior knowledge of its presence. Methods include broad-spectrum molecular approaches focused on pathogen detection via fluorescence hybridization-based microarray technology, identification via DNA enrichment of certain microbial regions or computational screening of non-enriched sequence data against human microbiome data sets. These approaches offer improvements but remain biased in the bacterial taxa used for species-level assignments.

MEGAN alignment tool (MALT) is a new program for the fast alignment and taxonomic assignment method to the identification of ancient DNA. MALT is similar to BLAST as it computes local alignments between highly conserved sequences and references. MALT can also calculate semi-global alignments where reads are aligned end-to-end. All references, complete bacterial genomes, are contained in a database called National Center for Biotechnology Information (NCBI) RefSeq. MALT consists of two programs: malt-build and malt-run. Malt-build is used to construct an index for the given database of reference sequences. Instead, malt-run is used to align a set of query sequences against the reference database. The program then computes the bit-score and the expected value (E-value) of the alignment and decides whether to keep or discard the alignment depending on user-specified thresholds for the bit-score, the E-value or the per cent identity. The bit-score is the requires size of a sequence database in which the current match could be found just by chance. The higher the bit-score, the better the sequence similarity. E-value is the number of expected hits of similar quality (score) that could be found just by chance. The smaller is the E-value, the better is the match.

MALT allows the screening of non-enriched sequence data in the search for unknown candidate bacterial pathogens that are involved in past disease outbreaks and for the exclusion of the environmental bacterial background. MALT is very important because it offers the advantage of genome-level screening without selection of a particular target organism, avoiding errors that are common to other screening approaches. To authenticate the candidate taxonomic assignments complete alignments are needed, but the target DNA is often present in a low amount so a small number of a marked region may not be sufficient for identification. This approach can detect only bacterial DNA and viral DNA, so it is not possible to identify other infectious agents that may be present in a population. This method is useful for studies dealing with the identification of pathogens responsible for ancient and modern disease, especially in cases for which candidate organisms are not known a priori.

Applications

Ancient pathogen genomics as a tool against future epidemics

One interesting application of the different sequencing techniques available nowadays is the investigation of historical disease outbreaks to provide an answer to important and long-standing questions in epidemiology, pathogen evolution and also human history.

So, much effort is spent to find more and more information about the aetiology of infectious diseases of historical importance, such as plague and the cocoliztli epidemic, to describe the geographic spread of viruses and to try defining the pathogenic mechanism of these infectious agents that are actually active elements of the evolutionary process. Today Y.pestis and S. enterica seem to be so far from us and no more dangerous at all, but scientists are still interested in the long-term tracing of genetic adaptation of these bacteria and accurate quantification of rates of their evolutionary change. This is because they can extract from this knowledge of the past the right ideas to develop a strategy against future epidemics.

Being perfectly aware of the fact that bacteria and viruses are one of most variable elements in nature, prone to unlimited mutational events, and taking for granted that it is impossible to manage all the external factors that can influence the development of a pathogenic virus, nobody is talking about defeating a new possible outbreak of plague or any other infective agent of the past: here the aim is to define a strategy, a "guideline", to be more prepared when a new dangerous pathogen will come. The contribution of the environment in infections is to be defined and factors such as human migration, climate change, overcrowding in cities or animal domestication are some of the major causes that contribute to the emergence and spread of disease. Of course, these factors are unpredictable and this is a reason why researchers are trying to bring relevant information from the past, that can be useful, today and tomorrow. While they continue to develop strategies to defeat emerging threats using diagnostic, molecular and advanced tools, they are still looking back at how ancient pathogens have evolved and adapted through historical events. The more it's known about the genomic basis of virulence in historical diseases, the more it can be understood about the emergence and re-emergence of infectious diseases today and in the future.

Ancient infections and human evolution

The analyses of phylogenic relationships between the human host and viral pathogens suggest that many diseases have been coevolving with humans for millennia, since the very start of human history in Africa.

In particular, the long-term interaction with pathogens is considered a selection that can be very strong since not all the individuals could survive in touch with all infectious agents that they had met over the years: the natural selection by pathogens is implicated in the evolution of species. This interaction has been already used to track human population movements and to reconstruct human migration flows within and out of Africa.

A pretty new application and interpretation of this feature is using aDNA to better understand human evolution. Many tropical infections probably played a significant role in the human evolutionary process. The correlation between humans and viruses can be understood if it is seen as a "fight" that continues for millennia and that is not still won by anyone: when viruses have changed their features in order to be infective for the other "fighters", humans had to find a strategy to increase their fitness and survived among changes.

In this continuous challenge through the years, next to infective diseases and other illnesses afflicting modern human society, cancer recently represents one of the most enigmatical ailments. Scientists are investigating if neoplastic diseases are restricted to postindustrial human society or if their origins can be found further back in time, maybe into prehistory. The difficulty is that cancer, lethal and fast, leaves very few indications in skeletons in those cases that succumb to death shortly, and even no signs of existence at all, in the case of extraskeletal tumours. Anyway, the knowledge about the aetiology of cancer is incomplete and microorganisms are taking their part with the role of their infection: migration movements in the past could have brought with them viruses, so possible reservoir of tropical disease as well as predisposition to cancer. For this reason, molecular analytical techniques are applied to archaeological remains to study hominin evolution, but also to improve the research in understanding the epidemiology and aetiology of tumours. Information derived from the aDNA can be used to anchor pathogen mutations and reconstruct back from the presence of microorganisms the evolutionary process, it can be useful to develop new vaccines or to discover possible future pathogenic threats.

Past pandemics are much more than just ancient history

What happened in the past is not all history, there is something hidden that can still drive human genetic diversity and natural selection, something that went in contact with humankind hundreds of years ago but that can still have an impact on global human health. Since epidemics are one of the most frequent phenomena that have affected and potentially devastated human populations, it is important to detect, prevent and control potential infective agents. After all, archaeologists, geneticists, and medical scientists are concerned in exploring the influences of pathogens that can contribute, threatening or improving, human health and longevity.

Evolution and phylogenesis of Yersinia pestis

Yersinia pestis is a gram-negative bacterium and belongs to the family of Enterobatteriaceae. Its closest relatives are Yersinia pseudotuberculosis and Yersinia enterocolitica, which are environmental species.

Y. pestis bacillus.

They all possess the plasmid pCD1, which encodes for a type III secretory system. Among chromosomal protein-coding genes, their nucleotide genomic identity rates 97%. They are different in terms of their virulence potential and transmission mechanisms.

Y. pestis is not a human-adapted bacteria. Its main reservoirs are rodents (like marmots, mice, great gerbils, voles and prairie dogs) and it is transmitted to humans by the flea. One of the most studied vectors of this pathogen is Xenopsylla cheopis.

After the bite of an infected flea, the bacteria enter into the host organism and travel to the closest lymph node, where bacteria replicate causing the large swellings called buboes. Bacteria can also disseminate into the bloodstream (causing septicaemia) and to the lung (causing pneumonia). The pulmonary disease has a direct human-to-human transmission.

It has been determined that Y. pestis became so dangerous because of the acquisition of ymt (yersina murine toxin). This gene is present on the pMT1 plasmid and allowed the survival of the bacterium in the flea vector and facilitated colonization of the midgut in arthropod, giving rise to the past millennium large-scale pandemics.

Early evolution and divergence from Yersinia pseudotuberculosis

Y. pestis is distinguishable from the other two species because of its pathogenicity and transmission mechanism. These differences are given by two plasmids: pPCP1, that confer to the bacterium its invasive properties in humans and pMT1, which is involved in flea colonisation (along with some loss of function on bacterial chromosomal genes).

Samples dated on the Late Neolithic and Bronze Age allowed identifying a first genetic divergence between Y. pseudotuberculosis and Y. pestis ancestors. The characteristics that confer to Y. pestis its virulence were absent in these strains: they lack of ymt, a gene necessary to the colonization of the vector; also, they presented an active form of genes required for biofilm formation (inactive in the pathogen Y. pestis) and an active flagellin gene, that is an inducer of immune response (is a pseudogene in Y. pestis).

The comparison of a draft of the genome and the two plasmids (pCD1 and pMT1) with samples of Black Death victims (1348-1349) in the East Smithfield burial ground underlined a very high genetic conservation of the sequence: only 97 single-nucleotide differences over 660 years.

Y. pestis microevolution

The London 6330 individual strain owns mutations absent in other isolates of the same period (1348-1350): the reason may be either the presence of multiple strains circulating in Europe at the same time or the microevolution of one single strain during the pandemic.

Three major outbreaks of plague

There are three pandemic outbreaks of Y. Pestis:

  1. The first is known as the Plague of Justinian, it first occurred in Egypt in 541-543 and then spread to Constantinople and neighbouring regions. It had outbreaks in Europe until 750 CE. Phylogenetic analysis showed that both genomes belong to a lineage that is extinct today and is closely related to stains from modern-day China, which suggest the possibility of an East Asian origin of the first pandemic.
  2. The second pandemic is known as the Black Death or as Great Pestilence. It occurred in 1346-1352 in Europe and had a lot of resurgences of plague, it continued until the 18th century. It could be possible that in this pandemic there were two different strains of Y. pestis that entered the continent through different pulses.
  3. The third pandemic started in China in 1860. It has a fast spread to other countries, due to the use of railways and steamships.

The strains associated with the Justinian Plague appear on a novel branch, which is phylogenetically distinct from the second and the third plague pandemics. The first strain of Y. pestis found during the second outbreak survives and give rise to modern branch 1 strains associated with the third pandemic outbreak.

The first plague bacteria and the second and third plague strain have a common ancestor.

Linkage between 2nd and 3rd pandemics

In a recent study, genomes of Y. pestis from three samples resumed in Barcelona (deceased between 1300-1420), Ellwangen (between 1486-1627) and Bolgar city (between 1298-1388). The date of death of the individuals analysed was determined thanks to radiocarbon dates; the last one was confirmed by the presence of a coin produced only after the year 1362. Of 223 samples from 178 individuals, only one for each site had a suitable amount of DNA and was finally selected for the whole genome sequencing of the bacillus (through a genome-capture assay, using as a draft Y. pseudotuberculosis genome and pMT1 and pCD1 from Y. pestis).

The alignment with a Y. pestis phylogeny tree created with previously know ancient genomes revealed an increase genetic diversity outside of China in comparison to what was previously thought; all the three new genomes mapped in Branch 1 and possess two SNPs associated to the Black Death (all the genomes of Y. pestis dated to the Black Death map in Branch 1). The Barcelona strain has no differences with the London strain; the two individuals from which the genome was obtained died of plague with a distance of some months (spring and autumn 1348) underlining the presence in Europe of a single wave of plague with low genetic diversity. The Ellwangen strain maps in a sub-branch of Branch 1 and is ancestral to a previously sequenced strain (L'Observance). it descends from the one circulating in London and Barcelona during the Black Death but also have additional mutations. Is therefore considered a lineage diverged from Branch 1 before the 16th century (Ellwangen outbreak) and with no known modern descendants.

In comparison with isolates from the Black Death, the Bolgar city strain presents:

  • p3 and p4, shared by the "London individual 6330";
  • p6, shared with all modern Branch 1 strains;
  • p7, unique of this strain;

The Bolgar City strain possesses SNPs associated to the Black Death and can be an evidence of a movement of plague towards east; These findings give credit to one of the models that try to explain the linkage between 2nd and 3rd pandemic: in this scenario, there was a single exit of plague to Europe (causing the Black Death) that after a radiation event, travelled eastward to establish in former soviet union and Asia, from which it spread in the 18th century to give raise to the 3rd pandemic.

Another hypothesis is that the 3rd pandemic's lineage may have been generated by a pre-existing genetic variability in Y. pestis strains in China: this hypothesis is actually supported by the correlation between following waves of the pandemic in Europe and climatic fluctuations that would have allowed its spread in the continent. This model can't explain the genetic diversity of the Black Death (four different lineages at least, that would have required the introduction from Asia of four different strains).

Again, there are two models that try to explain the multiple plague outbreaks in Europe following the black death:

  • Repeated introduction of plague from Asia. This scenario is compatible with the 2nd theory discussed before that sees a genetic variability of Y. pestis in China;
  • Presence in Europe of a reservoir (now extinct) that caused continue outbreaks until the 18th century;

Both models can be valid and nowadays we're not able to demonstrate one over the other. However, the Ellwangen strain genome sequenced in this study may be considered a proof of the second hypothesis due to the geographical position of the city that tends to exclude the possibility of an introduction of plague from eastward.

Modern Y. pestis strains

Sequencing of Y. pestis genomes allowed to discover a variation event preceding Black Death that gave rise to many strains that circulate today.

Salmonella enterica genomes analysis

During the 16th an epidemic occurred in Mexico and it caused high mortality in Indigenous populations of the Americas. This high mortality has been the consequence of the influence of the demographic collapse of many indigenous populations. This epidemic has been called "cocoliztli" by the native Aztec because of the symptoms, in particular, high fever and bleeding.

This pestilence is considered one of the worst epidemics in the history of Mexico and the cause of this outbreaks is remained a mystery for over 500 years.

A group of scientists from Harvard and Max-Planck Institute published a study in the journal of Nature ecology and Evolution, and they suggest Salmonella enterica as a good candidate for the strong epidemic in Mexico during the 16th century. Many studies suggest that this bacterium has been introduced in the Indigenous populations by Europeans.

The group of scientists analyzed the aDNA extracted from the teeth of 24 skeletons buried in a cemetery in the city of Teposcolula-Yucundaa and they found in 10 of the 24 skeletons aDNA traces of Salmonella enterica. Also, to demonstrate that the bacterium was introduced in Mexico by the Europeans, they analyzed five individuals that were buried before the influx of Europeans. The results revealed that there was no evidence of Salmonella enterica in the pre-contact era.

Analysis of Salmonella enterica genomes

The scientists performed the extraction of the aDNA from the teeth of 24 indigenous individuals' remains from the contact era epidemic cemetery and of 5 individuals buried in the pre-contact era cemetery. The extraction was performed according to the protocol for aDNA extraction. The group of researchers examined, in parallel, also a soil sample of the cemeteries to get an overview of the environment microorganisms that could have penetrated the samples.

After the extraction, the genomes were sequenced using Illumina genome analyzer. Then, using a bioinformatic tool, called MALT, the researchers performed an analysis of metagenomic sequences data. This program lets the researchers align the extracted sequences with a reference without specifying a precise target. The researchers performed MALT run two times: one using the complete bacterial genomes that were available through NCBI (National Center for Biotechnology Information) RefSeq as a reference, and the second run was carried out using the full NCBI Nucleotide database to screen for viral DNA.

The result of the screening process was positive for the presence of Salmonella enterica DNA in 10 sequences up to 24 collected from the samples and three tooth sample had a high amount of reads assigned to S. enterica. In particular, the major S. enterica strain present in the samples is the S. Paratyphi C. This strain causes enteric fever in human individuals. In the pre-contact era samples, they did not find any evidence of S. enterica, supporting the hypothesis that S. enterica was not a local bacteria.

A further analysis was carried out to identify the classical pattern of damage of aDNA in the three positive tooth samples and this was conducted by mapping the data sets to the S. Paratyphy C genome reference. The results were positive and supported the thesis of S.enterica as the cause of cocolitzli.

To go in-depth with the analyses and confirm the thesis, the researchers conducted further experiments and computational analysis. They performed a whole-genome target array and in-solution hybridization capture using probes that include the modern S. enterica genome differences and using S. Paratyphi C as reference. The hybridization was successful for the ten positive samples, while the other samples resulted negative for the ancient DNA.

 

Ancient DNA

From Wikipedia, the free encyclopedia
 
Cross-linked DNA extracted from the 4,000-year-old liver of the ancient Egyptian priest Nekht-Ankh.

Ancient DNA (aDNA) is DNA isolated from ancient specimens. Due to degradation processes (including cross-linking, deamination and fragmentation) ancient DNA is more degraded in comparison with contemporary genetic material. Even under the best preservation conditions, there is an upper boundary of 0.4–1.5 million years for a sample to contain sufficient DNA for sequencing technologies. 

Genetic material has been recovered from paleo/archaeological and historical skeletal material, mummified tissues, archival collections of non-frozen medical specimens, preserved plant remains, ice and from permafrost cores, marine and lake sediments and excavation dirt.

History of ancient DNA studies

1980s

The first study of what would come to be called aDNA was conducted in 1984, when Russ Higuchi and colleagues at the University of California, Berkeley reported that traces of DNA from a museum specimen of the Quagga not only remained in the specimen over 150 years after the death of the individual, but could be extracted and sequenced. Over the next two years, through investigations into natural and artificially mummified specimens, Svante Pääbo confirmed that this phenomenon was not limited to relatively recent museum specimens but could apparently be replicated in a range of mummified human samples that dated as far back as several thousand years.

The laborious processes that were required at that time to sequence such DNA (through bacterial cloning) were an effective brake on the development of the field of ancient DNA (aDNA). However, with the development of the Polymerase Chain Reaction (PCR) in the late 1980s, the field began to progress rapidly. Double primer PCR amplification of aDNA (jumping-PCR) can produce highly skewed and non-authentic sequence artifacts. Multiple primer, nested PCR strategy was used to overcome those shortcomings.

1990s

The post-PCR era heralded a wave of publications as numerous research groups claimed success in isolating aDNA. Soon a series of incredible findings had been published, claiming authentic DNA could be extracted from specimens that were millions of years old, into the realms of what Lindahl (1993b) has labelled Antediluvian DNA. The majority of such claims were based on the retrieval of DNA from organisms preserved in amber. Insects such as stingless bees, termites, and wood gnats, as well as plant and bacterial sequences were said to have been extracted from Dominican amber dating to the Oligocene epoch. Still older sources of Lebanese amber-encased weevils, dating to within the Cretaceous epoch, reportedly also yielded authentic DNA. Claims of DNA retrieval were not limited to amber.

Reports of several sediment-preserved plant remains dating to the Miocene were published. Then in 1994, Woodward et al. reported what at the time was called the most exciting results to date— mitochondrial cytochrome b sequences that had apparently been extracted from dinosaur bones dating to more than 80 million years ago. When in 1995 two further studies reported dinosaur DNA sequences extracted from a Cretaceous egg, it seemed that the field would revolutionize knowledge of the Earth's evolutionary past. Even these extraordinary ages were topped by the claimed retrieval of 250-million-year-old halobacterial sequences from halite.

As the field developed a better understanding of the kinetics of DNA preservation, the risks of sample contamination and other complicating factors, along with the failure of attempts to replicate many of the findings, all of the decade's claims of multi-million year old aDNA would come to be dismissed as inauthentic results.

2000s

Single primer extension amplification was introduced in 2007 to address postmortem DNA modification damage. Since 2009 the field of aDNA studies has been revolutionized with the introduction of much cheaper research-techniques, leading to new insights in human migrations. The use of high-throughput Next Generation Sequencing (NGS) techniques in the field of ancient DNA research has been essential for reconstructing the genomes of ancient or extinct organisms. A single-stranded DNA (ssDNA) library preparation method has sparked great interest among ancient DNA (aDNA) researchers.

In addition to these technical innovations, the start of the decade saw the field begin to develop better standards and criteria for evaluating DNA results, as well as a better understanding of the potential pitfalls.

Problems and errors

Degradation processes

Due to degradation processes (including cross-linking, deamination and fragmentation) ancient DNA is of lower quality in comparison with modern genetic material. The damage characteristics and ability of aDNA to survive through time restricts possible analyses and places an upper limit on the age of successful samples.  There is a theoretical correlation between time and DNA degradation, although differences in environmental conditions complicates things. Samples subjected to different conditions are unlikely to predictably align to a uniform age-degradation relationship. The environmental effects may even matter after excavation, as DNA decay rates may increase, particularly under fluctuating storage conditions. Even under the best preservation conditions, there is an upper boundary of 0.4–1.5 million years for a sample to contain sufficient DNA for contemporary sequencing technologies.

Research into the decay of mitochondrial and nuclear DNA in Moa bones has modelled mitochondrial DNA degradation to an average length of 1 base pair after 6,830,000 years at −5 °C. The decay kinetics have been measured by accelerated aging experiments further displaying the strong influence of storage temperature and humidity on DNA decay. Nuclear DNA degrades at least twice as fast as mtDNA. As such, early studies that reported recovery of much older DNA, for example from Cretaceous dinosaur remains, may have stemmed from contamination of the sample.

Age limit

A critical review of ancient DNA literature through the development of the field highlights that few studies after about 2002 have succeeded in amplifying DNA from remains older than several hundred thousand years. A greater appreciation for the risks of environmental contamination and studies on the chemical stability of DNA have resulted in concerns being raised over previously reported results. The alleged dinosaur DNA was later revealed to be human Y-chromosome, while the DNA reported from encapsulated halobacteria has been criticized based on its similarity to modern bacteria, which hints at contamination. A 2007 study also suggests that these bacterial DNA samples may not have survived from ancient times, but may instead be the product of long-term, low-level metabolic activity.

aDNA may contain a large number of postmortem mutations, increasing with time. Some regions of polynucleotite are more susceptible to this degradation, so sequence data can bypass statistical filters used to check the validity of data. Due to sequencing errors, great caution should be applied to interpretation of population size. Substitutions resulting from deamination cytosine residues are vastly over-represented in the ancient DNA sequences. Miscoding of C to T and G to A accounts for the majority of errors.

Contamination

Another problem with ancient DNA samples is contamination by modern human DNA and by microbial DNA (most of which is also ancient). New methods have emerged in recent years to prevent possible contamination of aDNA samples, including conducting extractions under extreme sterile conditions, using special adapters to identify endogenous molecules of the sample (over ones that may have been introduced during analysis), and applying bioinformatics to resulting sequences based on known reads in order approximate rates of contamination.

Non-human aDNA

Despite the problems associated with 'antediluvian' DNA, a wide and ever-increasing range of aDNA sequences have now been published from a range of animal and plant taxa. Tissues examined include artificially or naturally mummified animal remains, bone, paleofaeces, alcohol preserved specimens, rodent middens, dried plant remains, and recently, extractions of animal and plant DNA directly from soil samples.

In June 2013, a group of researchers including Eske Willerslev, Marcus Thomas Pius Gilbert and Orlando Ludovic of the Centre for Geogenetics, Natural History Museum of Denmark at the University of Copenhagen, announced that they had sequenced the DNA of a 560–780 thousand year old horse, using material extracted from a leg bone found buried in permafrost in Canada's Yukon territory.

In 2013, a German team reconstructed the mitochondrial genome of an Ursus deningeri more than 300,000 years old, proving that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost.

Researchers in 2016 measured chloroplast DNA in marine sediment cores, and found diatom DNA dating back to 1.4 million years. This DNA had a half-life significantly longer than previous research, of up to 15,000 years. Kirkpatrick's team also found that DNA only decayed along a half-life rate until about 100 thousand years, at which point it followed a slower, power-law decay rate.

Human aDNA

Due to the considerable anthropological, archaeological, and public interest directed toward human remains, they have received considerable attention from the DNA community. There are also more profound contamination issues, since the specimens belong to the same species as the researchers collecting and evaluating the samples.

Sources

Due to the morphological preservation in mummies, many studies from the 1990s and 2000s used mummified tissue as a source of ancient human DNA. Examples include both naturally preserved specimens, for example, those preserved in ice, such as the Ötzi the Iceman, or through rapid desiccation, such as high-altitude mummies from the Andes, as well as various sources of artificially preserved tissue (such as the chemically treated mummies of ancient Egypt). However, mummified remains are a limited resource. The majority of human aDNA studies have focused on extracting DNA from two sources that are much more common in the archaeological recordbone and teeth. The bone that is most often used for DNA extraction is the petrous bone, since its dense structure provides good conditions for DNA preservation. Several other sources have also yielded DNA, including paleofaeces, and hair. Contamination remains a major problem when working on ancient human material.

Ancient pathogen DNA has been successfully retrieved from samples dating to more than 5,000 years old in humans and as long as 17,000 years ago in other species. In addition to the usual sources of mummified tissue, bones and teeth, such studies have also examined a range of other tissue samples, including calcified pleura, tissue embedded in paraffin, and formalin-fixed tissue. Efficient computational tools have been developed for pathogen and microorganism aDNA analyses in a small (QIIME) and large scale (FALCON).

Results

Taking preventative measures in their procedure against such contamination though, a 2012 study analyzed bone samples of a Neanderthal group in the El Sidrón cave, finding new insights on potential kinship and genetic diversity from the aDNA. In November 2015, scientists reported finding a 110,000-year-old tooth containing DNA from the Denisovan hominin, an extinct species of human in the genus Homo.

The research has added new complexity to the peopling of Eurasia. It has also revealed new information about links between the ancestors of Central Asians and the indigenous peoples of the Americas. In Africa, older DNA degrades quickly due to the warmer tropical climate, although, in September 2017, ancient DNA samples, as old as 8,100 years old, have been reported.

Moreover, ancient DNA has helped researchers to estimate modern human divergence. By sequencing African genomes from three Stone Age hunter gatherers (2000 years old) and four Iron Age farmers (300 to 500 years old), Schlebusch and colleagues were able to push back the date of the earliest divergence between human populations to 350,000 to 260,000 years ago.

Archaeogenetics

From Wikipedia, the free encyclopedia

Archaeogenetics is the study of ancient DNA using various molecular genetic methods and DNA resources. This form of genetic analysis can be applied to human, animal, and plant specimens. Ancient DNA can be extracted from various fossilized specimens including bones, eggshells, and artificially preserved tissues in human and animal specimens. In plants, Ancient DNA can be extracted from seeds, tissue, and in some cases, feces. Archaeogenetics provides us with genetic evidence of ancient population group migrations, domestication events, and plant and animal evolution. The ancient DNA cross referenced with the DNA of relative modern genetic populations allows researchers to run comparison studies that provide a more complete analysis when ancient DNA is compromised.

Archaeogenetics receives its name from the Greek word arkhaios, meaning "ancient", and the term genetics, meaning "the study of heredity". The term archaeogenetics was conceived by archaeologist Colin Renfrew.

Early work

Ludwik Hirszfeld (1884–1954)

Ludwik Hirszfeld was a Polish microbiologist and serologist who was the President of the Blood Group Section of the Second International Congress of Blood Transfusion. He founded blood group inheritance with Erich von Dungern in 1910, and contributed to it greatly throughout his life. He studied ABO blood groups. In one of his studies in 1919, Hirszfeld documented the ABO blood groups and hair color of people at the Macedonian front, leading to his discovery that the hair color and blood type had no correlation. In addition to that he observed that there was a decrease of blood group A from western Europe to India and the opposite for blood group B. He hypothesized that the east-to-west blood group ratio stemmed from two blood groups consisting of mainly A or B mutating from blood group O, and mixing through migration or intermingling. A majority of his work was researching the links of blood types to sex, disease, climate, age, social class, and race. His work led him to discover that peptic ulcer was more dominant in blood group O, and that AB blood type mothers had a high male-to-female birth ratio.

Arthur Mourant (1904–1994)

Arthur Mourant was a British hematologist and chemist. He received many awards, most notably Fellowship of the Royal Society. His work included organizing the existing data on blood group gene frequencies, and largely contributing to the genetic map of the world through his investigation of blood groups in many populations. Mourant discovered the new blood group antigens of the Lewis, Henshaw, Kell, and Rhesus systems, and analyzed the association of blood groups and various other diseases. He also focused on the biological significance of polymorphisms. His work provided the foundation for archaeogenetics because it facilitated the separation of genetic evidence for biological relationships between people. This genetic evidence was previously used for that purpose. It also provided material that could be used to appraise the theories of population genetics.

William Boyd (1903–1983)

William Boyd was an American immunochemist and biochemist who became famous for his research on the genetics of race in the 1950s. During the 1940s, Boyd and Karl O. Renkonen independently discovered that lectins react differently to various blood types, after finding that the crude extracts of the lima bean and tufted vetch agglutinated the red blood cells from blood type A but not blood types B or O. This ultimately led to the disclosure of thousands of plants that contained these proteins. In order to examine racial differences and the distribution and migration patterns of various racial groups, Boyd systematically collected and classified blood samples from around the world, leading to his discovery that blood groups are not influenced by the environment, and are inherited. In his book Genetics and the Races of Man (1950), Boyd categorized the world population into 13 distinct races, based on their different blood type profiles and his idea that human races are populations with differing alleles. One of the most abundant information sources regarding inheritable traits linked to race remains the study of blood groups.

Methods

Fossil DNA preservation

Fossil retrieval starts with selecting an excavation site. Potential excavation sites are usually identified with the mineralogy of the location and visual detection of bones in the area. However, there are more ways to discover excavation zones using technology such as field portable x-ray fluorescence and Dense Stereo Reconstruction. Tools used include knives, brushes, and pointed trowels which assist in the removal of fossils from the earth.

To avoid contaminating the ancient DNA, specimens are handled with gloves and stored in -20 °C immediately after being unearthed. Ensuring that the fossil sample is analyzed in a lab that has not been used for other DNA analysis could prevent contamination as well. Bones are milled to a powder and treated with a solution before the polymerase chain reaction (PCR) process. Samples for DNA amplification may not necessarily be fossil bones. Preserved skin, salt- preserved or air-dried, can also be used in certain situations.

DNA preservation is difficult because the bone fossilisation degrades and DNA is chemically modified, usually by bacteria and fungi in the soil. The best time to extract DNA from a fossil is when it is freshly out of the ground as it contains six times the DNA when compared to stored bones. The temperature of extraction site also affects the amount of obtainable DNA, evident by a decrease in success rate for DNA amplification if the fossil is found in warmer regions. A drastic change of a fossil's environment also affects DNA preservation. Since excavation causes an abrupt change in the fossil's environment, it may lead to physiochemical change in the DNA molecule. Moreover, DNA preservation is also affected by other factors such as the treatment of the unearthed fossil like (e.g. washing, brushing and sun drying), pH, irradiation, the chemical composition of bone and soil, and hydrology. There are three perseveration diagenetic phases. The first phase is bacterial putrefaction, which is estimated to cause a 15-fold degradation of DNA. Phase 2 is when bone chemically degrades, mostly by depurination. The third diagenetic phase occurs after the fossil is excavated and stored, in which bone DNA degradation occurs most rapidly.

Methods of DNA extraction

Once a specimen is collected from an archaeological site, DNA can be extracted through a series of processes. One of the more common methods utilizes silica and takes advantage of polymerase chain reactions in order to collect ancient DNA from bone samples.

There are several challenges that add to the difficulty when attempting to extract ancient DNA from fossils and prepare it for analysis. DNA is continuously being split up. While the organism is alive these splits are repaired; however, once an organism has died, the DNA will begin to deteriorate without repair. This results in samples having strands of DNA measuring around 100 base pairs in length. Contamination is another significant challenge at multiple steps throughout the process. Often other DNA, such as bacterial DNA, will be present in the original sample. To avoid contamination it is necessary to take many precautions such as separate ventilation systems and workspaces for ancient DNA extraction work. The best samples to use are fresh fossils as uncareful washing can lead to mold growth. DNA coming from fossils also occasionally contains a compound that inhibits DNA replication. Coming to a consensus on which methods are best at mitigating challenges is also difficult due to the lack of repeatability caused by the uniqueness of specimens.

Silica-based DNA extraction is a method used as a purification step to extract DNA from archaeological bone artifacts and yield DNA that can be amplified using polymerase chain reaction (PCR) techniques. This process works by using silica as a means to bind DNA and separate it from other components of the fossil process that inhibit PCR amplification. However, silica itself is also a strong PCR inhibitor, so careful measures must be taken to ensure that silica is removed from the DNA after extraction. The general process for extracting DNA using the silica-based method is outlined by the following:

  1. Bone specimen is cleaned and the outer layer is scraped off
  2. Sample is collected from preferably compact section
  3. Sample is ground to fine powder and added to an extraction solution to release DNA
  4. Silica solution is added and centrifuged to facilitate DNA binding
  5. Binding solution is removed and a buffer is added to the solution to release the DNA from the silica

One of the main advantages of silica-based DNA extraction is that it is relatively quick and efficient, requiring only a basic laboratory setup and chemicals. It is also independent of sample size, as the process can be scaled to accommodate larger or smaller quantities. Another benefit is that the process can be executed at room temperature. However, this method does contain some drawbacks. Mainly, silica-based DNA extraction can only be applied to bone and teeth samples; they cannot be used on soft tissue. While they work well with a variety of different fossils, they may be less effective in fossils that are not fresh (e.g. treated fossils for museums). Also, contamination poses a risk for all DNA replication in general, and this method may result in misleading results if applied to contaminated material.

Polymerase chain reaction is a process that can amplify segments of DNA and is often used on extracted ancient DNA. It has three main steps: denaturation, annealing, and extension. Denaturation splits the DNA into two single strands at high temperatures. Annealing involves attaching primer strands of DNA to the single strands that allow Taq polymerase to attach to the DNA. Extension occurs when Taq polymerase is added to the sample and matches base pairs to turn the two single strands into two complete double strands. This process is repeated many times, and is usually repeated a higher number of times when used with ancient DNA. Some issues with PCR is that it requires overlapping primer pairs for ancient DNA due to the short sequences. There can also be “jumping PCR” which causes recombination during the PCR process which can make analyzing the DNA more difficult in inhomogeneous samples.

Methods of DNA analysis

DNA extracted from fossil remains is primarily sequenced using Massive parallel sequencing, which allows simultaneous amplification and sequencing of all DNA segments in a sample, even when it is highly fragmented and of low concentration. It involves attaching a generic sequence to every single strand that generic primers can bond to, and thus all of the DNA present is amplified. This is generally more costly and time intensive than PCR but due to the difficulties involved in ancient DNA amplification it is cheaper and more efficient. One method of massive parallel sequencing, developed by Margulies et al., employs bead-based emulsion PCR and pyrosequencing, and was found to be powerful in analyses of aDNA because it avoids potential loss of sample, substrate competition for templates, and error propagation in replication.

The most common way to analyze aDNA sequence is to compare it with a known sequence from other sources, and this could be done in different ways for different purposes.

The identity of the fossil remain can be uncovered by comparing its DNA sequence with those of known species using software such as BLASTN. This archaeogenetic approach is especially helpful when the morphology of the fossil is ambiguous. Apart from that, species identification can also be done by finding specific genetic markers in an aDNA sequence. For example, the American indigenous population is characterized by specific mitochondrial RFLPs and deletions defined by Wallace et al.

aDNA comparison study can also reveal the evolutionary relationship between two species. The number of base differences between DNA of an ancient species and that of a closely related extant species can be used to estimate the divergence time of those two species from their last common ancestor. The phylogeny of some extinct species, such as Australian marsupial wolves and American ground sloths, has been constructed by this method. Mitochondrial DNA in animals and chloroplast DNA in plants are usually used for this purpose because they have hundreds of copies per cell and thus are more easily accessible in ancient fossils.

Another method to investigate relationship between two species is through DNA hybridization. Single-stranded DNA segments of both species are allowed to form complementary pair bonding with each other. More closely related species have a more similar genetic makeup, and thus a stronger hybridization signal. Scholz et al. conducted southern blot hybridization on Neanderthal aDNA (extracted from fossil remain W-NW and Krapina). The results showed weak ancient human-Neanderthal hybridization and strong ancient human-modern human hybridization. The human-chimpanzee and neanderthal-chimpanzee hybridization are of similarly weak strength. This suggests that humans and neanderthals are not as closely related as two individuals of the same species are, but they are more related to each other than to chimpanzees.

There have also been some attempts to decipher aDNA to provide valuable phenotypic information of ancient species. This is always done by mapping aDNA sequence onto the karyotype of a well-studied closely related species, which share a lot of similar phenotypic traits. For example, Green et al. compared the aDNA sequence from Neanderthal Vi-80 fossil with modern human X and Y chromosome sequence, and they found a similarity in 2.18 and 1.62 bases per 10,000 respectively, suggesting Vi-80 sample was from a male individual. Other similar studies include finding of a mutation associated with dwarfism in Arabidopsis in ancient Nubian cotton, and investigation on the bitter taste perception locus in Neanderthals.

Applications

Human archaeology

Africa

Modern humans are thought to have evolved in Africa at least 200 kya (thousand years ago), with some evidence suggesting a date of over 300 kya. Examination of mitochondrial DNA (mtDNA), Y-chromosome DNA, and X-chromosome DNA indicate that the earliest population to leave Africa consisted of approximately 1500 males and females. It has been suggested by various studies that populations were geographically “structured” to some degree prior to the expansion out of Africa; this is suggested by the antiquity of shared mtDNA lineages. One study of 121 populations from various places throughout the continent found 14 genetic and linguistic “clusters,” suggesting an ancient geographic structure to African populations. In general, genotypic and phenotypic analysis have shown “large and subdivided throughout much of their evolutionary history.”

Genetic analysis has supported archaeological hypotheses of a large-scale migrations of Bantu speakers into Southern Africa approximately 5 kya. Microsatellite DNA, single nucleotide polymorphisms (SNPs), and insertion/deletion polymorphisms (INDELS) have shown that Nilo-Saharan speaking populations originate from Sudan. Furthermore, there is genetic evidence that Chad-speaking descendants of Nilo-Saharan speakers migrated from Sudan to Lake Chad about 8 kya. Genetic evidence has also indicated that non-African populations made significant contributions to the African gene pool. For example, the Saharan African Beja people have high levels of Middle-Eastern as well as East African Cushitic DNA.

Europe

Analysis of mtDNA shows that Eurasia was occupied in a single migratory event between 60 and 70 kya. Genetic evidence shows that occupation of the Near East and Europe happened no earlier than 50 kya. Studying haplogroup U has shown separate dispersals from the Near East both into Europe and into North Africa.

Much of the work done in archaeogenetics focuses on the Neolithic transition in Europe. Cavalli-Svorza's analysis of genetic-geographic patterns led him to conclude that there was a massive influx of Near Eastern populations into Europe at the start of the Neolithic. This view led him “to strongly emphasize the expanding early farmers at the expense of the indigenous Mesolithic foraging populations.” mtDNA analysis in the 1990s, however, contradicted this view. M.B. Richards estimated that 10–22% of extant European mtDNA's had come from Near Eastern populations during the Neolithic. Most mtDNA's were “already established” among existing Mesolithic and Paleolithic groups. Most “control-region lineages” of modern European mtDNA are traced to a founder event of reoccupying northern Europe towards the end of the Last Glacial Maximum (LGM). One study of extant European mtDNA's suggest this reoccupation occurred after the end of the LGM, although another suggests it occurred before. Analysis of haplogroups V, H, and U5 support a “pioneer colonization” model of European occupation, with incorporation of foraging populations into arriving Neolithic populations. Furthermore, analysis of ancient DNA, not just extant DNA, is shedding light on some issues. For instance, comparison of neolithic and mesolithic DNA has indicated that the development of dairying preceded widespread lactose tolerance.

South Asia

South Asia has served as the major early corridor for geographical dispersal of modern humans from out-of-Africa. Based on studies of mtDNA line M, some have suggested that the first occupants of India were Austro-Asiatic speakers who entered about 45–60 kya. The Indian gene pool has contributions from earliest settlers, as well as West Asian and Central Asian populations from migrations no earlier than 8 kya. The lack of variation in mtDNA lineages compared to the Y-chromosome lineages indicate that primarily males partook in these migrations. The discovery of two subbranches U2i and U2e of the U mtDNA lineage, which arose in Central Asia has “modulated” views of a large migration from Central Asia into India, as the two branches diverged 50 kya. Furthermore, U2e is found in large percentages in Europe but not India, and vice versa for U2i, implying U2i is native to India.

East Asia

Analysis of mtDNA and NRY (non-recombining region of Y chromosome) sequences have indicated that the first major dispersal out of Africa went through Saudi Arabia and the Indian coast 50–100 kya, and a second major dispersal occurred 15–50 kya north of the Himalayas.

Much work has been done to discover the extent of north-to-south and south-to-north migrations within Eastern Asia. Comparing the genetic diversity of northeastern groups with southeastern groups has allowed archaeologists to conclude many of the northeast Asian groups came from the southeast. The Pan-Asian SNP (single nucleotide polymorphism) study found “a strong and highly significant correlation between haplotype diversity and latitude,” which, when coupled with demographic analysis, supports the case for a primarily south-to-north occupation of East Asia. Archaeogenetics has also been used to study hunter-gatherer populations in the region, such as the Ainu from Japan and Negrito groups in the Philippines. For example, the Pan-Asian SNP study found that Negrito populations in Malaysia and the Negrito populations in the Philippines were more closely related to non-Negrito local populations than to each other, suggesting Negrito and non-Negrito populations are linked by one entry event into East Asia; although other Negrito groups do share affinities, including with Australian Aboriginies. A possible explanation of this is a recent admixture of some Negrito groups with their local populations.

Americas

Archaeogenetics has been used to better understand the populating of the Americas from Asia. Native American mtDNA haplogroups have been estimated to be between 15 and 20 kya, although there is some variation in these estimates. Genetic data has been used to propose various theories regarding how the Americas were colonized. Although the most widely held theory suggests “three waves” of migration after the LGM through the Bering Strait, genetic data have given rise to alternative hypotheses. For example, one hypothesis proposes a migration from Siberia to South America 20–15 kya and a second migration that occurred after glacial recession. Y-chromosome data has led some to hold that there was a single migration starting from the Altai Mountains of Siberia between 17.2–10.1 kya, after the LGM. Analysis of both mtDNA and Y-chromosome DNA reveals evidence of “small, founding populations.” Studying haplogroups has led some scientists to conclude that a southern migration into the Americas from one small population was impossible, although separate analysis has found that such a model is feasible if such a migration happened along the coasts.

Australia and New Guinea

Finally, archaeogenetics has been used to study the occupation of Australia and New Guinea. The aborigines of Australia and New Guinea are phenotypically very similar, but mtDNA has shown that this is due to convergence from living in similar conditions. Non-coding regions of mt-DNA have shown “no similarities” between the aboriginal populations of Australia and New Guinea. Furthermore, no major NRY lineages are shared between the two populations. The high frequency of a single NRY lineage unique to Australia coupled with “low diversity of lineage-associated Y-chromosomal short tandem repeat (Y-STR) haplotypes” provide evidence for a “recent founder or bottleneck” event in Australia. But there is relatively large variation in mtDNA, which would imply that the bottleneck effect impacted males primarily. Together, NRY and mtDNA studies show that the splitting event between the two groups was over 50kya, casting doubt on recent common ancestry between the two.

Plants and animals

Archaeogenetics has been used to understand the development of domestication of plants and animals.

Domestication of plants

The combination of genetics and archeological findings have been used to trace the earliest signs of plant domestication around the world. However, since the nuclear, mitochondrial, and chloroplast genomes used to trace domestication's moment of origin have evolved at different rates, its use to trace genealogy have been somewhat problematic. Nuclear DNA in specific is used over mitochondrial and chloroplast DNA because of its faster mutation rate as well as its intraspecific variation due to a higher consistency of polymorphism genetic markers. Findings in crop ‘domestication genes’ (traits that were specifically selected for or against) include

  • tb1 (teosinte branched1) – affecting the apical dominance in maize
  • tga1 (teosinte glume architecture1) – making maize kernels compatible for the convenience of humans 
  • te1 (Terminal ear1) – affecting the weight of kernels
  • fw2.2 – affecting the weight in tomatoes
  • BoCal – inflorescence of broccoli and cauliflower

Through the study of archaeogenetics in plant domestication, signs of the first global economy can also be uncovered. The geographical distribution of new crops highly selected in one region found in another where it would have not originally been introduced serve as evidence of a trading network for the production and consumption of readily available resources.

Domestication of animals

Archaeogenetics has been used to study the domestication of animals. By analyzing genetic diversity in domesticated animal populations researchers can search for genetic markers in DNA to give valuable insight about possible traits of progenitor species. These traits are then used to help distinguish archaeological remains between wild and domesticated specimens. The genetic studies can also lead to the identification of ancestors for domesticated animals. The information gained from genetics studies on current populations helps guide the Archaeologist's search for documenting these ancestors.

Archaeogenetics has been used to trace the domestication of pigs throughout the old world. These studies also reveal evidence about the details of early farmers. Methods of Archaeogenetics have also been used to further understand the development of domestication of dogs. Genetic studies have shown that all dogs are descendants from the gray wolf, however, it is currently unknown when, where, and how many times dogs were domesticated. Some genetic studies have indicated multiple domestications while others have not. Archaeological findings help better understand this complicated past by providing solid evidence about the progression of the domestication of dogs. As early humans domesticated dogs the archaeological remains of buried dogs became increasingly more abundant. Not only does this provide more opportunities for archaeologists to study the remains, it also provides clues about early human culture.

Operator (computer programming)

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Operator_(computer_programmin...