A Medley of Potpourri

Saturday, March 7, 2020

Genomics

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Genomics

Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

The field also includes studies of intragenomic (within the genome) phenomena such as epistasis (effect of one gene on another), pleiotropy (one gene affecting more than one trait), heterosis (hybrid vigour), and other interactions between loci and alleles within the genome.

History

Etymology

From the Greek ΓΕΝ gen, "gene" (gamma, epsilon, nu, epsilon) meaning "become, create, creation, birth", and subsequent variants: genealogy, genesis, genetics, genic, genomere, genotype, genus etc. While the word genome (from the German Genom, attributed to Hans Winkler) was in use in English as early as 1926, the term genomics was coined by Tom Roderick, a geneticist at the Jackson Laboratory (Bar Harbor, Maine), over beer at a meeting held in Maryland on the mapping of the human genome in 1986.

Early sequencing efforts

Following Rosalind Franklin's confirmation of the helical structure of DNA, James D. Watson and Francis Crick's publication of the structure of DNA in 1953 and Fred Sanger's publication of the Amino acid sequence of insulin in 1955, nucleic acid sequencing became a major target of early molecular biologists. In 1964, Robert W. Holley and colleagues published the first nucleic acid sequence ever determined, the ribonucleotide sequence of alanine transfer RNA. Extending this work, Marshall Nirenberg and Philip Leder revealed the triplet nature of the genetic code and were able to determine the sequences of 54 out of 64 codons in their experiments. In 1972, Walter Fiers and his team at the Laboratory of Molecular Biology of the University of Ghent (Ghent, Belgium) were the first to determine the sequence of a gene: the gene for Bacteriophage MS2 coat protein. Fiers' group expanded on their MS2 coat protein work, determining the complete nucleotide-sequence of bacteriophage MS2-RNA (whose genome encodes just four genes in 3569 base pairs [bp]) and Simian virus 40 in 1976 and 1978, respectively.

DNA-sequencing technology developed

Frederick Sanger

Walter Gilbert

Frederick Sanger and Walter Gilbert shared half of the 1980 Nobel Prize in Chemistry for Independently developing methods for the sequencing of DNA.

In addition to his seminal work on the amino acid sequence of insulin, Frederick Sanger and his colleagues played a key role in the development of DNA sequencing techniques that enabled the establishment of comprehensive genome sequencing projects. In 1975, he and Alan Coulson published a sequencing procedure using DNA polymerase with radiolabelled nucleotides that he called the Plus and Minus technique. This involved two closely related methods that generated short oligonucleotides with defined 3' termini. These could be fractionated by electrophoresis on a polyacrylamide gel (called polyacrylamide gel electrophoresis) and visualised using autoradiography. The procedure could sequence up to 80 nucleotides in one go and was a big improvement, but was still very laborious. Nevertheless, in 1977 his group was able to sequence most of the 5,386 nucleotides of the single-stranded bacteriophage φX174, completing the first fully sequenced DNA-based genome. The refinement of the Plus and Minus method resulted in the chain-termination, or Sanger method, which formed the basis of the techniques of DNA sequencing, genome mapping, data storage, and bioinformatic analysis most widely used in the following quarter-century of research. In the same year Walter Gilbert and Allan Maxam of Harvard University independently developed the Maxam-Gilbert method (also known as the chemical method) of DNA sequencing, involving the preferential cleavage of DNA at known bases, a less efficient method. For their groundbreaking work in the sequencing of nucleic acids, Gilbert and Sanger shared half the 1980 Nobel Prize in chemistry with Paul Berg (recombinant DNA).

Complete genomes

The advent of these technologies resulted in a rapid intensification in the scope and speed of completion of genome sequencing projects. The first complete genome sequence of a eukaryotic organelle, the human mitochondrion (16,568 bp, about 16.6 kb [kilobase]), was reported in 1981, and the first chloroplast genomes followed in 1986. In 1992, the first eukaryotic chromosome, chromosome III of brewer's yeast Saccharomyces cerevisiae (315 kb) was sequenced. The first free-living organism to be sequenced was that of Haemophilus influenzae (1.8 Mb [megabase]) in 1995. The following year a consortium of researchers from laboratories across North America, Europe, and Japan announced the completion of the first complete genome sequence of a eukaryote, S. cerevisiae (12.1 Mb), and since then genomes have continued being sequenced at an exponentially growing pace. As of October 2011, the complete sequences are available for: 2,719 viruses, 1,115 archaea and bacteria, and 36 eukaryotes, of which about half are fungi.

The number of genome projects has increased as technological improvements continue to lower the cost of sequencing. (A) Exponential growth of genome sequence databases since 1995. (B) The cost in US Dollars (USD) to sequence one million bases. (C) The cost in USD to sequence a 3,000 Mb (human-sized) genome on a log-transformed scale.

Most of the microorganisms whose genomes have been completely sequenced are problematic pathogens, such as Haemophilus influenzae, which has resulted in a pronounced bias in their phylogenetic distribution compared to the breadth of microbial diversity. Of the other sequenced species, most were chosen because they were well-studied model organisms or promised to become good models. Yeast (Saccharomyces cerevisiae) has long been an important model organism for the eukaryotic cell, while the fruit fly Drosophila melanogaster has been a very important tool (notably in early pre-molecular genetics). The worm Caenorhabditis elegans is an often used simple model for multicellular organisms. The zebrafish Brachydanio rerio is used for many developmental studies on the molecular level, and the plant Arabidopsis thaliana is a model organism for flowering plants. The Japanese pufferfish (Takifugu rubripes) and the spotted green pufferfish (Tetraodon nigroviridis) are interesting because of their small and compact genomes, which contain very little noncoding DNA compared to most species. The mammals dog (Canis familiaris), brown rat (Rattus norvegicus), mouse (Mus musculus), and chimpanzee (Pan troglodytes) are all important model animals in medical research.

A rough draft of the human genome was completed by the Human Genome Project in early 2001, creating much fanfare. This project, completed in 2003, sequenced the entire genome for one specific person, and by 2007 this sequence was declared "finished" (less than one error in 20,000 bases and all chromosomes assembled). In the years since then, the genomes of many other individuals have been sequenced, partly under the auspices of the 1000 Genomes Project, which announced the sequencing of 1,092 genomes in October 2012. Completion of this project was made possible by the development of dramatically more efficient sequencing technologies and required the commitment of significant bioinformatics resources from a large international collaboration. The continued analysis of human genomic data has profound political and social repercussions for human societies.

The "omics" revolution

General schema showing the relationships of the genome, transcriptome, proteome, and metabolome (lipidome).

The English-language neologism omics informally refers to a field of study in biology ending in -omics, such as genomics, proteomics or metabolomics. The related suffix -ome is used to address the objects of study of such fields, such as the genome, proteome or metabolome respectively. The suffix -ome as used in molecular biology refers to a totality of some sort; similarly omics has come to refer generally to the study of large, comprehensive biological data sets. While the growth in the use of the term has led some scientists (Jonathan Eisen, among others) to claim that it has been oversold, it reflects the change in orientation towards the quantitative analysis of complete or near-complete assortment of all the constituents of a system. In the study of symbioses, for example, researchers which were once limited to the study of a single gene product can now simultaneously compare the total complement of several types of biological molecules.

Genome analysis

After an organism has been selected, genome projects involve three components: the sequencing of DNA, the assembly of that sequence to create a representation of the original chromosome, and the annotation and analysis of that representation.

Overview of a genome project. First, the genome must be selected, which involves several factors including cost and relevance. Second, the sequence is generated and assembled at a given sequencing center (such as BGI or DOE JGI). Third, the genome sequence is annotated at several levels: DNA, protein, gene pathways, or comparatively.

Sequencing

Historically, sequencing was done in sequencing centers, centralized facilities (ranging from large independent institutions such as Joint Genome Institute which sequence dozens of terabases a year, to local molecular biology core facilities) which contain research laboratories with the costly instrumentation and technical support necessary. As sequencing technology continues to improve, however, a new generation of effective fast turnaround benchtop sequencers has come within reach of the average academic laboratory. On the whole, genome sequencing approaches fall into two broad categories, shotgun and high-throughput (or next-generation) sequencing.

Shotgun sequencing

An ABI PRISM 3100 Genetic Analyzer. Such capillary sequencers automated early large-scale genome sequencing efforts.

Shotgun sequencing is a sequencing method designed for analysis of DNA sequences longer than 1000 base pairs, up to and including entire chromosomes. It is named by analogy with the rapidly expanding, quasi-random firing pattern of a shotgun. Since gel electrophoresis sequencing can only be used for fairly short sequences (100 to 1000 base pairs), longer DNA sequences must be broken into random small segments which are then sequenced to obtain reads. Multiple overlapping reads for the target DNA are obtained by performing several rounds of this fragmentation and sequencing. Computer programs then use the overlapping ends of different reads to assemble them into a continuous sequence. Shotgun sequencing is a random sampling process, requiring over-sampling to ensure a given nucleotide is represented in the reconstructed sequence; the average number of reads by which a genome is over-sampled is referred to as coverage.

For much of its history, the technology underlying shotgun sequencing was the classical chain-termination method or 'Sanger method', which is based on the selective incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication. Recently, shotgun sequencing has been supplanted by high-throughput sequencing methods, especially for large-scale, automated genome analyses. However, the Sanger method remains in wide use, primarily for smaller-scale projects and for obtaining especially long contiguous DNA sequence reads (>500 nucleotides). Chain-termination methods require a single-stranded DNA template, a DNA primer, a DNA polymerase, normal deoxynucleosidetriphosphates (dNTPs), and modified nucleotides (dideoxyNTPs) that terminate DNA strand elongation. These chain-terminating nucleotides lack a 3'-OH group required for the formation of a phosphodiester bond between two nucleotides, causing DNA polymerase to cease extension of DNA when a ddNTP is incorporated. The ddNTPs may be radioactively or fluorescently labelled for detection in DNA sequencers. Typically, these machines can sequence up to 96 DNA samples in a single batch (run) in up to 48 runs a day.

High-throughput sequencing

The high demand for low-cost sequencing has driven the development of high-throughput sequencing technologies that parallelize the sequencing process, producing thousands or millions of sequences at once. High-throughput sequencing is intended to lower the cost of DNA sequencing beyond what is possible with standard dye-terminator methods. In ultra-high-throughput sequencing, as many as 500,000 sequencing-by-synthesis operations may be run in parallel.

Illumina Genome Analyzer II System. Illumina technologies have set the standard for high-throughput massively parallel sequencing.

The Illumina dye sequencing method is based on reversible dye-terminators and was developed in 1996 at the Geneva Biomedical Research Institute, by Pascal Mayer and Laurent Farinelli. In this method, DNA molecules and primers are first attached on a slide and amplified with polymerase so that local clonal colonies, initially coined "DNA colonies", are formed. To determine the sequence, four types of reversible terminator bases (RT-bases) are added and non-incorporated nucleotides are washed away. Unlike pyrosequencing, the DNA chains are extended one nucleotide at a time and image acquisition can be performed at a delayed moment, allowing for very large arrays of DNA colonies to be captured by sequential images taken from a single camera. Decoupling the enzymatic reaction and the image capture allows for optimal throughput and theoretically unlimited sequencing capacity; with an optimal configuration, the ultimate throughput of the instrument depends only on the A/D conversion rate of the camera. The camera takes images of the fluorescently labeled nucleotides, then the dye along with the terminal 3' blocker is chemically removed from the DNA, allowing the next cycle.

An alternative approach, ion semiconductor sequencing, is based on standard DNA replication chemistry. This technology measures the release of a hydrogen ion each time a base is incorporated. A microwell containing template DNA is flooded with a single nucleotide, if the nucleotide is complementary to the template strand it will be incorporated and a hydrogen ion will be released. This release triggers an ISFET ion sensor. If a homopolymer is present in the template sequence multiple nucleotides will be incorporated in a single flood cycle, and the detected electrical signal will be proportionally higher.

Assembly

Overlapping reads form contigs; contigs and gaps of known length form scaffolds.

Paired end reads of next generation sequencing data mapped to a reference genome.

Multiple, fragmented sequence reads must be assembled together on the basis of their overlapping areas.

Sequence assembly refers to aligning and merging fragments of a much longer DNA sequence in order to reconstruct the original sequence. This is needed as current DNA sequencing technology cannot read whole genomes as a continuous sequence, but rather reads small pieces of between 20 and 1000 bases, depending on the technology used. Third generation sequencing technologies such as PacBio or Oxford Nanopore routinely generate sequenceing reads >10 kb in length; however, they have a high error rate at approximately 15 percent. Typically the short fragments, called reads, result from shotgun sequencing genomic DNA, or gene transcripts (ESTs).

Assembly approaches

Assembly can be broadly categorized into two approaches: de novo assembly, for genomes which are not similar to any sequenced in the past, and comparative assembly, which uses the existing sequence of a closely related organism as a reference during assembly. Relative to comparative assembly, de novo assembly is computationally difficult (NP-hard), making it less favourable for short-read NGS technologies. Within the de novo assembly paradigm there are two primary strategies for assembly, Eulerian path strategies, and overlap-layout-consensus (OLC) strategies. OLC strategies ultimately try to create a Hamiltonian path through an overlap graph which is an NP-hard problem. Eulerian path strategies are computationally more tractable because they try to find a Eulerian path through a deBruijn graph.

Finishing

Finished genomes are defined as having a single contiguous sequence with no ambiguities representing each replicon.

Annotation

The DNA sequence assembly alone is of little value without additional analysis. Genome annotation is the process of attaching biological information to sequences, and consists of three main steps:

identifying portions of the genome that do not code for proteins
identifying elements on the genome, a process called gene prediction, and
attaching biological information to these elements.

Automatic annotation tools try to perform these steps in silico, as opposed to manual annotation (a.k.a. curation) which involves human expertise and potential experimental verification. Ideally, these approaches co-exist and complement each other in the same annotation pipeline (also see below).

Traditionally, the basic level of annotation is using BLAST for finding similarities, and then annotating genomes based on homologues. More recently, additional information is added to the annotation platform. The additional information allows manual annotators to deconvolute discrepancies between genes that are given the same annotation. Some databases use genome context information, similarity scores, experimental data, and integrations of other resources to provide genome annotations through their Subsystems approach. Other databases (e.g. Ensembl) rely on both curated data sources as well as a range of software tools in their automated genome annotation pipeline.^[66] Structural annotation consists of the identification of genomic elements, primarily ORFs and their localisation, or gene structure. Functional annotation consists of attaching biological information to genomic elements.

Sequencing pipelines and databases

The need for reproducibility and efficient management of the large amount of data associated with genome projects mean that computational pipelines have important applications in genomics.

Research areas

Functional genomics

Functional genomics is a field of molecular biology that attempts to make use of the vast wealth of data produced by genomic projects (such as genome sequencing projects) to describe gene (and protein) functions and interactions. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. Functional genomics attempts to answer questions about the function of DNA at the levels of genes, RNA transcripts, and protein products. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional “gene-by-gene” approach.

A major branch of genomics is still concerned with sequencing the genomes of various organisms, but the knowledge of full genomes has created the possibility for the field of functional genomics, mainly concerned with patterns of gene expression during various conditions. The most important tools here are microarrays and bioinformatics.

Structural genomics

An example of a protein structure determined by the Midwest Center for Structural Genomics.

Structural genomics seeks to describe the 3-dimensional structure of every protein encoded by a given genome. This genome-based approach allows for a high-throughput method of structure determination by a combination of experimental and modeling approaches. The principal difference between structural genomics and traditional structural prediction is that structural genomics attempts to determine the structure of every protein encoded by the genome, rather than focusing on one particular protein. With full-genome sequences available, structure prediction can be done more quickly through a combination of experimental and modeling approaches, especially because the availability of large numbers of sequenced genomes and previously solved protein structures allow scientists to model protein structure on the structures of previously solved homologs. Structural genomics involves taking a large number of approaches to structure determination, including experimental methods using genomic sequences or modeling-based approaches based on sequence or structural homology to a protein of known structure or based on chemical and physical principles for a protein with no homology to any known structure. As opposed to traditional structural biology, the determination of a protein structure through a structural genomics effort often (but not always) comes before anything is known regarding the protein function. This raises new challenges in structural bioinformatics, i.e. determining protein function from its 3D structure.

Epigenomics

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence (Russell 2010 p. 475). Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.

Metagenomics

Environmental Shotgun Sequencing (ESS) is a key technique in metagenomics. (A) Sampling from habitat; (B) filtering particles, typically by size; (C) Lysis and DNA extraction; (D) cloning and library construction; (E) sequencing the clones; (F) sequence assembly into contigs and scaffolds.

Metagenomics is the study of metagenomes, genetic material recovered directly from environmental samples. The broad field may also be referred to as environmental genomics, ecogenomics or community genomics. While traditional microbiology and microbial genome sequencing rely upon cultivated clonal cultures, early environmental gene sequencing cloned specific genes (often the 16S rRNA gene) to produce a profile of diversity in a natural sample. Such work revealed that the vast majority of microbial biodiversity had been missed by cultivation-based methods. Recent studies use "shotgun" Sanger sequencing or massively parallel pyrosequencing to get largely unbiased samples of all genes from all the members of the sampled communities. Because of its power to reveal the previously hidden diversity of microscopic life, metagenomics offers a powerful lens for viewing the microbial world that has the potential to revolutionize understanding of the entire living world.

Model systems

Viruses and bacteriophages

Bacteriophages have played and continue to play a key role in bacterial genetics and molecular biology. Historically, they were used to define gene structure and gene regulation. Also the first genome to be sequenced was a bacteriophage. However, bacteriophage research did not lead the genomics revolution, which is clearly dominated by bacterial genomics. Only very recently has the study of bacteriophage genomes become prominent, thereby enabling researchers to understand the mechanisms underlying phage evolution. Bacteriophage genome sequences can be obtained through direct sequencing of isolated bacteriophages, but can also be derived as part of microbial genomes. Analysis of bacterial genomes has shown that a substantial amount of microbial DNA consists of prophage sequences and prophage-like elements. A detailed database mining of these sequences offers insights into the role of prophages in shaping the bacterial genome: Overall, this method verified many known bacteriophage groups, making this a useful tool for predicting the relationships of prophages from bacterial genomes.

Cyanobacteria

At present there are 24 cyanobacteria for which a total genome sequence is available. 15 of these cyanobacteria come from the marine environment. These are six Prochlorococcus strains, seven marine Synechococcus strains, Trichodesmium erythraeum IMS101 and Crocosphaera watsonii WH8501. Several studies have demonstrated how these sequences could be used very successfully to infer important ecological and physiological characteristics of marine cyanobacteria. However, there are many more genome projects currently in progress, amongst those there are further Prochlorococcus and marine Synechococcus isolates, Acaryochloris and Prochloron, the N₂-fixing filamentous cyanobacteria Nodularia spumigena, Lyngbya aestuarii and Lyngbya majuscula, as well as bacteriophages infecting marine cyanobaceria. Thus, the growing body of genome information can also be tapped in a more general way to address global problems by applying a comparative approach. Some new and exciting examples of progress in this field are the identification of genes for regulatory RNAs, insights into the evolutionary origin of photosynthesis, or estimation of the contribution of horizontal gene transfer to the genomes that have been analyzed.

Applications of genomics

Genomics has provided applications in many fields, including medicine, biotechnology, anthropology and other social sciences.

Genomic medicine

Next-generation genomic technologies allow clinicians and biomedical researchers to drastically increase the amount of genomic data collected on large study populations. When combined with new informatics approaches that integrate many kinds of data with genomic data in disease research, this allows researchers to better understand the genetic bases of drug response and disease. For example, the All of Us research program aims to collect genome sequence data from 1 million participants to become a critical component of the precision medicine research platform.

Synthetic biology and bioengineering

The growth of genomic knowledge has enabled increasingly sophisticated applications of synthetic biology. In 2010 researchers at the J. Craig Venter Institute announced the creation of a partially synthetic species of bacterium, Mycoplasma laboratorium, derived from the genome of Mycoplasma genitalium.

Conservation genomics

Conservationists can use the information gathered by genomic sequencing in order to better evaluate genetic factors key to species conservation, such as the genetic diversity of a population or whether an individual is heterozygous for a recessive inherited genetic disorder. By using genomic data to evaluate the effects of evolutionary processes and to detect patterns in variation throughout a given population, conservationists can formulate plans to aid a given species without as many variables left unknown as those unaddressed by standard genetic approaches.

Ecosystem

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Ecosystem

Top: Coral reef ecosystems are highly productive marine systems, bottom: Temperate rainforest on the Olympic Peninsula in Washington state.

An ecosystem is a community of living organisms in conjunction with the nonliving components of their environment, interacting as a system. These biotic and abiotic components are linked together through nutrient cycles and energy flows. Energy enters the system through photosynthesis and is incorporated into plant tissue. By feeding on plants and on one-another, animals play an important role in the movement of matter and energy through the system. They also influence the quantity of plant and microbial biomass present. By breaking down dead organic matter, decomposers release carbon back to the atmosphere and facilitate nutrient cycling by converting nutrients stored in dead biomass back to a form that can be readily used by plants and other microbes.

Ecosystems are controlled by external and internal factors. External factors such as climate, parent material which forms the soil and topography, control the overall structure of an ecosystem but are not themselves influenced by the ecosystem. Unlike external factors, internal factors are controlled, for example, decomposition, root competition, shading, disturbance, succession, and the types of species present.

Ecosystems are dynamic entities—they are subject to periodic disturbances and are in the process of recovering from some past disturbance. Ecosystems in similar environments that are located in different parts of the world can end up doing things very differently simply because they have different pools of species present. Internal factors not only control ecosystem processes but are also controlled by them and are often subject to feedback loops.

Resource inputs are generally controlled by external processes like climate and parent material. Resource availability within the ecosystem is controlled by internal factors like decomposition, root competition or shading. Although humans operate within ecosystems, their cumulative effects are large enough to influence external factors like climate.

Biodiversity affects ecosystem functioning, as do the processes of disturbance and succession. Ecosystems provide a variety of goods and services upon which people depend.

History

The term ecosystem was first used in 1935 in a publication by British ecologist Arthur Tansley. Tansley devised the concept to draw attention to the importance of transfers of materials between organisms and their environment. He later refined the term, describing it as "The whole system, ... including not only the organism-complex, but also the whole complex of physical factors forming what we call the environment". Tansley regarded ecosystems not simply as natural units, but as "mental isolates". Tansley later defined the spatial extent of ecosystems using the term ecotope.

G. Evelyn Hutchinson, a limnologist who was a contemporary of Tansley's, combined Charles Elton's ideas about trophic ecology with those of Russian geochemist Vladimir Vernadsky. As a result, he suggested that mineral nutrient availability in a lake limited algal production. This would, in turn, limit the abundance of animals that feed on algae. Raymond Lindeman took these ideas further to suggest that the flow of energy through a lake was the primary driver of the ecosystem. Hutchinson's students, brothers Howard T. Odum and Eugene P. Odum, further developed a "systems approach" to the study of ecosystems. This allowed them to study the flow of energy and material through ecological systems.

Processes

Rainforest ecosystems are rich in biodiversity. This is the Gambia River in Senegal's Niokolo-Koba National Park.

Flora of Baja California Desert, Cataviña region, Mexico

Biomes of the world

Ecosystems are controlled both by external and internal factors. External factors, also called state factors, control the overall structure of an ecosystem and the way things work within it, but are not themselves influenced by the ecosystem. The most important of these is climate. Climate determines the biome in which the ecosystem is embedded. Rainfall patterns and seasonal temperatures influence photosynthesis and thereby determine the amount of water and energy available to the ecosystem.

Parent material determines the nature of the soil in an ecosystem, and influences the supply of mineral nutrients. Topography also controls ecosystem processes by affecting things like microclimate, soil development and the movement of water through a system. For example, ecosystems can be quite different if situated in a small depression on the landscape, versus one present on an adjacent steep hillside.

Other external factors that play an important role in ecosystem functioning include time and potential biota. Similarly, the set of organisms that can potentially be present in an area can also significantly affect ecosystems. Ecosystems in similar environments that are located in different parts of the world can end up doing things very differently simply because they have different pools of species present. The introduction of non-native species can cause substantial shifts in ecosystem function.

Unlike external factors, internal factors in ecosystems not only control ecosystem processes but are also controlled by them. Consequently, they are often subject to feedback loops. While the resource inputs are generally controlled by external processes like climate and parent material, the availability of these resources within the ecosystem is controlled by internal factors like decomposition, root competition or shading. Other factors like disturbance, succession or the types of species present are also internal factors.

Primary production

Global oceanic and terrestrial phototroph abundance, from September 1997 to August 2000. As an estimate of autotroph biomass, it is only a rough indicator of primary production potential and not an actual estimate of it.

Primary production is the production of organic matter from inorganic carbon sources. This mainly occurs through photosynthesis. The energy incorporated through this process supports life on earth, while the carbon makes up much of the organic matter in living and dead biomass, soil carbon and fossil fuels. It also drives the carbon cycle, which influences global climate via the greenhouse effect.

Through the process of photosynthesis, plants capture energy from light and use it to combine carbon dioxide and water to produce carbohydrates and oxygen. The photosynthesis carried out by all the plants in an ecosystem is called the gross primary production (GPP). About half of the GPP is consumed in plant respiration. The remainder, that portion of GPP that is not used up by respiration, is known as the net primary production (NPP). Total photosynthesis is limited by a range of environmental factors. These include the amount of light available, the amount of leaf area a plant has to capture light (shading by other plants is a major limitation of photosynthesis), rate at which carbon dioxide can be supplied to the chloroplasts to support photosynthesis, the availability of water, and the availability of suitable temperatures for carrying out photosynthesis.

Energy flow

Energy and carbon enter ecosystems through photosynthesis, are incorporated into living tissue, transferred to other organisms that feed on the living and dead plant matter, and eventually released through respiration.

The carbon and energy incorporated into plant tissues (net primary production) is either consumed by animals while the plant is alive, or it remains uneaten when the plant tissue dies and becomes detritus. In terrestrial ecosystems, roughly 90% of the net primary production ends up being broken down by decomposers. The remainder is either consumed by animals while still alive and enters the plant-based trophic system, or it is consumed after it has died, and enters the detritus-based trophic system.

In aquatic systems, the proportion of plant biomass that gets consumed by herbivores is much higher. In trophic systems photosynthetic organisms are the primary producers. The organisms that consume their tissues are called primary consumers or secondary producers—herbivores. Organisms which feed on microbes (bacteria and fungi) are termed microbivores. Animals that feed on primary consumers—carnivores—are secondary consumers. Each of these constitutes a trophic level.

The sequence of consumption—from plant to herbivore, to carnivore—forms a food chain. Real systems are much more complex than this—organisms will generally feed on more than one form of food, and may feed at more than one trophic level. Carnivores may capture some prey which is part of a plant-based trophic system and others that are part of a detritus-based trophic system (a bird that feeds both on herbivorous grasshoppers and earthworms, which consume detritus). Real systems, with all these complexities, form food webs rather than food chains. The food chain usually consists of five levels of consumption which are producers, primary consumers, secondary consumers, tertiary consumers, and decomposers.

Decomposition

The carbon and nutrients in dead organic matter are broken down by a group of processes known as decomposition. This releases nutrients that can then be re-used for plant and microbial production and returns carbon dioxide to the atmosphere (or water) where it can be used for photosynthesis. In the absence of decomposition, the dead organic matter would accumulate in an ecosystem, and nutrients and atmospheric carbon dioxide would be depleted. Approximately 90% of terrestrial net primary production goes directly from plant to decomposer.

Decomposition processes can be separated into three categories—leaching, fragmentation and chemical alteration of dead material. As water moves through dead organic matter, it dissolves and carries with it the water-soluble components. These are then taken up by organisms in the soil, react with mineral soil, or are transported beyond the confines of the ecosystem (and are considered lost to it). Newly shed leaves and newly dead animals have high concentrations of water-soluble components and include sugars, amino acids and mineral nutrients. Leaching is more important in wet environments and much less important in dry ones.

Fragmentation processes break organic material into smaller pieces, exposing new surfaces for colonization by microbes. Freshly shed leaf litter may be inaccessible due to an outer layer of cuticle or bark, and cell contents are protected by a cell wall. Newly dead animals may be covered by an exoskeleton. Fragmentation processes, which break through these protective layers, accelerate the rate of microbial decomposition. Animals fragment detritus as they hunt for food, as does passage through the gut. Freeze-thaw cycles and cycles of wetting and drying also fragment dead material.

The chemical alteration of the dead organic matter is primarily achieved through bacterial and fungal action. Fungal hyphae produces enzymes that can break through the tough outer structures surrounding dead plant material. They also produce enzymes which break down lignin, which allows them access to both cell contents and the nitrogen in the lignin. Fungi can transfer carbon and nitrogen through their hyphal networks and thus, unlike bacteria, are not dependent solely on locally available resources.

Decomposition rates vary among ecosystems. The rate of decomposition is governed by three sets of factors—the physical environment (temperature, moisture, and soil properties), the quantity and quality of the dead material available to decomposers, and the nature of the microbial community itself. Temperature controls the rate of microbial respiration; the higher the temperature, the faster the microbial decomposition occurs. It also affects soil moisture, which slows microbial growth and reduces leaching. Freeze-thaw cycles also affect decomposition—freezing temperatures kill soil microorganisms, which allows leaching to play a more important role in moving nutrients around. This can be especially important as the soil thaws in the spring, creating a pulse of nutrients which become available.

Decomposition rates are low under very wet or very dry conditions. Decomposition rates are highest in wet, moist conditions with adequate levels of oxygen. Wet soils tend to become deficient in oxygen (this is especially true in wetlands), which slows microbial growth. In dry soils, decomposition slows as well, but bacteria continue to grow (albeit at a slower rate) even after soils become too dry to support plant growth.

Nutrient cycling

Biological nitrogen cycling

Ecosystems continually exchange energy and carbon with the wider environment. Mineral nutrients, on the other hand, are mostly cycled back and forth between plants, animals, microbes and the soil. Most nitrogen enters ecosystems through biological nitrogen fixation, is deposited through precipitation, dust, gases or is applied as fertilizer.

Since most terrestrial ecosystems are nitrogen-limited, nitrogen cycling is an important control on ecosystem production.

Until modern times, nitrogen fixation was the major source of nitrogen for ecosystems. Nitrogen-fixing bacteria either live symbiotically with plants or live freely in the soil. The energetic cost is high for plants that support nitrogen-fixing symbionts—as much as 25% of gross primary production when measured in controlled conditions. Many members of the legume plant family support nitrogen-fixing symbionts. Some cyanobacteria are also capable of nitrogen fixation. These are phototrophs, which carry out photosynthesis. Like other nitrogen-fixing bacteria, they can either be free-living or have symbiotic relationships with plants. Other sources of nitrogen include acid deposition produced through the combustion of fossil fuels, ammonia gas which evaporates from agricultural fields which have had fertilizers applied to them, and dust. Anthropogenic nitrogen inputs account for about 80% of all nitrogen fluxes in ecosystems.

When plant tissues are shed or are eaten, the nitrogen in those tissues becomes available to animals and microbes. Microbial decomposition releases nitrogen compounds from dead organic matter in the soil, where plants, fungi, and bacteria compete for it. Some soil bacteria use organic nitrogen-containing compounds as a source of carbon, and release ammonium ions into the soil. This process is known as nitrogen mineralization. Others convert ammonium to nitrite and nitrate ions, a process known as nitrification. Nitric oxide and nitrous oxide are also produced during nitrification. Under nitrogen-rich and oxygen-poor conditions, nitrates and nitrites are converted to nitrogen gas, a process known as denitrification.

Other important nutrients include phosphorus, sulfur, calcium, potassium, magnesium and manganese. Phosphorus enters ecosystems through weathering. As ecosystems age this supply diminishes, making phosphorus-limitation more common in older landscapes (especially in the tropics). Calcium and sulfur are also produced by weathering, but acid deposition is an important source of sulfur in many ecosystems. Although magnesium and manganese are produced by weathering, exchanges between soil organic matter and living cells account for a significant portion of ecosystem fluxes. Potassium is primarily cycled between living cells and soil organic matter.

Function and biodiversity

Loch Lomond in Scotland forms a relatively isolated ecosystem. The fish community of this lake has remained stable over a long period until a number of introductions in the 1970s restructured its food web.

Spiny forest at Ifaty, Madagascar, featuring various Adansonia (baobab) species, Alluaudia procera (Madagascar ocotillo) and other vegetation.

Biodiversity plays an important role in ecosystem functioning. The reason for this is that ecosystem processes are driven by the number of species in an ecosystem, the exact nature of each individual species, and the relative abundance organisms within these species. Ecosystem processes are broad generalizations that actually take place through the actions of individual organisms. The nature of the organisms—the species, functional groups and trophic levels to which they belong—dictates the sorts of actions these individuals are capable of carrying out and the relative efficiency with which they do so.

Ecological theory suggests that in order to coexist, species must have some level of limiting similarity—they must be different from one another in some fundamental way, otherwise one species would competitively exclude the other. Despite this, the cumulative effect of additional species in an ecosystem is not linear—additional species may enhance nitrogen retention, for example, but beyond some level of species richness, additional species may have little additive effect.

The addition (or loss) of species that are ecologically similar to those already present in an ecosystem tends to only have a small effect on ecosystem function. Ecologically distinct species, on the other hand, have a much larger effect. Similarly, dominant species have a large effect on ecosystem function, while rare species tend to have a small effect. Keystone species tend to have an effect on ecosystem function that is disproportionate to their abundance in an ecosystem. Similarly, an ecosystem engineer is any organism that creates, significantly modifies, maintains or destroys a habitat.

Dynamics

Ecosystems are dynamic entities. They are subject to periodic disturbances and are in the process of recovering from some past disturbance. When a perturbation occurs, an ecosystem responds by moving away from its initial state. The tendency of an ecosystem to remain close to its equilibrium state, despite that disturbance, is termed its resistance. On the other hand, the speed with which it returns to its initial state after disturbance is called it's resilience. Time plays a role in the development of soil from bare rock and the recovery of a community from disturbance.

From one year to another, ecosystems experience variation in their biotic and abiotic environments. A drought, a colder than usual winter, and a pest outbreak all are short-term variability in environmental conditions. Animal populations vary from year to year, building up during resource-rich periods and crashing as they overshoot their food supply. These changes play out in changes in net primary production decomposition rates, and other ecosystem processes. Longer-term changes also shape ecosystem processes—the forests of eastern North America still show legacies of cultivation which ceased 200 years ago, while methane production in eastern Siberian lakes is controlled by organic matter which accumulated during the Pleistocene.

Disturbance also plays an important role in ecological processes. F. Stuart Chapin and coauthors define disturbance as "a relatively discrete event in time and space that alters the structure of populations, communities, and ecosystems and causes changes in resources availability or the physical environment". This can range from tree falls and insect outbreaks to hurricanes and wildfires to volcanic eruptions. Such disturbances can cause large changes in plant, animal and microbe populations, as well as soil organic matter content. Disturbance is followed by succession, a "directional change in ecosystem structure and functioning resulting from biotically driven changes in resources supply."

The frequency and severity of disturbance determine the way it affects ecosystem function. A major disturbance like a volcanic eruption or glacial advance and retreat leave behind soils that lack plants, animals or organic matter. Ecosystems that experience such disturbances undergo primary succession. A less severe disturbance like forest fires, hurricanes or cultivation result in secondary succession and a faster recovery. More severe disturbance and more frequent disturbance result in longer recovery times.

A freshwater lake in Gran Canaria, an island of the Canary Islands.

Clear boundaries make lakes convenient to study using an ecosystem approach.

Ecosystem ecology

A hydrothermal vent is an ecosystem on the ocean floor. (The scale bar is 1 m.)

Ecosystem ecology studies the processes and dynamics of ecosystems, and the way the flow of matter and energy through them structures natural systems. The study of ecosystems can cover 10 orders of magnitude, from the surface layers of rocks to the surface of the planet.

There is no single definition of what constitutes an ecosystem. German ecologist Ernst-Detlef Schulze and coauthors defined an ecosystem as an area which is "uniform regarding the biological turnover, and contains all the fluxes above and below the ground area under consideration." They explicitly reject Gene Likens' use of entire river catchments as "too wide a demarcation" to be a single ecosystem, given the level of heterogeneity within such an area. Other authors have suggested that an ecosystem can encompass a much larger area, even the whole planet. Schulze and coauthors also rejected the idea that a single rotting log could be studied as an ecosystem because the size of the flows between the log and its surroundings are too large, relative to the proportion cycles within the log. Philosopher of science Mark Sagoff considers the failure to define "the kind of object it studies" to be an obstacle to the development of theory in ecosystem ecology.

Ecosystems can be studied through a variety of approaches—theoretical studies, studies monitoring specific ecosystems over long periods of time, those that look at differences between ecosystems to elucidate how they work and direct manipulative experimentation. Studies can be carried out at a variety of scales, ranging from whole-ecosystem studies to studying microcosms or mesocosms (simplified representations of ecosystems). American ecologist Stephen R. Carpenter has argued that microcosm experiments can be "irrelevant and diversionary" if they are not carried out in conjunction with field studies done at the ecosystem scale. Microcosm experiments often fail to accurately predict ecosystem-level dynamics.

The Hubbard Brook Ecosystem Study started in 1963 to study the White Mountains in New Hampshire. It was the first successful attempt to study an entire watershed as an ecosystem. The study used stream chemistry as a means of monitoring ecosystem properties, and developed a detailed biogeochemical model of the ecosystem. Long-term research at the site led to the discovery of acid rain in North America in 1972. Researchers documented the depletion of soil cations (especially calcium) over the next several decades.

Human activities

Human activities are important in almost all ecosystems. Although humans exist and operate within ecosystems, their cumulative effects are large enough to influence external factors like climate.

Ecosystem goods and services

The High Peaks Wilderness Area in the 6,000,000-acre (2,400,000 ha) Adirondack Park is an example of a diverse ecosystem.

Ecosystems provide a variety of goods and services upon which people depend. Ecosystem goods include the "tangible, material products" of ecosystem processes such as food, construction material, medicinal plants. They also include less tangible items like tourism and recreation, and genes from wild plants and animals that can be used to improve domestic species.

Ecosystem services, on the other hand, are generally "improvements in the condition or location of things of value". These include things like the maintenance of hydrological cycles, cleaning air and water, the maintenance of oxygen in the atmosphere, crop pollination and even things like beauty, inspiration and opportunities for research. While material from the ecosystem had traditionally been recognized as being the basis for things of economic value, ecosystem services tend to be taken for granted.

Ecosystem management

When natural resource management is applied to whole ecosystems, rather than single species, it is termed ecosystem management. Although definitions of ecosystem management abound, there is a common set of principles which underlie these definitions. A fundamental principle is the long-term sustainability of the production of goods and services by the ecosystem; "intergenerational sustainability [is] a precondition for management, not an afterthought".

While ecosystem management can be used as part of a plan for wilderness conservation, it can also be used in intensively managed ecosystems.

Threats caused by humans

As human population and per capita consumption grow, so do the resource demands imposed on ecosystems and the effects of the human ecological footprint. Natural resources are vulnerable and limited. The environmental impacts of anthropogenic actions are becoming more apparent. Problems for all ecosystems include: environmental pollution, climate change and biodiversity loss. For terrestrial ecosystems further threats include air pollution, soil degradation, and deforestation. For aquatic ecosystems threats include also unsustainable exploitation of marine resources (for example overfishing of certain species), marine pollution, microplastics pollution, water pollution, the warming of oceans, and building on coastal areas.

Society is increasingly becoming aware that ecosystem services are not only limited but also that they are threatened by human activities. The need to better consider long-term ecosystem health and its role in enabling human habitation and economic activity is urgent. To help inform decision-makers, many ecosystem services are being assigned economic values, often based on the cost of replacement with anthropogenic alternatives. The ongoing challenge of prescribing economic value to nature, for example through biodiversity banking, is prompting transdisciplinary shifts in how we recognize and manage the environment, social responsibility, business opportunities, and our future as a species.

Search This Blog

Saturday, March 7, 2020

Genomics

History

Etymology

Early sequencing efforts

DNA-sequencing technology developed

Complete genomes

The "omics" revolution

Genome analysis

Sequencing

Shotgun sequencing

High-throughput sequencing

Assembly

Assembly approaches

Finishing

Annotation

Sequencing pipelines and databases

Research areas

Functional genomics

Structural genomics

Epigenomics

Metagenomics

Model systems

Viruses and bacteriophages

Cyanobacteria

Applications of genomics

Genomic medicine

Synthetic biology and bioengineering

Conservation genomics

Ecosystem

History

Processes

Primary production

Energy flow

Decomposition

Nutrient cycling

Function and biodiversity

Dynamics

Ecosystem ecology

Human activities

Ecosystem goods and services

Ecosystem management

Threats caused by humans

Classical radicalism