Search This Blog

Saturday, March 7, 2020

Pathogenomics

From Wikipedia, the free encyclopedia
Pathogenomics is a field which uses high-throughput screening technology and bioinformatics to study encoded microbe resistance, as well as virulence factors (VFs), which enable a microorganism to infect a host and possibly cause disease. This includes studying genomes of pathogens which cannot be cultured outside of a host. In the past, researchers and medical professionals found it difficult to study and understand pathogenic traits of infectious organisms. With newer technology, pathogen genomes can be identified and sequenced in a much shorter time and at a lower cost, thus improving the ability to diagnose, treat, and even predict and prevent pathogenic infections and disease. It has also allowed researchers to better understand genome evolution events - gene loss, gain, duplication, rearrangement - and how those events impact pathogen resistance and ability to cause disease. This influx of information has created a need for making the vast amounts of data accessible to researchers in the form of databases, and it has raised ethical questions about the wisdom of reconstructing previously extinct and deadly pathogens in order to better understand virulence.

Reviewing high-throughput screening results

History

During the earlier times when genomics was being studied, scientists found it challenging to sequence genetic information. The field began to explode in 1977 when Fred Sanger, PhD, along with his colleagues, sequenced the DNA-based genome of a bacteriophage, using a method now known as the Sanger Method. The Sanger Method for sequencing DNA exponentially advanced molecular biology and directly led to the ability to sequence genomes of other organisms, including the complete human genome.

The Haemophilus influenza genome was one of the first organism genomes sequenced in 1995 by J. Craig Venter and Hamilton Smith using whole genome shotgun sequencing. Since then, newer and more efficient high-throughput sequencing, such as Next Generation Genomic Sequencing (NGS) and Single-Cell Genomic Sequencing, have been developed. While the Sanger method is able to sequence one DNA fragment at a time, NGS technology can sequence thousands of sequences at a time. With the ability to rapidly sequence DNA, new insights developed, such as the discovery that since prokaryotic genomes are more diverse than originally thought, it is necessary to sequence multiple strains in a species rather than only a few. E.coli was an example of why this is important, with genes encoding virulence factors in two strains of the species differing by at least thirty percent. Such knowledge, along with more thorough study of genome gain, loss, and change, is giving researchers valuable insight into how pathogens interact in host environments and how they are able to infect hosts and cause disease.

With this high influx of new information, there has arisen a higher demand for bioinformatics so scientists can properly analyze the new data. In response, software and other tools have been developed for this purpose. Also, as of 2008, the amount of stored sequences was doubling every 18 months, making urgent the need for better ways to organize data and aid research. In response, thousands of publicly accessible databases and other resources have been created, including the Virulence Factor Database (VFDB) of pathogenic bacteria, which was established in 2004 and was created to aid in pathogenomics research.

Microbe analysis

Pathogens may be prokaryotic (archaea or bacteria), single-celled eukarya or viruses. Prokaryotic genomes have typically been easier to sequence due to smaller genome size compared to Eukarya. Due to this, there is a bias in reporting pathogenic bacterial behavior. Regardless of this bias in reporting, many of the dynamic genomic events are similar across all the types of pathogen organisms. Genomic evolution occurs via gene gain, gene loss, and genome rearrangement, and these "events" are observed in multiple pathogen genomes, with some bacterial pathogens experiencing all three. Pathogenomics does not focus exclusively on understanding pathogen-host interactions, however. Insight of individual or cooperative pathogen behavior provides knowledge into the development or inheritance of pathogen virulence factors. Through a deeper understanding of the small sub-units that cause infection, it may be possible to develop novel therapeutics that are efficient and cost-effective.

Cause and analysis of genomic diversity

Dynamic genomes with high plasticity are necessary to allow pathogens, especially bacteria, to survive in changing environments. With the assistance of high throughput sequencing methods and in silico technologies, it is possible to detect, compare and catalogue many of these dynamic genomic events. Genomic diversity is important when detecting and treating a pathogen since these events can change the function and structure of the pathogen. There is a need to analyze more than a single genome sequence of a pathogen species to understand pathogen mechanisms. Comparative genomics is a methodology which allows scientists to compare the genomes of different species and strains. There are several examples of successful comparative genomics studies, among them the analysis of Listeria and Escherichia coli. Some studies have attempted to address the difference between pathogenic and non-pathogenic microbes. This inquiry proves to be difficult, however, since a single bacterial species can have many strains, and the genomic content of each of these strains varies.  
Evolutionary dynamics
Varying microbe strains and genomic content are caused by different forces, including three specific evolutionary events which have an impact on pathogen resistance and ability to cause disease, a: gene gain, gene loss, and genome rearrangement.    
Gene loss and genome decay
Gene loss occurs when genes are deleted. The reason why this occurs is still not fully understood, though it most likely involves adaptation to a new environment or ecological niche. Some researchers believe gene loss may actually increase fitness and survival among pathogens. In a new environment, some genes may become unnecessary for survival, and so mutations are eventually "allowed" on those genes until they become inactive "pseudogenes." These pseudogenes are observed in organisms such as Shigella flexneri, Salmonella enterica, and Yersinia pestis. Over time, the pseudogenes are deleted, and the organisms become fully dependent on their host as either endosymbionts or obligate intracellular pathogens, as is seen in Buchnera, Myobacterium leprae, and Chlamydia trachomatis. These deleted genes are also called Anti-virulence genes (AVG) since it is thought they may have prevented the organism from becoming pathogenic. In order to be more virulent, infect a host and remain alive, the pathogen had to get rid of those AVGs. The reverse process can happen as well, as was seen during analysis of Listeria strains, which showed that a reduced genome size led to a non-pathogenic Listeria strain from a pathogenic strain.[26] Systems have been developed to detect these pseudogenes/AVGs in a genome sequence.

Summary of dynamic genomics events
Gene gain and duplication
One of the key forces driving gene gain is thought to be horizontal (lateral) gene transfer (LGT). It is of particular interest in microbial studies because these mobile genetic elements may introduce virulence factors into a new genome. A comparative study conducted by Gill et al. in 2005 postulated that LGT may have been the cause for pathogen variations between Staphylococcus epidermidis and Staphylococcus aureus. There still, however, remains skepticism about the frequency of LGT, its identification, and its impact. New and improved methodologies have been engaged, especially in the study of phylogenetics, to validate the presence and effect of LGT. Gene gain and gene duplication events are balanced by gene loss, such that despite their dynamic nature, the genome of a bacterial species remains approximately the same size.
Genome rearrangement
Mobile genetic insertion sequences can play a role in genome rearrangement activities. Pathogens that do not live in an isolated environment have been found to contain a large number of insertion sequence elements and various repetitive segments of DNA. The combination of these two genetic elements is thought help mediate homologous recombination. There are pathogens, such as Burkholderia mallei, and Burkholderia pseudomallei which have been shown to exhibit genome-wide rearrangements due to insertion sequences and repetitive DNA segments. At this time, no studies demonstrate genome-wide rearrangement events directly giving rise to pathogenic behavior in a microbe. This does not mean it is not possible. Genome-wide rearrangements do, however, contribute to the plasticity of bacterial genome, which may prime the conditions for other factors to introduce, or lose, virulence factors.
Single-nucleotide polymorphisms
Single Nucleotide Polymorphisms, or SNPs, allow for a wide array of genetic variation among humans as well as pathogens. They allow researchers to estimate a variety of factors: the effects of environmental toxins, how different treatment methods affect the body, and what causes someone's predisposition to illnesses. SNPs play a key role in understanding how and why mutations occur. SNPs also allows for scientists to map genomes and analyze genetic information.
Pan and core genomes
Pan-genome overview

Pan-genome overview The most recent definition of a bacterial species comes from the pre-genomic era. In 1987, it was proposed that bacterial strains showing >70% DNA·DNA re-association and sharing characteristic phenotypic traits should be considered to be strains of the same species. The diversity within pathogen genomes makes it difficult to identify the total number of genes that are associated within all strains of a pathogen species. It has been thought that the total number of genes associated with a single pathogen species may be unlimited, although some groups are attempting to derive a more empirical value. For this reason, it was necessary to introduce the concept of pan-genomes and core genomes. Pan-genome and core genome literature also tends to have a bias towards reporting on prokaryotic pathogenic organisms. Caution may need to be exercised when extending the definition of a pan-genome or a core-genome to the other pathogenic organisms because there is no formal evidence of the properties of these pan-genomes.

A core genome is the set of genes found across all strains of a pathogen species. A pan-genome is the entire gene pool for that pathogen species, and includes genes that are not shared by all strains. Pan-genomes may be open or closed depending on whether comparative analysis of multiple strains reveals no new genes (closed) or many new genes (open) compared to the core genome for that pathogen species. In the open pan-genome, genes may be further characterized as dispensable or strain specific. Dispensable genes are those found in more than one strain, but not in all strains, of a pathogen species. Strain specific genes are those found only in one strain of a pathogen species. The differences in pan-genomes are reflections of the life style of the organism. For example, Streptococcus agalactiae, which exists in diverse biological niches, has a broader pan-genome when compared with the more environmentally isolated Bacillus anthracis. Comparative genomics approaches are also being used to understand more about the pan-genome. Recent discoveries show that the number of new species continue to grow with an estimated 1031 bacteriophages on the planet with those bacteriophages infecting 1024 others per second, the continuous flow of genetic material being exchanged is difficult to imagine.

Virulence factors

Multiple genetic elements of human-affecting pathogens contribute to the transfer of virulence factors: plasmids, pathogenicity island, prophages, bacteriophages, transposons, and integrative and conjugative elements. Pathogencity islands and their detection are the focus of several bioinformatics efforts involved in pathogenomics. It is a common belief that "environmental bacterial strains" lack the capacity to harm or do damage to humans. However, recent studies show that bacteria from aquatic environments have acquired pathogenic strains through evolution. This allows for the bacteria to have a wider range in genetic traits and can cause a potential threat to humans from which there is more resistance towards antibiotics.

Microbe-microbe interactions

Staphylococcus aureus biofilm

Microbe-host interactions tend to overshadow the consideration of microbe-microbe interactions. Microbe-microbe interactions though can lead to chronic states of infirmity that are difficult to understand and treat.

Biofilms

Biofilms are an example of microbe-microbe interactions and are thought to be associated with up to 80% of human infections. Recently it has been shown that there are specific genes and cell surface proteins involved in the formation of biofilm. These genes and also surface proteins may be characterized through in silico methods to form an expression profile of biofilm-interacting bacteria. This expression profile may be used in subsequent analysis of other microbes to predict biofilm microbe behaviour, or to understand how to dismantle biofilm formation.

Host microbe analysis

Pathogens have the ability to adapt and manipulate host cells, taking full advantage of a host cell's cellular processes and mechanisms.

A microbe may be influenced by hosts to either adapt to its new environment or learn to evade it. An insight into these behaviours will provide beneficial insight for potential therapeutics. The most detailed outline of host-microbe interaction initiatives is outlined by the Pathogenomics European Research Agenda. Its report emphasizes the following features: 

Summary of host-microbe project goals in the Pathogenomics European Research Agenda
  • Microarray analysis of host and microbe gene expression during infection. This is important for identifying the expression of virulence factors that allow a pathogen to survive a host's defense mechanism. Pathogens tend to undergo an assortment of changed in order to subvert and hosts immune system, in some case favoring a hyper variable genome state. The genomic expression studies will be complemented with protein-protein interaction networks studies.
  • Using RNA interference (RNAi) to identify host cell functions in response to infections. Infection depends on the balance between the characteristics of the host cell and the pathogen cell. In some cases, there can be an overactive host response to infection, such as in meningitis, which can overwhelm the host's body. Using RNA, it will be possible to more clearly identify how a host cell defends itself during times of acute or chronic infection. This has also been applied successfully is Drosophila.
  • Not all microbe interactions in host environment are malicious. Commensal flora, which exists in various environments in animals and humans may actually help combating microbial infections. The human flora, such as the gut for example, is home to a myriad of microbes.
The diverse community within the gut has been heralded to be vital for human health. There are a number of projects under way to better understand the ecosystems of the gut. The sequence of commensal Escherichia coli strain SE11, for example, has already been determined from the faecal matter of a healthy human and promises to be the first of many studies. Through genomic analysis and also subsequent protein analysis, a better understanding of the beneficial properties of commensal flora will be investigated in hopes of understanding how to build a better therapeutic.

Eco-evo perspective

The "eco-evo" perspective on pathogen-host interactions emphasizes the influences ecology and the environment on pathogen evolution. The dynamic genomic factors such as gene loss, gene gain and genome rearrangement, are all strongly influenced by changes in the ecological niche where a particular microbial strain resides. Microbes may switch from being pathogenic and non-pathogenic due to changing environments. This was demonstrated during studies of the plague, Yersinia pestis, which apparently evolved from a mild gastrointestinal pathogen to a very highly pathogenic microbe through dynamic genomic events. In order for colonization to occur, there must be changes in biochemical makeup to aid survival in a variety of environments. This is most likely due to a mechanism allowing the cell to sense changes within the environment, thus influencing change in gene expression. Understanding how these strain changes occur from being low or non-pathogenic to being highly pathogenic and vice versa may aid in developing novel therapeutics for microbial infections.

Applications

Baby Receiving Immunizations

Human health has greatly improved and the mortality rate has declined substantially since the second world war because of improved hygiene due to changing public health regulations, as well as more readily available vaccines and antibiotics. Pathogenomics will allow scientists to expand what they know about pathogenic and non-pathogenic microbes, thus allowing for new and improved vaccines. Pathogenomics also has wider implication, including preventing bioterrorism.

Reverse vaccinology

Reverse vaccinology is relatively new. While research is still being conducted, there have been breakthroughs with pathogens such as Streptococcus and Meningitis. Methods of vaccine production, such as biochemical and serological, are laborious and unreliable.They require the pathogens to be in vitro to be effective. New advances in genomic development help predict nearly all variations of pathogens, thus making advances for vaccines. Protein-based vaccines are being developed to combat resistant pathogens such as Staphylococcus and Chlamydia.

Countering bioterrorism

In 2005, the sequence of the 1918 Spanish influenza was completed. Accompanied with phylogenetic analysis, it was possible to supply a detailed account of the virus' evolution and behavior, in particular its adaptation to humans. Following the sequencing of the Spanish influenza, the pathogen was also reconstructed. When inserted into mice, the pathogen proved to be incredibly deadly. The 2001 anthrax attacks shed light on the possibility of bioterrorism as being more of a real than imagined threat. Bioterrorism was anticipated in the Iraq war, with soldiers being inoculated for a smallpox attack. Using technologies and insight gained from reconstruction of the Spanish influenza, it may be possible to prevent future deadly planted outbreaks of disease. There is a strong ethical concern however, as to whether the resurrection of old viruses is necessary and whether it does more harm than good. The best avenue for countering such threats is coordinating with organizations which provide immunizations. The increased awareness and participation would greatly decrease the effectiveness of a potential epidemic. An addition to this measure would be to monitor natural water reservoirs as a basis to prevent an attack or outbreak. Overall, communication between labs and large organizations, such as Global Outbreak Alert and Response Network (GOARN), can lead to early detection and prevent outbreaks.

Chromatin

From Wikipedia, the free encyclopedia
  
The major structures in DNA compaction: DNA, the nucleosome, the 10 nm "beads-on-a-string" fibre, the 30 nm chromatin fibre and the metaphase chromosome.

Chromatin is a complex of DNA and protein found in eukaryotic cells. Its primary function is packaging long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important roles in reinforcing the DNA during cell division, preventing DNA damage, and regulating gene expression and DNA replication. During mitosis and meiosis, chromatin facilitates proper segregation of the chromosomes in anaphase; the characteristic shapes of chromosomes visible during this stage are the result of DNA being coiled into highly condensed chromatin.

The primary protein components of chromatin are histones, which bind to DNA and function as "anchors" around which the strands are wound. In general, there are three levels of chromatin organization:
  1. DNA wraps around histone proteins, forming nucleosomes and the so-called "beads on a string" structure (euchromatin).
  2. Multiple histones wrap into a 30-nanometer fibre consisting of nucleosome arrays in their most compact form (heterochromatin).
  3. Higher-level DNA supercoiling of the 30-nm fiber produces the metaphase chromosome (during mitosis and meiosis).
Many organisms, however, do not follow this organization scheme. For example, spermatozoa and avian red blood cells have more tightly packed chromatin than most eukaryotic cells, and trypanosomatid protozoa do not condense their chromatin into visible chromosomes at all. Prokaryotic cells have entirely different structures for organizing their DNA (the prokaryotic chromosome equivalent is called a genophore and is localized within the nucleoid region).

The overall structure of the chromatin network further depends on the stage of the cell cycle. During interphase, the chromatin is structurally loose to allow access to RNA and DNA polymerases that transcribe and replicate the DNA. The local structure of chromatin during interphase depends on the specific genes present in the DNA. Regions of DNA containing genes which are actively transcribed ("turned on") are less tightly compacted and closely associated with RNA polymerases in a structure known as euchromatin, while regions containing inactive genes ("turned off") are generally more condensed and associated with structural proteins in heterochromatin. Epigenetic modification of the structural proteins in chromatin via methylation and acetylation also alters local chromatin structure and therefore gene expression. The structure of chromatin networks is currently poorly understood and remains an active area of research in molecular biology.

Dynamic chromatin structure and hierarchy

Chromatin undergoes various structural changes during a cell cycle. Histone proteins are the basic packer and arranger of chromatin and can be modified by various post-translational modifications to alter chromatin packing (Histone modification). Most of the modifications occur on the histone tail. The consequences in terms of chromatin accessibility and compaction depend both on the amino-acid that is modified and the type of modification. For example, Histone acetylation results in loosening and increased accessibility of chromatin for replication and transcription. Lysine tri-methylation can either be correlated with transcriptional activity (tri-methylation of histone H3 Lysine 4) or transcriptional repression and chromatin compaction (tri-methylation of histone H3 Lysine 9 or 27). Several studies suggested that different modifications could occur simultaneously. For example, it was proposed that a bivalent structure (with tri-methylation of both Lysine 4 and 27 on histone H3) was involved in mammalian early development.

Polycomb-group proteins play a role in regulating genes through modulation of chromatin structure.

DNA structure

The structures of A-, B-, and Z-DNA.

In nature, DNA can form three structures, A-, B-, and Z-DNA. A- and B-DNA are very similar, forming right-handed helices, whereas Z-DNA is a left-handed helix with a zig-zag phosphate backbone. Z-DNA is thought to play a specific role in chromatin structure and transcription because of the properties of the junction between B- and Z-DNA.

At the junction of B- and Z-DNA, one pair of bases is flipped out from normal bonding. These play a dual role of a site of recognition by many proteins and as a sink for torsional stress from RNA polymerase or nucleosome binding.

Nucleosomes and beads-on-a-string

A cartoon representation of the nucleosome structure. From PDB: 1KX5​.

The basic repeat element of chromatin is the nucleosome, interconnected by sections of linker DNA, a far shorter arrangement than pure DNA in solution.

In addition to the core histones, there is the linker histone, H1, which contacts the exit/entry of the DNA strand on the nucleosome. The nucleosome core particle, together with histone H1, is known as a chromatosome. Nucleosomes, with about 20 to 60 base pairs of linker DNA, can form, under non-physiological conditions, an approximately 10 nm "beads-on-a-string" fibre. (Fig. 1-2).

The nucleosomes bind DNA non-specifically, as required by their function in general DNA packaging. There are, however, large DNA sequence preferences that govern nucleosome positioning. This is due primarily to the varying physical properties of different DNA sequences: For instance, adenine and thymine are more favorably compressed into the inner minor grooves. This means nucleosomes can bind preferentially at one position approximately every 10 base pairs (the helical repeat of DNA)- where the DNA is rotated to maximise the number of A and T bases that will lie in the inner minor groove.

30-nanometer chromatin fiber

Two proposed structures of the 30 nm chromatin filament.
Left: 1 start helix "solenoid" structure.
Right: 2 start loose helix structure.
Note: the histones are omitted in this diagram - only the DNA is shown.

With addition of H1, the beads-on-a-string structure in turn coils into a 30 nm diameter helical structure known as the 30 nm fibre or filament. The precise structure of the chromatin fiber in the cell is not known in detail, and there is still some debate over this.

This level of chromatin structure is thought to be the form of heterochromatin, which contains mostly transcriptionally silent genes. EM (electron microscopy) studies have demonstrated that the 30 NM fiber is highly dynamic such that it unfolds into a 10 nm fiber ("beads-on-a-string") structure when transversed by an RNA polymerase engaged in transcription.

Four proposed structures of the 30 nm chromatin filament for DNA repeat length per nucleosomes ranging from 177 to 207 bp.
Linker DNA in yellow and nucleosomal DNA in pink.

The existing models commonly accept that the nucleosomes lie perpendicular to the axis of the fibre, with linker histones arranged internally. A stable 30 nm fibre relies on the regular positioning of nucleosomes along DNA. Linker DNA is relatively resistant to bending and rotation. This makes the length of linker DNA critical to the stability of the fibre, requiring nucleosomes to be separated by lengths that permit rotation and folding into the required orientation without excessive stress to the DNA. In this view, different lengths of the linker DNA should produce different folding topologies of the chromatin fiber. Recent theoretical work, based on electron-microscopy images of reconstituted fibers supports this view.

Spatial organization of chromatin in the cell nucleus

The spatial arrangement of the chromatin within the nucleus is not random - specific regions of the chromatin can be found in certain territories. Territories are, for example, the lamina-associated domains (LADs), and the topologically associating domains (TADs), which are bound together by protein complexes. Currently, polymer models such as the Strings & Binders Switch (SBS) model and the Dynamic Loop (DL) model are used to describe the folding of chromatin within the nucleus.

Cell-cycle dependent structural organization

  • Interphase: The structure of chromatin during interphase of mitosis is optimized to allow simple access of transcription and DNA repair factors to the DNA while compacting the DNA into the nucleus. The structure varies depending on the access required to the DNA. Genes that require regular access by RNA polymerase require the looser structure provided by euchromatin.
    1. Karyogram of human male using Giemsa staining, showing the classic metaphase chromatin structure.
    2. Metaphase: The metaphase structure of chromatin differs vastly to that of interphase. It is optimised for physical strength[citation needed] and manageability, forming the classic chromosome structure seen in karyotypes. The structure of the condensed chromatin is thought to be loops of 30 nm fibre to a central scaffold of proteins. It is, however, not well-characterised.The physical strength of chromatin is vital for this stage of division to prevent shear damage to the DNA as the daughter chromosomes are separated. To maximise strength the composition of the chromatin changes as it approaches the centromere, primarily through alternative histone H1 analogues. During mitosis, although most of the chromatin is tightly compacted, there are small regions that are not as tightly compacted. These regions often correspond to promoter regions of genes that were active in that cell type prior to entry into chromatosis. The lack of compaction of these regions is called bookmarking, which is an epigenetic mechanism believed to be important for transmitting to daughter cells the "memory" of which genes were active prior to entry into mitosis. This bookmarking mechanism is needed to help transmit this memory because transcription ceases during mitosis.

    Chromatin and bursts of transcription

    Chromatin and its interaction with enzymes has been researched, and a conclusion being made is that it is relevant and an important factor in gene expression. Vincent G. Allfrey, a professor at Rockefeller University, stated that RNA synthesis is related to histone acetylation. The lysine amino acid attached to the end of the histones is positively charged. The acetylation of these tails would make the chromatin ends neutral, allowing for DNA access.

    When the chromatin decondenses, the DNA is open to entry of molecular machinery. Fluctuations between open and closed chromatin may contribute to the discontinuity of transcription, or transcriptional bursting. Other factors are probably involved, such as the association and dissociation of transcription factor complexes with chromatin. The phenomenon, as opposed to simple probabilistic models of transcription, can account for the high variability in gene expression occurring between cells in isogenic populations.

    Alternative chromatin organizations

    During metazoan spermiogenesis, the spermatid's chromatin is remodeled into a more spaced-packaged, widened, almost crystal-like structure. This process is associated with the cessation of transcription and involves nuclear protein exchange. The histones are mostly displaced, and replaced by protamines (small, arginine-rich proteins). It is proposed that in yeast, regions devoid of histones become very fragile after transcription; HMO1, an HMG-box protein, helps in stabilizing nucleosomes-free chromatin.

    Chromatin and DNA repair

    The packaging of eukaryotic DNA into chromatin presents a barrier to all DNA-based processes that require recruitment of enzymes to their sites of action. To allow the critical cellular process of DNA repair, the chromatin must be remodeled. In eukaryotes, ATP dependent chromatin remodeling complexes and histone-modifying enzymes are two predominant factors employed to accomplish this remodeling process.

    Chromatin relaxation occurs rapidly at the site of a DNA damage. This process is initiated by PARP1 protein that starts to appear at DNA damage in less than a second, with half maximum accumulation within 1.6 seconds after the damage occurs. Next the chromatin remodeler Alc1 quickly attaches to the product of PARP1, and completes arrival at the DNA damage within 10 seconds of the damage. About half of the maximum chromatin relaxation, presumably due to action of Alc1, occurs by 10 seconds. This then allows recruitment of the DNA repair enzyme MRE11, to initiate DNA repair, within 13 seconds.

    γH2AX, the phosphorylated form of H2AX is also involved in the early steps leading to chromatin decondensation after DNA damage occurrence. The histone variant H2AX constitutes about 10% of the H2A histones in human chromatin. γH2AX (H2AX phosphorylated on serine 139) can be detected as soon as 20 seconds after irradiation of cells (with DNA double-strand break formation), and half maximum accumulation of γH2AX occurs in one minute. The extent of chromatin with phosphorylated γH2AX is about two million base pairs at the site of a DNA double-strand break. γH2AX does not, itself, cause chromatin decondensation, but within 30 seconds of irradiation, RNF8 protein can be detected in association with γH2AX. RNF8 mediates extensive chromatin decondensation, through its subsequent interaction with CHD4, a component of the nucleosome remodeling and deacetylase complex NuRD.

    After undergoing relaxation subsequent to DNA damage, followed by DNA repair, chromatin recovers to a compaction state close to its pre-damage level after about 20 min.

    Methods to investigate chromatin

    1. ChIP-seq (Chromatin immunoprecipitation sequencing), aimed against different histone modifications, can be used to identify chromatin states throughout the genome. Different modifications have been linked to various states of chromatin.
    2. DNase-seq (DNase I hypersensitive sites Sequencing) uses the sensitivity of accessible regions in the genome to the DNase I enzyme to map open or accessible regions in the genome.
    3. FAIRE-seq (Formaldehyde-Assisted Isolation of Regulatory Elements sequencing) uses the chemical properties of protein-bound DNA in a two-phase separation method to extract nucleosome depleted regions from the genome.
    4. ATAC-seq (Assay for Transposable Accessible Chromatin sequencing) uses the Tn5 transposase to integrate (synthetic) transposons into accessible regions of the genome consequentially highlighting the localisation of nucleosomes and transcription factors across the genome.
    5. DNA footprinting is a method aimed at identifying protein-bound DNA. It uses labeling and fragmentation coupled to gel electrophoresis to identify areas of the genome that have been bound by proteins.
    6. MNase-seq (Micrococcal Nuclease sequencing) uses the micrococcal nuclease enzyme to identify nucleosome positioning throughout the genome.
    7. Chromosome conformation capture determines the spatial organization of chromatin in the nucleus, by inferring genomic locations that physically interact.
    8. MACC profiling (Micrococcal nuclease ACCessibility profiling) uses titration series of chromatin digests with micrococcal nuclease to identify chromatin accessibility as well as to map nucleosomes and non-histone DNA-binding proteins in both open and closed regions of the genome.

    Chromatin and knots

    It has been a puzzle how decondensed interphase chromosomes remain essentially unknotted. The natural expectation is that in the presence of type II DNA topoisomerases that permit passages of double-stranded DNA regions through each other, all chromosomes should reach the state of topological equilibrium. The topological equilibrium in highly crowded interphase chromosomes forming chromosome territories would result in formation of highly knotted chromatin fibres. However, Chromosome Conformation Capture (3C) methods revealed that the decay of contacts with the genomic distance in interphase chromosomes is practically the same as in the crumpled globule state that is formed when long polymers condense without formation of any knots. To remove knots from highly crowded chromatin, one would need an active process that should not only provide the energy to move the system from the state of topological equilibrium but also guide topoisomerase-mediated passages in such a way that knots would be efficiently unknotted instead of making the knots even more complex. It has been shown that the process of chromatin-loop extrusion is ideally suited to actively unknot chromatin fibres in interphase chromosomes.

    Chromatin: alternative definitions

    The term, introduced by Walther Flemming, has multiple meanings:
    1. Simple and concise definition: Chromatin is a macromolecular complex of a DNA macromolecule and protein macromolecules (and RNA). The proteins package and arrange the DNA and control its functions within the cell nucleus.
    2. A biochemists’ operational definition: Chromatin is the DNA/protein/RNA complex extracted from eukaryotic lysed interphase nuclei. Just which of the multitudinous substances present in a nucleus will constitute a part of the extracted material partly depends on the technique each researcher uses. Furthermore, the composition and properties of chromatin vary from one cell type to the another, during development of a specific cell type, and at different stages in the cell cycle.
    3. The DNA + histone = chromatin definition: The DNA double helix in the cell nucleus is packaged by special proteins termed histones. The formed protein/DNA complex is called chromatin. The basic structural unit of chromatin is the nucleosome.

    Nobel Prizes

    The following scientists were recognized for their contributions to chromatin research with Nobel Prizes:

    Year Who Award
    1910 Albrecht Kossel (University of Heidelberg) Nobel Prize in Physiology or Medicine for his discovery of the five nuclear bases: adenine, cytosine, guanine, thymine, and uracil.
    1933 Thomas Hunt Morgan (California Institute of Technology) Nobel Prize in Physiology or Medicine for his discoveries of the role played by the gene and chromosome in heredity, based on his studies of the white-eyed mutation in the fruit fly Drosophila.
    1962 Francis Crick, James Watson and Maurice Wilkins (MRC Laboratory of Molecular Biology, Harvard University and London University respectively) Nobel Prize in Physiology or Medicine for their discoveries of the double helix structure of DNA and its significance for information transfer in living material.
    1982 Aaron Klug (MRC Laboratory of Molecular Biology) Nobel Prize in Chemistry "for his development of crystallographic electron microscopy and his structural elucidation of biologically important nucleic acid-protein complexes"
    1993 Richard J. Roberts and Phillip A. Sharp Nobel Prize in Physiology "for their independent discoveries of split genes," in which DNA sections called exons express proteins, and are interrupted by DNA sections called introns, which do not express proteins.
    2006 Roger Kornberg (Stanford University) Nobel Prize in Chemistry for his discovery of the mechanism by which DNA is transcribed into messenger RNA.

    Metabolomics

    From Wikipedia, the free encyclopedia
     
    The central dogma of biology showing the flow of information from DNA to the phenotype. Associated with each stage is the corresponding systems biology tool, from genomics to metabolomics.

    Metabolomics is the scientific study of chemical processes involving metabolites, the small molecule substrates, intermediates and products of metabolism. Specifically, metabolomics is the "systematic study of the unique chemical fingerprints that specific cellular processes leave behind", the study of their small-molecule metabolite profiles. The metabolome represents the complete set of metabolites in a biological cell, tissue, organ or organism, which are the end products of cellular processes. mRNA gene expression data and proteomic analyses reveal the set of gene products being produced in the cell, data that represents one aspect of cellular function. Conversely, metabolic profiling can give an instantaneous snapshot of the physiology of that cell, and thus, metabolomics provides a direct "functional readout of the physiological state" of an organism. One of the challenges of systems biology and functional genomics is to integrate genomics, transcriptomic, proteomic, and metabolomic information to provide a better understanding of cellular biology.

    History

    The concept that individuals might have a "metabolic profile" that could be reflected in the makeup of their biological fluids was introduced by Roger Williams in the late 1940s, who used paper chromatography to suggest characteristic metabolic patterns in urine and saliva were associated with diseases such as schizophrenia. However, it was only through technological advancements in the 1960s and 1970s that it became feasible to quantitatively (as opposed to qualitatively) measure metabolic profiles. The term "metabolic profile" was introduced by Horning, et al. in 1971 after they demonstrated that gas chromatography-mass spectrometry (GC-MS) could be used to measure compounds present in human urine and tissue extracts. The Horning group, along with that of Linus Pauling and Arthur B. Robinson led the development of GC-MS methods to monitor the metabolites present in urine through the 1970s.

    Concurrently, NMR spectroscopy, which was discovered in the 1940s, was also undergoing rapid advances. In 1974, Seeley et al. demonstrated the utility of using NMR to detect metabolites in unmodified biological samples. This first study on muscle highlighted the value of NMR in that it was determined that 90% of cellular ATP is complexed with magnesium. As sensitivity has improved with the evolution of higher magnetic field strengths and magic angle spinning, NMR continues to be a leading analytical tool to investigate metabolism. Recent efforts to utilize NMR for metabolomics have been largely driven by the laboratory of Jeremy K. Nicholson at Birkbeck College, University of London and later at Imperial College London. In 1984, Nicholson showed 1H NMR spectroscopy could potentially be used to diagnose diabetes mellitus, and later pioneered the application of pattern recognition methods to NMR spectroscopic data.

    In 1995 liquid chromatography mass spectrometry metabolomics experiments were performed by Gary Siuzdak while working with Richard Lerner (then president of The Scripps Research Institute) and Benjamin Cravatt, to analyze the cerebral spinal fluid from sleep deprived animals. One molecule of particular interest, oleamide, was observed and later shown to have sleep inducing properties. This work is one of the earliest such experiments combining liquid chromatography and mass spectrometry in metabolomics.

    In 2005, the first metabolomics tandem mass spectrometry database, METLIN, for characterizing human metabolites was developed in the Siuzdak laboratory at The Scripps Research Institute. METLIN has since grown and as of July 1, 2019, METLIN contains over 450,000 metabolites and other chemical entities, each compound having experimental tandem mass spectrometry data generated from molecular standards at multiple collision energies and in positive and negative ionization modes. METLIN is the largest repository of tandem mass spectrometry data of its kind.

    In 2005, the Siuzdak lab was engaged in identifying metabolites associated with sepsis and in an effort to address the issue of statistically identifying the most relevant dysregulated metabolites across hundreds of LC/MS datasets, the first algorithm was developed to allow for the nonlinear alignment of mass spectrometry metabolomics data. Called XCMS, where the "X" constitutes any chromatographic technology, it has since (2012) been developed as an online tool and as of 2019 (with METLIN) has over 30,000 registered users.

    On 23 January 2007, the Human Metabolome Project, led by David Wishart of the University of Alberta, Canada, completed the first draft of the human metabolome, consisting of a database of approximately 2500 metabolites, 1200 drugs and 3500 food components. Similar projects have been underway in several plant species, most notably Medicago truncatula and Arabidopsis thaliana for several years.

    As late as mid-2010, metabolomics was still considered an "emerging field". Further, it was noted that further progress in the field depended in large part, through addressing otherwise "irresolvable technical challenges", by technical evolution of mass spectrometry instrumentation.

    In 2015, real-time metabolome profiling was demonstrated for the first time.

    Metabolome

    Human metabolome project

    Metabolome refers to the complete set of small-molecule (<1 .5="" a="" analogy="" and="" as="" be="" biological="" class="mw-redirect" coined="" found="" hormones="" href="https://en.wikipedia.org/wiki/Transcriptomics" in="" intermediates="" kda="" metabolic="" metabolites="" molecules="" organism.="" other="" sample="" secondary="" signaling="" single="" such="" the="" title="Transcriptomics" to="" was="" with="" within="" word="">transcriptomics
    and proteomics; like the transcriptome and the proteome, the metabolome is dynamic, changing from second to second. Although the metabolome can be defined readily enough, it is not currently possible to analyse the entire range of metabolites by a single analytical method.

    The first metabolite database (called METLIN) for searching fragmentation data from tandem mass spectrometry experiments was developed by the Siuzdak lab at The Scripps Research Institute in 2005. METLIN contains over 450,000 metabolites and other chemical entities, each compound having experimental tandem mass spectrometry data. In 2006, the Siuzdak lab also developed the first algorithm to allow for the nonlinear alignment of mass spectrometry metabolomics data. Called XCMS, where the "X" constitutes any chromatographic technology, it has since (2012) been developed as an online tool and as of 2019 (with METLIN) has over 30,000 registered users.

    In January 2007, scientists at the University of Alberta and the University of Calgary completed the first draft of the human metabolome. The Human Metabolome Database (HMDB) is perhaps the most extensive public metabolomic spectral database to date. The HMDB stores more than 40,000 different metabolite entries. They catalogued approximately 2500 metabolites, 1200 drugs and 3500 food components that can be found in the human body, as reported in the literature. This information, available at the Human Metabolome Database (www.hmdb.ca) and based on analysis of information available in the current scientific literature, is far from complete. In contrast, much more is known about the metabolomes of other organisms. For example, over 50,000 metabolites have been characterized from the plant kingdom, and many thousands of metabolites have been identified and/or characterized from single plants.

    Each type of cell and tissue has a unique metabolic ‘fingerprint’ that can elucidate organ or tissue-specific information. Bio-specimens used for metabolomics analysis include but not limit to plasma, serum, urine, saliva, feces, muscle, sweat, exhaled breath and gastrointestinal fluid. The ease of collection facilitates high temporal resolution, and because they are always at dynamic equilibrium with the body, they can describe the host as a whole. Genome can tell what could happen, transcriptome can tell what appears to be happening, proteome can tell what makes it happen and metabolome can tell what has happened and what is happening.

    Metabolites

    Metabolites are the substrates, intermediates and products of metabolism. Within the context of metabolomics, a metabolite is usually defined as any molecule less than 1.5 kDa in size. However, there are exceptions to this depending on the sample and detection method. For example, macromolecules such as lipoproteins and albumin are reliably detected in NMR-based metabolomics studies of blood plasma. In plant-based metabolomics, it is common to refer to "primary" and "secondary" metabolites. A primary metabolite is directly involved in the normal growth, development, and reproduction. A secondary metabolite is not directly involved in those processes, but usually has important ecological function. Examples include antibiotics and pigments. By contrast, in human-based metabolomics, it is more common to describe metabolites as being either endogenous (produced by the host organism) or exogenous. Metabolites of foreign substances such as drugs are termed xenometabolites.

    The metabolome forms a large network of metabolic reactions, where outputs from one enzymatic chemical reaction are inputs to other chemical reactions. Such systems have been described as hypercycles.

    Metabonomics

    Metabonomics is defined as "the quantitative measurement of the dynamic multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification". The word origin is from the Greek μεταβολή meaning change and nomos meaning a rule set or set of laws. This approach was pioneered by Jeremy Nicholson at Murdoch University and has been used in toxicology, disease diagnosis and a number of other fields. Historically, the metabonomics approach was one of the first methods to apply the scope of systems biology to studies of metabolism.

    There has been some disagreement over the exact differences between 'metabolomics' and 'metabonomics'. The difference between the two terms is not related to choice of analytical platform: although metabonomics is more associated with NMR spectroscopy and metabolomics with mass spectrometry-based techniques, this is simply because of usages amongst different groups that have popularized the different terms. While there is still no absolute agreement, there is a growing consensus that 'metabolomics' places a greater emphasis on metabolic profiling at a cellular or organ level and is primarily concerned with normal endogenous metabolism. 'Metabonomics' extends metabolic profiling to include information about perturbations of metabolism caused by environmental factors (including diet and toxins), disease processes, and the involvement of extragenomic influences, such as gut microflora. This is not a trivial difference; metabolomic studies should, by definition, exclude metabolic contributions from extragenomic sources, because these are external to the system being studied. However, in practice, within the field of human disease research there is still a large degree of overlap in the way both terms are used, and they are often in effect synonymous.

    Exometabolomics

    Exometabolomics, or "metabolic footprinting", is the study of extracellular metabolites. It uses many techniques from other subfields of metabolomics, and has applications in biofuel development, bioprocessing, determining drugs' mechanism of action, and studying intercellular interactions.

    Analytical technologies

    Key stages of a metabolomics study

    The typical workflow of metabolomics studies is shown in the figure. First, samples are collected from tissue, plasma, urine, saliva, cells, etc. Next, metabolites extracted often with the addition of internal standards and derivatization. During sample analysis, metabolites are quantified (LC or GC coupled with MS and/or NMR spectroscopy). The raw output data can be used for metabolite feature extraction and further processed before statistical analysis (such as PCA). Many bioinformatic tools and software are available to identify associations with disease states and outcomes, determine significant correlations, and characterize metabolic signatures with existing biological knowledge.

    Separation methods

    Initially, analytes in a metabolomic sample comprise a highly complex mixture. This complex mixture can be simplified prior to detection by separating some analytes from others. Separation achieves various goals: analytes which cannot be resolved by the detector may be separated in this step; in MS analysis ion suppression is reduced; the retention time of the analyte serves as information regarding its identity. This separation step is not mandatory and is often omitted in NMR and "shotgun" based approaches such as shotgun lipidomics.

    Gas chromatography (GC), especially when interfaced with mass spectrometry (GC-MS), is a widely used separation technique for metabolomic analysis. GC offers very high chromatographic resolution, and can be used in conjunction with a flame ionization detector (GC/FID) or a mass spectrometer (GC-MS). The method is especially useful for identification and quantification of small and volatile molecules. However, a practical limitation of GC is the requirement of chemical derivatization for many biomolecules as only volatile chemicals can be analysed without derivatization. In cases where greater resolving power is required, two-dimensional chromatography (GCxGC) can be applied.

    High performance liquid chromatography (HPLC) has emerged as the most common separation technique for metabolomic analysis. With the advent of electrospray ionization, HPLC was coupled to MS. In contrast with GC, HPLC has lower chromatographic resolution, but requires no derivatization for polar molecules, and separates molecules in the liquid phase. Additionally HPLC has the advantage that a much wider range of analytes can be measured with a higher sensitivity than GC methods.

    Capillary electrophoresis (CE) has a higher theoretical separation efficiency than HPLC (although requiring much more time per separation), and is suitable for use with a wider range of metabolite classes than is GC. As for all electrophoretic techniques, it is most appropriate for charged analytes.

    Detection methods

    Comparison of most common used metabolomics methods

    Mass spectrometry (MS) is used to identify and to quantify metabolites after optional separation by GC, HPLC (LC-MS), or CE. GC-MS was the first hyphenated technique to be developed. Identification leverages the distinct patterns in which analytes fragment which can be thought of as a mass spectral fingerprint; libraries exist that allow identification of a metabolite according to this fragmentation pattern. MS is both sensitive and can be very specific. There are also a number of techniques which use MS as a stand-alone technology: the sample is infused directly into the mass spectrometer with no prior separation, and the MS provides sufficient selectivity to both separate and to detect metabolites.

    For analysis by mass spectrometry the analytes must be imparted with a charge and transferred to the gas phase. Electron ionization (EI) is the most common ionization technique applies to GC separations as it is amenable to low pressures. EI also produces fragmentation of the analyte, both providing structural information while increasing the complexity of the data and possibly obscuring the molecular ion. Atmospheric-pressure chemical ionization (APCI) is an atmospheric pressure technique that can be applied to all the above separation techniques. APCI is a gas phase ionization method slightly more aggressive ionization than ESI which is suitable for less polar compounds. Electrospray ionization (ESI) is the most common ionization technique applied in LC/MS. This soft ionization is most successful for polar molecules with ionizable functional groups. Another commonly used soft ionization technique is secondary electrospray ionization (SESI).

    Surface-based mass analysis has seen a resurgence in the past decade, with new MS technologies focused on increasing sensitivity, minimizing background, and reducing sample preparation. The ability to analyze metabolites directly from biofluids and tissues continues to challenge current MS technology, largely because of the limits imposed by the complexity of these samples, which contain thousands to tens of thousands of metabolites. Among the technologies being developed to address this challenge is Nanostructure-Initiator MS (NIMS), a desorption/ ionization approach that does not require the application of matrix and thereby facilitates small-molecule (i.e., metabolite) identification. MALDI is also used however, the application of a MALDI matrix can add significant background at <1000 achieved="" addition="" analysis="" and="" applied="" approaches="" be="" because="" been="" biofluids="" can="" complicates="" crystals="" da="" desorption="" have="" i.e.="" imaging.="" in="" ionization="" limitations="" limits="" low-mass="" matrix-free="" matrix="" metabolites="" of="" other="" p="" range="" resolution="" resulting="" several="" size="" spatial="" that="" the="" these="" tissue="" tissues.="" to="">

    Secondary ion mass spectrometry (SIMS) was one of the first matrix-free desorption/ionization approaches used to analyze metabolites from biological samples. SIMS uses a high-energy primary ion beam to desorb and generate secondary ions from a surface. The primary advantage of SIMS is its high spatial resolution (as small as 50 nm), a powerful characteristic for tissue imaging with MS. However, SIMS has yet to be readily applied to the analysis of biofluids and tissues because of its limited sensitivity at >500 Da and analyte fragmentation generated by the high-energy primary ion beam. Desorption electrospray ionization (DESI) is a matrix-free technique for analyzing biological samples that uses a charged solvent spray to desorb ions from a surface. Advantages of DESI are that no special surface is required and the analysis is performed at ambient pressure with full access to the sample during acquisition. A limitation of DESI is spatial resolution because "focusing" the charged solvent spray is difficult. However, a recent development termed laser ablation ESI (LAESI) is a promising approach to circumvent this limitation. Most recently, ion trap techniques such as orbitrap mass spectrometry are also applied to metabolomics research.

    Nuclear magnetic resonance (NMR) spectroscopy is the only detection technique which does not rely on separation of the analytes, and the sample can thus be recovered for further analyses. All kinds of small molecule metabolites can be measured simultaneously - in this sense, NMR is close to being a universal detector. The main advantages of NMR are high analytical reproducibility and simplicity of sample preparation. Practically, however, it is relatively insensitive compared to mass spectrometry-based techniques. Comparison of most common used metabolomics methods is shown in the table. 

    Although NMR and MS are the most widely used, modern day techniques other methods of detection that have been used. These include Fourier-transform ion cyclotron resonance, ion-mobility spectrometry, electrochemical detection (coupled to HPLC), Raman spectroscopy and radiolabel (when combined with thin-layer chromatography).

    Statistical methods

    Software List for Metabolomic Analysis

    The data generated in metabolomics usually consist of measurements performed on subjects under various conditions. These measurements may be digitized spectra, or a list of metabolite features. In its simplest form this generates a matrix with rows corresponding to subjects and columns corresponding with metabolite features (or vice versa). Several statistical programs are currently available for analysis of both NMR and mass spectrometry data. A great number of free software are already available for the analysis of metabolomics data shown in the table. Some statistical tools listed in the table were designed for NMR data analyses were also useful for MS data. For mass spectrometry data, software is available that identifies molecules that vary in subject groups on the basis of mass-over-charge value and sometimes retention time depending on the experimental design.

    Once metabolite data matrix is determined, unsupervised data reduction techniques (e.g. PCA) can be used to elucidate patterns and connections. In many studies, including those evaluating drug-toxicity and some disease models, the metabolites of interest are not known a priori. This makes unsupervised methods, those with no prior assumptions of class membership, a popular first choice. The most common of these methods includes principal component analysis (PCA) which can efficiently reduce the dimensions of a dataset to a few which explain the greatest variation. When analyzed in the lower-dimensional PCA space, clustering of samples with similar metabolic fingerprints can be detected. PCA algorithms aim to replace all correlated variables by a much smaller number of uncorrelated variables (referred to as principal components (PCs)) and retain most of the information in the original dataset.[59] This clustering can elucidate patterns and assist in the determination of disease biomarkers - metabolites that correlate most with class membership.

    Linear models are commonly used for metabolomics data, but are affected by multicollinearity. On the other hand, multivariate statistics are thriving methods for high-dimensional correlated metabolomics data, of which the most popular one is Projection to Latent Structures (PLS) regression and its classification version PLS-DA. Other data mining methods, such as random forest, support-vector machine, k-nearest neighbors etc. are received increasing attention for untargeted metabolomics data analysis. In the case of univariate methods, variables are analyzed one by one using classical statistics tools (such as Student's t-test, ANOVA or mixed models) and only these with sufficient small p-values are considered relevant. However, correction strategies should be used to reduce false discoveries when multiple comparisons are conducted. For multivariate analysis, models should always be validated to ensure the results can be generalized.

    Key applications

    Toxicity assessment/toxicology by metabolic profiling (especially of urine or blood plasma samples) detects the physiological changes caused by toxic insult of a chemical (or mixture of chemicals). In many cases, the observed changes can be related to specific syndromes, e.g. a specific lesion in liver or kidney. This is of particular relevance to pharmaceutical companies wanting to test the toxicity of potential drug candidates: if a compound can be eliminated before it reaches clinical trials on the grounds of adverse toxicity, it saves the enormous expense of the trials.

    For functional genomics, metabolomics can be an excellent tool for determining the phenotype caused by a genetic manipulation, such as gene deletion or insertion. Sometimes this can be a sufficient goal in itself—for instance, to detect any phenotypic changes in a genetically modified plant intended for human or animal consumption. More exciting is the prospect of predicting the function of unknown genes by comparison with the metabolic perturbations caused by deletion/insertion of known genes. Such advances are most likely to come from model organisms such as Saccharomyces cerevisiae and Arabidopsis thaliana. The Cravatt laboratory at The Scripps Research Institute has recently applied this technology to mammalian systems, identifying the N-acyltaurines as previously uncharacterized endogenous substrates for the enzyme fatty acid amide hydrolase (FAAH) and the monoalkylglycerol ethers (MAGEs) as endogenous substrates for the uncharacterized hydrolase KIAA1363.

    Metabologenomics is a novel approach to integrate metabolomics and genomics data by correlating microbial-exported metabolites with predicted biosynthetic genes. This bioinformatics-based pairing method enables natural product discovery at a larger-scale by refining non-targeted metabolomic analyses to identify small molecules with related biosynthesis and to focus on those that may not have previously well known structures.

    Fluxomics is a further development of metabolomics. The disadvantage of metabolomics is that it only provides the user with steady-state level information, while fluxomics determines the reaction rates of metabolic reactions and can trace metabolites in a biological system over time.

    Nutrigenomics is a generalised term which links genomics, transcriptomics, proteomics and metabolomics to human nutrition. In general a metabolome in a given body fluid is influenced by endogenous factors such as age, sex, body composition and genetics as well as underlying pathologies. The large bowel microflora are also a very significant potential confounder of metabolic profiles and could be classified as either an endogenous or exogenous factor. The main exogenous factors are diet and drugs. Diet can then be broken down to nutrients and non- nutrients. Metabolomics is one means to determine a biological endpoint, or metabolic fingerprint, which reflects the balance of all these forces on an individual's metabolism.

    Representation of a Lie group

    From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Representation_of_a_Lie_group...