Search This Blog

Saturday, March 7, 2020

Nucleosome

From Wikipedia, the free encyclopedia

A nucleosome is the basic structural unit of DNA packaging in eukaryotes. The structure of a nucleosome consists of a segment of DNA wound around eight histone proteins and resembles thread wrapped around a spool.

DNA must be compacted into nucleosomes to fit within the cell nucleus. In addition to nucleosome wrapping, eukaryotic chromatin is further compacted by being folded into a series of more complex structures, eventually forming a chromosome.

Nucleosomes are thought to carry epigenetically inherited information in the form of covalent modifications of their core histones. Nucleosome positions in the genome are not random, and it is important to know where each nucleosome is located because this determines the accessibility of the DNA to regulatory proteins.

Nucleosomes were first observed as particles in the electron microscope by Don and Ada Olins in 1974, and their existence and structure (as histone octamers surrounded by approximately 200 base pairs of DNA) were proposed by Roger Kornberg. The role of the nucleosome as a general gene repressor was demonstrated by Lorch et al. in vitro, and by Han and Grunstein in vivo in 1987 and 1988, respectively.

The nucleosome core particle consists of approximately 146 base pairs (bp) of DNA wrapped in 1.67 left-handed superhelical turns around a histone octamer, consisting of 2 copies each of the core histones H2A, H2B, H3, and H4. Core particles are connected by stretches of linker DNA, which can be up to about 80 bp long. Technically, a nucleosome is defined as the core particle plus one of these linker regions; however the word is often synonymous with the core particle. Genome-wide nucleosome positioning maps are now available for many model organisms including mouse liver and brain.

Linker histones such as H1 and its isoforms are involved in chromatin compaction and sit at the base of the nucleosome near the DNA entry and exit binding to the linker region of the DNA. Non-condensed nucleosomes without the linker histone resemble "beads on a string of DNA" under an electron microscope.

In contrast to most eukaryotic cells, mature sperm cells largely use protamines to package their genomic DNA, most likely to achieve an even higher packaging ratio. Histone equivalents and a simplified chromatin structure have also been found in Archaea, suggesting that eukaryotes are not the only organisms that use nucleosomes.

Structure

Structure of the core particle

The crystal structure of the nucleosome core particle consisting of H2A , H2B , H3 and H4 core histones, and DNA. The view is from the top through the superhelical axis.

Overview

Pioneering structural studies in the 1980s by Aaron Klug's group provided the first evidence that an octamer of histone proteins wraps DNA around itself in about 1.7 turns of a left-handed superhelix. In 1997 the first near atomic resolution crystal structure of the nucleosome was solved by the Richmond group, showing the most important details of the particle. The human alpha-satellite palindromic DNA critical to achieving the 1997 nucleosome crystal structure was developed by the Bunick group at Oak Ridge National Laboratory in Tennessee. The structures of over 20 different nucleosome core particles have been solved to date, including those containing histone variants and histones from different species. The structure of the nucleosome core particle is remarkably conserved, and even a change of over 100 residues between frog and yeast histones results in electron density maps with an overall root mean square deviation of only 1.6Å.

The nucleosome core particle (NCP)

The nucleosome core particle (shown in the figure) consists of about 146 base pair of DNA wrapped in 1.67 left-handed superhelical turns around the histone octamer, consisting of 2 copies each of the core histones H2A, H2B, H3, and H4. Adjacent nucleosomes are joined by a stretch of free DNA termed linker DNA (which varies from 10 - 80 bp in length depending on species and tissue type).The whole structure generates a cylinder of diameter 11 nm and a height of 5.5 nm.

Apoptotic DNA laddering. Digested chromatin is in the first lane; the second contains DNA standard to compare lengths.
 
Scheme of nucleosome organization.
 
The crystal structure of the nucleosome core particle (PDB: 1EQZ​)

Nucleosome core particles are observed when chromatin in interphase is treated to cause the chromatin to unfold partially. The resulting image, via an electron microscope, is "beads on a string". The string is the DNA, while each bead in the nucleosome is a core particle. The nucleosome core particle is composed of DNA and histone proteins.

Partial DNAse digestion of chromatin reveals its nucleosome structure. Because DNA portions of nucleosome core particles are less accessible for DNAse than linking sections, DNA gets digested into fragments of lengths equal to multiplicity of distance between nucleosomes (180, 360, 540 base pairs etc.). Hence a very characteristic pattern similar to a ladder is visible during gel electrophoresis of that DNA. Such digestion can occur also under natural conditions during apoptosis ("cell suicide" or programmed cell death), because autodestruction of DNA typically is its role.
Protein interactions within the nucleosome
The core histone proteins contains a characteristic structural motif termed the "histone fold", which consists of three alpha-helices (α1-3) separated by two loops (L1-2). In solution, the histones form H2A-H2B heterodimers and H3-H4 heterotetramers. Histones dimerise about their long α2 helices in an anti-parallel orientation, and, in the case of H3 and H4, two such dimers form a 4-helix bundle stabilised by extensive H3-H3’ interaction. The H2A/H2B dimer binds onto the H3/H4 tetramer due to interactions between H4 and H2B, which include the formation of a hydrophobic cluster. The histone octamer is formed by a central H3/H4 tetramer sandwiched between two H2A/H2B dimers. Due to the highly basic charge of all four core histones, the histone octamer is stable only in the presence of DNA or very high salt concentrations.
Histone - DNA interactions
The nucleosome contains over 120 direct protein-DNA interactions and several hundred water-mediated ones. Direct protein - DNA interactions are not spread evenly about the octamer surface but rather located at discrete sites. These are due to the formation of two types of DNA binding sites within the octamer; the α1α1 site, which uses the α1 helix from two adjacent histones, and the L1L2 site formed by the L1 and L2 loops. Salt links and hydrogen bonding between both side-chain basic and hydroxyl groups and main-chain amides with the DNA backbone phosphates form the bulk of interactions with the DNA. This is important, given that the ubiquitous distribution of nucleosomes along genomes requires it to be a non-sequence-specific DNA-binding factor. Although nucleosomes tend to prefer some DNA sequences over others, they are capable of binding practically to any sequence, which is thought to be due to the flexibility in the formation of these water-mediated interactions. In addition, non-polar interactions are made between protein side-chains and the deoxyribose groups, and an arginine side-chain intercalates into the DNA minor groove at all 14 sites where it faces the octamer surface. The distribution and strength of DNA-binding sites about the octamer surface distorts the DNA within the nucleosome core. The DNA is non-uniformly bent and also contains twist defects. The twist of free B-form DNA in solution is 10.5 bp per turn. However, the overall twist of nucleosomal DNA is only 10.2 bp per turn, varying from a value of 9.4 to 10.9 bp per turn.

Histone tail domains

The histone tail extensions constitute up to 30% by mass of histones, but are not visible in the crystal structures of nucleosomes due to their high intrinsic flexibility, and have been thought to be largely unstructured. The N-terminal tails of histones H3 and H2B pass through a channel formed by the minor grooves of the two DNA strands, protruding from the DNA every 20 bp. The N-terminal tail of histone H4, on the other hand, has a region of highly basic amino acids (16-25), which, in the crystal structure, forms an interaction with the highly acidic surface region of a H2A-H2B dimer of another nucleosome, being potentially relevant for the higher-order structure of nucleosomes. This interaction is thought to occur under physiological conditions also, and suggests that acetylation of the H4 tail distorts the higher-order structure of chromatin.

Higher order structure

The current chromatin compaction model.
 
The organization of the DNA that is achieved by the nucleosome cannot fully explain the packaging of DNA observed in the cell nucleus. Further compaction of chromatin into the cell nucleus is necessary, but is not yet well understood. The current understanding is that repeating nucleosomes with intervening "linker" DNA form a 10-nm-fiber, described as "beads on a string", and have a packing ratio of about five to ten. A chain of nucleosomes can be arranged in a 30 nm fiber, a compacted structure with a packing ratio of ~50 and whose formation is dependent on the presence of the H1 histone.

A crystal structure of a tetranucleosome has been presented and used to build up a proposed structure of the 30 nm fiber as a two-start helix. There is still a certain amount of contention regarding this model, as it is incompatible with recent electron microscopy data. Beyond this, the structure of chromatin is poorly understood, but it is classically suggested that the 30 nm fiber is arranged into loops along a central protein scaffold to form transcriptionally active euchromatin. Further compaction leads to transcriptionally inactive heterochromatin.

Dynamics

Although the nucleosome is a very stable protein-DNA complex, it is not static and has been shown to undergo a number of different structural re-arrangements including nucleosome sliding and DNA site exposure. Depending on the context, nucleosomes can inhibit or facilitate transcription factor binding. Nucleosome positions are controlled by three major contributions: First, the intrinsic binding affinity of the histone octamer depends on the DNA sequence. Second, the nucleosome can be displaced or recruited by the competitive or cooperative binding of other protein factors. Third, the nucleosome may be actively translocated by ATP-dependent remodeling complexes.

Nucleosome sliding

Work performed in the Bradbury laboratory showed that nucleosomes reconstituted onto the 5S DNA positioning sequence were able to reposition themselves translationally onto adjacent sequences when incubated thermally. Later work showed that this repositioning did not require disruption of the histone octamer but was consistent with nucleosomes being able to "slide" along the DNA in cis. In 2008, it was further revealed that CTCF binding sites act as nucleosome positioning anchors so that, when used to align various genomic signals, multiple flanking nucleosomes can be readily identified. Although nucleosomes are intrinsically mobile, eukaryotes have evolved a large family of ATP-dependent chromatin remodelling enzymes to alter chromatin structure, many of which do so via nucleosome sliding. In 2012, Beena Pillai's laboratory has demonstrated that nucleosome sliding is one of the possible mechanism for large scale tissue specific expression of genes. The work shows that the transcription start site for genes expressed in a particular tissue, are nucleosome depleted while, the same set of genes in other tissue where they are not expressed, are nucleosome bound.

DNA site exposure

Work from the Widom laboratory has shown that nucleosomal DNA is in equilibrium between a wrapped and unwrapped state. Measurements of these rates using time-resolved FRET revealed that DNA within the nucleosome remains fully wrapped for only 250 ms before it is unwrapped for 10-50 ms and then rapidly rewrapped. This implies that DNA does not need to be actively dissociated from the nucleosome but that there is a significant fraction of time during which it is fully accessible. Indeed, this can be extended to the observation that introducing a DNA-binding sequence within the nucleosome increases the accessibility of adjacent regions of DNA when bound. This propensity for DNA within the nucleosome to “breathe” has important functional consequences for all DNA-binding proteins that operate in a chromatin environment. In particular, the dynamic breathing of nucleosomes plays an important role in restricting the advancement of RNA polymerase II during transcription elongation.

Nucleosome free region

Promoters of active genes have nucleosome free regions (NFR). This allows for promoter DNA accessibility to various proteins, such as transcription factors. Nucleosome free region typically spans for 200 nucleotides in S. cerevisae. Well-positioned nucleosomes form boundaries of NFR. These nucleosomes are called +1-nucleosome and −1-nucleosome and are located at canonical distances downstream and upstream, respectively, from transcription start site. +1-nucleosome and several downstream nucleosomes also tend to incorporate H2A.Z histone variant.

Modulating nucleosome structure

Eukaryotic genomes are ubiquitously associated into chromatin; however, cells must spatially and temporally regulate specific loci independently of bulk chromatin. In order to achieve the high level of control required to co-ordinate nuclear processes such as DNA replication, repair, and transcription, cells have developed a variety of means to locally and specifically modulate chromatin structure and function. This can involve covalent modification of histones, the incorporation of histone variants, and non-covalent remodelling by ATP-dependent remodeling enzymes.

Histone post-translational modifications

Since they were discovered in the mid-1960s, histone modifications have been predicted to affect transcription. The fact that most of the early post-translational modifications found were concentrated within the tail extensions that protrude from the nucleosome core lead to two main theories regarding the mechanism of histone modification. The first of the theories suggested that they may affect electrostatic interactions between the histone tails and DNA to “loosen” chromatin structure. Later it was proposed that combinations of these modifications may create binding epitopes with which to recruit other proteins. Recently, given that more modifications have been found in the structured regions of histones, it has been put forward that these modifications may affect histone-DNA and histone-histone interactions within the nucleosome core. Modifications (such as acetylation or phosphorylation) that lower the charge of the globular histone core are predicted to "loosen" core-DNA association; the strength of the effect depends on location of the modification within the core. Some modifications have been shown to be correlated with gene silencing; others seem to be correlated with gene activation. Common modifications include acetylation, methylation, or ubiquitination of lysine; methylation of arginine; and phosphorylation of serine. The information stored in this way is considered epigenetic, since it is not encoded in the DNA but is still inherited to daughter cells. The maintenance of a repressed or activated status of a gene is often necessary for cellular differentiation.

Histone variants

Although histones are remarkably conserved throughout evolution, several variant forms have been identified. This diversification of histone function is restricted to H2A and H3, with H2B and H4 being mostly invariant. H2A can be replaced by H2AZ (which leads to reduced nucleosome stability) or H2AX (which is associated with DNA repair and T cell differentiation), whereas the inactive X chromosomes in mammals are enriched in macroH2A. H3 can be replaced by H3.3 (which correlates with activate genes and regulatory elements) and in centromeres H3 is replaced by CENPA.

ATP-dependent nucleosome remodeling

A number of distinct reactions are associated with the term ATP-dependent chromatin remodeling. Remodeling enzymes have been shown to slide nucleosomes along DNA, disrupt histone-DNA contacts to the extent of destabilizing the H2A/H2B dimer and to generate negative superhelical torsion in DNA and chromatin. Recently, the Swr1 remodeling enzyme has been shown to introduce the variant histone H2A.Z into nucleosomes. At present, it is not clear if all of these represent distinct reactions or merely alternative outcomes of a common mechanism. What is shared between all, and indeed the hallmark of ATP-dependent chromatin remodeling, is that they all result in altered DNA accessibility. 

Studies looking at gene activation in vivo and, more astonishingly, remodeling in vitro have revealed that chromatin remodeling events and transcription-factor binding are cyclical and periodic in nature. While the consequences of this for the reaction mechanism of chromatin remodeling are not known, the dynamic nature of the system may allow it to respond faster to external stimuli. A recent study indicates that nucleosome positions change significantly during mouse embryonic stem cell development, and these changes are related to binding of developmental transcription factors.

Dynamic nucleosome remodelling across the Yeast genome

Studies in 2007 have catalogued nucleosome positions in yeast and shown that nucleosomes are depleted in promoter regions and origins of replication. About 80% of the yeast genome appears to be covered by nucleosomes and the pattern of nucleosome positioning clearly relates to DNA regions that regulate transcription, regions that are transcribed and regions that initiate DNA replication. Most recently, a new study examined dynamic changes in nucleosome repositioning during a global transcriptional reprogramming event to elucidate the effects on nucleosome displacement during genome-wide transcriptional changes in yeast (Saccharomyces cerevisiae). The results suggested that nucleosomes that were localized to promoter regions are displaced in response to stress (like heat shock). In addition, the removal of nucleosomes usually corresponded to transcriptional activation and the replacement of nucleosomes usually corresponded to transcriptional repression, presumably because transcription factor binding sites became more or less accessible, respectively. In general, only one or two nucleosomes were repositioned at the promoter to effect these transcriptional changes. However, even in chromosomal regions that were not associated with transcriptional changes, nucleosome repositioning was observed, suggesting that the covering and uncovering of transcriptional DNA does not necessarily produce a transcriptional event. After transcription, the rDNA region has to protected from any damage, it suggested HMGB proteins play a major role in protecting the nucleosome free region.

Nucleosome assembly in vitro

Diagram of nucleosome assembly.

Nucleosomes can be assembled in vitro by either using purified native or recombinant histones. One standard technique of loading the DNA around the histones involves the use of salt dialysis. A reaction consisting of the histone octamers and a naked DNA template can be incubated together at a salt concentration of 2 M. By steadily decreasing the salt concentration, the DNA will equilibrate to a position where it is wrapped around the histone octamers, forming nucleosomes. In appropriate conditions, this reconstitution process allows for the nucleosome positioning affinity of a given sequence to be mapped experimentally.

Disulfide crosslinked nucleosome core particles

A recent advance in the production of nucleosome core particles with enhanced stability involves site-specific disulfide crosslinks. Two different crosslinks can be introduced into the nucleosome core particle. A first one crosslinks the two copies of H2A via an introduced cysteine (N38C) resulting in histone octamer which is stable against H2A/H2B dimer loss during nucleosome reconstitution. A second crosslink can be introduced between the H3 N-terminal histone tail and the nucleosome DNA ends via an incorporated convertible nucleotide. The DNA-histone octamer crosslink stabilizes the nucleosome core particle against DNA dissociation at very low particle concentrations and at elevated salt concentrations.

Nucleosome assembly in vivo

Nucleosomes are the basic packing unit of DNA built from histone proteins around which DNA is coiled. They serve as a scaffold for formation of higher order chromatin structure as well as for a layer of regulatory control of gene expression. Nucleosomes are quickly assembled onto newly synthesized DNA behind the replication fork.

H3 and H4

Histones H3 and H4 from disassembled old nucleosomes are kept in the vicinity and randomly distributed on the newly synthesized DNA. They are assembled by the chromatin assembly factor-1 (CAF-1) complex, which consists of three subunits (p150, p60, and p48). Newly synthesized H3 and H4 are assembled by the replication coupling assembly factor (RCAF). RCAF contains the subunit Asf1, which binds to newly synthesized H3 and H4 proteins. The old H3 and H4 proteins retain their chemical modifications which contributes to the passing down of the epigenetic signature. The newly synthesized H3 and H4 proteins are gradually acetylated at different lysine residues as part of the chromatin maturation process. It is also thought that the old H3 and H4 proteins in the new nucleosomes recruit histone modifying enzymes that mark the new histones, contributing to epigenetic memory.

H2A and H2B

In contrast to old H3 and H4, the old H2A and H2B histone proteins are released and degraded; therefore, newly assembled H2A and H2B proteins are incorporated into new nucleosomes. H2A and H2B are assembled into dimers which are then loaded onto nucleosomes by the nucleosome assembly protein-1 (NAP-1) which also assists with nucleosome sliding. The nucleosomes are also spaced by ATP-dependent nucleosome-remodeling complexes containing enzymes such as Isw1 Ino80, and Chd1, and subsequently assembled into higher order structure.

Pathogenomics

From Wikipedia, the free encyclopedia
Pathogenomics is a field which uses high-throughput screening technology and bioinformatics to study encoded microbe resistance, as well as virulence factors (VFs), which enable a microorganism to infect a host and possibly cause disease. This includes studying genomes of pathogens which cannot be cultured outside of a host. In the past, researchers and medical professionals found it difficult to study and understand pathogenic traits of infectious organisms. With newer technology, pathogen genomes can be identified and sequenced in a much shorter time and at a lower cost, thus improving the ability to diagnose, treat, and even predict and prevent pathogenic infections and disease. It has also allowed researchers to better understand genome evolution events - gene loss, gain, duplication, rearrangement - and how those events impact pathogen resistance and ability to cause disease. This influx of information has created a need for making the vast amounts of data accessible to researchers in the form of databases, and it has raised ethical questions about the wisdom of reconstructing previously extinct and deadly pathogens in order to better understand virulence.

Reviewing high-throughput screening results

History

During the earlier times when genomics was being studied, scientists found it challenging to sequence genetic information. The field began to explode in 1977 when Fred Sanger, PhD, along with his colleagues, sequenced the DNA-based genome of a bacteriophage, using a method now known as the Sanger Method. The Sanger Method for sequencing DNA exponentially advanced molecular biology and directly led to the ability to sequence genomes of other organisms, including the complete human genome.

The Haemophilus influenza genome was one of the first organism genomes sequenced in 1995 by J. Craig Venter and Hamilton Smith using whole genome shotgun sequencing. Since then, newer and more efficient high-throughput sequencing, such as Next Generation Genomic Sequencing (NGS) and Single-Cell Genomic Sequencing, have been developed. While the Sanger method is able to sequence one DNA fragment at a time, NGS technology can sequence thousands of sequences at a time. With the ability to rapidly sequence DNA, new insights developed, such as the discovery that since prokaryotic genomes are more diverse than originally thought, it is necessary to sequence multiple strains in a species rather than only a few. E.coli was an example of why this is important, with genes encoding virulence factors in two strains of the species differing by at least thirty percent. Such knowledge, along with more thorough study of genome gain, loss, and change, is giving researchers valuable insight into how pathogens interact in host environments and how they are able to infect hosts and cause disease.

With this high influx of new information, there has arisen a higher demand for bioinformatics so scientists can properly analyze the new data. In response, software and other tools have been developed for this purpose. Also, as of 2008, the amount of stored sequences was doubling every 18 months, making urgent the need for better ways to organize data and aid research. In response, thousands of publicly accessible databases and other resources have been created, including the Virulence Factor Database (VFDB) of pathogenic bacteria, which was established in 2004 and was created to aid in pathogenomics research.

Microbe analysis

Pathogens may be prokaryotic (archaea or bacteria), single-celled eukarya or viruses. Prokaryotic genomes have typically been easier to sequence due to smaller genome size compared to Eukarya. Due to this, there is a bias in reporting pathogenic bacterial behavior. Regardless of this bias in reporting, many of the dynamic genomic events are similar across all the types of pathogen organisms. Genomic evolution occurs via gene gain, gene loss, and genome rearrangement, and these "events" are observed in multiple pathogen genomes, with some bacterial pathogens experiencing all three. Pathogenomics does not focus exclusively on understanding pathogen-host interactions, however. Insight of individual or cooperative pathogen behavior provides knowledge into the development or inheritance of pathogen virulence factors. Through a deeper understanding of the small sub-units that cause infection, it may be possible to develop novel therapeutics that are efficient and cost-effective.

Cause and analysis of genomic diversity

Dynamic genomes with high plasticity are necessary to allow pathogens, especially bacteria, to survive in changing environments. With the assistance of high throughput sequencing methods and in silico technologies, it is possible to detect, compare and catalogue many of these dynamic genomic events. Genomic diversity is important when detecting and treating a pathogen since these events can change the function and structure of the pathogen. There is a need to analyze more than a single genome sequence of a pathogen species to understand pathogen mechanisms. Comparative genomics is a methodology which allows scientists to compare the genomes of different species and strains. There are several examples of successful comparative genomics studies, among them the analysis of Listeria and Escherichia coli. Some studies have attempted to address the difference between pathogenic and non-pathogenic microbes. This inquiry proves to be difficult, however, since a single bacterial species can have many strains, and the genomic content of each of these strains varies.  
Evolutionary dynamics
Varying microbe strains and genomic content are caused by different forces, including three specific evolutionary events which have an impact on pathogen resistance and ability to cause disease, a: gene gain, gene loss, and genome rearrangement.    
Gene loss and genome decay
Gene loss occurs when genes are deleted. The reason why this occurs is still not fully understood, though it most likely involves adaptation to a new environment or ecological niche. Some researchers believe gene loss may actually increase fitness and survival among pathogens. In a new environment, some genes may become unnecessary for survival, and so mutations are eventually "allowed" on those genes until they become inactive "pseudogenes." These pseudogenes are observed in organisms such as Shigella flexneri, Salmonella enterica, and Yersinia pestis. Over time, the pseudogenes are deleted, and the organisms become fully dependent on their host as either endosymbionts or obligate intracellular pathogens, as is seen in Buchnera, Myobacterium leprae, and Chlamydia trachomatis. These deleted genes are also called Anti-virulence genes (AVG) since it is thought they may have prevented the organism from becoming pathogenic. In order to be more virulent, infect a host and remain alive, the pathogen had to get rid of those AVGs. The reverse process can happen as well, as was seen during analysis of Listeria strains, which showed that a reduced genome size led to a non-pathogenic Listeria strain from a pathogenic strain.[26] Systems have been developed to detect these pseudogenes/AVGs in a genome sequence.

Summary of dynamic genomics events
Gene gain and duplication
One of the key forces driving gene gain is thought to be horizontal (lateral) gene transfer (LGT). It is of particular interest in microbial studies because these mobile genetic elements may introduce virulence factors into a new genome. A comparative study conducted by Gill et al. in 2005 postulated that LGT may have been the cause for pathogen variations between Staphylococcus epidermidis and Staphylococcus aureus. There still, however, remains skepticism about the frequency of LGT, its identification, and its impact. New and improved methodologies have been engaged, especially in the study of phylogenetics, to validate the presence and effect of LGT. Gene gain and gene duplication events are balanced by gene loss, such that despite their dynamic nature, the genome of a bacterial species remains approximately the same size.
Genome rearrangement
Mobile genetic insertion sequences can play a role in genome rearrangement activities. Pathogens that do not live in an isolated environment have been found to contain a large number of insertion sequence elements and various repetitive segments of DNA. The combination of these two genetic elements is thought help mediate homologous recombination. There are pathogens, such as Burkholderia mallei, and Burkholderia pseudomallei which have been shown to exhibit genome-wide rearrangements due to insertion sequences and repetitive DNA segments. At this time, no studies demonstrate genome-wide rearrangement events directly giving rise to pathogenic behavior in a microbe. This does not mean it is not possible. Genome-wide rearrangements do, however, contribute to the plasticity of bacterial genome, which may prime the conditions for other factors to introduce, or lose, virulence factors.
Single-nucleotide polymorphisms
Single Nucleotide Polymorphisms, or SNPs, allow for a wide array of genetic variation among humans as well as pathogens. They allow researchers to estimate a variety of factors: the effects of environmental toxins, how different treatment methods affect the body, and what causes someone's predisposition to illnesses. SNPs play a key role in understanding how and why mutations occur. SNPs also allows for scientists to map genomes and analyze genetic information.
Pan and core genomes
Pan-genome overview

Pan-genome overview The most recent definition of a bacterial species comes from the pre-genomic era. In 1987, it was proposed that bacterial strains showing >70% DNA·DNA re-association and sharing characteristic phenotypic traits should be considered to be strains of the same species. The diversity within pathogen genomes makes it difficult to identify the total number of genes that are associated within all strains of a pathogen species. It has been thought that the total number of genes associated with a single pathogen species may be unlimited, although some groups are attempting to derive a more empirical value. For this reason, it was necessary to introduce the concept of pan-genomes and core genomes. Pan-genome and core genome literature also tends to have a bias towards reporting on prokaryotic pathogenic organisms. Caution may need to be exercised when extending the definition of a pan-genome or a core-genome to the other pathogenic organisms because there is no formal evidence of the properties of these pan-genomes.

A core genome is the set of genes found across all strains of a pathogen species. A pan-genome is the entire gene pool for that pathogen species, and includes genes that are not shared by all strains. Pan-genomes may be open or closed depending on whether comparative analysis of multiple strains reveals no new genes (closed) or many new genes (open) compared to the core genome for that pathogen species. In the open pan-genome, genes may be further characterized as dispensable or strain specific. Dispensable genes are those found in more than one strain, but not in all strains, of a pathogen species. Strain specific genes are those found only in one strain of a pathogen species. The differences in pan-genomes are reflections of the life style of the organism. For example, Streptococcus agalactiae, which exists in diverse biological niches, has a broader pan-genome when compared with the more environmentally isolated Bacillus anthracis. Comparative genomics approaches are also being used to understand more about the pan-genome. Recent discoveries show that the number of new species continue to grow with an estimated 1031 bacteriophages on the planet with those bacteriophages infecting 1024 others per second, the continuous flow of genetic material being exchanged is difficult to imagine.

Virulence factors

Multiple genetic elements of human-affecting pathogens contribute to the transfer of virulence factors: plasmids, pathogenicity island, prophages, bacteriophages, transposons, and integrative and conjugative elements. Pathogencity islands and their detection are the focus of several bioinformatics efforts involved in pathogenomics. It is a common belief that "environmental bacterial strains" lack the capacity to harm or do damage to humans. However, recent studies show that bacteria from aquatic environments have acquired pathogenic strains through evolution. This allows for the bacteria to have a wider range in genetic traits and can cause a potential threat to humans from which there is more resistance towards antibiotics.

Microbe-microbe interactions

Staphylococcus aureus biofilm

Microbe-host interactions tend to overshadow the consideration of microbe-microbe interactions. Microbe-microbe interactions though can lead to chronic states of infirmity that are difficult to understand and treat.

Biofilms

Biofilms are an example of microbe-microbe interactions and are thought to be associated with up to 80% of human infections. Recently it has been shown that there are specific genes and cell surface proteins involved in the formation of biofilm. These genes and also surface proteins may be characterized through in silico methods to form an expression profile of biofilm-interacting bacteria. This expression profile may be used in subsequent analysis of other microbes to predict biofilm microbe behaviour, or to understand how to dismantle biofilm formation.

Host microbe analysis

Pathogens have the ability to adapt and manipulate host cells, taking full advantage of a host cell's cellular processes and mechanisms.

A microbe may be influenced by hosts to either adapt to its new environment or learn to evade it. An insight into these behaviours will provide beneficial insight for potential therapeutics. The most detailed outline of host-microbe interaction initiatives is outlined by the Pathogenomics European Research Agenda. Its report emphasizes the following features: 

Summary of host-microbe project goals in the Pathogenomics European Research Agenda
  • Microarray analysis of host and microbe gene expression during infection. This is important for identifying the expression of virulence factors that allow a pathogen to survive a host's defense mechanism. Pathogens tend to undergo an assortment of changed in order to subvert and hosts immune system, in some case favoring a hyper variable genome state. The genomic expression studies will be complemented with protein-protein interaction networks studies.
  • Using RNA interference (RNAi) to identify host cell functions in response to infections. Infection depends on the balance between the characteristics of the host cell and the pathogen cell. In some cases, there can be an overactive host response to infection, such as in meningitis, which can overwhelm the host's body. Using RNA, it will be possible to more clearly identify how a host cell defends itself during times of acute or chronic infection. This has also been applied successfully is Drosophila.
  • Not all microbe interactions in host environment are malicious. Commensal flora, which exists in various environments in animals and humans may actually help combating microbial infections. The human flora, such as the gut for example, is home to a myriad of microbes.
The diverse community within the gut has been heralded to be vital for human health. There are a number of projects under way to better understand the ecosystems of the gut. The sequence of commensal Escherichia coli strain SE11, for example, has already been determined from the faecal matter of a healthy human and promises to be the first of many studies. Through genomic analysis and also subsequent protein analysis, a better understanding of the beneficial properties of commensal flora will be investigated in hopes of understanding how to build a better therapeutic.

Eco-evo perspective

The "eco-evo" perspective on pathogen-host interactions emphasizes the influences ecology and the environment on pathogen evolution. The dynamic genomic factors such as gene loss, gene gain and genome rearrangement, are all strongly influenced by changes in the ecological niche where a particular microbial strain resides. Microbes may switch from being pathogenic and non-pathogenic due to changing environments. This was demonstrated during studies of the plague, Yersinia pestis, which apparently evolved from a mild gastrointestinal pathogen to a very highly pathogenic microbe through dynamic genomic events. In order for colonization to occur, there must be changes in biochemical makeup to aid survival in a variety of environments. This is most likely due to a mechanism allowing the cell to sense changes within the environment, thus influencing change in gene expression. Understanding how these strain changes occur from being low or non-pathogenic to being highly pathogenic and vice versa may aid in developing novel therapeutics for microbial infections.

Applications

Baby Receiving Immunizations

Human health has greatly improved and the mortality rate has declined substantially since the second world war because of improved hygiene due to changing public health regulations, as well as more readily available vaccines and antibiotics. Pathogenomics will allow scientists to expand what they know about pathogenic and non-pathogenic microbes, thus allowing for new and improved vaccines. Pathogenomics also has wider implication, including preventing bioterrorism.

Reverse vaccinology

Reverse vaccinology is relatively new. While research is still being conducted, there have been breakthroughs with pathogens such as Streptococcus and Meningitis. Methods of vaccine production, such as biochemical and serological, are laborious and unreliable.They require the pathogens to be in vitro to be effective. New advances in genomic development help predict nearly all variations of pathogens, thus making advances for vaccines. Protein-based vaccines are being developed to combat resistant pathogens such as Staphylococcus and Chlamydia.

Countering bioterrorism

In 2005, the sequence of the 1918 Spanish influenza was completed. Accompanied with phylogenetic analysis, it was possible to supply a detailed account of the virus' evolution and behavior, in particular its adaptation to humans. Following the sequencing of the Spanish influenza, the pathogen was also reconstructed. When inserted into mice, the pathogen proved to be incredibly deadly. The 2001 anthrax attacks shed light on the possibility of bioterrorism as being more of a real than imagined threat. Bioterrorism was anticipated in the Iraq war, with soldiers being inoculated for a smallpox attack. Using technologies and insight gained from reconstruction of the Spanish influenza, it may be possible to prevent future deadly planted outbreaks of disease. There is a strong ethical concern however, as to whether the resurrection of old viruses is necessary and whether it does more harm than good. The best avenue for countering such threats is coordinating with organizations which provide immunizations. The increased awareness and participation would greatly decrease the effectiveness of a potential epidemic. An addition to this measure would be to monitor natural water reservoirs as a basis to prevent an attack or outbreak. Overall, communication between labs and large organizations, such as Global Outbreak Alert and Response Network (GOARN), can lead to early detection and prevent outbreaks.

Chromatin

From Wikipedia, the free encyclopedia
  
The major structures in DNA compaction: DNA, the nucleosome, the 10 nm "beads-on-a-string" fibre, the 30 nm chromatin fibre and the metaphase chromosome.

Chromatin is a complex of DNA and protein found in eukaryotic cells. Its primary function is packaging long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important roles in reinforcing the DNA during cell division, preventing DNA damage, and regulating gene expression and DNA replication. During mitosis and meiosis, chromatin facilitates proper segregation of the chromosomes in anaphase; the characteristic shapes of chromosomes visible during this stage are the result of DNA being coiled into highly condensed chromatin.

The primary protein components of chromatin are histones, which bind to DNA and function as "anchors" around which the strands are wound. In general, there are three levels of chromatin organization:
  1. DNA wraps around histone proteins, forming nucleosomes and the so-called "beads on a string" structure (euchromatin).
  2. Multiple histones wrap into a 30-nanometer fibre consisting of nucleosome arrays in their most compact form (heterochromatin).
  3. Higher-level DNA supercoiling of the 30-nm fiber produces the metaphase chromosome (during mitosis and meiosis).
Many organisms, however, do not follow this organization scheme. For example, spermatozoa and avian red blood cells have more tightly packed chromatin than most eukaryotic cells, and trypanosomatid protozoa do not condense their chromatin into visible chromosomes at all. Prokaryotic cells have entirely different structures for organizing their DNA (the prokaryotic chromosome equivalent is called a genophore and is localized within the nucleoid region).

The overall structure of the chromatin network further depends on the stage of the cell cycle. During interphase, the chromatin is structurally loose to allow access to RNA and DNA polymerases that transcribe and replicate the DNA. The local structure of chromatin during interphase depends on the specific genes present in the DNA. Regions of DNA containing genes which are actively transcribed ("turned on") are less tightly compacted and closely associated with RNA polymerases in a structure known as euchromatin, while regions containing inactive genes ("turned off") are generally more condensed and associated with structural proteins in heterochromatin. Epigenetic modification of the structural proteins in chromatin via methylation and acetylation also alters local chromatin structure and therefore gene expression. The structure of chromatin networks is currently poorly understood and remains an active area of research in molecular biology.

Dynamic chromatin structure and hierarchy

Chromatin undergoes various structural changes during a cell cycle. Histone proteins are the basic packer and arranger of chromatin and can be modified by various post-translational modifications to alter chromatin packing (Histone modification). Most of the modifications occur on the histone tail. The consequences in terms of chromatin accessibility and compaction depend both on the amino-acid that is modified and the type of modification. For example, Histone acetylation results in loosening and increased accessibility of chromatin for replication and transcription. Lysine tri-methylation can either be correlated with transcriptional activity (tri-methylation of histone H3 Lysine 4) or transcriptional repression and chromatin compaction (tri-methylation of histone H3 Lysine 9 or 27). Several studies suggested that different modifications could occur simultaneously. For example, it was proposed that a bivalent structure (with tri-methylation of both Lysine 4 and 27 on histone H3) was involved in mammalian early development.

Polycomb-group proteins play a role in regulating genes through modulation of chromatin structure.

DNA structure

The structures of A-, B-, and Z-DNA.

In nature, DNA can form three structures, A-, B-, and Z-DNA. A- and B-DNA are very similar, forming right-handed helices, whereas Z-DNA is a left-handed helix with a zig-zag phosphate backbone. Z-DNA is thought to play a specific role in chromatin structure and transcription because of the properties of the junction between B- and Z-DNA.

At the junction of B- and Z-DNA, one pair of bases is flipped out from normal bonding. These play a dual role of a site of recognition by many proteins and as a sink for torsional stress from RNA polymerase or nucleosome binding.

Nucleosomes and beads-on-a-string

A cartoon representation of the nucleosome structure. From PDB: 1KX5​.

The basic repeat element of chromatin is the nucleosome, interconnected by sections of linker DNA, a far shorter arrangement than pure DNA in solution.

In addition to the core histones, there is the linker histone, H1, which contacts the exit/entry of the DNA strand on the nucleosome. The nucleosome core particle, together with histone H1, is known as a chromatosome. Nucleosomes, with about 20 to 60 base pairs of linker DNA, can form, under non-physiological conditions, an approximately 10 nm "beads-on-a-string" fibre. (Fig. 1-2).

The nucleosomes bind DNA non-specifically, as required by their function in general DNA packaging. There are, however, large DNA sequence preferences that govern nucleosome positioning. This is due primarily to the varying physical properties of different DNA sequences: For instance, adenine and thymine are more favorably compressed into the inner minor grooves. This means nucleosomes can bind preferentially at one position approximately every 10 base pairs (the helical repeat of DNA)- where the DNA is rotated to maximise the number of A and T bases that will lie in the inner minor groove.

30-nanometer chromatin fiber

Two proposed structures of the 30 nm chromatin filament.
Left: 1 start helix "solenoid" structure.
Right: 2 start loose helix structure.
Note: the histones are omitted in this diagram - only the DNA is shown.

With addition of H1, the beads-on-a-string structure in turn coils into a 30 nm diameter helical structure known as the 30 nm fibre or filament. The precise structure of the chromatin fiber in the cell is not known in detail, and there is still some debate over this.

This level of chromatin structure is thought to be the form of heterochromatin, which contains mostly transcriptionally silent genes. EM (electron microscopy) studies have demonstrated that the 30 NM fiber is highly dynamic such that it unfolds into a 10 nm fiber ("beads-on-a-string") structure when transversed by an RNA polymerase engaged in transcription.

Four proposed structures of the 30 nm chromatin filament for DNA repeat length per nucleosomes ranging from 177 to 207 bp.
Linker DNA in yellow and nucleosomal DNA in pink.

The existing models commonly accept that the nucleosomes lie perpendicular to the axis of the fibre, with linker histones arranged internally. A stable 30 nm fibre relies on the regular positioning of nucleosomes along DNA. Linker DNA is relatively resistant to bending and rotation. This makes the length of linker DNA critical to the stability of the fibre, requiring nucleosomes to be separated by lengths that permit rotation and folding into the required orientation without excessive stress to the DNA. In this view, different lengths of the linker DNA should produce different folding topologies of the chromatin fiber. Recent theoretical work, based on electron-microscopy images of reconstituted fibers supports this view.

Spatial organization of chromatin in the cell nucleus

The spatial arrangement of the chromatin within the nucleus is not random - specific regions of the chromatin can be found in certain territories. Territories are, for example, the lamina-associated domains (LADs), and the topologically associating domains (TADs), which are bound together by protein complexes. Currently, polymer models such as the Strings & Binders Switch (SBS) model and the Dynamic Loop (DL) model are used to describe the folding of chromatin within the nucleus.

Cell-cycle dependent structural organization

  • Interphase: The structure of chromatin during interphase of mitosis is optimized to allow simple access of transcription and DNA repair factors to the DNA while compacting the DNA into the nucleus. The structure varies depending on the access required to the DNA. Genes that require regular access by RNA polymerase require the looser structure provided by euchromatin.
    1. Karyogram of human male using Giemsa staining, showing the classic metaphase chromatin structure.
    2. Metaphase: The metaphase structure of chromatin differs vastly to that of interphase. It is optimised for physical strength[citation needed] and manageability, forming the classic chromosome structure seen in karyotypes. The structure of the condensed chromatin is thought to be loops of 30 nm fibre to a central scaffold of proteins. It is, however, not well-characterised.The physical strength of chromatin is vital for this stage of division to prevent shear damage to the DNA as the daughter chromosomes are separated. To maximise strength the composition of the chromatin changes as it approaches the centromere, primarily through alternative histone H1 analogues. During mitosis, although most of the chromatin is tightly compacted, there are small regions that are not as tightly compacted. These regions often correspond to promoter regions of genes that were active in that cell type prior to entry into chromatosis. The lack of compaction of these regions is called bookmarking, which is an epigenetic mechanism believed to be important for transmitting to daughter cells the "memory" of which genes were active prior to entry into mitosis. This bookmarking mechanism is needed to help transmit this memory because transcription ceases during mitosis.

    Chromatin and bursts of transcription

    Chromatin and its interaction with enzymes has been researched, and a conclusion being made is that it is relevant and an important factor in gene expression. Vincent G. Allfrey, a professor at Rockefeller University, stated that RNA synthesis is related to histone acetylation. The lysine amino acid attached to the end of the histones is positively charged. The acetylation of these tails would make the chromatin ends neutral, allowing for DNA access.

    When the chromatin decondenses, the DNA is open to entry of molecular machinery. Fluctuations between open and closed chromatin may contribute to the discontinuity of transcription, or transcriptional bursting. Other factors are probably involved, such as the association and dissociation of transcription factor complexes with chromatin. The phenomenon, as opposed to simple probabilistic models of transcription, can account for the high variability in gene expression occurring between cells in isogenic populations.

    Alternative chromatin organizations

    During metazoan spermiogenesis, the spermatid's chromatin is remodeled into a more spaced-packaged, widened, almost crystal-like structure. This process is associated with the cessation of transcription and involves nuclear protein exchange. The histones are mostly displaced, and replaced by protamines (small, arginine-rich proteins). It is proposed that in yeast, regions devoid of histones become very fragile after transcription; HMO1, an HMG-box protein, helps in stabilizing nucleosomes-free chromatin.

    Chromatin and DNA repair

    The packaging of eukaryotic DNA into chromatin presents a barrier to all DNA-based processes that require recruitment of enzymes to their sites of action. To allow the critical cellular process of DNA repair, the chromatin must be remodeled. In eukaryotes, ATP dependent chromatin remodeling complexes and histone-modifying enzymes are two predominant factors employed to accomplish this remodeling process.

    Chromatin relaxation occurs rapidly at the site of a DNA damage. This process is initiated by PARP1 protein that starts to appear at DNA damage in less than a second, with half maximum accumulation within 1.6 seconds after the damage occurs. Next the chromatin remodeler Alc1 quickly attaches to the product of PARP1, and completes arrival at the DNA damage within 10 seconds of the damage. About half of the maximum chromatin relaxation, presumably due to action of Alc1, occurs by 10 seconds. This then allows recruitment of the DNA repair enzyme MRE11, to initiate DNA repair, within 13 seconds.

    γH2AX, the phosphorylated form of H2AX is also involved in the early steps leading to chromatin decondensation after DNA damage occurrence. The histone variant H2AX constitutes about 10% of the H2A histones in human chromatin. γH2AX (H2AX phosphorylated on serine 139) can be detected as soon as 20 seconds after irradiation of cells (with DNA double-strand break formation), and half maximum accumulation of γH2AX occurs in one minute. The extent of chromatin with phosphorylated γH2AX is about two million base pairs at the site of a DNA double-strand break. γH2AX does not, itself, cause chromatin decondensation, but within 30 seconds of irradiation, RNF8 protein can be detected in association with γH2AX. RNF8 mediates extensive chromatin decondensation, through its subsequent interaction with CHD4, a component of the nucleosome remodeling and deacetylase complex NuRD.

    After undergoing relaxation subsequent to DNA damage, followed by DNA repair, chromatin recovers to a compaction state close to its pre-damage level after about 20 min.

    Methods to investigate chromatin

    1. ChIP-seq (Chromatin immunoprecipitation sequencing), aimed against different histone modifications, can be used to identify chromatin states throughout the genome. Different modifications have been linked to various states of chromatin.
    2. DNase-seq (DNase I hypersensitive sites Sequencing) uses the sensitivity of accessible regions in the genome to the DNase I enzyme to map open or accessible regions in the genome.
    3. FAIRE-seq (Formaldehyde-Assisted Isolation of Regulatory Elements sequencing) uses the chemical properties of protein-bound DNA in a two-phase separation method to extract nucleosome depleted regions from the genome.
    4. ATAC-seq (Assay for Transposable Accessible Chromatin sequencing) uses the Tn5 transposase to integrate (synthetic) transposons into accessible regions of the genome consequentially highlighting the localisation of nucleosomes and transcription factors across the genome.
    5. DNA footprinting is a method aimed at identifying protein-bound DNA. It uses labeling and fragmentation coupled to gel electrophoresis to identify areas of the genome that have been bound by proteins.
    6. MNase-seq (Micrococcal Nuclease sequencing) uses the micrococcal nuclease enzyme to identify nucleosome positioning throughout the genome.
    7. Chromosome conformation capture determines the spatial organization of chromatin in the nucleus, by inferring genomic locations that physically interact.
    8. MACC profiling (Micrococcal nuclease ACCessibility profiling) uses titration series of chromatin digests with micrococcal nuclease to identify chromatin accessibility as well as to map nucleosomes and non-histone DNA-binding proteins in both open and closed regions of the genome.

    Chromatin and knots

    It has been a puzzle how decondensed interphase chromosomes remain essentially unknotted. The natural expectation is that in the presence of type II DNA topoisomerases that permit passages of double-stranded DNA regions through each other, all chromosomes should reach the state of topological equilibrium. The topological equilibrium in highly crowded interphase chromosomes forming chromosome territories would result in formation of highly knotted chromatin fibres. However, Chromosome Conformation Capture (3C) methods revealed that the decay of contacts with the genomic distance in interphase chromosomes is practically the same as in the crumpled globule state that is formed when long polymers condense without formation of any knots. To remove knots from highly crowded chromatin, one would need an active process that should not only provide the energy to move the system from the state of topological equilibrium but also guide topoisomerase-mediated passages in such a way that knots would be efficiently unknotted instead of making the knots even more complex. It has been shown that the process of chromatin-loop extrusion is ideally suited to actively unknot chromatin fibres in interphase chromosomes.

    Chromatin: alternative definitions

    The term, introduced by Walther Flemming, has multiple meanings:
    1. Simple and concise definition: Chromatin is a macromolecular complex of a DNA macromolecule and protein macromolecules (and RNA). The proteins package and arrange the DNA and control its functions within the cell nucleus.
    2. A biochemists’ operational definition: Chromatin is the DNA/protein/RNA complex extracted from eukaryotic lysed interphase nuclei. Just which of the multitudinous substances present in a nucleus will constitute a part of the extracted material partly depends on the technique each researcher uses. Furthermore, the composition and properties of chromatin vary from one cell type to the another, during development of a specific cell type, and at different stages in the cell cycle.
    3. The DNA + histone = chromatin definition: The DNA double helix in the cell nucleus is packaged by special proteins termed histones. The formed protein/DNA complex is called chromatin. The basic structural unit of chromatin is the nucleosome.

    Nobel Prizes

    The following scientists were recognized for their contributions to chromatin research with Nobel Prizes:

    Year Who Award
    1910 Albrecht Kossel (University of Heidelberg) Nobel Prize in Physiology or Medicine for his discovery of the five nuclear bases: adenine, cytosine, guanine, thymine, and uracil.
    1933 Thomas Hunt Morgan (California Institute of Technology) Nobel Prize in Physiology or Medicine for his discoveries of the role played by the gene and chromosome in heredity, based on his studies of the white-eyed mutation in the fruit fly Drosophila.
    1962 Francis Crick, James Watson and Maurice Wilkins (MRC Laboratory of Molecular Biology, Harvard University and London University respectively) Nobel Prize in Physiology or Medicine for their discoveries of the double helix structure of DNA and its significance for information transfer in living material.
    1982 Aaron Klug (MRC Laboratory of Molecular Biology) Nobel Prize in Chemistry "for his development of crystallographic electron microscopy and his structural elucidation of biologically important nucleic acid-protein complexes"
    1993 Richard J. Roberts and Phillip A. Sharp Nobel Prize in Physiology "for their independent discoveries of split genes," in which DNA sections called exons express proteins, and are interrupted by DNA sections called introns, which do not express proteins.
    2006 Roger Kornberg (Stanford University) Nobel Prize in Chemistry for his discovery of the mechanism by which DNA is transcribed into messenger RNA.

    Inequality (mathematics)

    From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Inequality...