From Wikipedia, the free encyclopedia
Representation of a
DNA molecule that is methylated. The two white spheres represent
methyl groups. They are bound to two
cytosine nucleotide molecules that make up the DNA sequence.
DNA methylation is a process by which
methyl groups
are added to the DNA molecule. Methylation can change the activity of a
DNA segment without changing the sequence. When located in a gene
promoter, DNA methylation typically acts to repress gene
transcription. DNA methylation is essential for normal development and is associated with a number of key processes including
genomic imprinting,
X-chromosome inactivation, repression of
transposable elements,
aging and
carcinogenesis.
Two of DNA's four bases,
cytosine and
adenine,
can be methylated. Cytosine methylation is widespread in both
eukaryotes and prokaryotes, even though the rate of cytosine DNA
methylation can differ greatly between species: 14% of cytosines are
methylated in
Arabidopsis thaliana, 8% in
Physarum,
[1] 4% in
Mus musculus, 2.3% in
Escherichia coli, 0.03% in
Drosophila, 0.006% in
Dictyostelium[2] and virtually none (< 0.0002%) in
Caenorhabditis[3] or
yeast species such as S. cerevisiae and S. pombe (but not N. crassa).
[4][5] Adenine methylation has been observed in bacterial, plant and recently in mammalian DNA,
[6][7] but has received considerably less attention.
Methylation of cytosine to form
5-methylcytosine occurs at the same 5 position on the
pyrimidine ring where the DNA base
thymine's methyl group is located; the same position distinguishes thymine from the analogous RNA base
uracil, which has no methyl group. Spontaneous
deamination of
5-methylcytosine
converts it to thymine. This results in a T:G mismatch. Repair
mechanisms then correct it back to the original C:G pair; alternatively,
they may substitute G for A, turning the original C:G pair into an T:A
pair, effectively changing a base and introducing a mutation. This
misincorporated base will not be corrected during DNA replication as
thymine is a DNA base. If the mismatch is not repaired and the cell
enters the cell cycle the strand carrying the T will be complemented by
an A in one of the daughter cells, such that the mutation becomes
permanent. The near-universal
replacement of uracil by thymine
in DNA, but not RNA, may have evolved as an error-control mechanism, to
facilitate the removal of uracils generated by the spontaneous
deamination of cytosine.
[8]
DNA methylation as well as many of its contemporary DNA
methyltransferases has been thought to evolve from early world primitive
RNA methylation activity and is supported by several lines of evidence.
[9]
In plants and other organisms, DNA methylation is found in three different sequence contexts: CG (or
CpG),
CHG or CHH (where H correspond to A, T or C). In mammals however, DNA
methylation is almost exclusively found in CpG dinucleotides, with the
cytosines on both strands being usually methylated. Non-CpG methylation
can however be observed in embryonic
stem cells,
[10][11][12] and has also been indicated in
neural development.
[13] Furthermore, non-CpG methylation has also been observed in
hematopoietic progenitor cells, and it occurred mainly in a CpApC sequence context.
[14]
Conserved function of DNA methylation
Typical DNA methylation landscape in mammals
The DNA methylation landscape of vertebrates is very particular
compared to other organisms. In vertebrates, around 60–80% of CpG are
methylated in somatic cells
[15] and DNA methylation appears as a default state that has to be specifically excluded from defined locations.
[16][17]
By contrast, the genome of most plants, invertebrates, fungi or
protists show “mosaic” methylation patterns, where only specific genomic
elements are targeted, and they are characterized by the alternation of
methylated and unmethylated domains.
[4][18]
High CpG methylation in mammalian genomes has an evolutionary cost
because it increases the frequency of spontaneous mutations. Loss of
amino-groups occurs with a high frequency for cytosines, with different
consequences depending on their methylation. Methylated C residues
spontaneously deaminate to form T residues over time; hence CpG
dinucleotides steadily deaminate to TpG dinucleotides, which is
evidenced by the under-representation of CpG dinucleotides in the human
genome (they occur at only 21% of the expected frequency).
[19]
(On the other hand, spontaneous deamination of unmethylated C residues
gives rise to U residues, a change that is quickly recognized and
repaired by the cell.)
CpG islands
In
mammals, the only exception for this global CpG depletion resides in a
specific category of GC- and CpG-rich sequences termed CpG islands that
are generally unmethylated and therefore retained the expected CpG
content.
[20]
CpG islands are usually defined as regions with 1) a length greater
than 200bp, 2) a G+C content greater than 50%, 3) a ratio of observed to
expected CpG greater than 0.6, although other definitions are sometimes
used.
[21] Excluding repeated sequences, there are around 25,000 CpG islands in the human genome, 75% of which being less than 850bp long.
[22] They are major regulatory units and around 50% of CpG islands are
located in gene promoter regions, while another 25% lie in gene bodies,
often serving as alternative promoters. Reciprocally, around 60-70% of
human genes have a CpG island in their promoter region.
[23][24] The majority of CpG islands are constitutively unmethylated and enriched for permissive
chromatin modification
such as H3K4 methylation. In somatic tissues, only 10% of CpG islands
are methylated, the majority of them being located in intergenic and
intragenic regions.
Repression of CpG-dense promoters
DNA
methylation was probably present at some extent in very early eukaryote
ancestors. In virtually every organism analyzed, methylation in
promoter regions correlates negatively with gene expression.
[4][25]
CpG-dense promoters of actively transcribed genes are never methylated,
but reciprocally transcriptionally silent genes do not necessarily
carry a methylated promoter. In mouse and human, around 60–70% of genes
have a CpG island in their promoter region and most of these CpG islands
remain unmethylated independently of the transcriptional activity of
the gene, in both differentiated and undifferentiated cell types.
[26][27]
Of note, whereas DNA methylation of CpG islands is unambiguously linked
with transcriptional repression, the function of DNA methylation in
CG-poor promoters remains unclear; albeit there is little evidence that
it could be functionally relevant.
[28]
DNA methylation may affect the transcription of genes in two ways.
First, the methylation of DNA itself may physically impede the binding
of
transcriptional proteins to the gene,
[29] and second, and likely more important, methylated DNA may be bound by proteins known as
methyl-CpG-binding domain proteins (MBDs).
MBD proteins then recruit additional proteins to the locus, such as
histone deacetylases and other
chromatin remodeling proteins that can modify
histones, thereby forming compact, inactive chromatin, termed
heterochromatin. This link between DNA methylation and chromatin structure is very important. In particular, loss of
methyl-CpG-binding protein 2 (MeCP2) has been implicated in
Rett syndrome; and
methyl-CpG-binding domain protein 2 (MBD2) mediates the transcriptional silencing of hypermethylated genes in cancer.
Repression of transposable elements
DNA
methylation is a powerful transcriptional repressor, at least in CpG
dense contexts. Transcriptional repression of protein-coding genes
appears essentially limited to very specific classes of genes that need
to be silent permanently and in almost all tissues. While DNA
methylation does not have the flexibility required for the fine-tuning
of gene regulation, its stability is perfect to ensure the permanent
silencing of
transposable elements.
Transposon control is one the most ancient function of DNA methylation
that is shared by animals, plants and multiple protists.
[30] It is even suggested that DNA methylation evolved precisely for this purpose.
[31]
Methylation of the gene body of highly transcribed genes
A
function that appears even more conserved than transposon silencing is
positively correlated with gene expression. In almost all species where
DNA methylation is present, DNA methylation is especially enriched in
the body of highly transcribed genes.
[4][25] The function of gene body methylation is not well understood. A body of evidence suggests that it could regulate splicing
[32] and suppress the activity of intragenic transcriptional units (cryptic promoters or transposable elements).
[33]
Gene-body methylation appears closely tied to H3K36 methylation. In
yeast and mammals, H3K36 methylation is highly enriched in the body of
highly transcribed genes. In yeast at least, H3K36me3 recruits enzymes
such as histone deacetylases to condense chromatin and prevent the
activation of cryptic start sites.
[34]
In mammals, DNMT3a and DNMT3b PWWP domain binds to H3K36me3 and the two
enzymes are recruited to the body of actively transcribed genes.
In mammals
Dynamic of DNA methylation during mouse embryonic development. E3.5-E6,
etc., refer to days after fertilization. PGC: primordial germ cells
During embryonic development
DNA methylation patterns are largely erased and then re-established
between generations in mammals. Almost all of the methylations from the
parents are erased, first during
gametogenesis, and again in early
embryogenesis,
with demethylation and remethylation occurring each time. Demethylation
in early embryogenesis occurs in the preimplantation period in two
stages – initially in the
zygote, then during the first few embryonic replication cycles of
morula and
blastula.
A wave of methylation then takes place during the implantation stage of
the embryo, with CpG islands protected from methylation. This results
in global repression and allows housekeeping genes to be expressed in
all cells. In the post-implantation stage, methylation patterns are
stage- and tissue-specific, with changes that would define each
individual cell type lasting stably over a long period.
[35]
Whereas DNA methylation is not necessary
per se for
transcriptional silencing, it is thought nonetheless to represent a
“locked” state that definitely inactivates transcription. In particular,
DNA methylation appears critical for the maintenance of mono-allelic
silencing in the context of
genomic imprinting and
X chromosome inactivation.
[36][37]
In these cases, expressed and silent alleles differ by their
methylation status, and loss of DNA methylation results in loss of
imprinting and re-expression of Xist in somatic cells. During embryonic
development, few genes change their methylation status, at the important
exception of many genes specifically expressed in the germline.
[38] DNA methylation appears absolutely required in
differentiated cells,
as knockout of any of the three competent DNA methyltransferase results
in embryonic or post-partum lethality. By contrast, DNA methylation is
dispensable in undifferentiated cell types, such as the inner cell mass
of the blastocyst, primordial germ cells or embryonic stem cells. Since
DNA methylation appears to directly regulate only a limited number of
genes, how precisely DNA methylation absence causes the death of
differentiated cells remain an open question.
Due to the phenomenon of
genomic imprinting, maternal and paternal genomes are differentially marked and must be properly
reprogrammed every time they pass through the germline. Therefore, during
gametogenesis,
primordial germ cells must have their original biparental DNA
methylation patterns erased and re-established based on the sex of the
transmitting parent. After fertilization the paternal and maternal
genomes are once again demethylated and remethylated (except for
differentially methylated regions associated with imprinted genes). This
reprogramming is likely required for totipotency of the newly formed
embryo and erasure of acquired epigenetic changes.
[39]
In cancer
In many disease processes, such as
cancer, gene promoter
CpG islands acquire abnormal hypermethylation, which results in
transcriptional silencing
that can be inherited by daughter cells following cell division.
Alterations of DNA methylation have been recognized as an important
component of cancer development. Hypomethylation, in general, arises
earlier and is linked to chromosomal instability and loss of imprinting,
whereas hypermethylation is associated with promoters and can arise
secondary to gene (oncogene suppressor) silencing, but might be a target
for
epigenetic therapy.
[40]
Global hypomethylation has also been implicated in the development and progression of cancer through different mechanisms.
[41] Typically, there is hypermethylation of
tumor suppressor genes and hypomethylation of
oncogenes.
[42]
Generally, in progression to cancer, hundreds of genes are
silenced or activated.
Although silencing of some genes in cancers occurs by mutation, a large
proportion of carcinogenic gene silencing is a result of altered DNA
methylation. DNA methylation causing silencing in cancer typically occurs at multiple
CpG sites in the
CpG islands that are present in the
promoters of protein coding genes.
Altered expressions of
microRNAs also silence or activate many genes in progression to cancer (see
microRNAs in cancer). Altered microRNA expression occurs through
hyper/hypo-methylation of
CpG sites in
CpG islands in promoters controlling transcription of the
microRNAs.
Silencing of DNA repair genes through methylation of CpG islands in
their promoters appears to be especially important in progression to
cancer.
In atherosclerosis
Epigenetic modifications such as DNA methylation have been implicated in cardiovascular disease, including
atherosclerosis.
In animal models of atherosclerosis, vascular tissue as well as blood
cells such as mononuclear blood cells exhibit global hypomethylation
with gene-specific areas of hypermethylation. DNA methylation
polymorphisms may be used as an early biomarker of atherosclerosis since
they are present before lesions are observed, which may provide an
early tool for detection and risk prevention.
[43]
Two of the cell types targeted for DNA methylation polymorphisms are
monocytes and lymphocytes, which experience an overall hypomethylation.
One proposed mechanism behind this global hypomethylation is elevated
homocysteine levels causing
hyperhomocysteinemia,
a known risk factor for cardiovascular disease. High plasma levels of
homocysteine inhibit DNA methyltransferases, which causes
hypomethylation. Hypomethylation of DNA affects gene that alter smooth
muscle cell proliferation, cause endothelial cell dysfunction, and
increase inflammatory mediators, all of which are critical in forming
atherosclerotic lesions.
[44] High levels of homocysteine also result in hypermethylation of CpG islands in the promoter region of the
estrogen receptor alpha (ERα) gene, causing its down regulation.
[45]
ERα protects against atherosclerosis due to its action as a growth
suppressor, causing the smooth muscle cells to remain in a quiescent
state.
[46]
Hypermethylation of the ERα promoter thus allows intimal smooth muscle
cells to proliferate excessively and contribute to the development of
the atherosclerotic lesion.
[47]
Another gene that experiences a change in methylation status in atherosclerosis is the
monocarboxylate transporter
(MCT3), which produces a protein responsible for the transport of
lactate and other ketone bodies out of many cell types, including
vascular smooth muscle cells. In atherosclerosis patients, there is an
increase in methylation of the CpG islands in exon 2, which decreases
MCT3 protein expression. The down regulation of MCT3 impairs lactate
transport, and significantly increases smooth muscle cell proliferation,
which further contributes to the atherosclerotic lesion. An ex vivo
experiment using the demethylating agent
Decitabine
(5-aza-2 -deoxycytidine) was shown to induce MCT3 expression in a dose
dependant manner, as all hypermethylated sites in the exon 2 CpG island
became demethylated after treatment. This may serve as a novel
therapeutic agent to treat atherosclerosis, although no human studies
have been conducted thus far.
[48]
In aging
In
humans and other mammals, DNA methylation levels can be used to
accurately estimate the age of tissues and cell types, forming an
accurate
epigenetic clock.
[49]
A
longitudinal study of
twin
children showed that, between the ages of 5 and 10, there was
divergence of methylation patterns due to environmental rather than
genetic influences.
[50] There is a global loss of DNA methylation during aging.
[42]
In a study that analyzed the complete DNA methylomes of CD4
+ T cells
in a newborn, a 26 years old individual and a 103 years old individual
was observed that the loss of methylation is proportional to age.
Hypomethylated CpGs observed in the centenarian DNAs compared with the
neonates covered all genomic compartments (promoters, intergenic,
intronic and exonic regions).
[51] However, some genes become hypermethylated with age, including genes for the
estrogen receptor,
p16, and
insulin-like growth factor 2.
[42]
In exercise
High intensity exercise has been shown to result in reduced DNA methylation in skeletal muscle.
[52] Promoter methylation of
PGC-1α and
PDK4 were immediately reduced after high intensity exercise, whereas
PPAR-γ methylation was not reduced until three hours after exercise.
[52] By contrast, six months of exercise in previously sedentary middle-age men resulted in increased methylation in
adipose tissue.
[53] One study showed a possible increase in global genomic DNA methylation of
white blood cells with more physical activity in non-Hispanics.
[54]
In B-cell differentiation
A study that investigated the methylome of
B cells along their differentiation cycle, using whole-genome
bisulfite sequencing
(WGBS), showed that there is a hypomethylation from the earliest stages
to the most differentiated stages. The largest methylation difference
is between the stages of germinal center B cells and memory B cells.
Furthermore, this study showed that there is a similarity between B cell
tumors and long-lived B cells in their DNA methylation signatures.
[14]
In the brain
Research has suggested that long-term memory storage in humans may be regulated by DNA methylation.
[55][56]
DNA methyltransferases (in mammals)
In mammalian cells, DNA methylation occurs mainly at the C5 position
of CpG dinucleotides and is carried out by two general classes of
enzymatic activities – maintenance methylation and
de novo methylation.
[57]
Maintenance methylation activity is necessary to preserve DNA
methylation after every cellular DNA replication cycle. Without the
DNA methyltransferase
(DNMT), the replication machinery itself would produce daughter strands
that are unmethylated and, over time, would lead to passive
demethylation. DNMT1 is the proposed maintenance methyltransferase that
is responsible for copying DNA methylation patterns to the daughter
strands during DNA replication. Mouse models with both copies of DNMT1
deleted are embryonic lethal at approximately day 9, due to the
requirement of DNMT1 activity for development in mammalian cells.
It is thought that DNMT3a and DNMT3b are the
de novo
methyltransferases that set up DNA methylation patterns early in
development. DNMT3L is a protein that is homologous to the other DNMT3s
but has no catalytic activity. Instead, DNMT3L assists the
de novo methyltransferases by increasing their ability to bind to DNA and stimulating their activity. Finally,
DNMT2 (TRDMT1)
has been identified as a DNA methyltransferase homolog, containing all
10 sequence motifs common to all DNA methyltransferases; however, DNMT2
(TRDMT1) does not methylate DNA but instead methylates cytosine-38 in
the anticodon loop of aspartic acid transfer RNA.
[58]
Since many tumor suppressor genes are silenced by DNA methylation during
carcinogenesis, there have been attempts to re-express these genes by inhibiting the DNMTs. 5-Aza-2'-deoxycytidine (
decitabine) is a
nucleoside analog
that inhibits DNMTs by trapping them in a covalent complex on DNA by
preventing the β-elimination step of catalysis, thus resulting in the
enzymes' degradation. However, for decitabine to be active, it must be
incorporated into the
genome
of the cell, which can cause mutations in the daughter cells if the
cell does not die. In addition, decitabine is toxic to the bone marrow,
which limits the size of its therapeutic window. These pitfalls have led
to the development of antisense RNA therapies that target the DNMTs by
degrading their
mRNAs and preventing their
translation.
However, it is currently unclear whether targeting DNMT1 alone is
sufficient to reactivate tumor suppressor genes silenced by DNA
methylation.
In plants
Significant progress has been made in understanding DNA methylation in the model plant
Arabidopsis thaliana.
DNA methylation in plants differs from that of mammals: while DNA
methylation in mammals mainly occurs on the cytosine nucleotide in a
CpG site,
in plants the cytosine can be methylated at CpG, CpHpG, and CpHpH
sites, where H represents any nucleotide but not guanine. Overall,
Arabidopsis DNA is highly methylated,
mass spectrometry analysis estimated 14% of cytosines to be modified.
[5]
The principal
Arabidopsis DNA methyltransferase enzymes, which
transfer and covalently attach methyl groups onto DNA, are DRM2, MET1,
and CMT3. Both the DRM2 and MET1 proteins share significant homology to
the mammalian methyltransferases DNMT3 and DNMT1, respectively, whereas
the CMT3 protein is unique to the plant kingdom. There are currently two
classes of DNA methyltransferases: 1) the
de novo class, or
enzymes that create new methylation marks on the DNA; and 2) a
maintenance class that recognizes the methylation marks on the parental
strand of DNA and transfers new methylation to the daughters strands
after DNA replication. DRM2 is the only enzyme that has been implicated
as a
de novo DNA methyltransferase. DRM2 has also been shown,
along with MET1 and CMT3 to be involved in maintaining methylation marks
through DNA replication.
[59] Other DNA methyltransferases are expressed in plants but have no known function.
It is not clear how the cell determines the locations of
de novo DNA methylation, but evidence suggests that, for many (though not all) locations,
RNA-directed DNA methylation
(RdDM) is involved. In RdDM, specific RNA transcripts are produced from
a genomic DNA template, and this RNA forms secondary structures called
double-stranded RNA molecules.
[60] The double-stranded RNAs, through either the small interfering RNA (
siRNA) or microRNA (
miRNA) pathways direct de-novo DNA methylation of the original genomic location that produced the RNA.
[60] This sort of mechanism is thought to be important in cellular defense against
RNA viruses and/or
transposons,
both of which often form a double-stranded RNA that can be mutagenic to
the host genome. By methylating their genomic locations, through an as
yet poorly understood mechanism, they are shut off and are no longer
active in the cell, protecting the genome from their mutagenic effect.
Recently, it was described that methylation of the DNA is the main
determinant of embryogenic cultures formation from explants in woody
plants and is regarded the main mechanism that explains the poor
response of mature explants to somatic embryogenesis in the plants (Isah
2016).
In insects
Functional DNA methylation has been discovered in Honey Bees.
[61][62]
DNA methylation marks are mainly on the gene body, and current opinions
on the function of DNA methylation is gene regulation via alternative
splicing
[63]
DNA methylation levels in Drosophila melanogaster are nearly undetectable.
[64] Sensitive methods applied to Drosophila DNA Suggest levels in the range of 0.1–0.3% of total cytosine.
[65] This low level of methylation
[66]
appears to reside in genomic sequence patterns that are very different
from patterns seen in humans, or in other animal or plant species to
date. Genomic methylation in D. melanogaster was found at specific short
motifs (concentrated in specific 5-base sequence motifs that are CA-
and CT-rich but depleted of guanine) and is independent of DNMT2
activity. Further, highly sensitive mass spectrometry approaches,
[67]
have now demonstrated the presence of low (0.07%) but significant
levels of adenine methylation during the earliest stages of Drosophila
embryogenesis.
In fungi
Many
fungi have low levels (0.1 to 0.5%) of cytosine methylation, whereas other fungi have as much as 5% of the genome methylated.
[68] This value seems to vary both among species and among isolates of the same species.
[69] There is also evidence that DNA methylation may be involved in state-specific control of
gene expression in fungi.
[citation needed] However, at a detection limit of 250 attomoles by using ultra-high sensitive
mass spectrometry DNA methylation was not confirmed in single cellular yeast species such as
Saccharomyces cerevisiae or
Schizosaccharomyces pombe, indicating that yeasts do not possess this DNA modification.
[5]
Although brewers' yeast (
Saccharomyces), fission yeast (
Schizosaccharomyces), and
Aspergillus flavus[70] have no detectable DNA methylation, the model filamentous fungus
Neurospora crassa has a well-characterized methylation system.
[71] Several genes control methylation in
Neurospora and mutation of the DNA methyl transferase,
dim-2, eliminates all DNA methylation but does not affect growth or sexual reproduction. While the
Neurospora genome has very little repeated DNA, half of the methylation occurs in repeated DNA including
transposon
relics and centromeric DNA. The ability to evaluate other important
phenomena in a DNA methylase-deficient genetic background makes
Neurospora an important system in which to study DNA methylation.
In lower eukaryotes
DNA methylation is largely absent from Dictyostelium discoidium
[72] where it appears to occur at about 0.006% of cytosines.
[2] In contrast, DNA methylation is widely distributed in Physarum polycephalum
[73] where 5-methylcytosine makes up as much as 8% of total cytosine
[1]
In bacteria
Adenine or
cytosine methylation is part of the
restriction modification system of many
bacteria, in which specific DNA sequences are methylated periodically throughout the genome. A
methylase
is the enzyme that recognizes a specific sequence and methylates one of
the bases in or near that sequence. Foreign DNAs (which are not
methylated in this manner) that are introduced into the cell are
degraded by sequence-specific
restriction enzymes
and cleaved. Bacterial genomic DNA is not recognized by these
restriction enzymes. The methylation of native DNA acts as a sort of
primitive immune system, allowing the bacteria to protect themselves
from infection by
bacteriophage.
E. coli DNA adenine methyltransferase (Dam) is an enzyme of ~32 kDa that does not belong to a restriction/modification system. The target recognition sequence for
E. coli
Dam is GATC, as the methylation occurs at the N6 position of the
adenine in this sequence (G meATC). The three base pairs flanking each
side of this site also influence DNA–Dam binding. Dam plays several key
roles in bacterial processes, including mismatch repair, the timing of
DNA replication, and gene expression. As a result of DNA replication,
the status of GATC sites in the
E. coli genome changes from fully
methylated to hemimethylated. This is because adenine introduced into
the new DNA strand is unmethylated. Re-methylation occurs within two to
four seconds, during which time replication errors in the new strand are
repaired. Methylation, or its absence, is the marker that allows the
repair apparatus of the cell to differentiate between the template and
nascent strands. It has been shown that altering Dam activity in
bacteria results in increased spontaneous mutation rate. Bacterial
viability is compromised in dam mutants that also lack certain other DNA
repair enzymes, providing further evidence for the role of Dam in DNA
repair.
One region of the DNA that keeps its hemimethylated status for longer is the
origin of replication,
which has an abundance of GATC sites. This is central to the bacterial
mechanism for timing DNA replication. SeqA binds to the origin of
replication, sequestering it and thus preventing methylation. Because
hemimethylated origins of replication are inactive, this mechanism
limits DNA replication to once per cell cycle.
Expression of certain genes, for example those coding for
pilus expression in
E. coli,
is regulated by the methylation of GATC sites in the promoter region of
the gene operon. The cells' environmental conditions just after DNA
replication determine whether Dam is blocked from methylating a region
proximal to or distal from the promoter region. Once the pattern of
methylation has been created, the pilus gene transcription is locked in
the on or off position until the DNA is again replicated. In
E. coli, these pilus
operons have important roles in virulence in urinary tract infections. It has been proposed
[by whom?] that inhibitors of Dam may function as antibiotics.
On the other hand, DNA cytosine methylase targets CCAGG and CCTGG
sites to methylate cytosine at the C5 position (C meC(A/T) GG). The
other methylase enzyme, EcoKI, causes methylation of adenines in the
sequences AAC(N
6)GTGC and GCAC(N
6)GTT.
Molecular cloning
Most strains used by molecular biologists are derivatives of
E. coli
K-12, and possess both Dam and Dcm, but there are commercially
available strains that are dam-/dcm- (lack of activity of either
methylase). In fact, it is possible to unmethylate the DNA extracted
from dam+/dcm+ strains by transforming it into dam-/dcm- strains. This
would help digest sequences that are not being recognized by
methylation-sensitive restriction enzymes.
[74][75]
The
restriction enzyme
DpnI can recognize 5'-GmeATC-3' sites and digest the methylated DNA.
Being such a short motif, it occurs frequently in sequences by chance,
and as such its primary use for researchers is to degrade template DNA
following
PCRs
(PCR products lack methylation, as no methylases are present in the
reaction). Similarly, some commercially available restriction enzymes
are sensitive to methylation at their cognate restriction sites, and
must as mentioned previously be used on DNA passed through a dam-/dcm-
strain to allow cutting.
Detection
DNA methylation can be detected by the following assays currently used in scientific research:
[76]
- Mass spectrometry
is a very sensitive and reliable analytical method to detect DNA
methylation. MS in general is however not informative about the sequence
context of the methylation, thus limited in studying the function of
this DNA modification.
- Methylation-Specific PCR (MSP),
which is based on a chemical reaction of sodium bisulfite with DNA that
converts unmethylated cytosines of CpG dinucleotides to uracil or UpG,
followed by traditional PCR.[77]
However, methylated cytosines will not be converted in this process,
and primers are designed to overlap the CpG site of interest, which
allows one to determine methylation status as methylated or
unmethylated.
- Whole genome bisulfite sequencing,
also known as BS-Seq, which is a high-throughput genome-wide analysis
of DNA methylation. It is based on aforementioned sodium bisulfite
conversion of genomic DNA, which is then sequenced on a Next-generation sequencing platform.
The sequences obtained are then re-aligned to the reference genome to
determine methylation states of CpG dinucleotides based on mismatches
resulting from the conversion of unmethylated cytosines into uracil.
- Reduced representation bisulfite sequencing,
also known as RRBS knows several working protocols. The first RRBS
protocol was called RRBS and aims for around 10% of the methylome, a
reference genome is needed. Later came more protocols that were able to
sequence a smaller portion of the genome and higher sample multiplexing.
EpiGBS was the first protocol were you could multiplex 96 sample in one
lane of Illumina sequencing and were a reference genome was not longer
needed. A de novo reference construction from the Watson and Crick reads
made population screening of SNP's and SMP's simultaneously a fact.
- The HELP assay, which is based on restriction enzymes' differential ability to recognize and cleave methylated and unmethylated CpG DNA sites.
- GLAD-PCR assay, which is based on new type of enzymes – site-specific methyl-directed DNA endonucleases, which hydrolyze only methylated DNA.
- ChIP-on-chip
assays, which is based on the ability of commercially prepared
antibodies to bind to DNA methylation-associated proteins like MeCP2.
- Restriction landmark genomic scanning,
a complicated and now rarely used assay based upon restriction enzymes'
differential recognition of methylated and unmethylated CpG sites; the
assay is similar in concept to the HELP assay.
- Methylated DNA immunoprecipitation (MeDIP), analogous to chromatin immunoprecipitation, immunoprecipitation is used to isolate methylated DNA fragments for input into DNA detection methods such as DNA microarrays (MeDIP-chip) or DNA sequencing (MeDIP-seq).
- Pyrosequencing
of bisulfite treated DNA. This is sequencing of an amplicon made by a
normal forward primer but a biotinylated reverse primer to PCR the gene
of choice. The Pyrosequencer then analyses the sample by denaturing the
DNA and adding one nucleotide at a time to the mix according to a
sequence given by the user. If there is a mis-match, it is recorded and
the percentage of DNA for which the mis-match is present is noted. This
gives the user a percentage methylation per CpG island.
- Molecular break light assay for DNA adenine methyltransferase
activity – an assay that relies on the specificity of the restriction
enzyme DpnI for fully methylated (adenine methylation) GATC sites in an
oligonucleotide labeled with a fluorophore and quencher. The adenine
methyltransferase methylates the oligonucleotide making it a substrate
for DpnI. Cutting of the oligonucleotide by DpnI gives rise to a
fluorescence increase.[78][79]
- Methyl Sensitive Southern Blotting is similar to the HELP assay,
although uses Southern blotting techniques to probe gene-specific
differences in methylation using restriction digests. This technique is
used to evaluate local methylation near the binding site for the probe.
- MethylCpG Binding Proteins (MBPs) and fusion proteins containing
just the Methyl Binding Domain (MBD) are used to separate native DNA
into methylated and unmethylated fractions. The percentage methylation
of individual CpG islands can be determined by quantifying the amount of
the target in each fraction.[80] Extremely sensitive detection can be achieved in FFPE tissues with abscription-based detection.
- High Resolution Melt Analysis (HRM or HRMA), is a post-PCR
analytical technique. The target DNA is treated with sodium bisulfite,
which chemically converts unmethylated cytosines into uracils, while
methylated cytosines are preserved. PCR amplification is then carried
out with primers designed to amplify both methylated and unmethylated
templates. After this amplification, highly methylated DNA sequences
contain a higher number of CpG sites compared to unmethylated templates,
which results in a different melting temperature that can be used in
quantitative methylation detection.[81][82]
- Ancient DNA methylation reconstruction, a method to reconstruct
high-resolution DNA methylation from ancient DNA samples. The method is
based on the natural degradation processes that occur in ancient DNA:
with time, methylated cytosines are degraded into thymines, whereas
unmethylated cytosines are degraded into uracils. This asymmetry in
degradation signals was used to reconstruct the full methylation maps of
the Neanderthal and the Denisovan [83]
Differentially methylated regions (DMRs)
Differentially methylated regions,
are genomic regions with different methylation statuses among multiple
samples (tissues, cells, individuals or others), are regarded as
possible functional regions involved in gene transcriptional regulation.
The identification of DMRs among multiple tissues (T-DMRs) provides a
comprehensive survey of epigenetic differences among human tissues.
[84]
For example, these methylated regions that are unique to a particular
tissue allow individuals to differentiate between tissue type, such as
semen and vaginal fluid. Current research conducted by Lee et al.,
showed DACT1 and USP49 positively identified semen by examining T-DMRs.
[85] DMRs between cancer and normal samples (C-DMRs) demonstrate the aberrant methylation in cancers.
[86] It is well known that DNA methylation is associated with cell differentiation and proliferation.
[87] Many DMRs have been found in the development stages (D-DMRs)
[88] and in the reprogrammed progress (R-DMRs).
[89]
In addition, there are intra-individual DMRs (Intra-DMRs) with
longitudinal changes in global DNA methylation along with the increase
of age in a given individual.
[90] There are also inter-individual DMRs (Inter-DMRs) with different methylation patterns among multiple individuals.
[91]
QDMR (Quantitative Differentially Methylated Regions) is a
quantitative approach to quantify methylation difference and identify
DMRs from genome-wide methylation profiles by adapting Shannon entropy
<
http://bioinfo.hrbmu.edu.cn/qdmr>.
The platform-free and species-free nature of QDMR makes it potentially
applicable to various methylation data. This approach provides an
effective tool for the high-throughput identification of the functional
regions involved in epigenetic regulation. QDMR can be used as an
effective tool for the quantification of methylation difference and
identification of DMRs across multiple samples.
[92]
Gene-set analysis (a.k.a. pathway analysis; usually performed tools
such as DAVID, GoSeq or GSEA) has been shown to be severely biased when
applied to high-throughput methylation data (e.g. MeDIP-seq, MeDIP-ChIP,
HELP-seq etc.), and a wide range of studies have thus mistakenly
reported hyper-methylation of genes related to development and
differentiation; it has been suggested that this can be corrected using
sample label permutations or using a statistical model to control for
differences in the numberes of CpG probes / CpG sites that target each
gene.
[93]
DNA methylation marks
DNA methylation marks,
are genomic regions with specific methylation pattern in a specific
biological state such as tissue, cell type, individual), are regarded as
possible functional regions involved in gene transcriptional
regulation. Although various human cell types may have the same genome,
these cells have different methylomes. The systematic identification and
characterization of methylation marks across cell types are crucial to
understanding the complex regulatory network for cell fate
determination. Hongbo Liu et al. proposed an entropy-based framework
termed SMART to integrate the whole genome bisulfite sequencing
methylomes across 42 human tissues/cells and identified 757,887 genome
segments.
[94]
Nearly 75% of the segments showed uniform methylation across all cell
types. From the remaining 25% of the segments, they identified cell
type-specific hypo/hypermethylation marks that were specifically
hypo/hypermethylated in a minority of cell types using a statistical
approach and presented an atlas of the human methylation marks. Further
analysis revealed that the cell type-specific hypomethylation marks were
enriched through
H3K27ac
and transcription factor binding sites in cell type-specific manner. In
particular, they observed that the cell type-specific hypomethylation
marks are associated with the cell type-specific super-enhancers that
drive the expression of cell identity genes. This framework provides a
complementary, functional annotation of the human genome and helps to
elucidate the critical features and functions of cell type-specific
hypomethylation.
The entropy-based Specific Methylation Analysis and Report Tool,
termed "SMART", which focuses on integrating a large number of DNA
methylomes for the de novo identification of cell type-specific
methylation marks. The latest version of SMART is focused on three main
functions including de novo identification of differentially methylated
regions (DMRs) by genome segmentation, identification of DMRs from
predefined regions of interest, and identification of differentially
methylated CpG sites. SMART is available at
http://fame.edbc.org/smart/.
Computational prediction
DNA
methylation can also be detected by computational models through
sophisticated algorithms and methods. Computational models can
facilitate the global profiling of DNA methylation across chromosomes,
and often such models are faster and cheaper to perform than biological
assays. Such up-to-date computational models include Bhasin,
et al.,
[95] Bock,
et al.,
[96] and Zheng,
et al.
[97] [98] Together with biological assay, these methods greatly facilitate the DNA methylation analysis.