A Medley of Potpourri

Wednesday, April 3, 2019

Cellular differentiation

From Wikipedia, the free encyclopedia

Cell-count distribution featuring cellular differentiation for three types of cells (progenitor

z

, osteoblast

y

, and chondrocyte

x

) exposed to pro-osteoblast stimulus.

Cellular differentiation is the process where a cell changes from one cell type to another. Usually, the cell changes to a more specialized type. Differentiation occurs numerous times during the development of a multicellular organism as it changes from a simple zygote to a complex system of tissues and cell types. Differentiation continues in adulthood as adult stem cells divide and create fully differentiated daughter cells during tissue repair and during normal cell turnover. Some differentiation occurs in response to antigen exposure. Differentiation dramatically changes a cell's size, shape, membrane potential, metabolic activity, and responsiveness to signals. These changes are largely due to highly controlled modifications in gene expression and are the study of epigenetics. With a few exceptions, cellular differentiation almost never involves a change in the DNA sequence itself. Thus, different cells can have very different physical characteristics despite having the same genome.

A specialized type of differentiation, known as 'terminal differentiation', is of importance in some tissues, for example vertebrate nervous system, striated muscle, epidermis and gut. During terminal differentiation, a precursor cell formerly capable of cell division, permanently leaves the cell cycle, dismantles the cell cycle machinery and often expresses a range of genes characteristic of the cell's final function (e.g. myosin and actin for a muscle cell). Differentiation may continue to occur after terminal differentiation if the capacity and functions of the cell undergo further changes.

Among dividing cells, there are multiple levels of cell potency, the cell's ability to differentiate into other cell types. A greater potency indicates a larger number of cell types that can be derived. A cell that can differentiate into all cell types, including the placental tissue, is known as totipotent. In mammals, only the zygote and subsequent blastomeres are totipotent, while in plants, many differentiated cells can become totipotent with simple laboratory techniques. A cell that can differentiate into all cell types of the adult organism is known as pluripotent. Such cells are called meristematic cells in higher plants and embryonic stem cells in animals, though some groups report the presence of adult pluripotent cells. Virally induced expression of four transcription factors Oct4, Sox2, c-Myc, and KIF4 (Yamanaka factors) is sufficient to create pluripotent (iPS) cells from adult fibroblasts. A multipotent cell is one that can differentiate into multiple different, but closely related cell types. Oligopotent cells are more restricted than multipotent, but can still differentiate into a few closely related cell types. Finally, unipotent cells can differentiate into only one cell type, but are capable of self-renewal. In cytopathology, the level of cellular differentiation is used as a measure of cancer progression. "Grade" is a marker of how differentiated a cell in a tumor is.

Mammalian cell types

Three basic categories of cells make up the mammalian body: germ cells, somatic cells, and stem cells. Each of the approximately 37.2 trillion (3.72x10¹³) cells in an adult human has its own copy or copies of the genome except certain cell types, such as red blood cells, that lack nuclei in their fully differentiated state. Most cells are diploid; they have two copies of each chromosome. Such cells, called somatic cells, make up most of the human body, such as skin and muscle cells. Cells differentiate to specialize for different functions.

Germ line cells are any line of cells that give rise to gametes—eggs and sperm—and thus are continuous through the generations. Stem cells, on the other hand, have the ability to divide for indefinite periods and to give rise to specialized cells. They are best described in the context of normal human development.

Development begins when a sperm fertilizes an egg and creates a single cell that has the potential to form an entire organism. In the first hours after fertilization, this cell divides into identical cells. In humans, approximately four days after fertilization and after several cycles of cell division, these cells begin to specialize, forming a hollow sphere of cells, called a blastocyst. The blastocyst has an outer layer of cells, and inside this hollow sphere, there is a cluster of cells called the inner cell mass. The cells of the inner cell mass go on to form virtually all of the tissues of the human body. Although the cells of the inner cell mass can form virtually every type of cell found in the human body, they cannot form an organism. These cells are referred to as pluripotent.

Pluripotent stem cells undergo further specialization into multipotent progenitor cells that then give rise to functional cells. Examples of stem and progenitor cells include:

Radial glial cells (embryonic neural stem cells) that give rise to excitatory neurons in the fetal brain through the process of neurogenesis.
Hematopoietic stem cells (adult stem cells) from the bone marrow that give rise to red blood cells, white blood cells, and platelets
Mesenchymal stem cells (adult stem cells) from the bone marrow that give rise to stromal cells, fat cells, and types of bone cells
Epithelial stem cells (progenitor cells) that give rise to the various types of skin cells
Muscle satellite cells (progenitor cells) that contribute to differentiated muscle tissue.

A pathway that is guided by the cell adhesion molecules consisting of four amino acids, arginine, glycine, asparagine, and serine, is created as the cellular blastomere differentiates from the single-layered blastula to the three primary layers of germ cells in mammals, namely the ectoderm, mesoderm and endoderm (listed from most distal (exterior) to proximal (interior)). The ectoderm ends up forming the skin and the nervous system, the mesoderm forms the bones and muscular tissue, and the endoderm forms the internal organ tissues.

Dedifferentiation

Micrograph of a liposarcoma with some dedifferentiation, that is not identifiable as a liposarcoma, (left edge of image) and a differentiated component (with lipoblasts and increased vascularity (right of image)). Fully differentiated (morphologically benign) adipose tissue (center of the image) has few blood vessels. H&E stain.

Dedifferentiation, or integration is a cellular process often seen in more basal life forms such as worms and amphibians in which a partially or terminally differentiated cell reverts to an earlier developmental stage, usually as part of a regenerative process. Dedifferentiation also occurs in plants. Cells in cell culture can lose properties they originally had, such as protein expression, or change shape. This process is also termed dedifferentiation.

Some believe dedifferentiation is an aberration of the normal development cycle that results in cancer, whereas others believe it to be a natural part of the immune response lost by humans at some point as a result of evolution.

A small molecule dubbed reversine, a purine analog, has been discovered that has proven to induce dedifferentiation in myotubes. These dedifferentiated cells could then redifferentiate into osteoblasts and adipocytes.

Diagram exposing several methods used to revert adult somatic cells to totipotency or pluripotency.

Mechanisms

Mechanisms of cellular differentiation.

Each specialized cell type in an organism expresses a subset of all the genes that constitute the genome of that species. Each cell type is defined by its particular pattern of regulated gene expression. Cell differentiation is thus a transition of a cell from one cell type to another and it involves a switch from one pattern of gene expression to another. Cellular differentiation during development can be understood as the result of a gene regulatory network. A regulatory gene and its cis-regulatory modules are nodes in a gene regulatory network; they receive input and create output elsewhere in the network. The systems biology approach to developmental biology emphasizes the importance of investigating how developmental mechanisms interact to produce predictable patterns (morphogenesis). (However, an alternative view has been proposed recently. Based on stochastic gene expression, cellular differentiation is the result of a Darwinian selective process occurring among cells. In this frame, protein and gene networks are the result of cellular processes and not their cause.)

An overview of major signal transduction pathways.

A few evolutionarily conserved types of molecular processes are often involved in the cellular mechanisms that control these switches. The major types of molecular processes that control cellular differentiation involve cell signaling. Many of the signal molecules that convey information from cell to cell during the control of cellular differentiation are called growth factors. Although the details of specific signal transduction pathways vary, these pathways often share the following general steps. A ligand produced by one cell binds to a receptor in the extracellular region of another cell, inducing a conformational change in the receptor. The shape of the cytoplasmic domain of the receptor changes, and the receptor acquires enzymatic activity. The receptor then catalyzes reactions that phosphorylate other proteins, activating them. A cascade of phosphorylation reactions eventually activates a dormant transcription factor or cytoskeletal protein, thus contributing to the differentiation process in the target cell. Cells and tissues can vary in competence, their ability to respond to external signals.

Signal induction refers to cascades of signaling events, during which a cell or tissue signals to another cell or tissue to influence its developmental fate. Yamamoto and Jeffery investigated the role of the lens in eye formation in cave- and surface-dwelling fish, a striking example of induction. Through reciprocal transplants, Yamamoto and Jeffery found that the lens vesicle of surface fish can induce other parts of the eye to develop in cave- and surface-dwelling fish, while the lens vesicle of the cave-dwelling fish cannot.

Other important mechanisms fall under the category of asymmetric cell divisions, divisions that give rise to daughter cells with distinct developmental fates. Asymmetric cell divisions can occur because of asymmetrically expressed maternal cytoplasmic determinants or because of signaling. In the former mechanism, distinct daughter cells are created during cytokinesis because of an uneven distribution of regulatory molecules in the parent cell; the distinct cytoplasm that each daughter cell inherits results in a distinct pattern of differentiation for each daughter cell. A well-studied example of pattern formation by asymmetric divisions is body axis patterning in Drosophila. RNA molecules are an important type of intracellular differentiation control signal. The molecular and genetic basis of asymmetric cell divisions has also been studied in green algae of the genus Volvox, a model system for studying how unicellular organisms can evolve into multicellular organisms. In Volvox carteri, the 16 cells in the anterior hemisphere of a 32-cell embryo divide asymmetrically, each producing one large and one small daughter cell. The size of the cell at the end of all cell divisions determines whether it becomes a specialized germ or somatic cell.

Epigenetic control

Since each cell, regardless of cell type, possesses the same genome, determination of cell type must occur at the level of gene expression. While the regulation of gene expression can occur through cis- and trans-regulatory elements including a gene's promoter and enhancers, the problem arises as to how this expression pattern is maintained over numerous generations of cell division. As it turns out, epigenetic processes play a crucial role in regulating the decision to adopt a stem, progenitor, or mature cell fate. This section will focus primarily on mammalian stem cells.

In systems biology and mathematical modeling of gene regulatory networks, cell-fate determination is predicted to exhibit certain dynamics, such as attractor-convergence (the attractor can be an equilibrium point, limit cycle or strange attractor) or oscillatory.

Importance of epigenetic control

The first question that can be asked is the extent and complexity of the role of epigenetic processes in the determination of cell fate. A clear answer to this question can be seen in the 2011 paper by Lister R, et al. on aberrant epigenomic programming in human induced pluripotent stem cells. As induced pluripotent stem cells (iPSCs) are thought to mimic embryonic stem cells in their pluripotent properties, few epigenetic differences should exist between them. To test this prediction, the authors conducted whole-genome profiling of DNA methylation patterns in several human embryonic stem cell (ESC), iPSC, and progenitor cell lines.

Female adipose cells, lung fibroblasts, and foreskin fibroblasts were reprogrammed into induced pluripotent state with the OCT4, SOX2, KLF4, and MYC genes. Patterns of DNA methylation in ESCs, iPSCs, somatic cells were compared. Lister R, et al. observed significant resemblance in methylation levels between embryonic and induced pluripotent cells. Around 80% of CG dinucleotides in ESCs and iPSCs were methylated, the same was true of only 60% of CG dinucleotides in somatic cells. In addition, somatic cells possessed minimal levels of cytosine methylation in non-CG dinucleotides, while induced pluripotent cells possessed similar levels of methylation as embryonic stem cells, between 0.5 and 1.5%. Thus, consistent with their respective transcriptional activities, DNA methylation patterns, at least on the genomic level, are similar between ESCs and iPSCs.

However, upon examining methylation patterns more closely, the authors discovered 1175 regions of differential CG dinucleotide methylation between at least one ES or iPS cell line. By comparing these regions of differential methylation with regions of cytosine methylation in the original somatic cells, 44-49% of differentially methylated regions reflected methylation patterns of the respective progenitor somatic cells, while 51-56% of these regions were dissimilar to both the progenitor and embryonic cell lines. In vitro-induced differentiation of iPSC lines saw transmission of 88% and 46% of hyper and hypo-methylated differentially methylated regions, respectively.

Two conclusions are readily apparent from this study. First, epigenetic processes are heavily involved in cell fate determination, as seen from the similar levels of cytosine methylation between induced pluripotent and embryonic stem cells, consistent with their respective patterns of transcription. Second, the mechanisms of de-differentiation (and by extension, differentiation) are very complex and cannot be easily duplicated, as seen by the significant number of differentially methylated regions between ES and iPS cell lines. Now that these two points have been established, we can examine some of the epigenetic mechanisms that are thought to regulate cellular differentiation.

Mechanisms of epigenetic regulation

Pioneer factor|Pioneering factors (Oct4, Sox2, Nanog)

Three transcription factors, OCT4, SOX2, and NANOG – the first two of which are used in induced pluripotent stem cell (iPSC) reprogramming, along with Klf4 and c-Myc – are highly expressed in undifferentiated embryonic stem cells and are necessary for the maintenance of their pluripotency. It is thought that they achieve this through alterations in chromatin structure, such as histone modification and DNA methylation, to restrict or permit the transcription of target genes. While highly expressed, their levels require a precise balance to maintain pluripotency, perturbation of which will promote differentiation towards different lineages based on how the gene expression levels change. Differential regulation of Oct-4 and SOX2 levels have been shown to precede germ layer fate selection. Increased levels of Oct4 and decreased levels of Sox2 promote a mesendodermal fate, with Oct4 actively suppressing genes associated with a neural ectodermal fate. Similarly, Increased levels of Sox2 and decreased levels of Oct4 promote differentiation towards a neural ectodermal fate, with Sox2 inhibiting differentiation towards a mesendodermal fate. Regardless of the lineage cells differentiate down, suppression of NANOG has been identified as a necessary prerequisite for differentiation.

Polycomb repressive complex (PRC2)

In the realm of gene silencing, Polycomb repressive complex 2, one of two classes of the Polycomb group (PcG) family of proteins, catalyzes the di- and tri-methylation of histone H3 lysine 27 (H3K27me2/me3). By binding to the H3K27me2/3-tagged nucleosome, PRC1 (also a complex of PcG family proteins) catalyzes the mono-ubiquitinylation of histone H2A at lysine 119 (H2AK119Ub1), blocking RNA polymerase II activity and resulting in transcriptional suppression. PcG knockout ES cells do not differentiate efficiently into the three germ layers, and deletion of the PRC1 and PRC2 genes leads to increased expression of lineage-affiliated genes and unscheduled differentiation. Presumably, PcG complexes are responsible for transcriptionally repressing differentiation and development-promoting genes.

Trithorax group proteins (TrxG)

Alternately, upon receiving differentiation signals, PcG proteins are recruited to promoters of pluripotency transcription factors. PcG-deficient ES cells can begin differentiation but cannot maintain the differentiated phenotype. Simultaneously, differentiation and development-promoting genes are activated by Trithorax group (TrxG) chromatin regulators and lose their repression. TrxG proteins are recruited at regions of high transcriptional activity, where they catalyze the trimethylation of histone H3 lysine 4 (H3K4me3) and promote gene activation through histone acetylation. PcG and TrxG complexes engage in direct competition and are thought to be functionally antagonistic, creating at differentiation and development-promoting loci what is termed a "bivalent domain" and rendering these genes sensitive to rapid induction or repression.

DNA methylation

Regulation of gene expression is further achieved through DNA methylation, in which the DNA methyltransferase-mediated methylation of cytosine residues in CpG dinucleotides maintains heritable repression by controlling DNA accessibility. The majority of CpG sites in embryonic stem cells are unmethylated and appear to be associated with H3K4me3-carrying nucleosomes. Upon differentiation, a small number of genes, including OCT4 and NANOG, are methylated and their promoters repressed to prevent their further expression. Consistently, DNA methylation-deficient embryonic stem cells rapidly enter apoptosis upon in vitro differentiation.

Nucleosome positioning

While the DNA sequence of most cells of an organism is the same, the binding patterns of transcription factors and the corresponding gene expression patterns are different. To a large extent, differences in transcription factor binding are determined by the chromatin accessibility of their binding sites through histone modification and/or pioneer factors. In particular, it is important to know whether a nucleosome is covering a given genomic binding site or not. This can be determined using a chromatin immunoprecipitation (ChIP) assay.

Histone acetylation and methylation

DNA-nucleosome interactions are characterized by two states: either tightly bound by nucleosomes and transcriptionally inactive, called heterochromatin, or loosely bound and usually, but not always, transcriptionally active, called euchromatin. The epigenetic processes of histone methylation and acetylation, and their inverses demethylation and deacetylation primarily account for these changes. The effects of acetylation and deacetylation are more predictable. An acetyl group is either added to or removed from the positively charged Lysine residues in histones by enzymes called histone acetyltransferases or histone deacteylases, respectively. The acetyl group prevents Lysine's association with the negatively charged DNA backbone. Methylation is not as straightforward, as neither methylation nor demethylation consistently correlate with either gene activation or repression. However, certain methylations have been repeatedly shown to either activate or repress genes. The trimethylation of lysine 4 on histone 3 (H3K4Me3) is associated with gene activation, whereas trimethylation of lysine 27 on histone 3 represses genes.

In stem cells

During differentiation, stem cells change their gene expression profiles. Recent studies have implicated a role for nucleosome positioning and histone modifications during this process. There are two components of this process: turning off the expression of embryonic stem cell (ESC) genes, and the activation of cell fate genes. Lysine specific demethylase 1 (KDM1A) is thought to prevent the use of enhancer regions of pluripotency genes, thereby inhibiting their transcription. It interacts with Mi-2/NuRD complex (nucleosome remodelling and histone deacetylase) complex, giving an instance where methylation and acetylation are not discrete and mutually exclusive, but intertwined processes.

Role of signaling in epigenetic control

A final question to ask concerns the role of cell signaling in influencing the epigenetic processes governing differentiation. Such a role should exist, as it would be reasonable to think that extrinsic signaling can lead to epigenetic remodeling, just as it can lead to changes in gene expression through the activation or repression of different transcription factors. Little direct data is available concerning the specific signals that influence the epigenome, and the majority of current knowledge about the subject consists of speculations on plausible candidate regulators of epigenetic remodeling. We will first discuss several major candidates thought to be involved in the induction and maintenance of both embryonic stem cells and their differentiated progeny, and then turn to one example of specific signaling pathways in which more direct evidence exists for its role in epigenetic change.

The first major candidate is Wnt signaling pathway. The Wnt pathway is involved in all stages of differentiation, and the ligand Wnt3a can substitute for the overexpression of c-Myc in the generation of induced pluripotent stem cells. On the other hand, disruption of ß-catenin, a component of the Wnt signaling pathway, leads to decreased proliferation of neural progenitors.

Growth factors comprise the second major set of candidates of epigenetic regulators of cellular differentiation. These morphogens are crucial for development, and include bone morphogenetic proteins, transforming growth factors (TGFs), and fibroblast growth factors (FGFs). TGFs and FGFs have been shown to sustain expression of OCT4, SOX2, and NANOG by downstream signaling to Smad proteins. Depletion of growth factors promotes the differentiation of ESCs, while genes with bivalent chromatin can become either more restrictive or permissive in their transcription.

Several other signaling pathways are also considered to be primary candidates. Cytokine leukemia inhibitory factors are associated with the maintenance of mouse ESCs in an undifferentiated state. This is achieved through its activation of the Jak-STAT3 pathway, which has been shown to be necessary and sufficient towards maintaining mouse ESC pluripotency. Retinoic acid can induce differentiation of human and mouse ESCs, and Notch signaling is involved in the proliferation and self-renewal of stem cells. Finally, Sonic hedgehog, in addition to its role as a morphogen, promotes embryonic stem cell differentiation and the self-renewal of somatic stem cells.

The problem, of course, is that the candidacy of these signaling pathways was inferred primarily on the basis of their role in development and cellular differentiation. While epigenetic regulation is necessary for driving cellular differentiation, they are certainly not sufficient for this process. Direct modulation of gene expression through modification of transcription factors plays a key role that must be distinguished from heritable epigenetic changes that can persist even in the absence of the original environmental signals. Only a few examples of signaling pathways leading to epigenetic changes that alter cell fate currently exist, and we will focus on one of them.

Expression of Shh (Sonic hedgehog) upregulates the production of BMI1, a component of the PcG complex that recognizes H3K27me3. This occurs in a Gli-dependent manner, as Gli1 and Gli2 are downstream effectors of the Hedgehog signaling pathway. In culture, Bmi1 mediates the Hedgehog pathway's ability to promote human mammary stem cell self-renewal. In both humans and mice, researchers showed Bmi1 to be highly expressed in proliferating immature cerebellar granule cell precursors. When Bmi1 was knocked out in mice, impaired cerebellar development resulted, leading to significant reductions in postnatal brain mass along with abnormalities in motor control and behavior. A separate study showed a significant decrease in neural stem cell proliferation along with increased astrocyte proliferation in Bmi null mice.

A alternative model of cellular differentiation during embryogenesis is that positional information is based on mechanical signalling by the cytoskeleton using Embryonic differentiation waves. The mechanical signal is then epigenetically transduced via signal transduction systems (of which specific molecules such as Wnt are part) to result in differential gene expression.

In summary, the role of signaling in the epigenetic control of cell fate in mammals is largely unknown, but distinct examples exist that indicate the likely existence of further such mechanisms.

Effect of matrix elasticity

In order to fulfill the purpose of regenerating a variety of tissues, adult stems are known to migrate from their niches, adhere to new extracellular matrices (ECM) and differentiate. The ductility of these microenvironments are unique to different tissue types. The ECM surrounding brain, muscle and bone tissues range from soft to stiff. The transduction of the stem cells into these cells types is not directed solely by chemokine cues and cell to cell signaling. The elasticity of the microenvironment can also affect the differentiation of mesenchymal stem cells (MSCs which originate in bone marrow.) When MSCs are placed on substrates of the same stiffness as brain, muscle and bone ECM, the MSCs take on properties of those respective cell types. Matrix sensing requires the cell to pull against the matrix at focal adhesions, which triggers a cellular mechano-transducer to generate a signal to be informed what force is needed to deform the matrix. To determine the key players in matrix-elasticity-driven lineage specification in MSCs, different matrix microenvironments were mimicked. From these experiments, it was concluded that focal adhesions of the MSCs were the cellular mechano-transducer sensing the differences of the matrix elasticity. The non-muscle myosin IIa-c isoforms generates the forces in the cell that lead to signaling of early commitment markers. Nonmuscle myosin IIa generates the least force increasing to non-muscle myosin IIc. There are also factors in the cell that inhibit non-muscle myosin II, such as blebbistatin. This makes the cell effectively blind to the surrounding matrix. Researchers have obtained some success in inducing stem cell-like properties in HEK 239 cells by providing a soft matrix without the use of diffusing factors. The stem-cell properties appear to be linked to tension in the cells' actin network. One identified mechanism for matrix-induced differentiation is tension-induced proteins, which remodel chromatin in response to mechanical stretch. The RhoA pathway is also implicated in this process.

DNA polymerase

From Wikipedia, the free encyclopedia

DNA-directed DNA polymerase
3D structure of the DNA-binding helix-turn-helix motifs in human DNA polymerase beta (based on PDB file 7ICG)
Identifiers
EC number	2.7.7.7
CAS number	9012-90-2
Databases
IntEnz	IntEnz view
BRENDA	BRENDA entry
ExPASy	NiceZyme view
KEGG	KEGG entry
MetaCyc	metabolic pathway
PRIAM	profile
PDB structures	RCSB PDB PDBe PDBsum
Gene Ontology	AmiGO / QuickGO

DNA polymerase is an enzyme that synthesizes DNA molecules from deoxyribonucleotides, the building blocks of DNA. These enzymes are essential for DNA replication and usually work in pairs to create two identical DNA strands from a single original DNA molecule. During this process, DNA polymerase "reads" the existing DNA strands to create two new strands that match the existing ones.

These enzymes catalyze the following chemical reaction

deoxynucleoside triphosphate + DNA_n ⇌ diphosphate + DNA_n+1

DNA polymerase adds nucleotides to the three prime (3')-end of a DNA strand, one nucleotide at a time.

Every time a cell divides, DNA polymerases are required to help duplicate the cell's DNA, so that a copy of the original DNA molecule can be passed to each daughter cell. In this way, genetic information is passed down from generation to generation.

Before replication can take place, an enzyme called helicase unwinds the DNA molecule from its tightly woven form, in the process breaking the hydrogen bonds between the nucleotide bases. This opens up or "unzips" the double-stranded DNA to give two single strands of DNA that can be used as templates for replication.

History

In 1956, Arthur Kornberg and colleagues discovered DNA polymerase I (Pol I), in Escherichia coli. They described the DNA replication process by which DNA polymerase copies the base sequence of a template DNA strand. Kornberg was later awarded the Nobel Prize in Physiology or Medicine in 1959 for this work. DNA polymerase II was also discovered by Thomas Kornberg (the son of Arthur Kornberg) and Malcolm E. Gefter in 1970 while further elucidating the role of Pol I in E. coli DNA replication.

Function

DNA polymerase moves along the old strand in the 3'–5' direction, creating a new strand having a 5'–3' direction.

DNA polymerase with proofreading ability

The main function of DNA polymerase is to synthesize DNA from deoxyribonucleotides, the building blocks of DNA. The DNA copies are created by the pairing of nucleotides to bases present on each strand of the original DNA molecule. This pairing always occurs in specific combinations, with cytosine along with guanine, and thymine along with adenine, forming two separate pairs, respectively. By contrast, RNA polymerases synthesize RNA from ribonucleotides from either RNA or DNA.

When synthesizing new DNA, DNA polymerase can add free nucleotides only to the 3' end of the newly forming strand. This results in elongation of the newly forming strand in a 5'–3' direction. No known DNA polymerase is able to begin a new chain (de novo); it can only add a nucleotide onto a pre-existing 3'-OH group, and therefore needs a primer at which it can add the first nucleotide. Primers consist of RNA or DNA bases (or both). In DNA replication, the first two bases are always RNA, and are synthesized by another enzyme called primase. Helicase and topoisomerase II are required to unwind DNA from a double-strand structure to a single-strand structure to facilitate replication of each strand consistent with the semiconservative model of DNA replication.

It is important to note that the directionality of the newly forming strand (the daughter strand) is opposite to the direction in which DNA polymerase moves along the template strand. Since DNA polymerase requires a free 3' OH group for initiation of synthesis, it can synthesize in only one direction by extending the 3' end of the preexisting nucleotide chain. Hence, DNA polymerase moves along the template strand in a 3'–5' direction, and the daughter strand is formed in a 5'–3' direction. This difference enables the resultant double-strand DNA formed to be composed of two DNA strands that are antiparallel to each other.

The function of DNA polymerase is not quite perfect, with the enzyme making about one mistake for every billion base pairs copied. Error correction is a property of some, but not all DNA polymerases. This process corrects mistakes in newly synthesized DNA. When an incorrect base pair is recognized, DNA polymerase moves backwards by one base pair of DNA. The 3'–5' exonuclease activity of the enzyme allows the incorrect base pair to be excised (this activity is known as proofreading). Following base excision, the polymerase can re-insert the correct base and replication can continue forwards. This preserves the integrity of the original DNA strand that is passed onto the daughter cells.

Fidelity is very important in DNA replication. Mismatches in DNA base pairing can potentially result in dysfunctional proteins and could lead to cancer. Many DNA polymerases contain an exonuclease domain, which acts in detecting base pair mismatches and further performs in the removal of the incorrect nucleotide to be replaced by the correct one. The shape and the interactions accommodating the Watson and Crick base pair are what primarily contribute to the detection or error. Hydrogen bonds play a key role in base pair binding and interaction. The loss of an interaction, which occurs at a mismatch, is said to trigger a shift in the balance, for the binding of the template-primer, from the polymerase, to the exonuclease domain. In addition, an incorporation of a wrong nucleotide causes a retard in DNA polymerization. This delay gives time for the DNA to be switched from the polymerase site to the exonuclease site. Different conformational changes and loss of interaction occur at different mismatches. In a purine:pyrimidine mismatch there is a displacement of the pyrimidine towards the major groove and the purine towards the minor groove. Relative to the shape of DNA polymerase's binding pocket, steric clashes occur between the purine and residues in the minor groove, and important van der Waals and electrostatic interactions are lost by the pyrimidine. Pyrimidine:pyrimidine and purine:purine mismatches present less notable changes since the bases are displaced towards the major groove, and less steric hindrance is experienced. However, although the different mismatches result in different steric properties, DNA polymerase is still able to detect and differentiate them so uniformly and maintain fidelity in DNA replication. DNA polymerization is also critical for many mutagenesis processes and is widely employed in biotechnologies.

Structure

The known DNA polymerases have highly conserved structure, which means that their overall catalytic subunits vary very little from species to species, independent of their domain structures. Conserved structures usually indicate important, irreplaceable functions of the cell, the maintenance of which provides evolutionary advantages. The shape can be described as resembling a right hand with thumb, finger, and palm domains. The palm domain appears to function in catalyzing the transfer of phosphoryl groups in the phosphoryl transfer reaction. DNA is bound to the palm when the enzyme is active. This reaction is believed to be catalyzed by a two-metal-ion mechanism. The finger domain functions to bind the nucleoside triphosphates with the template base. The thumb domain plays a potential role in the processivity, translocation, and positioning of the DNA.

Processivity

DNA polymerase's rapid catalysis is due to its processive nature. Processivity is a characteristic of enzymes that function on polymeric substrates. In the case of DNA polymerase, the degree of processivity refers to the average number of nucleotides added each time the enzyme binds a template. The average DNA polymerase requires about one second locating and binding a primer/template junction. Once it is bound, a nonprocessive DNA polymerase adds nucleotides at a rate of one nucleotide per second. Processive DNA polymerases, however, add multiple nucleotides per second, drastically increasing the rate of DNA synthesis. The degree of processivity is directly proportional to the rate of DNA synthesis. The rate of DNA synthesis in a living cell was first determined as the rate of phage T4 DNA elongation in phage infected E. coli. During the period of exponential DNA increase at 37 °C, the rate was 749 nucleotides per second.

DNA polymerase's ability to slide along the DNA template allows increased processivity. There is a dramatic increase in processivity at the replication fork. This increase is facilitated by the DNA polymerase's association with proteins known as the sliding DNA clamp. The clamps are multiple protein subunits associated in the shape of a ring. Using the hydrolysis of ATP, a class of proteins known as the sliding clamp loading proteins open up the ring structure of the sliding DNA clamps allowing binding to and release from the DNA strand. Protein-protein interaction with the clamp prevents DNA polymerase from diffusing from the DNA template, thereby ensuring that the enzyme binds the same primer/template junction and continues replication. DNA polymerase changes conformation, increasing affinity to the clamp when associated with it and decreasing affinity when it completes the replication of a stretch of DNA to allow release from the clamp.

Variation across species

DNA polymerase family A
c:o6-methyl-guanine pair in the polymerase-2 basepair position
Identifiers
Symbol	DNA_pol_A
Pfam	PF00476
InterPro	IPR001098
SMART	-
PROSITE	PDOC00412
SCOP	1dpi
SUPERFAMILY	1dpi

DNA polymerase family B
crystal structure of rb69 gp43 in complex with dna containing thymine glycol
Identifiers
Symbol	DNA_pol_B
Pfam	PF00136
Pfam clan	CL0194
InterPro	IPR006134
PROSITE	PDOC00107
SCOP	1noy
SUPERFAMILY	1noy

DNA polymerase type B, organellar and viral
phi29 dna polymerase, orthorhombic crystal form, ssdna complex
Identifiers
Symbol	DNA_pol_B_2
Pfam	PF03175
Pfam clan	CL0194
InterPro	IPR004868

Based on sequence homology, DNA polymerases can be further subdivided into seven different families: A, B, C, D, X, Y, and RT.

Some viruses also encode special DNA polymerases, such as Hepatitis B virus DNA polymerase. These may selectively replicate viral DNA through a variety of mechanisms. Retroviruses encode an unusual DNA polymerase called reverse transcriptase, which is an RNA-dependent DNA polymerase (RdDp). It polymerizes DNA from a template of RNA.

Family	Types of DNA polymerase	Species	Examples	Feature
A	Replicative and Repair Polymerases	Eukaryotic and Prokaryotic	T7 DNA polymerase, Pol I, Pol γ, θ, and ν	Two exonuclease domains (3'-5' and 5'-3')
B	Replicative and Repair Polymerases	Eukaryotic and Prokaryotic	Pol II, Pol B, Pol ζ, Pol α, δ, and ε	3'-5 exonuclease (proofreading); viral ones use protein primer
C	Replicative Polymerases	Prokaryotic	Pol III	3'-5 exonuclease (proofreading)
D	Replicative Polymerases	Euryarchaeota	PolD (DP1/DP2 heterodimer)	No "hand" feature, RNA polymerase-like; 3'-5 exonuclease (proofreading)
X	Replicative and Repair Polymerases	Eukaryotic	Pol β, Pol σ, Pol λ, Pol μ, and Terminal deoxynucleotidyl transferase	template-independent; 5' phosphatase (only Pol β)
Y	Replicative and Repair Polymerases	Eukaryotic and Prokaryotic	Pol ι, Pol κ, Pol η, Pol IV, and Pol V
RT	Replicative and Repair Polymerases	Viruses, Retroviruses, and Eukaryotic	Telomerase, Hepatitis B virus	RNA-dependent

Prokaryotic polymerase

Prokaryotes only have one DNA polymerase and it exists in two forms: core polymerase and holoenzyme. Core polymerase synthesizes DNA from the DNA template but it cannot initiate the synthesis alone or accurately. Holoenzyme accurately initiates synthesis.

Pol I

Prokaryotic family A polymerases include the DNA polymerase I (Pol I) enzyme, which is encoded by the polA gene and ubiquitous among prokaryotes. This repair polymerase is involved in excision repair with both 3'–5' and 5'–3' exonuclease activity and processing of Okazaki fragments generated during lagging strand synthesis. Pol I is the most abundant polymerase, accounting for more than 95% of polymerase activity in E. coli; yet cells lacking Pol I have been found suggesting Pol I activity can be replaced by the other four polymerases. Pol I adds ~15-20 nucleotides per second, thus showing poor processivity. Instead, Pol I starts adding nucleotides at the RNA primer:template junction known as the origin of replication (ori). Approximately 400 bp downstream from the origin, the Pol III holoenzyme is assembled and takes over replication at a highly processive speed and nature.

Taq polymerase is a heat-stable enzyme of this family that lacks proofreading ability.

Pol II

DNA polymerase II, a family B polymerase, is a polB gene product also known as DnaA. Pol II has 3'–5' exonuclease activity and participates in DNA repair, replication restart to bypass lesions, and its cell presence can jump from ~30-50 copies per cell to ~200–300 during SOS induction. Pol II is also thought to be a backup to Pol III as it can interact with holoenzyme proteins and assume a high level of processivity. The main role of Pol II is thought to be the ability to direct polymerase activity at the replication fork and helped stalled Pol III bypass terminal mismatches.

Pfu DNA polymerase is a heat-stable enzyme of this family found in the hyperthermophilic archaeon Pyrococcus furiosus. Detailed classification divides family B in archaea into B1, B2, B3, in which B2 is a group of pseudoenzymes. Pfu belongs to family B3. Others PolBs found in archaea are part of "Casposons", Cas1-dependent transposons. Some viruses (including Φ29 DNA polymerase) and mitochondrial plasmids carry polB as well.

Pol III

DNA polymerase III holoenzyme is the primary enzyme involved in DNA replication in E. coli and belongs to family C polymerases. It consists of three assemblies: the pol III core, the beta sliding clamp processivity factor, and the clamp-loading complex. The core consists of three subunits: α, the polymerase activity hub, ɛ, exonucleolytic proofreader, and θ, which may act as a stabilizer for ɛ. The holoenzyme contains two cores, one for each strand, the lagging and leading. The beta sliding clamp processivity factor is also present in duplicate, one for each core, to create a clamp that encloses DNA allowing for high processivity. The third assembly is a seven-subunit (τ2γδδ′χψ) clamp loader complex. Recent research has classified Family C polymerases as a subcategory of Family X with no eukaryotic equivalents.

Pol IV

In E. coli, DNA polymerase IV (Pol IV) is an error-prone DNA polymerase involved in non-targeted mutagenesis. Pol IV is a Family Y polymerase expressed by the dinB gene that is switched on via SOS induction caused by stalled polymerases at the replication fork. During SOS induction, Pol IV production is increased tenfold and one of the functions during this time is to interfere with Pol III holoenzyme processivity. This creates a checkpoint, stops replication, and allows time to repair DNA lesions via the appropriate repair pathway. Another function of Pol IV is to perform translesion synthesis at the stalled replication fork like, for example, bypassing N2-deoxyguanine adducts at a faster rate than transversing undamaged DNA. Cells lacking dinB gene have a higher rate of mutagenesis caused by DNA damaging agents.

Pol V

DNA polymerase V (Pol V) is a Y-family DNA polymerase that is involved in SOS response and translesion synthesis DNA repair mechanisms. Transcription of Pol V via the umuDC genes is highly regulated to produce only Pol V when damaged DNA is present in the cell generating an SOS response. Stalled polymerases causes RecA to bind to the ssDNA, which causes the LexA protein to autodigest. LexA then loses its ability to repress the transcription of the umuDC operon. The same RecA-ssDNA nucleoprotein posttranslationally modifies the UmuD protein into UmuD' protein. UmuD and UmuD' form a heterodimer that interacts with UmuC, which in turn activates umuC's polymerase catalytic activity on damaged DNA. In E. coli, a polymerase “tool belt” model for switching pol III with pol IV at a stalled replication fork, where both polymerases bind simultaneously to the β-clamp, has been proposed. However, the involvement of more than one TLS polymerase working in succession to bypass a lesion has not yet been shown in E. coli. Moreover, Pol IV can catalyze both insertion and extension with high efficiency, whereas pol V is considered the major SOS TLS polymerase. One example is the bypass of intra strand guanine thymine cross-link where it was shown on the basis of the difference in the mutational signatures of the two polymerases, that pol IV and pol V compete for TLS of the intra-strand crosslink.

Family D

In 1998, the family D of DNA polymerase was discovered in Pyrococcus furiosus and Methanococcus jannaschii. The PolD complex is a heterodimer of two chains, each encoded by DP1 (small proofreading) and DP2 (large catalytic). Unlike other DNA polymerases, the structure and mechanism of the catalytic core resemble that of multi-subunit RNA polymerases. The DP1-DP2 interface resembles that of Eukaryotic Class B polymerase zinc finger and its small subunit. DP1, a Mre11-like exonuclease, is likely the precursor of small subunit of Pol α and ε, providing proofreading capablities now lost in Eukaryotes. Its N-terminal HSH domain is similar to AAA proteins, especially Pol III subunit δ and RuvB, in structure. DP2 has a Class II KH domain. Pyrococcus abyssi polD is more heat-stable and more accurate than Taq polymerase, but has not yet been commercialized.

Eukaryotic DNA polymerase

Polymerases β, λ, σ and μ (beta, lambda, sigma, and mu)

Family X polymerases contain the well-known eukaryotic polymerase pol β (beta), as well as other eukaryotic polymerases such as Pol σ (sigma), Pol λ (lambda), Pol μ (mu), and Terminal deoxynucleotidyl transferase (TdT). Family X polymerases are found mainly in vertebrates, and a few are found in plants and fungi. These polymerases have highly conserved regions that include two helix-hairpin-helix motifs that are imperative in the DNA-polymerase interactions. One motif is located in the 8 kDa domain that interacts with downstream DNA and one motif is located in the thumb domain that interacts with the primer strand. Pol β, encoded by POLB gene, is required for short-patch base excision repair, a DNA repair pathway that is essential for repairing alkylated or oxidized bases as well as abasic sites. Pol λ and Pol μ, encoded by the POLL and POLM genes respectively, are involved in non-homologous end-joining, a mechanism for rejoining DNA double-strand breaks due to hydrogen peroxide and ionizing radiation, respectively. TdT is expressed only in lymphoid tissue, and adds "n nucleotides" to double-strand breaks formed during V(D)J recombination to promote immunological diversity.

Polymerases α, δ and ε (alpha, delta, and epsilon)

Pol α (alpha), Pol δ (delta), and Pol ε (epsilon) are members of Family B Polymerases and are the main polymerases involved with nuclear DNA replication. Pol α complex (pol α-DNA primase complex) consists of four subunits: the catalytic subunit POLA1, the regulatory subunit POLA2, and the small and the large primase subunits PRIM1 and PRIM2 respectively. Once primase has created the RNA primer, Pol α starts replication elongating the primer with ~20 nucleotides. Due to its high processivity, Pol δ takes over the leading and lagging strand synthesis from Pol α. Pol δ is expressed by genes POLD1, creating the catalytic subunit, POLD2, POLD3, and POLD4 creating the other subunits that interact with Proliferating Cell Nuclear Antigen (PCNA), which is a DNA clamp that allows Pol δ to possess processivity. Pol ε is encoded by the POLE1, the catalytic subunit, POLE2, and POLE3 gene. It has been reported that the function of Pol ε is to extend the leading strand during replication, while Pol δ primarily replicates the lagging strand; however, recent evidence suggested that Pol δ might have a role in replicating the leading strand of DNA as well. Pol ε's C-terminus "polymerase relic" region, despite being unnecessary for polymerase activity, is thought to be essential to cell vitality. The C-terminus region is thought to provide a checkpoint before entering anaphase, provide stability to the holoenzyme, and add proteins to the holoenzyme necessary for initiation of replication. Pol ε has a larger "palm" domain that provides high processivity independtly of PCNA.

Compared to other Family B polymerases, the DEDD exonuclease family responsible for proofreading is inactivated in Pol α. Pol ε is unique in that it has two zinc finger domains and an inactive copy of another family B polymerase in its C-terminal. The presence of this zinc finger has implications in the origins of Eukaryota, which in this case is placed into the Asgard group with archaeal B3 polymerase.

Polymerases η, ι and κ (eta, iota, and kappa)

Pol η (eta), Pol ι (iota), and Pol κ (kappa), are Family Y DNA polymerases involved in the DNA repair by translesion synthesis and encoded by genes POLH, POLI, and POLK respectively. Members of Family Y have five common motifs to aid in binding the substrate and primer terminus and they all include the typical right hand thumb, palm and finger domains with added domains like little finger (LF), polymerase-associated domain (PAD), or wrist. The active site, however, differs between family members due to the different lesions being repaired. Polymerases in Family Y are low-fidelity polymerases, but have been proven to do more good than harm as mutations that affect the polymerase can cause various diseases, such as skin cancer and Xeroderma Pigmentosum Variant (XPS). The importance of these polymerases is evidenced by the fact that gene encoding DNA polymerase η is referred as XPV, because loss of this gene results in the disease Xeroderma Pigmentosum Variant. Pol η is particularly important for allowing accurate translesion synthesis of DNA damage resulting from ultraviolet radiation. The functionality of Pol κ is not completely understood, but researchers have found two probable functions. Pol κ is thought to act as an extender or an inserter of a specific base at certain DNA lesions. All three translesion synthesis polymerases, along with Rev1, are recruited to damaged lesions via stalled replicative DNA polymerases. There are two pathways of damage repair leading researchers to conclude that the chosen pathway depends on which strand contains the damage, the leading or lagging strand.

Polymerases Rev1 and ζ (zeta)

Pol ζ another B family polymerase, is made of two subunits Rev3, the catalytic subunit, and Rev7 (MAD2L2), which increases the catalytic function of the polymerase, and is involved in translesion synthesis. Pol ζ lacks 3' to 5' exonuclease activity, is unique in that it can extend primers with terminal mismatches. Rev1 has three regions of interest in the BRCT domain, ubiquitin-binding domain, and C-terminal domain and has dCMP transferase ability, which adds deoxycytidine opposite lesions that would stall replicative polymerases Pol δ and Pol ε. These stalled polymerases activate ubiquitin complexes that in turn disassociate replication polymerases and recruit Pol ζ and Rev1. Together Pol ζ and Rev1 add deoxycytidine and Pol ζ extends past the lesion. Through a yet undetermined process, Pol ζ disassociates and replication polymerases reassociate and continue replication. Pol ζ and Rev1 are not required for replication, but loss of REV3 gene in budding yeast can cause increased sensitivity to DNA-damaging agents due to collapse of replication forks where replication polymerases have stalled.

Telomerase

Telomerase is a ribonucleoprotein recruited to replicate ends of linear chromosomes because normal DNA polymerase cannot replicate the ends, or telomere. The single-strand 3' overhang of the double-strand chromosome with the sequence 5'-TTAGGG-3' recruits telomerase. Telomerase acts like other DNA polymerases by extending the 3' end, but, unlike other DNA polymerases, telomerase does not require a template. The TERT subunit, an example of a reverse transcriptase, uses the RNA subunit to form the primer–template junction that allows telomerase to extend the 3' end of chromosome ends. The gradual decrease in size of telomeres as the result of many replications over a lifetime are thought to be associated with the effects of aging.

Polymerases γ, θ and ν (gamma, theta and nu)

Pol γ (gamma), Pol θ (theta), and Pol ν (nu) are Family A polymerases. Pol γ, encoded by the POLG gene, is the only mtDNA polymerase and therefore replicates, repairs, and has proofreading 3'–5' exonuclease and 5' dRP lyase activities. Any mutation that leads to limited or non-functioning Pol γ has a significant effect on mtDNA and is the most common cause of autosomal inherited mitochondrial disorders. Pol γ contains a C-terminus polymerase domain and an N-terminus 3'–5' exonuclease domain that are connected via the linker region, which binds the accessory subunit. The accessory subunit binds DNA and is required for processivity of Pol γ. Point mutation A467T in the linker region is responsible for more than one-third of all Pol γ-associated mitochondrial disorders. While many homologs of Pol θ, encoded by the POLQ gene, are found in eukaryotes, its function is not clearly understood. The sequence of amino acids in the C-terminus is what classifies Pol θ as Family A polymerase, although the error rate for Pol θ is more closely related to Family Y polymerases. Pol θ extends mismatched primer termini and can bypass abasic sites by adding a nucleotide. It also has Deoxyribophosphodiesterase (dRPase) activity in the polymerase domain and can show ATPase activity in close proximity to ssDNA. Pol ν (nu) is considered to be the least effective of the polymerase enzymes. However, DNA polymerase nu plays an active role in homology repair during cellular responses to crosslinks, fulfilling its role in a complex with helicase.

Plants use two Family A polymerases to copy both the mitochrondrial and plastid genomes. They are more similar to bacterial Pol I than they are to mamallian Pol γ.

Reverse transcriptase

Retroviruses encode an unusual DNA polymerase called reverse transcriptase, which is an RNA-dependent DNA polymerase (RdDp) that synthesizes DNA from a template of RNA. The reverse transcriptase family contain both DNA polymerase functionality and RNase H functionality, which degrades RNA base-paired to DNA. An example of a retrovirus is HIV.

Search This Blog