A Medley of Potpourri

Tuesday, February 7, 2023

Transcription factor

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Transcription_factor

Transcription factor glossary
gene expression – the process by which information from a gene is used in the synthesis of a functional gene product such as a protein transcription – the process of making messenger RNA (mRNA) from a DNA template by RNA polymerase transcription factor – a protein that binds to DNA and regulates gene expression by promoting or suppressing transcription transcriptional regulation – controlling the rate of gene transcription for example by helping or hindering RNA polymerase binding to DNA upregulation, activation, or promotion – increase the rate of gene transcription downregulation, repression, or suppression – decrease the rate of gene transcription coactivator – a protein (or a small molecule) that works with transcription factors to increase the rate of gene transcription corepressor – a protein (or a small molecule) that works with transcription factors to decrease the rate of gene transcription response element – a specific sequence of DNA that a transcription factor binds to

Illustration of an activator

In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The function of TFs is to regulate—turn on and off—genes in order to make sure that they are expressed in the desired cells at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct cell division, cell growth, and cell death throughout life; cell migration and organization (body plan) during embryonic development; and intermittently in response to signals from outside the cell, such as a hormone. There are 1500-1600 TFs in the human genome. Transcription factors are members of the proteome as well as regulome.

TFs work alone or with other proteins in a complex, by promoting (as an activator), or blocking (as a repressor) the recruitment of RNA polymerase (the enzyme that performs the transcription of genetic information from DNA to RNA) to specific genes.

A defining feature of TFs is that they contain at least one DNA-binding domain (DBD), which attaches to a specific sequence of DNA adjacent to the genes that they regulate. TFs are grouped into classes based on their DBDs. Other proteins such as coactivators, chromatin remodelers, histone acetyltransferases, histone deacetylases, kinases, and methylases are also essential to gene regulation, but lack DNA-binding domains, and therefore are not TFs.

TFs are of interest in medicine because TF mutations can cause specific diseases, and medications can be potentially targeted toward them.

Number

Transcription factors are essential for the regulation of gene expression and are, as a consequence, found in all living organisms. The number of transcription factors found within an organism increases with genome size, and larger genomes tend to have more transcription factors per gene.

There are approximately 2800 proteins in the human genome that contain DNA-binding domains, and 1600 of these are presumed to function as transcription factors, though other studies indicate it to be a smaller number. Therefore, approximately 10% of genes in the genome code for transcription factors, which makes this family the single largest family of human proteins. Furthermore, genes are often flanked by several binding sites for distinct transcription factors, and efficient expression of each of these genes requires the cooperative action of several different transcription factors (see, for example, hepatocyte nuclear factors). Hence, the combinatorial use of a subset of the approximately 2000 human transcription factors easily accounts for the unique regulation of each gene in the human genome during development.

Mechanism

Transcription factors bind to either enhancer or promoter regions of DNA adjacent to the genes that they regulate. Depending on the transcription factor, the transcription of the adjacent gene is either up- or down-regulated. Transcription factors use a variety of mechanisms for the regulation of gene expression. These mechanisms include:

stabilize or block the binding of RNA polymerase to DNA
catalyze the acetylation or deacetylation of histone proteins. The transcription factor can either do this directly or recruit other proteins with this catalytic activity. Many transcription factors use one or the other of two opposing mechanisms to regulate transcription:
- histone acetyltransferase (HAT) activity – acetylates histone proteins, which weakens the association of DNA with histones, which make the DNA more accessible to transcription, thereby up-regulating transcription
- histone deacetylase (HDAC) activity – deacetylates histone proteins, which strengthens the association of DNA with histones, which make the DNA less accessible to transcription, thereby down-regulating transcription
recruit coactivator or corepressor proteins to the transcription factor DNA complex

Function

Transcription factors are one of the groups of proteins that read and interpret the genetic "blueprint" in the DNA. They bind to the DNA and help initiate a program of increased or decreased gene transcription. As such, they are vital for many important cellular processes. Below are some of the important functions and biological roles transcription factors are involved in:

Basal transcriptional regulation

In eukaryotes, an important class of transcription factors called general transcription factors (GTFs) are necessary for transcription to occur. Many of these GTFs do not actually bind DNA, but rather are part of the large transcription preinitiation complex that interacts with RNA polymerase directly. The most common GTFs are TFIIA, TFIIB, TFIID (see also TATA binding protein), TFIIE, TFIIF, and TFIIH. The preinitiation complex binds to promoter regions of DNA upstream to the gene that they regulate.

Differential enhancement of transcription

Other transcription factors differentially regulate the expression of various genes by binding to enhancer regions of DNA adjacent to regulated genes. These transcription factors are critical to making sure that genes are expressed in the right cell at the right time and in the right amount, depending on the changing requirements of the organism.

Development

Many transcription factors in multicellular organisms are involved in development. Responding to stimuli, these transcription factors turn on/off the transcription of the appropriate genes, which, in turn, allows for changes in cell morphology or activities needed for cell fate determination and cellular differentiation. The Hox transcription factor family, for example, is important for proper body pattern formation in organisms as diverse as fruit flies to humans. Another example is the transcription factor encoded by the sex-determining region Y (SRY) gene, which plays a major role in determining sex in humans.

Response to intercellular signals

Cells can communicate with each other by releasing molecules that produce signaling cascades within another receptive cell. If the signal requires upregulation or downregulation of genes in the recipient cell, often transcription factors will be downstream in the signaling cascade. Estrogen signaling is an example of a fairly short signaling cascade that involves the estrogen receptor transcription factor: Estrogen is secreted by tissues such as the ovaries and placenta, crosses the cell membrane of the recipient cell, and is bound by the estrogen receptor in the cell's cytoplasm. The estrogen receptor then goes to the cell's nucleus and binds to its DNA-binding sites, changing the transcriptional regulation of the associated genes.

Response to environment

Not only do transcription factors act downstream of signaling cascades related to biological stimuli but they can also be downstream of signaling cascades involved in environmental stimuli. Examples include heat shock factor (HSF), which upregulates genes necessary for survival at higher temperatures, hypoxia inducible factor (HIF), which upregulates genes necessary for cell survival in low-oxygen environments, and sterol regulatory element binding protein (SREBP), which helps maintain proper lipid levels in the cell.

Cell cycle control

Many transcription factors, especially some that are proto-oncogenes or tumor suppressors, help regulate the cell cycle and as such determine how large a cell will get and when it can divide into two daughter cells. One example is the Myc oncogene, which has important roles in cell growth and apoptosis.

Pathogenesis

Transcription factors can also be used to alter gene expression in a host cell to promote pathogenesis. A well studied example of this are the transcription-activator like effectors (TAL effectors) secreted by Xanthomonas bacteria. When injected into plants, these proteins can enter the nucleus of the plant cell, bind plant promoter sequences, and activate transcription of plant genes that aid in bacterial infection. TAL effectors contain a central repeat region in which there is a simple relationship between the identity of two critical residues in sequential repeats and sequential DNA bases in the TAL effector's target site. This property likely makes it easier for these proteins to evolve in order to better compete with the defense mechanisms of the host cell.

Regulation

It is common in biology for important processes to have multiple layers of regulation and control. This is also true with transcription factors: Not only do transcription factors control the rates of transcription to regulate the amounts of gene products (RNA and protein) available to the cell but transcription factors themselves are regulated (often by other transcription factors). Below is a brief synopsis of some of the ways that the activity of transcription factors can be regulated:

Synthesis

Transcription factors (like all proteins) are transcribed from a gene on a chromosome into RNA, and then the RNA is translated into protein. Any of these steps can be regulated to affect the production (and thus activity) of a transcription factor. An implication of this is that transcription factors can regulate themselves. For example, in a negative feedback loop, the transcription factor acts as its own repressor: If the transcription factor protein binds the DNA of its own gene, it down-regulates the production of more of itself. This is one mechanism to maintain low levels of a transcription factor in a cell.

Nuclear localization

In eukaryotes, transcription factors (like most proteins) are transcribed in the nucleus but are then translated in the cell's cytoplasm. Many proteins that are active in the nucleus contain nuclear localization signals that direct them to the nucleus. But, for many transcription factors, this is a key point in their regulation. Important classes of transcription factors such as some nuclear receptors must first bind a ligand while in the cytoplasm before they can relocate to the nucleus.

Activation

Transcription factors may be activated (or deactivated) through their signal-sensing domain by a number of mechanisms including:

ligand binding – Not only is ligand binding able to influence where a transcription factor is located within a cell but ligand binding can also affect whether the transcription factor is in an active state and capable of binding DNA or other cofactors (see, for example, nuclear receptors).
phosphorylation – Many transcription factors such as STAT proteins must be phosphorylated before they can bind DNA.
interaction with other transcription factors (e.g., homo- or hetero-dimerization) or coregulatory proteins

Accessibility of DNA-binding site

In eukaryotes, DNA is organized with the help of histones into compact particles called nucleosomes, where sequences of about 147 DNA base pairs make ~1.65 turns around histone protein octamers. DNA within nucleosomes is inaccessible to many transcription factors. Some transcription factors, so-called pioneer factors are still able to bind their DNA binding sites on the nucleosomal DNA. For most other transcription factors, the nucleosome should be actively unwound by molecular motors such as chromatin remodelers. Alternatively, the nucleosome can be partially unwrapped by thermal fluctuations, allowing temporary access to the transcription factor binding site. In many cases, a transcription factor needs to compete for binding to its DNA binding site with other transcription factors and histones or non-histone chromatin proteins. Pairs of transcription factors and other proteins can play antagonistic roles (activator versus repressor) in the regulation of the same gene.

Availability of other cofactors/transcription factors

Most transcription factors do not work alone. Many large TF families form complex homotypic or heterotypic interactions through dimerization. For gene transcription to occur, a number of transcription factors must bind to DNA regulatory sequences. This collection of transcription factors, in turn, recruit intermediary proteins such as cofactors that allow efficient recruitment of the preinitiation complex and RNA polymerase. Thus, for a single transcription factor to initiate transcription, all of these other proteins must also be present, and the transcription factor must be in a state where it can bind to them if necessary. Cofactors are proteins that modulate the effects of transcription factors. Cofactors are interchangeable between specific gene promoters; the protein complex that occupies the promoter DNA and the amino acid sequence of the cofactor determine its spatial conformation. For example, certain steroid receptors can exchange cofactors with NF-κB, which is a switch between inflammation and cellular differentiation; thereby steroids can affect the inflammatory response and function of certain tissues.

Interaction with methylated cytosine

Transcription factors and methylated cytosines in DNA both have major roles in regulating gene expression. (Methylation of cytosine in DNA primarily occurs where cytosine is followed by guanine in the 5' to 3' DNA sequence, a CpG site.) Methylation of CpG sites in a promoter region of a gene usually represses gene transcription, while methylation of CpGs in the body of a gene increases expression. TET enzymes play a central role in demethylation of methylated cytosines. Demethylation of CpGs in a gene promoter by TET enzyme activity increases transcription of the gene.

The DNA binding sites of 519 transcription factors were evaluated. Of these, 169 transcription factors (33%) did not have CpG dinucleotides in their binding sites, and 33 transcription factors (6%) could bind to a CpG-containing motif but did not display a preference for a binding site with either a methylated or unmethylated CpG. There were 117 transcription factors (23%) that were inhibited from binding to their binding sequence if it contained a methylated CpG site, 175 transcription factors (34%) that had enhanced binding if their binding sequence had a methylated CpG site, and 25 transcription factors (5%) were either inhibited or had enhanced binding depending on where in the binding sequence the methylated CpG was located.

TET enzymes do not specifically bind to methylcytosine except when recruited (see DNA demethylation). Multiple transcription factors important in cell differentiation and lineage specification, including NANOG, SALL4A, WT1, EBF1, PU.1, and E2A, have been shown to recruit TET enzymes to specific genomic loci (primarily enhancers) to act on methylcytosine (mC) and convert it to hydroxymethylcytosine hmC (and in most cases marking them for subsequent complete demethylation to cytosine). TET-mediated conversion of mC to hmC appears to disrupt the binding of 5mC-binding proteins including MECP2 and MBD (Methyl-CpG-binding domain) proteins, facilitating nucleosome remodeling and the binding of transcription factors, thereby activating transcription of those genes. EGR1 is an important transcription factor in memory formation. It has an essential role in brain neuron epigenetic reprogramming. The transcription factor EGR1 recruits the TET1 protein that initiates a pathway of DNA demethylation. EGR1, together with TET1, is employed in programming the distribution of methylation sites on brain DNA during brain development and in learning (see Epigenetics in learning and memory).

Structure

Schematic diagram of the amino acid sequence (amino terminus to the left and carboxylic acid terminus to the right) of a prototypical transcription factor that contains (1) a DNA-binding domain (DBD), (2) signal-sensing domain (SSD), and Activation domain (AD). The order of placement and the number of domains may differ in various types of transcription factors. In addition, the transactivation and signal-sensing functions are frequently contained within the same domain.

Domain architecture example: Lactose Repressor (LacI). The N-terminal DNA binding domain (labeled) of the lac repressor binds its target DNA sequence (gold) in the major groove using a helix-turn-helix motif. Effector molecule binding (green) occurs in the regulatory domain (labeled). This triggers an allosteric response mediated by the linker region (labeled).

Transcription factors are modular in structure and contain the following domains:

DNA-binding domain (DBD), which attaches to specific sequences of DNA (enhancer or promoter. Necessary component for all vectors. Used to drive transcription of the vector's transgene promoter sequences) adjacent to regulated genes. DNA sequences that bind transcription factors are often referred to as response elements.
Activation domain (AD), which contains binding sites for other proteins such as transcription coregulators. These binding sites are frequently referred to as activation functions (AFs), Transactivation domain (TAD) or Trans-activating domain TAD but not mix with topologically associating domain TAD.
An optional signal-sensing domain (SSD) (e.g., a ligand-binding domain), which senses external signals and, in response, transmits these signals to the rest of the transcription complex, resulting in up- or down-regulation of gene expression. Also, the DBD and signal-sensing domains may reside on separate proteins that associate within the transcription complex to regulate gene expression.

DNA-binding domain

DNA contacts of different types of DNA-binding domains of transcription factors

The portion (domain) of the transcription factor that binds DNA is called its DNA-binding domain. Below is a partial list of some of the major families of DNA-binding domains/transcription factors:

Family	InterPro	Pfam	SCOP
basic helix-loop-helix	InterPro: IPR001092	Pfam PF00010	SCOP 47460
basic-leucine zipper (bZIP)	InterPro: IPR004827	Pfam PF00170	SCOP 57959
C-terminal effector domain of the bipartite response regulators	InterPro: IPR001789	Pfam PF00072	SCOP 46894
AP2/ERF/GCC box	InterPro: IPR001471	Pfam PF00847	SCOP 54176
helix-turn-helix
homeodomain proteins, which are encoded by homeobox genes, are transcription factors. Homeodomain proteins play critical roles in the regulation of development.	InterPro: IPR009057	Pfam PF00046	SCOP 46689
lambda repressor-like	InterPro: IPR010982		SCOP 47413
srf-like (serum response factor)	InterPro: IPR002100	Pfam PF00319	SCOP 55455
paired box
winged helix	InterPro: IPR013196	Pfam PF08279	SCOP 46785
zinc fingers
* multi-domain Cys₂His₂ zinc fingers	InterPro: IPR007087	Pfam PF00096	SCOP 57667
* Zn₂/Cys₆			SCOP 57701
* Zn₂/Cys₈ nuclear receptor zinc finger	InterPro: IPR001628	Pfam PF00105	SCOP 57716

Response elements

The DNA sequence that a transcription factor binds to is called a transcription factor-binding site or response element.

Transcription factors interact with their binding sites using a combination of electrostatic (of which hydrogen bonds are a special case) and Van der Waals forces. Due to the nature of these chemical interactions, most transcription factors bind DNA in a sequence specific manner. However, not all bases in the transcription factor-binding site may actually interact with the transcription factor. In addition, some of these interactions may be weaker than others. Thus, transcription factors do not bind just one sequence but are capable of binding a subset of closely related sequences, each with a different strength of interaction.

For example, although the consensus binding site for the TATA-binding protein (TBP) is TATAAAA, the TBP transcription factor can also bind similar sequences such as TATATAT or TATATAA.

Because transcription factors can bind a set of related sequences and these sequences tend to be short, potential transcription factor binding sites can occur by chance if the DNA sequence is long enough. It is unlikely, however, that a transcription factor will bind all compatible sequences in the genome of the cell. Other constraints, such as DNA accessibility in the cell or availability of cofactors may also help dictate where a transcription factor will actually bind. Thus, given the genome sequence, it is still difficult to predict where a transcription factor will actually bind in a living cell.

Additional recognition specificity, however, may be obtained through the use of more than one DNA-binding domain (for example tandem DBDs in the same transcription factor or through dimerization of two transcription factors) that bind to two or more adjacent sequences of DNA.

Clinical significance

Transcription factors are of clinical significance for at least two reasons: (1) mutations can be associated with specific diseases, and (2) they can be targets of medications.

Disorders

Due to their important roles in development, intercellular signaling, and cell cycle, some human diseases have been associated with mutations in transcription factors.

Many transcription factors are either tumor suppressors or oncogenes, and, thus, mutations or aberrant regulation of them is associated with cancer. Three groups of transcription factors are known to be important in human cancer: (1) the NF-kappaB and AP-1 families, (2) the STAT family and (3) the steroid receptors.

Below are a few of the better-studied examples:

Condition	Description	Locus
Rett syndrome	Mutations in the MECP2 transcription factor are associated with Rett syndrome, a neurodevelopmental disorder.	Xq28
Diabetes	A rare form of diabetes called MODY (Maturity onset diabetes of the young) can be caused by mutations in hepatocyte nuclear factors (HNFs) or insulin promoter factor-1 (IPF1/Pdx1).	multiple
Developmental verbal dyspraxia	Mutations in the FOXP2 transcription factor are associated with developmental verbal dyspraxia, a disease in which individuals are unable to produce the finely coordinated movements required for speech.	7q31
Autoimmune diseases	Mutations in the FOXP3 transcription factor cause a rare form of autoimmune disease called IPEX.	Xp11.23-q13.3
Li-Fraumeni syndrome	Caused by mutations in the tumor suppressor p53.	17p13.1
Breast cancer	The STAT family is relevant to breast cancer.	multiple
Multiple cancers	The HOX family are involved in a variety of cancers.	multiple
Osteoarthritis	Mutation or reduced activity of SOX9

Potential drug targets

Approximately 10% of currently prescribed drugs directly target the nuclear receptor class of transcription factors. Examples include tamoxifen and bicalutamide for the treatment of breast and prostate cancer, respectively, and various types of anti-inflammatory and anabolic steroids. In addition, transcription factors are often indirectly modulated by drugs through signaling cascades. It might be possible to directly target other less-explored transcription factors such as NF-κB with drugs. Transcription factors outside the nuclear receptor family are thought to be more difficult to target with small molecule therapeutics since it is not clear that they are "drugable" but progress has been made on Pax2 and the notch pathway.

Role in evolution

Gene duplications have played a crucial role in the evolution of species. This applies particularly to transcription factors. Once they occur as duplicates, accumulated mutations encoding for one copy can take place without negatively affecting the regulation of downstream targets. However, changes of the DNA binding specificities of the single-copy Leafy transcription factor, which occurs in most land plants, have recently been elucidated. In that respect, a single-copy transcription factor can undergo a change of specificity through a promiscuous intermediate without losing function. Similar mechanisms have been proposed in the context of all alternative phylogenetic hypotheses, and the role of transcription factors in the evolution of all species.

Role in biocontrol activity

The transcription factors have a role in resistance activity which important for successful biocontrol activity. The resistant to oxidative stress and alkaline pH sensing were contributed from the transcription factor Yap1 and Rim101 of the Papiliotrema terrestris LS28 as molecular tools revealed an understanding of the genetic mechanisms underlying the biocontrol activity which will supports disease management programs based on biological and integrated control.

Analysis

There are different technologies available to analyze transcription factors. On the genomic level, DNA-sequencing and database research are commonly used. The protein version of the transcription factor is detectable by using specific antibodies. The sample is detected on a western blot. By using electrophoretic mobility shift assay (EMSA), the activation profile of transcription factors can be detected. A multiplex approach for activation profiling is a TF chip system where several different transcription factors can be detected in parallel.

The most commonly used method for identifying transcription factor binding sites is chromatin immunoprecipitation (ChIP). This technique relies on chemical fixation of chromatin with formaldehyde, followed by co-precipitation of DNA and the transcription factor of interest using an antibody that specifically targets that protein. The DNA sequences can then be identified by microarray or high-throughput sequencing (ChIP-seq) to determine transcription factor binding sites. If no antibody is available for the protein of interest, DamID may be a convenient alternative.

Monday, February 6, 2023

Cis-regulatory element

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Cis-regulatory_element

Cis-regulatory elements (CREs) or Cis-regulatory modules (CRMs) are regions of non-coding DNA which regulate the transcription of neighboring genes. CREs are vital components of genetic regulatory networks, which in turn control morphogenesis, the development of anatomy, and other aspects of embryonic development, studied in evolutionary developmental biology.

CREs are found in the vicinity of the genes that they regulate. CREs typically regulate gene transcription by binding to transcription factors. A single transcription factor may bind to many CREs, and hence control the expression of many genes (pleiotropy). The Latin prefix cis means "on this side", i.e. on the same molecule of DNA as the gene(s) to be transcribed.

CRMs are stretches of DNA, usually 100–1000 DNA base pairs in length, where a number of transcription factors can bind and regulate expression of nearby genes and regulate their transcription rates. They are labeled as cis because they are typically located on the same DNA strand as the genes they control as opposed to trans, which refers to effects on genes not located on the same strand or farther away, such as transcription factors. One cis-regulatory element can regulate several genes, and conversely, one gene can have several cis-regulatory modules. Cis-regulatory modules carry out their function by integrating the active transcription factors and the associated co-factors at a specific time and place in the cell where this information is read and an output is given.

CREs are often but not always upstream of the transcription site. CREs contrast with trans-regulatory elements (TREs). TREs code for transcription factors.

Overview

Diagram showing at which stages in the DNA-mRNA-protein pathway expression can be controlled

The genome of an organism contains anywhere from a few hundred to thousands of different genes, all encoding a singular product or more. For numerous reasons, including organizational maintenance, energy conservation, and generating phenotypic variance, it is important that genes are only expressed when they are needed. The most efficient way for an organism to regulate gene expression is at the transcriptional level. CREs function to control transcription by acting nearby or within a gene. The most well characterized types of CREs are enhancers and promoters. Both of these sequence elements are structural regions of DNA that serve as transcriptional regulators.

Cis-regulatory modules are one of several types of functional regulatory elements. Regulatory elements are binding sites for transcription factors, which are involved in gene regulation. Cis-regulatory modules perform a large amount of developmental information processing. Cis-regulatory modules are non-random clusters at their specified target site that contain transcription factor binding sites.

The original definition presented cis-regulatory modules as enhancers of cis-acting DNA, which increased the rate of transcription from a linked promoter. However, this definition has changed to define cis-regulatory modules as a DNA sequence with transcription factor binding sites which are clustered into modular structures, including -but not limited to- locus control regions, promoters, enhancers, silencers, boundary control elements and other modulators.

Cis-regulatory modules can be divided into three classes; enhancers, which regulate gene expression positively; insulators, which work indirectly by interacting with other nearby cis-regulatory modules; and silencers that turn off expression of genes.

The design of cis-regulatory modules is such that transcription factors and epigenetic modifications serve as inputs, and the output of the module is the command given to the transcription machinery, which in turn determines the rate of gene transcription or whether it is turned on or off. There are two types of transcription factor inputs: those that determine when the target gene is to be expressed and those that serve as functional drivers, which come into play only during specific situations during development. These inputs can come from different time points, can represent different signal ligands, or can come from different domains or lineages of cells. However, a lot still remains unknown.

Additionally, the regulation of chromatin structure and nuclear organization also play a role in determining and controlling the function of cis-regulatory modules. Thus gene-regulation functions (GRF) provide a unique characteristic of a cis-regulatory module (CRM), relating the concentrations of transcription factors (input) to the promoter activities (output). The challenge is to predict GRFs. This challenge still remains unsolved. In general, gene-regulation functions do not use Boolean logic, although in some cases the approximation of the Boolean logic is still very useful.

The Boolean logic assumption

Within the assumption of the Boolean logic, principles guiding the operation of these modules includes the design of the module which determines the regulatory function. In relation to development, these modules can generate both positive and negative outputs. The output of each module is a product of the various operations performed on it. Common operations include the OR gate – this design indicates that in an output will be given when either input is given, and the AND gate – in this design two different regulatory factors are necessary to make sure that a positive output results. "Toggle Switches" – This design occurs when the signal ligand is absent while the transcription factor is present; this transcription factor ends up acting as a dominant repressor. However, once the signal ligand is present the transcription factor's role as repressor is eliminated and transcription can occur.

Other Boolean logic operations can occur as well, such as sequence specific transcriptional repressors, which when they bind to the cis-regulatory module lead to an output of zero. Additionally, besides influence from the different logic operations, the output of a "cis"-regulatory module will also be influenced by prior events. 4) Cis-regulatory modules must interact with other regulatory elements. For the most part, even with the presence of functional overlap between cis-regulatory modules of a gene, the modules' inputs and outputs tend to not be the same.

While the assumption of Boolean logic is important for systems biology, detailed studies show that in general the logic of gene regulation is not Boolean. This means, for example, that in the case of a cis-regulatory module regulated by two transcription factors, experimentally determined gene-regulation functions can not be described by the 16 possible Boolean functions of two variables. Non-Boolean extensions of the gene-regulatory logic have been proposed to correct for this issue.

Classification

Cis-regulatory modules can be characterized by the information processing that they encode and the organization of their transcription factor binding sites. Additionally, cis-regulatory modules are also characterized by the way they affect the probability, proportion, and rate of transcription. Highly cooperative and coordinated cis-regulatory modules are classified as enhanceosomes. The architecture and the arrangement of the transcription factor binding sites are critical because disruption of the arrangement could cancel out the function. Functional flexible cis-regulatory modules are called billboards. Their transcriptional output is the summation effect of the bound transcription factors. Enhancers affect the probability of a gene being activated, but have little or no effect on rate. The Binary response model acts like an on/off switch for transcription. This model will increase or decrease the amount of cells that transcribe a gene, but it does not affect the rate of transcription. Rheostatic response model describes cis-regulatory modules as regulators of the initiation rate of transcription of its associated gene.

Promoter

Promoters are CREs consisting of relatively short sequences of DNA which include the site where transcription is initiated and the region approximately 35 bp upstream or downstream from the initiation site (bp). In eukaryotes, promoters usually have the following four components: the TATA box, a TFIIB recognition site, an initiator, and the downstream core promoter element. It has been found that a single gene can contain multiple promoter sites. In order to initiate transcription of the downstream gene, a host of DNA-binding proteins called transcription factors (TFs) must bind sequentially to this region. Only once this region has been bound with the appropriate set of TFs, and in the proper order, can RNA polymerase bind and begin transcribing the gene.

Enhancers

Enhancers are CREs that influence (enhance) the transcription of genes on the same molecule of DNA and can be found upstream, downstream, within the introns, or even relatively far away from the gene they regulate. Multiple enhancers can act in a coordinated fashion to regulate transcription of one gene. A number of genome-wide sequencing projects have revealed that enhancers are often transcribed to long non-coding RNA (lncRNA) or enhancer RNA (eRNA), whose changes in levels frequently correlate with those of the target gene mRNA.

Silencers

Silencers are CREs that can bind transcription regulation factors (proteins) called repressors, thereby preventing transcription of a gene. The term "silencer" can also refer to a region in the 3' untranslated region of messenger RNA, that binds proteins which suppress translation of that mRNA molecule, but this usage is distinct from its use in describing a CRE.

Operators

Operators are CREs in prokaryotes and some eukaryotes that exist within operons, where they can bind proteins called repressors to affect transcription.

Evolutionary role

CREs have an important evolutionary role. The coding regions of genes are often well conserved among organisms; yet different organisms display marked phenotypic diversity. It has been found that polymorphisms occurring within non-coding sequences have a profound effect on phenotype by altering gene expression. Mutations arising within a CRE can generate expression variance by changing the way TFs bind. Tighter or looser binding of regulatory proteins will lead to up- or down-regulated transcription.

Cis-regulatory module in gene regulatory network

The function of a gene regulatory network depends on the architecture of the nodes, whose function is dependent on the multiple cis-regulatory modules. The layout of cis-regulatory modules can provide enough information to generate spatial and temporal patterns of gene expression. During development each domain, where each domain represents a different spatial regions of the embryo, of gene expression will be under the control of different cis-regulatory modules. The design of regulatory modules help in producing feedback, feed forward, and cross-regulatory loops.

Mode of action

Cis-regulatory modules can regulate their target genes over large distances. Several models have been proposed to describe the way that these modules may communicate with their target gene promoter. These include the DNA scanning model, the DNA sequence looping model and the facilitated tracking model. In the DNA scanning model, the transcription factor and cofactor complex form at the cis-regulatory module and then continues to move along the DNA sequence until it finds the target gene promoter. In the looping model, the transcription factor binds to the cis-regulatory module, which then causes the looping of the DNA sequence and allows for the interaction with the target gene promoter. The transcription factor-cis-regulatory module complex causes the looping of the DNA sequence slowly towards the target promoter and forms a stable looped configuration. The facilitated tracking model combines parts of the two previous models.

Identification and computational prediction

Besides experimentally determining CRMs, there are various bioinformatics algorithms for predicting them. Most algorithms try to search for significant combinations of transcription factor binding sites (DNA binding sites) in promoter sequences of co-expressed genes. More advanced methods combine the search for significant motifs with correlation in gene expression datasets between transcription factors and target genes. Both methods have been implemented, for example, in the ModuleMaster. Other programs created for the identification and prediction of cis-regulatory modules include:

INSECT 2.0 is a web server that allows to search Cis-regulatory modules in a genome-wide manner. The program relies on the definition of strict restrictions among the Transcription Factor Binding Sites (TFBSs) that compose the module in order to decrease the false positives rate. INSECT is designed to be user-friendly since it allows automatic retrieval of sequences and several visualizations and links to third-party tools in order to help users to find those instances that are more likely to be true regulatory sites. INSECT 2.0 algorithm was previously published and the algorithm and theory behind it explained in

Stubb uses hidden Markov models to identify statistically significant clusters of transcription factor combinations. It also uses a second related genome to improve the prediction accuracy of the model.

Bayesian Networks use an algorithm that combines site predictions and tissue-specific expression data for transcription factors and target genes of interest. This model also uses regression trees to depict the relationship between the identified cis-regulatory module and the possible binding set of transcription factors.

CRÈME examine clusters of target sites for transcription factors of interest. This program uses a database of confirmed transcription factor binding sites that were annotated across the human genome. A search algorithm is applied to the data set to identify possible combinations of transcription factors, which have binding sites that are close to the promoter of the gene set of interest. The possible cis-regulatory modules are then statistically analyzed and the significant combinations are graphically represented

Active cis-regulatory modules in a genomic sequence have been difficult to identify. Problems in identification arise because often scientists find themselves with a small set of known transcription factors, so it makes it harder to identify statistically significant clusters of transcription factor binding sites. Additionally, high costs limit the use of large whole genome tiling arrays.

Binding sites of gene regulatory factors. Transcription factors binding DNA, RNA-binding proteins and microRNAs binding RNA

Examples

An example of a cis-acting regulatory sequence is the operator in the lac operon. This DNA sequence is bound by the lac repressor, which, in turn, prevents transcription of the adjacent genes on the same DNA molecule. The lac operator is, thus, considered to "act in cis" on the regulation of the nearby genes. The operator itself does not code for any protein or RNA.

In contrast, trans-regulatory elements are diffusible factors, usually proteins, that may modify the expression of genes distant from the gene that was originally transcribed to create them. For example, a transcription factor that regulates a gene on chromosome 6 might itself have been transcribed from a gene on chromosome 11. The term trans-regulatory is constructed from the Latin root trans, which means "across from".

There are cis-regulatory and trans-regulatory elements. Cis-regulatory elements are often binding sites for one or more trans-acting factors.

To summarize, cis-regulatory elements are present on the same molecule of DNA as the gene they regulate whereas trans-regulatory elements can regulate genes distant from the gene from which they were transcribed.

Gene regulatory network

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Gene_regulatory_network

Structure of a gene regulatory network

Control process of a gene regulatory network

A gene (or genetic) regulatory network (GRN) is a collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins which, in turn, determine the function of the cell. GRN also play a central role in morphogenesis, the creation of body structures, which in turn is central to evolutionary developmental biology (evo-devo).

The regulator can be DNA, RNA, protein or any combination of two or more of these three that form a complex, such as a specific sequence of DNA and a transcription factor to activate that sequence. The interaction can be direct or indirect (through transcribed RNA or translated protein). In general, each mRNA molecule goes on to make a specific protein (or set of proteins). In some cases this protein will be structural, and will accumulate at the cell membrane or within the cell to give it particular structural properties. In other cases the protein will be an enzyme, i.e., a micro-machine that catalyses a certain reaction, such as the breakdown of a food source or toxin. Some proteins though serve only to activate other genes, and these are the transcription factors that are the main players in regulatory networks or cascades. By binding to the promoter region at the start of other genes they turn them on, initiating the production of another protein, and so on. Some transcription factors are inhibitory.

In single-celled organisms, regulatory networks respond to the external environment, optimising the cell at a given time for survival in this environment. Thus a yeast cell, finding itself in a sugar solution, will turn on genes to make enzymes that process the sugar to alcohol. This process, which we associate with wine-making, is how the yeast cell makes its living, gaining energy to multiply, which under normal circumstances would enhance its survival prospects.

In multicellular animals the same principle has been put in the service of gene cascades that control body-shape. Each time a cell divides, two cells result which, although they contain the same genome in full, can differ in which genes are turned on and making proteins. Sometimes a 'self-sustaining feedback loop' ensures that a cell maintains its identity and passes it on. Less understood is the mechanism of epigenetics by which chromatin modification may provide cellular memory by blocking or allowing transcription. A major feature of multicellular animals is the use of morphogen gradients, which in effect provide a positioning system that tells a cell where in the body it is, and hence what sort of cell to become. A gene that is turned on in one cell may make a product that leaves the cell and diffuses through adjacent cells, entering them and turning on genes only when it is present above a certain threshold level. These cells are thus induced into a new fate, and may even generate other morphogens that signal back to the original cell. Over longer distances morphogens may use the active process of signal transduction. Such signalling controls embryogenesis, the building of a body plan from scratch through a series of sequential steps. They also control and maintain adult bodies through feedback processes, and the loss of such feedback because of a mutation can be responsible for the cell proliferation that is seen in cancer. In parallel with this process of building structure, the gene cascade turns on genes that make structural proteins that give each cell the physical properties it needs.

Overview

At one level, biological cells can be thought of as "partially mixed bags" of biological chemicals – in the discussion of gene regulatory networks, these chemicals are mostly the messenger RNAs (mRNAs) and proteins that arise from gene expression. These mRNA and proteins interact with each other with various degrees of specificity. Some diffuse around the cell. Others are bound to cell membranes, interacting with molecules in the environment. Still others pass through cell membranes and mediate long range signals to other cells in a multi-cellular organism. These molecules and their interactions comprise a gene regulatory network. A typical gene regulatory network looks something like this:

Example of a regulatory network

The nodes of this network can represent genes, proteins, mRNAs, protein/protein complexes or cellular processes. Nodes that are depicted as lying along vertical lines are associated with the cell/environment interfaces, while the others are free-floating and can diffuse. Edges between nodes represent interactions between the nodes, that can correspond to individual molecular reactions between DNA, mRNA, miRNA, proteins or molecular processes through which the products of one gene affect those of another, though the lack of experimentally obtained information often implies that some reactions are not modeled at such a fine level of detail. These interactions can be inductive (usually represented by arrowheads or the + sign), with an increase in the concentration of one leading to an increase in the other, inhibitory (represented with filled circles, blunt arrows or the minus sign), with an increase in one leading to a decrease in the other, or dual, when depending on the circumstances the regulator can activate or inhibit the target node. The nodes can regulate themselves directly or indirectly, creating feedback loops, which form cyclic chains of dependencies in the topological network. The network structure is an abstraction of the system's molecular or chemical dynamics, describing the manifold ways in which one substance affects all the others to which it is connected. In practice, such GRNs are inferred from the biological literature on a given system and represent a distillation of the collective knowledge about a set of related biochemical reactions. To speed up the manual curation of GRNs, some recent efforts try to use text mining, curated databases, network inference from massive data, model checking and other information extraction technologies for this purpose.

Genes can be viewed as nodes in the network, with input being proteins such as transcription factors, and outputs being the level of gene expression. The value of the node depends on a function which depends on the value of its regulators in previous time steps (in the Boolean network described below these are Boolean functions, typically AND, OR, and NOT). These functions have been interpreted as performing a kind of information processing within the cell, which determines cellular behavior. The basic drivers within cells are concentrations of some proteins, which determine both spatial (location within the cell or tissue) and temporal (cell cycle or developmental stage) coordinates of the cell, as a kind of "cellular memory". The gene networks are only beginning to be understood, and it is a next step for biology to attempt to deduce the functions for each gene "node", to help understand the behavior of the system in increasing levels of complexity, from gene to signaling pathway, cell or tissue level.

Mathematical models of GRNs have been developed to capture the behavior of the system being modeled, and in some cases generate predictions corresponding with experimental observations. In some other cases, models have proven to make accurate novel predictions, which can be tested experimentally, thus suggesting new approaches to explore in an experiment that sometimes wouldn't be considered in the design of the protocol of an experimental laboratory. Modeling techniques include differential equations (ODEs), Boolean networks, Petri nets, Bayesian networks, graphical Gaussian network models, Stochastic, and Process Calculi. Conversely, techniques have been proposed for generating models of GRNs that best explain a set of time series observations. Recently it has been shown that ChIP-seq signal of histone modification are more correlated with transcription factor motifs at promoters in comparison to RNA level. Hence it is proposed that time-series histone modification ChIP-seq could provide more reliable inference of gene-regulatory networks in comparison to methods based on expression levels.

Structure and evolution

Global feature

Gene regulatory networks are generally thought to be made up of a few highly connected nodes (hubs) and many poorly connected nodes nested within a hierarchical regulatory regime. Thus gene regulatory networks approximate a hierarchical scale free network topology. This is consistent with the view that most genes have limited pleiotropy and operate within regulatory modules. This structure is thought to evolve due to the preferential attachment of duplicated genes to more highly connected genes. Recent work has also shown that natural selection tends to favor networks with sparse connectivity.

There are primarily two ways that networks can evolve, both of which can occur simultaneously. The first is that network topology can be changed by the addition or subtraction of nodes (genes) or parts of the network (modules) may be expressed in different contexts. The Drosophila Hippo signaling pathway provides a good example. The Hippo signaling pathway controls both mitotic growth and post-mitotic cellular differentiation. Recently it was found that the network the Hippo signaling pathway operates in differs between these two functions which in turn changes the behavior of the Hippo signaling pathway. This suggests that the Hippo signaling pathway operates as a conserved regulatory module that can be used for multiple functions depending on context. Thus, changing network topology can allow a conserved module to serve multiple functions and alter the final output of the network. The second way networks can evolve is by changing the strength of interactions between nodes, such as how strongly a transcription factor may bind to a cis-regulatory element. Such variation in strength of network edges has been shown to underlie between species variation in vulva cell fate patterning of Caenorhabditis worms.

Local feature

Feed-forward loop

Another widely cited characteristic of gene regulatory network is their abundance of certain repetitive sub-networks known as network motifs. Network motifs can be regarded as repetitive topological patterns when dividing a big network into small blocks. Previous analysis found several types of motifs that appeared more often in gene regulatory networks than in randomly generated networks. As an example, one such motif is called feed-forward loops, which consist three nodes. This motif is the most abundant among all possible motifs made up of three nodes, as is shown in the gene regulatory networks of fly, nematode, and human.

The enriched motifs have been proposed to follow convergent evolution, suggesting they are "optimal designs" for certain regulatory purposes. For example, modeling shows that feed-forward loops are able to coordinate the change in node A (in terms of concentration and activity) and the expression dynamics of node C, creating different input-output behaviors. The galactose utilization system of E. coli contains a feed-forward loop which accelerates the activation of galactose utilization operon galETK, potentially facilitating the metabolic transition to galactose when glucose is depleted. The feed-forward loop in the arabinose utilization systems of E.coli delays the activation of arabinose catabolism operon and transporters, potentially avoiding unnecessary metabolic transition due to temporary fluctuations in upstream signaling pathways. Similarly in the Wnt signaling pathway of Xenopus, the feed-forward loop acts as a fold-change detector that responses to the fold change, rather than the absolute change, in the level of β-catenin, potentially increasing the resistance to fluctuations in β-catenin levels. Following the convergent evolution hypothesis, the enrichment of feed-forward loops would be an adaptation for fast response and noise resistance. A recent research found that yeast grown in an environment of constant glucose developed mutations in glucose signaling pathways and growth regulation pathway, suggesting regulatory components responding to environmental changes are dispensable under constant environment.

On the other hand, some researchers hypothesize that the enrichment of network motifs is non-adaptive. In other words, gene regulatory networks can evolve to a similar structure without the specific selection on the proposed input-output behavior. Support for this hypothesis often comes from computational simulations. For example, fluctuations in the abundance of feed-forward loops in a model that simulates the evolution of gene regulatory networks by randomly rewiring nodes may suggest that the enrichment of feed-forward loops is a side-effect of evolution. In another model of gene regulator networks evolution, the ratio of the frequencies of gene duplication and gene deletion show great influence on network topology: certain ratios lead to the enrichment of feed-forward loops and create networks that show features of hierarchical scale free networks. De novo evolution of coherent type 1 feed-forward loops has been demonstrated computationally in response to selection for their hypothesized function of filtering out a short spurious signal, supporting adaptive evolution, but for non-idealized noise, a dynamics-based system of feed-forward regulation with different topology was instead favored.

Bacterial regulatory networks

Regulatory networks allow bacteria to adapt to almost every environmental niche on earth. A network of interactions among diverse types of molecules including DNA, RNA, proteins and metabolites, is utilised by the bacteria to achieve regulation of gene expression. In bacteria, the principal function of regulatory networks is to control the response to environmental changes, for example nutritional status and environmental stress. A complex organization of networks permits the microorganism to coordinate and integrate multiple environmental signals.

One example stress is when the environment suddenly becomes poor of nutrients. This triggers a complex adaptation process in bacteria, such as E. coli. After this environmental change, thousands of genes change expression level. However, these changes are predictable from the topology and logic of the gene network that is reported in RegulonDB. Specifically, on average, the response strength of a gene was predictable from the difference between the numbers of activating and repressing input transcription factors of that gene.

Modelling

Coupled ordinary differential equations

It is common to model such a network with a set of coupled ordinary differential equations (ODEs) or SDEs, describing the reaction kinetics of the constituent parts. Suppose that our regulatory network has $N$ nodes, and let $S_{1} (t), S_{2} (t), \dots, S_{N} (t)$ represent the concentrations of the $N$ corresponding substances at time $t$ . Then the temporal evolution of the system can be described approximately by

\frac{d S_{j}}{d t} = f_{j} (S_{1}, S_{2}, \dots, S_{N})

where the functions $f_{j}$ express the dependence of $S_{j}$ on the concentrations of other substances present in the cell. The functions $f_{j}$ are ultimately derived from basic principles of chemical kinetics or simple expressions derived from these e.g. Michaelis–Menten enzymatic kinetics. Hence, the functional forms of the $f_{j}$ are usually chosen as low-order polynomials or Hill functions that serve as an ansatz for the real molecular dynamics. Such models are then studied using the mathematics of nonlinear dynamics. System-specific information, like reaction rate constants and sensitivities, are encoded as constant parameters.

By solving for the fixed point of the system:

\frac{d S_{j}}{d t} = 0

for all $j$ , one obtains (possibly several) concentration profiles of proteins and mRNAs that are theoretically sustainable (though not necessarily stable). Steady states of kinetic equations thus correspond to potential cell types, and oscillatory solutions to the above equation to naturally cyclic cell types. Mathematical stability of these attractors can usually be characterized by the sign of higher derivatives at critical points, and then correspond to biochemical stability of the concentration profile. Critical points and bifurcations in the equations correspond to critical cell states in which small state or parameter perturbations could switch the system between one of several stable differentiation fates. Trajectories correspond to the unfolding of biological pathways and transients of the equations to short-term biological events. For a more mathematical discussion, see the articles on nonlinearity, dynamical systems, bifurcation theory, and chaos theory.

Boolean network

The following example illustrates how a Boolean network can model a GRN together with its gene products (the outputs) and the substances from the environment that affect it (the inputs). Stuart Kauffman was amongst the first biologists to use the metaphor of Boolean networks to model genetic regulatory networks.

Each gene, each input, and each output is represented by a node in a directed graph in which there is an arrow from one node to another if and only if there is a causal link between the two nodes.
Each node in the graph can be in one of two states: on or off.
For a gene, "on" corresponds to the gene being expressed; for inputs and outputs, "off" corresponds to the substance being present.
Time is viewed as proceeding in discrete steps. At each step, the new state of a node is a Boolean function of the prior states of the nodes with arrows pointing towards it.

The validity of the model can be tested by comparing simulation results with time series observations. A partial validation of a Boolean network model can also come from testing the predicted existence of a yet unknown regulatory connection between two particular transcription factors that each are nodes of the model.

Continuous networks

Continuous network models of GRNs are an extension of the boolean networks described above. Nodes still represent genes and connections between them regulatory influences on gene expression. Genes in biological systems display a continuous range of activity levels and it has been argued that using a continuous representation captures several properties of gene regulatory networks not present in the Boolean model. Formally most of these approaches are similar to an artificial neural network, as inputs to a node are summed up and the result serves as input to a sigmoid function, e.g., but proteins do often control gene expression in a synergistic, i.e. non-linear, way. However, there is now a continuous network model that allows grouping of inputs to a node thus realizing another level of regulation. This model is formally closer to a higher order recurrent neural network. The same model has also been used to mimic the evolution of cellular differentiation and even multicellular morphogenesis.

Stochastic gene networks

Recent experimental results have demonstrated that gene expression is a stochastic process. Thus, many authors are now using the stochastic formalism, after the work by Arkin et al. Works on single gene expression and small synthetic genetic networks, such as the genetic toggle switch of Tim Gardner and Jim Collins, provided additional experimental data on the phenotypic variability and the stochastic nature of gene expression. The first versions of stochastic models of gene expression involved only instantaneous reactions and were driven by the Gillespie algorithm.

Since some processes, such as gene transcription, involve many reactions and could not be correctly modeled as an instantaneous reaction in a single step, it was proposed to model these reactions as single step multiple delayed reactions in order to account for the time it takes for the entire process to be complete.

From here, a set of reactions were proposed that allow generating GRNs. These are then simulated using a modified version of the Gillespie algorithm, that can simulate multiple time delayed reactions (chemical reactions where each of the products is provided a time delay that determines when will it be released in the system as a "finished product").

For example, basic transcription of a gene can be represented by the following single-step reaction (RNAP is the RNA polymerase, RBS is the RNA ribosome binding site, and Pro_i is the promoter region of gene i):

RNAP + {Pro}_{i} \overset{k_{i, b a s}}{⟶} {Pro}_{i} (τ_{i}^{1}) + {RBS}_{i} (τ_{i}^{1}) + RNAP (τ_{i}^{2})

Furthermore, there seems to be a trade-off between the noise in gene expression, the speed with which genes can switch, and the metabolic cost associated their functioning. More specifically, for any given level of metabolic cost, there is an optimal trade-off between noise and processing speed and increasing the metabolic cost leads to better speed-noise trade-offs.

A recent work proposed a simulator (SGNSim, Stochastic Gene Networks Simulator), that can model GRNs where transcription and translation are modeled as multiple time delayed events and its dynamics is driven by a stochastic simulation algorithm (SSA) able to deal with multiple time delayed events. The time delays can be drawn from several distributions and the reaction rates from complex functions or from physical parameters. SGNSim can generate ensembles of GRNs within a set of user-defined parameters, such as topology. It can also be used to model specific GRNs and systems of chemical reactions. Genetic perturbations such as gene deletions, gene over-expression, insertions, frame shift mutations can also be modeled as well.

The GRN is created from a graph with the desired topology, imposing in-degree and out-degree distributions. Gene promoter activities are affected by other genes expression products that act as inputs, in the form of monomers or combined into multimers and set as direct or indirect. Next, each direct input is assigned to an operator site and different transcription factors can be allowed, or not, to compete for the same operator site, while indirect inputs are given a target. Finally, a function is assigned to each gene, defining the gene's response to a combination of transcription factors (promoter state). The transfer functions (that is, how genes respond to a combination of inputs) can be assigned to each combination of promoter states as desired.

In other recent work, multiscale models of gene regulatory networks have been developed that focus on synthetic biology applications. Simulations have been used that model all biomolecular interactions in transcription, translation, regulation, and induction of gene regulatory networks, guiding the design of synthetic systems.

Prediction

Other work has focused on predicting the gene expression levels in a gene regulatory network. The approaches used to model gene regulatory networks have been constrained to be interpretable and, as a result, are generally simplified versions of the network. For example, Boolean networks have been used due to their simplicity and ability to handle noisy data but lose data information by having a binary representation of the genes. Also, artificial neural networks omit using a hidden layer so that they can be interpreted, losing the ability to model higher order correlations in the data. Using a model that is not constrained to be interpretable, a more accurate model can be produced. Being able to predict gene expressions more accurately provides a way to explore how drugs affect a system of genes as well as for finding which genes are interrelated in a process. This has been encouraged by the DREAM competition which promotes a competition for the best prediction algorithms. Some other recent work has used artificial neural networks with a hidden layer.

Applications

Multiple sclerosis

There are three classes of multiple sclerosis: relapsing-remitting (RRMS), primary progressive (PPMS) and secondary progressive (SPMS). Gene regulatory network (GRN) plays a vital role to understand the disease mechanism across these three different multiple sclerosis classes.

Search This Blog

Tuesday, February 7, 2023

Transcription factor

Number

Mechanism

Function

Basal transcriptional regulation

Differential enhancement of transcription

Development

Response to intercellular signals

Response to environment

Cell cycle control

Pathogenesis

Regulation

Synthesis

Nuclear localization

Activation

Accessibility of DNA-binding site

Availability of other cofactors/transcription factors

Interaction with methylated cytosine

Structure

DNA-binding domain

Response elements

Clinical significance

Disorders

Potential drug targets

Role in evolution

Role in biocontrol activity

Analysis

Monday, February 6, 2023

Cis-regulatory element

Overview

The Boolean logic assumption

Classification

Promoter

Enhancers

Silencers

Operators

Evolutionary role

Cis-regulatory module in gene regulatory network

Mode of action

Identification and computational prediction

Examples

Gene regulatory network

Overview

Structure and evolution

Global feature

Local feature

Bacterial regulatory networks

Modelling

Coupled ordinary differential equations

Boolean network

Continuous networks

Stochastic gene networks

Prediction

Applications

Multiple sclerosis

Philosophy of science