Search This Blog

Tuesday, December 18, 2018

Protein moonlighting

From Wikipedia, the free encyclopedia

Crystallographic structure of cytochrome P450 from the bacteria S. coelicolor (rainbow colored cartoon, N-terminus = blue, C-terminus = red) complexed with heme cofactor (magenta spheres) and two molecules of its endogenous substrate epi-isozizaene as orange and cyan spheres respectively. The orange-colored substrate resides in the monooxygenase site while the cyan-colored substrate occupies the substrate entrance site. An unoccupied moonlighting terpene synthase site is designated by the orange arrow.

Protein moonlighting (or gene sharing) is a phenomenon by which a protein can perform more than one function. Ancestral moonlighting proteins originally possessed a single function but through evolution, acquired additional functions. Many proteins that moonlight are enzymes; others are receptors, ion channels or chaperones. The most common primary function of moonlighting proteins is enzymatic catalysis, but these enzymes have acquired secondary non-enzymatic roles. Some examples of functions of moonlighting proteins secondary to catalysis include signal transduction, transcriptional regulation, apoptosis, motility, and structural.

Protein moonlighting may occur widely in nature. Protein moonlighting through gene sharing differs from the use of a single gene to generate different proteins by alternative RNA splicing, DNA rearrangement, or post-translational processing. It is also different from multifunctionality of the protein, in which the protein has multiple domains, each serving a different function. Protein moonlighting by gene sharing means that a gene may acquire and maintain a second function without gene duplication and without loss of the primary function. Such genes are under two or more entirely different selective constraints.

Various techniques have been used to reveal moonlighting functions in proteins. The detection of a protein in unexpected locations within cells, cell types, or tissues may suggest that a protein has a moonlighting function. Furthermore, sequence or structure homology of a protein may be used to infer both primary function as well as secondary moonlighting functions of a protein. 

The most well-studied examples of gene sharing are crystallins. These proteins, when expressed at low levels in many tissues function as enzymes, but when expressed at high levels in eye tissue, become densely packed and thus form lenses. While the recognition of gene sharing is relatively recent—the term was coined in 1988, after crystallins in chickens and ducks were found to be identical to separately identified enzymes—recent studies have found many examples throughout the living world. Joram Piatigorsky has suggested that many or all proteins exhibit gene sharing to some extent, and that gene sharing is a key aspect of molecular evolution. The genes encoding crystallins must maintain sequences for catalytic function and transparency maintenance function.

Inappropriate moonlighting is a contributing factor in some genetic diseases, and moonlighting provides a possible mechanism by which bacteria may become resistant to antibiotics.

Discovery

The first observation of a moonlighting protein was made in the late 1980s by Joram Piatigorsky and Graeme Wistow during their research on crystallin enzymes. Piatigorsky determined that lens crystallin conservation and variance is due to other moonlighting functions outside of the lens. Originally Piatigorsky called these proteins "gene sharing" proteins, but the colloquial description moonlighting was subsequently applied to proteins by Constance Jeffery in 1999 to draw a similarity between multitasking proteins and people who work two jobs. The phrase "gene sharing" is ambiguous since it is also used to describe horizontal gene transfer, hence the phrase "protein moonlighting" has become the preferred description for proteins with more than one function.

Evolution

It is believed that moonlighting proteins came about by means of evolution through which uni-functional proteins gained the ability to perform multiple functions. With alterations, much of the protein's unused space can provide new functions. Many moonlighting proteins are the result of gene fusion of two single function genes. Alternatively a single gene can acquire a second function since the active site of the encoded protein typically is small compared to the overall size of the protein leaving considerable room to accommodate a second functional site. In yet a third alternative, the same active site can acquire a second function through mutations of the active site. 

The development of moonlighting proteins may be evolutionarily favorable to the organism since a single protein can do the job of multiple proteins conserving amino acids and energy required to synthesize these proteins. However, there is no universally agreed upon theory that explains why proteins with multiple roles evolved. While using one protein to perform multiple roles seems advantageous because it keeps the genome small, we can conclude that this is probably not the reason for moonlighting because of the large of amount of noncoding DNA.

Functions

Many proteins catalyze a chemical reaction. Other proteins fulfill structural, transport, or signaling roles. Furthermore, numerous proteins have the ability to aggregate into supramolecular assemblies. For example, a ribosome is made up of 90 proteins and RNA

A number of the currently known moonlighting proteins are evolutionarily derived from highly conserved enzymes, also called ancient enzymes. These enzymes are frequently speculated to have evolved moonlighting functions. Since highly conserved proteins are present in many different organisms, this increases the chance that they would develop secondary moonlighting functions. A high fraction of enzymes involved in glycolysis, an ancient universal metabolic pathway, exhibit moonlighting behavior. Furthermore, it has been suggested that as many as 7 out of 10 proteins in glycolysis and 7 out of 8 enzymes of the tricarboxylic acid cycle exhibit moonlighting behavior.

An example of a moonlighting enzyme is pyruvate carboxylase. This enzyme catalyzes the carboxylation of pyruvate into oxaloacetate, thereby replenishing the tricarboxylic acid cycle. Surprisingly, in yeast species such as H. polymorpha and P. pastoris, pyruvate carboylase is also essential for proper targeting and assembly of the peroxisomal protein alcohol oxidase (AO). AO, the first enzyme of methanol metabolism, is a homo-octameric flavoenzyme. In wild type cells, this enzyme is present as enzymatically active AO octamers in the peroxisomal matrix. However, in cells lacking pyruvate carboxylase, AO monomers accumulate in the cytosol, indicating that pyruvate carboxylase has a second fully unrelated function in assembly and import. The function in AO import/assembly is fully independent of the enzyme activity of pyruvate carboxylase, because amino acid substitutions can be introduced that fully inactive the enzyme activity of pyruvate carboxylase, without affecting its function in AO assembly and import. Conversely, mutations are known that block the function of this enzyme in import and assembly of AO, but have no effect on the enzymatic activity of the protein.

The E. coli anti-oxidant thioredoxin protein is another example of a moonlighting protein. Upon infection with the bacteriophage T7, E. coli thioredoxin forms a complex with T7 DNA polymerase, which results in enhanced T7 DNA replication, a crucial step for successful T7 infection. Thioredoxin binds to a loop in T7 DNA polymerase to bind more strongly to the DNA. The anti-oxidant function of thioredoxin is fully autonomous and fully independent of T7 DNA replication, in which the protein most likely fulfills the functional role.

ADT2 and ADT5 are another example of moonlighting proteins found in plants. Both of these proteins have roles in phenylalanine biosynthesis like all other ADTs. However ADT2, together with FtsZ is necessary in chloroplast division and ADT5 is transported by stromules into the nucleus.

Examples of moonlighting proteins
Kingdom Protein Organism Function
primary moonlighting
Animal
Aconitase H. sapiens TCA cycle enzyme Iron homeostasis
ATF2 H. sapiens Transcription factor DNA damage response
Crystallins Various Lens structural protein Various enzyme
Cytochrome c Various Energy metabolism Apoptosis
DLD H. sapiens Energy metabolism Protease
ERK2 H. sapiens MAP kinase Transcriptional repressor
ESCRT-II complex D. melanogaster Endosomal protein sorting Bicoid mRNA localization
STAT3 M. musculus Transcription factor Electron transport chain
Plant
Hexokinase A. thaliana Glucose metabolism Glucose signaling / cell death control
Presenilin P. patens γ-secretase Cystoskeletal function
Fungus
Aconitase S. cerevisiae TCA cycle enzyme mtDNA stability
Aldolase S. cerevisiae Glycolytic enzyme V-ATPase assembly
Arg5,6 S. cerevisiae Arginine biosynthesis Transcriptional control
Enolase S. cerevisiae Glycolytic enzyme Homotypic vacuole fusion
Mitochondrial tRNA import
Galactokinase K. lactis Galactose catabolism enzyme Induction galactose genes
Hal3 S. cerevisiae Halotolerance determinant Coenzyme A biosynthesis
HSP60 S. cerevisiae Mitochondrial chaperone Stabilization active DNA ori's
Phosphofructokinase P. pastoris Glycolytic enzyme Autophagy peroxisomes
Pyruvate carboxylase H. polymorpha Anaplerotic enzyme Assembly of alcohol oxidase
Vhs3 S. cerevisiae Halotolerance determinant Coenzyme A biosynthesis
Prokaryotes
Aconitase M. tuberculosis TCA cycle enzyme Iron-responsive protein
CYP170A1 S. coelicolor Albaflavenone synthase Terpene synthase
Enolase S. pneumoniae Glycolytic enzyme Plasminogen binding
GroEL E. aerogenes Chaperone Insect toxin
Glutamate racemase (MurI) E. coli cell wall biosynthesis gyrase inhibition
Thioredoxin E. coli Anti-oxidant T7 DNA polymerase subunit
Protist
Aldolase P. vivax Glycolytic enzyme Host-cell invasion

Mechanisms

Crystallographic structure of aconitase

In many cases, the functionality of a protein not only depends on its structure, but also its location. For example, a single protein may have one function when found in the cytoplasm of a cell, a different function when interacting with a membrane, and yet a third function if excreted from the cell. This property of moonlighting proteins is known as "differential localization". For example, in higher temperatures DegP (HtrA) will function as a protease by the directed degradation of proteins and in lower temperatures as a chaperone by assisting the non-covalent folding or unfolding and the assembly or disassembly of other macromolecular structures. Furthermore, moonlighting proteins may exhibit different behaviors not only as a result of its location within a cell, but also the type of cell that the protein is expressed in. Multifunctionality could also be as a consequence of differential post translational modifications (PTM'S). In the case of the glycolytic enzyme glyceraldehyde-3-phosphate dehydrogenase (GAPDH)alterations in the PTM's have been shown to be associated with higher order multi functionality.

Other methods through which proteins may moonlight are by changing their oligomeric state, altering concentrations of the protein's ligand or substrate, use of alternative binding sites, or finally through phosphorylation. An example of a protein that displays different function in different oligomeric states is pyruvate kinase which exhibits metabolic activity as a tetramer and thyroid hormone–binding activity as a monomer. Changes in the concentrations of ligands or substrates may cause a switch in protein a protein's function. For example, in the presence of low iron concentrations, aconitase functions as an enzyme while at high iron concentration, aconitase functions as an iron-responsive element-binding protein (IREBP). Proteins may also perform separate functions through the use of alternative binding sites that perform different tasks. An example of this is ceruloplasmin, a protein that functions as an oxidase in copper metabolism and moonlights as a copper-independent glutathione peroxidase. Lastly, phosphorylation may sometimes cause a switch in the function of a moonlighting protein. For example, phosphorylation of phosphoglucose isomerase (PGI) at Ser-185 by protein kinase CK2 causes it to stop functioning as an enzyme, while retaining its function as an autocrine motility factor. Hence when a mutation takes place that inactivates a function of a moonlighting proteins, the other function(s) are not necessarily affected.

The crystal structures of several moonlighting proteins, such as I-AniI homing endonuclease / maturase and the PutA proline dehydrogenase / transcription factor, have been determined. An analysis of these crystal structures has demonstrated that moonlighting proteins can either perform both functions at the same time, or through conformational changes, alternate between two states, each of which is able to perform a separate function. For example, the protein DegP plays a role in proteolysis with higher temperatures and is involved in refolding functions at lower temperatures. Lastly, these crystal structures have shown that the second function may negatively affect the first function in some moonlighting proteins. As seen in ƞ-crystallin, the second function of a protein can alter the structure, decreasing the flexibility, which in turn can impair enzymatic activity somewhat.

Identification methods

Moonlighting proteins have usually been identified by chance because there is no clear procedure to identify secondary moonlighting functions. Despite such difficulties, the number of moonlighting proteins that have been discovered is rapidly increasing. Furthermore, moonlighting proteins appear to be abundant in all kingdoms of life.

Various methods have been employed to determine a protein's function including secondary moonlighting functions. For example, the tissue, cellular, or subcellular distribution of a protein may provide hints as to the function. Real-time PCR is used to quantify mRNA and hence infer the presence or absence of a particular protein which is encoded by the mRNA within different cell types. Alternatively immunohistochemistry or mass spectrometry can be used to directly detect the presence of proteins and determine in which subcellular locations, cell types, and tissues a particular protein is expressed. 

Mass spectrometry may be used to detect proteins based on their mass-to-charge ratio. Because of alternative splicing and posttranslational modification, identification of proteins based on the mass of the parent ion alone is very difficult. However tandem mass spectrometry in which each of the parent peaks is in turn fragmented can be used to unambiguously identify proteins. Hence tandem mass spectrometry is one of the tools used in proteomics to identify the presence of proteins in different cell types or subcellular locations. While the presence of a moonlighting protein in an unexpected location may complicate routine analyses, at the same time, the detection of a protein in unexpected multiprotein complexes or locations suggests that protein may have a moonlighting function. Furthermore, mass spectrometry may be used to determine if a protein has high expression levels that do not correlate to the enzyme's measured metabolic activity. These expression levels may signify that the protein is performing a different function than previously known.

The structure of a protein can also help determine its functions. Protein structure in turn may be elucidated with various techniques including X-ray crystallography or NMR. Dual polarization interferometry may be used to measure changes in protein structure which may also give hints to the protein's function. Finally, application of systems biology approaches such as interactomics give clues to a proteins function based on what it interacts with.

Higher order multifunctionality

In the case of the glycolytic enzyme glyceraldehyde-3-phosphate dehydrogenase (GAPDH), in addition to the large number of alternate functions it has also been observed that it can be involved in the same function by multiple means (multifunctionality within multifunctionality). For example, in its role in maintenance of cellular iron homeostasis GAPDH can function to import or extrude iron from cells. Moreover, in case of its iron import activities it can traffic into cells holo-transferrin as well as the related molecule lactoferrin by multiple pathways.

Example

Crystallins

A crystallin from ducks that exhibits argininosuccinate lyase activity and is a key structural component in eye lenses, an example of gene sharing

In the case of crystallins, the genes must maintain sequences for catalytic function and transparency maintenance function. The abundant lens crystallins have been generally viewed as static proteins serving a strictly structural role in transparency and cataract. However, recent studies have shown that the lens crystallins are much more diverse than previously recognized and that many are related or identical to metabolic enzymes and stress proteins found in numerous tissues. Unlike other proteins performing highly specialized tasks, such as globin or rhodopsin, the crystallins are very diverse and show numerous species differences. Essentially all vertebrate lenses contain representatives of the α and β/γ crystallins, the "Ubiquitous crystallins", which are themselves heterogeneous, and only few species or selected taxonomic groups use entirely different proteins as lens crystallins.This paradox of crystallins being highly conserved in sequence while extremely diverse in number and distribution shows that many crystallins have vital functions outside the lens and cornea, and this multi-functionality of the crystallins is achieved by gene sharing.

Gene regulation

Crystallin recruitment may occur by changes in gene regulation that leads to high lens expression. One such example is gluthathione S-transferase/S11-crystallin that was specialized for lens expression by change in gene regulation and gene duplication. The fact that similar transcriptional factors such as Pax-6, and retinoic acid receptors, regulate different crystalline genes, suggests that lens-specific expression have played a crucial role for recruiting multifunctional protein as crystallins. Crystallin recruitment has occurred both with and without gene duplication, and tandem gene duplication has taken place among some of the crystallins with one of the duplicates specializing for lens expression. Ubiquitous α –crystallins and bird δ –crystallins are two examples.

Alpha crystallins

The α-crystallins, which contributed to the discovery of crystallins as borrowed proteins, have continually supported the theory of gene sharing, and helped delineating the mechanisms used for gene sharing as well. There are two α-crystallin genes (αA and αB), which are about 55% identical in amino acid sequence. Expression studies in non-lens cells showed that the αB-crystallin, other than being a functional lens protein, is a functional small heat shock protein. αB-crystallin is induced by heat and other physiological stresses, and it can protect the cells from elevated temperatures and hypertonic stress. αB-crystallin is also over-expressed in many pathologies, including neurodegenerative diseases, fibroblasts of patients with Werner's disease showing premature senescence, and growth abnormalities. In addition to being over-expressed under abnormal conditions, αB-crystallin is constitutively expressed in heart, skeletal muscle, kidney, lung and many other tissues. In contrast to αB-crystallin, except for low-level expression in the thymus, spleen and retina, αA-crystallin is highly specialized for expression in the lens and is not stress-inducible. However, like αB-crystallin, it can also function as molecular chaperone and protect against thermal stress.

Beta/gamma-crystallins

β/γ-crystallins are different from α-crystallins in that they are a large multigene family. Other proteins like bacterial spore coat, a slime mold cyst protein, and epidermis differentiation-specific protein, contain the same Greek key motifs and are placed under β/γ crystallin superfamily. This relationship supports the idea that β/γ- crystallins have been recruited by a gene-sharing mechanism. However, except for few reports, non-refractive function of the β/γ-crystallin is yet to be found.

Corneal crystallins

Similar to lens, cornea is a transparent, avascular tissue derived from the ectoderm that is responsible for focusing light onto the retina. However, unlike lens, cornea depends on the air-cell interface and its curvature for refraction. Early immunology studies have shown that BCP 54 comprises 20–40% of the total soluble protein in bovine cornea. Subsequent studies have indicated that BCP 54 is ALDH3, a tumor and xenobiotic-inducible cytosolic enzyme, found in human, rat, and other mammals.

Non refractive roles of crystallins in lens and cornea

While it is evident that gene sharing resulted in many of lens crystallins being multifunctional proteins, it is still uncertain to what extent the crystallins use their non-refractive properties in the lens, or on what basis they were selected. The α-crystallins provide a convincing case for a lens crystallin using its non-refractive ability within the lens to prevent protein aggregation under a variety of environmental stresses and to protect against enzyme inactivation by post-translational modifications such as glycation. The α-crystallins may also play a functional role in the stability and remodeling of the cytoskeleton during fiber cell differentiation in the lens. In cornea, ALDH3 is also suggested to be responsible for absorbing UV-B light.

Co-evolution of lens and cornea through gene sharing

Based on the similarities between lens and cornea, such as abundant water-soluble enzymes, and being derived from ectoderm, the lens and cornea are thought to be co-evolved as a "refraction unit." Gene sharing would maximize light transmission and refraction to the retina by this refraction unit. Studies have shown that many water-soluble enzymes/proteins expressed by cornea are identical to taxon-specific lens crystallins, such as ALDH1A1/ η-crystallin, α-enolase/τ-crystallin, and lactic dehydrogenase/ -crystallin. Also, the anuran corneal epithelium, which can transdifferentiate to regenerate the lens, abundantly expresses ubiquitous lens crystallins, α, β and γ, in addition to the taxon-specific crystallin α-enolase/τ-crystallin. Overall, the similarity in expression of these proteins in the cornea and lens, both in abundance and taxon-specificity, supports the idea of co-evolution of lens and cornea through gene sharing.

Relationship to similar concepts

Gene sharing is related to, but distinct from, several concepts in genetics, evolution, and molecular biology. Gene sharing entails multiple effects from the same gene, but unlike pleiotropy, it necessarily involves separate functions at the molecular level. A gene could exhibit pleiotropy when single enzyme function affects multiple phenotypic traits; mutations of a shared gene could potentially affect only a single trait. Gene duplication followed by differential mutation is another phenomenon thought to be a key element in the evolution of protein function, but in gene sharing, there is no divergence of gene sequence when proteins take on new functions; the single polypeptide takes on new roles while retaining old ones. Alternative splicing can result in the production of multiple polypeptides (with multiple functions) from a single gene, but by definition, gene sharing involves multiple functions of a single polypeptide.

Clinical significance

The multiple roles of moonlighting proteins complicates the determination of phenotype from genotype, hampering the study of inherited metabolic disorders

The complex phenotypes of several disorders are suspected to be caused by the involvement of moonlighting proteins. The protein GAPDH has at least 11 documented functions, one of which includes apoptosis. Excessive apoptosis is involved in many neurodegenerative diseases, such as Huntington's, Alzheimer's, and Parkinson's as well as in brain ischemia. In one case, GAPDH was found in the degenerated neurons of individuals who had Alzheimer's disease.

Although there is insufficient evidence for definite conclusions, there are well documented examples of moonlighting proteins that play a role in disease. One such disease is tuberculosis. One moonlighting protein in the bacterium M. tuberculosis has a function which counteracts the effects of antibiotics. Specifically, M. tuberculosis gains antibiotic resistance against ciprofloxacin from overexpression of Glutamate racemase in vivo. GAPDH localized to the surface of pathogenic mycobacteriea has been shown to capture and traffic the mammalian iron carrier protein transferrin into cells resulting in iron acquisition by the pathogen.

Molecular cloning

From Wikipedia, the free encyclopedia
 
Diagram of molecular cloning using bacteria and plasmids.

Molecular cloning is a set of experimental methods in molecular biology that are used to assemble recombinant DNA molecules and to direct their replication within host organisms. The use of the word cloning refers to the fact that the method involves the replication of one molecule to produce a population of cells with identical DNA molecules. Molecular cloning generally uses DNA sequences from two different organisms: the species that is the source of the DNA to be cloned, and the species that will serve as the living host for replication of the recombinant DNA. Molecular cloning methods are central to many contemporary areas of modern biology and medicine.

In a conventional molecular cloning experiment, the DNA to be cloned is obtained from an organism of interest, then treated with enzymes in the test tube to generate smaller DNA fragments. Subsequently, these fragments are then combined with vector DNA to generate recombinant DNA molecules. The recombinant DNA is then introduced into a host organism (typically an easy-to-grow, benign, laboratory strain of E. coli bacteria). This will generate a population of organisms in which recombinant DNA molecules are replicated along with the host DNA. Because they contain foreign DNA fragments, these are transgenic or genetically modified microorganisms (GMO). This process takes advantage of the fact that a single bacterial cell can be induced to take up and replicate a single recombinant DNA molecule. This single cell can then be expanded exponentially to generate a large amount of bacteria, each of which contain copies of the original recombinant molecule. Thus, both the resulting bacterial population, and the recombinant DNA molecule, are commonly referred to as "clones". Strictly speaking, recombinant DNA refers to DNA molecules, while molecular cloning refers to the experimental methods used to assemble them. The idea arose that different DNA sequences could be inserted into a plasmid and that these foreign sequences would be carried into bacteria and digested as part of the plasmid. That is, these plasmids could serve as cloning vectors to carry genes. 

Virtually any DNA sequence can be cloned and amplified, but there are some factors that might limit the success of the process. Examples of the DNA sequences that are difficult to clone are inverted repeats, origins of replication, centromeres and telomeres. Another characteristic that limits chances of success is large size of DNA sequence. Inserts larger than 10kbp have very limited success, but bacteriophages such as bacteriophage λ can be modified to successfully insert a sequence up to 40 kbp.

History

Prior to the 1970s, our understanding of genetics and molecular biology was severely hampered by an inability to isolate and study individual genes from complex organisms. This changed dramatically with the advent of molecular cloning methods. Microbiologists, seeking to understand the molecular mechanisms through which bacteria restricted the growth of bacteriophage, isolated restriction endonucleases, enzymes that could cleave DNA molecules only when specific DNA sequences were encountered. They showed that restriction enzymes cleaved chromosome-length DNA molecules at specific locations, and that specific sections of the larger molecule could be purified by size fractionation. Using a second enzyme, DNA ligase, fragments generated by restriction enzymes could be joined in new combinations, termed recombinant DNA. By recombining DNA segments of interest with vector DNA, such as bacteriophage or plasmids, which naturally replicate inside bacteria, large quantities of purified recombinant DNA molecules could be produced in bacterial cultures. The first recombinant DNA molecules were generated and studied in 1972.

Overview

Molecular cloning takes advantage of the fact that the chemical structure of DNA is fundamentally the same in all living organisms. Therefore, if any segment of DNA from any organism is inserted into a DNA segment containing the molecular sequences required for DNA replication, and the resulting recombinant DNA is introduced into the organism from which the replication sequences were obtained, then the foreign DNA will be replicated along with the host cell's DNA in the transgenic organism. 

Molecular cloning is similar to polymerase chain reaction (PCR) in that it permits the replication of DNA sequence. The fundamental difference between the two methods is that molecular cloning involves replication of the DNA in a living microorganism, while PCR replicates DNA in an in vitro solution, free of living cells.

Steps

The overall goal of molecular cloning is to take a gene of interest from one plasmid and insert it into another plasmid This is done by performing PCR, digestive reaction, ligation reaction, and transformation.

In standard molecular cloning experiments, the cloning of any DNA fragment essentially involves seven steps: (1) Choice of host organism and cloning vector, (2) Preparation of vector DNA, (3) Preparation of DNA to be cloned, (4) Creation of recombinant DNA, (5) Introduction of recombinant DNA into host organism, (6) Selection of organisms containing recombinant DNA, (7) Screening for clones with desired DNA inserts and biological properties. 

Although the detailed planning of the cloning can be done in any text editor, together with online utilities for e.g. PCR primer design, dedicated software exist for the purpose. Software for the purpose include for example ApE (open source), DNAStrider (open source), Serial Cloner (gratis) and Collagene (open source). 

Notably, the growing capacity and fidelity of DNA synthesis platforms allows for increasingly intricate designs in molecular engineering. These projects may include very long strands of novel DNA sequence and/or test entire libraries simultaneously, as opposed to of individual sequences. These shifts introduce complexity that require design to move away from the flat nucleotide-based representation and towards a higher level of abstraction. Examples of such tools are GenoCAD, Teselagen (free for academia) or GeneticConstructor (free for academics).

Choice of host organism and cloning vector

Diagram of a commonly used cloning plasmid; pBR322. It's a circular piece of DNA 4361 bases long. Two antibiotic resistance genes are present, conferring resistance to ampicillin and tetracycline, and an origin of replication that the host uses to replicate the DNA.

Although a very large number of host organisms and molecular cloning vectors are in use, the great majority of molecular cloning experiments begin with a laboratory strain of the bacterium E. coli (Escherichia coli) and a plasmid cloning vector. E. coli and plasmid vectors are in common use because they are technically sophisticated, versatile, widely available, and offer rapid growth of recombinant organisms with minimal equipment. If the DNA to be cloned is exceptionally large (hundreds of thousands to millions of base pairs), then a bacterial artificial chromosome or yeast artificial chromosome vector is often chosen. 

Specialized applications may call for specialized host-vector systems. For example, if the experimentalists wish to harvest a particular protein from the recombinant organism, then an expression vector is chosen that contains appropriate signals for transcription and translation in the desired host organism. Alternatively, if replication of the DNA in different species is desired (for example, transfer of DNA from bacteria to plants), then a multiple host range vector (also termed shuttle vector) may be selected. In practice, however, specialized molecular cloning experiments usually begin with cloning into a bacterial plasmid, followed by subcloning into a specialized vector.
Whatever combination of host and vector are used, the vector almost always contains four DNA segments that are critically important to its function and experimental utility:
  1. DNA replication origin is necessary for the vector (and its linked recombinant sequences) to replicate inside the host organism
  2. one or more unique restriction endonuclease recognition sites to serves as sites where foreign DNA may be introduced
  3. a selectable genetic marker gene that can be used to enable the survival of cells that have taken up vector sequences
  4. a tag gene that can be used to screen for cells containing the foreign DNA
Cleavage of a DNA sequence containing the BamHI restriction site. The DNA is cleaved at the palindromic sequence to produce 'sticky ends'.

Preparation of vector DNA

The cloning vector is treated with a restriction endonuclease to cleave the DNA at the site where foreign DNA will be inserted. The restriction enzyme is chosen to generate a configuration at the cleavage site that is compatible with the ends of the foreign DNA (see DNA end). Typically, this is done by cleaving the vector DNA and foreign DNA with the same restriction enzyme, for example EcoRI. Most modern vectors contain a variety of convenient cleavage sites that are unique within the vector molecule (so that the vector can only be cleaved at a single site) and are located within a gene (frequently beta-galactosidase) whose inactivation can be used to distinguish recombinant from non-recombinant organisms at a later step in the process. To improve the ratio of recombinant to non-recombinant organisms, the cleaved vector may be treated with an enzyme (alkaline phosphatase) that dephosphorylates the vector ends. Vector molecules with dephosphorylated ends are unable to replicate, and replication can only be restored if foreign DNA is integrated into the cleavage site.

Preparation of DNA to be cloned

DNA for cloning is most commonly produced using PCR. Template DNA is mixed with bases (the building blocks of DNA), primers (short pieces of complementary single stranded DNA) and a DNA polymerase enzyme that builds the DNA chain. The mix goes through cycles of heating and cooling to produce large quantities of copied DNA.

For cloning of genomic DNA, the DNA to be cloned is extracted from the organism of interest. Virtually any tissue source can be used (even tissues from extinct animals), as long as the DNA is not extensively degraded. The DNA is then purified using simple methods to remove contaminating proteins (extraction with phenol), RNA (ribonuclease) and smaller molecules (precipitation and/or chromatography). Polymerase chain reaction (PCR) methods are often used for amplification of specific DNA or RNA (RT-PCR) sequences prior to molecular cloning. 

DNA for cloning experiments may also be obtained from RNA using reverse transcriptase (complementary DNA or cDNA cloning), or in the form of synthetic DNA (artificial gene synthesis). cDNA cloning is usually used to obtain clones representative of the mRNA population of the cells of interest, while synthetic DNA is used to obtain any precise sequence defined by the designer.

The purified DNA is then treated with a restriction enzyme to generate fragments with ends capable of being linked to those of the vector. If necessary, short double-stranded segments of DNA (linkers) containing desired restriction sites may be added to create end structures that are compatible with the vector.

Creation of recombinant DNA with DNA ligase

The creation of recombinant DNA is in many ways the simplest step of the molecular cloning process. DNA prepared from the vector and foreign source are simply mixed together at appropriate concentrations and exposed to an enzyme (DNA ligase) that covalently links the ends together. This joining reaction is often termed ligation. The resulting DNA mixture containing randomly joined ends is then ready for introduction into the host organism. 

DNA ligase only recognizes and acts on the ends of linear DNA molecules, usually resulting in a complex mixture of DNA molecules with randomly joined ends. The desired products (vector DNA covalently linked to foreign DNA) will be present, but other sequences (e.g. foreign DNA linked to itself, vector DNA linked to itself and higher-order combinations of vector and foreign DNA) are also usually present. This complex mixture is sorted out in subsequent steps of the cloning process, after the DNA mixture is introduced into cells.

Introduction of recombinant DNA into host organism

The DNA mixture, previously manipulated in vitro, is moved back into a living cell, referred to as the host organism. The methods used to get DNA into cells are varied, and the name applied to this step in the molecular cloning process will often depend upon the experimental method that is chosen (e.g. transformation, transduction, transfection, electroporation).

When microorganisms are able to take up and replicate DNA from their local environment, the process is termed transformation, and cells that are in a physiological state such that they can take up DNA are said to be competent. In mammalian cell culture, the analogous process of introducing DNA into cells is commonly termed transfection. Both transformation and transfection usually require preparation of the cells through a special growth regime and chemical treatment process that will vary with the specific species and cell types that are used.

Electroporation uses high voltage electrical pulses to translocate DNA across the cell membrane (and cell wall, if present). In contrast, transduction involves the packaging of DNA into virus-derived particles, and using these virus-like particles to introduce the encapsulated DNA into the cell through a process resembling viral infection. Although electroporation and transduction are highly specialized methods, they may be the most efficient methods to move DNA into cells.

Selection of organisms containing vector sequences

Whichever method is used, the introduction of recombinant DNA into the chosen host organism is usually a low efficiency process; that is, only a small fraction of the cells will actually take up DNA. Experimental scientists deal with this issue through a step of artificial genetic selection, in which cells that have not taken up DNA are selectively killed, and only those cells that can actively replicate DNA containing the selectable marker gene encoded by the vector are able to survive.

When bacterial cells are used as host organisms, the selectable marker is usually a gene that confers resistance to an antibiotic that would otherwise kill the cells, typically ampicillin. Cells harboring the plasmid will survive when exposed to the antibiotic, while those that have failed to take up plasmid sequences will die. When mammalian cells (e.g. human or mouse cells) are used, a similar strategy is used, except that the marker gene (in this case typically encoded as part of the kanMX cassette) confers resistance to the antibiotic Geneticin.

Screening for clones with desired DNA inserts and biological properties

Modern bacterial cloning vectors (e.g. pUC19 and later derivatives including the pGEM vectors) use the blue-white screening system to distinguish colonies (clones) of transgenic cells from those that contain the parental vector (i.e. vector DNA with no recombinant sequence inserted). In these vectors, foreign DNA is inserted into a sequence that encodes an essential part of beta-galactosidase, an enzyme whose activity results in formation of a blue-colored colony on the culture medium that is used for this work. Insertion of the foreign DNA into the beta-galactosidase coding sequence disables the function of the enzyme, so that colonies containing transformed DNA remain colorless (white). Therefore, experimentalists are easily able to identify and conduct further studies on transgenic bacterial clones, while ignoring those that do not contain recombinant DNA.

The total population of individual clones obtained in a molecular cloning experiment is often termed a DNA library. Libraries may be highly complex (as when cloning complete genomic DNA from an organism) or relatively simple (as when moving a previously cloned DNA fragment into a different plasmid), but it is almost always necessary to examine a number of different clones to be sure that the desired DNA construct is obtained. This may be accomplished through a very wide range of experimental methods, including the use of nucleic acid hybridizations, antibody probes, polymerase chain reaction, restriction fragment analysis and/or DNA sequencing.

Applications

Molecular cloning provides scientists with an essentially unlimited quantity of any individual DNA segments derived from any genome. This material can be used for a wide range of purposes, including those in both basic and applied biological science. A few of the more important applications are summarized here.

Genome organization and gene expression

Molecular cloning has led directly to the elucidation of the complete DNA sequence of the genomes of a very large number of species and to an exploration of genetic diversity within individual species, work that has been done mostly by determining the DNA sequence of large numbers of randomly cloned fragments of the genome, and assembling the overlapping sequences.

At the level of individual genes, molecular clones are used to generate probes that are used for examining how genes are expressed, and how that expression is related to other processes in biology, including the metabolic environment, extracellular signals, development, learning, senescence and cell death. Cloned genes can also provide tools to examine the biological function and importance of individual genes, by allowing investigators to inactivate the genes, or make more subtle mutations using regional mutagenesis or site-directed mutagenesis. Genes cloned into expression vectors for functional cloning provide a means to screen for genes on the basis of the expressed protein's function.

Production of recombinant proteins

Obtaining the molecular clone of a gene can lead to the development of organisms that produce the protein product of the cloned genes, termed a recombinant protein. In practice, it is frequently more difficult to develop an organism that produces an active form of the recombinant protein in desirable quantities than it is to clone the gene. This is because the molecular signals for gene expression are complex and variable, and because protein folding, stability and transport can be very challenging. 

Many useful proteins are currently available as recombinant products. These include--(1) medically useful proteins whose administration can correct a defective or poorly expressed gene (e.g. recombinant factor VIII, a blood-clotting factor deficient in some forms of hemophilia, and recombinant insulin, used to treat some forms of diabetes), (2) proteins that can be administered to assist in a life-threatening emergency (e.g. tissue plasminogen activator, used to treat strokes), (3) recombinant subunit vaccines, in which a purified protein can be used to immunize patients against infectious diseases, without exposing them to the infectious agent itself (e.g. hepatitis B vaccine), and (4) recombinant proteins as standard material for diagnostic laboratory tests.

Transgenic organisms

Once characterized and manipulated to provide signals for appropriate expression, cloned genes may be inserted into organisms, generating transgenic organisms, also termed genetically modified organisms (GMOs). Although most GMOs are generated for purposes of basic biological research, a number of GMOs have been developed for commercial use, ranging from animals and plants that produce pharmaceuticals or other compounds (pharming), herbicide-resistant crop plants, and fluorescent tropical fish (GloFish) for home entertainment.

Gene therapy

Gene therapy involves supplying a functional gene to cells lacking that function, with the aim of correcting a genetic disorder or acquired disease. Gene therapy can be broadly divided into two categories. The first is alteration of germ cells, that is, sperm or eggs, which results in a permanent genetic change for the whole organism and subsequent generations. This “germ line gene therapy” is considered by many to be unethical in human beings. The second type of gene therapy, “somatic cell gene therapy”, is analogous to an organ transplant. In this case, one or more specific tissues are targeted by direct treatment or by removal of the tissue, addition of the therapeutic gene or genes in the laboratory, and return of the treated cells to the patient. Clinical trials of somatic cell gene therapy began in the late 1990s, mostly for the treatment of cancers and blood, liver, and lung disorders.

Despite a great deal of publicity and promises, the history of human gene therapy has been characterized by relatively limited success. The effect of introducing a gene into cells often promotes only partial and/or transient relief from the symptoms of the disease being treated. Some gene therapy trial patients have suffered adverse consequences of the treatment itself, including deaths. In some cases, the adverse effects result from disruption of essential genes within the patient's genome by insertional inactivation. In others, viral vectors used for gene therapy have been contaminated with infectious virus. Nevertheless, gene therapy is still held to be a promising future area of medicine, and is an area where there is a significant level of research and development activity.

Operator (computer programming)

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Operator_(computer_programmin...