Genetic engineering can be accomplished using multiple techniques. There are a number of steps that are followed before a genetically modified organism (GMO) is created. Genetic engineers must first choose what gene they wish to insert, modify, or delete. The gene must then be isolated and incorporated, along with other genetic elements, into a suitable vector. This vector is then used to insert the gene into the host genome, creating a transgenic or edited organism. The ability to genetically engineer organisms is built on years of research and discovery on how genes function and how we can manipulate them. Important advances included the discovery of restriction enzymes and DNA ligases and the development of polymerase chain reaction and sequencing.
This allowed the gene of interest to be isolated and then incorporated into a vector. Often a promoter and terminator region was added as well as a selectable marker gene. The gene may be modified further at this point to make it express more efficiently. This vector is then inserted into the host organism's genome. For animals, the gene is typically inserted into embryonic stem cells, while in plants it can be inserted into any tissue that can be cultured into a fully developed plant. Common techniques include microinjection, virus-mediated, Agrobacterium-mediated or biolistics. Further tests are carried out on the resulting organism to ensure stable integration, inheritance and expression. First generation offspring are heterozygous, requiring them to be inbred to create the homozygous pattern necessary for stable inheritance. Homozygosity must be confirmed in second generation specimens.
Traditional techniques inserted the genes randomly into the hosts genome. Advances have allowed genes to be inserted at specific locations within a genome, which reduces the unintended side effects of random insertion. Early targeting systems relied on meganucleases and zinc finger nucleases. Since 2009 more accurate and easier systems to implement have been developed. Transcription activator-like effector nucleases (TALENs) and the Cas9-guideRNA system (adapted from CRISPR) are the two most commonly used. They may potentially be useful in gene therapy and other procedures that require accurate or high throughput targeting.
History
Many different discoveries and advancements led to the development of genetic engineering. Human-directed genetic manipulation began with the domestication of plants and animals through artificial selection in about 12,000 BC. Various techniques were developed to aid in breeding and selection. Hybridization
was one way rapid changes in an organism's genetic makeup could be
introduced. Crop hybridization most likely first occurred when humans
began growing genetically distinct individuals of related species in
close proximity. Some plants were able to be propagated by vegetative cloning.
Genetic inheritance was first discovered by Gregor Mendel in 1865, following experiments crossing peas. In 1928 Frederick Griffith proved the existence of a "transforming principle" involved in inheritance, which was identified as DNA in 1944 by Oswald Avery, Colin MacLeod, and Maclyn McCarty. Frederick Sanger developed a method for sequencing DNA in 1977, greatly increasing the genetic information available to researchers.
After discovering the existence and properties of DNA, tools had to be developed that allowed it to be manipulated. In 1970 Hamilton Smiths lab discovered restriction enzymes, enabling scientists to isolate genes from an organism's genome. DNA ligases, which join broken DNA together, were discovered earlier in 1967. By combining the two enzymes it became possible to "cut and paste" DNA sequences to create recombinant DNA. Plasmids, discovered in 1952, became important tools for transferring information between cells and replicating DNA sequences. Polymerase chain reaction (PCR), developed by Kary Mullis in 1983, allowed small sections of DNA to be amplified (replicated) and aided identification and isolation of genetic material.
As well as manipulating DNA, techniques had to be developed for
its insertion into an organism's genome. Griffith's experiment had
already shown that some bacteria had the ability to naturally uptake and express foreign DNA. Artificial competence was induced in Escherichia coli in 1970 by treating them with calcium chloride solution (CaCl2). Transformation using electroporation was developed in the late 1980s, increasing the efficiency and bacterial range. In 1907 a bacterium that caused plant tumors, Agrobacterium tumefaciens, had been discovered. In the early 1970s it was found that this bacteria inserted its DNA into plants using a Ti plasmid.
By removing the genes in the plasmid that caused the tumor and adding
in novel genes, researchers were able to infect plants with A. tumefaciens and let the bacteria insert their chosen DNA into the genomes of the plants.
Choosing target genes
The
first step is to identify the target gene or genes to insert into the
host organism. This is driven by the goal for the resultant organism. In
some cases only one or two genes are affected. For more complex
objectives entire biosynthetic pathways
involving multiple genes may be involved. Once found genes and other
genetic information from a wide range of organisms can be inserted into
bacteria for storage and modification, creating genetically modified bacteria in the process. Bacteria are cheap, easy to grow, clonal,
multiply quickly, relatively easy to transform and can be stored at
-80 °C almost indefinitely. Once a gene is isolated it can be stored
inside the bacteria providing an unlimited supply for research.
Genetic screens
can be carried out to determine potential genes followed by other tests
identify the best candidates. A simple screen involves randomly mutating
DNA with chemicals or radiation and then selecting those that display
the desired trait. For organisms where mutation is not practical,
scientists instead look for individuals among the population who present
the characteristic through naturally-occurring mutations. Processes
that look at a phenotype and then try and identify the gene responsible are called forward genetics. The gene then needs to be mapped by comparing the inheritance of the phenotype with known genetic markers. Genes that are close together are likely to be inherited together.
Another option is reverse genetics. This approach involves targeting a specific gene with a mutation and then observing what phenotype develops. The mutation can be designed to inactivate the gene or only allow it to become active only under certain conditions. Conditional mutations are useful for identifying genes that are normally lethal if non-functional. As genes with similar functions share similar sequences (homologous) it is possible to predict the likely function of a gene by comparing its sequence to that of well-studied genes from model organisms. The development of microarrays, transcriptomes and genome sequencing has made it much easier to find desirable genes.
The bacteria Bacillus thuringiensis was first discovered in 1901 as the causative agent in the death of silkworms. Due to these insecticidal properties, the bacteria was used as a biological insecticide, developed commercially in 1938. The cry proteins
were discovered to provide the insecticidal activity in 1956, and by
the 1980s, scientists had successfully cloned the gene that encodes this
protein and expressed it in plants. The gene that provides resistance to the herbicide glyphosate was found after seven years of searching in bacteria that living in the outflow pipe of a Monsanto RoundUp manufacturing facility. In animals, the majority of genes used are growth hormone genes.
Gene manipulation
All
genetic engineering processes involve the modification of DNA.
Traditionally DNA was isolated from the cells of organisms. Later, genes
came to be cloned from a DNA segment after the creation of a DNA library or artificially synthesised.
Once isolated, additional genetic elements are added to the gene to
allow it to be expressed in the host organism and to aid selection.
Extraction from cells
First the cell must be gently opened,
exposing the DNA without causing too much damage to it. The methods
used vary depending on the type of cell. Once open, the DNA must be
separated from the other cellular components. A ruptured cell contains
proteins and other cell debris. By mixing with phenol and/or chloroform, followed by centrifuging, the nucleic acids can be separated from this debris into an upper aqueous phase.
This aqueous phase can be removed and further purified if necessary by
repeating the phenol-chloroform steps. The nucleic acids can then be precipitated from the aqueous solution using ethanol or isopropanol. Any RNA can be removed by adding a ribonuclease that will degrade it. Many companies now sell kits that simplify the process.
Gene isolation
The
gene of interest must be separated from the extracted DNA. If the
sequence is not known then a common method is to break the DNA up with a
random digestion method. This is usually accomplished using restriction enzymes (enzymes that cut DNA). A partial restriction digest
cuts only some of the restriction sites, resulting in overlapping DNA
fragment segments. The DNA fragments are put into individual plasmid vectors and grown inside bacteria. Once in the bacteria the plasmid is copied as the bacteria divides. To determine if a useful gene is present on a particular fragment the DNA library is screened for the desired phenotype. If the phenotype is detected then it is possible that the bacteria contains the target gene.
If the gene does not have a detectble phenotype or a DNA library
does not contain the correct gene, other methods must be used to isolate
it. If the position of the gene can be determined using molecular markers then chromosome walking is one way to isolate the correct DNA fragment. If the gene expresses close homology
to a known gene in another species, then it could be isolated by
searching for genes in the library that closely match the known gene.
For known DNA sequences, restriction enzymes that cut the DNA on either side of the gene can be used. Gel electrophoresis then sorts the fragments according to length. Some gels can separate sequences that differ by a single base-pair. The DNA can be visualised by staining it with ethidium bromide and photographing under UV light. A marker
with fragments of known lengths can be laid alongside the DNA to
estimate the size of each band. The DNA band at the correct size should
contain the gene, where it can be excised from the gel. Another technique to isolate genes of known sequences involves polymerase chain reaction (PCR).
PCR is a powerful tool that can amplify a given sequence, which can
then be isolated through gel electrophoresis. Its effectiveness drops
with larger genes and it has the potential to introduce errors into the
sequence.
It is possible to artificially synthesise genes. Some synthetic sequences are available commercially, forgoing many of these early steps.
Modification
The
gene to be inserted must be combined with other genetic elements in
order for it to work properly. The gene can be modified at this stage
for better expression or effectiveness. As well as the gene to be
inserted most constructs contain a promoter and terminator region as well as a selectable marker gene. The promoter region initiates transcription
of the gene and can be used to control the location and level of gene
expression, while the terminator region ends transcription. A selectable
marker, which in most cases confers antibiotic resistance
to the organism it is expressed in, is used to determine which cells
are transformed with the new gene. The constructs are made using recombinant DNA techniques, such as restriction digests, ligations and molecular cloning.
Inserting DNA into the host genome
Once the gene is constructed it must be stably integrated into the target organisms genome or exist as extrachromosomal DNA.
There are a number of techniques available for inserting the gene into
the host genome and they vary depending on the type of organism
targeted. In multicellular eukaryotes, if the transgene is incorporated into the host's germline cells, the resulting host cell can pass the transgene to its progeny. If the transgene is incorporated into somatic cells, the transgene can not be inherited.
Transformation
Transformation is the direct alteration of a cell's genetic components by passing the genetic material through the cell membrane. About 1% of bacteria are naturally able to take up foreign DNA, but this ability can be induced in other bacteria. Stressing the bacteria with a heat shock or electroporation can make the cell membrane
permeable to DNA that may then be incorporated into the genome or exist
as extrachromosomal DNA. Typically the cells are incubated in a
solution containing divalent cations (often calcium chloride)
under cold conditions, before being exposed to a heat pulse (heat
shock). Calcium chloride partially disrupts the cell membrane, which
allows the recombinant DNA to enter the host cell. It is suggested that
exposing the cells to divalent cations in cold condition may change or
weaken the cell surface structure, making it more permeable to DNA. The
heat-pulse is thought to create a thermal imbalance across the cell
membrane, which forces the DNA to enter the cells through either cell
pores or the damaged cell wall. Electroporation is another method of promoting competence. In this method the cells are briefly shocked with an electric field of 10-20 kV/cm,
which is thought to create holes in the cell membrane through which the
plasmid DNA may enter. After the electric shock, the holes are rapidly
closed by the cell's membrane-repair mechanisms. Up-taken DNA can
either integrate with the bacterials genome or, more commonly, exist as extrachromosomal DNA.
In plants the DNA is often inserted using Agrobacterium-mediated recombination, taking advantage of the Agrobacteriums T-DNA sequence that allows natural insertion of genetic material into plant cells. Plant tissue are cut into small pieces and soaked in a fluid containing suspended Agrobacterium. The bacteria will attach to many of the plant cells exposed by the cuts. The bacteria uses conjugation to transfer a DNA segment called T-DNA
from its plasmid into the plant. The transferred DNA is piloted to the
plant cell nucleus and integrated into the host plants genomic DNA.The
plasmid T-DNA is integrated semi-randomly into the genome of the host cell.
By modifying the plasmid to express the gene of interest,
researchers can insert their chosen gene stably into the plants genome.
The only essential parts of the T-DNA are its two small (25 base pair)
border repeats, at least one of which is needed for plant
transformation. The genes to be introduced into the plant are cloned into a plant transformation vector that contains the T-DNA region of the plasmid. An alternative method is agroinfiltration.
Another method used to transform plant cells is biolistics, where particles of gold or tungsten are coated with DNA and then shot into young plant cells or plant embryos. Some genetic material enters the cells and transforms them. This method can be used on plants that are not susceptible to Agrobacterium infection and also allows transformation of plant plastids.
Plants cells can also be transformed using electroporation, which uses
an electric shock to make the cell membrane permeable to plasmid DNA.
Due to the damage caused to the cells and DNA the transformation
efficiency of biolistics and electroporation is lower than agrobacterial
transformation.
Transfection
Transformation has a different meaning
in relation to animals, indicating progression to a cancerous state, so
the process used to insert foreign DNA into animal cells is usually
called transfection. There are many ways to directly introduce DNA into animal cells in vitro. Often these cells are stem cells that are used for gene therapy. Chemical based methods uses natural or synthetic compounds to form particles that facilitate the transfer of genes into cells. These synthetic vectors have the ability to bind DNA and accommodate large genetic transfers. One of the simplest methods involves using calcium phosphate to bind the DNA and then exposing it to cultured cells. The solution, along with the DNA, is encaspulated by the cells and a small amount of DNA can be integrated into the genome. Liposomes and polymers
can be used as vectors to deliver DNA into cultured animal cells.
Positively charged liposomes bind with DNA, while polymers can designed
that interact with DNA.
They form lipoplexes and polyplexes respectively, which are then
up-taken by the cells. Other techniques include using electroporation
and biolistics.
To create transgenic animals the DNA must be inserted into viable embryos or eggs. This is usually accomplished using microinjection, where DNA is injected through the cell's nuclear envelope directly into the nucleus. Superovulated fertilised eggs are collected at the single cell stage and cultured in vitro. When the pronuclei from the sperm head and egg are visible through the protoplasm the genetic material is injected into one of them. The oocyte is then implanted in the oviduct of a pseudopregnant animal. Another method is Embryonic Stem Cell-Mediated Gene Transfer. The gene is transfected into embryonic stem cells and then they are inserted into mouse blastocysts that are then implanted into foster mothers. The resulting offspring are chimeric, and further mating can produce mice fully transgenic with the gene of interest.
Transduction
Transduction is the process by which foreign DNA is introduced into a cell by a virus or viral vector. Genetically modified viruses can be used as viral vectors to transfer target genes to another organism in gene therapy.
First the virulent genes are removed from the virus and the target
genes are inserted instead. The sequences that allow the virus to insert
the genes into the host organism must be left intact. Popular virus
vectors are developed from retroviruses or adenoviruses. Other viruses used as vectors include, lentiviruses, pox viruses and herpes viruses. The type of virus used will depend on the cells targeted and whether the DNA is to be altered permanently or temporarily.
Regeneration
As often only a single cell is transformed with genetic material, the organism must be regenerated from that single cell. In plants this is accomplished through the use of tissue culture.
Each plant species has different requirements for successful
regeneration. If successful, the technique produces an adult plant that
contains the transgene in every cell. In animals it is necessary to ensure that the inserted DNA is present in the embryonic stem cells. Offspring can be screened for the gene. All offspring from the first generation are heterozygous for the inserted gene and must be inbred to produce a homozygous specimen. Bacteria consist of a single cell and reproduce clonally so regeneration is not necessary. Selectable markers are used to easily differentiate transformed from untransformed cells.
Cells that have been successfully transformed with the DNA
contain the marker gene, while those not transformed will not. By
growing the cells in the presence of an antibiotic or chemical that selects
or marks the cells expressing that gene, it is possible to separate
modified from unmodified cells. Another screening method involves a DNA probe
that sticks only to the inserted gene. These markers are usually
present in the transgenic organism, although a number of strategies have
been developed that can remove the selectable marker from the mature
transgenic plant.
Confirmation
Finding
that a recombinant organism contains the inserted genes is not usually
sufficient to ensure that they will be appropriately expressed in the
intended tissues. Further testing using PCR, Southern hybridization, and DNA sequencing is conducted to confirm that an organism contains the new gene.
These tests can also confirm the chromosomal location and copy number
of the inserted gene. Once confirmed methods that look for and measure
the gene products (RNA and protein) are also used to assess gene
expression, transcription, RNA processing patterns and expression and
localization of protein product(s). These include northern hybridisation, quantitative RT-PCR, Western blot, immunofluorescence, ELISA and phenotypic analysis. When appropriate, the organism's offspring are studied to confirm that the transgene and associated phenotype are stably inherited.
Gene targeting
Traditional methods of genetic engineering generally insert the new
genetic material randomly within the host genome. This can impair or
alter other genes within the organism. Methods were developed that
inserted the new genetic material into specific sites within an organism genome. Early methods that targeted genes at certain sites within a genome relied on homologous recombination.
By creating DNA constructs that contain a template that matches the
targeted genome sequence, it is possible that the HR processes within
the cell will insert the construct at the desired location. Using this
method on embryonic stem cells led to the development of transgenic mice with targeted knocked out. It has also been possible to knock in genes or alter gene expression patterns.
If a vital gene is knocked out it can prove lethal to the organism. In order to study the function of these genes, site specific recombinases (SSR) were used. The two most common types are the Cre-LoxP and Flp-FRT systems. Cre recombinase
is an enzyme that removes DNA by homologous recombination between
binding sequences known as Lox-P sites. The Flip-FRT system operates in a
similar way, with the Flip recombinase recognizing FRT sequences. By
crossing an organism containing the recombinase sites flanking the gene
of interest with an organism that expresses the SSR under control of tissue specific promoters,
it is possible to knock out or switch on genes only in certain cells.
This has also been used to remove marker genes from transgenic animals.
Further modifications of these systems allowed researchers to induce
recombination only under certain conditions, allowing genes to be
knocked out or expressed at desired times or stages of development.
Genome editing uses artificially engineered nucleases that create specific double-stranded breaks at desired locations in the genome. The breaks are subject to cellular DNA repair
processes that can be exploited for targeted gene knock-out, correction
or insertion at high frequencies. If a donor DNA containing the
appropriate sequence (homologies) is present, then new genetic material
containing the transgene will be integrated at the targeted site with
high efficiency by homologous recombination. There are four families of engineered nucleases: meganucleases, ZFNs, transcription activator-like effector nucleases (TALEN), the CRISPR/Cas (clustered regularly interspaced short palindromic repeat/CRISPRassociated protein (e.g. CRISPR/Cas9). Among the four types, TALEN and CRISPR/Cas are the two most commonly used.
Recent advances have looked at combining multiple systems to exploit
the best features of both (e.g. megaTAL that are a fusion of a TALE DNA
binding domain and a meganuclease).
Recent research has also focused on developing strategies to create
gene knock-out or corrections without creating double stranded breaks
(base editors).
Meganucleases and Zinc finger nucleases
Meganucleases were first used in 1988 in mammalian cells. Meganucleases are endodeoxyribonucleases
that function as restriction enzymes with long recognition sites,
making them are more specific to their target site than other restriction enzymes.
This increases their specificity and reduces their toxicity as they
will not target as many sites within a genome. The most studied
meganucleases are the LAGLIDADG family.
While meganucleases are still quite susceptible to off-target binding,
which makes them less attractive than other gene editing tools, their
smaller size still makes them attractive particularly for viral
vectorization perspectives.
Zinc-finger nucleases (ZFNs), used for the first time in 1996,
are typically created through the fusion of Zinc-finger domains and the FokI nuclease domain. ZFNs have thus the ability to cleave DNA at target sites.
By engineering the zinc finger domain to target a specific site within
the genome, it is possible to edit the genomic sequence at the desired location.
ZFNs have a greater specificity, but still hold the potential to bind
to non-specific sequences.. While a certain amount of off-target
cleavage is acceptable for creating transgenic model organisms, they
might not be optimal for all human gene therapy treatments.
TALEN and CRISPR
Access
to the code governing the DNA recognition by transcription
activator-like effectors (TALE) in 2009 opened the way to the
development of a new class of efficient TAL-based gene editing tools.
TALE, proteins secreted by the Xanthomonas plant pathogen, bind with
great specificity to genes within the plant host and initiate transcription of the genes helping infection. Engineering TALE by fusing the DNA binding core to the FokI nuclease catalytic domain allowed creation of a new tool of designer nucleases, the TALE nuclease (TALEN).
They have one of the greatest specificities of all the current
engineered nucleases. Due to the presence of repeat sequences, they are
difficult to construct through standard molecular biology procedure and
rely on more complicated method of such as Golden gate cloning.
In 2011, another major breakthrough technology was developed
based on CRISPR/Cas (clustered regularly interspaced short palindromic
repeat / CRISPR associated protein) systems that function as an adaptive
immune system in bacteria and archaea.
The CRISPR/Cas system allows bacteria and archaea to fight against
invading viruses by cleaving viral DNA and inserting pieces of that DNA
into their own genome. The organism then transcribes this DNA into RNA and combines this RNA with Cas9
proteins to make double-stranded breaks in the invading viral DNA. The
RNA serves as a guide RNA to direct the Cas9 enzyme to the correct spot
in the virus DNA. By pairing Cas proteins with a designed guide RNA
CRISPR/Cas9 can be used to induce double-stranded breaks at specific
points within DNA sequences. The break gets repaired by cellular DNA
repair enzymes, creating a small insertion/deletion type mutation in
most cases. Targeted DNA repair is possible by providing a donor DNA
template that represents the desired change and that is (sometimes) used
for double-strand break repair by homologous recombination. It was
later demonstrated that CRISPR/Cas9 can edit human cells in a dish.
Although the early generation lacks the specificity of TALEN, the major
advantage of this technology is the simplicity of the design. It also
allows multiple sites to be targeted simultaneously, allowing the
editing of multiple genes at once. CRISPR/Cpf1
is a more recently discovered system that requires a different guide
RNA to create particular double-stranded breaks (leaves overhangs when
cleaving the DNA) when compared to CRISPR/Cas9.
CRISPR/Cas9 is efficient at gene disruption. The creation of
HIV-resistant babies by Chinese researcher He Jiankui is perhaps the
most famous example of gene disruption using this method.
It is far less effective at gene correction. Methods of base editing
are under development in which a “nuclease-dead” Cas 9 endonuclease or a
related enzyme is used for gene targeting while a linked deaminase
enzyme makes a targeted base change in the DNA.
The most recent refinement of CRISPR-Cas9 is called Prime Editing. This
method links a reverse transcriptase to an RNA-guided engineered
nuclease that only makes single-strand cuts but no double-strand breaks.
It replaces the portion of DNA next to the cut by the successive action
of nuclease and reverse transcriptase, introducing the desired change
from an RNA template.