Search This Blog

Tuesday, June 9, 2020

Genome editing

From Wikipedia, the free encyclopedia
 
The different generations of nucleases used for genome editing
and the DNA repair pathways used to modify target DNA.

Genome editing, or genome engineering, or gene editing, is a type of genetic engineering in which DNA is inserted, deleted, modified or replaced in the genome of a living organism. Unlike early genetic engineering techniques that randomly inserts genetic material into a host genome, genome editing targets the insertions to site specific locations.

History

Genome editing with engineered nucleases, i.e. all three major classes of these enzymes—zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and engineered meganucleases—were selected by Nature Methods as the 2011 Method of the Year. The CRISPR-Cas system was selected by Science as 2015 Breakthrough of the Year.

As of 2015 four families of engineered nucleases were used: meganucleases, zinc finger nucleases (ZFNs), transcription activator-like effector-based nucleases (TALEN), and the clustered regularly interspaced short palindromic repeats (CRISPR/Cas9) system. Nine genome editors were available as of 2017.

In 2018, the common methods for such editing used engineered nucleases, or "molecular scissors". These nucleases create site-specific double-strand breaks (DSBs) at desired locations in the genome. The induced double-strand breaks are repaired through nonhomologous end-joining (NHEJ) or homologous recombination (HR), resulting in targeted mutations ('edits').

In May 2019, lawyers in China reported, in light of the purported creation by Chinese scientist He Jiankui of the first gene-edited humans (see Lulu and Nana controversy), the drafting of regulations that anyone manipulating the human genome by gene-editing techniques, like CRISPR, would be held responsible for any related adverse consequences. A cautionary perspective on the possible blind spots and risks of CRISPR and related biotechnologies has been recently discussed, focusing on the stochastic nature of cellular control processes. 

In February 2020, a US trial safely showed CRISPR gene editing on 3 cancer patients.

Background

Genetic engineering as method of introducing new genetic elements into organisms has been around since the 1970s. One drawback of this technology has been the random nature with which the DNA is inserted into the hosts genome. This can impair or alter other genes within the organism. Methods were sought which targeted the inserted genes to specific sites within an organism genome. As well as reducing off-target effects it also enabled the editing of specific sequences within a genome. This could be used for research purposes, by targeting mutations to specific genes, and in gene therapy. By inserting a functional gene into an organism and targeting it to replace the defective one it could be possible to cure certain genetic diseases.

Gene targeting

Homologous recombination

Early methods to target genes to certain sites within a genome of an organism (called gene targeting) relied on homologous recombination (HR). By creating DNA constructs that contain a template that matches the targeted genome sequence it is possible that the HR processes within the cell will insert the construct at the desired location. Using this method on embryonic stem cells led to the development of transgenic mice with targeted genes knocked out. It has also been possible to knock in genes or alter gene expression patterns. In recognition of their discovery of how homologous recombination can be used to introduce genetic modifications in mice through embryonic stem cells, Mario Capecchi, Martin Evans and Oliver Smithies were awarded the 2007 Nobel Prize for Physiology or Medicine.

Conditional targeting

If a vital gene is knocked out it can prove lethal to the organism. In order to study the function of these genes site specific recombinases (SSR) were used. The two most common types are the Cre-LoxP and Flp-FRT systems. Cre recombinase is an enzyme that removes DNA by homologous recombination between binding sequences known as Lox-P sites. The Flip-FRT system operates in a similar way, with the Flip recombinase recognising FRT sequences. By crossing an organism containing the recombinase sites flanking the gene of interest with an organism that express the SSR under control of tissue specific promoters, it is possible to knock out or switch on genes only in certain cells. These techniques were also used to remove marker genes from transgenic animals. Further modifications of these systems allowed researchers to induce recombination only under certain conditions, allowing genes to be knocked out or expressed at desired times or stages of development.

Process

Double strand break repair

Genome editing relies on the concept of DNA double stranded break (DSB) repair mechanics. There are two major pathways that repair DSB; non-homologous end joining (NHEJ) and homology directed repair (HDR). NHEJ uses a variety of enzymes to directly join the DNA ends while the more accurate HDR uses a homologous sequence as a template for regeneration of missing DNA sequences at the break point. This can be exploited by creating a vector with the desired genetic elements within a sequence that is homologous to the flanking sequences of a DSB. This will result in the desired change being inserted at the site of the DSB. While HDR based gene editing is similar to the homologous recombination based gene targeting, the rate of recombination is increased by at least three orders of magnitude.

Engineered nucleases

Groups of engineered nucleases. Matching colors signify DNA recognition patterns
 
The key to genome editing is creating a DSB at a specific point within the genome. Commonly used restriction enzymes are effective at cutting DNA, but generally recognize and cut at multiple sites. To overcome this challenge and create site-specific DSB, three distinct classes of nucleases have been discovered and bioengineered to date. These are the Zinc finger nucleases (ZFNs), transcription-activator like effector nucleases (TALEN), meganucleases and the clustered regularly interspaced short palindromic repeats (CRISPR/Cas9) system.

Meganucleases

Meganucleases, discovered in the late 1980s, are enzymes in the endonuclease family which are characterized by their capacity to recognize and cut large DNA sequences (from 14 to 40 base pairs). The most widespread and best known meganucleases are the proteins in the LAGLIDADG family, which owe their name to a conserved amino acid sequence.




Meganucleases, found commonly in microbial species, have the unique property of having very long recognition sequences (>14bp) thus making them naturally very specific. However, there is virtually no chance of finding the exact meganuclease required to act on a chosen specific DNA sequence. To overcome this challenge, mutagenesis and high throughput screening methods have been used to create meganuclease variants that recognize unique sequences. Others have been able to fuse various meganucleases and create hybrid enzymes that recognize a new sequence. Yet others have attempted to alter the DNA interacting aminoacids of the meganuclease to design sequence specific meganucelases in a method named rationally designed meganuclease. Another approach involves using computer models to try to predict as accurately as possible the activity of the modified meganucleases and the specificity of the recognized nucleic sequence.


A large bank containing several tens of thousands of protein units has been created. These units can be combined to obtain chimeric meganucleases that recognize the target site, thereby providing research and development tools that meet a wide range of needs (fundamental research, health, agriculture, industry, energy, etc.) These include the industrial-scale production of two meganucleases able to cleave the human XPC gene; mutations in this gene result in Xeroderma pigmentosum, a severe monogenic disorder that predisposes the patients to skin cancer and burns whenever their skin is exposed to UV rays.

Meganucleases have the benefit of causing less toxicity in cells than methods such as Zinc finger nuclease (ZFN), likely because of more stringent DNA sequence recognition; however, the construction of sequence-specific enzymes for all possible sequences is costly and time consuming, as one is not benefiting from combinatorial possibilities that methods such as ZFNs and TALEN-based fusions utilize.

Zinc finger nucleases

As opposed to meganucleases, the concept behind ZFNs and TALEN technology is based on a non-specific DNA cutting catalytic domain, which can then be linked to specific DNA sequence recognizing peptides such as zinc fingers and transcription activator-like effectors (TALEs). The first step to this was to find an endonuclease whose DNA recognition site and cleaving site were separate from each other, a situation that is not the most common among restriction enzymes. Once this enzyme was found, its cleaving portion could be separated which would be very non-specific as it would have no recognition ability. This portion could then be linked to sequence recognizing peptides that could lead to very high specificity. 

Zinc finger motifs occur in several transcription factors. The zinc ion, found in 8% of all human proteins, plays an important role in the organization of their three-dimensional structure. In transcription factors, it is most often located at the protein-DNA interaction sites, where it stabilizes the motif. The C-terminal part of each finger is responsible for the specific recognition of the DNA sequence.

The recognized sequences are short, made up of around 3 base pairs, but by combining 6 to 8 zinc fingers whose recognition sites have been characterized, it is possible to obtain specific proteins for sequences of around 20 base pairs. It is therefore possible to control the expression of a specific gene. It has been demonstrated that this strategy can be used to promote a process of angiogenesis in animals. It is also possible to fuse a protein constructed in this way with the catalytic domain of an endonuclease in order to induce a targeted DNA break, and therefore to use these proteins as genome engineering tools.

The method generally adopted for this involves associating two DNA binding proteins – each containing 3 to 6 specifically chosen zinc fingers – with the catalytic domain of the FokI endonuclease which need to dimerize to cleave the double-strand DNA. The two proteins recognize two DNA sequences that are a few nucleotides apart. Linking the two zinc finger proteins to their respective sequences brings the two FokI domains closer together. FokI requires dimerization to have nuclease activity and this means the specificity increases dramatically as each nuclease partner would recognize a unique DNA sequence. To enhance this effect, FokI nucleases have been engineered that can only function as heterodimers.

Several approaches are used to design specific zinc finger nucleases for the chosen sequences. The most widespread involves combining zinc-finger units with known specificities (modular assembly). Various selection techniques, using bacteria, yeast or mammal cells have been developed to identify the combinations that offer the best specificity and the best cell tolerance. Although the direct genome-wide characterization of zinc finger nuclease activity has not been reported, an assay that measures the total number of double-strand DNA breaks in cells found that only one to two such breaks occur above background in cells treated with zinc finger nucleases with a 24 bp composite recognition site and obligate heterodimer FokI nuclease domains.

The heterodimer functioning nucleases would avoid the possibility of unwanted homodimer activity and thus increase specificity of the DSB. Although the nuclease portions of both ZFNs and TALEN constructs have similar properties, the difference between these engineered nucleases is in their DNA recognition peptide. ZFNs rely on Cys2-His2 zinc fingers and TALEN constructs on TALEs. Both of these DNA recognizing peptide domains have the characteristic that they are naturally found in combinations in their proteins. Cys2-His2 Zinc fingers typically happen in repeats that are 3 bp apart and are found in diverse combinations in a variety of nucleic acid interacting proteins such as transcription factors. Each finger of the Zinc finger domain is completely independent and the binding capacity of one finger is impacted by its neighbor. TALEs on the other hand are found in repeats with a one-to-one recognition ratio between the amino acids and the recognized nucleotide pairs. Because both zinc fingers and TALEs happen in repeated patterns, different combinations can be tried to create a wide variety of sequence specificities. Zinc fingers have been more established in these terms and approaches such as modular assembly (where Zinc fingers correlated with a triplet sequence are attached in a row to cover the required sequence), OPEN (low-stringency selection of peptide domains vs. triplet nucleotides followed by high-stringency selections of peptide combination vs. the final target in bacterial systems), and bacterial one-hybrid screening of zinc finger libraries among other methods have been used to make site specific nucleases. 

Zinc finger nucleases are research and development tools that have already been used to modify a range of genomes, in particular by the laboratories in the Zinc Finger Consortium. The US company Sangamo BioSciences uses zinc finger nucleases to carry out research into the genetic engineering of stem cells and the modification of immune cells for therapeutic purposes. Modified T lymphocytes are currently undergoing phase I clinical trials to treat a type of brain tumor (glioblastoma) and in the fight against AIDS.

TALEN

General overview of the TALEN process

Transcription activator-like effector nucleases (TALENs) are specific DNA-binding proteins that feature an array of 33 or 34-amino acid repeats. TALENs are artificial restriction enzymes designed by fusing the DNA cutting domain of a nuclease to TALE domains, which can be tailored to specifically recognize a unique DNA sequence. These fusion proteins serve as readily targetable "DNA scissors" for gene editing applications that enable to perform targeted genome modifications such as sequence insertion, deletion, repair and replacement in living cells. The DNA binding domains, which can be designed to bind any desired DNA sequence, comes from TAL effectors, DNA-binding proteins excreted by plant pathogenic Xanthomanos app. TAL effectors consists of repeated domains, each of which contains a highly conserved sequence of 34 amino acids, and recognize a single DNA nucleotide within the target site. The nuclease can create double strand breaks at the target site that can be repaired by error-prone non-homologous end-joining (NHEJ), resulting in gene disruptions through the introduction of small insertions or deletions. Each repeat is conserved, with the exception of the so-called repeat variable di-residues (RVDs) at amino acid positions 12 and 13. The RVDs determine the DNA sequence to which the TALE will bind. This simple one-to-one correspondence between the TALE repeats and the corresponding DNA sequence makes the process of assembling repeat arrays to recognize novel DNA sequences straightforward. These TALEs can be fused to the catalytic domain from a DNA nuclease, FokI, to generate a transcription activator-like effector nuclease (TALEN). The resultant TALEN constructs combine specificity and activity, effectively generating engineered sequence-specific nucleases that bind and cleave DNA sequences only at pre-selected sites. The TALEN target recognition system is based on an easy-to-predict code. TAL nucleases are specific to their target due in part to the length of their 30+ base pairs binding site. TALEN can be performed within a 6 base pairs range of any single nucleotide in the entire genome.

TALEN constructs are used in a similar way to designed zinc finger nucleases, and have three advantages in targeted mutagenesis: (1) DNA binding specificity is higher, (2) off-target effects are lower, and (3) construction of DNA-binding domains is easier.

CRISPR

CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats) are genetic elements that bacteria use as a kind of acquired immunity to protect against viruses. They consist of short sequences that originate from viral genomes and have been incorporated into the bacterial genome. Cas (CRISPR associated proteins) process these sequences and cut matching viral DNA sequences. By introducing plasmids containing Cas genes and specifically constructed CRISPRs into eukaryotic cells, the eukaryotic genome can be cut at any desired position. Several companies, including Cellectis and Editas, have been working to monetize the CRISPR method while developing gene-specific therapies.

Precision and efficiency of engineered nucleases

Meganucleases method of gene editing is the least efficient of the methods mentioned above. Due to the nature of its DNA-binding element and the cleaving element, it is limited to recognizing one potential target every 1,000 nucleotides. ZFN was developed to overcome the limitations of meganuclease. The number of possible targets ZFN can recognized was increased to one in every 140 nucleotides. However, both methods are unpredictable due to the ability of their DNA-binding elements affecting each other. As a result, high degrees of expertise and lengthy and costly validations processes are required.

TALE nucleases being the most precise and specific method yields a higher efficiency than the previous two methods. It achieves such efficiency because the DNA-binding element consists of an array of TALE subunits, each of them having the capability of recognizing a specific DNA nucleotide chain independent from others, resulting in a higher number of target sites with high precision. New TALE nucleases take about one week and a few hundred dollars to create, with specific expertise in molecular biology and protein engineering.

CRISPR nucleases have a slightly lower precision when compared to the TALE nucleases. This is caused by the need of having a specific nucleotide at one end in order to produce the guide RNA that CRISPR uses to repair the double-strand break it induces. It has been shown to be the quickest and cheapest method, only costing less than two hundred dollars and a few days of time. CRISPR also requires the least amount of expertise in molecular biology as the design lays in the guide RNA instead of the proteins. One major advantage that CRISPR has over the ZFN and TALEN methods is that it can be directed to target different DNA sequences using its ~80nt CRISPR sgRNAs, while both ZFN and TALEN methods required construction and testing of the proteins created for targeting each DNA sequence.

Because off-target activity of an active nuclease would have potentially dangerous consequences at the genetic and organismal levels, the precision of meganucleases, ZFNs, CRISPR, and TALEN-based fusions has been an active area of research. While variable figures have been reported, ZFNs tend to have more cytotoxicity than TALEN methods or RNA-guided nucleases, while TALEN and RNA-guided approaches tend to have the greatest efficiency and fewer off-target effects. Based on the maximum theoretical distance between DNA binding and nuclease activity, TALEN approaches result in the greatest precision.

Multiplex Automated Genomic Engineering (MAGE)

Synthetic DNA is repeatedly introduced at multiple targeted areas of the chromosome and/or loci and then is replicated producing cells with/without mutations.
 
The methods for scientists and researchers wanting to study genomic diversity and all possible associated phenotypes were very slow, expensive, and inefficient. Prior to this new revolution, researchers would have to do single-gene manipulations and tweak the genome one little section at a time, observe the phenotype, and start the process over with a different single-gene manipulation. Therefore, researchers at the Wyss Institute at Harvard University designed the MAGE, a powerful technology that improves the process of in vivo genome editing. It allows for quick and efficient manipulations of a genome, all happening in a machine small enough to put on top of a small kitchen table. Those mutations combine with the variation that naturally occurs during cell mitosis creating billions of cellular mutations.

Chemically combined, synthetic single-stranded DNA (ssDNA) and a pool of oligionucleotides are introduced at targeted areas of the cell thereby creating genetic modifications. The cyclical process involves transformation of ssDNA (by electroporation) followed by outgrowth, during which bacteriophage homologous recombination proteins mediate annealing of ssDNAs to their genomic targets. Experiments targeting selective phenotypic markers are screened and identified by plating the cells on differential medias. Each cycle ultimately takes 2.5 hours to process, with additional time required to grow isogenic cultures and characterize mutations. By iteratively introducing libraries of mutagenic ssDNAs targeting multiple sites, MAGE can generate combinatorial genetic diversity in a cell population. There can be up to 50 genome edits, from single nucleotide base pairs to whole genome or gene networks simultaneously with results in a matter of days.

MAGE experiments can be divided into three classes, characterized by varying degrees of scale and complexity: (i) many target sites, single genetic mutations; (ii) single target site, many genetic mutations; and (iii) many target sites, many genetic mutations. An example of class three was reflected in 2009, where Church and colleagues were able to program Escherichia coli to produce five times the normal amount of lycopene, an antioxidant normally found in tomato seeds and linked to anti-cancer properties. They applied MAGE to optimize the 1-deoxy-d-xylulose-5-phosphate (DXP) metabolic pathway in Escherichia coli to overproduce isoprenoid lycopene. It took them about 3 days and just over $1,000 in materials. The ease, speed, and cost efficiency in which MAGE can alter genomes can transform how industries approach the manufacturing and production of important compounds in the bioengineering, bioenergy, biomedical engineering, synthetic biology, pharmaceutical, agricultural, and chemical industries.

Applications

Plants, animals and human genes that are successfully targeted using ZFN, which demonstrates the generality of this approach

As of 2012 efficient genome editing had been developed for a wide range of experimental systems ranging from plants to animals, often beyond clinical interest, and was becoming a standard experimental strategy in research labs. The recent generation of rat, zebrafish, maize and tobacco ZFN-mediated mutants and the improvements in TALEN-based approaches testify to the significance of the methods, and the list is expanding rapidly. Genome editing with engineered nucleases will likely contribute to many fields of life sciences from studying gene functions in plants and animals to gene therapy in humans. For instance, the field of synthetic biology which aims to engineer cells and organisms to perform novel functions, is likely to benefit from the ability of engineered nuclease to add or remove genomic elements and therefore create complex systems. In addition, gene functions can be studied using stem cells with engineered nucleases.

Listed below are some specific tasks this method can carry out:

Targeted gene modification in animals

The combination of recent discoveries in genetic engineering, particularly gene editing and the latest improvement in bovine reproduction technologies (e.g. in vitro embryo culture) allows for genome editing directly in fertilised oocytes using synthetic highly specific endonucleases. RNA-guided endonucleases:clustered regularly interspaced short palindromic repeats associated Cas9 (CRISPR/Cas9) are a new tool, further increasing the range of methods available. In particular CRISPR/Cas9 engineered endonucleases allows the use of multiple guide RNAs for simultaneous Knockouts (KO) in one step by cytoplasmic direct injection (CDI) on mammalian zygotes.

Thanks to the parallel development of single-cell transcriptomics, genome editing and new stem cell models we are now entering a scientifically exciting period where functional genetics is no longer restricted to animal models but can be performed directly in human samples. Single-cell gene expression analysis has resolved a transcriptional road-map of human development from which key candidate genes are being identified for functional studies. Using global transcriptomics data to guide experimentation, the CRISPR based genome editing tool has made it feasible to disrupt or remove key genes in order to elucidate function in a human setting.

Targeted gene modification in plants

Overview of GEEN workflow and editing possibilities

Genome editing using Meganuclease, ZFNs, and TALEN provides a new strategy for genetic manipulation in plants and are likely to assist in the engineering of desired plant traits by modifying endogenous genes. For instance, site-specific gene addition in major crop species can be used for 'trait stacking' whereby several desired traits are physically linked to ensure their co-segregation during the breeding processes. Progress in such cases have been recently reported in Arabidopsis thaliana and Zea mays. In Arabidopsis thaliana, using ZFN-assisted gene targeting, two herbicide-resistant genes (tobacco acetolactate synthase SuRA and SuRB) were introduced to SuR loci with as high as 2% transformed cells with mutations. In Zea mays, disruption of the target locus was achieved by ZFN-induced DSBs and the resulting NHEJ. ZFN was also used to drive herbicide-tolerance gene expression cassette (PAT) into the targeted endogenous locus IPK1 in this case. Such genome modification observed in the regenerated plants has been shown to be inheritable and was transmitted to the next generation. A potentially successful example of the application of genome editing techniques in crop improvement can be found in banana, where scientists used CRISPR/Cas9 editing to inactivate the endogenous banana streak virus in the B genome of banana (Musa spp.) to overcome a major challenge in banana breeding.

In addition, TALEN-based genome engineering has been extensively tested and optimized for use in plants. TALEN fusions have also been used by a U.S. food ingredient company, Calyxt, to improve the quality of soybean oil products and to increase the storage potential of potatoes.

Several optimizations need to be made in order to improve editing plant genomes using ZFN-mediated targeting. There is a need for reliable design and subsequent test of the nucleases, the absence of toxicity of the nucleases, the appropriate choice of the plant tissue for targeting, the routes of induction of enzyme activity, the lack of off-target mutagenesis, and a reliable detection of mutated cases.

A common delivery method for CRISPR/Cas9 in plants is Agrobacterium-based transformation. T-DNA is introduced directly into the plant genome by a T4SS mechanism. Cas9 and gRNA-based expression cassettes are turned into Ti plasmids, which are transformed in Agrobacterium for plant application. To improve Cas9 delivery in live plants, viruses are being used more effective transgene delivery.

Research

Gene therapy

The ideal gene therapy practice is that which replaces the defective gene with a normal allele at its natural location. This is advantageous over a virally delivered gene as there is no need to include the full coding sequences and regulatory sequences when only a small proportions of the gene needs to be altered as is often the case. The expression of the partially replaced genes is also more consistent with normal cell biology than full genes that are carried by viral vectors.

The first clinical use of TALEN-based genome editing was in the treatment of CD19+ acute lymphoblastic leukemia in an 11-month old child in 2015. Modified donor T cells were engineered to attack the leukemia cells, to be resistant to Alemtuzumab, and to evade detection by the host immune system after introduction.

Extensive research has been done in cells and animals using CRISPR-Cas9 to attempt to correct genetic mutations which cause genetic diseases such as Down syndrome, spina bifida, anencephaly, and Turner and Klinefelter syndromes.

In February 2019, medical scientists working with Sangamo Therapeutics, headquartered in Richmond, California, announced the first ever "in body" human gene editing therapy to permanently alter DNA - in a patient with Hunter Syndrome. Clinical trials by Sangamo involving gene editing using Zinc Finger Nuclease (ZFN) are ongoing.

Eradicating diseases

Researchers have used CRISPR-Cas9 gene drives to modify genes associated with sterility in A. gambiae, the vector for malaria. This technique has further implications in eradicating other vector borne diseases such as yellow fever, dengue, and Zika.

The CRISPR-Cas9 system can be programmed to modulate the population of any bacterial species by targeting clinical genotypes or epidemiological isolates. It can selectively enable the beneficial bacterial species over the harmful ones by eliminating pathogen, which gives it an advantage over broad-spectrum antibiotics.

Antiviral applications for therapies targeting human viruses such as HIV, herpes, and hepatitis B virus are under research. CRISPR can be used to target the virus or the host to disrupt genes encoding the virus cell-surface receptor proteins. In November 2018, He Jiankui announced that he had edited two human embryos, to attempt to disable the gene for CCR5, which codes for a receptor that HIV uses to enter cells. He said that twin girls, Lulu and Nana, had been born a few weeks earlier. He said that the girls still carried functional copies of CCR5 along with disabled CCR5 (mosaicism) and were still vulnerable to HIV. The work was widely condemned as unethical, dangerous, and premature.

In January 2019, scientists in China reported the creation of five identical cloned gene-edited monkeys, using the same cloning technique that was used with Zhong Zhong and Hua Hua – the first ever cloned monkeys - and Dolly the sheep, and the same gene-editing Crispr-Cas9 technique allegedly used by He Jiankui in creating the first ever gene-modified human babies Lulu and Nana. The monkey clones were made in order to study several medical diseases.

In the near future the new CRISPR system will also be able to eradicate diseases and conditions that humans are predisposed for. With this new technology scientists will be able to take the genes of a human sperm cell and egg, and replace the genes that activate cancer or other abnormal or unwanted defects. This will take the stress off of parents worrying about having a child and them not being able to live it like a normal child should. After just one generation of this process, the entire future of the human race would never have to worry about the problems of deformities or predisposed conditions.

Prospects and limitations

In the future, an important goal of research into genome editing with engineered nucleases must be the improvement of the safety and specificity of the nucleases. For example, improving the ability to detect off-target events can improve our ability to learn about ways of preventing them. In addition, zinc-fingers used in ZFNs are seldom completely specific, and some may cause a toxic reaction. However, the toxicity has been reported to be reduced by modifications done on the cleavage domain of the ZFN.

In addition, research by Dana Carroll into modifying the genome with engineered nucleases has shown the need for better understanding of the basic recombination and repair machinery of DNA. In the future, a possible method to identify secondary targets would be to capture broken ends from cells expressing the ZFNs and to sequence the flanking DNA using high-throughput sequencing.

Because of the ease of use and cost-efficiency of CRISPR, extensive research is currently being done on it. There are now more publications on CRISPR than ZFN and TALEN despite how recent the discovery of CRISPR is. Both CRISPR and TALEN are favored to be the choices to be implemented in large-scale productions due to their precision and efficiency.

Genome editing occurs also as a natural process without artificial genetic engineering. The agents that are competent to edit genetic codes are viruses or subviral RNA-agents.

Although GEEN has higher efficiency than many other methods in reverse genetics, it is still not highly efficient; in many cases less than half of the treated populations obtain the desired changes. For example, when one is planning to use the cell's NHEJ to create a mutation, the cell's HDR systems will also be at work correcting the DSB with lower mutational rates.

Traditionally, mice have been the most common choice for researchers as a host of a disease model. CRISPR can help bridge the gap between this model and human clinical trials by creating transgenic disease models in larger animals such as pigs, dogs, and non-human primates. Using the CRISPR-Cas9 system, the programmed Cas9 protein and the sgRNA can be directly introduced into fertilized zygotes to achieve the desired gene modifications when creating transgenic models in rodents. This allows bypassing of the usual cell targeting stage in generating transgenic lines, and as a result, it reduces generation time by 90%.

One potential that CRISPR brings with its effectiveness is the application of xenotransplantation. In previous research trials, CRISPR demonstrated the ability to target and eliminate endogenous retroviruses, which reduces the risk of transmitting diseases and reduces immune barriers. Eliminating these problems improves donor organ function, which brings this application closer to a reality. 

In plants, genome editing is seen as a viable solution to the conservation of biodiversity. Gene drive are a potential tool to alter the reproductive rate of invasive species, although there are significant associated risks.

Human enhancement

Many transhumanists see genome editing as a potential tool for human enhancement. Australian biologist and Professor of Genetics David Andrew Sinclair notes that "the new technologies with genome editing will allow it to be used on individuals (...) to have (...) healthier children" – designer babies. According to a September 2016 report by the Nuffield Council on Bioethics in the future it may be possible to enhance people with genes from other organisms or wholly synthetic genes to for example improve night vision and sense of smell.

The American National Academy of Sciences and National Academy of Medicine issued a report in February 2017 giving qualified support to human genome editing. They recommended that clinical trials for genome editing might one day be permitted once answers have been found to safety and efficiency problems "but only for serious conditions under stringent oversight."

Risks

In the 2016 Worldwide Threat Assessment of the US Intelligence Community statement United States Director of National Intelligence, James R. Clapper, named genome editing as a potential weapon of mass destruction, stating that genome editing conducted by countries with regulatory or ethical standards "different from Western countries" probably increases the risk of the creation of harmful biological agents or products. According to the statement the broad distribution, low cost, and accelerated pace of development of this technology, its deliberate or unintentional misuse might lead to far-reaching economic and national security implications. For instance technologies such as CRISPR could be used to make "killer mosquitoes" that cause plagues that wipe out staple crops.

According to a September 2016 report by the Nuffield Council on Bioethics, the simplicity and low cost of tools to edit the genetic code will allow amateurs – or "biohackers" – to perform their own experiments, posing a potential risk from the release of genetically modified bugs. The review also found that the risks and benefits of modifying a person's genome – and having those changes pass on to future generations – are so complex that they demand urgent ethical scrutiny. Such modifications might have unintended consequences which could harm not only the child, but also their future children, as the altered gene would be in their sperm or eggs. In 2001 Australian researchers Ronald Jackson and Ian Ramshaw were criticized for publishing a paper in the Journal of Virology that explored the potential control of mice, a major pest in Australia, by infecting them with an altered mousepox virus that would cause infertility as the provided sensitive information could lead to the manufacture of biological weapons by potential bioterrorists who might use the knowledge to create vaccine resistant strains of other pox viruses, such as smallpox, that could affect humans. Furthermore, there are additional concerns about the ecological risks of releasing gene drives into wild populations.

Adeno-associated virus

From Wikipedia, the free encyclopedia
 

Adeno-associated virus
Adeno-associated virus serotype 2 structure from 1LP3. One fivefold axis shown center.
Adeno-associated virus serotype 2 structure from 1LP3. One fivefold axis shown center.
Scientific classificationEdit this classification
(unranked): Virus
Realm: Monodnaviria
Kingdom: Shotokuvirae
Phylum: Cossaviricota
Class: Quintoviricetes
Order: Piccovirales
Family: Parvoviridae
Subfamily: Parvovirinae
Genus: Dependoparvovirus
Viruses included:
  • Adeno-associated dependoparvovirus A
  • Adeno-associated dependoparvovirus B
Adeno-associated viruses (AAV) are small viruses that infect humans and some other primate species. They belong to the genus Dependoparvovirus, which in turn belongs to the family Parvoviridae. They are small (20 nm) replication-defective, nonenveloped viruses.

AAV are not currently known to cause disease. The viruses cause a very mild immune response. Several additional features make AAV an attractive candidate for creating viral vectors for gene therapy, and for the creation of isogenic human disease models. Gene therapy vectors using AAV can infect both dividing and quiescent cells and persist in an extrachromosomal state without integrating into the genome of the host cell, although in the native virus integration of virally carried genes into the host genome does occur. Integration can be important for certain applications, but can also have unwanted consequences. Recent human clinical trials using AAV for gene therapy in the retina have shown promise.

History

The adeno-associated virus (AAV), previously thought to be a contaminant in adenovirus preparations, was first identified as a dependoparvovirus in the 1960s in the laboratories of Bob Atchison at Pittsburgh and Wallace Rowe at NIH. Serological studies in humans subsequently indicated that, despite being present in people infected by helper viruses such as adenovirus or herpes virus, AAV itself did not cause any disease.

Use in gene therapy

Advantages and drawbacks

Wild-type AAV has attracted considerable interest from gene therapy researchers due to a number of features. Chief amongst these is the virus's apparent lack of pathogenicity. It can also infect non-dividing cells and has the ability to stably integrate into the host cell genome at a specific site (designated AAVS1) in the human chromosome 19. This feature makes it somewhat more predictable than retroviruses, which present the threat of a random insertion and of mutagenesis, which is sometimes followed by development of a cancer. The AAV genome integrates most frequently into the site mentioned, while random incorporations into the genome take place with a negligible frequency. Development of AAVs as gene therapy vectors, however, has eliminated this integrative capacity by removal of the rep and cap from the DNA of the vector. The desired gene together with a promoter to drive transcription of the gene is inserted between the inverted terminal repeats (ITRs) that aid in concatemer formation in the nucleus after the single-stranded vector DNA is converted by host cell DNA polymerase complexes into double-stranded DNA. AAV-based gene therapy vectors form episomal concatemers in the host cell nucleus. In non-dividing cells, these concatemers remain intact for the life of the host cell. In dividing cells, AAV DNA is lost through cell division, since the episomal DNA is not replicated along with the host cell DNA. Random integration of AAV DNA into the host genome is detectable but occurs at very low frequency. AAVs also present very low immunogenicity, seemingly restricted to generation of neutralizing antibodies, while they induce no clearly defined cytotoxic response. This feature, along with the ability to infect quiescent cells present their dominance over adenoviruses as vectors for human gene therapy

Use of the virus does present some disadvantages. The cloning capacity of the vector is relatively limited and most therapeutic genes require the complete replacement of the virus's 4.8 kilobase genome. Large genes are, therefore, not suitable for use in a standard AAV vector. Options are currently being explored to overcome the limited coding capacity. The AAV ITRs of two genomes can anneal to form head-to-tail concatemers, almost doubling the capacity of the vector. Insertion of splice sites allows for the removal of the ITRs from the transcript. 

Because of AAV's specialized gene therapy advantages, researchers have created an altered version of AAV termed self-complementary adeno-associated virus (scAAV). Whereas AAV packages a single strand of DNA and must wait for its second strand to be synthesized, scAAV packages two shorter strands that are complementary to each other. By avoiding second-strand synthesis, scAAV can express more quickly, although as a caveat, scAAV can only encode half of the already limited capacity of AAV. Recent reports suggest that scAAV vectors are more immunogenic than single stranded adenovirus vectors, inducing a stronger activation of cytotoxic T lymphocytes.

Humoral immunity instigated by infection with the wild type is thought to be common. The associated neutralising activity limits the usefulness of the most commonly used serotype AAV2 in certain applications. Accordingly, the majority of clinical trials under way involve delivery of AAV2 into the brain, a relatively immunologically privileged organ. In the brain, AAV2 is strongly neuron-specific.

Clinical trials

To date, AAV vectors have been used in over 117 clinical trials worldwide, approximately 5.6% of virus-vectored gene-therapy trials. Recently, promising results have been obtained from Phase 1 and Phase 2 trials for a number of diseases, including Leber's congenital amaurosis, hemophilia, congestive heart failure, spinal muscular atrophy, lipoprotein lipase deficiency, and Parkinson's disease.

Selected clinical trials using AAV-based vectors
Indication Gene Route of administration Phase Subject number Status
Cystic fibrosis CFTR Lung, via aerosol I 12 Complete
CFTR Lung, via aerosol II 38 Complete
CFTR Lung, via aerosol II 100 Complete
Hemophilia B FIX Intramuscular I 9 Complete
FIX Hepatic artery I 6 Ended
Arthritis TNFR:Fc Intraarticular I 1 Ongoing
Hereditary emphysema AAT Intramuscular I 12 Ongoing
Leber's congenital amaurosis RPE65 Subretinal I–II Multiple Several ongoing and complete
Age-related macular degeneration sFlt-1 Subretinal I–II 24 Ongoing
Duchenne muscular dystrophy SGCA Intramuscular I 10 Ongoing
Parkinson's disease GAD65, GAD67 Intracranial I 12 Complete
Canavan disease AAC Intracranial I 21 Ongoing
Batten disease CLN2 Intracranial I 10 Ongoing
Alzheimer's disease NGF Intracranial I 6 Ongoing
Spinal muscular atrophy SMN1 Intravenous and Intrathecal I–III 15 Several ongoing and complete
Congestive heart failure SERCA2a Intra-coronary IIb 250 Ongoing

Structure

Two adenovirus particles surrounded by numerous, smaller adeno-associated viruses (negative-staining electron microscopy, magnification approximately 200,000×)

Genome, transcriptome and proteome

The AAV genome is built of single-stranded deoxyribonucleic acid (ssDNA), either positive- or negative-sensed, which is about 4.7 kilobase long. The genome comprises ITRs at both ends of the DNA strand, and two open reading frames (ORFs): rep and cap. The former is composed of four overlapping genes encoding Rep proteins required for the AAV life cycle, and the latter contains overlapping nucleotide sequences of capsid proteins: VP1, VP2 and VP3, which interact to form a capsid with icosahedral symmetry.

ITR sequences

The inverted terminal repeat (ITR) sequences comprise 145 bases each. They were named so because of their symmetry, which was shown to be required for efficient multiplication of the AAV genome. The feature of these sequences that gives them this property is their ability to form a hairpin, which contributes to so-called self-priming that allows primase-independent synthesis of the second DNA strand. The ITRs were also shown to be required for both integration of the AAV DNA into the host cell genome (19th chromosome in humans) and rescue from it, as well as for efficient encapsidation of the AAV DNA combined with generation of a fully assembled, deoxyribonuclease-resistant AAV particles.

With regard to gene therapy, ITRs seem to be the only sequences required in cis next to the therapeutic gene: structural (cap) and packaging (rep) proteins can be delivered in trans. With this assumption many methods were established for efficient production of recombinant AAV (rAAV) vectors containing a reporter or therapeutic gene. However, it was also published that the ITRs are not the only elements required in cis for the effective replication and encapsidation. A few research groups have identified a sequence designated cis-acting Rep-dependent element (CARE) inside the coding sequence of the rep gene. CARE was shown to augment the replication and encapsidation when present in cis.

rep gene and Rep proteins

On the "left side" of the genome there are two promoters called p5 and p19, from which two overlapping messenger ribonucleic acids (mRNAs) of different length can be produced. Each of these contains an intron which can be either spliced out or not. Given these possibilities, four various mRNAs, and consequently four various Rep proteins with overlapping sequence can be synthesized. Their names depict their sizes in kilodaltons (kDa): Rep78, Rep68, Rep52 and Rep40. Rep78 and 68 can specifically bind the hairpin formed by the ITR in the self-priming act and cleave at a specific region, designated terminal resolution site, within the hairpin. They were also shown to be necessary for the AAVS1-specific integration of the AAV genome. All four Rep proteins were shown to bind ATP and to possess helicase activity. It was also shown that they upregulate the transcription from the p40 promoter (mentioned below), but downregulate both p5 and p19 promoters.

cap gene and VP proteins

The right side of a positive-sensed AAV genome encodes overlapping sequences of three capsid proteins, VP1, VP2 and VP3, which start from one promoter, designated p40. The molecular weights of these proteins are 87, 72 and 62 kiloDaltons, respectively. The AAV capsid is composed of a mixture of VP1, VP2, and VP3 totaling 60 monomers arranged in icosahedral symmetry in a ratio of 1:1:10, with an estimated size of 3.9 MegaDaltons. The crystal structure of the VP3 protein was determined by Xie, Bue, et al. 

AAV2 capsid, shown as a ribbon diagram, with the back half hidden for clarity. One fivefold symmetry axis is shown center.
 
The cap gene produces an additional, non-structural protein called the Assembly-Activating Protein (AAP). This protein is produced from ORF2 and is essential for the capsid-assembly process. The exact function of this protein in the assembly process and its structure have not been solved to date.

All three VPs are translated from one mRNA. After this mRNA is synthesized, it can be spliced in two different manners: either a longer or shorter intron can be excised resulting in the formation of two pools of mRNAs: a 2.3 kb- and a 2.6 kb-long mRNA pool. Usually, especially in the presence of adenovirus, the longer intron is preferred, so the 2.3-kb-long mRNA represents the so-called "major splice". In this form the first AUG codon, from which the synthesis of VP1 protein starts, is cut out, resulting in a reduced overall level of VP1 protein synthesis. The first AUG codon that remains in the major splice is the initiation codon for VP3 protein. However, upstream of that codon in the same open reading frame lies an ACG sequence (encoding threonine) which is surrounded by an optimal Kozak context. This contributes to a low level of synthesis of VP2 protein, which is actually VP3 protein with additional N terminal residues, as is VP1.

Since the bigger intron is preferred to be spliced out, and since in the major splice the ACG codon is a much weaker translation initiation signal, the ratio at which the AAV structural proteins are synthesized in vivo is about 1:1:20, which is the same as in the mature virus particle. The unique fragment at the N terminus of VP1 protein was shown to possess the phospholipase A2 (PLA2) activity, which is probably required for the releasing of AAV particles from late endosomes. Muralidhar et al. reported that VP2 and VP3 are crucial for correct virion assembly. More recently, however, Warrington et al. showed VP2 to be unnecessary for the complete virus particle formation and an efficient infectivity, and also presented that VP2 can tolerate large insertions in its N terminus, while VP1 can not, probably because of the PLA2 domain presence.

Classification, serotypes, receptors and native tropism

Two species of AAV were recognised by the International Committee on Taxonomy of Viruses in 2013: adeno-associated dependoparvovirus A (formerly AAV-1, -2, -3 and -4) and adeno-associated dependoparvovirus B (formerly AAV-5).

Until the 1990s, virtually all AAV biology was studied using AAV serotype 2. However, AAV is highly prevalent in humans and other primates and several serotypes have been isolated from various tissue samples. Serotypes 2, 3, 5, and 6 were discovered in human cells, AAV serotypes 1, 4, and 7–11 in nonhuman primate samples. As of 2006 there have been 11 AAV serotypes described, the 11th in 2004. AAV capsid proteins contain 12 hypervariable surface regions, with most variability occurring in the threefold proximal peaks, but the parvovirus genome in general presents highly conserved replication and structural genes across serotypes. All of the known serotypes can infect cells from multiple diverse tissue types. Tissue specificity is determined by the capsid serotype and pseudotyping of AAV vectors to alter their tropism range will likely be important to their use in therapy.

Serotype 2

Serotype 2 (AAV2) has been the most extensively examined so far. AAV2 presents natural tropism towards skeletal muscles, neurons, vascular smooth muscle cells and hepatocytes.

Three cell receptors have been described for AAV2: heparan sulfate proteoglycan (HSPG), aVβ5 integrin and fibroblast growth factor receptor 1 (FGFR-1). The first functions as a primary receptor, while the latter two have a co-receptor activity and enable AAV to enter the cell by receptor-mediated endocytosis. These study results have been disputed by Qiu, Handa, et al. HSPG functions as the primary receptor, though its abundance in the extracellular matrix can scavenge AAV particles and impair the infection efficiency.

Studies have shown that serotype 2 of the virus (AAV-2) apparently kills cancer cells without harming healthy ones. "Our results suggest that adeno-associated virus type 2, which infects the majority of the population but has no known ill effects, kills multiple types of cancer cells yet has no effect on healthy cells," said Craig Meyers, a professor of immunology and microbiology at the Penn State College of Medicine in Pennsylvania in 2005. This could lead to a new anti-cancer agent.

Other serotypes

Although AAV2 is the most popular serotype in various AAV-based research, it has been shown that other serotypes can be more effective as gene delivery vectors. For instance AAV6 appears much better in infecting airway epithelial cells, AAV7 presents very high transduction rate of murine skeletal muscle cells (similar to AAV1 and AAV5), AAV8 is superb in transducing hepatocytes and AAV1 and 5 were shown to be very efficient in gene delivery to vascular endothelial cells. In the brain, most AAV serotypes show neuronal tropism, while AAV5 also transduces astrocytes. AAV6, a hybrid of AAV1 and AAV2, also shows lower immunogenicity than AAV2.

Serotypes can differ with the respect to the receptors they are bound to. For example, AAV4 and AAV5 transduction can be inhibited by soluble sialic acids (of different form for each of these serotypes), and AAV5 was shown to enter cells via the platelet-derived growth factor receptor.

Synthetic serotypes

There have been many efforts to engineer and improve new AAV variants for both clinical and research purposes. Such modifications include new tropisms to target specific tissues, and modified surface residues to evade detection by the immune system. Beyond opting for particular strains of recombinant AAV (rAAV) to target particular cells, researchers have also explored AAV pseudotyping, the practice of creating hybrids of certain AAV strains to approach an even more refined target. The hybrid is created by taking a capsid from one strain and the genome from another strain. For example, research involving AAV2/5, a hybrid with the genome of AAV2 and the capsid of AAV5, was able to achieve more accuracy and range in brain cells than AAV2 would be able to achieve unhybridized. Researchers have continued to experiment with pseudotyping by creating strains with hybrid capsids. AAV-DJ has a hybrid capsid from eight different strains of AAV; as such, it can infect different cells throughout many areas of the body, a property which a single strain of AAV with a limited tropism would not have. Other efforts to engineer and improve new AAV variants have involved the ancestral reconstruction of virus variants to generate new vectors with enhanced properties for clinical applications and the study of AAV biology.

Immunology

AAV is of particular interest to gene therapists due to its apparent limited capacity to induce immune responses in humans, a factor which should positively influence vector transduction efficiency while reducing the risk of any immune-associated pathology

AAV is not considered to have any known role in disease.

Innate

The innate immune response to the AAV vectors has been characterised in animal models. Intravenous administration in mice causes transient production of pro-inflammatory cytokines and some infiltration of neutrophils and other leukocytes into the liver, which seems to sequester a large percentage of the injected viral particles. Both soluble factor levels and cell infiltration appear to return to baseline within six hours. By contrast, more aggressive viruses produce innate responses lasting 24 hours or longer.

Humoral

The virus is known to instigate robust humoral immunity in animal models and in the human population, where up to 80% of individuals are thought to be seropositive for AAV2. Antibodies are known to be neutralising, and for gene therapy applications these do impact on vector transduction efficiency via some routes of administration. As well as persistent AAV specific antibody levels, it appears from both prime-boost studies in animals and from clinical trials that the B-cell memory is also strong. In seropositive humans, circulating IgG antibodies for AAV2 appear to be primarily composed of the IgG1 and IgG2 subclasses, with little or no IgG3 or IgG4 present.

Cell-mediated

The cell-mediated response to the virus and to vectors is poorly characterised, and has been largely ignored in the literature as recently as 2005. Clinical trials using an AAV2-based vector to treat haemophilia B seem to indicate that targeted destruction of transduced cells may be occurring. Combined with data that shows that CD8+ T-cells can recognise elements of the AAV capsid in vitro, it appears that there may be a cytotoxic T lymphocyte response to AAV vectors. Cytotoxic responses would imply the involvement of CD4+ T helper cells in the response to AAV and in vitro data from human studies suggests that the virus may indeed induce such responses, including both Th1 and Th2 memory responses. A number of candidate T cell stimulating epitopes have been identified within the AAV capsid protein VP1, which may be attractive targets for modification of the capsid if the virus is to be used as a vector for gene therapy.

Infection cycle

There are several steps in the AAV infection cycle, from infecting a cell to producing new infectious particles:
  1. attachment to the cell membrane
  2. receptor-mediated endocytosis
  3. endosomal trafficking
  4. escape from the late endosome or lysosome
  5. translocation to the nucleus
  6. uncoating
  7. formation of double-stranded DNA replicative form of the AAV genome
  8. expression of rep genes
  9. genome replication
  10. expression of cap genes, synthesis of progeny ssDNA particles
  11. assembly of complete virions, and
  12. release from the infected cell.
Some of these steps may look different in various types of cells, which, in part, contributes to the defined and quite limited native tropism of AAV. Replication of the virus can also vary in one cell type, depending on the cell's current cell cycle phase.

The characteristic feature of the adeno-associated virus is a deficiency in replication and thus its inability to multiply in unaffected cells. Adeno-associated virus spreads by co-infecting a cell with a helper virus. The first helper virus that was described as providing successful generation of new AAV particles, was the adenovirus, from which the AAV name originated. It was then shown that AAV replication can be facilitated by selected proteins derived from the adenovirus genome, by other viruses such as HSV or vaccinia, or by genotoxic agents, such as UV irradiation or hydroxyurea. Depending on the presence or absence of a helper virus, the life cycle of AAV follows either a lytic or lysogenic pathway, respectively. If there is a helper virus, AAV's gene expression activates, allowing the virus to replicate using the host cell's polymerase. When the helper virus kills the host cell, the new AAV virions are released. If there is not a helper virus present, AAV exhibits lysogenic behavior. When AAV infects a cell alone, its gene expression is repressed (AAV does not replicate), and its genome is incorporated into the host genome (into human chromosome 19). In rare cases, lysis can occur without a helper virus, but usually AAV can not replicate and kill a cell on its own.

The minimal set of the adenoviral genes required for efficient generation, of progeny AAV particles, was discovered by Matsushita, Ellinger et al. This discovery allowed for new production methods of recombinant AAV, which do not require adenoviral co-infection of the AAV-producing cells. In the absence of helper virus or genotoxic factors, AAV DNA can either integrate into the host genome or persist in episomal form. In the former case integration is mediated by Rep78 and Rep68 proteins and requires the presence of ITRs flanking the region being integrated. In mice, the AAV genome has been observed persisting for long periods of time in quiescent tissues, such as skeletal muscles, in episomal form (a circular head-to-tail conformation).

Butane

From Wikipedia, the free encyclopedia ...