DNA sequencers
| |
Manufacturers | Roche, Illumina, Life Technologies, Beckman Coulter, Pacific Biosciences |
---|
A DNA sequencer is a scientific instrument used to automate the DNA sequencing process. Given a sample of DNA, a DNA sequencer is used to determine the order of the four bases: G (guanine), C (cytosine), A (adenine) and T (thymine). This is then reported as a text string, called a read. Some DNA sequencers can be also considered optical instruments as they analyze light signals originating from fluorochromes attached to nucleotides.
The first automated DNA sequencer, invented by Lloyd M. Smith, was introduced by Applied Biosystems in 1987. It used the Sanger sequencing method, a technology which formed the basis of the “first generation” of DNA sequencers and enabled the completion of the human genome project in 2001. This first generation of DNA sequencers are essentially automated electrophoresis systems that detect the migration of labelled DNA fragments. Therefore, these sequencers can also be used in the genotyping of genetic markers where only the length of a DNA fragment(s) needs to be determined (e.g. microsatellites, AFLPs).
The Human Genome Project spurred the development of cheaper, high throughput and more accurate platforms known as Next Generation Sequencers (NGS) to sequence the human genome. These include the 454, SOLiD and Illumina DNA sequencing platforms. Next generation sequencing machines have increased the rate of DNA sequencing substantially, as compared with the previous Sanger methods. DNA samples can be prepared automatically in as little as 90 mins, while a human genome can be sequenced at 15 times coverage in a matter of days.
More recent, third-generation DNA sequencers such as SMRT and Oxford Nanopore measure the addition of nucleotides to a single DNA molecule in real time.
Because of limitations in DNA sequencer technology these reads are short compared to the length of a genome therefore the reads must be assembled into longer contigs. The data may also contain errors, caused by limitations in the DNA sequencing technique or by errors during PCR amplification. DNA sequencer manufacturers use a number of different methods to detect which DNA bases are present. The specific protocols applied in different sequencing platforms have an impact in the final data that is generated. Therefore, comparing data quality and cost across different technologies can be a daunting task. Each manufacturer provides their own ways to inform sequencing errors and scores. However, errors and scores between different platforms cannot always be compared directly. Since these systems rely on different DNA sequencing approaches, choosing the best DNA sequencer and method will typically depend on the experiment objectives and available budget.
History
The first DNA sequencing methods were developed by Gilbert (1973) and Sanger (1975).
Gilbert introduced a sequencing method based on chemical modification
of DNA followed by cleavage at specific bases whereas Sanger’s technique
is based on dideoxynucleotide
chain termination. The Sanger method became popular due to its
increased efficiency and low radioactivity. The first automated DNA
sequencer was the AB370A, introduced in 1986 by Applied Biosystems.
The AB370A was able to sequence 96 samples simultaneously, 500
kilobases per day, and reaching read lengths up to 600 bases. This was
the beginning of the “first generation” of DNA sequencers,
which implemented Sanger sequencing, fluorescent dideoxy nucleotides
and polyacrylamide gel sandwiched between glass plates - slab gels. The
next major advance was the release in 1995 of the AB310 which utilized a
linear polymer in a capillary in place of the slab gel for DNA strand
separation by electrophoresis. These techniques formed the base for the
completion of the human genome project in 2001.
The human genome project spurred the development of cheaper, high
throughput and more accurate platforms known as Next Generation
Sequencers (NGS). In 2005, 454 Life Sciences
released the 454 sequencer, followed by Solexa Genome Analyzer and
SOLiD (Supported Oligo Ligation Detection) by Agencourt in 2006. Applied
Biosystems acquired Agencourt in 2006, and in 2007, Roche bought 454
Life Sciences, while Illumina purchased Solexa. Ion Torrent entered the
market in 2010 and was acquired by Life Technologies (now Thermo Fisher
Scientific). These are still the most common NGS systems due to their
competitive cost, accuracy, and performance.
More recently, a third generation of DNA sequencers was
introduced. The sequencing methods applied by these sequencers do not
require DNA amplification (polymerase chain reaction – PCR), which
speeds up the sample preparation before sequencing and reduces errors.
In addition, sequencing data is collected from the reactions caused by
the addition of nucleotides in the complementary strand in real time.
Two companies introduced different approaches in their third-generation
sequencers. Pacific Biosciences sequencers utilize a method called
Single-molecule real-time (SMRT), where sequencing data is produced by
light (captured by a camera) emitted when a nucleotide is added to the
complementary strand by enzymes containing fluorescent dyes. Oxford
Nanopore Technologies is another company developing third-generation
sequencers using electronic systems based on nanopore sensing
technologies.
Manufacturers of DNA sequencers
DNA sequencers have been developed, manufactured, and sold by the following companies, among others.
Roche
The 454 DNA sequencer was the first next-generation sequencer to become commercially successful.
It was developed by 454 Life Sciences and purchased by Roche in 2007.
454 utilizes the detection of pyrophosphate released by the DNA
polymerase reaction when adding a nucleotide to the template strain.
Roche currently manufactures two systems based on their pyrosequencing technology: the GS FLX+ and the GS Junior System.
The GS FLX+ System promises read lengths of approximately 1000 base
pairs while the GS Junior System promises 400 base pair reads.
A predecessor to GS FLX+, the 454 GS FLX Titanium system was released
in 2008, achieving an output of 0.7G of data per run, with 99.9%
accuracy after quality filter, and a read length of up to 700bp. In
2009, Roche launched the GS Junior, a bench top version of the 454
sequencer with read length up to 400bp, and simplified library
preparation and data processing.
One of the advantages of 454 systems is their running speed,
Manpower can be reduced with automation of library preparation and
semi-automation of emulsion PCR. A disadvantage of the 454 system is
that it is prone to errors when estimating the number of bases in a long
string of identical nucleotides. This is referred to as a homopolymer
error and occurs when there are 6 or more identical bases in row. Another disadvantage is that the price of reagents is relatively more expensive compared with other next-generation sequencers.
In 2013 Roche announced that they would be shutting down
development of 454 technology and phasing out 454 machines completely in
2016.
Roche produces a number of software tools which are optimised for the analysis of 454 sequencing data. GS Run Processor
converts raw images generated by a sequencing run into intensity
values. The process consists of two main steps: image processing and
signal processing. The software also applies normalization, signal
correction, base-calling and quality scores for individual reads. The
software outputs data in Standard Flowgram Format (or SFF) files to be
used in data analysis applications (GS De Novo Assembler, GS Reference
Mapper or GS Amplicon Variant Analyzer). GS De Novo Assembler is a tool
for de novo assembly of whole-genomes up to 3GB in size from
shotgun reads alone or combined with paired end data generated by 454
sequencers. It also supports de novo assembly of transcripts (including
analysis), and also isoform variant detection.
GS Reference Mapper maps short reads to a reference genome, generating a
consensus sequence. The software is able to generate output files for
assessment, indicating insertions, deletions and SNPs. Can handle large
and complex genomes of any size.
Finally, the GS Amplicon Variant Analyzer aligns reads from amplicon
samples against a reference, identifying variants (linked or not) and
their frequencies. It can also be used to detect unknown and
low-frequency variants. It includes graphical tools for analysis of
alignments.
Illumina
Illumina produces a number of next-generation sequencing machines using technology acquired from Manteia Predictive Medicine and developed by Solexa.
Illumina makes a number of next generation sequencing machines using
this technology including the HiSeq, Genome Analyzer IIx, MiSeq and the
HiScanSQ, which can also process microarrays.
The technology leading to these DNA sequencers was first released by Solexa in 2006 as the Genome Analyzer.
Illumina purchased Solexa in 2007. The Genome Analyzer uses a
sequencing by synthesis method. The first model produced 1G per run.
During the year 2009 the output was increased from 20G per run in August
to 50G per run in December. In 2010 Illumina released the HiSeq 2000
with an output of 200 and then 600G per run which would take 8 days. At
its release the HiSeq 2000 provided one of the cheapest sequencing
platforms at $0.02 per million bases as costed by the Beijing Genomics Institute.
In 2011 Illumina released a benchtop sequencer called the MiSeq.
At its release the MiSeq could generate 1.5G per run with paired end
150bp reads. A sequencing run can be performed in 10 hours when using
automated DNA sample preparation.
The Illumina HiSeq uses two software tools to calculate the
number and position of DNA clusters to assess the sequencing quality:
the HiSeq control system and the real-time analyzer. These methods help
to assess if nearby clusters are interfering with each other.
Life Technologies
Life Technologies (now Thermo Fisher Scientific) produces DNA sequencers under the Applied Biosystems and Ion Torrent brands. Applied Biosystems makes the SOLiD next-generation sequencing platform, and Sanger-based DNA sequencers such as the 3500 Genetic Analyzer.
Under the Ion Torrent brand, Applied Biosystems produces four
next-generation sequencers: the Ion PGM System, Ion Proton System, Ion
S5 and Ion S5xl systems.
The company is also believed to be developing their new capillary DNA
sequencer called SeqStudio that will be released early 2018.
SOLiD systems was acquired by Applied Biosystems in 2006. SOLiD applies sequencing by ligation and dual base encoding.
The first SOLiD system was launched in 2007, generating reading lengths
of 35bp and 3G data per run. After five upgrades, the 5500xl sequencing
system was released in 2010, considerably increasing read length to
85bp, improving accuracy up to 99.99% and producing 30G per 7-day run.
The limited read length of the SOLiD has remained a significant shortcoming
and has to some extent limited its use to experiments where read length
is less vital such as resequencing and transcriptome analysis and more
recently ChIP-Seq and methylation experiments.
The DNA sample preparation time for SOLiD systems has become much
quicker with the automation of sequencing library preparations such as
the Tecan system.
The colour space data produced by the SOLiD platform can be
decoded into DNA bases for further analysis, however software that
considers the original colour space information can give more accurate
results. Life Technologies has released BioScope,
a data analysis package for resequencing, ChiP-Seq and transcriptome
analysis. It uses the MaxMapper algorithm to map the colour space reads.
Beckman Coulter
Beckman Coulter (now Danaher)
has previously manufactured chain termination and capillary
electrophoresis-based DNA sequencers under the model name CEQ, including
the CEQ 8000. The company now produces the GeXP Genetic Analysis
System, which uses dye terminator sequencing. This method uses a thermocycler in much the same way as PCR to denature, anneal, and extend DNA fragments, amplifying the sequenced fragments.
Pacific Biosciences
Pacific Biosciences produces the PacBio RS and Sequel sequencing systems using a single molecule real time sequencing, or SMRT, method.
This system can produce read lengths of multiple thousands of base
pairs. Higher raw read errors are corrected using either circular
consensus - where the same strand is read over and over again - or using
optimized assembly strategies. Scientists have reported 99.9999% accuracy with these strategies. The Sequel system was launched in 2015 with an increased capacity and a lower price.
Oxford Nanopore
Oxford Nanopore Technologies has begun shipping early versions of its nanopore sequencing MinION sequencer to selected labs. The device is four inches long and gets power from a USB port. MinION decodes DNA directly as the molecule is drawn at the rate of 450 bases/second through a nanopore
suspended in a membrane. Changes in electric current indicate which
base is present. It is 60 to 85 percent accurate, compared with 99.9
percent in conventional machines. Even inaccurate results may prove
useful because it produces long read lengths. GridION is a slightly
larger sequencer that processes up to five MinION flow cells at once.
PromethION is another (unreleased) product that will use as many as
100,000 pores in parallel, more suitable for high volume sequencing.