Search This Blog

Monday, February 24, 2025

Neural coding

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Neural_coding

Neural coding (or neural representation) is a neuroscience field concerned with characterising the hypothetical relationship between the stimulus and the neuronal responses, and the relationship among the electrical activities of the neurons in the ensemble. Based on the theory that sensory and other information is represented in the brain by networks of neurons, it is believed that neurons can encode both digital and analog information.

Overview

Neurons have an ability uncommon among the cells of the body to propagate signals rapidly over large distances by generating characteristic electrical pulses called action potentials: voltage spikes that can travel down axons. Sensory neurons change their activities by firing sequences of action potentials in various temporal patterns, with the presence of external sensory stimuli, such as light, sound, taste, smell and touch. Information about the stimulus is encoded in this pattern of action potentials and transmitted into and around the brain. Beyond this, specialized neurons, such as those of the retina, can communicate more information through graded potentials. These differ from action potentials because information about the strength of a stimulus directly correlates with the strength of the neurons' output. The signal decays much faster for graded potentials, necessitating short inter-neuron distances and high neuronal density. The advantage of graded potentials is higher information rates capable of encoding more states (i.e. higher fidelity) than spiking neurons.

Although action potentials can vary somewhat in duration, amplitude and shape, they are typically treated as identical stereotyped events in neural coding studies. If the brief duration of an action potential (about 1 ms) is ignored, an action potential sequence, or spike train, can be characterized simply by a series of all-or-none point events in time. The lengths of interspike intervals (ISIs) between two successive spikes in a spike train often vary, apparently randomly. The study of neural coding involves measuring and characterizing how stimulus attributes, such as light or sound intensity, or motor actions, such as the direction of an arm movement, are represented by neuron action potentials or spikes. In order to describe and analyze neuronal firing, statistical methods and methods of probability theory and stochastic point processes have been widely applied.

With the development of large-scale neural recording and decoding technologies, researchers have begun to crack the neural code and have already provided the first glimpse into the real-time neural code as memory is formed and recalled in the hippocampus, a brain region known to be central for memory formation. Neuroscientists have initiated several large-scale brain decoding projects.

Encoding and decoding

The link between stimulus and response can be studied from two opposite points of view. Neural encoding refers to the map from stimulus to response. The main focus is to understand how neurons respond to a wide variety of stimuli, and to construct models that attempt to predict responses to other stimuli. Neural decoding refers to the reverse map, from response to stimulus, and the challenge is to reconstruct a stimulus, or certain aspects of that stimulus, from the spike sequences it evokes.

Hypothesized coding schemes

A sequence, or 'train', of spikes may contain information based on different coding schemes. In some neurons the strength with which a postsynaptic partner responds may depend solely on the 'firing rate', the average number of spikes per unit time (a 'rate code'). At the other end, a complex 'temporal code' is based on the precise timing of single spikes. They may be locked to an external stimulus such as in the visual and auditory system or be generated intrinsically by the neural circuitry.

Whether neurons use rate coding or temporal coding is a topic of intense debate within the neuroscience community, even though there is no clear definition of what these terms mean.

Rate code

The rate coding model of neuronal firing communication states that as the intensity of a stimulus increases, the frequency or rate of action potentials, or "spike firing", increases. Rate coding is sometimes called frequency coding.

Rate coding is a traditional coding scheme, assuming that most, if not all, information about the stimulus is contained in the firing rate of the neuron. Because the sequence of action potentials generated by a given stimulus varies from trial to trial, neuronal responses are typically treated statistically or probabilistically. They may be characterized by firing rates, rather than as specific spike sequences. In most sensory systems, the firing rate increases, generally non-linearly, with increasing stimulus intensity. Under a rate coding assumption, any information possibly encoded in the temporal structure of the spike train is ignored. Consequently, rate coding is inefficient but highly robust with respect to the ISI 'noise'.

During rate coding, precisely calculating firing rate is very important. In fact, the term "firing rate" has a few different definitions, which refer to different averaging procedures, such as an average over time (rate as a single-neuron spike count) or an average over several repetitions (rate of PSTH) of experiment.

In rate coding, learning is based on activity-dependent synaptic weight modifications.

Rate coding was originally shown by Edgar Adrian and Yngve Zotterman in 1926. In this simple experiment different weights were hung from a muscle. As the weight of the stimulus increased, the number of spikes recorded from sensory nerves innervating the muscle also increased. From these original experiments, Adrian and Zotterman concluded that action potentials were unitary events, and that the frequency of events, and not individual event magnitude, was the basis for most inter-neuronal communication.

In the following decades, measurement of firing rates became a standard tool for describing the properties of all types of sensory or cortical neurons, partly due to the relative ease of measuring rates experimentally. However, this approach neglects all the information possibly contained in the exact timing of the spikes. During recent years, more and more experimental evidence has suggested that a straightforward firing rate concept based on temporal averaging may be too simplistic to describe brain activity.

Spike-count rate (average over time)

The spike-count rate, also referred to as temporal average, is obtained by counting the number of spikes that appear during a trial and dividing by the duration of trial. The length T of the time window is set by the experimenter and depends on the type of neuron recorded from and to the stimulus. In practice, to get sensible averages, several spikes should occur within the time window. Typical values are T = 100 ms or T = 500 ms, but the duration may also be longer or shorter (Chapter 1.5 in the textbook 'Spiking Neuron Models' ).

The spike-count rate can be determined from a single trial, but at the expense of losing all temporal resolution about variations in neural response during the course of the trial. Temporal averaging can work well in cases where the stimulus is constant or slowly varying and does not require a fast reaction of the organism — and this is the situation usually encountered in experimental protocols. Real-world input, however, is hardly stationary, but often changing on a fast time scale. For example, even when viewing a static image, humans perform saccades, rapid changes of the direction of gaze. The image projected onto the retinal photoreceptors changes therefore every few hundred milliseconds.

Despite its shortcomings, the concept of a spike-count rate code is widely used not only in experiments, but also in models of neural networks. It has led to the idea that a neuron transforms information about a single input variable (the stimulus strength) into a single continuous output variable (the firing rate).

There is a growing body of evidence that in Purkinje neurons, at least, information is not simply encoded in firing but also in the timing and duration of non-firing, quiescent periods. There is also evidence from retinal cells, that information is encoded not only in the firing rate but also in spike timing.[19] More generally, whenever a rapid response of an organism is required a firing rate defined as a spike-count over a few hundred milliseconds is simply too slow.

Time-dependent firing rate (averaging over several trials)

The time-dependent firing rate is defined as the average number of spikes (averaged over trials) appearing during a short interval between times t and t+Δt, divided by the duration of the interval. It works for stationary as well as for time-dependent stimuli. To experimentally measure the time-dependent firing rate, the experimenter records from a neuron while stimulating with some input sequence. The same stimulation sequence is repeated several times and the neuronal response is reported in a Peri-Stimulus-Time Histogram (PSTH). The time t is measured with respect to the start of the stimulation sequence. The Δt must be large enough (typically in the range of one or a few milliseconds) so that there is a sufficient number of spikes within the interval to obtain a reliable estimate of the average. The number of occurrences of spikes nK(t;t+Δt) summed over all repetitions of the experiment divided by the number K of repetitions is a measure of the typical activity of the neuron between time t and t+Δt. A further division by the interval length Δt yields time-dependent firing rate r(t) of the neuron, which is equivalent to the spike density of PSTH (Chapter 1.5 in.).

For sufficiently small Δt, r(t)Δt is the average number of spikes occurring between times t and t+Δt over multiple trials. If Δt is small, there will never be more than one spike within the interval between t and t+Δt on any given trial. This means that r(t)Δt is also the fraction of trials on which a spike occurred between those times. Equivalently, r(t)Δt is the probability that a spike occurs during this time interval.

As an experimental procedure, the time-dependent firing rate measure is a useful method to evaluate neuronal activity, in particular in the case of time-dependent stimuli. The obvious problem with this approach is that it can not be the coding scheme used by neurons in the brain. Neurons can not wait for the stimuli to repeatedly present in an exactly same manner before generating a response.

Nevertheless, the experimental time-dependent firing rate measure can make sense, if there are large populations of independent neurons that receive the same stimulus. Instead of recording from a population of N neurons in a single run, it is experimentally easier to record from a single neuron and average over N repeated runs. Thus, the time-dependent firing rate coding relies on the implicit assumption that there are always populations of neurons.

Temporal coding

When precise spike timing or high-frequency firing-rate fluctuations are found to carry information, the neural code is often identified as a temporal code. A number of studies have found that the temporal resolution of the neural code is on a millisecond time scale, indicating that precise spike timing is a significant element in neural coding. Such codes, that communicate via the time between spikes are also referred to as interpulse interval codes, and have been supported by recent studies.

Neurons exhibit high-frequency fluctuations of firing-rates which could be noise or could carry information. Rate coding models suggest that these irregularities are noise, while temporal coding models suggest that they encode information. If the nervous system only used rate codes to convey information, a more consistent, regular firing rate would have been evolutionarily advantageous, and neurons would have utilized this code over other less robust options. Temporal coding supplies an alternate explanation for the “noise," suggesting that it actually encodes information and affects neural processing. To model this idea, binary symbols can be used to mark the spikes: 1 for a spike, 0 for no spike. Temporal coding allows the sequence 000111000111 to mean something different from 001100110011, even though the mean firing rate is the same for both sequences, at 6 spikes/10 ms.

Until recently, scientists had put the most emphasis on rate encoding as an explanation for post-synaptic potential patterns. However, functions of the brain are more temporally precise than the use of only rate encoding seems to allow. In other words, essential information could be lost due to the inability of the rate code to capture all the available information of the spike train. In addition, responses are different enough between similar (but not identical) stimuli to suggest that the distinct patterns of spikes contain a higher volume of information than is possible to include in a rate code.

Temporal codes (also called spike codes), employ those features of the spiking activity that cannot be described by the firing rate. For example, time-to-first-spike after the stimulus onset, phase-of-firing with respect to background oscillations, characteristics based on the second and higher statistical moments of the ISI probability distribution, spike randomness, or precisely timed groups of spikes (temporal patterns) are candidates for temporal codes. As there is no absolute time reference in the nervous system, the information is carried either in terms of the relative timing of spikes in a population of neurons (temporal patterns) or with respect to an ongoing brain oscillation (phase of firing). One way in which temporal codes are decoded, in presence of neural oscillations, is that spikes occurring at specific phases of an oscillatory cycle are more effective in depolarizing the post-synaptic neuron.

The temporal structure of a spike train or firing rate evoked by a stimulus is determined both by the dynamics of the stimulus and by the nature of the neural encoding process. Stimuli that change rapidly tend to generate precisely timed spikes (and rapidly changing firing rates in PSTHs) no matter what neural coding strategy is being used. Temporal coding in the narrow sense refers to temporal precision in the response that does not arise solely from the dynamics of the stimulus, but that nevertheless relates to properties of the stimulus. The interplay between stimulus and encoding dynamics makes the identification of a temporal code difficult.

In temporal coding, learning can be explained by activity-dependent synaptic delay modifications. The modifications can themselves depend not only on spike rates (rate coding) but also on spike timing patterns (temporal coding), i.e., can be a special case of spike-timing-dependent plasticity.

The issue of temporal coding is distinct and independent from the issue of independent-spike coding. If each spike is independent of all the other spikes in the train, the temporal character of the neural code is determined by the behavior of time-dependent firing rate r(t). If r(t) varies slowly with time, the code is typically called a rate code, and if it varies rapidly, the code is called temporal.

Temporal coding in sensory systems

For very brief stimuli, a neuron's maximum firing rate may not be fast enough to produce more than a single spike. Due to the density of information about the abbreviated stimulus contained in this single spike, it would seem that the timing of the spike itself would have to convey more information than simply the average frequency of action potentials over a given period of time. This model is especially important for sound localization, which occurs within the brain on the order of milliseconds. The brain must obtain a large quantity of information based on a relatively short neural response. Additionally, if low firing rates on the order of ten spikes per second must be distinguished from arbitrarily close rate coding for different stimuli, then a neuron trying to discriminate these two stimuli may need to wait for a second or more to accumulate enough information. This is not consistent with numerous organisms which are able to discriminate between stimuli in the time frame of milliseconds, suggesting that a rate code is not the only model at work.

To account for the fast encoding of visual stimuli, it has been suggested that neurons of the retina encode visual information in the latency time between stimulus onset and first action potential, also called latency to first spike or time-to-first-spike. This type of temporal coding has been shown also in the auditory and somato-sensory system. The main drawback of such a coding scheme is its sensitivity to intrinsic neuronal fluctuations. In the primary visual cortex of macaques, the timing of the first spike relative to the start of the stimulus was found to provide more information than the interval between spikes. However, the interspike interval could be used to encode additional information, which is especially important when the spike rate reaches its limit, as in high-contrast situations. For this reason, temporal coding may play a part in coding defined edges rather than gradual transitions.

The mammalian gustatory system is useful for studying temporal coding because of its fairly distinct stimuli and the easily discernible responses of the organism. Temporally encoded information may help an organism discriminate between different tastants of the same category (sweet, bitter, sour, salty, umami) that elicit very similar responses in terms of spike count. The temporal component of the pattern elicited by each tastant may be used to determine its identity (e.g., the difference between two bitter tastants, such as quinine and denatonium). In this way, both rate coding and temporal coding may be used in the gustatory system – rate for basic tastant type, temporal for more specific differentiation.

Research on mammalian gustatory system has shown that there is an abundance of information present in temporal patterns across populations of neurons, and this information is different from that which is determined by rate coding schemes. Groups of neurons may synchronize in response to a stimulus. In studies dealing with the front cortical portion of the brain in primates, precise patterns with short time scales only a few milliseconds in length were found across small populations of neurons which correlated with certain information processing behaviors. However, little information could be determined from the patterns; one possible theory is they represented the higher-order processing taking place in the brain.

As with the visual system, in mitral/tufted cells in the olfactory bulb of mice, first-spike latency relative to the start of a sniffing action seemed to encode much of the information about an odor. This strategy of using spike latency allows for rapid identification of and reaction to an odorant. In addition, some mitral/tufted cells have specific firing patterns for given odorants. This type of extra information could help in recognizing a certain odor, but is not completely necessary, as average spike count over the course of the animal's sniffing was also a good identifier. Along the same lines, experiments done with the olfactory system of rabbits showed distinct patterns which correlated with different subsets of odorants, and a similar result was obtained in experiments with the locust olfactory system.

Temporal coding applications

The specificity of temporal coding requires highly refined technology to measure informative, reliable, experimental data. Advances made in optogenetics allow neurologists to control spikes in individual neurons, offering electrical and spatial single-cell resolution. For example, blue light causes the light-gated ion channel channelrhodopsin to open, depolarizing the cell and producing a spike. When blue light is not sensed by the cell, the channel closes, and the neuron ceases to spike. The pattern of the spikes matches the pattern of the blue light stimuli. By inserting channelrhodopsin gene sequences into mouse DNA, researchers can control spikes and therefore certain behaviors of the mouse (e.g., making the mouse turn left). Researchers, through optogenetics, have the tools to effect different temporal codes in a neuron while maintaining the same mean firing rate, and thereby can test whether or not temporal coding occurs in specific neural circuits.

Optogenetic technology also has the potential to enable the correction of spike abnormalities at the root of several neurological and psychological disorders. If neurons do encode information in individual spike timing patterns, key signals could be missed by attempting to crack the code while looking only at mean firing rates. Understanding any temporally encoded aspects of the neural code and replicating these sequences in neurons could allow for greater control and treatment of neurological disorders such as depression, schizophrenia, and Parkinson's disease. Regulation of spike intervals in single cells more precisely controls brain activity than the addition of pharmacological agents intravenously.

Phase-of-firing code

Phase-of-firing code is a neural coding scheme that combines the spike count code with a time reference based on oscillations. This type of code takes into account a time label for each spike according to a time reference based on phase of local ongoing oscillations at low or high frequencies.

It has been shown that neurons in some cortical sensory areas encode rich naturalistic stimuli in terms of their spike times relative to the phase of ongoing network oscillatory fluctuations, rather than only in terms of their spike count. The local field potential signals reflect population (network) oscillations. The phase-of-firing code is often categorized as a temporal code although the time label used for spikes (i.e. the network oscillation phase) is a low-resolution (coarse-grained) reference for time. As a result, often only four discrete values for the phase are enough to represent all the information content in this kind of code with respect to the phase of oscillations in low frequencies. Phase-of-firing code is loosely based on the phase precession phenomena observed in place cells of the hippocampus. Another feature of this code is that neurons adhere to a preferred order of spiking between a group of sensory neurons, resulting in firing sequence.

Phase code has been shown in visual cortex to involve also high-frequency oscillations. Within a cycle of gamma oscillation, each neuron has its own preferred relative firing time. As a result, an entire population of neurons generates a firing sequence that has a duration of up to about 15 ms.

Population coding

Population coding is a method to represent stimuli by using the joint activities of a number of neurons. In population coding, each neuron has a distribution of responses over some set of inputs, and the responses of many neurons may be combined to determine some value about the inputs. From the theoretical point of view, population coding is one of a few mathematically well-formulated problems in neuroscience. It grasps the essential features of neural coding and yet is simple enough for theoretic analysis. Experimental studies have revealed that this coding paradigm is widely used in the sensory and motor areas of the brain.

For example, in the visual area medial temporal (MT), neurons are tuned to the direction of object motion. In response to an object moving in a particular direction, many neurons in MT fire with a noise-corrupted and bell-shaped activity pattern across the population. The moving direction of the object is retrieved from the population activity, to be immune from the fluctuation existing in a single neuron's signal. When monkeys are trained to move a joystick towards a lit target, a single neuron will fire for multiple target directions. However it fires the fastest for one direction and more slowly depending on how close the target was to the neuron's "preferred" direction. If each neuron represents movement in its preferred direction, and the vector sum of all neurons is calculated (each neuron has a firing rate and a preferred direction), the sum points in the direction of motion. In this manner, the population of neurons codes the signal for the motion. This particular population code is referred to as population vector coding.

Place-time population codes, termed the averaged-localized-synchronized-response (ALSR) code, have been derived for neural representation of auditory acoustic stimuli. This exploits both the place or tuning within the auditory nerve, as well as the phase-locking within each nerve fiber auditory nerve. The first ALSR representation was for steady-state vowels; ALSR representations of pitch and formant frequencies in complex, non-steady state stimuli were later demonstrated for voiced-pitch, and formant representations in consonant-vowel syllables. The advantage of such representations is that global features such as pitch or formant transition profiles can be represented as global features across the entire nerve simultaneously via both rate and place coding.

Population coding has a number of other advantages as well, including reduction of uncertainty due to neuronal variability and the ability to represent a number of different stimulus attributes simultaneously. Population coding is also much faster than rate coding and can reflect changes in the stimulus conditions nearly instantaneously. Individual neurons in such a population typically have different but overlapping selectivities, so that many neurons, but not necessarily all, respond to a given stimulus.

Typically an encoding function has a peak value such that activity of the neuron is greatest if the perceptual value is close to the peak value, and becomes reduced accordingly for values less close to the peak value.  It follows that the actual perceived value can be reconstructed from the overall pattern of activity in the set of neurons. Vector coding is an example of simple averaging. A more sophisticated mathematical technique for performing such a reconstruction is the method of maximum likelihood based on a multivariate distribution of the neuronal responses. These models can assume independence, second order correlations,  or even more detailed dependencies such as higher order maximum entropy models, or copulas.

Correlation coding

The correlation coding model of neuronal firing claims that correlations between action potentials, or "spikes", within a spike train may carry additional information above and beyond the simple timing of the spikes. Early work suggested that correlation between spike trains can only reduce, and never increase, the total mutual information present in the two spike trains about a stimulus feature. However, this was later demonstrated to be incorrect. Correlation structure can increase information content if noise and signal correlations are of opposite sign. Correlations can also carry information not present in the average firing rate of two pairs of neurons. A good example of this exists in the pentobarbital-anesthetized marmoset auditory cortex, in which a pure tone causes an increase in the number of correlated spikes, but not an increase in the mean firing rate, of pairs of neurons.

Independent-spike coding

The independent-spike coding model of neuronal firing claims that each individual action potential, or "spike", is independent of each other spike within the spike train.

Position coding

Plot of typical position coding

A typical population code involves neurons with a Gaussian tuning curve whose means vary linearly with the stimulus intensity, meaning that the neuron responds most strongly (in terms of spikes per second) to a stimulus near the mean. The actual intensity could be recovered as the stimulus level corresponding to the mean of the neuron with the greatest response. However, the noise inherent in neural responses means that a maximum likelihood estimation function is more accurate.

Neural responses are noisy and unreliable.

This type of code is used to encode continuous variables such as joint position, eye position, color, or sound frequency. Any individual neuron is too noisy to faithfully encode the variable using rate coding, but an entire population ensures greater fidelity and precision. For a population of unimodal tuning curves, i.e. with a single peak, the precision typically scales linearly with the number of neurons. Hence, for half the precision, half as many neurons are required. In contrast, when the tuning curves have multiple peaks, as in grid cells that represent space, the precision of the population can scale exponentially with the number of neurons. This greatly reduces the number of neurons required for the same precision.

Sparse coding

The sparse code is when each item is encoded by the strong activation of a relatively small set of neurons. For each item to be encoded, this is a different subset of all available neurons. In contrast to sensor-sparse coding, sensor-dense coding implies that all information from possible sensor locations is known.

As a consequence, sparseness may be focused on temporal sparseness ("a relatively small number of time periods are active") or on the sparseness in an activated population of neurons. In this latter case, this may be defined in one time period as the number of activated neurons relative to the total number of neurons in the population. This seems to be a hallmark of neural computations since compared to traditional computers, information is massively distributed across neurons. Sparse coding of natural images produces wavelet-like oriented filters that resemble the receptive fields of simple cells in the visual cortex. The capacity of sparse codes may be increased by simultaneous use of temporal coding, as found in the locust olfactory system.

Given a potentially large set of input patterns, sparse coding algorithms (e.g. sparse autoencoder) attempt to automatically find a small number of representative patterns which, when combined in the right proportions, reproduce the original input patterns. The sparse coding for the input then consists of those representative patterns. For example, the very large set of English sentences can be encoded by a small number of symbols (i.e. letters, numbers, punctuation, and spaces) combined in a particular order for a particular sentence, and so a sparse coding for English would be those symbols.

Linear generative model

Most models of sparse coding are based on the linear generative model. In this model, the symbols are combined in a linear fashion to approximate the input.

More formally, given a k-dimensional set of real-numbered input vectors , the goal of sparse coding is to determine n k-dimensional basis vectors , corresponding to neuronal receptive fields, along with a sparse n-dimensional vector of weights or coefficients for each input vector, so that a linear combination of the basis vectors with proportions given by the coefficients results in a close approximation to the input vector: .

The codings generated by algorithms implementing a linear generative model can be classified into codings with soft sparseness and those with hard sparseness. These refer to the distribution of basis vector coefficients for typical inputs. A coding with soft sparseness has a smooth Gaussian-like distribution, but peakier than Gaussian, with many zero values, some small absolute values, fewer larger absolute values, and very few very large absolute values. Thus, many of the basis vectors are active. Hard sparseness, on the other hand, indicates that there are many zero values, no or hardly any small absolute values, fewer larger absolute values, and very few very large absolute values, and thus few of the basis vectors are active. This is appealing from a metabolic perspective: less energy is used when fewer neurons are firing.

Another measure of coding is whether it is critically complete or overcomplete. If the number of basis vectors n is equal to the dimensionality k of the input set, the coding is said to be critically complete. In this case, smooth changes in the input vector result in abrupt changes in the coefficients, and the coding is not able to gracefully handle small scalings, small translations, or noise in the inputs. If, however, the number of basis vectors is larger than the dimensionality of the input set, the coding is overcomplete. Overcomplete codings smoothly interpolate between input vectors and are robust under input noise. The human primary visual cortex is estimated to be overcomplete by a factor of 500, so that, for example, a 14 x 14 patch of input (a 196-dimensional space) is coded by roughly 100,000 neurons.

Other models are based on matching pursuit, a sparse approximation algorithm which finds the "best matching" projections of multidimensional data, and dictionary learning, a representation learning method which aims to find a sparse matrix representation of the input data in the form of a linear combination of basic elements as well as those basic elements themselves.

Biological evidence

Sparse coding may be a general strategy of neural systems to augment memory capacity. To adapt to their environments, animals must learn which stimuli are associated with rewards or punishments and distinguish these reinforced stimuli from similar but irrelevant ones. Such tasks require implementing stimulus-specific associative memories in which only a few neurons out of a population respond to any given stimulus and each neuron responds to only a few stimuli out of all possible stimuli.

Theoretical work on sparse distributed memory has suggested that sparse coding increases the capacity of associative memory by reducing overlap between representations. Experimentally, sparse representations of sensory information have been observed in many systems, including vision, audition, touch, and olfaction. However, despite the accumulating evidence for widespread sparse coding and theoretical arguments for its importance, a demonstration that sparse coding improves the stimulus-specificity of associative memory has been difficult to obtain.

In the Drosophila olfactory system, sparse odor coding by the Kenyon cells of the mushroom body is thought to generate a large number of precisely addressable locations for the storage of odor-specific memories. Sparseness is controlled by a negative feedback circuit between Kenyon cells and GABAergic anterior paired lateral (APL) neurons. Systematic activation and blockade of each leg of this feedback circuit shows that Kenyon cells activate APL neurons and APL neurons inhibit Kenyon cells. Disrupting the Kenyon cell–APL feedback loop decreases the sparseness of Kenyon cell odor responses, increases inter-odor correlations, and prevents flies from learning to discriminate similar, but not dissimilar, odors. These results suggest that feedback inhibition suppresses Kenyon cell activity to maintain sparse, decorrelated odor coding and thus the odor-specificity of memories.

Molecular paleontology

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Molecular_paleontology

Molecular paleontology refers to the recovery and analysis of DNA, proteins, carbohydrates, or lipids, and their diagenetic products from ancient human, animal, and plant remains. The field of molecular paleontology has yielded important insights into evolutionary events, species' diasporas, the discovery and characterization of extinct species.

In shallow time, advancements in the field of molecular paleontology have allowed scientists to pursue evolutionary questions on a genetic level rather than relying on phenotypic variation alone. By applying molecular analytical techniques to DNA in recent animal remains, one can quantify the level of relatedness between any two organisms for which DNA has been recovered. Using various biotechnological techniques such as DNA isolation, amplification, and sequencing scientists have been able to acquire and expand insights into the divergence and evolutionary history of countless recently extinct organisms. In February 2021, scientists reported, for the first time, the sequencing of DNA from animal remains, a mammoth in this instance, over a million years old, the oldest DNA sequenced to date.

In deep time, compositional heterogeneities in carbonaceous remains of a diversity of animals, ranging in age from the Neoproterozoic to the Recent, have been linked to biological signatures encoded in modern biomolecules via a cascade of oxidative fossilization reactions. The macromolecular composition of carbonaceous fossils, some Tonian in age, preserve biological signatures reflecting original biomineralization, tissue types, metabolism, and relationship affinities (phylogeny).

History

The study of molecular paleontology is said to have begun with the discovery by Abelson of 360 million year old amino acids preserved in fossil shells. However, Svante Pääbo is often the one considered to be the founder of the field of molecular paleontology.

The field of molecular paleontology has had several major advances since the 1950s and is a continuously growing field. Below is a timeline showing notable contributions that have been made.

Timeline

A visual graphic of the events listed in the timeline section.
A timeline demonstrating important dates in molecular paleontology. All of these dates are listed and specifically sourced in the History section under Timeline.

mid-1950s: Abelson found preserved amino acids in fossil shells that were about 360 million years old. Produced idea of comparing fossil amino acid sequences with existing organism so that molecular evolution could be studied.

1970s: Fossil peptides are studied by amino acid analysis. Start to use whole peptides and immunological methods.

Late 1970s: Palaeobotanists (can also be spelled as Paleobotanists) studied molecules from well-preserved fossil plants.

1984: The first successful DNA sequencing of an extinct species, the quagga, a zebra-like species.

1991: Published article on the successful extraction of proteins from the fossil bone of a dinosaur, specifically the Seismosaurus.

2005: Scientists resurrect extinct 1918 influenza virus.

2006: Neanderthals nuclear DNA sequence segments begin to be analyzed and published.

2007: Scientists synthesize entire extinct human endogenous retrovirus (HERV-K) from scratch.

2010: A new species of early hominid, the Denisovans, discovered from mitochondrial and nuclear genomes recovered from bone found in a cave in Siberia. Analysis showed that the Denisovan specimen lived approximately 41,000 years ago, and shared a common ancestor with both modern humans and Neanderthals approximately 1 million years ago in Africa.

2013: The first entire Neanderthal genome is successfully sequenced. More information can be found at the Neanderthal genome project.

2013: A 400,000-year-old specimen with remnant mitochondrial DNA sequenced and is found to be a common ancestor to Neanderthals and Denisovans, Homo heidelbergensis.

2013: Mary Schweitzer and colleagues propose the first chemical mechanism explaining the potential preservation of vertebrate cells and soft tissues into the fossil record. The mechanism proposes that free oxygen radicals, potentially produced by redox-active iron, induce biomolecule crosslinking. This crosslinking mechanism is somewhat analogous to the crosslinking that occurs during histological tissue fixation, such as with formaldehyde. The authors also suggest the source of iron to be the hemoglobin from the deceased organism.

2015: A 110,000-year-old fossil tooth containing DNA from Denisovans was reported.

2018: Molecular paleobiologists link polymers of N-, O-, S-heterocycle composition (AGEs/ALEs, as referred to in the cited publication, Wiemann et al. 2018) in carbonaceous fossil remains mechanistically to structural biomolecules in original tissues. Through oxidative crosslinking, a process similar to the Maillard reaction, nucleophilic amino acid residues condense with Reactive Carbonyl Species derived from lipids and sugars. The processes of biomolecule fossilization, identified via Raman spectroscopy of modern and fossil tissues, experimental modelling, and statistical data evaluation, include Advanced Glycosylation and Advanced Lipoxidation.

2019: An independent laboratory of Molecular Paleontologists confirms the transformation of biomolecules through Advanced Glycosylation and Lipoxidation during fossilization. The authors use Synchrotron Fourier-Transform Infrared spectroscopy.

2020: Wiemann and colleagues identify biological signatures reflecting original biomineralization, tissue types, metabolism, and relationship affinity (phylogeny) in preserved compositional heterogeneities of a diversity of carbonaceous animal fossils. This is the first large-scale analysis of fossils ranging in age from the Neoproterozoic to the Recent, and the first published record of biological signals found in complex organic matter. The authors rely on statistical analyses of a uniquely large Raman spectroscopy data set.

2021: Geochemists find tissue type signals in the composition of carbonaceous fossils dating back to the Tonian, and apply these signals to identify epibionts. The authors use Raman spectroscopy.

2022: Raman spectroscopy data revealing patterns in the fossilization of structural biomolecules have been replicated with Fourier-Transform Infrared spectroscopy and a diversity of different Raman instruments, filters, and excitation sources.

2023: The first in-depth chemical description of how original, biological cells and tissues fossilize is published. Importantly, the study shows that the free oxygen radical hypothesis (proposed by Mary Schweitzer and colleagues in 2013) is in many cases identical to the AGE/ALE formation hypothesis (proposed by Jasmina Wiemann and colleagues in 2018). The combined hypotheses, along with thermal maturation and carbonization, form a loose framework for biological cell and tissue fossilization.

The quagga

The first successful DNA sequencing of an extinct species was in 1984, from a 150-year-old museum specimen of the quagga, a zebra-like species. Mitochondrial DNA (also known as mtDNA) was sequenced from desiccated muscle of the quagga, and was found to differ by 12 base substitutions from the mitochondrial DNA of a mountain zebra. It was concluded that these two species had a common ancestor 3-4 million years ago, which is consistent with known fossil evidence of the species.

Denisovans

The Denisovans of Eurasia, a hominid species related to Neanderthals and humans, was discovered as a direct result of DNA sequencing of a 41,000-year-old specimen recovered in 2008. Analysis of the mitochondrial DNA from a retrieved finger bone showed the specimen to be genetically distinct from both humans and Neanderthals. Two teeth and a toe bone were later found to belong to different individuals with the same population. Analysis suggests that both the Neanderthals and Denisovans were already present throughout Eurasia when modern humans arrived. In November 2015, scientists reported finding a fossil tooth containing DNA from Denisovans, and estimated its age at 110,000-years-old.

Mitochondrial DNA analysis

A photo of Neanderthal DNA extraction in process
Neanderthal DNA extraction. Working in a clean room, researchers at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, took extensive precautions to avoid contaminating Neanderthal DNA samples - extracted from bones like this one - with DNA from any other source, including modern humans. NHGRI researchers are part of the international team that sequenced the genome of the Neanderthal, Homo neanderthalensis.

The mtDNA from the Denisovan finger bone differs from that of modern humans by 385 bases (nucleotides) in the mtDNA strand out of approximately 16,500, whereas the difference between modern humans and Neanderthals is around 202 bases. In contrast, the difference between chimpanzees and modern humans is approximately 1,462 mtDNA base pairs. This suggested a divergence time around one million years ago. The mtDNA from a tooth bore a high similarity to that of the finger bone, indicating they belonged to the same population. From a second tooth, an mtDNA sequence was recovered that showed an unexpectedly large number of genetic differences compared to that found in the other tooth and the finger, suggesting a high degree of mtDNA diversity. These two individuals from the same cave showed more diversity than seen among sampled Neanderthals from all of Eurasia, and were as different as modern-day humans from different continents.

Nuclear genome analysis

Isolation and sequencing of nuclear DNA has also been accomplished from the Denisova finger bone. This specimen showed an unusual degree of DNA preservation and low level of contamination. They were able to achieve near-complete genomic sequencing, allowing a detailed comparison with Neanderthal and modern humans. From this analysis, they concluded, in spite of the apparent divergence of their mitochondrial sequence, the Denisova population along with Neanderthal shared a common branch from the lineage leading to modern African humans. The estimated average time of divergence between Denisovan and Neanderthal sequences is 640,000 years ago, and the time between both of these and the sequences of modern Africans is 804,000 years ago. They suggest the divergence of the Denisova mtDNA results either from the persistence of a lineage purged from the other branches of humanity through genetic drift or else an introgression from an older hominin lineage.

Homo heidelbergensis

A photo of the Denisovan cranium found at Sima de los Huesos
"Homo heidelbergensis Cranium 5 is one of the most important discoveries in the Sima de los Huesos, Atapuerca (Spain). The mandible of this cranium appeared, nearly intact, some years after its find, close to the same location.

Homo heidelbergensis was first discovered in 1907 near Heidelberg, Germany and later also found elsewhere in Europe, Africa, and Asia. However it was not until 2013 that a specimen with retrievable DNA was found, in a ~400,000 year old femur found in the Sima de los Huesos Cave in Spain. The femur was found to contain both mtDNA and nuclear DNA. Improvements in DNA extraction and library preparation techniques allowed for mtDNA to be successfully isolated and sequenced, however the nuclear DNA was found to be too degraded in the observed specimen, and was also contaminated with DNA from an ancient cave bear (Ursus deningeri) present in the cave. The mtDNA analysis found a surprising link between the specimen and the Denisovans, and this finding raised many questions. Several scenarios were proposed in a January 2014 paper titled "A mitochondrial genome sequence of a hominin from Sima de los Huesos", elucidating the lack of convergence in the scientific community on how Homo heidelbergensis is related to other known hominin groups. One plausible scenario that the authors proposed was that the H. heidelbergensis was an ancestor to both Denisovans and Neanderthals. Completely sequenced nuclear genomes from both Denisovans and Neanderthals suggest a common ancestor approximately 700,000 years ago, and one leading researcher in the field, Svante Paabo, suggests that perhaps this new hominin group is that early ancestor.

Applications

Discovery and characterization of new species

Molecular paleontology techniques applied to fossils have contributed to the discovery and characterization of several new species, including the Denisovans and Homo heidelbergensis. We have been able to better understand the path that humans took as they populated the earth, and what species were present during this diaspora.

De-extinction

An artist's color drawing of the Pyrenean ibex
The Pyrenean ibex was temporarily brought back from extinction in 1984.

It is now possible to revive extinct species using molecular paleontology techniques. This was first accomplished via cloning in 2003 with the Pyrenean ibex, a type of wild goat that became extinct in 2000. Nuclei from the Pyrenean ibex's cells were injected into goat eggs emptied of their own DNA, and implanted into surrogate goat mothers. The offspring lived only seven minutes after birth, due to defects in its lungs. Other cloned animals have been observed to have similar lung defects.

There are many species that have gone extinct as a direct result of human activity. Some examples include the dodo, the great auk, the Tasmanian tiger, the Chinese river dolphin, and the passenger pigeon. An extinct species can be revived by using allelic replacement of a closely related species that is still living. By only having to replace a few genes within an organism, instead of having to build the extinct species' genome from scratch, it could be possible to bring back several species in this way, even Neanderthals.

The ethics surrounding the re-introduction of extinct species are very controversial. Critics of bringing extinct species back to life contend that it would divert limited money and resources from protecting the world's current biodiversity problems. With current extinction rates approximated to be 100 to 1,000 times the background extinction rate, it is feared that a de-extinction program might lessen public concerns over the current mass extinction crisis, if it is believed that these species can simply be brought back to life. As the editors of a Scientific American article on de-extinction pose: Should we bring back the woolly mammoth only to let elephants become extinct in the meantime? The main driving factor for the extinction of most species in this era (post 10,000 BC) is the loss of habitat, and temporarily bringing back an extinct species will not recreate the environment they once inhabited.

Proponents of de-extinction, such as George Church, speak of many potential benefits. Reintroducing an extinct keystone species, such as the woolly mammoth, could help re-balance the ecosystems that once depended on them. Some extinct species could create broad benefits for the environments they once inhabited, if returned. For example, woolly mammoths may be able to slow the melting of the Russian and Arctic tundra in several ways such as eating dead grass so that new grass can grow and take root, and periodically breaking up the snow, subjecting the ground below to the arctic air. These techniques could also be used to reintroduce genetic diversity in a threatened species, or even introduce new genes and traits to allow the animals to compete better in a changing environment.

Research and technology

When a new potential specimen is found, scientists normally first analyze for cell and tissue preservation using histological techniques, and test the conditions for the survivability of DNA. They will then attempt to isolate a DNA sample using the technique described below, and conduct a PCR amplification of the DNA to increase the amount of DNA available for testing. This amplified DNA is then sequenced. Care is taken to verify that the sequence matches the phylogenetic traits of the organism. When an organism dies, a technique called amino acid dating can be used to age the organism. It inspects the degree of racemization of aspartic acid, leucine, and alanine within the tissue. As time passes, the D/L ratio (where "D" and "L" are mirror images of each other) increase from 0 to 1. In samples where the D/L ratio of aspartic acid is greater than 0.08, ancient DNA sequences can not be retrieved (as of 1996).

Mitochondrial DNA vs. nuclear DNA

An infographic contrasting inheritance of mitochondrial and nuclear DNA
Unlike nuclear DNA (left), mitochondrial DNA is only inherited from the maternal lineage (right).

Mitochondrial DNA (mtDNA) is separate from one's nuclear DNA. It is present in organelles called mitochondria in each cell. Unlike nuclear DNA, which is inherited from both parents and rearranged every generation, an exact copy of mitochondrial DNA gets passed down from mother to her sons and daughters. The benefits of performing DNA analysis with Mitochondrial DNA is that it has a far smaller mutation rate than nuclear DNA, making tracking lineages on the scale of tens of thousands of years much easier. Knowing the base mutation rate for mtDNA, (in humans this rate is also known as the Human mitochondrial molecular clock) one can determine the amount of time any two lineages have been separated. Another advantage of mtDNA is that thousands of copies of it exist in every cell, whereas only two copies of nuclear DNA exist in each cell. All eukaryotes, a group which includes all plants, animals, and fungi, have mtDNA. A disadvantage of mtDNA is that only the maternal line is represented. For example, a child will inherit 1/8 of its DNA from each of its eight great-grandparents, however it will inherit an exact clone of its maternal great-grandmother's mtDNA. This is analogous to a child inheriting only his paternal great-grandfather's last name, and not a mix of all of the eight surnames.

Isolation

There are many things to consider when isolating a substance. First, depending upon what it is and where it is located, there are protocols that must be carried out in order to avoid contamination and further degradation of the sample. Then, handling of the materials is usually done in a physically isolated work area and under specific conditions (i.e. specific Temperature, moisture, etc...) also to avoid contamination and further loss of sample.

Once the material has been obtained, depending on what it is, there are different ways to isolate and purify it. DNA extraction from fossils is one of the more popular practices and there are different steps that can be taken to get the desired sample. DNA extracted from amber-entombed fossils can be taken from small samples and mixed with different substances, centrifuged, incubated, and centrifuged again.[46] On the other hand, DNA extraction from insects can be done by grinding the sample, mixing it with buffer, and undergoing purification through glass fiber columns. In the end, regardless of how the sample was isolated for these fossils, the DNA isolated must be able to undergo amplification.

Amplification

An infographic showing the replication process of PCR
Polymerase chain reaction

The field of molecular paleontology benefited greatly from the invention of the polymerase chain reaction(PCR), which allows one to make billions of copies of a DNA fragment from just a single preserved copy of the DNA. One of the biggest challenges up until this point was the extreme scarcity of recovered DNA because of degradation of the DNA over time.

Sequencing

DNA sequencing is done to determine the order of nucleotides and genes. There are many different materials from which DNA can be extracted. In animals, the mitochondrial chromosome can be used for molecular study. Chloroplasts can be studied in plants as a primary source of sequence data.

An evolutionary tree of mammals
An evolutionary tree of mammals

In the end, the sequences generated are used to build evolutionary trees. Methods to match data sets include: maximum probability, minimum evolution (also known as neighbor-joining) which searches for the tree with shortest overall length, and the maximum parsimony method which finds the tree requiring the fewest character-state changes. The groups of species defined within a tree can also be later evaluated by statistical tests, such as the bootstrap method, to see if they are indeed significant.

Limitations and challenges

Ideal environmental conditions for preserving DNA where the organism was desiccated and uncovered are difficult to come by, as well as maintaining their condition until analysis. Nuclear DNA normally degrades rapidly after death by endogenous hydrolytic processes, by UV radiation, and other environmental stressors.

Also, interactions with the organic breakdown products of surrounding soil have been found to help preserve biomolecular materials. However, they have also created the additional challenge of being able to separate the various components in order to be able to conduct the proper analysis on them. Some of these breakdowns have also been found to interfere with the action of some of the enzymes used during PCR.

Finally, one of the largest challenge in extracting ancient DNA, particularly in ancient human DNA, is in contamination during PCR. Small amounts of human DNA can contaminate the reagents used for extraction and PCR of ancient DNA. These problems can be overcome by rigorous care in the handling of all solutions as well as the glassware and other tools used in the process. It can also help if only one person performs the extractions, to minimize different types of DNA present.

Polymerase chain reaction

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Polymerase_chain_reaction
A strip of eight PCR tubes, each containing a 100 μL reaction mixture
Placing a strip of eight PCR tubes into a thermal cycler

The polymerase chain reaction (PCR) is a method widely used to make millions to billions of copies of a specific DNA sample rapidly, allowing scientists to amplify a very small sample of DNA (or a part of it) sufficiently to enable detailed study. PCR was invented in 1983 by American biochemist Kary Mullis at Cetus Corporation. Mullis and biochemist Michael Smith, who had developed other essential ways of manipulating DNA, were jointly awarded the Nobel Prize in Chemistry in 1993.

PCR is fundamental to many of the procedures used in genetic testing and research, including analysis of ancient samples of DNA and identification of infectious agents. Using PCR, copies of very small amounts of DNA sequences are exponentially amplified in a series of cycles of temperature changes. PCR is now a common and often indispensable technique used in medical laboratory research for a broad variety of applications including biomedical research and forensic science.

The majority of PCR methods rely on thermal cycling. Thermal cycling exposes reagents to repeated cycles of heating and cooling to permit different temperature-dependent reactions—specifically, DNA melting and enzyme-driven DNA replication. PCR employs two main reagents—primers (which are short single strand DNA fragments known as oligonucleotides that are a complementary sequence to the target DNA region) and a thermostable DNA polymerase. In the first step of PCR, the two strands of the DNA double helix are physically separated at a high temperature in a process called nucleic acid denaturation. In the second step, the temperature is lowered and the primers bind to the complementary sequences of DNA. The two DNA strands then become templates for DNA polymerase to enzymatically assemble a new DNA strand from free nucleotides, the building blocks of DNA. As PCR progresses, the DNA generated is itself used as a template for replication, setting in motion a chain reaction in which the original DNA template is exponentially amplified.

Almost all PCR applications employ a heat-stable DNA polymerase, such as Taq polymerase, an enzyme originally isolated from the thermophilic bacterium Thermus aquaticus. If the polymerase used was heat-susceptible, it would denature under the high temperatures of the denaturation step. Before the use of Taq polymerase, DNA polymerase had to be manually added every cycle, which was a tedious and costly process.

Applications of the technique include DNA cloning for sequencing, gene cloning and manipulation, gene mutagenesis; construction of DNA-based phylogenies, or functional analysis of genes; diagnosis and monitoring of genetic disorders; amplification of ancient DNA; analysis of genetic fingerprints for DNA profiling (for example, in forensic science and parentage testing); and detection of pathogens in nucleic acid tests for the diagnosis of infectious diseases.

Principles

An older, three-temperature thermal cycler for PCR

PCR amplifies a specific region of a DNA strand (the DNA target). Most PCR methods amplify DNA fragments of between 0.1 and 10 kilo base pairs (kbp) in length, although some techniques allow for amplification of fragments up to 40 kbp. The amount of amplified product is determined by the available substrates in the reaction, which becomes limiting as the reaction progresses.

A basic PCR set-up requires several components and reagents, including:

  • a DNA template that contains the DNA target region to amplify
  • a DNA polymerase; an enzyme that polymerizes new DNA strands; heat-resistant Taq polymerase is especially common, as it is more likely to remain intact during the high-temperature DNA denaturation process
  • two DNA primers that are complementary to the 3' (three prime) ends of each of the sense and anti-sense strands of the DNA target (DNA polymerase can only bind to and elongate from a double-stranded region of DNA; without primers, there is no double-stranded initiation site at which the polymerase can bind); specific primers that are complementary to the DNA target region are selected beforehand, and are often custom-made in a laboratory or purchased from commercial biochemical suppliers
  • deoxynucleoside triphosphates, or dNTPs (sometimes called "deoxynucleotide triphosphates"; nucleotides containing triphosphate groups), the building blocks from which the DNA polymerase synthesizes a new DNA strand
  • a buffer solution providing a suitable chemical environment for optimum activity and stability of the DNA polymerase
  • bivalent cations, typically magnesium (Mg) or manganese (Mn) ions; Mg2+ is the most common, but Mn2+ can be used for PCR-mediated DNA mutagenesis, as a higher Mn2+ concentration increases the error rate during DNA synthesis; and monovalent cations, typically potassium (K) ions.

The reaction is commonly carried out in a volume of 10–200 μL in small reaction tubes (0.2–0.5 mL volumes) in a thermal cycler. The thermal cycler heats and cools the reaction tubes to achieve the temperatures required at each step of the reaction (see below). Many modern thermal cyclers make use of a Peltier device, which permits both heating and cooling of the block holding the PCR tubes simply by reversing the device's electric current. Thin-walled reaction tubes permit favorable thermal conductivity to allow for rapid thermal equilibrium. Most thermal cyclers have heated lids to prevent condensation at the top of the reaction tube. Older thermal cyclers lacking a heated lid require a layer of oil on top of the reaction mixture or a ball of wax inside the tube.

Procedure

Typically, PCR consists of a series of 20–40 repeated temperature changes, called thermal cycles, with each cycle commonly consisting of two or three discrete temperature steps (see figure below). The cycling is often preceded by a single temperature step at a very high temperature (>90 °C (194 °F)), and followed by one hold at the end for final product extension or brief storage. The temperatures used and the length of time they are applied in each cycle depend on a variety of parameters, including the enzyme used for DNA synthesis, the concentration of bivalent ions and dNTPs in the reaction, and the melting temperature (Tm) of the primers. The individual steps common to most PCR methods are as follows:

  • Initialization: This step is only required for DNA polymerases that require heat activation by hot-start PCR. It consists of heating the reaction chamber to a temperature of 94–96 °C (201–205 °F), or 98 °C (208 °F) if extremely thermostable polymerases are used, which is then held for 1–10 minutes.
  • Denaturation: This step is the first regular cycling event and consists of heating the reaction chamber to 94–98 °C (201–208 °F) for 20–30 seconds. This causes DNA melting, or denaturation, of the double-stranded DNA template by breaking the hydrogen bonds between complementary bases, yielding two single-stranded DNA molecules.
  • Annealing: In the next step, the reaction temperature is lowered to 50–65 °C (122–149 °F) for 20–40 seconds, allowing annealing of the primers to each of the single-stranded DNA templates. Two different primers are typically included in the reaction mixture: one for each of the two single-stranded complements containing the target region. The primers are single-stranded sequences themselves, but are much shorter than the length of the target region, complementing only very short sequences at the 3' end of each strand.
It is critical to determine a proper temperature for the annealing step because efficiency and specificity are strongly affected by the annealing temperature. This temperature must be low enough to allow for hybridization of the primer to the strand, but high enough for the hybridization to be specific, i.e., the primer should bind only to a perfectly complementary part of the strand, and nowhere else. If the temperature is too low, the primer may bind imperfectly. If it is too high, the primer may not bind at all. A typical annealing temperature is about 3–5 °C below the Tm of the primers used. Stable hydrogen bonds between complementary bases are formed only when the primer sequence very closely matches the template sequence. During this step, the polymerase binds to the primer-template hybrid and begins DNA formation.
  • Extension/elongation: The temperature at this step depends on the DNA polymerase used; the optimum activity temperature for the thermostable DNA polymerase of Taq polymerase is approximately 75–80 °C (167–176 °F), though a temperature of 72 °C (162 °F) is commonly used with this enzyme. In this step, the DNA polymerase synthesizes a new DNA strand complementary to the DNA template strand by adding free dNTPs from the reaction mixture that is complementary to the template in the 5'-to-3' direction, condensing the 5'-phosphate group of the dNTPs with the 3'-hydroxy group at the end of the nascent (elongating) DNA strand. The precise time required for elongation depends both on the DNA polymerase used and on the length of the DNA target region to amplify. As a rule of thumb, at their optimal temperature, most DNA polymerases polymerize a thousand bases per minute. Under optimal conditions (i.e., if there are no limitations due to limiting substrates or reagents), at each extension/elongation step, the number of DNA target sequences is doubled. With each successive cycle, the original template strands plus all newly generated strands become template strands for the next round of elongation, leading to exponential (geometric) amplification of the specific DNA target region.
The processes of denaturation, annealing and elongation constitute a single cycle. Multiple cycles are required to amplify the DNA target to millions of copies. The formula used to calculate the number of DNA copies formed after a given number of cycles is 2n, where n is the number of cycles. Thus, a reaction set for 30 cycles results in 230, or 1,073,741,824 copies of the original double-stranded DNA target region.
  • Final elongation: This single step is optional, but is performed at a temperature of 70–74 °C (158–165 °F) (the temperature range required for optimal activity of most polymerases used in PCR) for 5–15 minutes after the last PCR cycle to ensure that any remaining single-stranded DNA is fully elongated.
  • Final hold: The final step cools the reaction chamber to 4–15 °C (39–59 °F) for an indefinite time, and may be employed for short-term storage of the PCR products.
Schematic drawing of a complete PCR cycle
Ethidium bromide-stained PCR products after gel electrophoresis. Two sets of primers were used to amplify a target sequence from three different tissue samples. No amplification is present in sample #1; DNA bands in sample #2 and #3 indicate successful amplification of the target sequence. The gel also shows a positive control, and a DNA ladder containing DNA fragments of defined length for sizing the bands in the experimental PCRs.

To check whether the PCR successfully generated the anticipated DNA target region (also sometimes referred to as the amplimer or amplicon), agarose gel electrophoresis may be employed for size separation of the PCR products. The size of the PCR products is determined by comparison with a DNA ladder, a molecular weight marker which contains DNA fragments of known sizes, which runs on the gel alongside the PCR products.

Tucker PCR

Stages

Exponential amplification

As with other chemical reactions, the reaction rate and efficiency of PCR are affected by limiting factors. Thus, the entire PCR process can further be divided into three stages based on reaction progress:

  • Exponential amplification: At every cycle, the amount of product is doubled (assuming 100% reaction efficiency). After 30 cycles, a single copy of DNA can be increased up to 1,000,000,000 (one billion) copies. In a sense, then, the replication of a discrete strand of DNA is being manipulated in a tube under controlled conditions.[16] The reaction is very sensitive: only minute quantities of DNA must be present.
  • Leveling off stage: The reaction slows as the DNA polymerase loses activity and as consumption of reagents, such as dNTPs and primers, causes them to become more limited.
  • Plateau: No more product accumulates due to exhaustion of reagents and enzyme.

Optimization

In practice, PCR can fail for various reasons, such as sensitivity or contamination. Contamination with extraneous DNA can lead to spurious products and is addressed with lab protocols and procedures that separate pre-PCR mixtures from potential DNA contaminants. For instance, if DNA from a crime scene is analyzed, a single DNA molecule from lab personnel could be amplified and misguide the investigation. Hence the PCR-setup areas is separated from the analysis or purification of other PCR products, disposable plasticware used, and the work surface between reaction setups needs to be thoroughly cleaned.

Specificity can be adjusted by experimental conditions so that no spurious products are generated. Primer-design techniques are important in improving PCR product yield and in avoiding the formation of unspecific products. The usage of alternate buffer components or polymerase enzymes can help with amplification of long or otherwise problematic regions of DNA. For instance, Q5 polymerase is said to be ≈280 times less error-prone than Taq polymerase. Both the running parameters (e.g. temperature and duration of cycles), or the addition of reagents, such as formamide, may increase the specificity and yield of PCR. Computer simulations of theoretical PCR results (Electronic PCR) may be performed to assist in primer design.

Applications

Selective DNA isolation

PCR allows isolation of DNA fragments from genomic DNA by selective amplification of a specific region of DNA. This use of PCR augments many ways, such as generating hybridization probes for Southern or northern hybridization and DNA cloning, which require larger amounts of DNA, representing a specific DNA region. PCR supplies these techniques with high amounts of pure DNA, enabling analysis of DNA samples even from very small amounts of starting material.

Other applications of PCR include DNA sequencing to determine unknown PCR-amplified sequences in which one of the amplification primers may be used in Sanger sequencing, isolation of a DNA sequence to expedite recombinant DNA technologies involving the insertion of a DNA sequence into a plasmid, phage, or cosmid (depending on size) or the genetic material of another organism. Bacterial colonies (such as E. coli) can be rapidly screened by PCR for correct DNA vector constructs. PCR may also be used for genetic fingerprinting; a forensic technique used to identify a person or organism by comparing experimental DNAs through different PCR-based methods.

Electrophoresis of PCR-amplified DNA fragments:
  1. Father
  2. Child
  3. Mother

The child has inherited some, but not all, of the fingerprints of each of its parents, giving it a new, unique fingerprint.

Some PCR fingerprint methods have high discriminative power and can be used to identify genetic relationships between individuals, such as parent-child or between siblings, and are used in paternity testing (Fig. 4). This technique may also be used to determine evolutionary relationships among organisms when certain molecular clocks are used (i.e. the 16S rRNA and recA genes of microorganisms).

Amplification and quantification of DNA

Because PCR amplifies the regions of DNA that it targets, PCR can be used to analyze extremely small amounts of sample. This is often critical for forensic analysis, when only a trace amount of DNA is available as evidence. PCR may also be used in the analysis of ancient DNA that is tens of thousands of years old. These PCR-based techniques have been successfully used on animals, such as a forty-thousand-year-old mammoth, and also on human DNA, in applications ranging from the analysis of Egyptian mummies to the identification of a Russian tsar and the body of English king Richard III.

Quantitative PCR or Real Time PCR (qPCR, not to be confused with RT-PCR) methods allow the estimation of the amount of a given sequence present in a sample—a technique often applied to quantitatively determine levels of gene expression. Quantitative PCR is an established tool for DNA quantification that measures the accumulation of DNA product after each round of PCR amplification.

qPCR allows the quantification and detection of a specific DNA sequence in real time since it measures concentration while the synthesis process is taking place. There are two methods for simultaneous detection and quantification. The first method consists of using fluorescent dyes that are retained nonspecifically in between the double strands. The second method involves probes that code for specific sequences and are fluorescently labeled. Detection of DNA using these methods can only be seen after the hybridization of probes with its complementary DNA (cDNA) takes place. An interesting technique combination is real-time PCR and reverse transcription. This sophisticated technique, called RT-qPCR, allows for the quantification of a small quantity of RNA. Through this combined technique, mRNA is converted to cDNA, which is further quantified using qPCR. This technique lowers the possibility of error at the end point of PCR, increasing chances for detection of genes associated with genetic diseases such as cancer. Laboratories use RT-qPCR for the purpose of sensitively measuring gene regulation. The mathematical foundations for the reliable quantification of the PCR and RT-qPCR facilitate the implementation of accurate fitting procedures of experimental data in research, medical, diagnostic and infectious disease applications.

Medical and diagnostic applications

Prospective parents can be tested for being genetic carriers, or their children might be tested for actually being affected by a disease. DNA samples for prenatal testing can be obtained by amniocentesis, chorionic villus sampling, or even by the analysis of rare fetal cells circulating in the mother's bloodstream. PCR analysis is also essential to preimplantation genetic diagnosis, where individual cells of a developing embryo are tested for mutations.

  • PCR can also be used as part of a sensitive test for tissue typing, vital to organ transplantation. As of 2008, there is even a proposal to replace the traditional antibody-based tests for blood type with PCR-based tests.
  • Many forms of cancer involve alterations to oncogenes. By using PCR-based tests to study these mutations, therapy regimens can sometimes be individually customized to a patient. PCR permits early diagnosis of malignant diseases such as leukemia and lymphomas, which is currently the highest-developed in cancer research and is already being used routinely. PCR assays can be performed directly on genomic DNA samples to detect translocation-specific malignant cells at a sensitivity that is at least 10,000 fold higher than that of other methods. PCR is very useful in the medical field since it allows for the isolation and amplification of tumor suppressors. Quantitative PCR for example, can be used to quantify and analyze single cells, as well as recognize DNA, mRNA and protein confirmations and combinations.

Infectious disease applications

PCR allows for rapid and highly specific diagnosis of infectious diseases, including those caused by bacteria or viruses.[36] PCR also permits identification of non-cultivatable or slow-growing microorganisms such as mycobacteria, anaerobic bacteria, or viruses from tissue culture assays and animal models. The basis for PCR diagnostic applications in microbiology is the detection of infectious agents and the discrimination of non-pathogenic from pathogenic strains by virtue of specific genes.

Characterization and detection of infectious disease organisms have been revolutionized by PCR in the following ways:

  • The human immunodeficiency virus (or HIV), is a difficult target to find and eradicate. The earliest tests for infection relied on the presence of antibodies to the virus circulating in the bloodstream. However, antibodies don't appear until many weeks after infection, maternal antibodies mask the infection of a newborn, and therapeutic agents to fight the infection don't affect the antibodies. PCR tests have been developed that can detect as little as one viral genome among the DNA of over 50,000 host cells. Infections can be detected earlier, donated blood can be screened directly for the virus, newborns can be immediately tested for infection, and the effects of antiviral treatments can be quantified.
  • Some disease organisms, such as that for tuberculosis, are difficult to sample from patients and slow to be grown in the laboratory. PCR-based tests have allowed detection of small numbers of disease organisms (both live or dead), in convenient samples. Detailed genetic analysis can also be used to detect antibiotic resistance, allowing immediate and effective therapy. The effects of therapy can also be immediately evaluated.
  • The spread of a disease organism through populations of domestic or wild animals can be monitored by PCR testing. In many cases, the appearance of new virulent sub-types can be detected and monitored. The sub-types of an organism that were responsible for earlier epidemics can also be determined by PCR analysis.
  • Viral DNA can be detected by PCR. The primers used must be specific to the targeted sequences in the DNA of a virus, and PCR can be used for diagnostic analyses or DNA sequencing of the viral genome. The high sensitivity of PCR permits virus detection soon after infection and even before the onset of disease. Such early detection may give physicians a significant lead time in treatment. The amount of virus ("viral load") in a patient can also be quantified by PCR-based DNA quantitation techniques (see below). A variant of PCR (RT-PCR) is used for detecting viral RNA rather than DNA: in this test the enzyme reverse transcriptase is used to generate a DNA sequence which matches the viral RNA; this DNA is then amplified as per the usual PCR method. RT-PCR is widely used to detect the SARS-CoV-2 viral genome.
  • Diseases such as pertussis (or whooping cough) are caused by the bacteria Bordetella pertussis. This bacteria is marked by a serious acute respiratory infection that affects various animals and humans and has led to the deaths of many young children. The pertussis toxin is a protein exotoxin that binds to cell receptors by two dimers and reacts with different cell types such as T lymphocytes which play a role in cell immunity. PCR is an important testing tool that can detect sequences within the gene for the pertussis toxin. Because PCR has a high sensitivity for the toxin and a rapid turnaround time, it is very efficient for diagnosing pertussis when compared to culture.

Forensic applications

The development of PCR-based genetic (or DNA) fingerprinting protocols has seen widespread application in forensics:

  • DNA samples are often taken at crime scenes and analyzed by PCR.
    In its most discriminating form, genetic fingerprinting can uniquely discriminate any one person from the entire population of the world. Minute samples of DNA can be isolated from a crime scene, and compared to that from suspects, or from a DNA database of earlier evidence or convicts. Simpler versions of these tests are often used to rapidly rule out suspects during a criminal investigation. Evidence from decades-old crimes can be tested, confirming or exonerating the people originally convicted.
  • Forensic DNA typing has been an effective way of identifying or exonerating criminal suspects due to analysis of evidence discovered at a crime scene. The human genome has many repetitive regions that can be found within gene sequences or in non-coding regions of the genome. Specifically, up to 40% of human DNA is repetitive. There are two distinct categories for these repetitive, non-coding regions in the genome. The first category is called variable number tandem repeats (VNTR), which are 10–100 base pairs long and the second category is called short tandem repeats (STR) and these consist of repeated 2–10 base pair sections. PCR is used to amplify several well-known VNTRs and STRs using primers that flank each of the repetitive regions. The sizes of the fragments obtained from any individual for each of the STRs will indicate which alleles are present. By analyzing several STRs for an individual, a set of alleles for each person will be found that statistically is likely to be unique. Researchers have identified the complete sequence of the human genome. This sequence can be easily accessed through the NCBI website and is used in many real-life applications. For example, the FBI has compiled a set of DNA marker sites used for identification, and these are called the Combined DNA Index System (CODIS) DNA database. Using this database enables statistical analysis to be used to determine the probability that a DNA sample will match. PCR is a very powerful and significant analytical tool to use for forensic DNA typing because researchers only need a very small amount of the target DNA to be used for analysis. For example, a single human hair with attached hair follicle has enough DNA to conduct the analysis. Similarly, a few sperm, skin samples from under the fingernails, or a small amount of blood can provide enough DNA for conclusive analysis.
  • Less discriminating forms of DNA fingerprinting can help in DNA paternity testing, where an individual is matched with their close relatives. DNA from unidentified human remains can be tested, and compared with that from possible parents, siblings, or children. Similar testing can be used to confirm the biological parents of an adopted (or kidnapped) child. The actual biological father of a newborn can also be confirmed (or ruled out).
  • The PCR AMGX/AMGY design has been shown to not only facilitate in amplifying DNA sequences from a very minuscule amount of genome. However it can also be used for real-time sex determination from forensic bone samples. This provides a powerful and effective way to determine gender in forensic cases and ancient specimens.

Research applications

PCR has been applied to many areas of research in molecular genetics:

  • PCR allows rapid production of short pieces of DNA, even when not more than the sequence of the two primers is known. This ability of PCR augments many methods, such as generating hybridization probes for Southern or northern blot hybridization. PCR supplies these techniques with large amounts of pure DNA, sometimes as a single strand, enabling analysis even from very small amounts of starting material.
  • The task of DNA sequencing can also be assisted by PCR. Known segments of DNA can easily be produced from a patient with a genetic disease mutation. Modifications to the amplification technique can extract segments from a completely unknown genome, or can generate just a single strand of an area of interest.
  • PCR has numerous applications to the more traditional process of DNA cloning. It can extract segments for insertion into a vector from a larger genome, which may be only available in small quantities. Using a single set of 'vector primers', it can also analyze or extract fragments that have already been inserted into vectors. Some alterations to the PCR protocol can generate mutations (general or site-directed) of an inserted fragment.
  • Sequence-tagged sites is a process where PCR is used as an indicator that a particular segment of a genome is present in a particular clone. The Human Genome Project found this application vital to mapping the cosmid clones they were sequencing, and to coordinating the results from different laboratories.
  • An application of PCR is the phylogenic analysis of DNA from ancient sources, such as that found in the recovered bones of Neanderthals, from frozen tissues of mammoths, or from the brain of Egyptian mummies. In some cases the highly degraded DNA from these sources might be reassembled during the early stages of amplification.
  • A common application of PCR is the study of patterns of gene expression. Tissues (or even individual cells) can be analyzed at different stages to see which genes have become active, or which have been switched off. This application can also use quantitative PCR to quantitate the actual levels of expression
  • The ability of PCR to simultaneously amplify several loci from individual sperm has greatly enhanced the more traditional task of genetic mapping by studying chromosomal crossovers after meiosis. Rare crossover events between very close loci have been directly observed by analyzing thousands of individual sperms. Similarly, unusual deletions, insertions, translocations, or inversions can be analyzed, all without having to wait (or pay) for the long and laborious processes of fertilization, embryogenesis, etc.
  • Site-directed mutagenesis: PCR can be used to create mutant genes with mutations chosen by scientists at will. These mutations can be chosen in order to understand how proteins accomplish their functions, and to change or improve protein function.

Advantages

PCR has a number of advantages. It is fairly simple to understand and to use, and produces results rapidly. The technique is highly sensitive with the potential to produce millions to billions of copies of a specific product for sequencing, cloning, and analysis. qRT-PCR shares the same advantages as the PCR, with an added advantage of quantification of the synthesized product. Therefore, it has its uses to analyze alterations of gene expression levels in tumors, microbes, or other disease states.

PCR is a very powerful and practical research tool. The sequencing of unknown etiologies of many diseases are being figured out by the PCR. The technique can help identify the sequence of previously unknown viruses related to those already known and thus give us a better understanding of the disease itself. If the procedure can be further simplified and sensitive non-radiometric detection systems can be developed, the PCR will assume a prominent place in the clinical laboratory for years to come.

Limitations

One major limitation of PCR is that prior information about the target sequence is necessary in order to generate the primers that will allow its selective amplification. This means that, typically, PCR users must know the precise sequence(s) upstream of the target region on each of the two single-stranded templates in order to ensure that the DNA polymerase properly binds to the primer-template hybrids and subsequently generates the entire target region during DNA synthesis.

Like all enzymes, DNA polymerases are also prone to error, which in turn causes mutations in the PCR fragments that are generated.

Another limitation of PCR is that even the smallest amount of contaminating DNA can be amplified, resulting in misleading or ambiguous results. To minimize the chance of contamination, investigators should reserve separate rooms for reagent preparation, the PCR, and analysis of product. Reagents should be dispensed into single-use aliquots. Pipettors with disposable plungers and extra-long pipette tips should be routinely used. It is moreover recommended to ensure that the lab set-up follows a unidirectional workflow. No materials or reagents used in the PCR and analysis rooms should ever be taken into the PCR preparation room without thorough decontamination.

Environmental samples that contain humic acids may inhibit PCR amplification and lead to inaccurate results.

Variations

  • Allele-specific PCR or The amplification refractory mutation system (ARMS): a diagnostic or cloning technique based on single-nucleotide variations (SNVs not to be confused with SNPs) (single-base differences in a patient). Any mutation involving single base change can be detected by this system. It requires prior knowledge of a DNA sequence, including differences between alleles, and uses primers whose 3' ends encompass the SNV (base pair buffer around SNV usually incorporated). PCR amplification under stringent conditions is much less efficient in the presence of a mismatch between template and primer, so successful amplification with an SNP-specific primer signals presence of the specific SNP or small deletions in a sequence. See SNP genotyping for more information.
  • Assembly PCR or Polymerase Cycling Assembly (PCA): artificial synthesis of long DNA sequences by performing PCR on a pool of long oligonucleotides with short overlapping segments. The oligonucleotides alternate between sense and antisense directions, and the overlapping segments determine the order of the PCR fragments, thereby selectively producing the final long DNA product.
  • Asymmetric PCR: preferentially amplifies one DNA strand in a double-stranded DNA template. It is used in sequencing and hybridization probing where amplification of only one of the two complementary strands is required. PCR is carried out as usual, but with a great excess of the primer for the strand targeted for amplification. Because of the slow (arithmetic) amplification later in the reaction after the limiting primer has been used up, extra cycles of PCR are required.[49] A recent modification on this process, known as Linear-After-The-Exponential-PCR (LATE-PCR), uses a limiting primer with a higher melting temperature (Tm) than the excess primer to maintain reaction efficiency as the limiting primer concentration decreases mid-reaction.
  • Convective PCR: a pseudo-isothermal way of performing PCR. Instead of repeatedly heating and cooling the PCR mixture, the solution is subjected to a thermal gradient. The resulting thermal instability driven convective flow automatically shuffles the PCR reagents from the hot and cold regions repeatedly enabling PCR. Parameters such as thermal boundary conditions and geometry of the PCR enclosure can be optimized to yield robust and rapid PCR by harnessing the emergence of chaotic flow fields. Such convective flow PCR setup significantly reduces device power requirement and operation time.
  • Dial-out PCR: a highly parallel method for retrieving accurate DNA molecules for gene synthesis. A complex library of DNA molecules is modified with unique flanking tags before massively parallel sequencing. Tag-directed primers then enable the retrieval of molecules with desired sequences by PCR.
  • Digital PCR (dPCR): used to measure the quantity of a target DNA sequence in a DNA sample. The DNA sample is highly diluted so that after running many PCRs in parallel, some of them do not receive a single molecule of the target DNA. The target DNA concentration is calculated using the proportion of negative outcomes. Hence the name 'digital PCR'.
  • Helicase-dependent amplification: similar to traditional PCR, but uses a constant temperature rather than cycling through denaturation and annealing/extension cycles. DNA helicase, an enzyme that unwinds DNA, is used in place of thermal denaturation.
  • Hot start PCR: a technique that reduces non-specific amplification during the initial set up stages of the PCR. It may be performed manually by heating the reaction components to the denaturation temperature (e.g., 95 °C) before adding the polymerase. Specialized enzyme systems have been developed that inhibit the polymerase's activity at ambient temperature, either by the binding of an antibody or by the presence of covalently bound inhibitors that dissociate only after a high-temperature activation step. Hot-start/cold-finish PCR is achieved with new hybrid polymerases that are inactive at ambient temperature and are instantly activated at elongation temperature.
  • In silico PCR (digital PCR, virtual PCR, electronic PCR, e-PCR) refers to computational tools used to calculate theoretical polymerase chain reaction results using a given set of primers (probes) to amplify DNA sequences from a sequenced genome or transcriptome. In silico PCR was proposed as an educational tool for molecular biology.
  • Intersequence-specific PCR (ISSR): a PCR method for DNA fingerprinting that amplifies regions between simple sequence repeats to produce a unique fingerprint of amplified fragment lengths.
  • Inverse PCR: is commonly used to identify the flanking sequences around genomic inserts. It involves a series of DNA digestions and self ligation, resulting in known sequences at either end of the unknown sequence.
  • Ligation-mediated PCR: uses small DNA linkers ligated to the DNA of interest and multiple primers annealing to the DNA linkers; it has been used for DNA sequencing, genome walking, and DNA footprinting.
  • Methylation-specific PCR (MSP): developed by Stephen Baylin and James G. Herman at the Johns Hopkins School of Medicine, and is used to detect methylation of CpG islands in genomic DNA. DNA is first treated with sodium bisulfite, which converts unmethylated cytosine bases to uracil, which is recognized by PCR primers as thymine. Two PCRs are then carried out on the modified DNA, using primer sets identical except at any CpG islands within the primer sequences. At these points, one primer set recognizes DNA with cytosines to amplify methylated DNA, and one set recognizes DNA with uracil or thymine to amplify unmethylated DNA. MSP using qPCR can also be performed to obtain quantitative rather than qualitative information about methylation.
  • Miniprimer PCR: uses a thermostable polymerase (S-Tbr) that can extend from short primers ("smalligos") as short as 9 or 10 nucleotides. This method permits PCR targeting to smaller primer binding regions, and is used to amplify conserved DNA sequences, such as the 16S (or eukaryotic 18S) rRNA gene.
  • Multiplex ligation-dependent probe amplification (MLPA): permits amplifying multiple targets with a single primer pair, thus avoiding the resolution limitations of multiplex PCR (see below).
  • Multiplex-PCR: consists of multiple primer sets within a single PCR mixture to produce amplicons of varying sizes that are specific to different DNA sequences. By targeting multiple genes at once, additional information may be gained from a single test-run that otherwise would require several times the reagents and more time to perform. Annealing temperatures for each of the primer sets must be optimized to work correctly within a single reaction, and amplicon sizes. That is, their base pair length should be different enough to form distinct bands when visualized by gel electrophoresis.
  • Nanoparticle-assisted PCR (nanoPCR): some nanoparticles (NPs) can enhance the efficiency of PCR (thus being called nanoPCR), and some can even outperform the original PCR enhancers. It was reported that quantum dots (QDs) can improve PCR specificity and efficiency. Single-walled carbon nanotubes (SWCNTs) and multi-walled carbon nanotubes (MWCNTs) are efficient in enhancing the amplification of long PCR. Carbon nanopowder (CNP) can improve the efficiency of repeated PCR and long PCR, while zinc oxide, titanium dioxide and Ag NPs were found to increase the PCR yield. Previous data indicated that non-metallic NPs retained acceptable amplification fidelity. Given that many NPs are capable of enhancing PCR efficiency, it is clear that there is likely to be great potential for nanoPCR technology improvements and product development.
  • Nested PCR: increases the specificity of DNA amplification, by reducing background due to non-specific amplification of DNA. Two sets of primers are used in two successive PCRs. In the first reaction, one pair of primers is used to generate DNA products, which besides the intended target, may still consist of non-specifically amplified DNA fragments. The product(s) are then used in a second PCR with a set of primers whose binding sites are completely or partially different from and located 3' of each of the primers used in the first reaction. Nested PCR is often more successful in specifically amplifying long DNA fragments than conventional PCR, but it requires more detailed knowledge of the target sequences.
  • Overlap-extension PCR or Splicing by overlap extension (SOEing) : a genetic engineering technique that is used to splice together two or more DNA fragments that contain complementary sequences. It is used to join DNA pieces containing genes, regulatory sequences, or mutations; the technique enables creation of specific and long DNA constructs. It can also introduce deletions, insertions or point mutations into a DNA sequence.
  • PAN-AC: uses isothermal conditions for amplification, and may be used in living cells.
  • PAN-PCR: A computational method for designing bacterium typing assays based on whole genome sequence data.
  • Quantitative PCR (qPCR): used to measure the quantity of a target sequence (commonly in real-time). It quantitatively measures starting amounts of DNA, cDNA, or RNA. Quantitative PCR is commonly used to determine whether a DNA sequence is present in a sample and the number of its copies in the sample. Quantitative PCR has a very high degree of precision. Quantitative PCR methods use fluorescent dyes, such as Sybr Green, EvaGreen or fluorophore-containing DNA probes, such as TaqMan, to measure the amount of amplified product in real time. It is also sometimes abbreviated to RT-PCR (real-time PCR) but this abbreviation should be used only for reverse transcription PCR. qPCR is the appropriate contractions for quantitative PCR (real-time PCR).
  • Reverse Complement PCR (RC-PCR): Allows the addition of functional domains or sequences of choice to be appended independently to either end of the generated amplicon in a single closed tube reaction. This method generates target specific primers within the reaction by the interaction of universal primers (which contain the desired sequences or domains to be appended) and RC probes.
  • Reverse Transcription PCR (RT-PCR): for amplifying DNA from RNA. Reverse transcriptase reverse transcribes RNA into cDNA, which is then amplified by PCR. RT-PCR is widely used in expression profiling, to determine the expression of a gene or to identify the sequence of an RNA transcript, including transcription start and termination sites. If the genomic DNA sequence of a gene is known, RT-PCR can be used to map the location of exons and introns in the gene. The 5' end of a gene (corresponding to the transcription start site) is typically identified by RACE-PCR (Rapid Amplification of cDNA Ends).
  • RNase H-dependent PCR (rhPCR): a modification of PCR that utilizes primers with a 3' extension block that can be removed by a thermostable RNase HII enzyme. This system reduces primer-dimers and allows for multiplexed reactions to be performed with higher numbers of primers.
  • Single specific primer-PCR (SSP-PCR): allows the amplification of double-stranded DNA even when the sequence information is available at one end only. This method permits amplification of genes for which only a partial sequence information is available, and allows unidirectional genome walking from known into unknown regions of the chromosome.
  • Solid Phase PCR: encompasses multiple meanings, including Polony Amplification (where PCR colonies are derived in a gel matrix, for example), Bridge PCR (primers are covalently linked to a solid-support surface), conventional Solid Phase PCR (where Asymmetric PCR is applied in the presence of solid support bearing primer with sequence matching one of the aqueous primers) and Enhanced Solid Phase PCR (where conventional Solid Phase PCR can be improved by employing high Tm and nested solid support primer with optional application of a thermal 'step' to favour solid support priming).
  • Suicide PCR: typically used in paleogenetics or other studies where avoiding false positives and ensuring the specificity of the amplified fragment is the highest priority. It was originally described in a study to verify the presence of the microbe Yersinia pestis in dental samples obtained from 14th Century graves of people supposedly killed by the plague during the medieval Black Death epidemic. The method prescribes the use of any primer combination only once in a PCR (hence the term "suicide"), which should never have been used in any positive control PCR reaction, and the primers should always target a genomic region never amplified before in the lab using this or any other set of primers. This ensures that no contaminating DNA from previous PCR reactions is present in the lab, which could otherwise generate false positives.
  • Thermal asymmetric interlaced PCR (TAIL-PCR): for isolation of an unknown sequence flanking a known sequence. Within the known sequence, TAIL-PCR uses a nested pair of primers with differing annealing temperatures; a degenerate primer is used to amplify in the other direction from the unknown sequence.
  • Touchdown PCR (Step-down PCR): a variant of PCR that aims to reduce nonspecific background by gradually lowering the annealing temperature as PCR cycling progresses. The annealing temperature at the initial cycles is usually a few degrees (3–5 °C) above the Tm of the primers used, while at the later cycles, it is a few degrees (3–5 °C) below the primer Tm. The higher temperatures give greater specificity for primer binding, and the lower temperatures permit more efficient amplification from the specific products formed during the initial cycles.
  • Universal Fast Walking: for genome walking and genetic fingerprinting using a more specific 'two-sided' PCR than conventional 'one-sided' approaches (using only one gene-specific primer and one general primer—which can lead to artefactual 'noise') by virtue of a mechanism involving lariat structure formation. Streamlined derivatives of UFW are LaNe RAGE (lariat-dependent nested PCR for rapid amplification of genomic DNA ends), 5'RACE LaNe and 3'RACE LaNe.

History

Diagrammatic representation of an example primer pair. The use of primers in an in vitro assay to allow DNA synthesis was a major innovation that allowed the development of PCR.

The heat-resistant enzymes that are a key component in polymerase chain reaction were discovered in the 1960s as a product of a microbial life form that lived in the superheated waters of Yellowstone's Mushroom Spring.

A 1971 paper in the Journal of Molecular Biology by Kjell Kleppe and co-workers in the laboratory of H. Gobind Khorana first described a method of using an enzymatic assay to replicate a short DNA template with primers in vitro. However, this early manifestation of the basic PCR principle did not receive much attention at the time and the invention of the polymerase chain reaction in 1983 is generally credited to Kary Mullis.

"Baby Blue", a 1986 prototype machine for doing PCR

When Mullis developed the PCR in 1983, he was working in Emeryville, California for Cetus Corporation, one of the first biotechnology companies, where he was responsible for synthesizing short chains of DNA. Mullis has written that he conceived the idea for PCR while cruising along the Pacific Coast Highway one night in his car. He was playing in his mind with a new way of analyzing changes (mutations) in DNA when he realized that he had instead invented a method of amplifying any DNA region through repeated cycles of duplication driven by DNA polymerase. In Scientific American, Mullis summarized the procedure: "Beginning with a single molecule of the genetic material DNA, the PCR can generate 100 billion similar molecules in an afternoon. The reaction is easy to execute. It requires no more than a test tube, a few simple reagents, and a source of heat." DNA fingerprinting was first used for paternity testing in 1988.

Mullis has credited his use of LSD as integral to his development of PCR: "Would I have invented PCR if I hadn't taken LSD? I seriously doubt it. I could sit on a DNA molecule and watch the polymers go by. I learnt that partly on psychedelic drugs."

Mullis and biochemist Michael Smith, who had developed other essential ways of manipulating DNA, were jointly awarded the Nobel Prize in Chemistry in 1993, seven years after Mullis and his colleagues at Cetus first put his proposal to practice. Mullis's 1985 paper with R. K. Saiki and H. A. Erlich, "Enzymatic Amplification of β-globin Genomic Sequences and Restriction Site Analysis for Diagnosis of Sickle Cell Anemia"—the polymerase chain reaction invention (PCR)—was honored by a Citation for Chemical Breakthrough Award from the Division of History of Chemistry of the American Chemical Society in 2017.

At the core of the PCR method is the use of a suitable DNA polymerase able to withstand the high temperatures of >90 °C (194 °F) required for separation of the two DNA strands in the DNA double helix after each replication cycle. The DNA polymerases initially employed for in vitro experiments presaging PCR were unable to withstand these high temperatures. So the early procedures for DNA replication were very inefficient and time-consuming, and required large amounts of DNA polymerase and continuous handling throughout the process.

The discovery in 1976 of Taq polymerase—a DNA polymerase purified from the thermophilic bacterium, Thermus aquaticus, which naturally lives in hot (50 to 80 °C (122 to 176 °F)) environments such as hot springs—paved the way for dramatic improvements of the PCR method. The DNA polymerase isolated from T. aquaticus is stable at high temperatures remaining active even after DNA denaturation, thus obviating the need to add new DNA polymerase after each cycle. This allowed an automated thermocycler-based process for DNA amplification.

Patent disputes

The PCR technique was patented by Kary Mullis and assigned to Cetus Corporation, where Mullis worked when he invented the technique in 1983. The Taq polymerase enzyme was also covered by patents. There have been several high-profile lawsuits related to the technique, including an unsuccessful lawsuit brought by DuPont. The Swiss pharmaceutical company Hoffmann-La Roche purchased the rights to the patents in 1992. The last of the commercial PCR patents expired in 2017.

A related patent battle over the Taq polymerase enzyme is still ongoing in several jurisdictions around the world between Roche and Promega. The legal arguments have extended beyond the lives of the original PCR and Taq polymerase patents, which expired on 28 March 2005.

Neural coding

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Neural_coding ...