The effective population size is "the number of individuals in a population who contribute offspring to the next generation," or all the breeding adults in that population. Genetically derived estimates of effective population size tend to provide a lower number than an actual head count would provide.[1] In more technical terms, the effective population size is the number of individuals that an idealised population would need to have, in order for some specified quantity of interest to be the same in the idealised population as in the real population. Idealised populations are based on unrealistic but convenient simplifications such as random mating, simultaneous birth of each new generation, constant population size, and equal numbers of children per parent. In some simple scenarios, the effective population size is the number of breeding individuals in the population. However, for most quantities of interest and most real populations, the census population size N of a real population is usually larger than the effective population size Ne. The same population may have multiple effective population sizes, for different properties of interest, including for different genetic loci.
The effective population size is most commonly measured with respect to the coalescence time. In an idealised diploid population with no selection at any locus, the expectation of the coalescence time in generations is equal to twice the census population size. The effective population size is measured as within-species genetic diversity divided by four times the mutation rate, because in such an idealised population, the heterozygosity is equal to . In a population with selection at many loci and abundant linkage disequilibrium, the coalescent effective population size may not reflect the census population size at all, or may reflect its logarithm.
The concept of effective population size was introduced in the field of population genetics in 1931 by the American geneticist Sewall Wright.[2][3]
Overview: Types of effective population size
Depending on the quantity of interest, effective population size can be defined in several ways. Ronald Fisher and Sewall Wright originally defined it as "the number of breeding individuals in an idealised population that would show the same amount of dispersion of allele frequencies under random genetic drift or the same amount of inbreeding as the population under consideration". More generally, an effective population size may be defined as the number of individuals in an idealised population that has a value of any given population genetic quantity that is equal to the value of that quantity in the population of interest. The two population genetic quantities identified by Wright were the one-generation increase in variance across replicate populations (variance effective population size) and the one-generation change in the inbreeding coefficient (inbreeding effective population size). These two are closely linked, and derived from F-statistics, but they are not identical.[4]Today, the effective population size is usually estimated empirically with respect to the sojourn or coalescence time, estimated as the within-species genetic diversity divided by the mutation rate, yielding a coalescent effective population size.[5] Another important effective population size is the selection effective population size 1/scritical, where scritical is the critical value of the selection coefficient at which selection becomes more important than genetic drift.[6]
Empirical measurements
In Drosophila populations of census size 16, the variance effective population size has been measured as equal to 11.5.[7] This measurement was achieved through studying changes in the frequency of a neutral allele from one generation to another in over 100 replicate populations.For coalescent effective population sizes, a survey of publications on 102 mostly wildlife animal and plant species yielded 192 Ne/N ratios. Seven different estimation methods were used in the surveyed studies. Accordingly, the ratios ranged widely from 10-6 for Pacific oysters to 0.994 for humans, with an average of 0.34 across the examined species.[8] A genealogical analysis of human hunter-gatherers (Eskimos) determined the effective-to-census population size ratio for haploid (mitochondrial DNA, Y chromosomal DNA), and diploid (autosomal DNA) loci separately: the ratio of the effective to the census population size was estimated as 0.6–0.7 for autosomal and X-chromosomal DNA, 0.7–0.9 for mitochondrial DNA and 0.5 for Y-chromosomal DNA.[9]
Variance effective size
References missing In the Wright-Fisher idealized population model, the conditional variance of the allele frequency , given the allele frequency in the previous generation, isTheoretical examples
In the following examples, one or more of the assumptions of a strictly idealised population are relaxed, while other assumptions are retained. The variance effective population size of the more relaxed population model is then calculated with respect to the strict model.Variations in population size
Population size varies over time. Suppose there are t non-overlapping generations, then effective population size is given by the harmonic mean of the population sizes[10]:Dioeciousness
If a population is dioecious, i.e. there is no self-fertilisation thenWhen N is large, Ne approximately equals N, so this is usually trivial and often ignored:
Variance in reproductive success
If population size is to remain constant, each individual must contribute on average two gametes to the next generation. An idealized population assumes that this follows a Poisson distribution so that the variance of the number of gametes contributed, k is equal to the mean number contributed, i.e. 2:Non-Fisherian sex-ratios
When the sex ratio of a population varies from the Fisherian 1:1 ratio, effective population size is given by:Inbreeding effective size
Alternatively, the effective population size may be defined by noting how the average inbreeding coefficient changes from one generation to the next, and then defining Ne as the size of the idealized population that has the same change in average inbreeding coefficient as the population under consideration. The presentation follows Kempthorne (1957).[11]For the idealized population, the inbreeding coefficients follow the recurrence equation
Theoretical example: overlapping generations and age-structured populations
When organisms live longer than one breeding season, effective population sizes have to take into account the life tables for the species.Haploid
Assume a haploid population with discrete age structure. An example might be an organism that can survive several discrete breeding seasons. Further, define the following age structure characteristics:- Fisher's reproductive value for age ,
- The chance an individual will survive to age , and
- The number of newborn individuals per breeding season.
- average age of a reproducing individual
Diploid
Similarly, the inbreeding effective number can be calculated for a diploid population with discrete age structure. This was first given by Johnson,[13] but the notation more closely resembles Emigh and Pollak.[14]Assume the same basic parameters for the life table as given for the haploid case, but distinguishing between male and female, such as N0ƒ and N0m for the number of newborn females and males, respectively (notice lower case ƒ for females, compared to upper case F for inbreeding).
The inbreeding effective number is
Coalescent effective size
According to the neutral theory of molecular evolution, a neutral allele remains in a population for Ne generations, where Ne is the effective population size. An idealised diploid population will have a pairwise nucleotide diversity equal to 4Ne, where is the mutation rate. The sojourn effective population size can therefore be estimated empirically by dividing the nucleotide diversity by the mutation rate.[5]The coalescent effective size may have little relationship to the number of individuals physically present in a population.[15] Measured coalescent effective population sizes vary between genes in the same population, being low in genome areas of low recombination and high in genome areas of high recombination.[16][17] Sojourn times are proportional to N in neutral theory, but for alleles under selection, sojourn times are proportional to log(N). Genetic hitchhiking can cause neutral mutations to have sojourn times proportional to log(N): this may explain the relationship between measured effective population size and the local recombination rate.