Neutral theory of molecular evolution

Overview

The neutral theory of molecular evolution, proposed by Motoo Kimura in 1968, holds that the vast majority of evolutionary changes at the molecular level are caused not by natural selection but by the random fixation of selectively neutral or nearly neutral mutations through genetic drift, a proposal that fundamentally changed how evolutionary biologists interpret DNA and protein sequence data.
The theory predicts an approximately constant rate of molecular evolution for functionally unconstrained sequences, providing the theoretical foundation for the molecular clock used to estimate divergence times between species, and explains the observed correlation between population size and levels of genetic polymorphism.
While the neutral theory is now accepted as the default null model for molecular evolution, the relative proportions of neutral, nearly neutral, and adaptive substitutions remain actively debated, with genomic studies revealing that the fraction of adaptive evolution varies substantially across taxa, genomic regions, and protein classes.

The neutral theory of molecular evolution, proposed by the Japanese population geneticist Motoo Kimura in 1968, holds that the overwhelming majority of evolutionary changes at the molecular level — substitutions of one nucleotide for another in DNA, or one amino acid for another in proteins — are not driven by natural selection but by the random fixation of selectively neutral mutations through genetic drift.^{1, 2} Kimura did not deny that natural selection is the primary force shaping organismal adaptation and morphological evolution. His claim was narrower and more precise: at the level of DNA and protein sequences, most evolutionary change is selectively neutral, meaning that the mutant alleles involved are functionally equivalent to the alleles they replace and therefore invisible to natural selection.² The theory was independently supported by Jack King and Thomas Jukes, who in 1969 published a provocatively titled paper, "Non-Darwinian Evolution," presenting similar arguments from protein sequence data.³

Simulations showing how genetic drift causes faster allele fixation in smaller populations — Effect of population size on genetic drift: ten simulations of random allele frequency change over 50 generations for populations of 20, 200, and 2,000 individuals. Smaller populations fix alleles faster by chance alone — the mechanism central to the neutral theory. Professor marginalia, Wikimedia Commons, CC BY-SA 3.0

Historical context

Before Kimura's proposal, the dominant view in evolutionary biology was that natural selection was the primary cause of evolutionary change at all levels, from morphology to molecules. This "selectionist" or "panselectionist" position, articulated most forcefully by Ernst Mayr and the architects of the modern evolutionary synthesis, held that most genetic variation within populations was maintained by balancing selection (heterozygote advantage, frequency-dependent selection) and that substitutions between species were driven by positive (directional) selection.^{2, 12}

Two empirical developments in the 1960s challenged this view. First, the application of protein electrophoresis to natural populations by Richard Lewontin, John Hubby, and Harry Harris revealed unexpectedly high levels of genetic polymorphism — typically 30 percent or more of protein-coding loci were polymorphic in most species surveyed. If all this variation were maintained by balancing selection, the cumulative fitness cost (the "segregation load") would be enormous, potentially exceeding what populations could sustain.^{2, 12} Second, comparisons of homologous protein sequences across species (haemoglobin, cytochrome c, fibrinopeptides) by Emile Zuckerkandl and Linus Pauling revealed that amino acid substitutions accumulated at roughly constant rates over evolutionary time — the molecular clock — a pattern more consistent with a stochastic process like drift than with the irregular pace expected from adaptation to changing environments.⁶

Core principles

The neutral theory rests on several key propositions derived from population genetics theory. The first is that most mutations arising in natural populations are either deleterious (and rapidly removed by purifying selection) or selectively neutral (having no measurable effect on the fitness of the organism). Only a small fraction of mutations are advantageous. The neutral theory does not claim that most mutations are neutral — most are deleterious — but that among those mutations that are fixed (that is, that spread through the entire population and replace the ancestral allele), the vast majority are neutral rather than advantageous.^{1, 2}

The second key result is that the rate of neutral evolution is equal to the neutral mutation rate and is independent of population size. This elegant result, derived by Kimura, follows from the mathematics of genetic drift: in a diploid population of effective size N_e, each new neutral mutation has a probability of fixation of 1/(2N_e), and the total number of new neutral mutations arising per generation is 2N_eμ, where μ is the neutral mutation rate per gamete. The product of these two quantities — the overall rate of substitution — is simply μ, independent of N_e. This result provides the theoretical foundation for the molecular clock: if the neutral mutation rate is approximately constant across lineages, then neutral substitutions accumulate at a roughly constant rate, allowing divergence times to be estimated from sequence differences.^{2, 6}

The third prediction concerns genetic polymorphism within populations. Under the neutral theory, the expected level of heterozygosity at a locus is determined by the product of the effective population size and the mutation rate (the parameter θ = 4N_eμ for diploids). Species with larger effective population sizes should, all else being equal, harbour more neutral genetic variation. This prediction is broadly supported: Drosophila species, with large population sizes, are typically more polymorphic at the DNA level than mammals, which have smaller effective populations.^{2, 11, 16}

The nearly neutral theory

Tomoko Ohta, Kimura's student and collaborator, extended the neutral theory in the early 1970s with her nearly neutral theory, which emphasises the importance of mutations whose selective effects are very small — not strictly zero but too small to be effectively "seen" by natural selection in populations of finite size. The behaviour of such mutations depends on the relationship between the strength of selection (measured by the selection coefficient s) and the effective population size: a mutation is effectively neutral when |s| < 1/(2N_e), because in this regime genetic drift dominates over selection.⁴

The nearly neutral theory makes a distinctive prediction: slightly deleterious mutations should be fixed more frequently in species with small effective population sizes (where drift is stronger relative to selection) than in species with large populations. This predicts a negative correlation between population size and the rate of slightly deleterious substitutions, particularly at sites under weak functional constraint such as synonymous sites affected by codon usage bias. Ohta's theory has been supported by comparative genomic studies showing, for example, that mammals (with small N_e) accumulate slightly deleterious nonsynonymous substitutions at higher rates relative to synonymous substitutions than Drosophila (with large N_e).^{4, 7}

The nearly neutral theory blurs the sharp boundary between neutral and selected mutations, recognising a continuum from strongly deleterious through weakly deleterious, nearly neutral, and weakly advantageous to strongly advantageous. In practice, the fate of any given mutation depends not only on its intrinsic fitness effect but also on the demographic history and effective population size of the species in which it arises.^{4, 16}

The neutralist-selectionist debate

Kimura's theory provoked one of the most intense and productive debates in the history of evolutionary biology, pitting "neutralists" against "selectionists" in a controversy that lasted from the late 1960s through the 1990s. Selectionists, including Mayr, Lewontin, and John Gillespie, argued that natural selection was pervasive at the molecular level and that much of the observed polymorphism and divergence reflected the action of various forms of selection (balancing, directional, background, hitchhiking) rather than drift alone.^{2, 12}

The debate was difficult to resolve empirically because neutral and selected substitutions can produce similar patterns in sequence data. Several statistical tests were developed to distinguish between them. The Hudson-Kreitman-Aguade (HKA) test compares the ratio of polymorphism to divergence across multiple loci, looking for outliers that deviate from the neutral expectation. The McDonald-Kreitman (MK) test, introduced in 1991, compares the ratio of nonsynonymous to synonymous changes within species (polymorphism) and between species (divergence): under strict neutrality, these ratios should be equal, while an excess of nonsynonymous divergence relative to polymorphism indicates positive selection.^{8, 10}

The consensus that gradually emerged, and that prevails today, is that the neutral theory provides the correct null model for molecular evolution — the baseline expectation against which the action of selection must be demonstrated — but that a significant fraction of molecular evolution is adaptive, particularly in species with large effective population sizes where selection is more efficient relative to drift.^{13, 15}

The molecular clock

One of the most important practical consequences of the neutral theory is its theoretical justification of the molecular clock. Zuckerkandl and Pauling observed in 1965 that the number of amino acid differences between homologous proteins in different species is roughly proportional to the time since those species diverged, as estimated from the fossil record.⁶ The neutral theory explains this regularity: if most substitutions are neutral and the neutral mutation rate is approximately constant, then substitutions accumulate at a roughly clock-like rate, providing a tool for estimating divergence times between lineages.²

In practice, the molecular clock is not perfectly constant. Rates of molecular evolution vary among lineages (the "rate heterogeneity" problem), among genes, and among different functional regions of the same gene. Substitution rates are highest at sites under the weakest functional constraint — pseudogenes, introns, and synonymous sites — and lowest at sites under strong purifying selection, such as functionally critical amino acid positions. This pattern is precisely what the neutral theory predicts: functionally constrained sites tolerate fewer neutral mutations, so their effective neutral mutation rate (and hence their substitution rate) is lower.^{2, 5}

Modern phylogenetic analyses employ "relaxed clock" models that allow substitution rates to vary across branches of a phylogenetic tree, calibrated by fossil dates or biogeographic events. These methods, which owe their theoretical foundation to Kimura's neutral theory, are now standard tools for estimating the timing of evolutionary divergences across the tree of life.^{5, 14}

Evidence from genomics

The genomic era has provided large-scale data bearing on the neutral theory's predictions. Genome-wide analyses using the McDonald-Kreitman framework have estimated that approximately 45 percent of amino acid substitutions in Drosophila melanogaster have been driven by positive selection, a substantial departure from strict neutrality.⁷ In contrast, estimates for humans and other mammals with small effective population sizes suggest that a much lower fraction of nonsynonymous substitutions (perhaps 10 to 20 percent) is adaptive, with the remainder being neutral or slightly deleterious — consistent with the nearly neutral theory's prediction that drift plays a larger role in small populations.¹³

Comparative genomic studies have confirmed several specific predictions of the neutral theory. The ratio of nonsynonymous to synonymous substitution rates (dN/dS or ω) is typically much less than one for most protein-coding genes, indicating that purifying selection dominates and that most nonsynonymous mutations are deleterious. Rates of substitution at synonymous sites and in pseudogenes (nonfunctional gene copies) are higher than at nonsynonymous sites, as expected if functional constraint reduces the proportion of mutations that are effectively neutral. Regions of the genome under no known functional constraint, such as ancient transposon insertions, evolve at rates close to the total mutation rate, consistent with the neutral expectation.^{5, 9}

The human effective population size has been estimated at approximately 10,000 from patterns of genetic diversity, a figure consistent with neutral theory predictions and independently supported by coalescent analyses of multiple unlinked loci.¹¹ This small effective population size, relative to Drosophila or many bacterial species, means that genetic drift is a more potent force in human evolution and that slightly deleterious mutations are more likely to drift to fixation than in species with larger populations.^{4, 16}

Significance and continuing debates

The neutral theory transformed molecular evolutionary biology by establishing a rigorous null hypothesis against which the action of natural selection at the molecular level must be tested. Before Kimura, the default assumption was that all evolutionary change, including molecular change, was adaptive. After Kimura, the burden of proof shifted: deviations from neutral expectations require positive evidence, and many patterns previously attributed to selection were reinterpreted as consequences of drift and mutation.^{2, 15}

The theory also provided the quantitative framework for the molecular clock, which remains one of the most widely used tools in evolutionary biology for estimating divergence times and calibrating phylogenetic trees. Without the neutral theory's prediction of approximately clock-like substitution rates at functionally unconstrained sites, the molecular clock would lack theoretical justification.^{6, 14}

Debates continue over the proportion of molecular evolution that is adaptive versus neutral, the importance of nearly neutral mutations in shaping genome evolution, and the extent to which linked selection (background selection and genetic hitchhiking) distorts neutral predictions. These are empirical questions that genomic data are progressively resolving, but they are asked within the conceptual framework that Kimura established. The neutral theory did not replace natural selection as the explanation for adaptation; it demonstrated that adaptation and molecular evolution are substantially different processes, governed by different dynamics, and that the molecular level is dominated by the stochastic fixation of variants that make no difference to the organism.^{2, 13, 15}

References

Evolutionary rate at the molecular level

Kimura, M. · Nature 217: 624–626, 1968

Genetic drift and population size