bookmark

Gene duplication


Overview

  • Gene duplication—through unequal crossing over, retrotransposition, or whole-genome duplication—is the primary mechanism by which genomes acquire new genetic raw material, directly addressing the claim that evolution cannot create new information.
  • After duplication, one copy is freed from selective constraint and can accumulate mutations that produce entirely new functions (neofunctionalization) or partition the original function between the two copies (subfunctionalization).
  • Major evolutionary innovations including the hemoglobin gene family, Hox gene clusters for body patterning, and the opsins underlying color vision all arose through successive rounds of gene duplication and divergence.

In 1970, Susumu Ohno published a landmark book arguing that gene duplication is the primary mechanism by which genomes acquire new genetic material—the raw substrate upon which natural selection and genetic drift act to produce evolutionary novelty.1 Half a century of genomic research has confirmed and extended Ohno's thesis. Gene duplication literally creates new DNA sequences, and the subsequent divergence of duplicate copies has produced many of the most important gene families in biology, from the hemoglobins that transport oxygen in vertebrate blood to the Hox genes that pattern animal body plans to the opsins that enable color vision.2, 13 The process directly refutes the common objection that evolution "cannot create new information"—duplication creates the information, and mutation and selection reshape it into new functions.

Mechanisms of gene duplication

Genes can be duplicated through several distinct mechanisms, each producing copies with different characteristics. Unequal crossing over during meiosis occurs when homologous chromosomes misalign, resulting in one chromosome with a duplicated segment and another with a corresponding deletion. This mechanism typically produces tandem duplicates—copies located adjacent to each other on the same chromosome—and is responsible for the expansion of many clustered gene families, including the globin and olfactory receptor gene clusters.1, 13

Retrotransposition involves the reverse transcription of an mRNA molecule back into DNA, followed by its insertion at a new genomic location by the machinery of transposable elements. Because the copy is made from processed mRNA, retrotransposed duplicates (retrocopies) lack introns and the original gene's regulatory sequences, and they are inserted at random positions in the genome. Most retrocopies become pseudogenes, but some acquire new regulatory elements from their insertion site and evolve into functional retrogenes with novel expression patterns.2, 13

The most dramatic form of gene duplication is whole-genome duplication (WGD), or polyploidy, in which the entire genome is copied. WGD events have occurred multiple times in the history of life, particularly in plants and early vertebrates. A single WGD event doubles every gene in the genome simultaneously, providing an enormous burst of raw material for evolutionary innovation.1, 11

Evolutionary fates of duplicate genes

When a gene is duplicated, both copies initially perform the same function. Because the organism now has two copies where one sufficed, one of the duplicates is freed from the full force of purifying selection—mutations that would be lethal in a single-copy gene can accumulate in one duplicate because the other copy continues to provide the essential function. Lynch and Conery estimated that gene duplicates arise at a rate of approximately 0.01 per gene per million years, and that most duplicates are silenced (pseudogenized) within a few million years.2 The minority that survive, however, can follow several evolutionary trajectories of profound importance.

Neofunctionalization occurs when one copy retains the original function while the other accumulates mutations that confer an entirely new function. This is the classical model proposed by Ohno, and it accounts for many of the most important evolutionary innovations produced by gene duplication.1, 4 Subfunctionalization, formalized by Force et al., occurs when both copies accumulate complementary degenerative mutations, such that each retains a subset of the ancestral gene's functions. Neither copy alone can perform the full original role, so both are preserved by natural selection. Subfunctionalization may serve as a transitional state that preserves duplicates long enough for neofunctionalization to occur in one copy.3, 14

The globin gene family

The vertebrate globin gene family is perhaps the most thoroughly studied example of evolution through gene duplication. All globin proteins—including hemoglobin (which transports oxygen in red blood cells), myoglobin (which stores oxygen in muscle tissue), and cytoglobin (expressed in connective tissue)—descended from a single ancestral globin gene through a series of duplication events spanning over 800 million years.5, 6

The earliest duplication, approximately 800 million years ago, produced the split between the lineage leading to myoglobin and the lineage leading to hemoglobin. Subsequent duplications within the hemoglobin lineage produced the alpha-globin and beta-globin gene clusters, which are now located on different chromosomes in humans (chromosome 16 and chromosome 11, respectively). Further duplications within each cluster generated the fetal and embryonic globin variants that are expressed at different stages of development: embryonic hemoglobins with extremely high oxygen affinity for extracting oxygen from maternal blood, fetal hemoglobin with intermediate affinity, and adult hemoglobin optimized for oxygen delivery to tissues.5, 6

Each of these globin variants represents a neofunctionalized duplicate—a gene copy that diverged from its sibling after duplication to serve a distinct biological role. The entire family tree of globin genes can be reconstructed from sequence comparisons, and the branching pattern matches the predicted order of duplication events inferred from comparative genomics across vertebrate species.5

Hox gene clusters

Hox genes encode transcription factors that specify positional identity along the anterior-posterior body axis of animals. They are among the most important regulatory genes in developmental biology, and their evolutionary history provides a dramatic illustration of evolution through gene duplication at multiple scales.7, 8

The ancestral animal Hox cluster likely contained a small number of genes arranged in tandem on a single chromosome. Through a series of tandem duplications, this cluster expanded to approximately 10–13 genes in the common ancestor of bilaterian animals. Invertebrates such as Drosophila retain a single Hox cluster (split into two sub-clusters in flies), while vertebrates possess four Hox clusters (HoxA, HoxB, HoxC, HoxD) containing a total of 39 genes in mammals.7, 8 The four vertebrate clusters arose through whole-genome duplication events at the base of the vertebrate lineage—the same events that doubled other gene families. Teleost fish, which experienced an additional whole-genome duplication, possess seven or eight Hox clusters.7, 12

The expansion of Hox gene numbers through duplication is thought to have provided the genetic toolkit for the increased morphological complexity of vertebrate body plans. After duplication, individual Hox genes diverged in their expression patterns and regulatory targets, enabling finer-grained patterning of the body axis and the evolution of novel structures such as limbs and specialized vertebral regions.7, 8

Opsins and color vision

The evolution of color vision in primates provides a clear and well-understood example of new function arising through gene duplication. Most mammals are dichromats, possessing only two types of cone opsin (short-wavelength and medium/long-wavelength). Old World primates, including humans, are trichromats with three cone opsins: short-wavelength (blue), medium-wavelength (green), and long-wavelength (red).9, 10

Trichromatic vision arose through a duplication of the ancestral medium/long-wavelength opsin gene on the X chromosome, followed by sequence divergence between the two copies such that one shifted its peak absorption toward longer wavelengths (red) while the other retained sensitivity to medium wavelengths (green). The human red and green opsin genes are located in tandem on the X chromosome and share approximately 96% sequence identity, reflecting their recent origin from a single gene by duplication approximately 30–40 million years ago.9, 10 The critical amino acid differences that shift spectral sensitivity between the two opsins have been identified and experimentally verified. This represents neofunctionalization in action: a duplicated gene diverging to serve a new sensory function that neither copy alone possessed.

The 2R hypothesis

Ohno proposed that two rounds of whole-genome duplication (WGD) occurred early in vertebrate evolution, an idea now known as the 2R hypothesis.1 If correct, this would mean that every gene in the ancestral vertebrate genome was quadrupled, providing an enormous pool of duplicate genes from which vertebrate-specific innovations could evolve. Genomic analyses have provided substantial support for this hypothesis, showing that many vertebrate gene families exist in groups of four (or fewer, due to subsequent gene loss), arranged in patterns consistent with two rounds of genome-wide duplication.11, 12

Dehal and Boore analyzed the human genome and found widespread evidence of ancient tetraploidy (four-fold duplication) consistent with two WGD events occurring before the divergence of jawed and jawless vertebrates.11 Holland et al. confirmed this pattern using the amphioxus genome as an outgroup, showing that many genomic regions in vertebrates display a four-to-one relationship with corresponding amphioxus regions.12 These ancient WGD events are thought to have been instrumental in the evolution of vertebrate complexity, providing the raw genetic material for innovations in the nervous system, immune system, and skeletal development.

Gene duplication and the creation of new information

A persistent claim in anti-evolution literature is that mutation and natural selection cannot generate "new genetic information." Gene duplication directly falsifies this claim. When a gene is duplicated, the total amount of DNA in the genome increases—new sequence is literally created. The duplicate is then free to diverge through point mutations, insertions, deletions, and domain shuffling, eventually acquiring a function that did not previously exist in the genome.1, 2, 13

The examples above demonstrate this process at work. The ancestral globin gene encoded a single oxygen-binding protein; through duplication and divergence, it gave rise to a family of proteins with distinct functions optimized for different tissues and developmental stages. A single ancestral opsin gene gave rise to the three opsins that enable trichromatic color vision. A small cluster of Hox genes expanded into the four clusters that pattern the vertebrate body plan. In each case, gene duplication provided the raw material, and natural selection shaped the diverging copies into new functional genes—a well-documented, experimentally verified mechanism for the origin of biological novelty.1, 4, 13

References

1

Evolution by Gene Duplication

Ohno, S. · Springer-Verlag, Berlin, 1970

open_in_new
2

The evolutionary fate and consequences of duplicate genes

Lynch, M. & Conery, J. S. · Science 290: 1151–1155, 2000

open_in_new
3

Preservation of duplicate genes by complementary, degenerative mutations

Force, A. et al. · Genetics 151: 1531–1545, 1999

open_in_new
4

Gene duplication and the adaptive evolution of a classic genetic switch

Zhang, J. · Nature 428: 567–571, 2004

open_in_new
5

Evolution of the globin gene family

Hardison, R. C. · PNAS 93: 5675–5679, 1996

open_in_new
6

Molecular evolution of the vertebrate globin gene family

Goodman, M., Moore, G. W. & Matsuda, G. · Nature 253: 603–608, 1975

open_in_new
7

Evolution of Hox gene clusters

Garcia-Fernàndez, J. · Developmental Dynamics 232: 31–37, 2005

open_in_new
8

Hox genes and the evolution of vertebrate body plans

Carroll, S. B. · Nature 376: 479–485, 1995

open_in_new
9

Molecular genetics and evolution of color vision in vertebrates

Yokoyama, S. · Gene 300: 69–78, 2002

open_in_new
10

The evolution of trichromatic color vision in primates

Surridge, A. K., Osorio, D. & Mundy, N. I. · Animal Behaviour 66: 7–16, 2003

open_in_new
11

Deconstructing the 2R hypothesis

Dehal, P. & Boore, J. L. · PLoS Computational Biology 1(3): e38, 2005

open_in_new
12

Two rounds of whole genome duplication in the ancestral vertebrate

Holland, P. W. H. et al. · PLoS Biology 6(7): e174, 2008

open_in_new
13

Gene duplication: a drive for phenotypic diversity and cause of human disease

Kondrashov, F. A. · Annual Review of Genetics 46: 97–118, 2012

open_in_new
14

Subfunctionalization of duplicated genes as a transition state to neofunctionalization

Rastogi, S. & Liberles, D. A. · BMC Evolutionary Biology 5: 28, 2005

open_in_new
0:00