Paleogenomics methods

Overview

Ancient DNA survives in recognizable form for at most a few million years under ideal permafrost conditions, and degrades through two characteristic processes — fragmentation into short pieces and cytosine deamination — that simultaneously destroy genetic information and serve as authentication markers proving a sequence is genuinely ancient.
The petrous portion of the temporal bone yields DNA concentrations orders of magnitude higher than other skeletal elements, and single-stranded library preparation methods developed after 2012 recover damaged molecules that standard double-stranded protocols discard, together transforming what can be sequenced.
Population-genetic statistics developed specifically for ancient DNA — the D-statistic, f-statistics, qpAdm, and admixture graphs — detect and quantify gene flow between archaic and modern human populations even from low-coverage, error-prone sequence data.

Paleogenomics — the sequencing and analysis of genetic material recovered from ancient biological remains — has transformed paleoanthropology more completely than any development since radiocarbon dating. Before the field existed, researchers inferred the relationships of extinct human populations entirely from the shapes of bones. Since the early 2010s, it has become possible to read the actual genomes of Neanderthals, Denisovans, and thousands of prehistoric Homo sapiens, revealing population movements, episodes of interbreeding, and patterns of natural selection that left no trace in the fossil record. The methods that make this possible — spanning chemistry, molecular biology, sequencing technology, and population genetics — are as consequential as the discoveries they enable, and understanding them is prerequisite to evaluating what ancient DNA evidence can and cannot establish.^{3, 5}

Graph showing the dramatic decline in genome sequencing costs that made paleogenomics feasible — The precipitous decline in genome sequencing costs, driven by next-generation sequencing technologies, made it feasible to sequence ancient genomes at population scale. Wetterstrand KA, Wikimedia Commons, Public domain

Origins of the field

The first demonstration that DNA could be recovered from ancient material came in 1984, when Russell Higuchi and colleagues at the University of California, Berkeley, cloned and sequenced short fragments of mitochondrial DNA from a museum specimen of the quagga, a southern African equid that had been extinct since 1883.¹ The result was extraordinary: genetic material had survived more than a century of museum storage and could be amplified using the molecular cloning techniques then available. The following year, Svante Pääbo — then a graduate student in Uppsala — published the extraction of DNA from Egyptian mummies dating to roughly 2,400 years old, demonstrating that ancient DNA could be recovered from human tissue and opening the question of how far back in time the technique might reach.²

For the next two decades, progress was hampered by a fundamental limitation: the only amplification method available was the polymerase chain reaction (PCR), which requires short primer sequences flanking a target region. Researchers were confined to studying small, targeted segments of the genome, almost always mitochondrial DNA, because mitochondria are present in thousands of copies per cell and their DNA therefore survives at higher concentrations than nuclear DNA. This era produced a large literature on ancient mitochondrial lineages but could not address questions about the nuclear genome — the portion that carries most information about ancestry, relatedness, and phenotype.^{1, 2}

The transformation came with the maturation of next-generation sequencing (NGS) platforms in the mid-2000s. Rather than amplifying specific loci, NGS converts all DNA molecules in a sample into sequenceable fragments simultaneously, allowing the entire surviving genetic content of a specimen to be read in a single experiment. Pääbo’s laboratory at the Max Planck Institute for Evolutionary Anthropology harnessed this technology to produce a draft sequence of the Neanderthal genome in 2010, assembled from DNA extracted from three bones found in Vindija Cave, Croatia, and representing the first time the nuclear genome of an extinct human species had been sequenced.³ That same year, the same laboratory reported the genome of a new and entirely unknown archaic human lineage — the Denisovans — from a finger bone found in Denisova Cave, Siberia.⁵ These two publications marked the founding of paleogenomics as a mature discipline.

How ancient DNA degrades

Understanding what makes ancient DNA challenging to work with requires knowing how DNA breaks down after an organism dies. Two processes dominate. The first is hydrolytic fragmentation: water molecules cleave the phosphodiester bonds that link nucleotides in the DNA strand, breaking the molecule into progressively shorter pieces over time. Ancient DNA molecules are typically only 30 to 100 base pairs long, compared with the millions of base pairs that constitute intact chromosomal DNA in living cells.¹⁷ The short length of recovered fragments means that any individual read of sequence provides only a tiny window onto the original genome, requiring massive numbers of fragments to reconstruct coverage of the full genome.

The second process is cytosine deamination: the amino group at the cytosine ring is lost, converting cytosine (C) to uracil (U). When the damaged template is copied during library preparation or sequencing, uracil is read as thymine (T), producing characteristic C-to-T substitution errors concentrated at the ends of ancient DNA fragments. This deamination pattern is not a nuisance but a forensic asset: because modern contaminant DNA does not show this damage signature, damage patterns serve as the primary proof that a sequence is genuinely ancient rather than the product of modern contamination.¹⁷ A sample with no elevated C-to-T damage at fragment termini is almost certainly contaminated with modern DNA and cannot be authenticated.

DNA survival depends critically on environmental conditions, above all temperature. Cold slows every chemical reaction, including hydrolysis and oxidative damage. This is why permafrost is the most productive environment for ancient DNA recovery: bones and other organic material buried in permanently frozen ground can preserve readable genetic sequences for extraordinary spans of time. The current record, as of the mid-2020s, belongs to environmental DNA extracted from Greenland permafrost and dating to approximately two million years ago — the oldest authenticated ancient DNA yet recovered from any source.¹⁸ In temperate or tropical environments, DNA typically degrades to unrecoverable levels within a few thousand years, and the prospect of recovering ancient DNA from early hominin remains in Africa — where most of human evolution took place — remains limited by this thermal constraint.

Extraction: finding the best source material

Not all skeletal elements preserve DNA equally well, and the choice of sampling location is among the most consequential methodological decisions in any paleogenomics project. For most of the field’s early history, researchers preferentially sampled teeth and compact cortical bone from the shafts of long bones. These materials protect DNA from environmental exposure and were assumed to offer the best preservation. That assumption was overturned in 2015, when a systematic comparison by Ron Pinhasi and colleagues demonstrated that the petrous portion of the temporal bone — a dense, pyramid-shaped region at the base of the skull that houses the inner ear — yields endogenous ancient DNA at concentrations one to two orders of magnitude higher than any other skeletal element tested.⁹ The petrous is among the densest bones in the human body, and its exceptional density appears to shield the DNA within from microbial degradation and chemical attack. This discovery immediately became standard practice: petrous sampling is now the default approach when the specimen permits it.

A second high-yield source identified more recently is dental cementum, the calcified tissue that covers the root of a tooth and attaches it to the jaw via the periodontal ligament. Cementum appears to preserve DNA at concentrations comparable to the petrous, possibly because its tight mineral structure limits access by exogenous microbes and water. Where petrous bone is unavailable — because the specimen has been heavily sampled previously, or because the cranium is not preserved — the cementum layer of a molar root offers an alternative with similarly high DNA yields.⁹

The extraction protocol itself has been refined considerably since the early days. Modern protocols use silica-based spin columns to bind DNA fragments of all sizes, including the very short molecules (below 30 base pairs) that earlier silica chemistries lost. Digestion of the bone powder proceeds in a lysis buffer optimized for ancient material: proteinase K degrades protein matrices that bind and protect DNA, while EDTA chelates calcium ions that stabilize the mineral lattice, releasing DNA into solution. The entire process takes place in dedicated ancient DNA laboratories built with positive-pressure filtered air, ultraviolet-irradiated work surfaces, and strict protocols to prevent contamination from the researchers themselves.⁸

Library preparation for degraded DNA

Before ancient DNA fragments can be sequenced on a next-generation platform, they must be converted into a sequencing library: a collection of molecules with standardized adapter sequences ligated to each end, allowing them to bind to a sequencing flow cell and be read. Standard library preparation protocols developed for modern, high-molecular-weight DNA work poorly on ancient material because they depend on DNA polymerases that require relatively long, intact templates to function efficiently, and because they lose molecules shorter than approximately 100 base pairs during size-selection steps.¹⁰

Two innovations substantially improved library recovery from heavily degraded ancient DNA. The first was the introduction of single-stranded DNA library preparation, developed by Jesse Dabney and colleagues around 2013. Conventional double-stranded protocols require DNA to be in intact duplex form before adapter ligation; single-stranded protocols instead denature the DNA and ligate adapters to single-stranded molecules, recovering fragments that the double-stranded method cannot capture. The result is a dramatic increase in the number of ancient molecules that make it into the sequencing library, particularly the shortest and most heavily damaged fragments — which are also, paradoxically, the most authentically ancient.¹⁰ Single-stranded library preparation is now considered the gold standard for heavily degraded specimens.

The second innovation involves enzymatic treatment to control damage before sequencing. The enzyme uracil-DNA glycosylase (UDG) excises uracil residues from DNA, and when applied to ancient DNA libraries it effectively removes the deaminated cytosines that cause C-to-T errors. Full UDG treatment produces libraries with very low damage rates, making the resulting sequences easier to map accurately to a reference genome and reducing error rates in variant calls. The trade-off is that authentication via damage patterns is no longer directly possible on UDG-treated libraries, because the signature has been enzymatically erased. A partial UDG treatment — which reduces but does not eliminate damage — is often used as a compromise, yielding cleaner sequence data while preserving enough residual damage for authentication.⁶

Sequencing strategies: shotgun versus targeted capture

Once a library has been prepared, researchers face a choice between two fundamentally different sequencing strategies. Shotgun sequencing feeds the entire library into the sequencer without prior selection, reading whatever molecules are present. In specimens with high endogenous DNA content — such as bones from Siberian permafrost — shotgun sequencing is highly efficient. The high-coverage Denisovan genome was produced by shotgun sequencing to approximately 30-fold depth, meaning each position in the genome was read an average of 30 times, yielding a sequence comparable in quality to modern human reference genomes.⁶ Similarly, a high-coverage Neanderthal genome from the Altai Mountains, published in 2014, was generated by shotgun sequencing and provided sufficient resolution to detect gene flow between Neanderthals, Denisovans, and early modern humans at the level of individual chromosomal segments.⁴

In specimens with poor DNA preservation — which describes the vast majority of ancient human remains from temperate climates — most of the DNA in the library comes not from the individual being studied but from environmental microbes, fungi, and soil bacteria that infiltrated the bone after death. In such samples, endogenous human DNA may constitute less than one percent, sometimes less than one tenth of one percent, of the total sequencing output. Shotgun sequencing in these cases is extremely expensive for the amount of informative data returned. The alternative is targeted enrichment (also called hybridisation capture), in which synthetic RNA or DNA oligonucleotides complementary to genomic regions of interest are used to selectively pull down matching sequences from the library before sequencing. Capture panels targeting several hundred thousand single-nucleotide polymorphisms (SNPs) informative for human population history can enrich endogenous sequences by two to three orders of magnitude, making it economically feasible to genotype even poorly preserved specimens.¹² Targeted capture has been instrumental in the large-scale ancient genomics projects that have traced population movements across prehistoric Europe, the Americas, and South Asia, where bones from temperate burial sites routinely have low endogenous DNA content.

Targeted enrichment has also been applied to non-human sequences, most notably to ancient pathogen genomes. By designing capture probes based on the genome of a pathogen of interest, researchers can reconstruct pathogen sequences from the remains of individuals who died of infectious disease. This approach has produced ancient genome sequences of Yersinia pestis (plague), Vibrio cholerae (cholera), Mycobacterium tuberculosis, and the 1918 influenza virus, enabling direct study of how these pathogens evolved, spread, and changed in virulence over historical time.¹¹

Authentication and contamination

The most persistent methodological challenge in paleogenomics is distinguishing genuine ancient DNA from modern contamination. A single skin cell shed by a researcher, or a fragment of modern DNA carried on a tool or reagent, can overwhelm the endogenous signal in a low-preservation specimen. The field learned this lesson painfully in the 1990s, when several high-profile claims of extremely old DNA — from dinosaur bones, amber-preserved insects, and Miocene plant material — turned out on reanalysis to be contamination artifacts. The stringency of current protocols is a direct response to those failures.¹⁷

Authentication rests on several independent lines of evidence. The damage pattern — elevated C-to-T substitutions at the 5′ ends and G-to-A at the 3′ ends of fragments — is the most direct indicator of authentic ancient DNA, because modern contaminants do not carry this signature.¹⁷ Fragment length distribution provides a second check: authentic ancient DNA should consist overwhelmingly of fragments below 100 base pairs, following a characteristic exponential size distribution. A sample dominated by longer fragments is suspect. Third, in the case of human specimens, the fraction of reads that map to the human reference genome versus to environmental microbes — the endogenous rate — should be consistent across independent extractions from the same specimen; large variation between extractions suggests process-level contamination rather than authentic signal.

Contamination from living humans is addressed through several complementary strategies. Ancient DNA facilities maintain databases of genotypes of everyone who has handled specimens, allowing contaminating sequences to be identified and computationally excluded. For male specimens, the ratio of X to Y chromosome reads provides an independent contamination estimate: if a male skeleton shows more X chromosome reads than expected, the excess likely derives from female laboratory workers. Mitochondrial contamination rates can be estimated by identifying mismatches between consensus ancient mitotype and sequences that diverge from it, since most modern contaminants will not share the exact mitochondrial haplotype of a prehistoric individual.¹⁷

Population-genetic analysis of ancient genomes

Sequencing a genome is only the first step. Extracting historical inference from ancient genomes requires statistical methods designed to handle the particular features of ancient DNA data: low coverage, systematic damage errors, and the absence of phasing information (which of a diploid individual’s two chromosome copies carries each variant). The main analytical toolkit of modern paleogenomics was developed primarily by David Reich’s laboratory at Harvard, drawing on theoretical foundations from population genetics.¹⁴

The D-statistic (also called the ABBA-BABA test) detects gene flow between populations by counting shared derived alleles in a four-taxon comparison. If a test population shares more derived alleles with one of two reference populations than expected under a simple tree model, the asymmetry is taken as evidence of gene flow. The D-statistic was the tool used to demonstrate Neanderthal admixture in non-African modern humans in the 2010 Neanderthal genome paper, and it remains the standard first-pass test for admixture.^{3, 14} Its strength is that it requires only a single allele at each position per individual, making it robust to the pseudo-haploid data typically extracted from low-coverage ancient genomes by randomly sampling one read at each site.

A more general family of statistics, the f-statistics (f2, f3, f4), were formalised by Nick Patterson and David Reich in a 2012 paper that laid out a comprehensive framework for testing and quantifying population relationships using allele-frequency correlations.¹⁴ The f3-statistic tests whether a target population is the product of admixture between two source populations: a significantly negative f3 value is statistical evidence of mixture. The f4-statistic is formally equivalent to the D-statistic but is computed on allele frequencies and is more sensitive when large numbers of individuals are available. These statistics make no assumptions about population topology and can be applied to genome-wide SNP data from any combination of ancient and modern individuals.

For questions of admixture proportion — estimating what fraction of a target population’s ancestry derives from each of a set of sources — the method qpAdm uses a set of reference populations to simultaneously estimate mixture weights while testing whether the model is consistent with the data.¹⁵ qpAdm is the workhorse of large-scale ancient population history projects, applied in hundreds of studies to reconstruct the ancestry of prehistoric Europeans, South Asians, Native Americans, and African populations. Its results are presented as proportions of ancestry from each modelled source, with confidence intervals derived from block jackknife resampling over the genome.

The most complex analyses use admixture graphs — explicit models specifying the full population history as a graph of splits and mixture events, with mixture proportions and genetic drift parameters fitted to minimise the discrepancy between observed and expected f-statistics.¹³ The software package TreeMix and the graph-fitting routines in the ADMIXTOOLS suite implement versions of this approach. Admixture graphs can in principle capture any population history, but they become unidentifiable as complexity increases, and the space of possible graphs is too large to search exhaustively, so results depend heavily on the starting topology assumed by the researcher.

Landmark achievements

The methods described above have produced a series of discoveries that have fundamentally redrawn the map of human prehistory. The Neanderthal genome project, culminating in a high-coverage genome published in 2014, demonstrated that all non-African modern humans carry approximately one to two percent Neanderthal-derived DNA, the legacy of interbreeding after the out-of-Africa dispersal, and revealed that Neanderthals had extremely low genetic diversity — lower than any living human population — consistent with a long history of small, isolated populations.⁴

Perhaps the most striking demonstration of what paleogenomics can recover from minimal material was the identification of the Denisovans. The high-coverage Denisovan genome was assembled from DNA extracted from a single distal finger phalanx no larger than a fingernail. The bone yielded sufficient endogenous DNA to sequence the genome to 30-fold coverage — quality comparable to a modern clinical genome — despite being more than 50,000 years old. From this single fragment, researchers determined that the Denisovans were a sister group to Neanderthals, that they had contributed 3–6 percent of the ancestry of present-day Melanesians and Aboriginal Australians, and that their population history included a deep divergence from a superarchaic source.⁶

The deepest nuclear genome yet recovered from a hominin comes from the Sima de los Huesos site in northern Spain, a narrow shaft in the Atapuerca cave system that has yielded the remains of at least 28 individuals dating to roughly 430,000 years ago. Mitochondrial DNA from these specimens, published in 2013, showed an unexpected affinity with Denisovans rather than Neanderthals, posing a puzzle about early hominin relationships. In 2016, Meyer and colleagues succeeded in recovering nuclear DNA from the same specimens using single-stranded library preparation and targeted enrichment.⁷ The nuclear genome placed the Sima de los Huesos hominins as early Neanderthals — consistent with their morphology — and suggested that the Neanderthal-Denisovan split occurred before 430,000 years ago, making this the oldest nuclear hominin genome then sequenced by a factor of more than four.

Large-scale ancient genomics has also revolutionised the understanding of prehistoric population movements within Homo sapiens. Studies of hundreds and then thousands of ancient European genomes have documented three major episodes of population replacement or mixture: the initial arrival of anatomically modern humans replacing or assimilating earlier populations, the spread of Anatolian farmers into Europe beginning around 9,000 years ago who largely replaced the indigenous hunter-gatherers, and a massive migration of steppe pastoralists from the Pontic-Caspian steppe beginning around 4,500 years ago, who themselves carried large proportions of Eastern European hunter-gatherer and Caucasus-related ancestry and are now thought to have been the primary vehicle for the spread of Indo-European languages into Europe.¹⁹

Limitations and future directions

Despite its achievements, paleogenomics operates under constraints that cannot be overcome by improved methodology alone. The most fundamental is thermodynamic: DNA degradation is a chemical process driven by temperature, and no extraction or sequencing technology can recover information that has been chemically destroyed. In practice, this limits ancient DNA research primarily to cold environments. The record for oldest recovered ancient DNA belongs not to a hominin but to environmental DNA — plant, animal, and microbial DNA preserved in Greenland permafrost sediments — dated to approximately two million years ago.¹⁸ For hominins, the oldest securely authenticated nuclear genome remains the Sima de los Huesos specimen at around 430,000 years, and it is uncertain whether nuclear DNA will ever be recoverable from the older hominin fossils of Africa and Asia that document the deeper history of the genus Homo.^{7, 18}

A second limitation is the geography of preservation. Most of the key events in human evolution occurred in sub-Saharan Africa, a region where tropical temperatures and humid conditions are maximally destructive to ancient DNA. The entirety of the period from the emergence of Homo sapiens around 300,000 years ago to the dispersal out of Africa around 60,000–70,000 years ago is essentially inaccessible to direct genomic interrogation. Researchers must instead infer this period from the genomes of living Africans and from the few ancient specimens preserved in more favorable environments — for instance, the 2,000-year-old Moroccan specimens that provided the first ancient African genomic data and revealed unexpected Eurasian back-migration.¹⁶

A third limitation concerns the interpretation of statistical methods. Population-genetic statistics such as the D-statistic and f-statistics detect gene flow but cannot always distinguish between alternative historical models that produce identical expected values. An asymmetric D-statistic can result from direct admixture between the populations being tested, from admixture from an unsampled third population, or from ancestral population structure predating the split of the populations being compared. Distinguishing among these possibilities requires careful model testing and, ideally, the inclusion of relevant ancient or extant reference genomes that can serve as proxies for the unsampled populations. As the number of sequenced ancient genomes grows into the thousands and tens of thousands, the statistical power to resolve these ambiguities increases, but the interpretive challenges remain substantial.¹⁴

Looking forward, improvements in sequencing technology — particularly long-read platforms that can read fragments of hundreds of base pairs in a single pass — promise to recover haplotype information from ancient specimens, reconstructing which variants were located on the same chromosome and enabling finer-grained inference of population history. Advances in ancient proteomics, which uses mass spectrometry to sequence fossil proteins surviving in contexts where DNA has long since degraded, offer a complementary approach that can reach temporal depths of several million years. The integration of ancient genomics with archaeological dating, stable isotope analysis, and the morphological study of the fossil record increasingly allows inference about ancient human biology that none of these methods could achieve in isolation.^{3, 12}

References

Quagga DNA: Cloning and sequencing studies of an extinct equid

Higuchi, R. et al. · Nature 312: 282–284, 1984