Introduction

Genetic disease is a necessary product of evolution (Box 1). Fundamental biological systems, such as DNA replication, transcription and translation, evolved very early in the history of life. Although these ancient evolutionary innovations gave rise to cellular life, they also created the potential for disease. Subsequent innovations along life’s long evolutionary history have similarly enabled both adaptation and the potential for dysfunction. Against this ancient background, young genetic variants specific to the human lineage interact with modern environments to produce human disease phenotypes. Consequently, the substrates for genetic disease in modern humans are often far older than the human lineage itself, but the genetic variants that cause them are usually unique to humans.

The advent of high-throughput genomic technologies has enabled the sequencing of the genomes of diverse species from across the tree of life1. Analysis of these genomes has, in turn, revealed the striking conservation of many of the molecular pathways that underlie the function of biological systems that are essential for cellular life2. The same technologies have also spearheaded a revolution in human genomics3; currently, more than 120,000 individual whole human genome sequences are publicly available, and genome-scale data from hundreds of thousands more have been generated by consumer genomics companies4. Huge nationwide biobanks are also characterizing the genotypes and phenotypes of millions of people from around the world5,6,7. These studies are radically changing our understanding of the genetic architecture of disease8. It is also now possible to extract and sequence ancient DNA from remains of organisms that are thousands of years old, enabling scientists to reconstruct the history of recent human adaptation with unprecedented resolution9,10. These breakthroughs have revealed the recent, often complicated, history of our species and how it influences the genetic architecture of disease8,11. With the expansion of clinical whole-genome sequencing and personalized medicine, the influence of our evolutionary past and its implications for understanding human disease can no longer remain overlooked by medical practice; evolutionary perspectives must inform medicine12,13.

Much like a family’s medical history over generations, the genome is fundamentally a historical record. Decoding the evolution of the human genome provides valuable context for interpreting and modelling disease. This context is not limited to recent human evolution but also includes more ancient events that span life’s history. In this Review, we trace the 4 billion-year interplay between evolution and disease by illustrating how innovations during the course of life’s history have established the potential for, and inevitability of, disease. Beginning with events in the very deep past, where most genes and pathways involved in human disease originate, we explain how ancient biological systems, recent genetic variants and dynamic environments interact to produce both adaptation and disease risk in human populations. Given this scope, we cannot provide a comprehensive account of all evolutionary events relevant to human disease. Instead, our goal is to illustrate through examples the relevance of both deep and recent evolution to the study and treatment of genetic disease. Many of these key insights stem from recent discoveries, which have yet to be integrated into the broader canvas of evolutionary biomedicine (Box 2).

Macroevolutionary imprints on human disease

Systems involved in disease have ancient origins

Many of cellular life’s essential biological systems and processes, such as DNA replication, transcription and translation, represent ancient evolutionary innovations shared by all living organisms. Although essential, each of these ancient innovations generated the conditions for modern disease (Fig. 1). In this section, we provide examples of how several ancient innovations have created substrates for dysfunction and disease, and how considering these histories contributes to understanding the biology of disease and extrapolating results from model systems to humans.

Fig. 1: Evolutionary events in both the deep evolutionary past and recent human evolution shape the potential for disease.
figure 1

A timeline of evolutionary events (top) in the deep evolutionary past and on the human lineage that are relevant to patterns of human disease risk (bottom). The ancient innovations on this timeline (left) formed biological systems that are essential, but are also foundations for disease. During recent human evolution (right), the development of new traits and recent rapid demographic and environmental changes have created the potential for mismatches between genotypes and modern environments that can cause disease. The timeline is schematic and not shown to scale. bya, billion years ago; kya, thousand years ago; mya, million years ago.

As a foundational (if obvious) example, the origin of self-replicating molecules 4 billion years ago formed the basis of life, but also the root of genetic diseases12,14,15. Similarly, asymmetric cell division may have evolved as an efficient way to handle cellular damage, but it also established the basis for ageing in multicellular organisms16,17. Myriad age-related diseases in humans, and many other multicellular organisms, are a manifestation of this first evolutionary trade-off.

The evolution of multicellularity, which has occurred many times across the tree of life, illustrates the interplay between evolutionary innovation and disease18. The origin of multicellularity enabled complex body plans with trillions of cells, involving innovations associated with the ability of cells to regulate their cell cycles, modulate their growth and form intricate networks of communication. But multicellularity also established the foundation for cancer19,20. Genes that regulate cell cycle control are often divided into two groups: caretakers and gatekeepers21,22. The caretakers are involved in basic control of the cell cycle and DNA repair, and mutations in these genes often lead to increased mutation rates or genomic instability, both of which increase cancer risk. Caretaker genes are enriched for functions with origins dating back to the first cells23. The gatekeepers appeared later, at the genesis of metazoan multicellularity23. The gatekeepers are directly linked to tumorigenesis through their roles in regulating cell growth, death and communication. The progression of individual tumours in a given patient is likewise informed by an evolutionary perspective. Designing treatments that account for the evolution of drug resistance and heterogeneity in tumours is a tenet of modern cancer therapy24,25,26,27,28,29.

Like multicellularity, the evolution of immune systems also set the stage for dysregulation and disease. Mammalian innate and adaptive immune systems are both ancient. Components of the innate immune system are present across metazoans and even some plants30,31, whereas the adaptive immune system is present across jawed vertebrates32. These systems provided molecular mechanisms for self-/non-self-recognition and response to pathogens, but they evolved in a piecemeal fashion, using many different, pre-existing genes and processes. For example, co-option of endogenous retroviruses provided novel regulatory elements for interferon response33. As well, it is clear that the human immune system has co-evolved with parasites, such as helminths, over millions of years. Helminth infection both induces and modulates an immune response in humans34.

Evolutionary analyses of development have revealed that new anatomical structures often arise by co-opting existing structures and molecular pathways that were established earlier in the history of life. For example, animal eyes, limb structure in tetrapods and pregnancy in mammals (Box 3) each evolved by adapting and integrating ancient genes and regulatory circuits in new ways35,36,37,38. This integration of novel traits into the existing network of biological systems gives rise to links between diverse traits via the shared genes that underlie their development and function36. As a result, many genes are pleiotropic — they have effects on multiple, seemingly unrelated, traits. We do not have space here to cover the full evolutionary scope of these innovations and their legacies, but just as in each of the cases described above, innovations and adaptations spanning from the origin of metazoans to modern human populations shape the substrate upon which disease appears.

Medical implications

Although ancient macroevolutionary innovations may seem far removed from modern human phenotypes, their imprint remains on the human body and genome. Understanding the constraints they impose can provide insight into mechanisms of disease.

Mapping the origins and evolution of traits and identifying the genetic networks that underlie them are critical to the accurate selection of model systems and extrapolation to human populations. Failure to consider the evolutionary history of homologous systems, their phylogenetic relationships and their functional contexts in different organisms can lead to inaccurate generalization. Instead, when considering a model system, key evolutionary questions about both the organism and the trait of interest can indicate how translatable the research will be to humans39,40. For example, is the similarity between the trait in humans and the trait in the model system due to shared ancestry, that is, homology? The presence of homology in a human gene or system of study suggests potential as a model system; however, homology alone is not sufficient justification. Environmental and life history factors shape traits, and divergence between species complicates the simple assumption that homology provides genetic or mechanistic similarity. Thus, homology must be supplemented by understanding of whether the evolutionary divergence between humans and the proposed model led to functional divergence. For example, the rapid evolution of the placenta and variation in reproductive strategy across mammals have made it challenging to extrapolate results about the regulation of birth timing from model organisms, such as mouse, to humans (Box 3). More broadly, differences in genetic networks that underlie the development of homologous traits across mammals explain why the majority of successful animal trials fail to translate to human clinical trials41,42. Molecular mechanisms of ancient systems, such as DNA replication, can be studied using phylogenetically distant species; however, ‘humanizing’ these models to research human-specific aspects of traits may not be possible and comparative studies of closely related species may be required40.

Although evolutionary divergence in homologous traits is an impediment to the direct translation of findings from a model system to humans, understanding how these evolutionary differences came about can also yield insights into disease mechanisms. For example, intuition would suggest that large animals (many cells and cell divisions) with long lifespans (many ageing cells), such as elephants and whales, would be at increased risk for developing cancer. However, size and lifespan are not significantly correlated with cancer risk across species; despite their large size, elephants and whales do not have a higher risk of developing cancer43,44. Why is this so? Recent studies of the evolution of genes involved in the DNA damage response in elephants have revealed mechanisms that may contribute to cancer resistance. An ancient leukaemia inhibiting factor pseudogene (LIF6) regained its function in the ancestor of modern elephants. This gene works in conjunction with the tumour suppressor gene TP53, which has increased in copy number in elephants, to reduce elephants’ risk for cancer despite their large body size45,46. This illustrates a basic life history trade-off: selection has created mechanisms for cancer suppression and somatic maintenance in large vertebrates that are not needed in small short-lived vertebrates. Studying such seeming paradoxes, especially those with clear contrasts to human disease risk, will shed light on broader disease mechanisms and suggest targets for functional interventions with translation potential.

Human-specific evolution

Human adaptation, trade-offs and disease

The macroevolutionary events described above created the foundation of genetic disease, but considering the more recent changes that occurred during the evolutionary history of the human lineage is necessary to illuminate the full context of human disease. Comparisons between humans and their closest living primate relatives, such as chimpanzees, have revealed diseases that either do not appear in other species or take very different courses47. We are beginning to understand the genetic differences underlying some of these human-specific conditions, with particular insights into infectious diseases.

The last common ancestors of humans and chimpanzees underwent a complex speciation event that is likely to have involved multiple rounds of gene flow between ~12 and 6 million years ago (mya)48. Over the millions of years after this divergence, climatic, demographic and social pressures drove the evolution of many physical and behavioural traits unique to the human lineage, including bipedalism (~7 mya), lack of body hair (~2–3 mya) and larger brain volume relative to body size (~2 mya)12,47. These traits evolved in a diverse array of hominin groups, mainly in Africa, although some of these species, such as Homo erectus, ventured into Europe and Asia.

These human adaptations developed on the substrate of tightly integrated systems shaped by billions of years of evolution, and thus beneficial adaptations with respect to one system often incurred trade-offs in the form of costs on other linked systems49. The trade-off concept derives from a branch of evolutionary biology known as life history theory. It is based on the observation that organisms contain combinations of traits that cannot be simultaneously optimized by natural selection50,51. For example, many fitness-related traits draw on common energetic reserves, and investment in one comes at the expense of another52. Large body size may improve survival in certain environments, but it comes at the expense of longer development and lower numerical investment in reproduction.

The trade-off concept is clinically relevant because it dispenses with the notion of a single ‘optimal’ phenotype or fitness state for an individual49,53,54. Given the interconnected deep evolution of the human body, many diseases are tightly linked, in the sense that decreasing the risk for one increases the risk for the other. Such diametric diseases and the trade-offs that produce them are the starkest when there is competition within the body for limited resources; for example, energy used for reproduction cannot be used for growth, immune function or other energy-consuming survival processes54. The molecular basis for diametric diseases often results from antagonistic pleiotropy at the genetic level — when a variant has contrasting effects on multiple bodily systems. In extreme cases, some diseases that manifest well after reproductive age, for example, Alzheimer disease, have been less visible to selection and, thus, potentially more susceptible to trade-offs. Cancer and neurodegenerative disorders also exhibit this diametric pattern, where cancer risk is inversely associated with Alzheimer disease, Parkinson disease and Huntington disease. This association is hypothesized to be mediated by differences in the neuronal energy use and trade-offs in cell proliferation and apoptosis pathways49. Similarly, osteoarthritis (breakdown of cartilage in joints often accompanied by high bone mineral density) and osteoporosis (low bone mineral density) rarely co-occur. Their diametric pattern reflects, at least in part, different probabilities across individuals of mesenchymal stem cells within bone marrow to develop into osteoblasts versus non-bone cells such as adipocytes49,55. In another example, a history of selection for a robust immune response can now lead to an increased risk for autoimmune and inflammatory diseases, especially when coupled with new environmental mismatches49,54. Other examples of trade-offs are found throughout the human body, manifesting in risk for diverse diseases, including psychiatric and rheumatoid disorders49,56.

Just as adaptations in deep evolutionary time created new substrates for disease, evolutionary pressures exerted on the human lineage established the foundation for complex cognitive capabilities, but they also established the potential for many neuropsychiatric or neurodevelopmental diseases. For example, genomic structural variants enabled functional innovation in the brain through the emergence of novel genes57,58,59,60. Many human-specific segmental duplications influence genes that are essential to the development of the human brain, such as SRGAP2C and ARHGAP11B. Both of these genes function in cortical development and may be involved in the expansion of human brain size61,62,63. The human-specific NOTCH2NL is also hypothesized to have evolved from a partial duplication event, and is implicated in increased output during human corticogenesis, another potential key contributor to human brain size59,60. Although these structural variants were probably adaptive58, they may have also predisposed humans to neuropsychiatric diseases and developmental disorders. Copy number variation in the region flanking ARHGAP11B, specifically a microdeletion at 15q13.3, is associated with risk for intellectual disability, autism spectrum disorder (ASD), schizophrenia and epilepsy58,64. Duplications and deletions of NOTCH2NL and surrounding regions are implicated in macrocephaly and ASD or microcephaly and schizophrenia, respectively59. These trade-offs also play out at the protein domain level. For example, the Olduvai domain (previously known as DUF1220) is a 1.4-kb sequence that appears in ~300 copies in the human genome; this domain has experienced a large human-specific increase in copy number. These domains appear in tandem arrays in neuroblastoma breakpoint family (NBPF) genes, and have been associated with both increased brain size and neuropsychiatric diseases, including autism and schizophrenia65. These examples suggest that the genomic organization of these human-specific duplications may have enabled human-specific changes in brain development while also increasing the likelihood of detrimental rearrangements that cause human disease59,64. Furthermore, genomic regions associated with neuropsychiatric diseases have experienced human-specific accelerated evolution and recent positive selection, providing additional evidence for the role of recent evolutionary pressures on human disease risk66,67. Schizophrenia-associated loci, for example, are enriched near human accelerated regions (HARs) that are conserved in non-human primates68. Variation in HARs has also been associated with risk for ASD, possibly through perturbations of gene regulatory architecture69.

Human immune systems have adapted in response to changes in environment and lifestyles over the past few million years; however, the rapid evolution of the immune system may have left humans vulnerable to certain diseases, such as HIV-1 infection. A similar virus, simian immunodeficiency virus (SIV), is found in chimpanzees and other primates, and studies in the early 2000s found evidence of AIDS-like symptoms (primarily a reduction in CD4+ T cells) in chimpanzees infected with SIV. Although the effects of SIV in chimpanzees mirror some of the effects of HIV in humans70, captive chimpanzees infected with HIV-1 do not typically develop AIDS and have better clinical outcomes. The differences in outcome are influenced by human-specific immune evolution. For example, humans have lost expression of several Siglecs, cell surface proteins that binds sialic acids, in T lymphocytes compared with great apes71. In support of this hypothesis, human T cells with high Siglec-5 expression survive longer after HIV-1 infection72. Moreover, there is a possible role for the rapidly evolving Siglecs in other diseases, such as epithelial cancers, that differentially affect humans relative to closely related primates73,74.

Another human-specific immune change is the deletion of an exon of CMP-N-acetylneuraminic acid hydroxylase (CMAH) leading to a difference in human cell surface sialoglycans compared with other great apes75,76,77. The change in human sialic acid to an N-acetylneuraminic acid (Neu5Ac) termination, rather than N-glycolylneuraminic acid (Neu5Gc), may have been driven by pressure to escape infection by Plasmodium reichenowi, a parasite that binds Neu5Gc and causes malaria in chimpanzees. Conversely, the prevalence of Neu5Ac probably made humans more susceptible to infection by the malaria parasite Plasmodium falciparum, which binds to Neu5Ac78,79, and another human-specific pathology: typhoid fever80. Typhoid toxin binds specifically and is cytotoxic to cells expressing Neu5Ac glycans. Thus, the deletion of CMAH was likely to have been selected for by pressure from pathogens, but has in turn enabled other human-specific diseases such as malaria and typhoid fever81. The rapid evolution of the human immune system creates the potential for human-specific disease. As a result, human-specific variation in many other human immune genes influences human-specific disease risk82,83.

Medical implications

These examples from recent human evolution highlight the ongoing interplay of genetic variation, adaptation and disease. Understanding the evolutionary history of traits along with the aetiology of related diseases can help identify and evaluate risks for unintended consequences of treatments due to trade-offs. For example, ovarian steroids have pleiotropic effects stimulating both bone growth and mitosis in breast tissues to mobilize calcium stores during lactation54. However, later in life this link gives rise to a clinical trade-off. Hormone replacement therapy in postmenopausal women reduces the risk for osteoporosis and ovarian cancer, but also, as a result of its effects on breast tissue, increases the risk for breast cancer. Given the commonality of the trade-off between maintenance and proliferation, this is just one of many examples of cancer risk emerging as a result of trade-offs in immune, reproductive and metabolic systems56,84. Pregnancy is also rife with clinically relevant trade-offs given the interaction between multiple individuals and genomes (mother, father and fetus) with different objectives (Box 3). Trade-offs at the cellular level also have medical implications. For example, cellular senescence is a necessary and beneficial part of many basic bodily responses, but the accumulation of senescent cells underlies many ageing-related disorders. Thus, individuals with different solutions to this trade-off may have very different ‘molecular’ versus ‘chronological’ ages85.

Identifying such trade-offs by studying disease and treatment response is of great interest, but is challenging for several reasons: the number of possible combinations of traits to consider is large; many humans must have experienced the negative effects; and data must be available on both traits in the same individuals. Here, evolution paired with massive electronic health record (EHR)-linked biobanks5,86,87 provides a possible solution. By considering the evolutionary context and potential linkages between traits, the search space of possible trade-offs can be constrained. Then, diametric traits can be tested for among individuals in the EHRs by performing phenome-wide association studies (PheWAS) either on traits or genetic loci of interest and looking for inverse relationships88. The mechanisms underlying the observed associations could then be evaluated in model systems and, if validated, anticipated in future human treatments.

In addition to trade-offs, evolutionary analyses can help us identify therapeutic targets for uniquely human diseases. A small subset of humans infected with HIV never progress to AIDS — a resistance phenotype that has been generally attributed to host genomics89,90,91. Identifying and understanding the genes that contribute to non-progression is of great interest in the development of vaccines and treatments for HIV infection. Genome-wide association studies (GWAS) and functional studies have supported the role of the MHC class I region, specifically the HLA-B*27/B*57 molecules, in HIV non-progression92,93,94. Comparative genomics with chimpanzees identified a chimpanzee MHC class I molecule functionally analogous to that of the non-progressors that contains amino acid substitutions that change binding affinity for conserved areas of the HIV-1 and SIV viruses. Evolutionary analysis of this region suggests that these substitutions are the result of an ancient selective sweep in chimpanzee genomes that did not occur in humans95. This analysis not only helps us understand how humans are uniquely susceptible to HIV progression but also highlights functional variation in the MHC that are potential targets of medical intervention.

Recent human demographic history

Most genetic variants are young, but have diverse histories

The complex demographic history of modern humans in the past 200,000 years has created differences in the genetic architecture of and risk for specific diseases among human populations. With genomic sequences of thousands of humans from diverse locations, we can compare genetic information over time and geography to better understand the origins and evolution of both individual genetic variants and human populations96,97,98. The vast majority of human genetic variants are not shared with other species99. Demographic events such as bottlenecks, introgression and population expansion shaped the genetic composition of human populations, whereas rapid introduction of humans into new environments and the subsequent adaptations created potential for evolutionary mismatches (Figs 2,3).

Fig. 2: Recent adaptation has produced evolutionary trade-offs that lead to disease in some environments.
figure 2

Representative genes that have experienced local adaptive evolution over the past 100,000 years as humans moved across the globe. We focus on adaptations that also produced the potential for disease due to trade-offs or mismatches with modern environments. For each, we list the evolutionary pressure, the trait(s) influenced and the associated disease(s). The approximate regions where the adaptations occurred are indicated by blue circles. Arrows represent the expansion of human populations, and purple shading represents introgression events with archaic hominins. Supplementary Table S1 presents more details and references. COVID-19, coronavirus disease 2019; G6PD, glucose-6-phosphate dehydrogenase; UV, ultraviolet.

Fig. 3: Effects of recent demographic events in human history on genetic mechanisms underlying disease.
figure 3

Ancient human migrations, introgression events with other archaic hominins and recent population expansions have all contributed to the introduction of variants associated with human disease. Schematic of human evolutionary history, where the branches represent different human populations and the branch widths represent population size (top left). Letter labels refer to the processes illustrated in parts ad. a | Human populations migrating out of Africa maintained only a subset of genetic diversity present in African populations. The resulting out-of-Africa bottleneck is likely to have increased the fraction of deleterious, disease-associated variants in non-African populations. Coloured circles represent different genetic variants. Circles marked with X denote deleterious, disease-associated variants. b | When anatomically modern humans left Africa, they encountered other archaic hominin populations. Haplotypes introduced by archaic introgression events (illustrated in grey) contained Neanderthal-derived variants (denoted by red circles) associated with increased disease risk in modern populations. c | In the last 10,000 years, the burden of rare disease-associated variants (denoted by yellow circles) has increased due to rapid population expansion. d | Modern human individuals with admixture in their recent ancestry, such as African Americans, can have differences in genetic risk for disease, because of each individual’s unique mix of genomic regions with African and European evolutionary ancestry. For example, each of the three admixed individuals depicted have the same proportions of African and European ancestry, but do not all carry the disease-associated variant found at higher frequency in European populations (illustrated by yellow circles). Summarizing clinical risk for a patient requires a higher resolution view of evolutionary ancestry along the genome and improved representation of genetic variation from diverse human populations.

Approximately 200,000 years ago, ‘anatomically modern humans’ (AMHs) first appeared in Africa. This group had the key physical characteristics of modern human groups and exhibited unique behavioural and cognitive abilities that enabled rapid improvements in tool development, art and material culture. Approximately 100,000 years ago, AMH groups began to migrate out of Africa. The populations ancestral to all modern Eurasians are likely to have left Africa tens of thousands of years later98, but quickly spread across Eurasia. Expansions into the Americas and further bottlenecks are thought to have occurred between 35,000 and 15,000 years ago. The details and uncertainties surrounding these origin and migration events are more extensively reviewed elsewhere98.

Populations that experience bottlenecks and founder effects have a higher mutation load than populations that do not, largely due to their lower effective population sizes reducing the efficacy of selection100 (Fig. 3a). During this dispersal, the migrant human populations harboured less genetic variation than was present in Africa. The reduction in diversity caused by the out-of-Africa and subsequent bottlenecks shaped the genetic landscape of all populations outside Africa.

AMHs did not live in isolation after migration out of Africa. Instead, there is evidence of multiple admixture events with other archaic hominin groups, namely Neanderthals and Denisovans101,102. Modern non-African populations derive approximately 2% of their ancestry from Neanderthals, with some Asian populations having an even higher proportion of archaic hominin ancestry (Fig. 3b). African populations have only a small amount of Neanderthal and Denisovan ancestry, largely from back migration from European populations with archaic ancestry103. However, there is evidence of admixture with other, as yet unknown, archaic hominins in the genomes of modern African populations104,105,106.

Following their expansion around the globe, humans have experienced explosive growth over the past 10,000 years, in particular in modern Eurasian populations107,108 (Fig. 3c). Growth in population size modifies the genetic architecture of traits by increasing the efficacy of selection and generating many more low-frequency genetic variants. Although the impact of rare alleles is not completely understood, they often have a deleterious role in variation in traits in modern populations109. Although there is still debate about the combined effects of these recent demographic differences, a consensus is emerging that they are likely to have only minor effects on the efficacy of selection and the mutation load between human populations100,110,111,112,113,114. Nonetheless, there are substantial differences in allele frequency between populations that are relevant to disease risk115.

The exposure of humans to new environments and major lifestyle shifts, such as agriculture and urbanization, created the opportunity for adaptation96,116. Ancient DNA sequencing efforts coupled with recent statistical advances are beginning to enable the linking of human adaptations to specific environmental shifts in the recent past96,117,118. However, these rapid environmental changes also created new patterns of complex disease. Mismatch between our biological suitability for ancestral environments and modern environments accounts for the prevalence of many common diseases, such as obesity, diabetes or heart disease that derive from sedentary lifestyles and poor nutrition. The ancestral susceptibility model proposes that ancestral alleles that were adapted to ancient environments can, in modern populations, increase the risk for disease119,120. Supporting this hypothesis, both ancestral and derived alleles increase disease risk in modern humans121,122. However, underscoring the importance of recent demographic history, patterns of risk for ancestral and derived alleles differ in African and European populations, with ancestral risk alleles at higher frequencies in African populations115.

Medical implications

The different evolutionary histories of modern human individuals and populations described in the previous section influence disease susceptibilities and outcomes. Perhaps most striking are the mismatches and trade-offs resulting from recent immune system adaptations. Classic examples include genetic variants conferring resistance to malaria also causing sickle cell-related diseases in homozygotes96,123, or the predominantly African G1 and G2 variants in APOL1 protecting against trypanosomes and ‘sleeping sickness’ but leading to chronic kidney disease in individuals with these genotypes96. Similarly, a variant in CREBRF that is thought to have improved survival for people in times of starvation is now linked to obesity and type 2 diabetes124. In a study of ancient European populations, a variant in SLC22A4, the ergothioneine transporter, that may have been selected for to protect against deficiency of ergothioneine (an antioxidant) is also associated with gastrointestinal problems such as coeliac disease, ulcerative colitis and irritable bowel syndrome118. The variant responsible did not reach high frequency in European populations until relatively recently, and current disease associations are likely to be new, perhaps as a result of mismatches with the current environment118. The possibility of mismatch is further supported by the varying prevalence of coeliac disease between human populations related to population-specific selection for several risk alleles82. Indeed, recent studies suggest that there is a relationship between ancestry and immune response, with individuals of African ancestry demonstrating stronger responses. This could be the result of selective processes in response to new environments for European populations, or a larger pathogen burden in Africa now leading to a higher instance of inflammatory and autoimmune disorders. This is still an open area of research, and more evidence is needed before strong conclusions can be drawn125.

In modern human environments, there is also a mismatch between the current low parasite infection levels and the immune system that evolved under higher parasite load. This mismatch is hypothesized to contribute to the increase in inflammatory and autoimmune diseases seen in modern humans34. For example, loci associated with ten different inflammatory diseases, including Crohn’s disease and multiple sclerosis, show evidence of selection consistent with the hygiene hypothesis126. Furthermore, recent positive selection on variants in the type 2 immune response pathway favoured alleles associated with susceptibility to asthma127. This suggests that recent evolutionary processes may have led to elevated or altered immune responses at the expense of increased susceptibility to inflammatory and autoimmune diseases. This insight has broad clinical implications, including the potential targeted use of helminths and natural products for immune modulation in patients with chronic inflammatory disease128,129.

Archaic introgression is relevant to modern medicine because alleles introduced by these evolutionary events continue to have an impact on modern populations even though the archaic hominin lineages are now extinct (Fig. 3b). Archaic hominins had considerably lower effective population sizes than AMHs, and thus they probably carried a larger fraction of weakly deleterious mutations than AMHs101. As a result, Neanderthal introgression is predicted to have substantially increased the genetic load of non-African AMHs130,131. Large-scale sequencing efforts, in combination with analysis of clinical biobanks and improved computational methods, have revealed the potential impacts of introgressed DNA on modern human genomes. Several recent studies link regions of archaic admixture in modern populations with a range of diseases, including immunological, neuropsychiatric and dermatological phenotypes102,132,133,134,135,136,137,138,139. This demonstrates the functional impact of introgressed sequence on disease risk in non-African humans today. However, some of these associations may be influenced by linked non-Neanderthal alleles140. For example, in addition to alleles of Neanderthal origin, introgression also reintroduced ancestral alleles that were lost in modern Eurasian populations prior to interbreeding (for example, in the out-of-Africa bottleneck)141. Some introgressed alleles may have initially lessened adverse effects from migration to northern climates, dietary changes and introduction to novel pathogens117,142,143. For example, Neanderthal alleles contribute to variation in innate immune response across populations125,132,134,144 and probably helped AMHs adapt to new viruses, in particular RNA viruses in Europe145. However, due to recent demographic and environmental changes, some previously adaptive Neanderthal alleles may no longer provide the same benefits146. For example, there is evidence that an introgressed Neanderthal haplotype increases risk for SARS-CoV-2 (ref.147).

Physicians regularly rely on proxies for our more recent evolutionary history in the form of self-reported ancestry in their clinical practice; however, these measures fail to capture the complex evolutionary ancestry of each individual patient. For example, two individuals who identify as African Americans may both have 15% European ancestry, but this ancestry will be at different genomic loci and from different ancestral European and African populations (Fig. 3d). Thus, one may carry a disease-increasing European ancestry allele whereas the other does not. Mapping fine-scale genetic ancestry across patients’ genomes can improve our ability to summarize clinically relevant risk148, but such approaches require broad sampling across populations and awareness of human diversity (Box 4). The profound need to increase the sampling of diverse groups is demonstrated by the lack of diversity in genomic studies, and the potential for health disparities caused by the over-representation of European-ancestry populations149,150,151 (Fig. 4). In 2016, 81% of GWAS data were from studies conducted on European populations149. Although this is an improvement from 96% in 2009, most non-European populations still lack appropriate representation. The problem is more extreme for many phenotypes or traits of interest. For example, only 1.2% of the studies in a survey of 569 GWAS on neurological phenotypes included individuals of African ancestry150,152.

Fig. 4: Illustrations of the need to consider diverse human populations in the genetic analysis of disease.
figure 4

a | Interactions between the maternal killer cell inhibitory receptor (KIR) genotype and the fetal trophoblasts illustrate evolutionary trade-offs in pregnancy. Birthweight is under stabilizing selection in human populations. The interaction between maternal KIR genotypes (a diversity of which are maintained in the population) and the fetal trophoblasts influence birthweight. African (AFR) populations, relative to European (EUR) populations, maintain larger proportions of the KIR AA haplotype176, which is associated with improved maternal immune response to some viral challenges; however, it is also associated with low birthweight. Alternatively, the KIR BB haplotype is associated with higher birthweight but increased risk of pre-eclampsia. b | Current strategies for predicting genetic risk are confounded by a lack of inclusion of diverse human populations. Thus, they are more likely to fail in genetic risk prediction in populations that are under-represented in genetic databases. For example, polygenic risk score (PRS) models trained on European populations often perform poorly when applied to African populations. This poor performance stems from the fact that the genetic diversity of African populations, differences in effect sizes between populations and differential evolutionary pressures are not taken into account. The weights for each variant (blue circles) in the PRS derived from genome-wide association studies are signified by w1, w2 and w3. c | Population-specific adaptation and genetic hitch-hiking can produce different disease risk between populations. Haplotypes with protective effects against disease may rise to high frequency in specific populations through genetic hitch-hiking with nearby alleles under selection for a different trait. For example, selection for lighter skin pigmentation caused a haplotype that carried a variant associated with lighter skin (blue circle) to increase in frequency in European populations compared with African populations. This haplotype also carried a variant protective against prostate cancer (blue triangle).

Ancestry biases in genomic databases and GWAS propagate through other strategies that are designed to translate population genetic insights to the clinic, such as polygenic risk scores (PRSs)153,154 (Fig. 4b). PRSs hold the promise of predicting medical outcomes from genomic data alone. However, the evolutionary perspective suggests that the genetic architecture of diseases should differ between populations due to the effects of the demographic and environmental differences discussed above. Indeed, many PRSs generalize poorly across populations and are subject to biases155,156. Prioritization of Mendelian disease genes is also challenging in under-represented populations. Generally, African-ancestry individuals have significantly more variants, yet we know less about the pathogenicity of variants that are absent from or less frequent in European populations157. Patients of African and Asian ancestry are currently more likely than those of European ancestry to receive ambiguous genetic test results after exome sequencing or be told that they have variants of uncertain significance (VUS)158. Indeed, disease-causing variants of African origin are under-represented in common databases159. This under-representation covers a range of phenotypic traits and outcomes, including interpreting the effects of CYP2D6 variants on drug response160,161, risk identification and classification for breast cancer across populations162, and disparate effects of GWAS associations for traits including body mass index (BMI) and type 2 diabetes in non-European populations163. In a study on hypertrophic cardiomyopathy, benign variants in African Americans were incorrectly classified as pathogenic on the basis of GWAS results from a European ancestry cohort. Inclusion of individuals of African descent in the initial GWAS could have prevented these errors164.

Conclusions and future perspectives

All diseases have evolutionary histories, and the signatures of those histories are archived in our genomes. Recent advances in genomics are enabling us to read these histories with high accuracy, resolution and depth. Insights from evolutionary genomics reveal that there is not one answer to the question of why we get sick. Rather, diseases affect patchworks of ancient biological systems that evolved over millennia, and although the systems involved are ancient, the variation that is relevant for human disease is recent. Furthermore, evolutionary genomics approaches have the power to identify potential mechanisms, pathways and networks and to suggest clinical targets. In this context, we argue that an evolutionary perspective can aid the implementation of precision medicine in the era of genome sequencing and editing165 (Box 4).

Combining knowledge of evolutionary events along the human lineage with results from recent genomic studies provides an explanatory framework beyond descriptions of disease risk or association. For example, a recent analysis of the higher incidence of prostate cancer among men of African ancestry not only discovered a set of genetic variants associated with increased risk, but also used measures of selection to propose an evolutionary explanation of genetic hitch-hiking for the lower incidence in non-African populations166. Haplotypes with protective effects against prostate cancer may have risen to higher frequency in non-African populations because of selection on the nearby variants associated with skin pigmentation (Fig. 4c). Thus, evolutionary perspectives not only help answer the question of how we get sick but also why we get sick.

As the genetic information available from diverse populations increases, we can specifically map the genetics of traits in different populations and more precisely define disease risk on an individual basis167,168. However, we emphasize that environmental and social factors are major determinants of disease risk that often contribute more than genetics, and thus must be prioritized. Studying diverse human populations will provide additional power to discover trait-associated loci and understand genetic architecture across different environmental exposures and evolutionary histories150,169. For example, a GWAS with small sample size in a Greenlandic Inuit population found a variant in a fatty-acid enzyme that affects height in both this population and European populations170. Previous GWAS probably missed this variant due to its low frequency in European populations (0.017 compared with 0.98 in the Inuit); nevertheless, it has a much greater effect on height than other variants previously identified through GWAS170. Similarly, a recent study of height in 3,000 Peruvians identified another variant with an even greater influence on height171. The growth of large DNA biobanks in which hundreds of thousands of patients’ EHRs are linked to DNA samples represents a substantial untapped resource for evolutionary medicine5,86,87. These data enable testing of the functional effects of genetic variants on diverse traits at minimal additional cost. Shifting from single-ancestry GWAS to trans-ethnic or multi-ethnic GWAS will capitalize on the benefits of both a larger sample size and the inherent diversity of human populations for replication of established signals and discovery of new ones172,173,174,175.

Although evolutionary assumptions are tacit in medical practice, until recently self-reported family history remained the best representation of our evolutionary ancestry’s imprint on our disease risk. However, a family history cannot fully capture the complex evolutionary and demographic history of each individual. New technologies now enable the collection and interpretation of an individual’s family history in a much longer and complementary form — their genome. New data and methods are substantially increasing the resolution and depth with which these histories can be quantified, providing opportunities for evolution to inform medical practice.