Abstract
The house mouse is a powerful model to dissect the genetic basis of phenotypic variation, and serves as a model to study human diseases. Despite a wealth of discoveries, most classical laboratory strains have captured only a small fraction of genetic variation known to segregate in their wild progenitors, and existing strains are often related to each other in complex ways. Inbred strains of mice independently derived from natural populations have the potential to increase power in genetic studies with the addition of novel genetic variation. Here, we perform exome-enrichment and high-throughput sequencing (~8× coverage) of 26 wild-derived strains known in the mouse research community as the “Montpellier strains.” We identified 1.46 million SNPs in our dataset, approximately 19% of which have not been detected from other inbred strains. This novel genetic variation is expected to contribute to phenotypic variation, as they include 18,496 nonsynonymous variants and 262 early stop codons. Simulations demonstrate that the higher density of genetic variation in the Montpellier strains provides increased power for quantitative genetic studies. Inasmuch as the power to connect genotype to phenotype depends on genetic variation, it is important to incorporate these additional genetic strains into future research programs.
Similar content being viewed by others
References
Auwera GA, Carneiro MO, Hartl C et al (2013) From FastQ data to high-confidence variant calls: the genome analysis toolkit best practices pipeline. Curr Protoc Bioinform 43:11.10.11–11.10.33
Beck JA, Lloyd S, Hafezparast M, Lennon-Pierce M, Eppig JT, Festing MF, Fisher EM (2000) Genealogies of mouse inbred strains. Nat Genet 24:23–25
Bonhomme F, Martin S, Thaler L (1978) Hybridation en laboratoire de Mus musculus L. et Mus spretus Lataste. Experientia 34:1140–1141
Boursot P, Jacquart T, Bonhomme F, Britton-Davidian J, Thaler L (1985) Differenciation geographique du genome mitochondrial chez Mus spretus Lataste. Comptes rendus de l’Academie des sciences 301:161–166
Boursot P, Din W, Anand R, Darviche D, Dod B, Von Deimling F, Talwar GP, Bonhomme F (1996) Origin and radiation of the house mouse: mitochondrial DNA phylogeny. J Evol Biol 9:391–415
Britton J, Thaler L (1978) Evidence for the presence of two sympatric species of mice (genus <i>Mus</i> L.) in southern France based on biochemical genetics. Biochem Genet 16:213–225
Broman KW, Sen S (2009) A guide to QTL mapping with R/qtl. Springer, New York
Broman KW, Wu H, Sen S, Churchill GA (2003) R/qtl: QTL mapping in experimental crosses. Bioinformatics 19:889–890
Burgio G, Szatanik M, Guenet J-L, Arnau M-R, Panthier J-J, Montagutelli X (2007) Interspecific recombinant congenic strains between C57BL/6 and mice of the Mus spretus species: a powerful tool to dissect genetic control of complex traits. Genetics 177:2321–2333
Church DM, Goodstadt L, Hillier LW et al (2009) Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol 7:e1000112
Dai J-g, Min J-x, Xiao Y-b, Lei X, Shen W-h, Wei H (2005) The absence of mitochondrial DNA diversity among common laboratory inbred mouse strains. J Exp Biol 208:4445–4450
Dejager L, Libert C, Montagutelli X (2009) Thirty years of Mus spretus: a promising future. Trends Genet 25:234–241
DePristo MA, Banks E, Poplin R et al (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43:491–498
Didion J, Pardo-Manuel de Villena F (2013) Deconstructing Mus gemischus: advances in understanding ancestry, structure, and variation in the genome of the laboratory mouse. Mamm Genome 24:1–20
Earl DA (2012) STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conservation Genetics Resour 4:359–361
Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14:2611–2620
Falush D, Stephens M, Pritchard JK (2007) Inference of population structure using multilocus genotype data: dominant markers and null alleles. Mol Ecol Notes 7:574–578
Felsenstein J (1993) {PHYLIP}: phylogenetic inference package, version 3.5 c.
Ferris SD, Sage RD, Wilson AC (1982) Evidence from mtDNA sequences that common laboratory strains of inbred mice are descended from a single female. Nature 295:163–165
Frazer KA, Eskin E, Kang HM et al (2007) A sequence-based variation map of 8.27 million SNPs in inbred mouse strains. Nature 448:1050–1053
Futschik A, Hotz T, Munk A, Sieling H (2014) Multiscale DNA partitioning: statistical evidence for segments. Bioinformatics 30:2255–2262
Geraldes A, Basset P, Gibson B, Smith KL, Harr B, Yu HT, Bulatova N, Ziv Y, Nachman MW (2008) Inferring the history of speciation in house mice from autosomal, X-linked, Y-linked and mitochondrial genes. Mol Ecol 17:5349–5363
Geraldes A, Basset P, Smith KL, Nachman MW (2011) Higher differentiation among subspecies of the house mouse (Mus musculus) in genomic regions with low recombination. Mol Ecol 20:4722–4736
Green RE, Krause J, Briggs AW et al (2010) A draft sequence of the Neandertal genome. Science 328:710–722
Grubb SC, Churchill GA, Bogue MA (2004) A collaborative database of inbred mouse strain characteristics. Bioinformatics 20:2857–2859
Haley CS, Knott SA (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315–324
Halligan DL, Oliver F, Eyre-Walker A, Harr B, Keightley PD, Nachman MW (2010) Evidence for pervasive adaptive protein evolution in wild mice. PLoS Genet 6(1):e1000825
Hardouin EA, Orth A, Teschke M, Darvish J, Tautz D, Bonhomme F (2015) Eurasian house mouse (Mus musculus L.) differentiation at microsatellite loci identifies the Iranian plateau as a phylogeographic hotspot. BMC Evol Biol 15:26
Harr B, Karakoc E, Neme R et al. (2016) Genomic resources for wild populations of the house mouse, Mus musculus and its close relative Mus spretus. Sci Data 3:160075
Hubisz MJ, Falush D, Stephens M, Pritchard JK (2009) Inferring weak population structure with the assistance of sample group information. Molecular ecology resources 9:1322–1332
Ideraabdullah FY, de la Casa-Esperon E, Bell TA, Detwiler DA, Magnuson T, Sapienza C, Pardo-Manuel de Villena F (2004) Genetic and haplotype diversity among wild-derived mouse inbred strains. Genome Res 14:1880–1887
Keane TM, Goodstadt L, Danecek P et al (2011) Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 477:289–294
Kessler MD, Dean MD (2014) Effective population size does not predict codon usage bias in mammals. Ecol Evol 4:3887–3900
Korneliussen TS, Albrechtsen A, Nielsen R (2014) ANGSD: analysis of next generation sequencing data. BMC Bioinform 15:1
Laurie CC, Nickerson DA, Anderson AD, Weir BS, Livingston RJ, Dean MD, Smith KL, Schadt EE, Nachman MW (2007) Linkage disequilibrium in wild mice. PLos Genet 3:e144
Lee TH, Guo H, Wang X, Kim C, Paterson AH (2014) SNPhylo: a pipeline to construct a phylogenetic tree from huge SNP data. BMC Genomics 15:162
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
Lindblad-Toh K, Winchester E, Daly MJ et al (2000) Large-scale discovery and genotyping of single-nucleotide polymorphisms in the mouse. Nat Genet 24:381–386
Lundrigan BL, Jansa SA, Tucker PK (2002) Phylogenetic relationships in the genus Mus, based on paternally, maternally, and biparentally inherited characters. Syst Biol 51:410–431
McCarthy EE, Celebi JT, Baer R, Ludwig T (2003) Loss of Bard1, the heterodimeric partner of the Brca1 tumor suppressor, results in early embryonic lethality and chromosomal instability. Mol Cell Biol 23:5056–5063
McKenna A, Hanna M, Banks E et al (2010) The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303
Moriwaki K (1994) Wild mouse from a geneticist’s viewpoint. In Genetics in wild mice Tokyo. Japan Scientific Societies Press, Japan
Morse HC (1978) Origins of inbred mice. Academic Press, Cambridge
Morse HCI (2007) Building a better mouse: one hundred years of genetics and biology. In: Fox JG, Barthold SW, Davisson MT, Newcomer CE, Quimby FW, Smith AL (eds) The mouse in biomedical research. Elsevier, Waltham
Nagamine CM, Nishioka Y, Moriwaki K, Boursot P, Bonhomme F, Lau YFC (1992) The musculus-type Y chromosome of the laboratory mouse is of Asian origin. Mamm Genome 3:84–91
Nikolskiy I, Conrad DF, Chun S, Fay JC, Cheverud JM, Lawson HA (2015) Using whole-genome sequences of the LG/J and SM/J inbred mouse strains to prioritize quantitative trait genes and nucleotides. BMC Genom 16:415
Novembre JA (2002) Accounting for background nucleotide composition when measuring codon usage bias. Mol Biol Evol 19:1390–1394
Orsini P, Cassaing J, Duplantier J, Croset H (1982) Premieres donnees sur l’ecologie des populations naturelles de souris, Mus spretus Lataste et Mus musculus domesticus Rutty dans le Midi de la France
Paigen K (2003a) One hundred years of mouse genetics: an intellectual history. I. The classical period (1902–1980). Genetics 163:1–7
Paigen K (2003b) One hundred years of mouse genetics: an intellectual history. II. The molecular revolution (1981–2002). Genetics 163:1227–1235
Paradis E (2012) Analysis of phylogenetics and evolution with R. Springer, New York
Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20:289–290
Petkov PM, Ding Y, Cassell MA et al (2004) An efficient SNP system for mouse genome scanning and elucidating strain relationships. Genome Res 14:1806–1811
Phifer-Rixey M, Bonhomme F, Boursot P, Churchill GA, Piálek J, Tucker PK, Nachman MW (2012) Adaptive evolution and effective population size in wild house mice. Mol Biol Evol 29:2949–2955
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959
Rajabi-Maham H, Orth A, Siahsarvie R, Boursot P, Darvish J, Bonhomme F (2012) The south-eastern house mouse Mus musculus castaneus (Rodentia: Muridae) is a polytypic subspecies. Biol J Linn Soc 107:295–306
Rohland N, Reich D (2012) Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Genome Res 22:939–946
Salcedo T, Geraldes A, Nachman MW (2007) Nucleotide variation in wild and inbred mice. Genetics 177:2277–2291
Sarver B, Keeble S, Cosart T, Tucker P, Dean MD, Good JM (2017) Phylogenomic insights into mouse evolution using a pseudoreference approach. Genome Biol Evol 9:726–739
Schliep KP (2011) Phangorn: phylogenetic analysis in R. Bioinformatics 27:592–593
She JX, Bonhomme F, Boursot P, Thaler L, Catzeflis F (1990) Molecular phylogenies in the genus Mus - comparative analysis of electrophoretic, scnDNA hybridization, and mtDNA RFLP data. Biol J Linn Soc 41:83–103
Sherry ST, Ward M, Sirotkin K (1999) dbSNP—database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Res 9:677–679
Sherry ST, Ward M-H, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311
Silver L (1995) Mouse genetics: concepts and applications. Oxford University Press, New York
Srivastava A, Morgan AP, Najarian ML et al (2017) Genomes of the mouse collaborative cross. Genetics 206:537–556
Suzuki H, Shimada T, Terashima M, Tsuchiya K, Aplin K (2004) Temporal, spatial, and ecological modes of evolution of Eurasian Mus based on mitochondrial and nuclear gene sequences. Mol Phylogenet Evol 33:626–646
Tucker PK, Lee BK, Lundrigan BL, Eicher EM (1992) Geographic origin of the Y chromosomes in “old” inbred strains of mice. Mammalian genome 3:254–261
Wade CM, Daly MJ (2005) Genetic variation in laboratory mice. Nat Genet 37:1175–1180
Wade CM, Kulbokas EJ 3rd, Kirby AW, Zody MC, Mullikin JC, Lander ES, Lindblad-Toh K, Daly MJ (2002) The mosaic structure of variation in the laboratory mouse genome. Nature 420:574–578
Wang X, Pandey AK, Mulligan MK et al. (2016) Joint mouse–human phenome-wide association to test gene function and disease risk. Nat Commun. doi:10.1038/ncomms10464
Waterston RH, Lindblad-Toh K, Birney E et al (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562
Westermark UK, Reyngold M, Olshen AB, Baer R, Jasin M, Moynahan ME (2003) BARD1 participates with BRCA1 in homology-directed repair of chromosome breaks. Mol Cell Biol 23:7926–7936
White MA, Ané C, Dewey CN, Larget BR, Payseur BA (2009) Fine-scale phylogenetic discordance across the house mouse genome. PLoS Genet 5:e1000729
White JK, Gerdin A-K, Karp NA et al (2013) Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes. Cell 154:452–464
Wong K, Bumpstead S, Van Der Weyden L, Reinholdt LG, Wilming LG, Adams DJ, Keane TM (2012) Sequencing and characterization of the FVB/NJ mouse genome. Genome Biol 13:R72
Yalcin B, Fullerton J, Miller S et al (2004) Unexpected complexity in the haplotypes of commonly used inbred strains of laboratory mice. Proc Natl Acad Sci USA 101:9734–9739
Yang H, Bell TA, Churchill GA, Pardo-Manuel de Villena F (2007) On the subspecific origin of the laboratory mouse. Nat Genet 39:1100–1107
Yang H, Ding Y, Hutchins LN, Szatkiewicz J, Bell TA, Paigen BJ, Graber JH, de Villena FP-M, Churchill GA (2009) A customized and versatile high-density genotyping array for the mouse. Nat Meth 6:663–666
Yang H, Wang JR, Didion JP et al (2011) Subspecific origin and haplotype diversity in the laboratory mouse. Nat Genet 43:648–655
Yonekawa H, Gotoh O, Tagashira Y, Matsushima Y, Shi LI, Cho WS, Miyashita N, Moriwaki K (1986) A hybrid origin of Japanese mice “Mus musculus molossinus”. Curr Top Microbiol Immunol 127:62–67
Yonekawa H, Moriwaki K, Gotoh O, Miyashita N, Matsushima Y, Shi LM, Cho WS, Zhen XL, Tagashira Y (1988) Hybrid origin of Japanese mice “Mus musculus molossinus”: evidence from restriction analysis of mitochondrial DNA. Mol Biol Evol 5:63–78
Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS (2012) A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 28:3326–3328
Acknowledgements
We thank Charlie Nicolet and Selene Tyndale from the Epigenome Center at USC. Brent Young, Rachel Mangels, and Lorraine Provencio helped with molecular work. Matt Salomon and Rob Williams gave many helpful suggestions. Jean-Jacques Duquesne maintained the wild mouse repository in Montpellier. Funding was provided by the National Institutes of Health Grant #GM098536 (MDD), National Science Foundation Grant #1146525 (MDD), the Eunice Kennedy Shriver National Institute of Child Health and Human Development of the National Institutes of Health Grant #HD073439 (JMG), and the University of Montana Genomics Core, supported by a grant from the M.J. Murdock Charitable Trust.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Illumina sequencing data are available in NCBI under the BioProject PRJNA326865.
Electronic supplementary material
Below is the link to the electronic supplementary material.
335_2017_9704_MOESM1_ESM.jpg
Supplementary Figure 1—Genealogical relationships among 26 Montpellier strains, 36 strains from dbSNP, and 67 wild-caught mice from Harr et al. (2016). Similar to Figure 1 but note that species names have changed to accommodate larger sample sizes. Strain names on a black box indicate novel Montpellier exomes sequenced in this study. Nodes labeled with small black circles were supported with at least 95% bootstrap support. Importantly, as seen in Fig. 1, the classical inbred strains cluster to the exclusion of wild-derived inbred strains (JPG 312 KB)
335_2017_9704_MOESM2_ESM.pdf
Supplementary Figure 2—Principal components analysis of genetic variation among 26 Montpellier strains and 36 “classical inbred” strains. Colors follow Fig. 1 (PDF 29 KB)
335_2017_9704_MOESM3_ESM.pdf
Supplementary Figure 3—Distribution of stop codons, arranged by the proportion of wild-type protein that remains (PDF 25 KB)
335_2017_9704_MOESM4_ESM.pdf
Supplementary Figure 4—Proportion of wild-type protein truncated by early stop codons, separated by whether the early stop codon occurs on a facultative or constitutive exon (PDF 26 KB)
335_2017_9704_MOESM5_ESM.pdf
Supplementary Figure 5—Examples of pairwise distribution of genetic divergence vs. haplotype length for 2 pairs of wild-derived inbred strains (PDF 45 KB)
Rights and permissions
About this article
Cite this article
Chang, P.L., Kopania, E., Keeble, S. et al. Whole exome sequencing of wild-derived inbred strains of mice improves power to link phenotype and genotype. Mamm Genome 28, 416–425 (2017). https://doi.org/10.1007/s00335-017-9704-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00335-017-9704-9