Abstract
Background
The U.K. 100,000 Genomes Project is in the process of investigating the role of genome sequencing in patients with undiagnosed rare diseases after usual care and the alignment of this research with health care implementation in the U.K. National Health Service. Other parts of this project focus on patients with cancer and infection.Methods
We conducted a pilot study involving 4660 participants from 2183 families, among whom 161 disorders covering a broad spectrum of rare diseases were present. We collected data on clinical features with the use of Human Phenotype Ontology terms, undertook genome sequencing, applied automated variant prioritization on the basis of applied virtual gene panels and phenotypes, and identified novel pathogenic variants through research analysis.Results
Diagnostic yields varied among family structures and were highest in family trios (both parents and a proband) and families with larger pedigrees. Diagnostic yields were much higher for disorders likely to have a monogenic cause (35%) than for disorders likely to have a complex cause (11%). Diagnostic yields for intellectual disability, hearing disorders, and vision disorders ranged from 40 to 55%. We made genetic diagnoses in 25% of the probands. A total of 14% of the diagnoses were made by means of the combination of research and automated approaches, which was critical for cases in which we found etiologic noncoding, structural, and mitochondrial genome variants and coding variants poorly covered by exome sequencing. Cohortwide burden testing across 57,000 genomes enabled the discovery of three new disease genes and 19 new associations. Of the genetic diagnoses that we made, 25% had immediate ramifications for clinical decision making for the patients or their relatives.Conclusions
Our pilot study of genome sequencing in a national health care system showed an increase in diagnostic yield across a range of rare diseases. (Funded by the National Institute for Health Research and others.).Free full text
The 100,000 Genomes Pilot on Rare Disease Diagnosis in Healthcare − A Preliminary Report
Abstract
Background
The UK 100,000 Genomes Project is in the process of investigating the role of genome sequencing of patients with undiagnosed rare disease following usual care, and the alignment of research with healthcare implementation in the UK’s national health service. (Other parts of this Project focus on patients with cancer and infection.)
Methods
We enrolled participants, collected clinical features with human phenotype ontology terms, undertook genome sequencing and applied automated variant prioritization based on virtual gene panels (PanelApp) and phenotypes (Exomiser), alongside identification of novel pathogenic variants through research analysis. We report results on a pilot study of 4660 participants from 2183 families with 161 disorders covering a broad spectrum of rare disease.
Results
Diagnostic yields varied by family structure and were highest in trios and larger pedigrees. Likely monogenic disorders had much higher diagnostic yields (35%) with intellectual disability, hearing and vision disorders, achieving yields between 40 and 55%. Those with more complex etiologies had an overall 25% yield. Combining research and automated approaches was critical to 14% of diagnoses in which we found etiologic non-coding, structural and mitochondrial genome variants and coding variants poorly covered by exome sequencing. Cohort-wide burden testing across 57,000 genomes enabled discovery of 3 new disease genes and 19 novel associations. Of the genetic diagnoses that we made, 24% had immediate ramifications for the clinical decision-making for the patient or their relatives.
Conclusion
Our pilot study of genome sequencing in a national health care system demonstrates diagnostic uplift across a range of rare diseases.
(Funded by National Institute for Health Research and others)
Rare disease is a worldwide healthcare challenge with approximately 10,000 disorders affecting 6% of the population in Western societies.1,2 Over 80% of rare diseases have a genetic component and these conditions are disabling and expensive to manage. One-third of children with a rare disease die before their fifth birthday.1 The adoption of next generation sequencing has improved rare disease diagnostic rates over the past decade.3–5 However, the majority of rare disease patients remain without a molecular diagnosis following standard diagnostic testing.3–5 To address this, the UK Government launched the 100,000 Genomes Project (100KGP) in 2013 to apply whole genome sequencing (WGS) to rare disease, cancer and infection in national healthcare.6
To assess impact of this WGS approach on the genetic diagnosis of rare disease in the UK’s National Health Service, we carried out a pilot study in which we enrolled families and undertook detailed clinical phenotyping of the proband.4 We collected electronic health records from all participants in a multi-petabyte research environment.5 When necessary, we carried out wet bench orthogonal tests and in-silico approaches.
Methods
Patients
Following ethical approval, consenting participants (identified by healthcare professionals and researchers) with a broad range of rare diseases without diagnoses after undergoing usual care in the NHS (which ranged from no available test through approved tests which did not include genome sequencing) were recruited by nine English hospitals and consented through the National Institute for Health Research (NIHR) BioResource for Rare Diseases. To test the broad applicability of genome sequencing, participants were eligible if they had a rare disease (as defined in the UK as a disorder affecting 1 in 2000 or less), were likely to have a single gene or oligogenic aetiology, and no genomic diagnosis. Data on prior proband testing was collected where possible including single-gene tests, karyotyping, single nucleotide polymorphism (SNP) arrays, next generation sequencing panels, and exomes. Probands and, where feasible, parents and/or other family members were enrolled by multiple clinical specialties in the NHS. Standardized baseline clinical data were recorded using the Human Phenotype Ontology (HPO)7 against disease specific data models8 and whole blood was drawn for DNA extraction. The participants are followed over their life course using electronic health records (all hospital episodes, registries and cause of death).
Genome Sequencing
Genome sequencing9 was performed using the Illumina TruSeq DNA PCR-Free sample preparation kit by Illumina Laboratory Sciences, Cambridge UK on an HiSeq 2500 sequencer, generating a mean depth of 32× (range from 27× to 54×) and greater than 15× for at least 95% of the reference human genome. WGS reads were aligned to the Genome Reference Consortium human genome build 37 (GRCh37) using Isaac Genome Alignment Software. Family-based variant calling of single variant nucleotides and insertion deletions (indels) for chromosomes 1 to 22, X, and the mitochondrial genome (mean 2814x coverage, range 142-16581) was performed using the Platypus variant caller.10
The Diagnostic Pipeline
We constructed an automated analytical pipeline to filter the genome down to rare, segregating and predicted damaging candidate variants in coding regions. To limit the possibility of overlooking, or inefficiently prioritizing diagnoses we focussed initially on virtual gene panels based on both the recruited clinical indication/disease and submitted HPO terms (applied virtual panels). To address the issue of which genes have sufficient evidence to attribute causation and include in these virtual gene panels, we used our PanelApp software to enable expert, crowd-sourced review and curation of genes with diagnostic-grade evidence for each of our disease categories e.g. evidence in at least three, unrelated families.11 Loss of function (LoF) or de novo, protein altering variants affecting genes in the applied virtual panels were classified as tier 1, other variant types such as missense variants affecting these genes were classified as tier 2, and all other filtered variants were classified as tier 3 (Figure S1 in the Supplementary Appendix). To further reduce the possibility of missing, or inefficient prioritization of diagnoses, we ran Exomiser12, a phenotype-based approach to look across all genes in the genome for a diagnosis. Exomiser prioritizes rare, segregating, predicted pathogenic variants in genes where the patient phenotypes match previous reference knowledge from human disease or model organism databases. The ontology-driven phenotype matching can detect patients possessing atypical profile for a disease.
Decision support systems and clinical genetics teams provided by Congenica Ltd and Fabric Genomics13,14 assisted us in variant prioritization and return of candidate variants to the 13 NHS Genomic Medicine Centres (GMC). These variants were reviewed by NHS clinical scientists and clinicians using the American College of Medical Genetics and Genomics guidelines and a diagnostic report was issued for each proband.15 Final clinical outcomes included whether a genetic diagnosis was obtained, the variant(s) involved, whether they explained all, or some of the phenotypes and whether an intervention was deployed.
The pilot participants were recruited and sequenced throughout 2014-2016, while the infrastructure to collect, QC, process and return data was being established. Results were returned to the GMCs from May 2016 to April 2019. In our post-pilot phase with an established pipeline, we now return results to the GMCs within 6 weeks of sample collection.
Novel Pathogenic Variants
Researchers investigated coding and non-coding regions for novel diagnoses in genes matching the patients’ phenotypes, including the presence of de novo variants in highly constrained coding regions16 with 95% confidence. We used a novel methodology for mitochondrial DNA that accounts for heteroplasmy,17 Genomiser,18 and ExpansionHunter for simple tandem repeat expansions.19 Finally we employed a novel random forest method to analyse Canvas20 and Manta21 calls and identify potentially pathogenic copy number and structural variants.
Gene-based burden testing to detect enrichment of rare, predicted pathogenic, segregating variants in novel genes in specific disease cohorts relative to controls was performed on the pilot genomes as well as additional genomes from the rest of the 100KGP to increase power (57,002 genomes; see Supplementary Methods).
Access to the pilot genomic and clinical data is freely accessible by becoming a member of a Genomics England Clinical Interpretation Partnership (GeCIP) domain (https://www.genomicsengland.co.uk/about-gecip/).
Statistical Analysis
Testing was performed using the R (version 3.6.0) and Stata (version 16) statistical packages. Further detail on individual methods is given in the Supplementary Appendix.
Results
Patients
We enrolled 4660 participants (2183 probands and 2477 family members) from 161 broad categories across rare disease (Table 1), with neurologic, ophthalmologic and tumor syndromes commonly represented. Participants were recruited with varying numbers of affected and unaffected family members. We aimed, with varying degrees of success, to recruit trios or larger family structures to facilitate more effective variant prioritization. Of the probands with multiple bowel polyps whom we recruited, 93% were singletons. In contrast, 12% of probands with intellectual disability were singletons. Adult probands were more commonly enrolled than pediatric probands (age at recruitment 18 years or younger) (74% vs. 26%), in line with the general population (79% vs. 21%; 2011 census of England and Wales). The preponderance of adults is unusual compared to previous sequencing projects and reflects an eligibility criterion: probands had already undergone usual care: in many cases, usual care involved standard genetic testing (mostly single-gene or panel-based). A lower percentage of female probands were recruited, especially for pediatric cases, where the difference was significant (232 female vs. 339 male; P< 0.001) based on the expected female proportion of 51% from 2011 census of England and Wales) across most disease categories. The increased susceptibility of males to recessive X-linked conditions may account for this sex bias: over 6% of total diagnoses involved variants on the X chromosome (which represents approximately 5% of the genome). The inferred ancestry of the probands (see Supplementary Appendix) was in line with that expected from the population (86% white, 7.5% Asian, 3.3% black, 2.2% mixed, 1% other: 2011 census of England and Wales). However, significantly more pediatric probands were of South Asian ancestry compared to adult probands (16% vs. 4%, P<0.001); our results indicated potential consanguinity in 43% of pediatric South Asian probands and 1% for the other pediatric probands (Table 1).
Table 1
Variable | All probands (N=2183) | Paediatric (age at recruitment <=18) probands (N=571) | Adult (age at recruitment > 18) probands (N=1612) |
---|---|---|---|
Sex — no. (%) | |||
Male | 1138 (52) | 339 (16) | 799 (37) |
Female | 1045 (48) | 232 (11) | 813 (37) |
2183 (100) | 571 (26) | 1612 (74) | |
Median (IQR) age in years at recruitment | 35 (18-54) | 9 (5-14) | 45 (31-60) |
Race or ethnic group — no. (%), %consangunity suggested in record | |||
African | 50 (2), 0 | 25 (4), 0 | 25 (2), 0 |
Ad Mixed American | 26 (1), 23 | 12 (2), 25 | 14 (1), 21 |
East Asian | 8 (<1), 0 | 2 (<1), 0 | 6 (<1), 0 |
European | 1931 (88), <1 | 438 (77), <1 | 1493 (93), <1 |
South Asian | 163 (7), 36 | 93 (16), 43 | 70 (4), 25 |
Not determined | 5 (<1), 0 | 1 (<1), 0 | 4 (<1), 0 |
2183 (100), 3 | 571 (26), 8 | 1612 (74), 2 |
Clinical Data and Sequencing
We collected HPO terms for each participant (median of 4 present terms, range 1-61 and median of 4 absent terms (phenotypes not exhibited by the proband), range 0-144). We then carried out genome sequencing followed by quality assurance to check coverage, sequence quality, presence of repeat sample submissions or sample swaps, and consistency with reported family structures (see Supplementary Appendix).
The Diagnostic Yield
We obtained genetic diagnoses for 25% of probands and deposited the genotypes into the ClinVar repository (accession numbers XXXX to YYYY). Of these diagnoses, 60% were made on the basis of coding SNV/indels in the applied virtual panels, 26% from coding SNV/indels affecting well-established disease genes outside the virtual panels using phenotype-based prioritization and/or expert review by the clinicians, Congenica Ltd, or Fabric Genomics, and 14% from genome-wide, phenotype-agnostic research analysis looking beyond SNV/indels, coding regions, and disease genes in the virtual panels (Figure 1). Following international guidelines15 a further 10% of probands were classified with variants of unknown significance in genes consistent with the phenotype by clinical review at the site, but with further functional validation required. Fewer candidate variants were returned after filtering in larger family structures (Table 3), making it easier to identify causative variants, in turn leading to higher diagnostic rates for trios, quads and more complex family structures (Figure 2a), even within a disorder e.g. for hereditary ataxia the diagnostic rate increased from 21% for singletons to 32% for trios (Table S4 in the Supplementary Materials).
Table 3
All family structures | Singletons | Duos | Trios | Other family structures | |
---|---|---|---|---|---|
Variants after filtering | 221 (49-288) | 292 (258-327) | 149 (117-213) | 29 (17-136) | 22.5 (9-71) |
In virtual panels | 1 (0-2) | 1 (0-2) | 1 (0-3) | 1 (0-2) | 0 (0-1) |
Unsurprisingly, we obtained a higher diagnostic yield for diseases that were considered more likely to have a monogenic cause (Table S4 in the Supplementary Appendix) than those we considered more likely to have complex etiology (35% vs 11%) (Figure 2a). Likely monogenic diseases equate to those with a presence in OMIM and where genetic testing is part of the standard diagnostic workup, based on the consensus blinded review of three clinical geneticists. Diagnostic yield was highly variable by disease (Figure 2b, Table S3 in the Supplementary Appendix), varying from 40-55% for intellectual disability and various vision and hearing disorders to 6% for tumor syndromes.
We obtained data on the presence or absence of prior genetic testing for a subset (1177) of the participants. The number of tests per proband ranged from 0-16 with a median of 1 (IQR 0-2), and approximately half of the probands in this subset had been tested at least once. The overall diagnostic uplift from genome sequencing in this subset was 32% with only a slight difference depending on whether prior testing had been performed (33%), or not (31%). However, many of these prior tests were not recent. The diagnostic yield provided by genome sequencing varied between 28 to 45% depending on the type of prior testing (Figure 2c, Table S5 in the Supplementary Appendix) which, for the most part, involved targeted single gene and panel testing (Table S6 in the Supplementary Appendix).
Diagnostic Pipeline
The aim of the automated, diagnostic pipeline is to identify a few, potentially causative candidate variants, from the millions in a whole genome, through removal of extremely unlikely candidates (filtering) and identification of the most likely in the remainder (prioritization). This allows the GMCs to efficiently perform manual, clinical interpretation and issue a diagnostic report. The virtual panel-based pipeline identified 322 (66%) of the 490 SNV/indel-based diagnoses from the genomes, with a high positive predictive value given the millions of variants in the whole genomes: of 1041 of returned candidate variants, 291 (28%) proved to be diagnostic. We re-ran this analysis in December 2019 to assess the impact of using updated versions of the virtual panels containing the latest disease gene discoveries, improved virtual panel selection based on the patient’s phenotype and advances in variant filtering strategies, e.g. allowing for incomplete penetrance where suspected. This increased the number of genetic diagnoses detected from 322 to 377 (77%) with a positive predictive value of 15% (Figure 2d), demonstrating effective filtering and prioritization of the variants with only a median of 1 (IQR 0-2) candidate variant in panels returned to the clinicians at the GMCs per case (Table 3). Ongoing evolution of the virtual panels with new disease genes is expected to continue increasing the yield from this approach.
Phenotype-based prioritization using Exomiser detected 77%, 86%, and 88% of these diagnoses in the top, top 3 and top 5 ranked candidates respectively (Figure 2d). Exomiser and use of virtual panels were complementary, with 92% of these diagnoses re-called when used combined (last blue bar in Figure 2d). Precision phenotyping of our patients was essential both for Exomiser and for the selection of additional virtual panels, without which only 54% of these diagnoses would have been prioritized in the recruited disease virtual panel and presented to the GMCs as a likely candidate (first blue bar in Figure 2d).
Research-based Diagnoses
14% of the genetic diagnoses required research outside the diagnostic pipeline (Figure 1). This research involved comparisons with the genome sequences and clinical data in our research environment, with validation using wet bench orthogonal tests and in-silico approaches (Table S7 in the Supplementary Appendix). Additional diagnoses were made by screening for the presence of de novo variants in highly constrained coding regions16. These diagnoses included a de novo EBF3 missense variant in a patient with hereditary ataxia. Mitochondrial genome analysis, taking into account heteroplasmy, detected 4 new diagnoses as well as the 9 that had already been detected by the main pipeline). Twelve probands had intronic splicing variants prioritized by Exomiser due to the known pathogenic status of these variants in ClinVar.23 Nine novel non-coding diagnoses involving previously undescribed variants required exploration of the whole genome and in vitro functional validation via reverse transcription polymerase chain reaction, mini-gene, or luciferase assays.24,25,26 Here, unsolved probands were queried for non-coding variants affecting genes in the applied virtual panels, either alone, or in compound heterozygosity with loss-of-function variants. These were identified using either Genomiser or, for retinal disorder probands, systematic analysis of the untranslated regions, promoter or introns. A further 43 probands were fully or partially explained by structural variants or simple tandem repeat expansions in the genes HTT or FXN in probands with hereditary spastic paraplegia.
Novel Disease Gene Associations
We performed burden testing to discover novel Mendelian disease gene associations and potential genetic diagnoses for unsolved probands; 828 significant disease-gene associations (q value < 0.1) were identified, including 249 known and 579 novel genes (novel with respect to their association with disease), with only 0.03 ± 0.2 (range 0-3) associations from 10,000 permutations where cases and controls were assigned randomly. Twenty two candidates represent the most likely new, fully penetrant, Mendelian disease genes (Table S8 in the Supplementary Appendix and ClinVar accession numbers SCV001759972 - SCV001760540) with three recently independently confirmed diagnoses: UBAP1 in hereditary spastic paraplegia,27 FOXJ1 in non-CF bronchiectasis,28 and SORD in Charcot-Marie Tooth disease.29 Diagnostic reports were issued for three probands with these genes (Figure 1) and we are investigating others in GeneMatcher and by functional validation studies in model organisms.
Diagnostic Sequelae
These findings ended long diagnostic odysseys for some patients and their families (the median duration of odyssey was 75 months and number of hospital visits was 68); Table S1 in Supplementary Appendix); we speculate that they will mitigate NHS resource costs (183,273 episodes of hospital care costing £87 million for affected participants; Table S3 in Supplementary Appendix). In addition, 134 (25%) of the 533 genetic diagnoses were reported by clinicians to be of immediate clinical actionability with only 11 (0.2%) described as having no benefit. As of now, the remainder of the diagnoses are of unknown utility. Healthcare benefits included 4 diagnoses leading to a suggested change in medication, 26 suggesting additional surveillance for the proband or relatives, 13 allowing clinical trial eligibility, 59 informing future reproductive choices, and 32 with other benefits (Table S9 in the Supplementary Appendix).
In several specific probands, diagnoses have had important clinical actionability. In a 36-yr-old male with suspected choroideraemia, we detected a novel, CHM promoter variant causing loss of gene expression26 and offering eligibility for a gene-replacement trial. A male neonate proband presented with severe infection and transient neurologic symptoms immediately after birth and died at 4 months with no diagnosis but healthcare costs of approximately £80,000 (Table S10 in Supplementary Appendix). A diagnosis of transcobalamin 2 deficiency due to a homozygous frameshift in TCN2 was made from this study which enabled predictive testing to be offered to the younger brother within one week of birth. The younger child, who received a positive result, received weekly hydroxocobalamin injections to prevent metabolic decompensation. A 10-year-old girl was admitted to intensive care with life-threatening chicken pox. She had endured a diagnostic odyssey over seven years at a total cost of £356,571 across 307 secondary care episodes (Table S11 in Supplementary Appendix). We were able to diagnose CTPS1 deficiency due to a homozygous, known pathogenic splice acceptor variant. A diagnosis enabled a curative bone marrow transplant (cost £70,000) and predictive testing of her siblings showed no further family members to be at risk. One proband had waited till his sixth decade for a genomic diagnosis of an INF2 mutation causing focal segmental glomerulosclerosis. His father, brother and uncle had all died of renal failue. He had received two kidney transplants, had transmitted the condition to his daughter and was concerned about whether his 15-year-old grand daughter, who was under surveillance, was at risk. After he received his genetic diagnosis, the grand-daughter was tested, found to be negative, and discharged from regular medical surveillance.
Discussion
Our findings demonstrate a substantial uplift in genomic diagnoses achieved for patients by genome sequencing across a broad spectrum of rare disease. The enhanced diagnostic benefit was observed regardless of whether participants had undergone prior genetic testing (31% in those who had received testing and 33% in those who had not). For 25% of those who received a genetic diagnosis, there was immediate clinical actionability. Standardizing procedures, from enrolment of patients to the return of NHS-validated results to clinicians, was critical to our success. For example, clinical data collection using diseasespecific data models and HPO terms enabled diagnoses confirming the value of standardization through ontologies and clinical annotation in precision medicine.30. These additional diagnoses, beyond the 264 (49% of total diagnoses) observed in the single disease virtual panel, came from Exomiser and additional, applied virtual panels. The diagnostic discoveries derived by combining research, decision support and clinical validation and assessment leveraged an additional 72 diagnoses.
Diagnostic yield was influenced by family structure, and for disorders with a likely Mendelian inheritance and a single gene etiology our yield increased to 35%: ophthalmological, metabolic and neurologic disorders yielded the greatest percentage of diagnoses. The scale of our dataset enabled cohort-wide burden testing which identified numerous novel disease–gene associations including three that have now been confirmed and 19 with compelling evidence that are likely to be confirmed in independent datasets.
Of the diseases we diagnosed through genome-sequencing, 13% were caused by mutations in non-coding sequence or mitochondrial genomes, tandem repeat expansions in Huntington disease, and a wide range of structural variants with nucleotide resolution of breakpoints using a novel random forest method. An additional 2% of diagnoses involved coding variants in regions of low coverage on exome sequencing. Our results provide new evidence of the value of genome sequencing and mirror previous studies where 53% of participants who received new diagnoses from genome sequencing had previously received testing by exome sequencing.5
Previous studies have demonstrated how next-generation sequencing can reveal diagnoses with yields of between 25% and 29% from exome sequencing in persons who had received no prior genetic testing.32–34 The Undiagnosed Disease Network reported a 26% yield from a mixture of exome and genome sequence analysis of 382 patients5 and another genome sequencing study gave a 42% yield in 50 families with intellectual disability in whom prior testing had previously been carried out.35 We obtained similar results with a broad range of disorders (160) with unmet diagnostic need. Our approach is limited to diagnoses that are readily made through short-read genome sequencing. Fully phased, long-read sequencing better detects structural variation and delivers sequence from parts of the genome that are poorly captured by short read sequencing.31
This pilot has underpinned the case for genome-sequencing in the diagnosis of certain specific rare diseases in the new NHS National Genomic Test Directory36. For patients in the National Health Service for specific disorders, such as intellectual disability, genome-sequencing will now be the first-line test (Table S12 in the Supplementary Appendix) and the NHS in England, through a new National Genomic Medicine Service, is in the process of sequencing 500,000 whole genomes in rare disease and cancer in healthcare. We hope our findings will assist other health systems in considering the role of genome sequencing in the care of patients with rare diseases.
Disclosure forms provided by the authors are available with the full text of this article at NEJM.org.
Table 2
Primary symptoms — no. (%) | All Families | Singletons | Duos | Trios | Larger families |
---|---|---|---|---|---|
Cardiovascular | 147 (7) | 56 (3) | 24 (1) | 49 (2) | 18 (1) |
Ciliopathies | 69 (3) | 34 (2) | 14 (1) | 16 (1) | 5 (<1) |
Dermatological | 38 (2) | 9 (<1) | 5 (<1) | 22 (1) | 2 (<1) |
Dysmorphic and congenital abnormalities | 20 (1) | 10 (<1) | 2 (<1) | 7 (<1) | 1 (<1) |
Endocrine | 87 (4) | 57 (3) | 14 (1) | 12 (1) | 4 (<1) |
Gastroenterological | 32 (1) | 18 (1) | 14 (1) | ||
Growth | 3 (<1) | 3 (<1) | |||
Haematological and immunological | 5 (<1) | 2 (<1) | 3 (<1) | ||
Haematological | 7 (<1) | 3 (<1) | 2 (<1) | 2 (<1) | |
Hearing and ear | 35 (2) | 6 (<1) | 5 (<1) | 17 (1) | 7 (<1) |
Metabolic | 93 (4) | 24 (1) | 12 (1) | 48 (2) | 9 (<1) |
Intellectual disability (ID) | 130 (6) | 10 (<1) | 24 (1) | 78 (4) | 18 (1) |
Neurology and neurodevelopmental (excl. ID) | 521 (24) | 193 (9) | 93 (4) | 194 (9) | 41 (2) |
Ophthalmological | 348 (16) | 74 (3) | 62 (3) | 199 (9) | 13 (1) |
Renal and urinary tract | 176 (8) | 125 (6) | 21 (1) | 26 (1) | 4 (<1) |
Respiratory | 2 (<1) | 1 (<1) | 1 (<1) | ||
Rheumatological | 48 (2) | 14 (1) | 6 (<1) | 25 (1) | 3 (<1) |
Skeletal | 62 (3) | 15 (1) | 11 (1) | 23 (1) | 13 (1) |
Tumour syndromes | 293 (13) | 231 (11) | 31 (1) | 27 (1) | 4 (<1) |
Other | 67 (3) | 17 (1) | 12 (1) | 34 (2) | 4 (<1) |
2183(100) | 881 (40) | 343 (16) | 797 (37) | 162 (7) |
Acknowledgements
We thank all the participants and healthcare teams in Addenbrooke’s Hospital in Cambridge, Great Ormond St Hospital NHS Foundation Trust, University College London NHS Foundation Trust, Guys and St Thomas’s Hospital, Barts Health, Oxford University Hospitals NHS Foundation Trust, Manchester University NHS Foundation Trust and The Newcastle Hospitals NHS Foundation Trust. Mark Caulfield and Willem H Ouwehand are NIHR Senior Investigators. This work is part of the portfolio of translational research at the NIHR Biomedical Research Centres at Barts, Cambridge University Hospitals NHS Foundation Trust, Great Ormond Street Foundation NHS Trust, Manchester University NHS Foundation Trust, Moorfield’s NHS Foundation Trust, The Newcastle Hospitals NHS Foundation Trust, Oxford University Hospitals NHS Foundation Trust, and University College London NHS Foundation Trust. This work was made possible through the generosity of NHS patients and their families and uses clinical data from the NHS and NHS Digital.
We thank all those across the world who have contributed to the PanelApp knowledgebase and to the validation and reporting working group (Dom McMullan, Helen Firth, Steve Abbs, Sian Ellard) for their role in supporting the development of the bioinformatics pipeline and reporting process. We received extremely valuable feedback on our work from Dr. David Bick and Prof. Gil McVean. We are grateful for the support from Professor Dame Sue Hill and the team in NHS England for the work to fund and establish the 13 Genomic Medicine Centres and that enabled the NHS contribution including the clnical return of results within the NHS in a standardized and validated format which both led to the confirmation of the diagnoses, provided additional information and led to the patient benefit reported. Maria Bitner-Glindzicz from Great Ormond Street Hospital and Institute of Child Health was a key contributor to the 100,000 Genomes Project Pilot but died during the preparation of this manuscript.
Funding
Genomics England and the 100,000 Genomes Project was funded by the National Institute for Health Research, the Wellcome Trust, the Medical Research Council, Cancer Research UK, the Department of Health and Social Care and NHS England. The NIHR BioResource is funded by the NIHR.
PFC is a Wellcome Trust Principal Research Fellow (212219/Z/18/Z), and an NIHR Senior Investigator, who receives support from the Medical Research Council Mitochondrial Biology Unit (MC_UU_00015/9), the Medical Research Council (MRC) International Centre for Genomic Medicine in Neuromuscular Disease (MR/S005021/1), and the NIHR Biomedical Research Centre based at Cambridge University Hospitals NHS Foundation Trust and the University of Cambridge. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. LRW is supported by Grants from Versus Arthritis (21593), The NIHR BRC at GOSH and Medical Research Council (MR/R013926/1).
We are grateful for the support of the Monarch Initiative on HPO and Exomiser, funded by the National Institutes of Health (NIH) Office of the Director (OD) [1R24OD011883]. DS, PCM and VC were funded by National Institutes of Health Grant 5-UM1-HG006370. GA is supported by a Fight for Sight (UK) Early Career Investigator Award (5045/46), NIHR-BRC at Great Ormond Street Hospital Institute for Child Health and Moorfields Eye Charity (Stephen and Elizabeth Archer in memory of Marion Woods). The Moorfields/UCL Institute of Ophthalmology team are additionally funded by NIHR-BRC at Moorfields Eye Hospital and UCL Institute of Ophthalmology. We gratefully acknowledge the Illumina Laboratory Services team at Hinxton for genome sequencing and secondary analysis.
References
Full text links
Read article at publisher's site: https://doi.org/10.1056/nejmoa2035790
Read article for free, from open access legal sources, via Unpaywall: https://www.nejm.org/doi/pdf/10.1056/NEJMoa2035790?articleTools=true
Citations & impact
Impact metrics
Citations of article over time
Alternative metrics
Discover the attention surrounding your research
https://www.altmetric.com/details/116590047
Smart citations by scite.ai
Explore citation contexts and check if this article has been
supported or disputed.
https://scite.ai/reports/10.1056/nejmoa2035790
Article citations
Cardiomyopathies in 100,000 genomes project: interval evaluation improves diagnostic yield and informs strategies for ongoing gene discovery.
Genome Med, 16(1):125, 29 Oct 2024
Cited by: 0 articles | PMID: 39472908 | PMCID: PMC11520845
The co-occurrence of genetic variants in the TYR and OCA2 genes confers susceptibility to albinism.
Nat Commun, 15(1):8436, 30 Sep 2024
Cited by: 0 articles | PMID: 39349469 | PMCID: PMC11443028
A call to action to scale up research and clinical genomic data sharing.
Nat Rev Genet, 07 Oct 2024
Cited by: 0 articles | PMID: 39375561
Review
Population-specific putative causal variants shape quantitative traits.
Nat Genet, 56(10):2027-2035, 03 Oct 2024
Cited by: 1 article | PMID: 39363016 | PMCID: PMC11525193
Whole-genome sequencing in 333,100 individuals reveals rare non-coding single variant and aggregate associations with height.
Nat Commun, 15(1):8549, 03 Oct 2024
Cited by: 1 article | PMID: 39362880 | PMCID: PMC11450065
Go to all (252) article citations
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Genome Sequencing for Diagnosing Rare Diseases.
N Engl J Med, 390(21):1985-1997, 01 Jun 2024
Cited by: 7 articles | PMID: 38838312
Structural and non-coding variants increase the diagnostic yield of clinical whole genome sequencing for rare diseases.
Genome Med, 15(1):94, 09 Nov 2023
Cited by: 13 articles | PMID: 37946251 | PMCID: PMC10636885
Re-analysis of whole-exome sequencing data uncovers novel diagnostic variants and improves molecular diagnostic yields for sudden death and idiopathic diseases.
Genome Med, 11(1):83, 17 Dec 2019
Cited by: 39 articles | PMID: 31847883 | PMCID: PMC6916453
Case for genome sequencing in infants and children with rare, undiagnosed or genetic diseases.
J Med Genet, 56(12):783-791, 25 Apr 2019
Cited by: 58 articles | PMID: 31023718 | PMCID: PMC6929710
Review Free full text in Europe PMC
Funding
Funders who supported this work.
Action Medical Research (1)
Grant ID: 2063
Ataxia UK (1)
Grant ID: ZHORVATH
Cancer Research U.K.
Fight for Sight (3)
Grant ID: 1570/1571
Grant ID: 1479/80
Grant ID: 24TP171
Medical Research Council (20)
Colorectal Adenoma/carcinoma Prevention Project 2 (CAPP2): RCT of aspirin and resistant starch in HNPCC(Lynch Syndrome)
Prof. Sir John Burn, Newcastle University
Grant ID: G0100496
Examining the coupling of small GTPase activation and metabolism of the phosphoinositide lipid PI(3,5)P2 in Charcot Marie Tooth Type 4 Neuropathies
Dr Laura Swan, University of Liverpool
Grant ID: MR/N010035/1
Identification of novel short tandem repeat expansions in neurological disorders
Dr Arianna Tucci, Queen Mary University of London
Grant ID: MR/S006753/1
Do secondary mitochondrial DNA defects cause retinal ganglion cell death in dominant optic atrophy?
Professor Patrick Yu Wai Man, Newcastle University
Grant ID: G0701386
Molecular mechanism of the recovery in infantile reversible cytochrome c oxidase (COX) deficiency myopathy
Prof Rita Horvath, Newcastle University
Grant ID: G1000848
Exosomal protein deficiencies: how abnormal RNA metabolism results in childhood-onset neurological diseases
Prof Rita Horvath, Newcastle University
Grant ID: MR/N025431/1
UK Infrastructure for Large-scale Clinical Genomics Research
Professor Sir Mark Caulfield, Queen Mary University of London
Grant ID: MC_EX_MR/M009203/1
Grant ID: HDR-9004
Using transcriptomics to transform the diagnosis and understanding of inherited adult neurological disorders
Professor Mina Ryten, University College London
Grant ID: MR/N008324/1
Grant ID: MR/N027302/1
Grant ID: MR/N027302/2
What disease mechanisms contribute to multisystem tissue involvement in dominant optic atrophy due to OPA1 mutations?
Professor Patrick Yu Wai Man, Newcastle University
Grant ID: G1002570
ISCF HDRUK DIH Sprint Exemplar: Cloud-based integration of phenotype and genotype data for rare disease research
Professor John R Bradley, Cambridge University Hospitals Trust
Grant ID: MC_PC_18030
UK Infrastructure for Large-scale Clinical Genomics Research
Professor Sir Mark Caulfield, Queen Mary University of London
Grant ID: MR/M009203/1
Exosomal protein deficiencies: how abnormal RNA metabolism results in childhood-onset neurological diseases
Prof Rita Horvath, University of Cambridge
Grant ID: MR/N025431/2
UK Infrastructure for Large-scale Clinical Genomics Research
Professor Sir Mark Caulfield, Queen Mary University of London
Grant ID: MC_PC_14089
Understanding genetic mechanisms modulating the clinical expression of mitochondrial disease.
Prof Patrick F. Chinnery, University of Cambridge
Grant ID: MC_UU_00015/9
MICA: Childhood arthritis and its associated uveitis: stratification through endotypes and mechanism to deliver benefit; the CLUSTER Consortium.
Professor Lucy Wedderburn, University College London
Grant ID: MR/R013926/1
MRC Strategic Award to establish an International Centre for Genomic Medicine in Neuromuscular Diseases
Professor Michael Hanna, University College London
Grant ID: MR/S005021/1
Targeting the cellular metabolism to treat tissue-specific mitochondrial diseases
Prof Rita Horvath, University of Cambridge
Grant ID: MR/V009346/1
Muscular Dystrophy UK (1)
Grant ID: RA4/0924
NHGRI NIH HHS (2)
Grant ID: UM1 HG006370
Grant ID: U24 HG011449
NHS England
NIH HHS (1)
Grant ID: R24 OD011883
NIH Office of the Director (2)
Grant ID: 1R24OD011883
Grant ID: 5-UM1-HG006370
National Institute for Health Research (NIHR) (8)
Grant ID: NF-SI-0616-10099
Grant ID: NIHR200158
Grant ID: ICA-CDRF-2015-01-046
Grant ID: NF-SI-0507-10376
Grant ID: ACF-2016-14-010
Grant ID: NF-SI-0512-10113
Grant ID: NF-SI-0513-10141
Grant ID: NF-SI-0617-10154
Sight Research UK (1)
Grant ID: SEE 003
Versus Arthritis (2)
Centre for Adolescent Rheumatology Versus Arthritis at University College London (UCL), University College London Hospitals (UCLH) and Great Ormond Street Hospital (GOSH) NHS Trusts
Professor Lucy Wedderburn, University College London
Grant ID: 21593
Novel mechanistic insights into the pathogenesis of juvenile dermatomyositis.
Professor Lucy Wedderburn, University College London
Grant ID: 21552
Wellcome Trust (3)
Grant ID: 109915/A/15/Z
Nuclear genomic control of mitochondrial DNA heteroplasmy in humans: population genetics & disease
Prof Patrick F. Chinnery, University of Cambridge
Grant ID: 212219
Nuclear genomic control of mitochondrial DNA heteroplasmy in humans: population genetics & disease
Prof Patrick F. Chinnery, University of Cambridge
Grant ID: 212219/Z/18/Z