Europe PMC
Nothing Special   »   [go: up one dir, main page]

Europe PMC requires Javascript to function effectively.

Either your web browser doesn't support Javascript or it is currently turned off. In the latter case, please turn on Javascript support in your web browser and reload this page.

This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy.

Abstract 


Background

The U.K. 100,000 Genomes Project is in the process of investigating the role of genome sequencing in patients with undiagnosed rare diseases after usual care and the alignment of this research with health care implementation in the U.K. National Health Service. Other parts of this project focus on patients with cancer and infection.

Methods

We conducted a pilot study involving 4660 participants from 2183 families, among whom 161 disorders covering a broad spectrum of rare diseases were present. We collected data on clinical features with the use of Human Phenotype Ontology terms, undertook genome sequencing, applied automated variant prioritization on the basis of applied virtual gene panels and phenotypes, and identified novel pathogenic variants through research analysis.

Results

Diagnostic yields varied among family structures and were highest in family trios (both parents and a proband) and families with larger pedigrees. Diagnostic yields were much higher for disorders likely to have a monogenic cause (35%) than for disorders likely to have a complex cause (11%). Diagnostic yields for intellectual disability, hearing disorders, and vision disorders ranged from 40 to 55%. We made genetic diagnoses in 25% of the probands. A total of 14% of the diagnoses were made by means of the combination of research and automated approaches, which was critical for cases in which we found etiologic noncoding, structural, and mitochondrial genome variants and coding variants poorly covered by exome sequencing. Cohortwide burden testing across 57,000 genomes enabled the discovery of three new disease genes and 19 new associations. Of the genetic diagnoses that we made, 25% had immediate ramifications for clinical decision making for the patients or their relatives.

Conclusions

Our pilot study of genome sequencing in a national health care system showed an increase in diagnostic yield across a range of rare diseases. (Funded by the National Institute for Health Research and others.).

Free full text 


Logo of wtpaEurope PMCEurope PMC Funders GroupSubmit a Manuscript
N Engl J Med. Author manuscript; available in PMC 2022 Aug 3.
Published in final edited form as:
PMCID: PMC7613219
EMSID: EMS151082
NIHMSID: NIHMS1823225
PMID: 34758253

The 100,000 Genomes Pilot on Rare Disease Diagnosis in Healthcare − A Preliminary Report

The 100,000 Genomes Project Pilot Investigators, Damian Smedley, Ph.D,#1,2 Katherine R Smith, Ph.D,#1,2 Antonio Rueda Martin, M.Sc,#1 Ellen A Thomas, M.D,#1 Ellen M McDonagh, Ph.D,#1,3 Valentina Cipriani, Ph.D,#2,4,5,6 Jamie M Ellingford, Ph.D,#7,8 Gavin Arno, Ph.D,#4,5 Arianna Tucci, M.D,#1,2 Jana Vandrovcova, Ph.D,#9 Georgia Chan, Ph.D,#1 Hywel J Williams, Ph.D,#10,11 Thiloka Ratnaike, MBBS, Ph.D,12,13,14 Wei Wei, Ph.D,12,13 Kathleen Stirrups, Ph.D,15,16 Kristina Ibanez, Ph.D,1 Loukas Moutsianas, Ph.D,1,2 Matthias Wielscher, Ph.D,1 Anna Need, Ph.D,1 Michael R Barnes, Ph.D,2 Letizia Vestito, M.Sc,17,18,19 James Buchanan, D.Phil,20,21 Sarah Wordsworth, Ph.D,20,21 Sofie Ashford, B.Sc,15 Karola Rehmstrom, Ph.D,22 Emily Li, Ph.D,22 Gavin Fuller, MMedSci,23 Philip Twiss, M.Sc,23 Olivera Spasic-Boskovic, M.Sc,23 Sally Halsall, Ph.D,23 R. Andres Floto, M.D.,Ph.D,22 Kenneth Poole, M.D.,Ph.D,22,23 Annette Wagner, M.D.,Ph.D,23 Sarju G Mehta, M.D,23 Mark Gurnell, M.D.,Ph.D,24 Nigel Burrows, M.D,23 Roger James, Ph.D,15 Christopher Penkett, D.Phil,15,16 Eleanor Dewhurst, B.A,15 Stefan Gräf, Ph.D,15,25,16 Rutendo Mapeta, B.Sc,15,16 Mary Kasanicki, Ph.D,15,23 Andrea Haworth, M.Sc. FRCPath,26 Helen Savage, M.Sc, DipRCPath,26 Melanie Babcock, Ph.D,27 Martin G Reese, Ph.D,27 Mark Bale,1 Emma Baple, MBBS, Ph.D,1,28,29 Christopher Boustred, Ph.D,1 Helen Brittain, M.D,1 Anna de Burca, MBBS, PhD,30 Marta Bleda, Ph.D,1 Andrew Devereau, Ph.D,1 Dina Halai, M.Sc,1 Eik Haraldsdottir, M.Sc,1 Zerin Hyder, M.D,1,8 Dalia Kasperaviciute, Ph.D,1,2 Christine Patch, Ph.D,1 Dimitris Polychronopoulos, Ph.D,1 Angela Matchan, M.Sc,1 Razvan Sultana, Ph.D,1 Mina Ryten, M.D.,Ph.D,1,31,18,32 Ana Lisa Taylor Tavares, MBBS,1 Carolyn Tregidgo, Ph.D,1 Clare Turnbull, M.D.,Ph.D,1,33 Matthew Welland, M.Sc,1 Suzanne Wood, M.Sc,1,2 Catherine Snow, Ph.D,1 Eleanor Williams, Ph.D,1 Sarah Leigh, Ph.D,1 Rebecca E Foulger, Ph.D,1 Louise C Daugherty, M.Sc,1 Olivia Niblock, M.Sc,1 Ivone U.S. Leong, Ph.D,1 Caroline F Wright, Ph.D,1,28 Jim Davies, D.Phil,21 Charles Crichton, B.A,21 James Welch, B.A,21 Kerrie Woods, B.A,21 Lara Abulhoul, M.D,34 Paul Aurora, MRCP, Ph.D,35 Detlef Bockenhauer, M.D,17,36 Alexander Broomfield, M.D,17 Maureen A Cleary, M.D,17 Tanya Lam, MBBS, MPH,17 Mehul Dattani, FRCP,18,37 Emma Footitt, Ph.D,17 Vijeya Ganesan, M.D,17 Stephanie Grunewald, M.D.,Ph.D,34,38 Sandrine Compeyrot-Lacassagne, M.D,17,38 Francesco Muntoni, M.D,17,38 Clarissa Pilkington, MBBS,17,38 Rosaline Quinlivan, M.D,17 Nikhil Thapar, M.D.,Ph.D,39,40 Colin Wallis, M.D,17 Lucy R Wedderburn, FRCP, Ph.D,17,35,38 Austen Worth, M.D,17 Teofila Bueser, M.Sc,32,41 Cecilia Compton, M.Sc,32 Charu Deshpande, MRCPCH,32 Hiva Fassihi, FRCP,42 Eshika Haque, M.Sc,32 Louise Izatt, Ph.D,32 Dragana Josifova, M.D,32 Shehla Mohammed, FRCP,32 Leema Robert, MRCPCH,32 Sarah Rose, M.Sc,32 Deborah Ruddy, Ph.D,32 Robert Sarkany, FRCP,42 Genevieve Say, M.Sc,32 Adam C Shaw, M.D,32 Agata Wolejko, M.Sc,43 Bishoy Habib, B.Sc,43 Gavin Burns, Ph.D,43 Sarah Hunter, M.Sc,43 Russell J Grocock, Ph.D,43 Sean J Humphray, B.Sc,43 Peter N Robinson, M.D,44 Melissa Haendel, Ph.D,45 Michael A Simpson, Ph.D,46 Siddharth Banka, M.D.,Ph.D,7,8 Jill Clayton-Smith, FRCP,7,8 Sofia Douzgou, FRCP, Ph.D,7,8 Georgina Hall, M.Sc,7,8 Huw B Thomas, Ph.D,7 Raymond T O’Keefe, Ph.D,7 Michel Michaelides, FRCOphth,5,4 Anthony T Moore, FRCOphth,5,4,47 Sam Malka, B.Sc,5,4 Nikolas Pontikos, Ph.D,5,4 Andrew C Browning, M.D.,Ph.D,48 Volker Straub, M.D, PhD,49 Gráinne S Gorman, FRCP, Ph.D,50,51,52 Rita Horvath, M.D, PhD,50,12 Richard Quinton, M.D,53,54 Andrew M Schaefer, MRCP,50,51 Patrick Yu-Wai-Man, FRCOphth, Ph.D,55,13,56 Doug M Turnbull, FMedSci, FRS,50,51,52 Robert McFarland, MRCPCH, Ph.D,50,51 Robert W Taylor, FRCPath, Ph.D,50,51 OConnor Emer, M.D,9 Yip Janice, MRes,9 Newland Katrina, M.Sc,9 Huw R Morris, FRCP, Ph.D,9 James Polke, FRCPath, Ph.D,9 Nicholas W Wood, Ph.D, FMedSci,9,6 Carolyn Campbell, FRCPath,57 Carme Camps, Ph.D,58,21 Kate Gibson, B.Sc,57 Nils Koelling, Ph.D,59 Tracy Lester, Ph.D, FRCPath,57 Andrea H Németh, FRCP, D.Phil,60,30 Claire Palles, Ph.D,61 Smita Patel, FRCP, FRCPath, Ph.D,62,21 Noemi BA Roy, FRCPath, D.Phil,59,63,21 Arjune Sen, MRCP, Ph.D,64,21,65 John Taylor, Ph.D,57,21 Pilar Cacheiro, Ph.D,2 Julius O Jacobsen, Ph.D,2 Eleanor G Seaby, M.D,66 Val Davison, FRCPath,67 Lyn Chitty, Ph.D, MRCOG,17,18,38 Angela Douglas, Ph.D, FRCPath,68,67 Kikkeri Naresh, FRCPath,69 Dom McMullan, Ph.D, FRCPath,70 Sian Ellard, Ph.D, FRCPath,71 I. Karen Temple, Ph.D, FRCPath,72,73 Andrew D Mumford, Ph.D, FRCPath,74 Gill Wilson, FRCP,75 Phil Beales, FMedSci,18,17,38 Maria Bitner-Glindzicz, MBBS, Ph.D,18,17,38 Graeme Black, M.D, D.Phil,7,8 John R Bradley, DM,15 Paul Brennan, FRCP,49 John Burn, MBBS, Ph.D,76 Patrick F Chinnery, MedSci,12,13,15 Perry Elliott, M.D,77 Frances Flinter, M.D,32 Henry Houlden, M.D,9 Melita Irving, M.D,32,78 William Newman, M.D, PhD,7,8 Shamima Rahman, FRCP,FRCPCH, Ph.D,34,79 John A Sayer, MB ChB, PhD,53,54,80 Jenny C Taylor, Ph.D,58,21 Andrew R Webster, FRCOphth,5,4 Andrew OM Wilkie, FMedSci, FRS,59 Willem H Ouwehand, FMedSci,15,81,82,16 F Lucy Raymond, M.D.,Ph.D,15,22 NIHR Bioresource,15 John Chisholm, FREng,1 Sue Hill, Ph.D,67 David Bentley, D.Phil,43 Richard H Scott, M.D.,Ph.D,#1,17 Tom Fowler, Ph.D,#1,2 Augusto Rendon, Ph.D,#1,16 and Mark Caulfield, FRCP, FMedScicorresponding author#1,2

Abstract

Background

The UK 100,000 Genomes Project is in the process of investigating the role of genome sequencing of patients with undiagnosed rare disease following usual care, and the alignment of research with healthcare implementation in the UK’s national health service. (Other parts of this Project focus on patients with cancer and infection.)

Methods

We enrolled participants, collected clinical features with human phenotype ontology terms, undertook genome sequencing and applied automated variant prioritization based on virtual gene panels (PanelApp) and phenotypes (Exomiser), alongside identification of novel pathogenic variants through research analysis. We report results on a pilot study of 4660 participants from 2183 families with 161 disorders covering a broad spectrum of rare disease.

Results

Diagnostic yields varied by family structure and were highest in trios and larger pedigrees. Likely monogenic disorders had much higher diagnostic yields (35%) with intellectual disability, hearing and vision disorders, achieving yields between 40 and 55%. Those with more complex etiologies had an overall 25% yield. Combining research and automated approaches was critical to 14% of diagnoses in which we found etiologic non-coding, structural and mitochondrial genome variants and coding variants poorly covered by exome sequencing. Cohort-wide burden testing across 57,000 genomes enabled discovery of 3 new disease genes and 19 novel associations. Of the genetic diagnoses that we made, 24% had immediate ramifications for the clinical decision-making for the patient or their relatives.

Conclusion

Our pilot study of genome sequencing in a national health care system demonstrates diagnostic uplift across a range of rare diseases.

(Funded by National Institute for Health Research and others)

Rare disease is a worldwide healthcare challenge with approximately 10,000 disorders affecting 6% of the population in Western societies.1,2 Over 80% of rare diseases have a genetic component and these conditions are disabling and expensive to manage. One-third of children with a rare disease die before their fifth birthday.1 The adoption of next generation sequencing has improved rare disease diagnostic rates over the past decade.35 However, the majority of rare disease patients remain without a molecular diagnosis following standard diagnostic testing.35 To address this, the UK Government launched the 100,000 Genomes Project (100KGP) in 2013 to apply whole genome sequencing (WGS) to rare disease, cancer and infection in national healthcare.6

To assess impact of this WGS approach on the genetic diagnosis of rare disease in the UK’s National Health Service, we carried out a pilot study in which we enrolled families and undertook detailed clinical phenotyping of the proband.4 We collected electronic health records from all participants in a multi-petabyte research environment.5 When necessary, we carried out wet bench orthogonal tests and in-silico approaches.

Methods

Patients

Following ethical approval, consenting participants (identified by healthcare professionals and researchers) with a broad range of rare diseases without diagnoses after undergoing usual care in the NHS (which ranged from no available test through approved tests which did not include genome sequencing) were recruited by nine English hospitals and consented through the National Institute for Health Research (NIHR) BioResource for Rare Diseases. To test the broad applicability of genome sequencing, participants were eligible if they had a rare disease (as defined in the UK as a disorder affecting 1 in 2000 or less), were likely to have a single gene or oligogenic aetiology, and no genomic diagnosis. Data on prior proband testing was collected where possible including single-gene tests, karyotyping, single nucleotide polymorphism (SNP) arrays, next generation sequencing panels, and exomes. Probands and, where feasible, parents and/or other family members were enrolled by multiple clinical specialties in the NHS. Standardized baseline clinical data were recorded using the Human Phenotype Ontology (HPO)7 against disease specific data models8 and whole blood was drawn for DNA extraction. The participants are followed over their life course using electronic health records (all hospital episodes, registries and cause of death).

Genome Sequencing

Genome sequencing9 was performed using the Illumina TruSeq DNA PCR-Free sample preparation kit by Illumina Laboratory Sciences, Cambridge UK on an HiSeq 2500 sequencer, generating a mean depth of 32× (range from 27× to 54×) and greater than 15× for at least 95% of the reference human genome. WGS reads were aligned to the Genome Reference Consortium human genome build 37 (GRCh37) using Isaac Genome Alignment Software. Family-based variant calling of single variant nucleotides and insertion deletions (indels) for chromosomes 1 to 22, X, and the mitochondrial genome (mean 2814x coverage, range 142-16581) was performed using the Platypus variant caller.10

The Diagnostic Pipeline

We constructed an automated analytical pipeline to filter the genome down to rare, segregating and predicted damaging candidate variants in coding regions. To limit the possibility of overlooking, or inefficiently prioritizing diagnoses we focussed initially on virtual gene panels based on both the recruited clinical indication/disease and submitted HPO terms (applied virtual panels). To address the issue of which genes have sufficient evidence to attribute causation and include in these virtual gene panels, we used our PanelApp software to enable expert, crowd-sourced review and curation of genes with diagnostic-grade evidence for each of our disease categories e.g. evidence in at least three, unrelated families.11 Loss of function (LoF) or de novo, protein altering variants affecting genes in the applied virtual panels were classified as tier 1, other variant types such as missense variants affecting these genes were classified as tier 2, and all other filtered variants were classified as tier 3 (Figure S1 in the Supplementary Appendix). To further reduce the possibility of missing, or inefficient prioritization of diagnoses, we ran Exomiser12, a phenotype-based approach to look across all genes in the genome for a diagnosis. Exomiser prioritizes rare, segregating, predicted pathogenic variants in genes where the patient phenotypes match previous reference knowledge from human disease or model organism databases. The ontology-driven phenotype matching can detect patients possessing atypical profile for a disease.

Decision support systems and clinical genetics teams provided by Congenica Ltd and Fabric Genomics13,14 assisted us in variant prioritization and return of candidate variants to the 13 NHS Genomic Medicine Centres (GMC). These variants were reviewed by NHS clinical scientists and clinicians using the American College of Medical Genetics and Genomics guidelines and a diagnostic report was issued for each proband.15 Final clinical outcomes included whether a genetic diagnosis was obtained, the variant(s) involved, whether they explained all, or some of the phenotypes and whether an intervention was deployed.

The pilot participants were recruited and sequenced throughout 2014-2016, while the infrastructure to collect, QC, process and return data was being established. Results were returned to the GMCs from May 2016 to April 2019. In our post-pilot phase with an established pipeline, we now return results to the GMCs within 6 weeks of sample collection.

Novel Pathogenic Variants

Researchers investigated coding and non-coding regions for novel diagnoses in genes matching the patients’ phenotypes, including the presence of de novo variants in highly constrained coding regions16 with 95% confidence. We used a novel methodology for mitochondrial DNA that accounts for heteroplasmy,17 Genomiser,18 and ExpansionHunter for simple tandem repeat expansions.19 Finally we employed a novel random forest method to analyse Canvas20 and Manta21 calls and identify potentially pathogenic copy number and structural variants.

Gene-based burden testing to detect enrichment of rare, predicted pathogenic, segregating variants in novel genes in specific disease cohorts relative to controls was performed on the pilot genomes as well as additional genomes from the rest of the 100KGP to increase power (57,002 genomes; see Supplementary Methods).

Access to the pilot genomic and clinical data is freely accessible by becoming a member of a Genomics England Clinical Interpretation Partnership (GeCIP) domain (https://www.genomicsengland.co.uk/about-gecip/).

Statistical Analysis

Testing was performed using the R (version 3.6.0) and Stata (version 16) statistical packages. Further detail on individual methods is given in the Supplementary Appendix.

Results

Patients

We enrolled 4660 participants (2183 probands and 2477 family members) from 161 broad categories across rare disease (Table 1), with neurologic, ophthalmologic and tumor syndromes commonly represented. Participants were recruited with varying numbers of affected and unaffected family members. We aimed, with varying degrees of success, to recruit trios or larger family structures to facilitate more effective variant prioritization. Of the probands with multiple bowel polyps whom we recruited, 93% were singletons. In contrast, 12% of probands with intellectual disability were singletons. Adult probands were more commonly enrolled than pediatric probands (age at recruitment 18 years or younger) (74% vs. 26%), in line with the general population (79% vs. 21%; 2011 census of England and Wales). The preponderance of adults is unusual compared to previous sequencing projects and reflects an eligibility criterion: probands had already undergone usual care: in many cases, usual care involved standard genetic testing (mostly single-gene or panel-based). A lower percentage of female probands were recruited, especially for pediatric cases, where the difference was significant (232 female vs. 339 male; P< 0.001) based on the expected female proportion of 51% from 2011 census of England and Wales) across most disease categories. The increased susceptibility of males to recessive X-linked conditions may account for this sex bias: over 6% of total diagnoses involved variants on the X chromosome (which represents approximately 5% of the genome). The inferred ancestry of the probands (see Supplementary Appendix) was in line with that expected from the population (86% white, 7.5% Asian, 3.3% black, 2.2% mixed, 1% other: 2011 census of England and Wales). However, significantly more pediatric probands were of South Asian ancestry compared to adult probands (16% vs. 4%, P<0.001); our results indicated potential consanguinity in 43% of pediatric South Asian probands and 1% for the other pediatric probands (Table 1).

Table 1

Demographics (including inferred ancestry) of the 100,000 Genomes Project pilot.
VariableAll probands (N=2183)Paediatric (age at recruitment <=18) probands (N=571)Adult (age at recruitment > 18) probands (N=1612)
Sex — no. (%)
Male 1138 (52)339 (16)799 (37)
Female 1045 (48)232 (11)813 (37)
2183 (100)571 (26)1612 (74)
Median (IQR) age in years at recruitment 35 (18-54)9 (5-14)45 (31-60)
Race or ethnic group — no.
(%), %consangunity
suggested in record
African 50 (2), 025 (4), 025 (2), 0
Ad Mixed American 26 (1), 2312 (2), 2514 (1), 21
East Asian 8 (<1), 02 (<1), 06 (<1), 0
European 1931 (88), <1438 (77), <11493 (93), <1
South Asian 163 (7), 3693 (16), 4370 (4), 25
Not determined 5 (<1), 01 (<1), 04 (<1), 0
2183 (100), 3571 (26), 81612 (74), 2

Clinical Data and Sequencing

We collected HPO terms for each participant (median of 4 present terms, range 1-61 and median of 4 absent terms (phenotypes not exhibited by the proband), range 0-144). We then carried out genome sequencing followed by quality assurance to check coverage, sequence quality, presence of repeat sample submissions or sample swaps, and consistency with reported family structures (see Supplementary Appendix).

The Diagnostic Yield

We obtained genetic diagnoses for 25% of probands and deposited the genotypes into the ClinVar repository (accession numbers XXXX to YYYY). Of these diagnoses, 60% were made on the basis of coding SNV/indels in the applied virtual panels, 26% from coding SNV/indels affecting well-established disease genes outside the virtual panels using phenotype-based prioritization and/or expert review by the clinicians, Congenica Ltd, or Fabric Genomics, and 14% from genome-wide, phenotype-agnostic research analysis looking beyond SNV/indels, coding regions, and disease genes in the virtual panels (Figure 1). Following international guidelines15 a further 10% of probands were classified with variants of unknown significance in genes consistent with the phenotype by clinical review at the site, but with further functional validation required. Fewer candidate variants were returned after filtering in larger family structures (Table 3), making it easier to identify causative variants, in turn leading to higher diagnostic rates for trios, quads and more complex family structures (Figure 2a), even within a disorder e.g. for hereditary ataxia the diagnostic rate increased from 21% for singletons to 32% for trios (Table S4 in the Supplementary Materials).

An external file that holds a picture, illustration, etc.
Object name is EMS151082-f001.jpg

Overview of the diagnostic and research pipeline and source of diagnoses. Results were returned to the Genomic Medicine Centres (GMCs) of the recruiting hospitals on an 2183 pilot probands. 25% received a positive diagnosis, 10% had variant(s) of unknown significance (VUS) in genes consistent with the phenotype according to clinical geneticists at the recruiting site, but with further functional validation required. The remaining 65% received a negative report at the time but will be reanalysed. Numbers and source of these positive diagnoses is shown at each stage of the automated diagnostic pipeline and additional research where a clear diagnosis was not immediately obvious.

An external file that holds a picture, illustration, etc.
Object name is EMS151082-f002.jpg
Diagnoses in the rare disease pilot.

(a) Percentage diagnostic yield for all samples and sub-divided by family structure or whether likely monogenic (35% yield) vs more complex aetiologies (11% yield) with the numbers of probands shown on bars, (b) Percentage diagnostic yield by disease area (numbers of closed probands shown on bars), (c) Percentage diagnostic yield for probands with/without prior genetics testing and broken down by most extensive testing type: chromosomal (karyotyping, arrayCGH, SNP arrays), targeted single gene tests, NGS panels or WES (numbers of closed probands shown on bars) (d) Performance of virtual panel-based and Exomiser prioritization for identifying the diagnoses. Virtual disease panel only: a single panel for the recruited disease category. Applied panels - all applied virtual panels used in the pipeline including the recruited disease associated panel as well as 0 or more additionally selected panels based on the patient phenotypes (HPO terms). Proportion of diagnoses detected are in blue (sensitivity) along with proportion of prioritized variants leading to a positive diagnosis in orange (positive predictive value). Proportions are also shown on bars. Here, diagnosed variant(s) are true positives and other returned candidate variants are false positives.Table 1. Demographics (including inferred ancestry) of the 100,000 Genomes Project pilot.

Table 3

Number of candidate variants returned to the NHS per case by automated virtual panel-based analysis pipeline. Duos refer strictly to parent-child pairs and trios to both parents and a child in a family. Values shown are median (IQR).

All family structuresSingletonsDuosTriosOther family structures
Variants after filtering 221 (49-288)292 (258-327)149 (117-213)29 (17-136)22.5 (9-71)
In virtual panels 1 (0-2)1 (0-2)1 (0-3)1 (0-2)0 (0-1)

Unsurprisingly, we obtained a higher diagnostic yield for diseases that were considered more likely to have a monogenic cause (Table S4 in the Supplementary Appendix) than those we considered more likely to have complex etiology (35% vs 11%) (Figure 2a). Likely monogenic diseases equate to those with a presence in OMIM and where genetic testing is part of the standard diagnostic workup, based on the consensus blinded review of three clinical geneticists. Diagnostic yield was highly variable by disease (Figure 2b, Table S3 in the Supplementary Appendix), varying from 40-55% for intellectual disability and various vision and hearing disorders to 6% for tumor syndromes.

We obtained data on the presence or absence of prior genetic testing for a subset (1177) of the participants. The number of tests per proband ranged from 0-16 with a median of 1 (IQR 0-2), and approximately half of the probands in this subset had been tested at least once. The overall diagnostic uplift from genome sequencing in this subset was 32% with only a slight difference depending on whether prior testing had been performed (33%), or not (31%). However, many of these prior tests were not recent. The diagnostic yield provided by genome sequencing varied between 28 to 45% depending on the type of prior testing (Figure 2c, Table S5 in the Supplementary Appendix) which, for the most part, involved targeted single gene and panel testing (Table S6 in the Supplementary Appendix).

Diagnostic Pipeline

The aim of the automated, diagnostic pipeline is to identify a few, potentially causative candidate variants, from the millions in a whole genome, through removal of extremely unlikely candidates (filtering) and identification of the most likely in the remainder (prioritization). This allows the GMCs to efficiently perform manual, clinical interpretation and issue a diagnostic report. The virtual panel-based pipeline identified 322 (66%) of the 490 SNV/indel-based diagnoses from the genomes, with a high positive predictive value given the millions of variants in the whole genomes: of 1041 of returned candidate variants, 291 (28%) proved to be diagnostic. We re-ran this analysis in December 2019 to assess the impact of using updated versions of the virtual panels containing the latest disease gene discoveries, improved virtual panel selection based on the patient’s phenotype and advances in variant filtering strategies, e.g. allowing for incomplete penetrance where suspected. This increased the number of genetic diagnoses detected from 322 to 377 (77%) with a positive predictive value of 15% (Figure 2d), demonstrating effective filtering and prioritization of the variants with only a median of 1 (IQR 0-2) candidate variant in panels returned to the clinicians at the GMCs per case (Table 3). Ongoing evolution of the virtual panels with new disease genes is expected to continue increasing the yield from this approach.

Phenotype-based prioritization using Exomiser detected 77%, 86%, and 88% of these diagnoses in the top, top 3 and top 5 ranked candidates respectively (Figure 2d). Exomiser and use of virtual panels were complementary, with 92% of these diagnoses re-called when used combined (last blue bar in Figure 2d). Precision phenotyping of our patients was essential both for Exomiser and for the selection of additional virtual panels, without which only 54% of these diagnoses would have been prioritized in the recruited disease virtual panel and presented to the GMCs as a likely candidate (first blue bar in Figure 2d).

Research-based Diagnoses

14% of the genetic diagnoses required research outside the diagnostic pipeline (Figure 1). This research involved comparisons with the genome sequences and clinical data in our research environment, with validation using wet bench orthogonal tests and in-silico approaches (Table S7 in the Supplementary Appendix). Additional diagnoses were made by screening for the presence of de novo variants in highly constrained coding regions16. These diagnoses included a de novo EBF3 missense variant in a patient with hereditary ataxia. Mitochondrial genome analysis, taking into account heteroplasmy, detected 4 new diagnoses as well as the 9 that had already been detected by the main pipeline). Twelve probands had intronic splicing variants prioritized by Exomiser due to the known pathogenic status of these variants in ClinVar.23 Nine novel non-coding diagnoses involving previously undescribed variants required exploration of the whole genome and in vitro functional validation via reverse transcription polymerase chain reaction, mini-gene, or luciferase assays.24,25,26 Here, unsolved probands were queried for non-coding variants affecting genes in the applied virtual panels, either alone, or in compound heterozygosity with loss-of-function variants. These were identified using either Genomiser or, for retinal disorder probands, systematic analysis of the untranslated regions, promoter or introns. A further 43 probands were fully or partially explained by structural variants or simple tandem repeat expansions in the genes HTT or FXN in probands with hereditary spastic paraplegia.

Novel Disease Gene Associations

We performed burden testing to discover novel Mendelian disease gene associations and potential genetic diagnoses for unsolved probands; 828 significant disease-gene associations (q value < 0.1) were identified, including 249 known and 579 novel genes (novel with respect to their association with disease), with only 0.03 ± 0.2 (range 0-3) associations from 10,000 permutations where cases and controls were assigned randomly. Twenty two candidates represent the most likely new, fully penetrant, Mendelian disease genes (Table S8 in the Supplementary Appendix and ClinVar accession numbers SCV001759972 - SCV001760540) with three recently independently confirmed diagnoses: UBAP1 in hereditary spastic paraplegia,27 FOXJ1 in non-CF bronchiectasis,28 and SORD in Charcot-Marie Tooth disease.29 Diagnostic reports were issued for three probands with these genes (Figure 1) and we are investigating others in GeneMatcher and by functional validation studies in model organisms.

Diagnostic Sequelae

These findings ended long diagnostic odysseys for some patients and their families (the median duration of odyssey was 75 months and number of hospital visits was 68); Table S1 in Supplementary Appendix); we speculate that they will mitigate NHS resource costs (183,273 episodes of hospital care costing £87 million for affected participants; Table S3 in Supplementary Appendix). In addition, 134 (25%) of the 533 genetic diagnoses were reported by clinicians to be of immediate clinical actionability with only 11 (0.2%) described as having no benefit. As of now, the remainder of the diagnoses are of unknown utility. Healthcare benefits included 4 diagnoses leading to a suggested change in medication, 26 suggesting additional surveillance for the proband or relatives, 13 allowing clinical trial eligibility, 59 informing future reproductive choices, and 32 with other benefits (Table S9 in the Supplementary Appendix).

In several specific probands, diagnoses have had important clinical actionability. In a 36-yr-old male with suspected choroideraemia, we detected a novel, CHM promoter variant causing loss of gene expression26 and offering eligibility for a gene-replacement trial. A male neonate proband presented with severe infection and transient neurologic symptoms immediately after birth and died at 4 months with no diagnosis but healthcare costs of approximately £80,000 (Table S10 in Supplementary Appendix). A diagnosis of transcobalamin 2 deficiency due to a homozygous frameshift in TCN2 was made from this study which enabled predictive testing to be offered to the younger brother within one week of birth. The younger child, who received a positive result, received weekly hydroxocobalamin injections to prevent metabolic decompensation. A 10-year-old girl was admitted to intensive care with life-threatening chicken pox. She had endured a diagnostic odyssey over seven years at a total cost of £356,571 across 307 secondary care episodes (Table S11 in Supplementary Appendix). We were able to diagnose CTPS1 deficiency due to a homozygous, known pathogenic splice acceptor variant. A diagnosis enabled a curative bone marrow transplant (cost £70,000) and predictive testing of her siblings showed no further family members to be at risk. One proband had waited till his sixth decade for a genomic diagnosis of an INF2 mutation causing focal segmental glomerulosclerosis. His father, brother and uncle had all died of renal failue. He had received two kidney transplants, had transmitted the condition to his daughter and was concerned about whether his 15-year-old grand daughter, who was under surveillance, was at risk. After he received his genetic diagnosis, the grand-daughter was tested, found to be negative, and discharged from regular medical surveillance.

Discussion

Our findings demonstrate a substantial uplift in genomic diagnoses achieved for patients by genome sequencing across a broad spectrum of rare disease. The enhanced diagnostic benefit was observed regardless of whether participants had undergone prior genetic testing (31% in those who had received testing and 33% in those who had not). For 25% of those who received a genetic diagnosis, there was immediate clinical actionability. Standardizing procedures, from enrolment of patients to the return of NHS-validated results to clinicians, was critical to our success. For example, clinical data collection using diseasespecific data models and HPO terms enabled diagnoses confirming the value of standardization through ontologies and clinical annotation in precision medicine.30. These additional diagnoses, beyond the 264 (49% of total diagnoses) observed in the single disease virtual panel, came from Exomiser and additional, applied virtual panels. The diagnostic discoveries derived by combining research, decision support and clinical validation and assessment leveraged an additional 72 diagnoses.

Diagnostic yield was influenced by family structure, and for disorders with a likely Mendelian inheritance and a single gene etiology our yield increased to 35%: ophthalmological, metabolic and neurologic disorders yielded the greatest percentage of diagnoses. The scale of our dataset enabled cohort-wide burden testing which identified numerous novel disease–gene associations including three that have now been confirmed and 19 with compelling evidence that are likely to be confirmed in independent datasets.

Of the diseases we diagnosed through genome-sequencing, 13% were caused by mutations in non-coding sequence or mitochondrial genomes, tandem repeat expansions in Huntington disease, and a wide range of structural variants with nucleotide resolution of breakpoints using a novel random forest method. An additional 2% of diagnoses involved coding variants in regions of low coverage on exome sequencing. Our results provide new evidence of the value of genome sequencing and mirror previous studies where 53% of participants who received new diagnoses from genome sequencing had previously received testing by exome sequencing.5

Previous studies have demonstrated how next-generation sequencing can reveal diagnoses with yields of between 25% and 29% from exome sequencing in persons who had received no prior genetic testing.3234 The Undiagnosed Disease Network reported a 26% yield from a mixture of exome and genome sequence analysis of 382 patients5 and another genome sequencing study gave a 42% yield in 50 families with intellectual disability in whom prior testing had previously been carried out.35 We obtained similar results with a broad range of disorders (160) with unmet diagnostic need. Our approach is limited to diagnoses that are readily made through short-read genome sequencing. Fully phased, long-read sequencing better detects structural variation and delivers sequence from parts of the genome that are poorly captured by short read sequencing.31

This pilot has underpinned the case for genome-sequencing in the diagnosis of certain specific rare diseases in the new NHS National Genomic Test Directory36. For patients in the National Health Service for specific disorders, such as intellectual disability, genome-sequencing will now be the first-line test (Table S12 in the Supplementary Appendix) and the NHS in England, through a new National Genomic Medicine Service, is in the process of sequencing 500,000 whole genomes in rare disease and cancer in healthcare. We hope our findings will assist other health systems in considering the role of genome sequencing in the care of patients with rare diseases.

Disclosure forms provided by the authors are available with the full text of this article at NEJM.org.

Table 2

Clinical features of the 100,000 Genomes Project pilot
Primary symptoms — no. (%)All FamiliesSingletonsDuosTriosLarger families
Cardiovascular 147 (7)56 (3)24 (1)49 (2)18 (1)
Ciliopathies 69 (3)34 (2)14 (1)16 (1)5 (<1)
Dermatological 38 (2)9 (<1)5 (<1)22 (1)2 (<1)
Dysmorphic and congenital abnormalities 20 (1)10 (<1)2 (<1)7 (<1)1 (<1)
Endocrine 87 (4)57 (3)14 (1)12 (1)4 (<1)
Gastroenterological 32 (1)18 (1)14 (1)
Growth 3 (<1)3 (<1)
Haematological and immunological 5 (<1)2 (<1)3 (<1)
Haematological 7 (<1)3 (<1)2 (<1)2 (<1)
Hearing and ear 35 (2)6 (<1)5 (<1)17 (1)7 (<1)
Metabolic 93 (4)24 (1)12 (1)48 (2)9 (<1)
Intellectual disability (ID) 130 (6)10 (<1)24 (1)78 (4)18 (1)
Neurology and neurodevelopmental (excl. ID) 521 (24)193 (9)93 (4)194 (9)41 (2)
Ophthalmological 348 (16)74 (3)62 (3)199 (9)13 (1)
Renal and urinary tract 176 (8)125 (6)21 (1)26 (1)4 (<1)
Respiratory 2 (<1)1 (<1)1 (<1)
Rheumatological 48 (2)14 (1)6 (<1)25 (1)3 (<1)
Skeletal 62 (3)15 (1)11 (1)23 (1)13 (1)
Tumour syndromes 293 (13)231 (11)31 (1)27 (1)4 (<1)
Other 67 (3)17 (1)12 (1)34 (2)4 (<1)
2183(100)881 (40)343 (16)797 (37)162 (7)

Supplementary Material

Supplement

Acknowledgements

We thank all the participants and healthcare teams in Addenbrooke’s Hospital in Cambridge, Great Ormond St Hospital NHS Foundation Trust, University College London NHS Foundation Trust, Guys and St Thomas’s Hospital, Barts Health, Oxford University Hospitals NHS Foundation Trust, Manchester University NHS Foundation Trust and The Newcastle Hospitals NHS Foundation Trust. Mark Caulfield and Willem H Ouwehand are NIHR Senior Investigators. This work is part of the portfolio of translational research at the NIHR Biomedical Research Centres at Barts, Cambridge University Hospitals NHS Foundation Trust, Great Ormond Street Foundation NHS Trust, Manchester University NHS Foundation Trust, Moorfield’s NHS Foundation Trust, The Newcastle Hospitals NHS Foundation Trust, Oxford University Hospitals NHS Foundation Trust, and University College London NHS Foundation Trust. This work was made possible through the generosity of NHS patients and their families and uses clinical data from the NHS and NHS Digital.

We thank all those across the world who have contributed to the PanelApp knowledgebase and to the validation and reporting working group (Dom McMullan, Helen Firth, Steve Abbs, Sian Ellard) for their role in supporting the development of the bioinformatics pipeline and reporting process. We received extremely valuable feedback on our work from Dr. David Bick and Prof. Gil McVean. We are grateful for the support from Professor Dame Sue Hill and the team in NHS England for the work to fund and establish the 13 Genomic Medicine Centres and that enabled the NHS contribution including the clnical return of results within the NHS in a standardized and validated format which both led to the confirmation of the diagnoses, provided additional information and led to the patient benefit reported. Maria Bitner-Glindzicz from Great Ormond Street Hospital and Institute of Child Health was a key contributor to the 100,000 Genomes Project Pilot but died during the preparation of this manuscript.

Funding

Genomics England and the 100,000 Genomes Project was funded by the National Institute for Health Research, the Wellcome Trust, the Medical Research Council, Cancer Research UK, the Department of Health and Social Care and NHS England. The NIHR BioResource is funded by the NIHR.

PFC is a Wellcome Trust Principal Research Fellow (212219/Z/18/Z), and an NIHR Senior Investigator, who receives support from the Medical Research Council Mitochondrial Biology Unit (MC_UU_00015/9), the Medical Research Council (MRC) International Centre for Genomic Medicine in Neuromuscular Disease (MR/S005021/1), and the NIHR Biomedical Research Centre based at Cambridge University Hospitals NHS Foundation Trust and the University of Cambridge. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. LRW is supported by Grants from Versus Arthritis (21593), The NIHR BRC at GOSH and Medical Research Council (MR/R013926/1).

We are grateful for the support of the Monarch Initiative on HPO and Exomiser, funded by the National Institutes of Health (NIH) Office of the Director (OD) [1R24OD011883]. DS, PCM and VC were funded by National Institutes of Health Grant 5-UM1-HG006370. GA is supported by a Fight for Sight (UK) Early Career Investigator Award (5045/46), NIHR-BRC at Great Ormond Street Hospital Institute for Child Health and Moorfields Eye Charity (Stephen and Elizabeth Archer in memory of Marion Woods). The Moorfields/UCL Institute of Ophthalmology team are additionally funded by NIHR-BRC at Moorfields Eye Hospital and UCL Institute of Ophthalmology. We gratefully acknowledge the Illumina Laboratory Services team at Hinxton for genome sequencing and secondary analysis.

References

2. Ferreira CR. The burden of rare diseases. Am J Med Genet A. 2019;179:885–892. [Abstract] [Google Scholar]
3. Boycott KM, Rath A, Chong JX, et al. International Cooperation to Enable the Diagnosis of All Rare Genetic Diseases. Am J Hum Genet. 2017;100(5) [Europe PMC free article] [Abstract] [Google Scholar]
4. Taylor JC, Martin HC, Lise S, et al. Factors influencing success of clinical genome sequencing across a broad spectrum of disorders. Nat Genet. 2015 Jul;47(7):717–726. [Europe PMC free article] [Abstract] [Google Scholar]
5. Splinter K, Adams DR, Bacino CA, et al. Undiagnosed Diseases Network. Effect of Genetic Diagnosis on Patients with Previously Undiagnosed Disease. N Engl J Med. 2018;379(22):2131–2139. [Europe PMC free article] [Abstract] [Google Scholar]
7. Köhler S, Carmody L, Vasilevsky N, et al. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res. 2019;47(D1):D1018–D1027. [Europe PMC free article] [Abstract] [Google Scholar]
8. Genomics England data models 2018. https://www.genomicsengland.co.uk/?wpdmdl=5500 .
9. Bentley DR, Balasubramanian S, Swerdlow HP, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456(7218):53–9. [Europe PMC free article] [Abstract] [Google Scholar]
10. Rimmer A, Phan H, Mathieson I, Iqbal Z, et al. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat Genet. 2014 Aug;46(8):912–918. [Europe PMC free article] [Abstract] [Google Scholar]
11. Martin AR, Williams E, Foulger RE, et al. PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nat Genet. 2019 Nov;51(11):1560–1565. [Abstract] [Google Scholar]
12. Smedley D, Jacobsen JO, Jäger M, et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat Protoc. 2015 Dec;10(12):2004–15. [Europe PMC free article] [Abstract] [Google Scholar]
13. Congenica platform. https://www.congenica.com/platform .
14. Fabric Genomics platform. https://fabricgenomics.com/
15. Richards S, Aziz N, Bale S, Bick D, et al. ACMG Laboratory Quality Assurance Committee. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17(5):405–24. 10.1038/gim.2015.30. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
16. Havrilla JM, Pedersen BS, Layer RM, Quinlan AR. A map of constrained coding regions in the human genome. Nat Genet. 2019 Jan;51(1):88–95. xs. [Europe PMC free article] [Abstract] [Google Scholar]
17. Wei W, Tuna S, Keogh MJ, Smith KR, et al. Germline selection shapes human mitochondrial DNA diversity. Science. 2019 May 24;364(6442) 10.1126/science.aau6520. pii: eaau6520. [Abstract] [CrossRef] [Google Scholar]
18. Smedley D, Schubach M, Jacobsen JOB, et al. A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease. Am J Hum Genet. 2016 Sep 1;99(3):595–606. [Europe PMC free article] [Abstract] [Google Scholar]
19. Dolzhenko E, van Vugt JJFA, Shaw RJ, et al. Detection of long repeat expansions from PCR-free whole-genome sequence data. Genome Res. 2017;27(11):1895–1903. 10.1101/gr.225672.117. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
20. Zhang L, Bai W, Yuan N, Du Z. Comprehensively benchmarking applications for detecting copy number variation. PLoS Comput Biol. 2019 May 28;15(5):e1007069. 10.1371/journal.pcbi.1007069. eCollection 2019 May. Erratum in: PLoS Comput Biol. 2019 Sep 20;15(9):e1007367. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
21. Kosugi S, Momozawa Y, Liu X, Terao C, Kubo M, Kamatani Y. Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing. Genome Biol. 2019 Jun 3;20(1):117. 10.1186/s13059-019-1720-5. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
23. Landrum MJ, Lee JM, Benson M, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018 Jan 4;46(D1):D1062–D1067. [Europe PMC free article] [Abstract] [Google Scholar]
24. Carss KJ, Arno G, Erwood M, et al. Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease. Am J Hum Genet. 2017 Jan 5;100(1):75–90. [Europe PMC free article] [Abstract] [Google Scholar]
26. Radziwon A, Arno G, Wheaton KD, et al. Single-base substitutions in the CHM promoter as a cause of choroideremia. Hum Mutat. 2017 Jun;38(6):704–715. [Abstract] [Google Scholar]
27. Farazi Fard MA, Rebelo AP, Buglo E, et al. Truncating Mutations in UBAP1 Cause Hereditary Spastic Paraplegia. Am J Hum Genet. 2019 Apr 4;104(4):767–773. [Europe PMC free article] [Abstract] [Google Scholar]
28. Wallmeier J, Frank D, Shoemark A, et al. De Novo Mutations in FOXJ1 Result in a Motile Ciliopathy with Hydrocephalus and Randomization of Left/Right Body Asymmetry. Am J Hum Genet. 2019 Nov 7;105(5):1030–1039. [Europe PMC free article] [Abstract] [Google Scholar]
29. Cortese A, et al. Biallelic mutations in SORD cause a common and potentially treatable hereditary neuropathy with implications for diabetes. Nat Genet. 2020;52:473–481. [Europe PMC free article] [Abstract] [Google Scholar]
30. Haendel MA, Chute CG, Robinson PN. Classification, Ontology, and Precision Medicine. N Engl J Med. 2018;379:1452–1462. [Europe PMC free article] [Abstract] [Google Scholar]
31. Eichler EE. Genetic Variation, Comparative Genomics, and the Diagnosis of Disease. N Engl J Med. 2019 Jul 4;381(1):64–74. [Europe PMC free article] [Abstract] [Google Scholar]
32. Yang Y, Muzny DM, Xia F, et al. Molecular findings among patients referred for clinical whole-exome sequencing. JAMA. 2014 Nov 12;312(18):1870–9. [Europe PMC free article] [Abstract] [Google Scholar]
33. Hu X, Li N, Xu Y, et al. Proband-only medical exome sequencing as a cost-effective first-tier genetic diagnostic test for patients without prior molecular tests and clinical diagnosis in a developing country: the China experience. Genet Med. 2018 Sep;20(9):1045–1053. [Abstract] [Google Scholar]
34. Vissers LELM, van Nimwegen KJM, Schieving JH, et al. A clinical utility study of exome sequencing versus conventional genetic testing in pediatric neurology. Genet Med. 2017 Sep;19(9):1055–1063. [Europe PMC free article] [Abstract] [Google Scholar]
35. Gilissen C, Hehir-Kwa JY, Thung DT, et al. Genome sequencing identifies major causes of severe intellectual disability. Nature. 2014 Jul 17;511(7509):344–7. [Abstract] [Google Scholar]

Citations & impact 


Impact metrics

Jump to Citations

Citations of article over time

Alternative metrics

Altmetric item for https://www.altmetric.com/details/116590047
Altmetric
Discover the attention surrounding your research
https://www.altmetric.com/details/116590047

Smart citations by scite.ai
Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by EuropePMC if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles.
Explore citation contexts and check if this article has been supported or disputed.
https://scite.ai/reports/10.1056/nejmoa2035790

Supporting
Mentioning
Contrasting
5
246
0

Article citations


Go to all (252) article citations

Similar Articles 


To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.


Funding 


Funders who supported this work.

Action Medical Research (1)

Ataxia UK (1)

Cancer Research U.K.

    Fight for Sight (3)

    Medical Research Council (20)

    Muscular Dystrophy UK (1)

    NHGRI NIH HHS (2)

    NHS England

      NIH HHS (1)

      NIH Office of the Director (2)

      National Institute for Health Research (NIHR) (8)

      Sight Research UK (1)

      Wellcome Trust (3)