Abstract
T cell alloreactivity against minor histocompatibility antigens (mHAgs)—polymorphic peptides resulting from donor–recipient (D–R) disparity at sites of genetic polymorphisms—is at the core of the therapeutic effect of allogeneic hematopoietic cell transplantation (allo-HCT). Despite the crucial role of mHAgs in graft-versus-leukemia (GvL) and graft-versus-host disease (GvHD) reactions, it remains challenging to consistently link patient-specific mHAg repertoires to clinical outcomes. Here we devise an analytic framework to systematically identify mHAgs, including their detection on HLA class I ligandomes and functional verification of their immunogenicity. The method relies on the integration of polymorphism detection by whole-exome sequencing of germline DNA from D–R pairs with organ-specific transcriptional- and proteome-level expression. Application of this pipeline to 220 HLA-matched allo-HCT D–R pairs demonstrated that total and organ-specific mHAg load could independently predict the occurrence of acute GvHD and chronic pulmonary GvHD, respectively, and defined promising GvL targets, confirmed in a validation cohort of 58 D–R pairs, for the prevention or treatment of post-transplant disease recurrence.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
WES and RNA-seq data from the training HCT, DFCI-MRD and HP-MRD cohorts is available through the dbGaP portal (accession phs003394.v1.p1). The original mass spectra, PSMs and protein sequence databases used for searches have been deposited in the public proteomics repository MassIVE (https://massive.ucsd.edu) and are accessible at ftp://massive.ucsd.edu/v08/MSV000095025/. Original mass spectrometry data for the previously published B721.221 monoallelic immunopeptidomes are accessible at ftp://massive.ucsd.edu/MSV000080527. The GI GvHD scRNA-seq dataset is available from the corresponding author upon reasonable request. All external datasets used in this study (identified by reference number) are summarized in Supplementary Fig. 18, and their accession numbers are as follows: GSE164403 (ref. 29); GSE124395 (ref. 30); GSE115469 (ref. 31); EGAS00001002649(ref. 32); GSE123904 (ref. 33); GSE116222 (ref. 34); GSE125970 (ref. 35); GSE164241 (ref. 36); GSE164403 (ref. 37); GSE116256 (ref. 39); dbGAP study ID 30641 (ref. 40), accession ID phs001657.v1.p1; E-MTAB-8581 (ref. 62) accessed online through https://developmentcellatlas.ncl.ac.uk; www.proteinatlas.org/about/download (ref. 43); GSE109093 (ref. 44); GSE113046 (ref. 45). GTEx data were accessed from https://gtexportal.org/home/, IEDB from https://www.iedb.org/database_export_v3.php and 1000 Genomes project from https://www.internationalgenome.org/data. Additional databases/datasets used include Interferome (www.interferome.org), MutSigDB (https://www.gsea-msigdb.org/gsea/msigdb/). Source data are provided with this paper.
Code availability
The source code and documentation for the mHAg pipeline are available under https://github.com/nidhih2/mhags (https://doi.org/10.5281/zenodo.11658572 (ref. 97) for autosomal mHAg prediction and https://doi.org/10.5281/zenodo.11658599 (ref.98) for Y mHAg prediction).
References
Copelan, E. A. Hematopoietic stem-cell transplantation. N. Engl. J. Med. 354, 1813–1826 (2006).
Griffioen, M., van Bergen, C. A. & Falkenburg, J. H. Autosomal minor histocompatibility antigens: how genetic variants create diversity in immune targets. Front. Immunol. 7, 100 (2016).
Mutis, T., Xagara, A. & Spaapen, R. M. The connection between minor h antigens and neoantigens and the missing link in their prediction. Front. Immunol. 11, 1162 (2020).
Zeiser, R. & Blazar, B. R. Acute graft-versus-host disease—biologic process, prevention, and therapy. N. Engl. J. Med. 377, 2167–2179 (2017).
Zeiser, R. & Blazar, B. R. Pathophysiology of chronic graft-versus-host disease and therapeutic targets. N. Engl. J. Med. 377, 2565–2579 (2017).
Aljurf, M. et al. Worldwide network for blood & marrow transplantation (WBMT) special article, challenges facing emerging alternate donor registries. Bone Marrow Transplant. 54, 1179–1188 (2019).
Cieri, N., Maurer, K. & Wu, C. J. 60 years young: the evolving role of allogeneic hematopoietic stem cell transplantation in cancer immunotherapy. Cancer Res. 81, 4373–4384 (2021).
Bolon, Y., Atshan, R., Allbee-Johnson, M., Estrada-Merly, N. & Lee, S. Current use and outcome of hematopoietic stem cell transplantation: CIBMTR summary slides. CIBMTR https://cibmtr.org/CIBMTR/Resources/Summary-Slides-Reports (2022).
Spellman, S. R. Hematology 2022—what is complete HLA match in 2022? Hematology Am. Soc. Hematol. Educ. Program 2022, 83–89 (2022).
Goulmy, E., Gratama, J. W., Blokland, E., Zwaan, F. E. & van Rood, J. J. A minor transplantation antigen detected by MHC-restricted cytotoxic T lymphocytes during graft-versus-host disease. Nature 302, 159–161 (1983).
Wang, W. et al. Human H–Y: a male-specific histocompatibility antigen derived from the SMCY protein. Science 269, 1588–1590 (1995).
Den Haan, J. M. et al. Identification of a graft versus host disease-associated human minor histocompatibility antigen. Science 268, 1476–1480 (1995).
Goulmy, E., Termijtelen, A., Bradley, B. A. & van Rood, J. J. Y-antigen killing by T cells of women is restricted by HLA. Nature 266, 544–545 (1977).
Goulmy, E. et al. Mismatches of minor histocompatibility antigens between HLA-identical donors and recipients and the development of graft-versus-host disease after bone marrow transplantation. N. Engl. J. Med. 334, 281–285 (1996).
Spierings, E. et al. Multicenter analyses demonstrate significant clinical effects of minor histocompatibility antigens on GvHD and GvL after HLA-matched related and unrelated hematopoietic stem cell transplantation. Biol. Blood Marrow Transplant. 19, 1244–1253 (2013).
Grumet, F. C. et al. CD31 mismatching affects marrow transplantation outcome. Biol. Blood Marrow Transplant. 7, 503–512 (2001).
McCarroll, S. A. et al. Donor–recipient mismatch for common gene deletion polymorphisms in graft-versus-host disease. Nat. Genet. 41, 1341–1344 (2009).
Spellman, S. et al. Effects of mismatching for minor histocompatibility antigens on clinical outcomes in HLA-matched, unrelated hematopoietic stem cell transplants. Biol. Blood Marrow Transplant. 15, 856–863 (2009).
Kogler, G. et al. Recipient cytokine genotypes for TNF-α and IL-10 and the minor histocompatibility antigens HY and CD31 codon 125 are not associated with occurrence or severity of acute GVHD in unrelated cord blood transplantation: a retrospective analysis. Transplantation 74, 1167–1175 (2002).
Martin, P. J. et al. A model of minor histocompatibility antigens in allogeneic hematopoietic cell transplantation. Front. Immunol. 12, 782152 (2021).
Story, C. M. et al. Genetics of HLA peptide presentation and impact on outcomes in HLA-matched allogeneic hematopoietic cell transplantation. Transplant. Cell Ther. 27, 591–599 (2021).
Warren, E. H. et al. Effect of MHC and non-MHC donor/recipient genetic disparity on the outcome of allogeneic HCT. Blood 120, 2796–2806 (2012).
Bykova, N. A., Malko, D. B. & Efimov, G. A. In silico analysis of the minor histocompatibility antigen landscape based on the 1000 Genomes project. Front. Immunol. 9, 1819 (2018).
Jadi, O. et al. Associations of minor histocompatibility antigens with outcomes following allogeneic hematopoietic cell transplantation. Am. J. Hematol. 98, 940–950 (2023).
Lang, F., Schrors, B., Lower, M., Tureci, O. & Sahin, U. Identification of neoantigens for individualized therapeutic cancer vaccines. Nat. Rev. Drug Discov. 21, 261–282 (2022).
Fotakis, G., Trajanoski, Z. & Rieder, D. Computational cancer neoantigen prediction: current status and recent advances. Immunooncol. Technol. 12, 100052 (2021).
Peters, B., Nielsen, M. & Sette, A. T cell epitope predictions. Annu. Rev. Immunol. 38, 123–145 (2020).
Sarkizova, S. et al. A large peptidome dataset improves HLA class I epitope prediction across most of the human population. Nat. Biotechnol. 38, 199–209 (2020).
Reynolds, G. et al. Developmental cell programs are co-opted in inflammatory skin disease. Science 371, eaba6500 (2021).
Aizarani, N. et al. A human liver cell atlas reveals heterogeneity and epithelial progenitors. Nature 572, 199–204 (2019).
MacParland, S. A. et al. Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations. Nat. Commun. 9, 4383 (2018).
Vieira Braga, F. A. et al. A cellular census of human lungs identifies novel cell states in health and in asthma. Nat. Med. 25, 1153–1163 (2019).
Laughney, A. M. et al. Regenerative lineages and immune-mediated pruning in lung cancer metastasis. Nat. Med. 26, 259–269 (2020).
Parikh, K. et al. Colonic epithelial cell diversity in health and inflammatory bowel disease. Nature 567, 49–55 (2019).
Wang, Y. et al. Single-cell transcriptome analysis reveals differential nutrient absorption functions in human intestine. J. Exp. Med. 217, e20191130 (2020).
Williams, D. W. et al. Human oral mucosa cell atlas reveals a stromal-neutrophil axis regulating tissue immunity. Cell 184, 4090–4104 (2021).
Bannier-Hélaouët, M. et al. Exploring the human lacrimal gland using organoids and single-cell sequencing. Cell Stem Cell 28, 1221–1232 (2021).
Kanate, A. S. et al. Indications for hematopoietic cell transplantation and immune effector cell therapy: guidelines from the American Society for Transplantation and Cellular Therapy. Biol. Blood Marrow Transplant. 26, 1247–1256 (2020).
Van Galen, P. et al. Single-cell RNA-seq reveals AML hierarchies relevant to disease progression and immunity. Cell 176, 1265–1281 (2019).
Tyner, J. W. et al. Functional genomic landscape of acute myeloid leukaemia. Nature 562, 526–531 (2018).
Lonsdale, J. et al. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Jiang, L. et al. A quantitative proteome map of the human body. Cell 183, 269–283 (2020).
Uhlen, M. et al. A genome-wide transcriptomic analysis of protein-coding genes in human blood cells. Science 366, eaax9198 (2019).
Cesana, M. et al. A CLK3-HMGA2 alternative splicing axis impacts human hematopoietic stem cell molecular identity throughout development. Cell Stem Cell 22, 575–588 (2018).
Drissen, R., Thongjuea, S., Theilgaard-Monch, K. & Nerlov, C. Identification of two distinct pathways of human myelopoiesis. Sci. Immunol. 4, eaau7148 (2019).
Kim, H. T. et al. Donor and recipient sex in allogeneic stem cell transplantation: what really matters. Haematologica 101, 1260–1266 (2016).
Ofran, Y. et al. Diverse patterns of T-cell response against multiple newly identified human Y chromosome-encoded minor histocompatibility epitopes. Clin. Cancer Res. 16, 1642–1651 (2010).
Miklos, D. B. et al. Antibody response to DBY minor histocompatibility antigen is induced after allogeneic stem cell transplantation and in healthy female donors. Blood 103, 353–359 (2004).
Feng, X., Hui, K. M., Younes, H. M. & Brickner, A. G. Targeting minor histocompatibility antigens in graft versus tumor or graft versus leukemia responses. Trends Immunol. 29, 624–632 (2008).
Bachireddy, P. et al. Mapping the evolution of T cell states during response and resistance to adoptive cellular therapy. Cell Rep. 37, 109992 (2021).
Bachireddy, P. et al. Distinct evolutionary paths in chronic lymphocytic leukemia during resistance to the graft-versus-leukemia effect. Sci. Transl. Med. 12, eabb7661 (2020).
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).
Torikai, H. et al. A novel HLA-A*3303-restricted minor histocompatibility antigen encoded by an unconventional open reading frame of human TMSB4Y gene. J. Immunol. 173, 7046–7054 (2004).
Ouspenskaia, T. et al. Unannotated proteins expand the MHC-I-restricted immunopeptidome in cancer. Nat. Biotechnol. 40, 209–217 (2022).
Andreatta, M. et al. MS-Rescue: a computational pipeline to increase the quality and yield of immunopeptidomics experiments. Proteomics 19, e1800357 (2019).
Lee, P. C. et al. Reversal of viral and epigenetic HLA class I repression in Merkel cell carcinoma. J. Clin. Invest. 132, e151666 (2022).
Oliveira, G. et al. Phenotype, specificity and avidity of antitumour CD8+ T cells in melanoma. Nature 596, 119–125 (2021).
Vita, R. et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 47, D339–D343 (2019).
Chowell, D. et al. TCR contact residue hydrophobicity is a hallmark of immunogenic CD8+ T cell epitopes. Proc. Natl Acad. Sci. USA 112, E1754–E1762 (2015).
Schaefer, M. R. et al. A novel trafficking signal within the HLA-C cytoplasmic tail allows regulated expression upon differentiation of macrophages. J. Immunol. 180, 7804–7817 (2008).
Gabrielsen, I. S. M. et al. Transcriptomes of antigen presenting cells in human thymus. PLoS ONE 14, e0218858 (2019).
Park, J. E. et al. A cell atlas of human thymic development defines T cell repertoire formation. Science 367, eaay3224 (2020).
Holtan, S. G. et al. Composite end point of graft-versus-host disease-free, relapse-free survival after allogeneic hematopoietic cell transplantation. Blood 125, 1333–1338 (2015).
1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Lin, M. J. et al. Cancer vaccines: the next immunotherapy frontier. Nat. Cancer 3, 911–926 (2022).
Rojas, L. A. et al. Personalized RNA neoantigen vaccines stimulate T cells in pancreatic cancer. Nature 618, 144–150 (2023).
Lansford, J. L. et al. Computational modeling and confirmation of leukemia-associated minor histocompatibility antigens. Blood Adv. 2, 2052–2062 (2018).
Olsen, K. S. et al. Shared graft-versus-leukemia minor histocompatibility antigens in DISCOVeRY-BMT. Blood Adv. 7, 1635–1649 (2023).
Parkhurst, M. R. et al. Unique neoantigens arise from somatic mutations in patients with gastrointestinal cancers. Cancer Discov. 9, 1022–1035 (2019).
Wolff, D. et al. National Institutes of Health Consensus Development Project on criteria for clinical trials in chronic graft-versus-host disease: IV. The 2020 highly morbid forms report. Transplant. Cell Ther. 27, 817–835 (2021).
Lybaert, L. et al. Neoantigen-directed therapeutics in the clinic: where are we? Trends Cancer 9, 503–519 (2023).
Bacigalupo, A. & Jones, R. PTCy: the ‘new’ standard for GVHD prophylaxis. Blood Rev. 62, 101096 (2023).
Murata, M., Warren, E. H. & Riddell, S. R. A human minor histocompatibility antigen resulting from differential expression due to a gene deletion. J. Exp. Med. 197, 1279–1289 (2003).
Broen, K. et al. A polymorphism in the splice donor site of ZNF419 results in the novel renal cell carcinoma-associated minor histocompatibility antigen ZAPHIR. PLoS ONE 6, e21699 (2011).
Griffioen, M. et al. Identification of 4 novel HLA-B*40:01 restricted minor histocompatibility antigens and their potential as targets for graft-versus-leukemia reactivity. Haematologica 97, 1196–1204 (2012).
Spierings, E. et al. Identification of HLA class II-restricted H–Y-specific T-helper epitope evoking CD4+ T-helper cells in H–Y-mismatched transplantation. Lancet 362, 610–615 (2003).
Coghill, J. M. et al. Effector CD4+ T cells, the cytokines they generate, and GVHD: something old and something new. Blood 117, 3268–3276 (2011).
Jones, S. C., Murphy, G. F., Friedman, T. M. & Korngold, R. Importance of minor histocompatibility antigen expression by nonhematopoietic tissues in a CD4+ T cell-mediated graft-versus-host disease model. J. Clin. Invest. 112, 1880–1886 (2003).
Chaves, F. A., Lee, A. H., Nayak, J. L., Richards, K. A. & Sant, A. J. The utility and limitations of current web-available algorithms to predict peptides recognized by CD4 T cells in response to pathogen infection. J. Immunol. 188, 4235–4248 (2012).
Dohner, H. et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood 129, 424–447 (2017).
Greenberg, P. L. et al. Revised international prognostic scoring system for myelodysplastic syndromes. Blood 120, 2454–2465 (2012).
Przepiorka, D. et al. 1994 consensus conference on acute GVHD grading. Bone Marrow Transplant. 15, 825–828 (1995).
Glucksberg, H. et al. Clinical manifestations of graft-versus-host disease in human recipients of marrow from HLA-matched sibling donors. Transplantation 18, 295–304 (1974).
Pavletic, S. Z. et al. NCI first international workshop on the biology, prevention, and treatment of relapse after allogeneic hematopoietic stem cell transplantation: report from the committee on the epidemiology and natural history of relapse following allogeneic cell transplantation. Biol. Blood Marrow Transplant. 16, 871–890 (2010).
Parry, E. M. et al. Evolutionary history of transformation from chronic lymphocytic leukemia to Richter syndrome. Nat. Med. 29, 158–169 (2023).
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).
Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).
Kim, H. et al. Development of a validated interferon score using NanoString technology. J. Interferon Cytokine Res. 38, 171–185 (2018).
Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).
Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018).
Quentmeier, H. et al. The LL-100 panel: 100 cell lines for blood cancer studies. Sci Rep. 9, 8218 (2019).
Szolek, A. et al. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30, 3310–3316 (2014).
Klaeger, S. et al. Optimized liquid and gas phase fractionation increases HLA-peptidome coverage for primary cell and tissue samples. Mol. Cell. Proteomics 20, 100133 (2021).
Cui, K. H., Warnes, G. M., Jeffrey, R. & Matthews, C. D. Sex determination of preimplantation embryos by human testis-determining-gene amplification. Lancet 343, 79–82 (1994).
Bui, H. H. et al. Predicting population coverage of T-cell epitope-based diagnostics and vaccines. BMC Bioinformatics 7, 153 (2006).
Samarajiwa, S. A., Forster, S., Auchettl, K. & Hertzog, P. J. INTERFEROME: the database of interferon regulated genes. Nucleic Acids Res. 37, D852–D857 (2009).
Hookeri, N. nidhih2/mhags: v1.0.0 (v1.0.0). Zenodo https://doi.org/10.5281/zenodo.11658572 (2024).
Hookeri, N. nidhih2/mhags-fm: v1.0.0 (v1.0.0). Zenodo https://doi.org/10.5281/zenodo.11658599 (2024).
Acknowledgements
We are grateful for expert assistance from S. Pollock, H. Lyon and F. Dao for their help in sample collection and management at the Broad Institute; K. Rizza and the OTTR team (DFCI Department of Cellular Therapy) for assistance with clinical databases; A. Gusev for fruitful discussion on correlative analysis design; D. Hearsey and the members of the DFCI Ted and Eileen Pasquarello Tissue Bank in Hematologic Malignancies for provision of samples; the patients who generously consented for the research use of these samples and all members of the Wu Laboratory for productive discussions. This research was supported by grants from the National Institutes of Health (NIH/NCI-P01 CA229092 and NIH/NHLBI P01 HL158505 to C.J.W. and NIH R01 HL157174 to D.B.K.) and from the Leukemia & Lymphoma Society (SCOR-22937-22 to C.J.W. and R.J.S.). Statistical analysis was supported by the DF/HCC Cancer Center Support Grant 5P30 CA006516. Mass spectrometry-based immunopeptidomics data acquisition and analysis was supported in part by NIH P01CA206978 (to S.A.C.), NCI Clinical Proteomic Tumor Analysis Consortium program U24CA270823 and U01CA271402 (to S.A.C.), as well as a grant from the Dr. Miriam and Sheldon G. Adelson Medical Research Foundation (to S.A.C.). N.C. was supported by the 2020 American Association for Cancer Research-Incyte Immuno-oncology Research Fellowship (20-40-46-CIER) and the Helen Gurley Brown Foundation. H.J. was supported by the NCI CaNCURE (grant 5R25CA174650). L.P. is a scholar of the American Society of Hematology (ASH), is a participant in the BIH Charité Digital Clinician Scientist Program funded by the DFG, the Charité—Universitätsmedizin Berlin and the Berlin Institute of Health at Charité (BIH) and is supported by the Max-Eder program of the German Cancer Aid (Deutsche Krebshifle), by the Else Kröner-Fresenius-Stiftung (2023_EKEA.102) and the DKMS John Hansen Research Grant. D.A.B. acknowledges support from the Department of Defense Early Career Investigator grant (KCRP AKCI-ECI and W81XWH-20-1-0882), the Louis Goodman and Alfred Gilman Yale Scholar Fund and the Yale Cancer Center (supported by NIH/NCI research grant P30CA016359). G.O. was supported by the Claudia Adams Barr Program for Innovative Cancer Research and by DF/HCC Kidney Cancer SPORE P50 CA101942. S.L. is supported by the NCI Research Specialist Award (R50CA251956). L.S.K. is supported by the NIH under grants NIH/NIAID U19 Al1051731, NIH/NHLBI R01 HL095791, NIH/NHLBI P01 HL158504, NIH/NHLBI P01 HL158505 and NIH/NIAID U19 AI174967. Visual elements in Fig. 1 were created with BioRender.com.
Author information
Authors and Affiliations
Contributions
N.C. and C.J.W. conceived the project and directed the overall study. N.C. designed and performed the experimental and data analysis together with H.J., K.P. and L.P. K.S. developed the computational pipeline under the supervision of N.C., C.J.W., C.S. and G.G. N.H. analyzed public single-cell datasets, docked the pipeline on Terra and applied it to the DFCI-MRD patient cohort. L.S.K. provided the allo-HCT GI single-cell libraries. Y.S. analyzed the GvHD single-cell dataset. R.K.-R. and Y.S. ran the pipeline on the HP-MRD cohort. S.L. and K.J.L. assisted with NGS preparation and analysis. N.C. curated the clinical annotation of the patient cohort with the help of K.A.K., H.T.K. and V.T.H J.S. and W.J.L. provided the DFCI-MRD cohort DNA samples. P.D.-F., V.G.-G.S. and C.M.-C. provided the samples and clinical annotation for the HP-MRD cohort. L.L. performed the 1000 Genomes simulation under the guidance of N.C., C.S. and G.G. J.K. and N.C. designed and performed the statistical analyses under the supervision of D.N. C.F. and S.S. curated AML cell line genomic and transcriptomic analysis. G.M.H., S.K., J.A., S.S., G.O., D.A.B., D.B.K., K.R.C. and S.A.C. generated and analyzed mass spectrometry results. G.O., R.J.S., J.R. and V.T.H. contributed to data discussion and interpretation. N.C. and C.J.W. wrote the manuscript. All authors discussed the results and read and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
C.J.W. holds equity in BioNTech and receives research support from Pharmacyclics. D.B.K is a scientific advisor for Immunitrack and Breakbio and owns equity in Affimed N.V., Agenus, Armata Pharmaceuticals, Breakbio, BioMarin Pharmaceutical, Celldex Therapeutics, Editas Medicine, Gilead Sciences, Immunitybio, IMV, Lexicon Pharmaceuticals and Neoleukin Therapeutics. BeiGene supported unrelated SARS-COV-2 research at Translational Immunogenomics Lab. R.J.S. consults or is on the advisory board of Kiadis, Juno Therapeutics, Gilead, Jasper, Jazz Pharmaceuticals, Precision Biosciences, Rheo Therapeutics, Takeda and NMDP—Be the Match. J.R. receives research funding from Kite/Gilead, Novartis and Oncternal and consults or is on advisory boards for Clade Therapeutics, Garuda Therapeutics, LifeVault Bio, Smart Immune and TriArm Bio. V.T.H. receives funding from Jazz Pharmaceuticals and consults or is on advisory boards for Jazz Pharmaceuticals, Janssen, Alexion Pharmaceuticals and Omeros. W.J.L. consults or is on the advisory board of CareDx, One Lambda and Thermo Fisher Scientific and receives royalty payments from Thermo Fisher Scientific. K.J.L. holds equity in Standard BioTools and is on the scientific advisory board for MBQ Pharma. S.A.C. is a member of the scientific advisory boards of PTM BioLabs, Kymera, Seer and PrognomIQ and holds equity in the latter three. D.A.B. reports honoraria from LM Education/Exchange Services; advisory board fees from Exelixis and AVEO; personal fees from Schlesinger Associates, Cancer Expert Now, Adnovate Strategies, MDedge, CancerNetwork, Catenion, OncLive, Cello Health BioConsulting, PWW Consulting, Haymarket Medical Network, Aptitude Health, ASCO Post/Harborside, Targeted Oncology, AbbVie, DLA Piper and Elephas; equity in CurIOS Therapeutics, Elephas and Fortress Biotech (subsidiary); research support from Exelixis (US) and AstraZeneca (UK), outside of the submitted work. G.O. is a consultant for Bicycle Therapeutics. L.S.K. is on the scientific advisory board for Mammoth Biosciences and HiFiBio; received research funding from Magenta Therapeutics, Tessera Therapeutics, Novartis, EMD Serono, Gilead Pharmaceuticals and Regeneron Pharmaceuticals; consulting fees from Vertex; grants/personal fees from Bristol Myers Squibb and royalties/partial funding for the current study from Bristol Myers Squibb. L.S.K.’s conflict of interest with Bristol Myers Squibb is managed under an agreement with Harvard Medical School. D.N. holds equity in Madrigal Pharmaceutics. G.G. receives research funds from Pharmacyclics, Ultima Genomics and IBM. G.G. receives research funds from Pharmacyclics, Bayer, Genentech, Ultima Genomics and IBM; is an inventor of patent applications related to MSMuTect, MSMutSig, MSIDetect, POLYSOLVER, SignatureAnalyzer-GPU and MinimuMM-seq; is a founder and consultant and holds privately held equity in Scorpion Therapeutics and is a founder and holds privately held equity in PreDICTA Biosciences. The other authors declare no competing interests.
Peer review
Peer review information
Nature Biotechnology thanks Marcel van den Brink and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Pipeline details.
a, Detailed workflow for the prediction of autosomal (left) and Y-encoded (right) mHAgs. b, Pipeline outputs for the training AML cohort composed of 11 D–R pairs (see Supplementary Table 3). Shown are the median number (and interquartile range) of hits for each step of the pipeline for matched-related donor (MRD, blue; n = 2) and unrelated donor (URD, orange; n = 9) transplants. The number of discordant variants between donors and recipients was, as expected, higher in URD than in MRD transplants. Pie charts (on the right)—distribution of the types of discordant variants in MRD (top) and URD (bottom) D–R pairs, that is, SNPs, single-nucleotide polymorphisms; DEL, deletions; INS, insertions. GvHD, graft-versus-host disease; GvL, graft-versus-leukemia.
Extended Data Fig. 2 Single-cell data analysis to define the GvHD filter gene set.
a, Summary of the single-cell datasets used, related to the following organs target of GvHD: oral mucosa, lacrimal gland (eye), skin, liver, colon (GI) and lung. For each organ analyzed, the number of datasets and their accession numbers are shown, together with the total number of cells after standard single-cell data QC analysis, as well as after removal of immune cells. The ID of the second lung dataset has been abbreviated for ease of visualization, but the full identifier is reported in c legend. b, UMAP plots showing clustering of the resident cell types for each organ. Note that for ‘eye’, the dataset includes both primary cells and organoids (derived from ductal cells), which were both analyzed. c, Violin plots of lineage-defining markers for each cell type across the different organ datasets. d, UMAP plots showing the relative contribution of individual datasets, for those organs with 2 available. Vasc., vascular; Lymph., lymphatic; KC, keratinocyte; LSEC, liver sinusoidal endothelial cells; CT, crypt top.
Extended Data Fig. 3 Threshold definition for single-cell-based expression atlas.
a, To minimize the drop-out effect common in single-cell RNA-seq data, gene expression was analyzed in a pseudo-bulk fashion for each cluster. To define the threshold for positive expression having the best signal-to-noise ratio, the expression levels of lineage-specific markers such as MLANA (melanin, expressed only in melanocytes), SFTB (surfactant B, expressed only in the lung) and ALB (albumin, expressed only in the liver) were analyzed, using as control a pan-expressed gene, B2M. The dot plot shows the expression levels for all single-cell clusters (with the tiles next to their names indicating the organ of origin: green for ‘liver’, light blue for ‘lung’, yellow for ‘skin’, orange for ‘GI’, red for ‘oral mucosa’ and navy blue for ‘eye’). b, Comparison of the expression profile of fibroblasts from 2 independent single-cell datasets (oral mucosa and skin). Results of the linear regression analysis (R squared and p value) are reported and show a substantial transcriptional identity between fibroblasts across different anatomical sites and datasets. c, Comparison of the expression profile of fibroblasts derived from single-cell sequencing data versus bulk sequencing available through the GTEx repository. Fibroblasts were chosen as they were the only purified cell type available in both GTEx and single-cell datasets. Results of the linear regression analysis are reported, showing a significant transcriptional similarity across single-cell versus bulk RNA sequencing. Vasc., vascular; Lymph., lymphatic; KC, keratinocyte; LSEC, liver sinusoidal endothelial cells; CT, crypt top; CPM, counts per million; TPM, transcript per million.
Extended Data Fig. 4 GI single-cell RNA-sequencing of allo-HCT patients.
a, Schema depicting the patients included in the allo-HCT dataset and the analytic pipeline for the single-cell RNA-sequencing analysis. Briefly, single-cell RNA-sequencing data wre generated from the biopsies of 3 allo-HCT patients undergoing diagnostic colonoscopy for suspected GI GvHD at a median time from transplant of 90 days (range: 22–103). Upon standard processing and QC, viable cells were clustered and manually annotated (see Supplementary Fig. 2). Immune cell clusters were excluded, and remaining resident non-immune cells were merged and harmonized with the healthy subject GI dataset used for the generation of the GvHD filter. CPM, counts per million. b, UMAP showing cluster annotations from the merged Seurat object containing both allo-HCT and healthy subject-derived GI cells (top) and violin plots depicting the lineage-defining markers used for cluster annotation (bottom). c, UMAP depicting the clusters colored based on the dataset of origin. d, Venn diagram showing the number of genes that were present in the allo-HCT GI dataset vs. the GI healthy subject dataset. e, Venn diagram showing the overlap of the genes expressed in the allo-HCT GI dataset vs. the overall GvHD filter. f, Enrichment analysis of interferon-related signatures (ref. 88; MSigDB IFNa and IFNg89) in the allo-HCT vs. GI healthy subject datasets (p < 0.0001, 2-tailed Mann–Whitney test).
Extended Data Fig. 5 GvL filter details.
a, Schematic depicting the generation of the 2 components of the GvL filter, that is, AML and Hematopoietic filters. For the ‘AML filter’, a single-cell-based classifier39 was applied to bulk RNA-seq data from the Beat AML cohort40, to fully capture the AML transcriptional heterogeneity. For the ‘heme filter’, bulk RNA-seq from 18 purified mature hemopoietic cell types43 as well as from hematopoietic stem and progenitor cells (HSPCs)44,45 were the starting source. From the list of expressed genes (TPM > 2), all those with expression in adult non-hematopoietic tissues per the GTEx RNA and protein repositories were excluded to define a set of 650 genes with preferential expression in AML and/or hematopoietic cells. b, Gender-specificity of the GvL filters: the GTEx filtering step was performed in a gender-specific fashion, as genes expressed in the male reproductive organs are not filtered out if the patient is female, and vice versa genes expressed in the female reproductive organs are maintained if the patient is male. c, Histograms depicting the chromosomal location of the 650 genes comprising the ‘AML’ and ‘heme’ filters; below each bar, relative chromosome size is depicted. For the X chromosome, only the pseudo-autosomal regions have been included in the analysis. d, Subcellular localization of the genes in the ‘AML’ and ‘heme’ filters. e, Biological functions of the genes included in the filters. Biological functions (from GO and superpaths) have been manually clustered in macro-groups as specified in Supplementary Table 2.
Extended Data Fig. 6 Y-encoded mHAg filter.
a, Schematic depicting the structure of the Y chromosome with a special focus on the genes in the male-specific region (MSY). Heatmap showing the expression pattern of the genes in the MSY across different healthy adult tissues: only the first 9 genes (RPS4Y1, DDX3Y, KDM5D, EIF1AY, ZFY, USP9Y, TMSB4Y, UTY and NLGN4Y) have evidence of expression (≥1 TPM) in ≥1 adult tissue site of GvHD. PAR, pseudo-autosomal region. b, Stacked histograms showing the number of predicted Y epitopes across individual HLA-A, HLA-B and HLA-C alleles and divided based on the MSY gene of origin. c, Bubble plot showing the median number of predicted epitopes for each MSY gene grouped based on the HLA peptide-binding motif from ref.28.
Extended Data Fig. 7 Antigenicity and immunogenicity of Y mHAgs.
a, Correlation matrix showing the peptide-binding motifs of the HLA-A, HLA-B and HLA-C alleles from ref. 28; lateral panels display the individual HLA alleles belonging to each peptide-binding motif, whose corresponding monoallelic B721.221 immunopeptidomes have been analyzed in Fig. 2b. b, Hydrophobicity scores of the 410 Y mHAg peptides tested for immunogenicity, grouped by individual HLA restrictions: HLA-A0201 had the highest number of predicted binders with a score >0 (boxplots show min to max and median values; Kruskal–Wallis test with Dunn’s multiple comparisons test). c, Hydrophobicity scores of the 410 Y mHAgs grouped by HLA groups. Whiskers indicate min and max values, with all individual values shown (Kruskall–Wallis test with Dunn’s multiple comparisons test). d, Hydrophobicity scores of the predicted binders for each HLA allele, grouped based on the experimental evidence of T cell immunogenicity (per Fig. 2f): only for HLA-A0101 and HLA-C0501 was hydrophobicity significantly associated with immunogenicity (assessed with 2-tailed unpaired t test). Whiskers indicate min and max values, with all individual values shown. e, UMAP showing cluster annotations of single-cell thymic epithelial cells (TECs) from ref.62 (left), with feature plots of cluster-defining markers (middle); normalized expression of HLA-A, HLA-B and HLA-C genes per cluster (right).
Extended Data Fig. 8 Tracking of Y mHAg-specific T cells ex vivo.
a, Flow cytometry plots showing the percentage of circulating CD8+ T cells specific for the indicated Y mHAgs at the listed time points, including donor before allo-HCT in a patient transplanted from his HLA-identical sister and experiencing severe chronic GvHD. An irrelevant epitope from the EBV EBNA3A protein (HLA-B0702-restricted) was used as control, as both patient and donor were EBV seropositive. b, Timeline depicting the patient clinical course, highlighting the onset and course of the severe chronic GvHD, involving primarily skin and liver as shown by the liver function tests (ALT in red and total bilirubin in green). Triangles, peripheral blood samples used for Y mHAg-specific T cell tracking; diamonds, EBV reactivation. MMF, mycophenolate mofetil; tx: transplant. c, Quantification of ZFY-C0501-specific T cells in the leukapheresis (LK) products of additional 7 female donors to male patients (F to M), compared with T cells stained with a control C0501-dextramer. Boxplots show min to max and median values; significance was assessed with 2-tailed Wilcoxon paired t test.
Extended Data Fig. 9 Autosomal mHAgs and GvHD.
a, Normal distribution of the autosomal mHAg load in the DFCI-MRD cohort. b, Cumulative incidence of NIH moderate/severe chronic GvHD stratifying patients based on the overall autosomal mHAg load below (orange) or above (yellow) the median: no differences in 5-year cumulative incidences are observed: CIs are 42% (95% confidence interval: 33–52%) and 39% (95% confidence interval: 30–49%) for < median and > median, respectively, 2-sided p = 0.8 (Gray’s test). c, Distribution of patients experiencing grade II–IV skin (left) and GI (right) acute GvHD across deciles of skin and GI mHAgs, respectively. d, Distribution of patients experiencing NIH moderate/severe organ-specific chronic GvHD across deciles of mHAgs expressed in the indicated GvHD target organs: from left to right—skin, GI, liver, eye and oral. e, Heatmap depicting the co-occurrence of the 7 SNPs associated with liver acute GvHD in: from left to right, patients with liver acute GvHD, patients experiencing acute GvHD without liver involvement and patients with chronic liver GvHD. f, Number of co-occurring driver liver mHAgs in the 3 patient groups outlined in e and defined with the same color code. Boxplots show min to max and median values (Kruskall–Wallis test with Dunn’s multiple comparison test). g, Promoter analysis of the genes harboring the SNPs associated with liver acute GvHD: 4 of 7 genes have interferon-responsive elements in their promoter region. Transcription factor binding site locations within 1500 base pairs (bp) upstream of the transcription start site (position 0) and the 5′ UTR are indicated.
Extended Data Fig. 10 Population coverage simulating a T cell-based immunotherapy approach targeting GRFS mHAgs.
a, Heatmap showing the donor–recipient pairs (DRPs) that are informative for the pool of 54 GRFS epitopes indicated in the columns. DRPs (rows) are grouped by population of origin of the simulated recipient, as shown in the inner right bar. The outer right bar shows the number of predicted epitopes per DRP. Gray histograms on the bottom indicate the number of informative DRPs for each epitope. b, Population coverage analysis for: from top to bottom, overall simulation cohort, EUR, EAS, SAS, AFR and AMR. The histogram bars denote the percentage of DRPs that are informative for the indicated number of epitope hits, while the open circles indicate the cumulative percentage of population coverage for each number of epitope hits. The percentage indicated in the top-right corner of each graph shows the % of population for which ≥1 GRFS epitope could be potentially targeted. The red line denotes the 90% threshold of population coverage, which is considered optimal.
Supplementary information
Supplementary Information
Supplementary Figs. 1–18.
Supplementary Tables
Supplementary Tables 1–7.
Supplementary Data
Code for correlative outcome analyses.
Source data
Source Data Fig. 2
Statistical source data for Fig. 2i.
Source Data Fig. 3
Statistical source data for Fig. 3.
Source Data Fig. 4
Statistical source data for Fig. 4c,g.
Source Data Fig. 5
Statistical source data for Fig. 5.
Source Data Extended Data Fig. 3
Statistical source data for Fig. 3b,c.
Source Data Extended Data Fig. 4
Statistical source data for Fig. 4f.
Source Data Extended Data Fig. 7
Statistical source data for Fig. 7b–d.
Source Data Extended Data Fig. 8
Statistical source data for Fig. 8c.
Source Data Extended Data Fig. 9
Statistical source data for Fig. 9f.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cieri, N., Hookeri, N., Stromhaug, K. et al. Systematic identification of minor histocompatibility antigens predicts outcomes of allogeneic hematopoietic cell transplantation. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-024-02348-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41587-024-02348-3