Abstract
Accurately predicting the molecular subtype of cancer patients is of great significance for personalized diagnosis and treatment of cancer. The progress of a large amount of multi-omics data and data-driven methods is expected to promote the molecular subtyping of cancer. Existing methods are limited by their ability to deal with high-dimensional data and the influence of misleading and unrelated factors, resulting in ambiguous and overlapping subtypes. This article proposes a method called Multi-Omics Subtypes of Digestive System Tumors (MSDST), which is used for subtype identification of digestive system tumors. The method learns a new representation of the relationship between samples from multi-omics data, and uses a self-encoding model composed of omics-specific graph convolutional networks to learn the high-level representation of each omics data feature while considering the prognosis prediction results. Finally, k-means algorithm is used to cluster samples for analysis. Compared with other state-of-the-art methods, our proposed method performs better in identifying digestive system tumor subtypes. Subsequent clinical data analysis and functional enrichment analysis further confirm the specific biological characteristics and functional differences of the identified subtypes. This research provides new ideas and methods for precision medicine, and is expected to promote personalized treatment and improve the prognosis of digestive system tumors.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
This study analyzed publicly available datasets generated by the Cancer Genome Atlas (TCGA), managed by the National Cancer Institute (NCI). These datasets can be found at: http://cancergenome.nih.gov.
References
Arnold M, Abnet CC, Neale RE, Vignat J, Giovannucci EL, McGlynn KA, Bray F (2020) Global burden of 5 major types of gastrointestinal cancer. Gastroenterology 159(1):335–34915
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F (2021) Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer J Clin 71(3):209–249
Liu Y, Sethi NS, Hinoue T, Schneider BG, Cherniack AD, Sanchez-Vega F, Seoane JA, Farshidfar F, Bowlby R, Islam M (2018) Comparative molecular analysis of gastrointestinal adenocarcinomas. Cancer Cell 33(4):721–7358
Xie Y, Shi L, He X, Luo Y (2021) Gastrointestinal cancers in China, the USA, and Europe. Gastroenterol Rep 9(2):91–104
Zhao L, Lee VH, Ng MK, Yan H, Bijlsma MF (2019) Molecular subtyping of cancer: current status and moving toward clinical applications. Brief Bioinf 20(2):572–584
Wong ANN, He Z, Leung KL, To CCK, Wong CY, Wong SCC, Yoo JS, Chan CKR, Chan AZ, Lacambra MD (2022) Current developments of artificial intelligence in digital pathology and its future clinical applications in gastrointestinal cancers. Cancers 14(15):3780
Wahid M, Ahmed G, Hussain S, Ansari AA (2023) A survey on cancer molecular subtype classification using deep learning. In: 2023 4th International conference on computing, mathematics and engineering technologies (iCoMET), pp 1–5
Hamet P, Tremblay J (2017) Artificial intelligence in medicine. Metab Clin Exp 69S:36–40
Vahabi N, Michailidis G (2022) Unsupervised multi-omics data integration methods: a comprehensive review. Front Genet 13
Rakshit S, Saha I, Chakraborty SS, Plewczyski D (2018) Deep learning for integrated analysis of breast cancer subtype specific multi-omics data. In: TENCON 2018 - 2018 IEEE Region 10 Conference, pp 1917–1922
Sun P, Wu Y, Yin C, Jiang H, Xu Y, Sun H (2022) Molecular subtyping of cancer based on distinguishing co-expression modules and machine learning. Front Genet 13
Tian J, Zhu M, Ren Z, Zhao Q, Wang P, He CK, Zhang M, Peng X, Wu B, Feng R, Fu M (2022) Deep learning algorithm reveals two prognostic subtypes in patients with gliomas. BMC Bioinf 23(1):417
Cascianelli S, Molineris I, Isella C, Masseroli M, Medico E (2020) Machine learning for RNA sequencing-based intrinsic subtyping of breast cancer. Sci Rep 10(1):14071
Li S, Yang Y, Wang X, Li J, Yu J, Li X, Wong K-C (2022) Colorectal cancer subtype identification from differential gene expression levels using minimalist deep learning. BioData Mining 15(1):12
Chen R, Yang L, Goodison S, Sun Y (2019) Deep learning approach to identifying breast cancer subtypes using high-dimensional genomic data. arXiv:629865
Yang B, Xin T-T, Pang S-M, Wang M, Wang Y-J (2021) Deep subspace mutual learning for cancer subtypes prediction. Bioinformatics 37(21):3715–3722
Xu J, Wu P, Chen Y, Meng Q, Dawood H, Dawood H (2019) A hierarchical integration deep flexible neural forest framework for cancer subtype classification by integrating multi-omics data. BMC Bioinf 20(1):527
Hashimoto N, Fukushima D, Koga R, Takagi Y, Ko K, Kohno K, Nakaguro M, Nakamura S, Hontani H, Takeuchi I (2019) Multi-scale domain-adversarial multiple-instance cnn for cancer subtype classification with unannotated histopathological images. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3852–3861
ElKarami B, Alkhateeb A, Qattous H, Alshomali L, Shahrrava B (2022) Multi-omics data integration model based on umap embedding and convolutional neural network. Cancer Inf 21:11769351221124204
Zhan Q, Wen C, Zhao Y, Fang L, Jin Y, Zhang Z, Zou S, Li F, Yang Y, Wu L (2021) Identification of copy number variation-driven molecular subtypes informative for prognosis and treatment in pancreatic adenocarcinoma of a Chinese cohort. Ebiomed 74
Qattous H, Azzeh M, Ibrahim R, Al-Ghafer IA, Al Sorkhy M, Alkhateeb A (2024) Pacmap-embedded convolutional neural network for multi-omics data integration. Heliyon 10(1)
Madhumita, Paul S (2022) Capturing the latent space of an autoencoder for multi-omics integration and cancer subtyping. Comput Biol Med 148:105832
Li X, Ma J, Leng L, Han M, Li M, He F, Zhu Y (2022) Mogcn: a multi-omics integration method based on graph convolutional network for cancer subtype analysis. Front Genet 13:806842
Gao F, Wang W, Tan M, Zhu L, Zhang Y, Fessler E, Vermeulen L, Wang X (2019) Deepcc: a novel deep learning-based framework for cancer molecular subtype classification. Oncogenesis 8(9):44
Franco EF, Rana P, Cruz A, Calderon VV, Azevedo V, Ramos RTJ, Ghosh P (2021) Performance comparison of deep learning autoencoders for cancer subtype detection using multi-omics data. Cancers (Basel) 13(9)
Dai W, Yue W, Peng W, Fu X, Liu L, Liu L (2022) Identifying cancer subtypes using a residual graph convolution model on a sample similarity network. Genes 13(1):65
Yin C, Cao Y, Sun P, Zhang H, Li Z, Xu Y, Sun H (2022) Molecular subtyping of cancer based on robust graph neural network and multi-omics data integration. Front Genet 13:884028
Ge S, Liu J, Cheng Y, Meng X, Wang X (2022) Multi-view spectral clustering with latent representation learning for applications on multi-omics cancer subtyping. Brief Bioinf 24(1)
Zhang G, Peng Z, Yan C, Wang J, Luo J, Luo H (2022) Multigatae: a novel cancer subtype identification method based on multi-omics and attention mechanism. Front Genet 13:855629
Shi X, Liang C, Wang H (2023) Multiview robust graph-based clustering for cancer subtype identification. IEEE/ACM Trans Comput Biol Bioinf 20(1):544–556
Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM (2013) The cancer genome atlas pan-cancer analysis project. Nat Genet 45(10):1113–1120
Wang T, Shao W, Huang Z, Tang H, Zhang J, Ding Z, Huang K (2021) Mogonet integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification. Nat Commun 12(1):3445
Wang L, Ding Z, Tao Z, Liu Y, Fu Y (2019) Generative multi-view human action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6212–6221
Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24
Li B, Wang T, Nabavi S (2021) Cancer molecular subtype classification by graph convolutional networks on multi-omics data. In: Proceedings of the 12th ACM conference on bioinformatics, computational biology, and health informatics, pp 1–9
Wu T, Hu E, Xu S, Chen M, Guo P, Dai Z, Feng T, Zhou L, Tang W, Zhan L, Fu X, Liu S, Bo X, Yu G (2021) clusterprofiler 4.0: a universal enrichment tool for interpreting omics data. The Innovation 2(3):100141
Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 2623–2631
Wang B, Mezlini AM, Demir F, Fiume M, Tu Z, Brudno M, Haibe-Kains B, Goldenberg A (2014) Similarity network fusion for aggregating data types on a genomic scale. Nat Methods 11(3):333–337
Song W, Wang W, Dai D-Q (2021) Subtype-weslr: identifying cancer subtype with weighted ensemble sparse latent representation of multi-view data. Briefings in Bioinformatics 23(1)
Meng C, Helm D, Frejno M, Kuster B (2016) mocluster: identifying joint patterns across multiple omics data sets. J Proteome Res 15(3):755–765
Rappoport N, Shamir R (2019) Nemo: cancer subtyping by integration of partial multi-omic data. Bioinformatics 35(18):3348–3356
Wu D, Wang D, Zhang MQ, Gu J (2015) Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: application to cancer molecular classification. BMC Genom 16(1):1022
Mo Q, Shen R, Guo C, Vannucci M, Chan KS, Hilsenbeck SG (2017) A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data. Biostatistics 19(1):71–86
Shi Q, Zhang C, Peng M, Yu X, Zeng T, Liu J, Chen L (2017) Pattern fusion analysis by adaptive alignment of multiple heterogeneous omics data. Bioinformatics 33(17):2706–2714
Zhang C, Chen Y, Zeng T, Zhang C, Chen L (2022) Deep latent space fusion for adaptive representation of heterogeneous multi-omics data. Brief Bioinf 23(2):600
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (Grant Nos. U20A20225 and U2013601), in part by Anhui Province Natural Science Funds for Distinguished Young Scholar (Grant No. 2308085J02), in part by the Science and Technology Innovation 2030 - "New Generation Artificial Intelligence" Major Project (Grant No. 2022ZD0116305), in part by Innovation Leading Talent of Anhui Province TeZhi plan, in part by the Natural Science Foundation of Hefei, China (Grant No. 202321), and in part by the CAAI-Huawei Mind Spore Open Fund (Grant No. CAAIXSJLJJ-2022-011A). Thanks to the funding support from Anhui Engineering Research Center on Information Fusion and Control of Intelligent Robot (Grant No. IFCIR2024001).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors hereby proclaim that they possess no conflicting interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, L., Wang, N., Zhu, Z. et al. Identification of subtypes in digestive system tumors based on multi-omics data and graph convolutional network. Int. J. Mach. Learn. & Cyber. 15, 3567–3577 (2024). https://doi.org/10.1007/s13042-024-02109-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-024-02109-3