Abstract
The abundance of biomarkers across histology, imaging, and clinical endpoints poses a challenge in selecting indicators for personalized clinical decision support. Patient heterogeneity necessitates an adaptive and incremental approach to indicator selection, leading to complex demands due to missing data. To address these challenges, we propose Forest Chain (F-Chain), a learning framework that incrementally selects prognostic indicators for each patient. Using a proposed surrogate preference function, F-Chain achieves consistent evaluations across multiple doctors and data sources. We introduce an indicator selection strategy that integrates data information, gradually adding relevant indicators. Additionally, we develop a missingness-incorporated decision tree for predicting outcomes on multi-source datasets with substantial missing values. We validate the F-Chain model using the SEER database and real clinical data from a hospital, demonstrating superior OS prediction results compared to state-of-the-art methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
No datasets were generated or analysed during the current study.
References
Smith JC, Sheltzer JM (2018) Systematic identification of mutations and copy number alterations associated with cancer patient prognosis. Elife 7:39217
Williams D.R, Kontos E.Z, Viswanath K, Haas J.S, Lathan C.S, MacConaill L.E, Chen J, Ayanian J.Z (2012) Integrating multiple social statuses in health disparities research: the case of lung cancer. Health Serv Res 47(3pt2):1255–1277
Johnson AM, Hines RB, Johnson JA III, Bayakly AR (2014) Treatment and survival disparities in lung cancer: the effect of social environment and place of residence. Lung Cancer 83(3):401–407
Botlagunta M, Botlagunta MD, Myneni MB, Lakshmi D, Nayyar A, Gullapalli JS, Shah MA (2023) Classification and diagnostic prediction of breast cancer metastasis on clinical data using machine learning algorithms. Sci Rep 13(1):485
Tsung K, Zhangxu Z (2022) TANLUN clinical research participants (2022) the truth and dispute in TNM staging, standardized care and evidence-based medicine in cancer management. J Clin Rev Case Rep 7(8):103–112
Peixoto A, Silva M, Pereira P, Macedo G (2016) Biopsies in gastrointestinal endoscopy: when and how. GE Port J Gastroenterol 23(1):19–27
jp J.G.C.A (2021) Japanese gastric cancer treatment guidelines 2018. Gastric Cancer 24(1):1–21
Yang L, Ying X, Liu S, Lyu G, Xu Z, Zhang X, Li H, Li Q, Wang N, Ji J (2020) Gastric cancer: epidemiology, risk factors and prevention strategies. Chin J Cancer Res 32(6):695
Goetz LH, Schork NJ (2018) Personalized medicine: motivation, challenges, and progress. Fertil Steril 109(6):952–963
Schork NJ (2019). In: Von Hoff DD, Han H (eds) Artificial intelligence and personalized medicine. Springer, Cham, pp 265–283. https://doi.org/10.1007/978-3-030-16391-4_11
Chen Z, Du Z, Li Q, Guo H, Ma T, Tian Y (2023) PMDF: Preference-based multimodal deep forest for overall survival prediction in gastric cancer. In: 2023 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 3919–3925
Corti C, Cobanaj M, Dee EC, Criscitiello C, Tolaney SM, Celi LA, Curigliano G (2023) Artificial intelligence in cancer research and precision medicine: applications, limitations and priorities to drive transformation in the delivery of equitable and unbiased care. Cancer Treat Rev 112:102498
Zhang S, Bamakan SMH, Qu Q, Li S (2018) Learning for personalized medicine: a comprehensive review from a deep learning perspective. IEEE Rev Biomed Eng 12:194–208
Bhinder B, Gilvary C, Madhukar NS, Elemento O (2021) Artificial intelligence in cancer research and precision medicine. Cancer Discov 11(4):900–915
Ozer M.E, Sarica P.O, Arga K.Y (2020) New machine learning applications to accelerate personalized medicine in breast cancer: rise of the support vector machines. OMICS 24(5):241–246
Liu H, Liu M, Li D, Zheng W, Yin L, Wang R (2022) Recent advances in pulse-coupled neural networks with applications in image processing. Electronics 11(20):3264
Swanson K, Wu E, Zhang A, Alizadeh AA, Zou J (2023) From patterns to patients: advances in clinical machine learning for cancer diagnosis, prognosis, and treatment. Cell
Thornton J, D’Souza R, Tandon R (2023) Artificial intelligence and psychiatry research and practice
Seastedt K.P, Schwab P, O’Brien Z, Wakida E, Herrera K, Marcelo P.G.F, Agha-Mir-Salim L, Frigola X.B, Ndulue E.B, Marcelo A et al (2022) Global healthcare fairness: we should be sharing more, not less, data. PLOS Digital Health 1(10):0000102
Bertsimas D, Wiberg H (2020) Machine learning in oncology: methods, applications, and challenges. JCO Clin Cancer Inform 4
Cranor, L.F (2008) A framework for reasoning about the human in the loop. In: Proceedings of the 1st conference on usability, psychology, and security. UPSEC’08. USENIX Association, USA
Wu X, Xiao L, Sun Y, Zhang J, Ma T, He L (2022) A survey of human-in-the-loop for machine learning. Future Gener Comput Syst
Schirner G, Erdogmus D, Chowdhury K, Padir T (2013) The future of human-in-the-loop cyber-physical systems. Computer 46(1):36–45
Li G (2017) Human-in-the-loop data integration. Proc VLDB Endow 10(12):2006–2017
Fürnkranz J, Hüllermeier E, Cheng W, Park S-H (2012) Preference-based reinforcement learning: a formal framework and a policy iteration algorithm. Mach Learn 89(1):123–156
Bemporad A, Piga D (2021) Global optimization based on active preference learning with radial basis functions. Mach Learn 110(2):417–448
Maadi M, Akbarzadeh Khorshidi H, Aickelin U (2021) A review on human–AI interaction in machine learning and insights for medical applications. Int J Environ Res Public Health 18(4):2121
Zhang X, Wang S, Liu J, Tao C (2018) Towards improving diagnosis of skin diseases by combining deep neural network and human knowledge. BMC Med Inform Decis Mak 18(2):69–76
Lian C, Liu M, Zhang J, Shen D (2018) Hierarchical fully convolutional network for joint atrophy localization and Alzheimer’s disease diagnosis using structural MRI. IEEE Trans Pattern Anal Mach Intell 42(4):880–893
Bianchi F, Piroddi L, Bemporad A, Halasz G, Villani M, Piga D (2022) Active preference-based optimization for human-in-the-loop feature selection. Eur J Control 66:100647
Twala BE, Jones M, Hand DJ (2008) Good methods for coping with missing data in decision trees. Pattern Recogn Lett 29(7):950–956
Li Q, Wang Y, Du Z, Li Q, Zhang W, Zhong F, Wang ZJ, Chen Z (2024) APDF: an active preference-based deep forest expert system for overall survival prediction in gastric cancer. Expert Syst Appl 245:123131
Tai K-C (1979) The tree-to-tree correction problem. J ACM (JACM) 26(3):422–433
Pawlik M, Augsten N (2011) RTED: a robust algorithm for the tree edit distance. arXiv preprint arXiv:1201.0230
Zhang P-F, Du Z-D, Wen F, Zhang F-Y, Zhang W-H, Luo L, Hu J-K, Li Q (2020) Development and validation of a nomogram for predicting overall survival of gastric cancer patients after d2r0 resection. Eur J Cancer Care 29(5):13260
Zeng J, Li K, Cao F, Zheng Y (2023) Development and validation of survival prediction model for gastric adenocarcinoma patients using deep learning: a seer-based study. Front Oncol 13:1131859
Hu L, Yang K, Chen Y, Sun C, Wang X, Zhu S, Yang S, Cao G, Xiong M, Chen B (2022) Survival nomogram for different grades of gastric cancer patients based on seer database and external validation cohort. Front Oncol 12:951444
Liu D, Wang X, Li L, Jiang Q, Li X, Liu M, Wang W, Shi E, Zhang C, Wang Y et al (2022) Machine learning-based model for the prognosis of postoperative gastric cancer. Cancer Manag Res 135–155
Zhang Y, Yu C (2021) Development and validation of a surveillance, epidemiology, and end results (seer)-based prognostic nomogram for predicting survival in elderly patients with gastric cancer after surgery. J Gastrointest Oncol 12(2):278
Speiser JL (2021) A random forest method with feature selection for developing medical prediction models with clustered and longitudinal data. J Biomed Inform 117:103763
Chen T, Zhang C, Liu Y, Zhao Y, Lin D, Hu Y, Yu J, Li G (2019) A gastric cancer LncRNAs model for MSI and survival prediction based on support vector machine. BMC Genom 20(1):1–7
Walczak S, Velanovich V (2018) Improving prognosis and reducing decision regret for pancreatic cancer treatment using artificial neural networks. Decis Support Syst 106:110–118
Abdelaziz M, Wang T, Elazab A (2021) Alzheimer’s disease diagnosis framework from incomplete multimodal data using convolutional neural networks. J Biomed Inform 121:103863
Liu Z, Chen Z, Li Y, Zhao L, Yang T, Farahbakhsh R, Crespi N, Huang X (2023) IMC-NLT: incomplete multi-view clustering by NMF and low-rank tensor. Expert Syst Appl 221:119742
Zhao Z, Li W, Liu P, Zhang A, Sun J, Xu LX (2023) Survival analysis for multimode ablation using self-adapted deep learning network based on multisource features. IEEE J Biomed Health Inform
Shaker A, Lawrence C (2023) Multi-source survival domain adaptation. In: Proceedings of the AAAI conference on artificial intelligence, vol 37, pp 9752–9762
Li Q, Du Z, Chen Z, Huang X, Li Q et al (2023) Multiview deep forest for overall survival prediction in cancer. Comput Math Methods Med 2023
Acknowledgements
This work was supported by Joint Medical-Industrial Intersection Fundation of Dalian University of Technology (DUT23YG204), the National Natural Science Foundation of China (62006035), Dalian Science and Technology Innovation Foundation (2023JJ13SN065), and the Fundamental Research Funds for the Central Universities (DUT22RC(3)011).
Author information
Authors and Affiliations
Contributions
Qiucen Li and Zedong Du contributed equally to this work and should be regarded as co-first authors. Qiucen L. and Z.D. performed conceptualization, methodology, formal analysis, investigation, and writing, including the original draft, review, and editing. Qiu.L. and P.Z. were responsible for supervision and data curation. H.G. verified the code and methods, and conducted data preprocessing. X.H. visualized the experimental results. D.L. verified the code and methods, and revised the manuscript. Z.C. provided funding support for the entire project and was responsible for project management and supervision. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Q., Du, Z., Li, Q. et al. F-Chain: personalized overall survival prediction based on incremental adaptive indicators and multi-source clinical records. Memetic Comp. 16, 269–284 (2024). https://doi.org/10.1007/s12293-024-00415-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12293-024-00415-5