Predicting the Conversion from Mild Cognitive Impairment to Alzheimer’s Disease Using an Explainable AI Approach
<p>Demographics summary.</p> "> Figure 2
<p>Sankey Diagram featuring patient’s clinical state from the baseline until the 48th-month visit.</p> "> Figure 3
<p>Missing values at baseline visit.</p> "> Figure 4
<p>Prediction method using baseline and first annual visit data.</p> "> Figure 5
<p>Training workflow used for creating the model.</p> "> Figure 6
<p>Recursive Feature Elimination workflow as implemented in scikit-learn.</p> "> Figure 7
<p>Summarized confusion matrix from 5-fold cross-validation.</p> "> Figure 8
<p>SHAP summary plot.</p> "> Figure 9
<p>SHAP feature importance plot.</p> ">
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data and Preprocessing
2.1.1. Model Selection
2.1.2. XGBoost
2.1.3. CatBoost
2.1.4. Light Gradient Boosting Machine
2.1.5. Standard Machine Learning Classifiers
2.1.6. Experiment Setup and Model Comparison
2.2. Training Pipeline
Hyperparameter Tuning
3. Results
Shapley Additive Explanations
4. Discussion
5. Conclusions
6. Limitations
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Feature | Median | Mean | Max | Std | Membership Status | |
0 | DX_bl | [2.0, 1.0] | ||||
1 | AGE | 74 | 73.53801743 | 91.4 | 7.610487740946301 | |
2 | PTEDUCAT | 16 | 15.790849673202615 | 20 | 2.8848952582917042 | |
3 | APOE4 | [1.0, 0.0, 2.0] | ||||
4 | FDG | 1.422475 | 1.279733683442266 | 1.57338 | 0.17810577352607126 | |
5 | CDRSB | 2 | 2.6919389978213513 | 15 | 2.3115707124725917 | |
6 | ADAS11 | 11.33 | 13.06316557734205 | 56.33 | 7.848523985235578 | |
7 | ADAS13 | 19 | 20.335211328976037 | 71.33 | 10.712926813970377 | |
8 | ADASQ4 | 7 | 6.328322440087145 | 10 | 2.8895712942428418 | |
9 | MMSE | 27 | 25.861220043572985 | 30 | 3.819875621530087 | |
10 | RAVLT_immediate | 30 | 30.709586056644877 | 71 | 11.989765521262475 | |
11 | RAVLT_learning | 3 | 3.3753812636165574 | 12 | 2.626026021958254 | |
12 | RAVLT_forgetting | 5 | 4.550762527233116 | 14 | 2.727335365616795 | |
13 | RAVLT_perc_forgetting | 83.3333 | 68.94227735294118 | 100 | 66.12887476892881 | |
14 | LDELTOTAL | 4 | 5.311111111111112 | 25 | 5.1133143144938495 | |
15 | TRABSCOR | 107.5 | 138.7888888888889 | 300 | 83.34054120772674 | |
16 | FAQ | 4 | 7.128322440087145 | 30 | 7.761732851339966 | |
17 | Ventricles | 39,943.79 | 44,969.496797385626 | 151,426 | 23,686.835523710066 | |
18 | Hippocampus | 6390.5 | 6366.0328758169935 | 10,452 | 1201.5157640328403 | |
19 | WholeBrain | 1,007,820 | 1,009,664.1372549019 | 1,428,190 | 112,656.48231469237 | |
20 | Entorhinal | 3281.6 | 3285.5557734204795 | 5770 | 773.1931836208981 | |
21 | Fusiform | 16,739.5 | 16,767.563834422657 | 28,878 | 2781.09519 | |
22 | MidTemp | 18,553.4 | 18,716.070370370373 | 29,006 | 2960.8444166415316 | |
23 | ICV | 1,522,580 | 1,540,374.3877995643 | 2,100,210 | 164,601.06052299083 | |
24 | mPACCdigit | −8.21001 | −8.66117474 | 5.95912 | 6.933461669969725 | |
25 | mPACCtrailsB | −7.9708 | −8.337507947 | 6.13315 | 6.824617271140973 | |
26 | CDRSB_bl | 1.5 | 2.070806100217865 | 10 | 1.5453513515946802 | |
27 | ADAS11_bl | 11 | 11.881372549019607 | 36 | 5.700384925550372 | |
28 | ADAS13_bl | 18 | 18.951222222222224 | 50 | 8.290366685918455 | |
29 | MMSE_bl | 27 | 26.715686274509803 | 30 | 2.550000345894958 | |
30 | RAVLT_immediate_bl | 30 | 32.001960784313724 | 68 | 10.788177445521617 | |
31 | RAVLT_learning_bl | 3 | 3.621786492374728 | 11 | 2.575508392318168 | |
32 | RAVLT_forgetting_bl | 5 | 4.614596949891068 | 13 | 2.296667824916341 | |
33 | RAVLT_perc_forgetting_bl | 71.4286 | 66.49644300653596 | 100 | 32.46909675562216 | |
34 | LDELTOTAL_BL | 4 | 4.790849673202614 | 18 | 3.62889217 | |
35 | TRABSCOR_bl | 105.1 | 131.97015250544663 | 300 | 76.91764281066624 | |
36 | FAQ_bl | 3 | 5.153159041394336 | 30 | 6.195501396652368 | |
37 | mPACCdigit_bl | −7.385415 | −7.820029706 | 2.23768 | 4.944942117584728 | |
38 | mPACCtrailsB_bl | −6.988085 | −7.485260064 | 2.7732 | 4.926875777371472 | |
39 | Ventricles_bl | 37,837.5 | 42,501.538061002175 | 157,713 | 22,875.046972145934 | |
40 | Hippocampus_bl | 6528 | 6547.429477124183 | 9929 | 1178.2954938265814 | |
41 | WholeBrain_bl | 1,015,965 | 1,020,942.0840958606 | 1,443,990 | 113,929.04757611542 | |
42 | Entorhinal_bl | 3360.5 | 3365.5603485838783 | 5896 | 770.1566993159872 | |
43 | Fusiform_bl | 17,023.5 | 17,064.395642701526 | 26,280 | 2763.6379991806716 | |
44 | MidTemp_bl | 19,086.3 | 19,171.861437908494 | 29,292 | 2966.722264517154 | |
45 | ICV_bl | 1,527,190 | 1,542,547.285 | 2,714,340 | 169,932.87108013846 | |
46 | FDG_bl | 1.1870850000000002 | 1.2005082629629629 | 1.70113 | 0.1342208035987261 | |
47 | Years_bl | 1.00205 | 1.0118026601307188 | 1.2512 | 0.049772686 | |
48 | TAU_bl | 298.89 | 305.8513442265795 | 816.9 | 110.61959394290388 | |
49 | ABETA_bl | 754.74 | 867.6725054466232 | 1700 | 380.51187522862955 | |
50 | PTAU_bl | 29.145 | 30.04860784313726 | 94.86 | 12.535837824740023 | |
51 | TAU | 297.9 | 317.38917211328976 | 802.4 | 73.68806040025473 | |
52 | DX | [2, 1] | ||||
53 | MOCA | 23 | 22.769162995594716 | 30 | 4.020161189133941 | |
54 | EcogPtLang | 1.77778 | 1.8973843788546254 | 4 | 0.6874612645623314 | |
55 | EcogPtVisspat | 1.28571 | 1.507761947136564 | 4 | 0.6135489175939899 | |
56 | EcogPtPlan | 1.4 | 1.5351908810572688 | 3.8 | 0.6146249328058496 | |
57 | EcogPtDivatt | 2 | 2.006057268722467 | 4 | 0.8197753125466491 | |
58 | EcogPtTotal | 1.74359 | 1.8405794669603526 | 3.69231 | 0.5772989648737149 | |
59 | EcogSPLang | 1.58611 | 1.8575385903083699 | 4 | 0.7996222455335457 | |
60 | EcogSPPlan | 1.5 | 1.8114684140969162 | 4 | 0.9016202278521113 | |
61 | EcogSPOrgan | 1.66667 | 1.9126724317180617 | 4 | 0.9180337285803657 | |
62 | EcogSPDivatt | 2 | 2.182672577092511 | 4 | 0.9494293241028594 | |
63 | EcogSPTotal | 1.74359 | 1.9563972290748899 | 3.97368 | 0.7966188010118953 | |
64 | MOCA_bl | 23 | 22.822907488986782 | 30 | 3.5954600273400152 | |
65 | EcogPtLang_bl | 1.77778 | 1.9069771189427316 | 4 | 0.6946476139514485 | |
66 | EcogPtVisspat_bl | 1.28571 | 1.4918661453744495 | 4 | 0.619250998 | |
67 | EcogPtOrgan_bl | 1.416666 | 1.6129223039647576 | 4 | 0.6744607742485486 | |
68 | EcogPtDivatt_bl | 1.75 | 1.9948237885462554 | 4 | 0.8070685154926579 | |
69 | EcogPtTotal_bl | 1.7142400000000002 | 1.8485160044052862 | 3.85294 | 0.5781348132153441 | |
70 | EcogSPMem_bl | 2.25 | 2.3429122026431717 | 4 | 0.8585914694597968 | |
71 | EcogSPLang_bl | 1.55556 | 1.7631507488986786 | 4 | 0.7370702823814411 | |
72 | EcogSPVisspat_bl | 1.266666 | 1.5279054140969162 | 4 | 0.6853481425640744 | |
73 | EcogSPOrgan_bl | 1.5 | 1.735022181 | 4 | 0.800743658 | |
74 | EcogSPTotal_bl | 1.71053 | 1.8598564933920703 | 3.89744 | 0.6867589759636067 | |
75 | Transition | [0, 1] |
References
- Portet, F.; Ousset, P.J.; Visser, P.J.; Frisoni, G.B.; Nobili, F.; Scheltens, P.; Vellas, B.; Touchon, J.; MCI Working Group of the European Consortium on Alzheimer’s Disease. Mild cognitive impairment (MCI) in medical practice: A critical review of the concept and new diagnostic procedure. Report of the MCI Working Group of the European Consortium on Alzheimer’s Disease. J. Neurol. Neurosurg. Psychiatry 2006, 77, 714. [Google Scholar] [CrossRef] [PubMed]
- Alzheimers Facts and Figures Report 2022. 2022. Available online: https://www.alz.org/media/Documents/alzheimers-facts-and-figures-special-report-2022.pdf (accessed on 6 May 2023).
- 2023 Alzheimer’s disease facts and figures. Alzheimer’s Dement. 2023, 19, 1598–1695. [CrossRef] [PubMed]
- Mofrad, S.A.; Lundervold, A.; Lundervold, A.S. A predictive framework based on brain volume trajectories enabling early detection of Alzheimer’s disease. Comput. Med. Imaging Graph. 2021, 90, 101910. [Google Scholar] [CrossRef] [PubMed]
- Zhang, T.; Liao, Q.; Zhang, D.; Zhang, C.; Yan, J.; Ngetich, R.; Zhang, J.; Jin, Z.; Li, L. Predicting MCI to AD Conversation Using Integrated sMRI and rs-fMRI: Machine Learning and Graph Theory Approach. Front. Aging Neurosci. 2021, 13, 688926. [Google Scholar] [CrossRef] [PubMed]
- Graff-Radford, J.; Yong, K.X.; Apostolova, L.G.; Bouwman, F.H.; Carrillo, M.; Dickerson, B.C.; Rabinovici, G.D.; Schott, J.M.; Jones, D.T.; Murray, M.E. New Insights into Atypical Alzheimer’s Disease in the Era of Biomarkers. Lancet Neurol. 2021, 20, 222. [Google Scholar] [CrossRef] [PubMed]
- Blennow, K.; Zetterberg, H. Biomarkers for Alzheimer’s disease: Current status and prospects for the future. J. Intern. Med. 2018, 284, 643–663. [Google Scholar] [CrossRef] [PubMed]
- Bron, E.E.; Bron, E.E.; Klein, S.; Papma, J.M.; Jiskoot, L.C.; Venkatraghavan, V.; Linders, J.; Aalten, P.; De Deyn, P.P.; Biessels, G.J.; et al. Cross-cohort generalizability of deep and conventional machine learning for MRI-based diagnosis and prediction of Alzheimer’s disease. Neuroimage Clin. 2021, 31, 102712. [Google Scholar] [CrossRef] [PubMed]
- Vrahatis, A.G.; Skolariki, K.; Krokidis, M.G.; Lazaros, K.; Exarchos, T.P.; Vlamos, P. Revolutionizing the Early Detection of Alzheimer’s Disease through Non-Invasive Biomarkers: The Role of Artificial Intelligence and Deep Learning. Sensors 2023, 23, 4184. [Google Scholar] [CrossRef]
- Chang, C.H.; Lin, C.H.; Lane, H.Y. Machine Learning and Novel Biomarkers for the Diagnosis of Alzheimer’s Disease. Int. J. Mol. Sci. 2021, 22, 2761. [Google Scholar] [CrossRef]
- Ko, H.; Ihm, J.J.; Kim, H.G. Cognitive profiling related to cerebral amyloid beta burden using machine learning approaches. Front. Aging Neurosci. 2019, 11, 439698. [Google Scholar]
- Salvatore, C.; Cerasa, A.; Battista, P.; Gilardi, M.C.; Quattrone, A.; Castiglioni, I. Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer’s disease: A machine learning approach. Front. Neurosci. 2015, 9, 144798. [Google Scholar]
- Singh, A.; Kumar, R.; Tiwari, A.K. Prediction of Alzheimer’s Using Random Forest with Radiomic Features. Comput. Syst. Sci. Eng. 2022, 45, 513–530. [Google Scholar] [CrossRef]
- Akter, L.; Ferdib-Al-Islam. Dementia Identification for Diagnosing Alzheimer’s Disease using XGBoost Algorithm. In Proceedings of the 2021 International Conference on Information and Communication Technology for Sustainable Development, ICICT4SD 2021, Dhaka, Bangladesh, 27–28 February 2021; pp. 205–209. [Google Scholar] [CrossRef]
- Bucholc, M.; Titarenko, S.; Ding, X.; Canavan, C.; Chen, T. A hybrid machine learning approach for prediction of conversion from mild cognitive impairment to dementia. Expert. Syst. Appl. 2023, 217, 119541. [Google Scholar] [CrossRef]
- Balaji, P.; Chaurasia, M.A.; Bilfaqih, S.M.; Muniasamy, A.; Alsid, L.E.G. Hybridized Deep Learning Approach for Detecting Alzheimer’s Disease. Biomedicines 2023, 11, 149. [Google Scholar] [CrossRef] [PubMed]
- Bogdanovic, B.; Eftimov, T.; Simjanoska, M. In-depth insights into Alzheimer’s disease by using explainable machine learning approach. Sci. Rep. 2022, 12, 6508. [Google Scholar] [CrossRef] [PubMed]
- Yi, F.; Yang, H.; Chen, D.; Qin, Y.; Han, H.; Cui, J.; Bai, W.; Ma, Y.; Zhang, R.; Yu, H. XGBoost-SHAP-based interpretable diagnostic framework for alzheimer’s disease. BMC Med. Inform. Decis. Mak 2023, 23, 137. [Google Scholar] [CrossRef] [PubMed]
- ADNI|About. Available online: https://adni.loni.usc.edu/about/ (accessed on 7 November 2023).
- ADNIMERGE: Clinical and Biomarker Data from All ADNI Protocols • ADNIMERGE. Available online: https://adni.bitbucket.io/index.html (accessed on 7 November 2023).
- ADNI_General Procedures Manual. 2006. Available online: https://adni.loni.usc.edu/wp-content/uploads/2024/02/ADNI_General_Procedures_Manual_29Feb2024.pdf (accessed on 7 November 2023).
- Troyanskaya, O.; Cantor, M.; Sherlock, G.; Brown, P.; Hastie, T.; Tibshirani, R.; Botstein, D.; Altman, R.B. Missing value estimation methods for DNA microarrays. Bioinformatics 2001, 17, 520–525. [Google Scholar] [CrossRef] [PubMed]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar] [CrossRef]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. arXiv 2019, arXiv:1706.09516v5. [Google Scholar]
- Dorogush, A.V.; Ershov, V.; Yandex, A.G. CatBoost: Gradient Boosting with Categorical Features Support. October 2018. Available online: https://arxiv.org/abs/1810.11363v1 (accessed on 3 December 2023).
- Zhenyu, Z.; Yang, R.; Wang, P. Application of explainable machine learning based on Catboost in credit scoring. J. Phys. Conf. Ser. 2021, 1955, 12039. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Available online: https://github.com/Microsoft/LightGBM (accessed on 3 December 2023).
- Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V. Gene selection for cancer classification using support vector machines. Mach. Learn. 2002, 46, 389–422. [Google Scholar]
- Wu, J.; Chen, X.Y.; Zhang, H.; Xiong, L.D.; Lei, H.; Deng, S.H. Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar] [CrossRef]
- Bloch, L.; Friedrich, C.M. Using Bayesian Optimization to Effectively Tune Random Forest and XGBoost Hyperparameters for Early Alzheimer’s Disease Diagnosis. In Wireless Mobile Communication and Healthcare, Proceedings of the 9th EAI International Conference, MobiHealth 2020, Virtual Event,19 November 2020; Springer: Cham, Switzerland, 2021. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Allen, P.G.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. Available online: https://github.com/slundberg/shap (accessed on 11 December 2023).
- Syaifullah, A.H.; Shiino, A.; Kitahara, H.; Ito, R.; Ishida, M.; Tanigaki, K. Machine Learning for Diagnosis of AD and Prediction of MCI Progression from Brain MRI Using Brain Anatomical Analysis Using Diffeomorphic Deformation. Front. Neurol. 2021, 11, 576029. [Google Scholar] [PubMed]
- Lin, W.; Gao, Q.; Yuan, J.; Chen, Z.; Feng, C.; Chen, W.; Du, M.; Tong, T. Predicting Alzheimer’s Disease Conversion from Mild Cognitive Impairment Using an Extreme Learning Machine-Based Grading Method with Multimodal Data. Front. Aging Neurosci. 2020, 12, 509232. [Google Scholar] [CrossRef] [PubMed]
- Anderson, N.H.; Woodburn, K. Old-age psychiatry. In Companion Psychiatric Studies; Elsevier: Amsterdam, The Netherlands, 2010; pp. 635–692. [Google Scholar] [CrossRef]
- Donohue, M.C.; Sperling, R.A.; Salmon, D.P.; Rentz, D.M.; Raman, R.; Thomas, R.G.; Weiner, M.; Aisen, P.S.; Australian Imaging, Biomarkers, and Lifestyle Flagship Study of Ageing; Alzheimer’s Disease Neuroimaging Initiative; et al. The Preclinical Alzheimer Cognitive Composite: Measuring Amyloid-Related Decline. JAMA Neurol. 2014, 71, 961. [Google Scholar] [CrossRef] [PubMed]
Algorithm | Class | Precision | Recall | Accurary |
---|---|---|---|---|
XGBoost | Stable | 0.90 | 0.89 | 0.86 |
Transition | 0.79 | 0.77 | ||
CatBoost | Stable | 0.90 | 0.88 | 0.85 |
Transition | 0.77 | 0.76 | ||
Light Gradient Boosting | Stable | 0.89 | 0.90 | 0.86 |
Transition | 0.76 | 0.75 | ||
Decision Tree | Stable | 0.85 | 0.86 | 0.79 |
Transition | 0.67 | 0.66 | ||
Logistic Regression | Stable | 0.70 | 0.91 | 0.66 |
Transition | 0.32 | 0.11 | ||
Naive Bayes | Stable | 0.87 | 0.62 | 0.66 |
Transition | 0.47 | 0.78 |
Parameter | Value |
---|---|
colsample bytree | 0.6714223800630487 |
gamma | 0.7244817045778367 |
Learning_rate | 0.01 |
min_child_weight | 10 |
n_estimators | 1000 |
Scale_pos_weight | 5.0 |
subsample | 0.5676926067435525 |
Max_depth | 7 |
Target Class | Precision | Recall | F1-Score | Accuracy | ROC AUC |
---|---|---|---|---|---|
Stable | 0.94 | 0.84 | 0.90 | 0.85 | 0.86 |
Transition | 0.71 | 0.88 | 0.79 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Grammenos, G.; Vrahatis, A.G.; Vlamos, P.; Palejev, D.; Exarchos, T.; for the Alzheimer’s Disease Neuroimaging Initiative. Predicting the Conversion from Mild Cognitive Impairment to Alzheimer’s Disease Using an Explainable AI Approach. Information 2024, 15, 249. https://doi.org/10.3390/info15050249
Grammenos G, Vrahatis AG, Vlamos P, Palejev D, Exarchos T, for the Alzheimer’s Disease Neuroimaging Initiative. Predicting the Conversion from Mild Cognitive Impairment to Alzheimer’s Disease Using an Explainable AI Approach. Information. 2024; 15(5):249. https://doi.org/10.3390/info15050249
Chicago/Turabian StyleGrammenos, Gerasimos, Aristidis G. Vrahatis, Panagiotis Vlamos, Dean Palejev, Themis Exarchos, and for the Alzheimer’s Disease Neuroimaging Initiative. 2024. "Predicting the Conversion from Mild Cognitive Impairment to Alzheimer’s Disease Using an Explainable AI Approach" Information 15, no. 5: 249. https://doi.org/10.3390/info15050249