Abstract
Recently, there have been great interests for computer-aided diagnosis of Alzheimer’s disease (AD) and its prodromal stage, mild cognitive impairment (MCI). Unlike the previous methods that considered simple low-level features such as gray matter tissue volumes from MRI, and mean signal intensities from PET, in this paper, we propose a deep learning-based latent feature representation with a stacked auto-encoder (SAE). We believe that there exist latent non-linear complicated patterns inherent in the low-level features such as relations among features. Combining the latent information with the original features helps build a robust model in AD/MCI classification, with high diagnostic accuracy. Furthermore, thanks to the unsupervised characteristic of the pre-training in deep learning, we can benefit from the target-unrelated samples to initialize parameters of SAE, thus finding optimal parameters in fine-tuning with the target-related samples, and further enhancing the classification performances across four binary classification problems: AD vs. healthy normal control (HC), MCI vs. HC, AD vs. MCI, and MCI converter (MCI-C) vs. MCI non-converter (MCI-NC). In our experiments on ADNI dataset, we validated the effectiveness of the proposed method, showing the accuracies of 98.8, 90.7, 83.7, and 83.3 % for AD/HC, MCI/HC, AD/MCI, and MCI-C/MCI-NC classification, respectively. We believe that deep learning can shed new light on the neuroimaging data analysis, and our work presented the applicability of this method to brain disease diagnosis.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Although there exist in total more than 800 subjects in ADNI database, only 202 subjects have the baseline data including all the modalities of MRI, FDG-PET, and CSF.
Refer to http://www.adniinfo.org for the details.
While the low-level simple features should be the voxels in MRI and FDG-PET, due to high dimensionality and a small sample problem, in this paper, we take a ROI-based approach and consider the conical GM tissue volumes and the mean intensity for each ROI from MRI and FDG-PET, respectively, as the low-level features.
In this work, we set γ = 0.01 and ρ = 0.05.
In our case, the tasks are to regress class-label, and MMSE and ADAS-Cog scores.
In this work, \({\user2 t}^{(1)}_{s}=\cdots={\user2 t}^{(m)}_{s}=\cdots={\user2 t}^{(M)}_{s}.\)
CONCAT represents a concatenation of the features from MRI, FDG-PET, and CSF into a single vector, which is the most direct and intuitive way of combining multimodal information.
We considered [100, 300, 500, 1,000]–[50, 100]–[10, 20, 30] and [10, 20, 30]–[1, 2, 3] (bottom–top) for three-layer and two-layer networks, respectively.
Refer to "Sparse auto-encoder" for explanation of the supervised learning.
References
Alzheimer’s Association (2012) Alzheimer’s disease facts and figures. Alzheimer’s Dementia 8(2):131–168
Aston JAD, Cunningham VJ, Asselin MC, Hammers A, Evans AC, Gunn RN (2002) Positron emission tomography partial volume correction: estimation and algorithms. J Cereb Blood Flow Metab 22(8):1019–1034
Avants BB, Epstein CL, Grossman M, Gee JC (2008) Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med Image Anal 12(1):26–41
Bengio Y (2009) Learning deep architectures for AI. Found Trends Machine Learn 2(1):1–127
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Schölkopf B, Platt J, Hoffman T (eds) Advances in neural information processing systems 19. MIT Press, Cambridge, pp 153–160
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Inc., New York
Bokde ALW, Lopez-Bayo P, Meindl T, Pechler S, Born C, Faltraco F, Teipel SJ, Möller HJ, Hampel H (2006) Functional connectivity of the fusiform gyrus during a face-matching task in subjects with mild cognitive impairment. Brain 129(5):1113–1124
Braak H, Braak E (1991) Neuropathological stageing of Alzheimer-related changes. Acta Neuropathologica 82(4):239–259
Buckner RL, Snyder AZ, Shannon BJ, LaRossa G, Sachs R, Fotenos AF, Sheline YI, Klunk WE, Mathis CA, Morris JC, Mintun MA (2005) Molecular, structural, and functional characterization of Alzheimer’s disease: evidence for a relationship between default activity, amyloid, and memory. J Neurosci 25:7709–7717
Burton EJ, Barber R, Mukaetova-Ladinska EB, Robson J, Perry RH, Jaros E, Kalaria RN, O’Brien JT (2009) Medial temporal lobe atrophy on MRI differentiates Alzheimer’s disease from dementia with Lewy bodies and vascular cognitive impairment: a prospective study with pathological verification of diagnosis. Brain 132(1):195–203
Copenhaver BR, Rabin LA, Saykin AJ, Roth RM, Wishart HA, Flashman LA, Santulli RB, McHugh TL, Mamourian AC (2006) The fornix and mammillary bodies in older adults with Alzheimer’s disease, mild cognitive impairment, and cognitive complaints: a volumetric MRI study. Psychiatry Res Neuroimaging 147(2–3):93–103
Cui Y, Liu B, Luo S, Zhen X, Fan M, Liu T, Zhu W, Park M, Jiang T, Jin JS, (2011) The Alzheimer’s disease neuroimaging initiative: identification of conversion from mild cognitive impairment to Alzheimer’s disease using multivariate predictors. PLoS One 6(7):e21, 896
Dai W, Lopez O, Carmichael O, Becker J, Kuller L, Gach H (2009) Mild cognitive impairment and Alzheimer disease: patterns of altered cerebral blood flow at MR imaging. Radiology 250(3):856–866
Davatzikos C, Bhatt P, Shaw LM, Batmanghelich KN, Trojanowski JQ (2011) Prediction of MCI to AD conversion, via MRI, CSF biomarkers, and pattern classification. Neurobiol Aging 32(12):2322.e19–2322.e27
Desikan R, Cabral H, Hess C, Dillon W, Salat D, Buckner R, Fischl B, Initiative ADN (2009) Automated MRI measures identify individuals with mild cognitive impairment and Alzheimer’s disease. Brain 132:2048–2057
Devanand DP, Pradhaban G, Liu X, Khandji A, De Santi S, Segal S, Rusinek H, Pelton GH, Hoing LS, Mayeux R, Stern Y, Tabert MH, de Leon JJ (2007) Hippocampal and entorhinal atrophy in mild cognitive impairment. Neurology 68:828–836
Dickerson BC, Bakkour A, Salat DH, Feczko E, Pacheco J, Greve DN, Grodstein F, Wright CI, Blacker D, Rosas HD, Sperling RA, Atri A, Growdon JH, Hyman BT, Morris JC, Fischl B, Buckner RL (2009) The cortical signature of Alzheimer’s disease: Regionally specific cortical thinning relates to symptom severity in very mild to mild AD dementia and is detectable in asymptomatic amyloid-positive individuals. Cereb Cortex 19:828–836
Erhan D, Bengio Y, Courville A, Manzagol PA, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning. Int J Pattern Recognit Artif Intell 11:625–660
Evans AC, Collins DL (1997) Animal: validation and applications of nonlinear registration-based segmentation. Int J Pattern Recognit Artif Intell 11(8):1271–1294
Ewers M, Walsh C, Trojanowski JQ, Shaw LM, Petersen RC, Jack Jr CR, Feldman HH, Bokde AL, Alexander GE, Scheltens P, Vellas B, Dubois B, Weiner M, Hampel, H (2012) Prediction of conversion from mild cognitive impairment to Alzheimer’s disease dementia based upon biomarkers and neuropsychological test performance. Neurobiol Aging 33(7):1203–1214.e2
Fan Y, Rao H, Hurt H, Giannetta J, Korczykowski M, Shera D, Avants BB, Gee JC, Wang J, Shen D (2007) Multivariate examination of brain abnormality using both structural and functional MRI. NeuroImage 36(4):1189–1199
Friston KJ, Ashburner J, Frith CD, Poline JB, Heather JD, Frackowiak RSJ (1995) Spatial registration and normalization of images. Hum Brain Mapp 3(3):165–189
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):93–202
Gönen M, Alpaydm E (2011) Multiple kernel learning algorithms. J Machine Learn Res 12:2211–2268
Gray KR, Aljabar P, Heckemann RA, Hammers A, Rueckert D (2013) Random forest-based similarity measures for multi-modal classification of Alzheimer’s disease. NeuroImage 65:167–175
Greicius MD, Srivastava G, Reiss AL, Menon V (2004) Default-mode network activity distinguishes Alzheimer’s disease from healthy aging: evidence from functional MRI. Proc Natl Acad Sci USA 101(13):4637–4642
Han B, Davis LS (2012) Density-based multifeature background subtraction with support vector machine. IEEE Trans Pattern Anal Machine Intell 34(5):1017–1023
Hinrichs C, Singh V, Xu G, Johnson SC (2011) Predictive markers for AD in a multi-modality framework: an analysis of MCI progression in the ADNI population. NeuroImage 55(2):574–589
Hinton GE, Osindero S, Teh YW (2006)A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Jia H, Wu G, Wang Q, Shen D (2010) ABSORB: Atlas building by self-organized registration and bundling. NeuroImage 51(3):1057–1070
Kabani N, MacDonald D, Holmes C, Evans A (1998) A 3D atlas of the human brain. NeuroImage 7(4):S717
Klöppel S, Stonnington CM, Chu C, Draganski B, Scahill RI, Rohrer JD, Fox NC, Jack, Jr, CR, Ashburner, J, Frackowiak RSJ (2008) Automatic classification of MR scans in Alzheimer’s disease. Brain 131(3):681–689
Kohannim O, Hua X, Hibar DP, Lee S, Chou YY, Toga AW, Jack CR Jr, Weiner MW, Thompson PM (2010) Boosting power for clinical trials using classifiers based on multiple biomarkers. Neurobiol Aging 31(8):1429–1442
Larochelle H, Bengio Y, Louradour J, Lamblin P (2009) Exploring strategies for training deep neural networks. J Machine Learn Res 10:1–40
Lee ACH, Buckley MJ, Gaffan D, Emery T, Hodges JR, Graham KS (2006) Differentiating the roles of the hippocampus and perirhinal cortex in processes beyond long-term declarative memory: a double dissociation in dementia. J Neurosci 26(19):5198–5203
Lee H, Ekanadham C, Ng A (2008) Sparse deep belief net model for visual area v2. In: Platt J, Koller D, Singer Y, Roweis S (eds) Advances in neural information processing systems 20. MIT Press, Cambridge, pp 873–880
Lee H, Grosse R, Ranganath R, Ng AY (2011) Unsupervised learning of hierarchical representations with convolutional deep belief networks. Commun ACM 54(10):95–103
Li Y, Wang Y, Wu G, Shi F, Zhou L, Lin W, Shen D (2012) Discriminant analysis of longitudinal cortical thickness changes in Alzheimer’s disease using dynamic and network features. Neurobiol Aging 33(2):427.e15–427.e30
Liu M, Zhang D, Shen D (2012) Ensemble sparse classification of Alzheimer’s disease. NeuroImage 60(2):1106–1116
Liu F, Suk HI, Wee CY, Chen H, Shen D (2013) High-order graph matching based feature selection for Alzheimer's disease identification. In: Proceedings of the 16th international conference on medical image computing and computer-assisted intervention, vol 8150. Springer, Berlin, Heidelberg, pp 311–318
Loewenstein DA, Greig MT, Schinka JA, Barker W, Shen Q, Potter E, Raj A, Brooks L, Varon D, Schoenberg M, Banko J, Potter H, Duara R (2012) An investigation of PreMCI: subtypes and longitudinal outcomes. Alzheimer’s Dementia 8(3):172–179
Mark RE, Sitskoorn MM (2013) Are subjective cognitive complaints relevant in preclinical Alzheimer’s disease? A review and guidelines for healthcare professionals. Rev Clin Gerontol 23:61–74
Mosconi L (2005) Brain glucose metabolism in the early and specific diagnosis of Alzheimer’s disease. Eur J Nucl Med Mol Imaging 32(4):486–510
Mosconi L, Tsui WH, Herholz K, Pupi A, Drzezga A, Lucignani G, Reiman EM, Holthoff V, Kalbe E, Sorbi S, Diehl-Schmid J, Perneczky R, Clerici F, Caselli R, Beuthien-Baumann B, Kurz A, Minoshima S, de Leon MJ (2008) Multicenter standardized 18F-FDG PET diagnosis of mild cognitive impairment, Alzheimer’s disease, and other dementias. J Nucl Med 49(3):390–398
Nettiksimmons J, Harvey D, Brewer J, Carmichael O, DeCarli C, Jack CR, Petersen R, Shaw LM, Trojanowski JQ, Weiner MW, Beckett L (2010) Subtypes based on cerebrospinal fluid and magnetic resonance imaging markers in normal elderly predict cognitive decline. Neurobiol Aging 31(8):1419–1428
Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng AY (2011) Multimodal deep learning. In: Proceedings of the 28th International Conference on Machine Learning, pp 689–696
Nobili F, Mazzei D, Dessi B, Morbelli S, Brugnolo A, Barbieri P, Girtler N, Sambuceti G, Rodriguez G, Pagani M (2010) Unawareness of memory deficit in amnestic MCI: FDG-PET findings. J Alzheimer’s Dis 22:(3):993–1003 (2010)
Nordberg A, Rinne JO, Kadir A, Langstrom B (2010) The use of PET in Alzheimer disease. Nat Rev Neurol 6(2):78–87
Perrin RJ, Fagan AM, Holtzman DM (2009) Multimodal techniques for diagnosis and prognosis of Alzheimer’s disease. Nature 461:916–922
Rueckert D, Sonoda L, Hayes C, Hill D, Leach M, Hawkes D (1999) Non-rigid registration using free-form deformations: application to breast MR images. IEEE Trans Med Imaging 18(8):712–721
Schroeter ML, Stein T, Maslowski N, Neumann J (2009) Neural correlates of Alzheimer’s disease and mild cognitive impairment: a systematic and quantitative meta-analysis involving 1351 patients. NeuroImage 47(4):1196–1206
Serre T, Wolf L, Poggio T (2005) Object recognition with features inspired by visual cortex. In: Proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 994–1000
Shen D, Davatzikos C (2002) HAMMER: Hierarchical attribute matching mechanism for elastic registration. IEEE Trans Med Imaging 21(11):1421–1439
Shen D, Wong WH, Ip HH (1999) Affine-invariant image retrieval by correspondence matching of shapes. Image Vis Comput 17(7):489–499
Shin HC, Orton MR, Collins DJ, Doran SJ, Leach MO (2013) Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4D patient data. IEEE Trans Pattern Anal Machine Intell 35(8):1930–1943
Singh V, Chertkow H, Lerch JP, Evans AC, Dorr AE, Kabani NJ (2006) Spatial patterns of cortical thinning in mild cognitive impairment and Alzheimer’s disease. Brain 129(11):2885–2893
Sled JG, Zijdenbos AP, Evans AC (1998) A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Trans Med Imaging 17(1):87–97
Srivastava N, Salakhutdinov R (2012) Multimodal learning with deep Boltzmann machines. In: Bartlett P, Pereira F, Burges C, Bottou L, Weinberger K (eds) Advances in neural information processing systems 25, pp 2231–2239
Suk HI, Lee SW (2013) A novel Bayesian framework for discriminative feature extraction in brain-computer interfaces. IEEE Trans Pattern Anal Machine Intell 35(2):286–299
Suk HI, Wee CY, Shen D (2013) Discriminative group sparse representation for mild cognitive impairment classification. In: Proceedings of the 4th international workshop on machine learning in medical imaging, vol 81814. Springer, Switzerland, pp 131–138
Tang S, Fan Y, Wu G, Kim M, Shen D (2009) RABBIT: rapid alignment of brains by building intermediate templates. NeuroImage 47(4):1277–1287
Tapiola T, Alafuzoff I, Herukka SK, Parkkinen L, Hartikainen P, Soininen H, Pirttilä T (2009) Cerebrospinal fluid β-amyloid 42 and tau proteins as biomarkers of Alzheimer-type pathologic changes in the brain. Archives Neurol 66(3):382–389
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc 58(1):267–288
Vercauteren T, Pennec X, Perchant A, Ayache N (2009) Diffeomorphic demons: efficient non-parametric image registration. NeuroImage 45(1, Suppl 1):S61–S72
Visser PJ, Verhey FRJ, Hofman PAM, Scheltens P, Jolles J (2002) Medial temporal lobe atrophy predicts Alzheimer’s disease in patients with minor cognitive impairment. J Neurol Neurosurg Psychiatry 72: 491–497
Walhovd K, Fjell A, Brewer J, McEvoy L, Fennema-Notestine C, Hagler DJ Jr, Jennings R, Karow D, Dale A (2010) The Alzheimer’s disease Neuroimaging Initiative Combining MR imaging, positron-emission tomography, and CSF biomarkers in the diagnosis and prognosis of Alzheimer disease. Am J Neuroradiol 31:347–354
Wang Y, Nie J, Yap PT, Shi F, Guo L, Shen D (2011) Robust deformable-surface-based skull-stripping for large-scale studies. In: Proceedings of the 14th international conference on medical image computing and computer-assisted intervention, vol 6893. Springer, Berlin, Heidelberg, pp 635–642
Wee CY, Yap PT, Li W, Denny K, Browndyke JN, Potter GG, Welsh-Bohmer KA, Wang L, Shen D (2011) Enriched white matter connectivity networks for accurate identification of MCI patients. Neuroimage 54(3):1812–1822
Wee CY, Yap PT, Zhang D, Denny K, Browndyke JN, Potter GG, Welsh-Bohmer KA, Wang L, Shen D (2012) Identification of MCI individuals using structural and functional connectivity networks. Neuroimage 59(3):2045–2056
Westman E, Muehlboeck JS, Simmons A (2012) Combining MRI and CSF measures for classification of Alzheimer’s disease and prediction of mild cognitive impairment conversion. NeuroImage 62(1):229–238
Wu G, Qi F, Shen D (2006) Learning-based deformable registration of MR brain images. IEEE Trans Med Imaging 25(6):1145–1157
Xue Z, Shen D, Davatzikos C (2006a) Statistical representation of high-dimensional deformation fields with application to statistically constrained 3D warping. Med Image Anal 10(5):740–751
Xue Z, Shen D, Karacali B, Stern J, Rottenberg D, Davatzikos C (2006b) Simulating deformations of MR brain images for validation of atlas-based segmentation and registration algorithms. NeuroImage 33(3):855–866
Yang J, Shen D, Davatzikos C, Verma R (2008) Diffusion tensor image registration using tensor geometry and orientation features. In: Proceedings of the 11th international conference on medical image computing and computer-assisted intervention, vol 5242. Springer, Berlin, Heidelberg, pp 905–913
Yao Z, Hu B, Liang C, Zhao L, Jackson M (2012) The Alzheimer’s disease neuroimaging initiative: a longitudinal study of atrophy in amnestic mild cognitive impairment and normal aging revealed by cortical thickness. PLoS One 7(11):e48,973
Yu K, Lin Y, Lafferty J (2011) Learning image representations from the pixel level via hierarchical sparse coding. In: Proceedings of the 2011 IEEE computer society conference on computer vision and pattern recognition, Providence, pp 1713–1720
Yuan L, Wang Y, Thompson PM, Narayan VA, Ye J (2012) Multi-source feature learning for joint analysis of incomplete multiple heterogeneous neuroimaging data. NeuroImage 61(3):622–632
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68(1):49–67
Zhang D, Shen D (2012) Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease. NeuroImage 59(2):895–907
Zhang D, Wang Y, Zhou L, Yuan H, Shen D (2011) Multimodal classification of Alzheimer’s disease and mild cognitive impairment. NeuroImage 55(3):856–867
Zhang Y, Brady M, Smith S (2001) Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans Med Imaging 20(1):45–57
Zhang Y, Schuff N, Du AT, Rosen HJ, Kramer JH, Gorno-Tempini ML, Miller BL, Weiner MW (2009) White matter damage in frontotemporal dementia and Alzheimer’s disease measured by diffusion MRI. Brain 132(9):2579–2592
Zhou L, Wang Y, Li Y, Yap PT, Shen D, ADNI (2011) Hierarchical anatomical brain networks for MCI prediction: revisiting volumetric measures. PLoS ONE 6(7):e21935
Acknowledgments
This work was supported in part by NIH grants EB006733, EB008374, EB009634, AG041721, MH100217, and AG042599, and also by the National Research Foundation grant (No. 2012-005741) funded by the Korean government.
Author information
Authors and Affiliations
Consortia
Corresponding author
Rights and permissions
About this article
Cite this article
Suk, HI., Lee, SW., Shen, D. et al. Latent feature representation with stacked auto-encoder for AD/MCI diagnosis. Brain Struct Funct 220, 841–859 (2015). https://doi.org/10.1007/s00429-013-0687-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00429-013-0687-3