Nothing Special   »   [go: up one dir, main page]

Skip to main content

Unified Multi-modal Learning for Any Modality Combinations in Alzheimer’s Disease Diagnosis

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2024 (MICCAI 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15003))

  • 1095 Accesses

Abstract

Our method solves unified multi-modal learning in an diverse and imbalanced setting, which are the key features of medical modalities compared to the extensively-studied ones. Different from existing works that assumed fixed or maximum number of modalities for multi-modal learning, our model not only manages any missing scenarios but is also capable of handling new modalities and unseen combinations. We argue that, the key towards this any combination model is the proper design of alignment, which should guarantee both modality invariance across diverse inputs and effective modeling of complementarities within the unified metric space. Instead of exact cross-modal alignment, we propose to decouple these two functions into representation-level and task-level alignment, which we empirically show are both indispensable in this task. Moreover, we introduce tunable modality-agnostic Transformer to unify the representation learning process, which significantly reduces modality-specific parameters and enhances the scalability of our model. The experiments have shown that the proposed method enables one single model handling all possible combinations of the six seen modalities and two new modalities in Alzheimer’s Disease diagnosis, with superior performance on longer combinations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cui, C., et al.: Deep multi-modal fusion of image and non-image data in disease diagnosis and prognosis: a review. Progress in Biomedical Engineering 5(2), 022001 (2023)

    Article  Google Scholar 

  2. Duan, J., et al.: Multi-modal alignment using representation codebook. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15651–15660 (2022)

    Google Scholar 

  3. Fan, Y., Xu, W., Wang, H., Wang, J., Guo, S.: PMR: prototypical modal rebalance for multimodal learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20029–20038 (2023)

    Google Scholar 

  4. Gorishniy, Y., Rubachev, I., Khrulkov, V., Babenko, A.: Revisiting deep learning models for tabular data. Advances in Neural Information Processing Systems 34, 18932–18943 (2021)

    Google Scholar 

  5. Huang, Y., Xu, J., Zhou, Y., Tong, T., Zhuang, X., T.A.D.N.I.A.: Diagnosis of alzheimer’s disease via multi-modality 3d convolutional neural network. Front. Neurosci. 13 (2019). https://doi.org/10.3389/fnins.2019.00509, https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2019.00509

  6. Jack Jr, C.R., et al.: The alzheimer’s disease neuroimaging initiative (ADNI): MRI methods. J. Magn. Reson. Imaging Official J. Int. Soc. Mag. Reson. Med. 27(4), 685–691 (2008)

    Google Scholar 

  7. Jaegle, A., Gimeno, F., Brock, A., Vinyals, O., Zisserman, A., Carreira, J.: Perceiver: general perception with iterative attention. In: International Conference on Machine Learning, pp. 4651–4664. PMLR (2021)

    Google Scholar 

  8. Kang, L., Gong, H., Wan, X., Li, H.: Visual-attribute prompt learning for progressive mild cognitive impairment prediction. In: Greenspan, H., et al. (ed.) Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, MICCAI 2023, LNCS, vol. 14224, pp. 547–557. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43904-9_53

  9. Lee, Y.L., Tsai, Y.H., Chiu, W.C., Lee, C.Y.: Multimodal prompting with missing modalities for visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14943–14952 (2023)

    Google Scholar 

  10. Liu, L., Liu, S., Zhang, L., To, X.V., Nasrallah, F., Chandra, S.S.: Cascaded multi-modal mixing transformers for alzheimer’s disease classification with incomplete data. NeuroImage 277, 120267 (2023)

    Article  Google Scholar 

  11. Qiu, S., Miller, M.I., Joshi, P.S., Lee, J.C., Xue, C., Ni, Y., Wang, Y., De Anda-Duran, I., Hwang, P.H., Cramer, J.A., et al.: Multimodal deep learning for alzheimer’s disease dementia assessment. Nature communications 13(1),  3404 (2022)

    Article  Google Scholar 

  12. Shvetsova, N., et al.: Everything at once-multi-modal fusion transformer for video retrieval. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20020–20029 (2022)

    Google Scholar 

  13. Sohn, K.: Improved deep metric learning with multi-class n-pair loss objective. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  14. Song, J., Zheng, J., Li, P., Lu, X., Zhu, G., Shen, P.: An effective multimodal image fusion method using MRI and pet for alzheimer’s disease diagnosis. Front. Digit. Health 3, 637386 (2021)

    Google Scholar 

  15. Tu, Y., Lin, S., Qiao, J., Zhuang, Y., Zhang, P.: Alzheimer’s disease diagnosis via multimodal feature fusion. Computers in Biology and Medicine 148, 105901 (2022)

    Article  Google Scholar 

  16. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  17. Wang, H., Chen, Y., Ma, C., Avery, J., Hull, L., Carneiro, G.: Multi-modal learning with missing modality via shared-specific feature modelling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15878–15887 (2023)

    Google Scholar 

  18. Wang, P., et al.: One-peace: exploring one general representation model toward unlimited modalities. arXiv preprint arXiv:2305.11172 (2023)

  19. Yao, W., Yin, K., Cheung, W.K., Liu, J., Qin, J.: DrFuse: learning disentangled representation for clinical multi-modal fusion with missing modality and modal inconsistency. In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)

    Google Scholar 

  20. Woo, S., Lee, S., Park, Y., Nugroho, M.A., Kim, C.: Towards good practices for missing modality robust action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 2776–2784 (2023)

    Google Scholar 

  21. Yao, J., Zhu, X., Zhu, F., Huang, J.: Deep correlational learning for survival prediction from multi-modality data. In: Descoteaux, M., et al. (eds.) Medical Image Computing and Computer-Assisted Intervention − MICCAI 2017, MICCAI 2017, LNCS, vol. 10434, pp. 406–414. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66185-8_46

  22. Zhou, R., Zhou, H., Chen, B.Y., Shen, L., Zhang, Y., He, L.: Attentive deep canonical correlation analysis for diagnosing alzheimer’s disease using multimodal imaging genetics. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, MICCAI 2023, LNCS, vol. 14221, pp. 681–691. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43895-0_64

  23. Zhou, T., Thung, K.H., Zhu, X., Shen, D.: Effective feature learning and fusion of multimodality data using stage-wise deep neural network for dementia diagnosis. Human brain mapping 40(3), 1001–1016 (2019)

    Article  Google Scholar 

  24. Zuo, H., Liu, R., Zhao, J., Gao, G., Li, H.: Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalities. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE (2023)

    Google Scholar 

Download references

Acknowledgments

The study was funded by the General Research Fund of Hong Kong Research Grants Council (No. 15218521); the grant under Theme-based Research Scheme of Hong Kong Research Grants Council (No. T45-401/22-N).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Qin .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

Author Feng Yidan has received research grants from the General Research Fund of Hong Kong Research Grants Council (No. 15218521) and Theme-based Research Scheme of Hong Kong Research Grants Council (No. T45-401/22-N).

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Feng, Y., Gao, B., Deng, S., Qiu, A., Qin, J. (2024). Unified Multi-modal Learning for Any Modality Combinations in Alzheimer’s Disease Diagnosis. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15003. Springer, Cham. https://doi.org/10.1007/978-3-031-72384-1_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72384-1_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72383-4

  • Online ISBN: 978-3-031-72384-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics