GEMTrans: A General, Echocardiography-Based, Multi-level Transformer Framework for Cardiovascular Diagnosis

Masoud Mokhtari¹²,
Neda Ahmadi¹²,
Teresa S. M. Tsang¹³,
Purang Abolmaesumi¹² &
…
Renjie Liao¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14349))

Included in the following conference series:

International Workshop on Machine Learning in Medical Imaging

1096 Accesses
13 Citations

Abstract

Echocardiography (echo) is an ultrasound imaging modality that is widely used for various cardiovascular diagnosis tasks. Due to inter-observer variability in echo-based diagnosis, which arises from the variability in echo image acquisition and the interpretation of echo images based on clinical experience, vision-based machine learning (ML) methods have gained popularity to act as secondary layers of verification. For such safety-critical applications, it is essential for any proposed ML method to present a level of explainability along with good accuracy. In addition, such methods must be able to process several echo videos obtained from various heart views and the interactions among them to properly produce predictions for a variety of cardiovascular measurements or interpretation tasks. Prior work lacks explainability or is limited in scope by focusing on a single cardiovascular task. To remedy this, we propose a General, Echo-based, Multi-Level Transformer (GEMTrans) framework that provides explainability, while simultaneously enabling multi-video training where the inter-play among echo image patches in the same frame, all frames in the same video, and inter-video relationships are captured based on a downstream task. We show the flexibility of our framework by considering two critical tasks including ejection fraction (EF) and aortic stenosis (AS) severity detection. Our model achieves mean absolute errors of 4.15 and 4.84 for single and dual-video EF estimation and an accuracy of 96.5% for AS detection, while providing informative task-specific attention maps and prototypical explainability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bertasius, G., Wang, H., Torresani, L.: Is space-time attention all you need for video understanding? In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 813–824. PMLR (2021)
Google Scholar
Biewald, L.: Experiment tracking with weights and biases (2020)
Google Scholar
Cheng, L.H., Sun, X., van der Geest, R.J.: Contrastive learning for echocardiographic view integration. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13434, pp. 340–349. Springer, Cham (2022)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2019)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021)
Google Scholar
Duffy, G., et al.: High-throughput precision phenotyping of left ventricular hypertrophy with cardiovascular deep learning. JAMA Cardiol. 7(4), 386–395 (2022)
Article Google Scholar
Fiorito, A.M., Østvik, A., Smistad, E., Leclerc, S., Bernard, O., Lovstakken, L.: Detection of cardiac events in echocardiography using 3D convolutional recurrent neural networks. In: IEEE International Ultrasonics Symposium, pp. 1–4 (2018)
Google Scholar
Gao, X., Li, W., Loomes, M., Wang, L.: A fused deep learning architecture for viewpoint classification of echocardiography. Inf. Fusion 36, 103–113 (2017)
Article Google Scholar
Ginsberg, T., et al.: Deep video networks for automatic assessment of aortic stenosis in echocardiography. In: Noble, J.A., Aylward, S., Grimwood, A., Min, Z., Lee, S.-L., Hu, Y. (eds.) ASMUS 2021. LNCS, vol. 12967, pp. 202–210. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87583-1_20
Chapter Google Scholar
Gu, A.N., et al.: Efficient echocardiogram view classification with sampling-free uncertainty estimation. In: Noble, J.A., et al. (eds.) ASMUS 2021. LNCS, vol. 12967, pp. 139–148. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87583-1_14
Chapter Google Scholar
Huang, Z., Long, G., Wessler, B., Hughes, M.C.: A new semi-supervised learning benchmark for classifying view and diagnosing aortic stenosis from echocardiograms. In: Proceedings of the 6th Machine Learning for Healthcare Conference (2021)
Google Scholar
Huang, Z., Long, G., Wessler, B., Hughes, M.C.: Tmed 2: a dataset for semi-supervised classification of echocardiograms (2022)
Google Scholar
Kazemi Esfeh, M.M., Luong, C., Behnami, D., Tsang, T., Abolmaesumi, P.: A deep Bayesian video analysis framework: towards a more robust estimation of ejection fraction. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12262, pp. 582–590. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59713-9_56
Chapter Google Scholar
Liu, F., Wang, K., Liu, D., Yang, X., Tian, J.: Deep pyramid local attention neural network for cardiac structure segmentation in two-dimensional echocardiography. Med. Image Anal. 67, 101873 (2021)
Article Google Scholar
Melas-Kyriazi, L.: Vit pytorch (2020). https://github.com/lukemelas/PyTorch-Pretrained-ViT
Mokhtari, M., Tsang, T., Abolmaesumi, P., Liao, R.: EchoGNN: explainable ejection fraction estimation with graph neural networks. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention. MICCAI 2022, vol. 13434, pp. 360–369. Springer Nature Switzerland, Cham (2022). https://doi.org/10.1007/978-3-031-16440-8_35
Chapter Google Scholar
Otto, C.M., et al.: 2020 ACC/AHA guideline for the management of patients with valvular heart disease: executive summary. J. Am. Coll. Cardiol. 77(4), 450–500 (2021)
Article Google Scholar
Ouyang, D., et al.: Video-based AI for beat-to-beat assessment of cardiac function. Nature 580, 252–256 (2020)
Article Google Scholar
Reynaud, H., Vlontzos, A., Hou, B., Beqiri, A., Leeson, P., Kainz, B.: Ultrasound video transformers for cardiac ejection fraction estimation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 495–505. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_48
Chapter Google Scholar
Roshanitabrizi, P., et al.: Ensembled prediction of rheumatic heart disease from ungated doppler echocardiography acquired in low-resource settings. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13431, pp. 602–612. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16431-6_57
Chapter Google Scholar
Spitzer, E., et al.: Aortic stenosis and heart failure: disease ascertainment and statistical considerations for clinical trials. Card. Fail. Rev. 5, 99–105 (2019)
Article Google Scholar
Stacey, J., Belinkov, Y., Rei, M.: Supervising model attention with human explanations for robust natural language inference. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 10, pp. 11349–11357 (2022)
Google Scholar
Suetens, P.: Fundamentals of Medical Imaging, 2nd edn. Cambridge University Press, Cambridge (2009)
Book Google Scholar
Thomas, S., Gilbert, A., Ben-Yosef, G.: Light-weight spatio-temporal graphs for segmentation and ejection fraction prediction in cardiac ultrasound. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13434, pp. 380–390. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16440-8_37
Chapter Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates (2017)
Google Scholar
Xue, M., et al.: Protopformer: concentrating on prototypical parts in vision transformers for interpretable image recognition. ArXiv (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
Masoud Mokhtari, Neda Ahmadi, Purang Abolmaesumi & Renjie Liao
Vancouver General Hospital, Vancouver, BC, Canada
Teresa S. M. Tsang

Authors

Masoud Mokhtari
View author publications
You can also search for this author in PubMed Google Scholar
Neda Ahmadi
View author publications
You can also search for this author in PubMed Google Scholar
Teresa S. M. Tsang
View author publications
You can also search for this author in PubMed Google Scholar
Purang Abolmaesumi
View author publications
You can also search for this author in PubMed Google Scholar
Renjie Liao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Purang Abolmaesumi .

Editor information

Editors and Affiliations

Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
Xiaohuan Cao
Rensselaer Polytechnic Institute, Troy, NY, USA
Xuanang Xu
Imperial College London, London, UK
Islem Rekik
ShanghaiTech University, Shanghai, China
Zhiming Cui
Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
Xi Ouyang

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2677 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mokhtari, M., Ahmadi, N., Tsang, T.S.M., Abolmaesumi, P., Liao, R. (2024). GEMTrans: A General, Echocardiography-Based, Multi-level Transformer Framework for Cardiovascular Diagnosis. In: Cao, X., Xu, X., Rekik, I., Cui, Z., Ouyang, X. (eds) Machine Learning in Medical Imaging. MLMI 2023. Lecture Notes in Computer Science, vol 14349. Springer, Cham. https://doi.org/10.1007/978-3-031-45676-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-45676-3_1
Published: 15 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45675-6
Online ISBN: 978-3-031-45676-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)