Abstract
Deformable image registration minimizes the discrepancy between moving and fixed images by establishing linear and nonlinear spatial correspondences. It plays a crucial role in surgical navigation, image fusion and disease analysis. Its challenge lies in the large number of deformed parameters and the uncertainty of acquisition conditions. Benefiting from the powerful ability to capture hierarchical features and spatial relationships of convolutional neural networks, the medical image registration task has made great progress. Nowadays, the long-range relationship modeling and adaptive selection of self-attention show great potential and have also attracted much attention from researchers. Inspired by this, we propose a new method called Multi-scale Large Kernel Attention UNet (MLKA-Net), which combines a large kernel convolution with the attention mechanism using a multi-scale strategy, and uses a correction module to fine-tune the deformation field to achieve high-accuracy registration. Specifically, we first propose a multi-scale large kernel attention mechanism (MLKA), which generates attention maps by aggregating information from convolution kernels at different scales to improve local feature modeling capabilities of attention. Furthermore, we employ large kernel dilation convolution in proposed attention to construct sufficiently long-range relationships, while keeping lower number of parameters. Finally, to further improve local accuracy of the registration, we design an additional correction module and unsupervised framework to fine-tune the deformation field to solve the issue of original information loss in multilayer networks. Our method is compared qualitatively and quantitatively with 24 representative and advanced methods on the 3 public available 3D datasets from IXI database, LPBA40 dataset and OASIS database, respectively. The experiments demonstrate the excellent performance of the proposed method.
Similar content being viewed by others
Data availability
All datasets used in this study are publicly available. The IXI dataset is available at Brain Development (http://biomedic.doc.ic.ac.uk/brain-development/downloads/IXI/IXI-T1, accessed on 27 January 2024), and the 3D OASIS MRI dataset and 2D OASIS MRI dataset are available at OASIS database (https://www.oasis-brains.org, accessed on 27 January 2024). The LPBA40 dataset is available at (https://www.loni.usc.edu/research/atlas_downloads, accessed on 27 January 2024),
References
Avants BB, Epstein CL, Grossman M, Gee JC (2008) Symmetric diffeomorphic image registration with cross-correlation: Evaluating automated labeling of elderly and neurodegenerative brain. Med Image Anal 12(1):26–41
Faisal Beg Mirza, Miller Michael I, Alain Trouvé, Laurent Younes (2005) Computing large deformation metric mappings via geodesic flows of diffeomorphisms. Int J Comp Vision 61:139–157
Heinrich Mattias P, Oskar Maier, Heinz Handels (2015) Multi-modal multi-atlas segmentation using discrete optimisation and self-similarities. VISCERAL Challenge@ISBI. 1390:27
Marc Modat, Ridgway Gerard R, Taylor Zeike A, Manja Lehmann, Josephine Barnes, Hawkes David J, Fox Nick C, Sébastien Ourselin (2010) Fast free-form deformation using graphics processing units. Comp Method Programs Biomedicine 98(3):278–284
Balakrishnan Guha, Zhao Amy, Sabuncu Mert R, Guttag John, Dalca Adrian V (2019) Voxelmorph: A learning framework for deformable medical image registration. IEEE Trans Med Imaging 38(8):1788–1800
Chen Junyu, Frey Eric C, He Yufan, Segars William P, Li Ye, Yong Du (2022) Transmorph: Transformer for unsupervised medical image registration. Med Image Anal 82:102615
Xi Jia, Joseph Bartlett, Zhang Tianyang Lu, Wenqi Qiu Zhaowen, Jinming Duan (2022) U-net vs transformer: Is u-net outdated in medical image registration? In: Lian Chunfeng, Cao Xiaohuan, Rekik Islem, Xuanang Xu, Cui Zhiming (eds) Machine Learning Medical Imaging. Cham. Springer Nature Switzerland, pp 151–160
Kim B, Kim DH, Park SH, Kim J, Lee JG, Ye JC (2021) CycleMorph: cycle consistent unsupervised deformable image registration. Med Image Anal 71:102036
Shi Jiacheng, He Yuting, Kong Youyong, Coatrieux Jean-Louis, Shu Huazhong, Yang Guanyu, Li Shuo (2022) Xmorpher: Full transformer for deformable medical image registration via cross attention. In Linwei Wang, Qi Dou, P. Thomas Fletcher, Stefanie Speidel, and Shuo Li, (eds.), Medical Image Computing and Computer Assisted Intervention – MICCAI 2022, 217–226, Cham. Springer Nature Switzerland
Hessam Sokooti, De Vos Bob, Floris Berendsen, Lelieveldt Boudewijn PF, Ivana Išgum, Marius Staring (2017) Nonrigid image registration using multi-scale 3d convolutional neural networks. Springer, Cham
Yutong Xie, Jianpeng Zhang, Chunhua Shen, Yong Xia (2021) Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation. In: de Bruijne Marleen, Cattin Philippe C, Cotin Stéphane, Padoy Nicolas, Speidel Stefanie, Zheng Yefeng, Essert Caroline (eds) Medical Image Computing and Computer Assisted Intervention - MICCAI 2021. Cham. Springer International Publishing, pp 171–180
Yang Xiao, Kwitt Roland, Styner Martin, Niethammer Marc (2017) Quicksilver: Fast predictive image registration – a deep learning approach. Neuroimage 158:378–396
Jaderberg Max, Simonyan Karen, Zisserman Andrew, Kavukcuoglu Koray (2015) Spatial transformer networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, NIPS’15, page 2017–2025, Cambridge, MA, USA. MIT Press
Olaf Ronneberger, Philipp Fischer, Thomas Brox (2015) U-net: Convolutional networks for biomedical image segmentation. In: Navab Nassir, Hornegger Joachim, Wells William M, Frangi Alejandro F (eds) Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015. Cham. Springer International Publishing, pp 234–241
Dosovitskiy Alexey, Beyer Lucas, Kolesnikov Alexander, Weissenborn Dirk, Zhai Xiaohua, Unterthiner Thomas, Dehghani Mostafa, Minderer Matthias, Heigold Georg, Gelly Sylvain, Uszkoreit Jakob, Houlsby Neil (2021) An image is worth 16x16 words: Transformers for image recognition at scale
Yungeng Zhang, Yuru Pei, Hongbin Zha (2021) Learning dual transformer network for diffeomorphic registration. In: de Bruijne Marleen, Cattin Philippe C, Cotin Stéphane, Padoy Nicolas, Speidel Stefanie, Zheng Yefeng, Essert Caroline (eds) Medical Image Computing and Computer Assisted Intervention - MICCAI 2021–24th International Conference, Strasbourg, France, September 27 - October 1, 2021, Proceedings, Part IV, vol 12904. Lecture Notes in Computer Science. Springer, pp 129–138
Chen Xuxin, Wang Ximin, Zhang Ke, Fung Kar-Ming, Thai Theresa C, Moore Kathleen, Mannel Robert S, Liu Hong, Zheng Bin, Qiu Yuchen (2022) Recent advances and clinical applications of deep learning in medical image analysis. Med Image Anal 79:102444
He Kelei, Gan Chen, Li Zhuoyuan, Rekik Islem, Yin Zihao, Ji Wen, Gao Yang, Wang Qian, Zhang Junfeng, Shen Dinggang (2023) Transformers in medical image analysis. Intell Medicine 3(1):59–78
Shen Dinggang, Davatzikos C (2001) Hammer: hierarchical attribute matching mechanism for elastic registration. In Proceedings IEEE Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA 2001), 29–36
Xavier Pennec, Pascal Cachier, Nicholas Ayache (1999) Understanding the “demon’s algorithm’’: 3d non-rigid registration by gradient descent. In: Taylor Chris, Colchester Alain (eds) Medical Image Computing and Computer-Assisted Intervention - MICCAI’99, 597–605, Berlin, Heidelberg. Springer, Berlin Heidelberg
Rueckert D, Sonoda LI, Hayes C, Hill DLG, Leach MO, Hawkes DJ (1999) Nonrigid registration using free-form deformations: application to breast mr images. IEEE Trans Med Imaging 18(8):712–721
Ashburner John (2007) A fast diffeomorphic image registration algorithm. Neuroimage 38(1):95–113
Vercauteren Tom, Pennec Xavier, Perchant Aymeric, Ayache Nicholas (2009) Diffeomorphic demons: Efficient non-parametric image registration. Neuroimage 45:S61–S72
Shun Miao, Jane Wang Z, Rui Liao (2016) A cnn regression approach for real-time 2d/3d registration. IEEE Transa Medical Imaging 35(5):1352–1363
de Vos Bob D, Berendsen Floris F, Viergever Max A, Staring Marius, Išgum Ivana (2017) End-to-end unsupervised deformable image registration with a convolutional neural network. volume 10553 LNCS of Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 204–212. Springer Verlag
Dongyang Kuang, Tanya Schmah (2019) Faim - a convnet method for unsupervised 3d medical image registration. In: Suk Heung-Il, Liu Mingxia, Yan Pingkun, Lian Chunfeng (eds) Machine Learning in Medical Imaging. Cham. Springer International Publishing, pp 646–654
Zhao Shengyu, et al., (2019) Recursive cascaded networks for unsupervised medical image registration. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 10599–10609
Boah Kim, Hwan Kim Dong, Ho Park Seong, Jieun Kim, June-Goo Lee, Chul Ye Jong (2021) Cyclemorph: Cycle consistent unsupervised deformable image registration. Med Image Anal 71:102036
Xiaojun Hu, Miao Kang, Weilin Huang, Scott Matthew R, Roland Wiest, Mauricio Reyes (2019) Dual-stream pyramid registration network. In: Shen Dinggang, Liu Tianming, Peters Terry M, Staib Lawrence H, Essert Caroline, Zhou Sean, Yap Pew-Thian, Khan Ali (eds) Medical Image Computing and Computer Assisted Intervention - MICCAI 2019. Cham. Springer International Publishing, pp 382–390
Qian Lijun, Zhou Qing, Cao Xiaohuan, Shen Wenjun, Suo Shiteng, Ma Shanshan, Guoxiang Qu, Gong Xuhua, Yan Yunqi, Jianrong Xu, Jiang Luan (2021) A cascade-network framework for integrated registration of liver dce-mr images. Comput Med Imaging Graph 89:101887
Zhao Yao et al (2023) A transformer-based hierarchical registration framework for multimodality deformable image registration. Computerized Medical Imaging Graphics 108:102286
Chen Junyu, He Yufan, Frey Eric C., Li Ye, Du Yong (2021) Vit-v-net: Vision transformer for unsupervised volumetric medical image registration. arXiv preprint http://arxiv.org/abs/2104.06468arXiv:2104.06468
Ze Liu, Yutong Lin, Cao Yue Hu, Han Wei Yixuan, Zheng Zhang, Stephen Lin, Baining Guo (2021) Swin transformer: Hierarchical vision transformer using shifted windows. IEEE/CVF Int Conf Computer Vision (ICCV) 2021:9992–10002
Fan Jingfan, Cao Xiaohuan, Wang Qian, Yap Pew-Thian, Shen Dinggang (2019) Adversarial learning for mono- or multi-modal registration. Med Image Anal 58:101545
Chitchaya Suwanraksa, Jidapa Bridhikitti, Thiansin Liamsuwan, Sitthichok Chaichulee (2023) Cbct-to-ct translation using registration-based generative adversarial networks in patients with head and neck cancer. Cancers. 15(7):2017
Kim Boah, Han Inhwa, Ye Jong Chul (2022) Diffusemorph: Unsupervised deformable image registration using diffusion model. In Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner, (eds.), 347–364, Cham. Springer Nature Switzerland
Cai Linqin, Fang Haodu, Li Zhiqing (2023) Pre-trained multilevel fuse network based on vision-conditioned reasoning and bilinear attentions for medical image visual question answering. J Supercomput 79(12):13696–13723
La Salvia Marco, Torti Emanuele, Marenzi Elisa, Danese Giovanni, Leporati Francesco (2024) Edge and cloud computing approaches in the early diagnosis of skin cancer with attention-based vision transformer through hyperspectral imaging. J Supercomput 80(11):16368–16392
Liu Yu, Ao Yongcai (August 2024) Deformable attention mechanism-based YOLOv7 structure for lung nodule detection. The Journal of Supercomputing. 1-20
Rane Chinmay, Mehrotra Raj, Bhattacharyya Shubham, Sharma Mukta, Bhattacharya Mahua (2021) A novel attention fusion network-based framework to ensemble the predictions of CNNs for lymph node metastasis detection. J Supercomput 77(4):4201–4220
Shuai L, Guo ZH, Zhang P, Wan J, Pu X, Wang ZL (2020) Stretchable, self-healing, conductive hydrogel fibers for strain sensing and triboelectric energy-harvesting smart textiles. Nano Energy 78:105389
Vaswani Ashish et al (2017) Attention is all you need. In: Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates Inc
Dalca Adrian V, Balakrishnan Guha, Guttag John, Sabuncu Mert R (2019) Unsupervised learning of probabilistic diffeomorphic registration for images and surfaces. Med Image Anal 57:226–236
Vincent Arsigny, Olivier Commowick, Xavier Pennec, Nicholas Ayache (2006) A log-euclidean framework for statistics on diffeomorphisms. Medical image computing and computer-assisted intervention : MICCAI. Int Conf Medical Image Comp Comp-Assist Interv 1:924–31
Shattuck David W, Mirza Mubeena, Adisetiyo Vitria, Hojatkashani Cornelius, Salamon Georges, Narr Katherine L, Poldrack Russell A, Bilder Robert M, Toga Arthur W (2008) Construction of a 3d probabilistic atlas of human cortical structures. Neuroimage 39(3):1064–1080
Marcus Daniel S, Wang Tracy H, Parker Jamie, Csernansky John G, Morris John C, Buckner Randy L (2007) Open access series of imaging studies (oasis): Cross-sectional mri data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci 19(9):1498–1507
Alessa Hering, Lasse Hansen, Mok Tony CW, Chung Albert CS, Hanna Siebert, Stephanie Hager, Annkristin Lange, Sven Kuckertz, Stefan Heldmann, Wei Shao, Sulaiman Vesal, Mirabela Rusu et al (2023) Geoffrey Sonn. Learn2reg: comprehensive multi-task medical image registration challenge, dataset and evaluation in the era of deep learning. IEEE Transa Medical Imaging 3:697–712
Dalca Adrian V, Balakrishnan Guha, Guttag John, Sabuncu Mert R (2018) Unsupervised Learning for Fast Probabilistic Diffeomorphic Registration, 729–738. Springer International Publishing
Lee Raymond Dice (1945) Measures of the amount of ecologic association between species. Ecology 26:297–302
Qiu Huaqi, Qin Chen, Schuh Andreas, Hammernik Kerstin, Rueckert Daniel (2021) Learning diffeomorphic and modality-invariant registration using b-splines. In Medical Imaging with Deep Learning
Wang Wenhai, Xie Enze, Li Xiang, Fan Deng-Ping, Song Kaitao, Liang Ding, Tong Lu, Luo Ping, Shao Ling (2022) Pvt v2: Improved baselines with pyramid vision transformer. Computational Visual Media 8(3):415–424
Zhou Hong-Yu, Guo Jiansen, Zhang Yinghao, Yu Lequan, Wang Liansheng, Yu Yizhou (2021) nnformer: Interleaved transformer for volumetric segmentation. CoRR, abs/2109.03201
Hanna Siebert, Lasse Hansen, Heinrich Mattias P (2022) Fast 3d registration with accurate optimisation and little learning for learn2reg 2021. In: Aubreville Marc, Zimmerer David, Heinrich Mattias (eds) Biomedical Image Registration, Domain Generalisation and Out-of-Distribution Analysis. Cham. Springer International Publishing, pp 174–179
Mok Tony CW, Chung Albert CS (2020) Large deformation diffeomorphic image registration with laplacian pyramid networks. In Anne L. Martel, Purang Abolmaesumi, Danail Stoyanov, Diana Mateus, Maria A. Zuluaga, S. Kevin Zhou, Daniel Racoceanu, and Leo Joskowicz (eds.), Medical Image Computing and Computer Assisted Intervention – MICCAI 2020, 211–221, Cham. Springer International Publishing
Acknowledgements
This work was supported in part by the National Key Research and Development Program of China under Grant NO. 2020YFB1313900, the Shenzhen Science and Technology Program under Grant NO. JCYJ20200109115201707 and NO. JCYJ20220818101408019, and the Graduate Innovative Fund of Wuhan Institute of Technology under Grant NO. CX2023289.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors state that they have no financial Conflict of interest or personal relationships that could have impacted the research presented in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, Y., Hu, X., Lu, T. et al. A multi-scale large kernel attention with U-Net for medical image registration. J Supercomput 81, 70 (2025). https://doi.org/10.1007/s11227-024-06489-9
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-024-06489-9