Region Attention Transformer for Medical Image Restoration

Zhiwen Yang¹⁴,
Haowei Chen¹⁴,
Ziniu Qian¹⁴,
Yang Zhou¹⁴,
Hui Zhang¹⁵,
Dan Zhao¹⁶,
Bingzheng Wei¹⁷ &
…
Yan Xu¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15007))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

934 Accesses

Abstract

Transformer-based methods have demonstrated impressive results in medical image restoration, attributed to the multi-head self-attention (MSA) mechanism in the spatial dimension. However, the majority of existing Transformers conduct attention within fixed and coarsely partitioned regions (e.g. the entire image or fixed patches), resulting in interference from irrelevant regions and fragmentation of continuous image content. To overcome these challenges, we introduce a novel Region Attention Transformer (RAT) that utilizes a region-based multi-head self-attention mechanism (R-MSA). The R-MSA dynamically partitions the input image into non-overlapping semantic regions using the robust Segment Anything Model (SAM) and then performs self-attention within these regions. This region partitioning is more flexible and interpretable, ensuring that only pixels from similar semantic regions complement each other, thereby eliminating interference from irrelevant regions. Moreover, we introduce a focal region loss to guide our model to adaptively focus on recovering high-difficulty regions. Extensive experiments demonstrate the effectiveness of RAT in various medical image restoration tasks, including PET image synthesis, CT image denoising, and pathological image super-resolution. Code is available at https://github.com/RAT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 159.99; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Dual-scale shifted window attention network for medical image segmentation

Article Open access 31 July 2024

DAMAF: dual attention network with multi-level adaptive complementary fusion for medical image segmentation

Article 20 June 2024

MARNet: Medical Image Segmentation Network Based on Multi-axis Attention Mechanism and Aggregated Supervision Enhancement

References

Zhou, Y., et al.: 3D segmentation guided style-based generative adversarial networks for pet synthesis. IEEE Trans. Med. Imaging 41(8), 2092–2104 (2022)
Google Scholar
Chen, Y., Xie, Y., Zhou, Z., Shi, F., Christodoulou, A.G., Li, D.: Brain MRI super resolution using 3D deep densely connected neural networks. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 739–742. IEEE (2018)
Google Scholar
Chan, C., Zhou, J., Yang, L., Qi, W., Kolthammer, J., Asma, E.: Noise adaptive deep convolutional neural network for whole-body pet denoising. In: 2018 IEEE Nuclear Science Symposium and Medical Imaging Conference Proceedings (NSS/MIC), pp. 1–4. IEEE (2018)
Google Scholar
Zhou, L., Schaefferkoetter, J.D., Tham, I.W., Huang, G., Yan, J.: Supervised learning with Cyclegan for low-dose FDG pet image denoising. Med. Image Anal. 65, 101770 (2020)
Google Scholar
Luo, Y., et al.: Adaptive rectification based adversarial network with spectrum constraint for high-quality pet image synthesis. Med. Image Anal. 77, 102335 (2022)
Google Scholar
Yang, Z., Zhou, Y., Zhang, H., Wei, B., Fan, Y., Xu, Y.: DRMC: a generalist model with dynamic routing for multi-center PET image synthesis. In: Greenspan, H., et al. (eds.) MICCAI 2023, Part III, pp. 36–46. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43898-1_4
Jang, S.I., et al.: Spach transformer: spatial and channel-wise transformer based on local and global self-attentions for pet image denoising. IEEE Trans. Med. Imaging 43(6), 2036–2049 (2023)
Article Google Scholar
Chen, H., et al.: Low-dose CT denoising with convolutional neural network. In: 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), pp. 143–146. IEEE (2017)
Google Scholar
Yang, Q., et al.: Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss. IEEE Trans. Med. Imaging 37(6), 1348–1357 (2018)
Google Scholar
Chen, H., et al.: Low-dose CT with a residual encoder-decoder convolutional neural network. IEEE Trans. Med. Imaging 36(12), 2524–2535 (2017)
Google Scholar
Wang, D., Fan, F., Wu, Z., Liu, R., Wang, F., Yu, H.: CTformer: convolution-free token2token dilated vision transformer for low-dose CT denoising. Phys. Med. Biol. 68(6), 065012 (2023)
Article Google Scholar
Liang, T., Jin, Y., Li, Y., Wang, T.: EDCNN: edge enhancement-based densely connected network with compound loss for low-dose CT denoising. In: 2020 15th IEEE International Conference on Signal Processing (ICSP), vol. 1, pp. 193–198. IEEE (2020)
Google Scholar
Li, B., Keikhosravi, A., Loeffler, A.G., Eliceiri, K.W.: Single image super-resolution for whole slide image using convolutional neural networks and self-supervised color normalization. Med. Image Anal. 68, 101938 (2021)
Article Google Scholar
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using Swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)
Google Scholar
Zhang, A., Ren, W., Liu, Y., Cao, X.: Lightweight image super-resolution with superpixel token interaction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12728–12737 (2023)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Zhao, G., Lin, J., Zhang, Z., Ren, X., Su, Q., Sun, X.: Explicit sparse transformer: Concentrated attention through explicit selection. arXiv preprint arXiv:1912.11637 (2019)
Mei, Y., Fan, Y., Zhou, Y.: Image super-resolution with non-local sparse attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3517–3526 (2021)
Google Scholar
Xia, Z., Pan, X., Song, S., Li, L.E., Huang, G.: Vision transformer with deformable attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4794–4803 (2022)
Google Scholar
Kirillov, A., et al.: Segment anything. arXiv preprint arXiv:2304.02643 (2023)
Wang, X., Yu, K., Dong, C., Loy, C.C.: Recovering realistic texture in image super-resolution by deep spatial feature transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 606–615 (2018)
Google Scholar
Chen, L., Chu, X., Zhang, X., Sun, J.: Simple baselines for image restoration. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022, Part VII, pp. 17–33. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20071-7_2
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Xiang, L., et al.: Deep auto-context convolutional neural networks for standard-dose pet image estimation from low-dose pet/MRI. Neurocomputing 267, 406–416 (2017)
Google Scholar
Hudson, H.M., Larkin, R.S.: Accelerated image reconstruction using ordered subsets of projection data. IEEE Trans. Med. Imaging 13(4), 601–609 (1994)
Article Google Scholar
McCollough, C.H., et al.: Low-dose CT for the detection and classification of metastatic liver lesions: results of the 2016 low dose CT grand challenge. Med. Phys. 44(10), e339–e352 (2017)
Google Scholar
Drifka, C.R., et al.: Highly aligned stromal collagen is a negative prognostic factor following pancreatic ductal adenocarcinoma resection. Oncotarget 7(46), 76197 (2016)
Google Scholar
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)
Article Google Scholar
Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)
Google Scholar
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 286–301 (2018)
Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation in China under Grant 62371016, U23B2063, 62022010, and 62176267, the Bejing Natural Science Foundation Haidian District Joint Fund in China under Grant L222032, the Beijing hope run special fund of cancer foundation of China under Grant LC2018L02, the Fundamental Research Funds for the Central University of China from the State Key Laboratory of Software Development Environment in Beihang University in China, the 111 Proiect in China under Grant B13003, the SinoUnion Healthcare Inc. under the eHealth program, the high performance computing (HPC) resources at Beihang University.

Author information

Authors and Affiliations

School of Biological Science and Medical Engineering, State Key Laboratory of Software Development Environment, Key Laboratory of Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100191, China
Zhiwen Yang, Haowei Chen, Ziniu Qian, Yang Zhou & Yan Xu
Department of Biomedical Engineering, Tsinghua University, Beijing, 100084, China
Hui Zhang
Department of Gynecology Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Dan Zhao
ByteDance Inc., Beijing, 100098, China
Bingzheng Wei

Authors

Zhiwen Yang
View author publications
You can also search for this author in PubMed Google Scholar
Haowei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ziniu Qian
View author publications
You can also search for this author in PubMed Google Scholar
Yang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Bingzheng Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yan Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Xu .

Editor information

Editors and Affiliations

Children’s National Hospital/George Washington University, Washington, DC, USA
Marius George Linguraru
The Chinese University of Hong Kong, Hong Kong, China
Qi Dou
Technical University of Denmark, Kgs Lyngby, Denmark
Aasa Feragen
Imperial College London, London, UK
Stamatia Giannarou
Imperial College London, London, UK
Ben Glocker
Universitat de Barcelona, Barcelona, Spain
Karim Lekadir
Helmholtz Munich, Technical University of Munich and King’s College London, Munich, Germany
Julia A. Schnabel

Ethics declarations

Disclosure of Interests

We have no conflicts of interest to disclose.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Z. et al. (2024). Region Attention Transformer for Medical Image Restoration. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15007. Springer, Cham. https://doi.org/10.1007/978-3-031-72104-5_58

Download citation

DOI: https://doi.org/10.1007/978-3-031-72104-5_58
Published: 03 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72103-8
Online ISBN: 978-3-031-72104-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Region Attention Transformer for Medical Image Restoration

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Dual-scale shifted window attention network for medical image segmentation

DAMAF: dual attention network with multi-level adaptive complementary fusion for medical image segmentation

MARNet: Medical Image Segmentation Network Based on Multi-axis Attention Mechanism and Aggregated Supervision Enhancement

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Region Attention Transformer for Medical Image Restoration

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Dual-scale shifted window attention network for medical image segmentation

DAMAF: dual attention network with multi-level adaptive complementary fusion for medical image segmentation

MARNet: Medical Image Segmentation Network Based on Multi-axis Attention Mechanism and Aggregated Supervision Enhancement

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation