Nothing Special   »   [go: up one dir, main page]

Skip to main content

Region Attention Transformer for Medical Image Restoration

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2024 (MICCAI 2024)

Abstract

Transformer-based methods have demonstrated impressive results in medical image restoration, attributed to the multi-head self-attention (MSA) mechanism in the spatial dimension. However, the majority of existing Transformers conduct attention within fixed and coarsely partitioned regions (e.g. the entire image or fixed patches), resulting in interference from irrelevant regions and fragmentation of continuous image content. To overcome these challenges, we introduce a novel Region Attention Transformer (RAT) that utilizes a region-based multi-head self-attention mechanism (R-MSA). The R-MSA dynamically partitions the input image into non-overlapping semantic regions using the robust Segment Anything Model (SAM) and then performs self-attention within these regions. This region partitioning is more flexible and interpretable, ensuring that only pixels from similar semantic regions complement each other, thereby eliminating interference from irrelevant regions. Moreover, we introduce a focal region loss to guide our model to adaptively focus on recovering high-difficulty regions. Extensive experiments demonstrate the effectiveness of RAT in various medical image restoration tasks, including PET image synthesis, CT image denoising, and pathological image super-resolution. Code is available at https://github.com/RAT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 159.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zhou, Y., et al.: 3D segmentation guided style-based generative adversarial networks for pet synthesis. IEEE Trans. Med. Imaging 41(8), 2092–2104 (2022)

    Google Scholar 

  2. Chen, Y., Xie, Y., Zhou, Z., Shi, F., Christodoulou, A.G., Li, D.: Brain MRI super resolution using 3D deep densely connected neural networks. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 739–742. IEEE (2018)

    Google Scholar 

  3. Chan, C., Zhou, J., Yang, L., Qi, W., Kolthammer, J., Asma, E.: Noise adaptive deep convolutional neural network for whole-body pet denoising. In: 2018 IEEE Nuclear Science Symposium and Medical Imaging Conference Proceedings (NSS/MIC), pp. 1–4. IEEE (2018)

    Google Scholar 

  4. Zhou, L., Schaefferkoetter, J.D., Tham, I.W., Huang, G., Yan, J.: Supervised learning with Cyclegan for low-dose FDG pet image denoising. Med. Image Anal. 65, 101770 (2020)

    Google Scholar 

  5. Luo, Y., et al.: Adaptive rectification based adversarial network with spectrum constraint for high-quality pet image synthesis. Med. Image Anal. 77, 102335 (2022)

    Google Scholar 

  6. Yang, Z., Zhou, Y., Zhang, H., Wei, B., Fan, Y., Xu, Y.: DRMC: a generalist model with dynamic routing for multi-center PET image synthesis. In: Greenspan, H., et al. (eds.) MICCAI 2023, Part III, pp. 36–46. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43898-1_4

  7. Jang, S.I., et al.: Spach transformer: spatial and channel-wise transformer based on local and global self-attentions for pet image denoising. IEEE Trans. Med. Imaging 43(6), 2036–2049 (2023)

    Article  Google Scholar 

  8. Chen, H., et al.: Low-dose CT denoising with convolutional neural network. In: 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), pp. 143–146. IEEE (2017)

    Google Scholar 

  9. Yang, Q., et al.: Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss. IEEE Trans. Med. Imaging 37(6), 1348–1357 (2018)

    Google Scholar 

  10. Chen, H., et al.: Low-dose CT with a residual encoder-decoder convolutional neural network. IEEE Trans. Med. Imaging 36(12), 2524–2535 (2017)

    Google Scholar 

  11. Wang, D., Fan, F., Wu, Z., Liu, R., Wang, F., Yu, H.: CTformer: convolution-free token2token dilated vision transformer for low-dose CT denoising. Phys. Med. Biol. 68(6), 065012 (2023)

    Article  Google Scholar 

  12. Liang, T., Jin, Y., Li, Y., Wang, T.: EDCNN: edge enhancement-based densely connected network with compound loss for low-dose CT denoising. In: 2020 15th IEEE International Conference on Signal Processing (ICSP), vol. 1, pp. 193–198. IEEE (2020)

    Google Scholar 

  13. Li, B., Keikhosravi, A., Loeffler, A.G., Eliceiri, K.W.: Single image super-resolution for whole slide image using convolutional neural networks and self-supervised color normalization. Med. Image Anal. 68, 101938 (2021)

    Article  Google Scholar 

  14. Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using Swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)

    Google Scholar 

  15. Zhang, A., Ren, W., Liu, Y., Cao, X.: Lightweight image super-resolution with superpixel token interaction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12728–12737 (2023)

    Google Scholar 

  16. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  17. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

    Google Scholar 

  18. Zhao, G., Lin, J., Zhang, Z., Ren, X., Su, Q., Sun, X.: Explicit sparse transformer: Concentrated attention through explicit selection. arXiv preprint arXiv:1912.11637 (2019)

  19. Mei, Y., Fan, Y., Zhou, Y.: Image super-resolution with non-local sparse attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3517–3526 (2021)

    Google Scholar 

  20. Xia, Z., Pan, X., Song, S., Li, L.E., Huang, G.: Vision transformer with deformable attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4794–4803 (2022)

    Google Scholar 

  21. Kirillov, A., et al.: Segment anything. arXiv preprint arXiv:2304.02643 (2023)

  22. Wang, X., Yu, K., Dong, C., Loy, C.C.: Recovering realistic texture in image super-resolution by deep spatial feature transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 606–615 (2018)

    Google Scholar 

  23. Chen, L., Chu, X., Zhang, X., Sun, J.: Simple baselines for image restoration. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022, Part VII, pp. 17–33. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20071-7_2

  24. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  25. Xiang, L., et al.: Deep auto-context convolutional neural networks for standard-dose pet image estimation from low-dose pet/MRI. Neurocomputing 267, 406–416 (2017)

    Google Scholar 

  26. Hudson, H.M., Larkin, R.S.: Accelerated image reconstruction using ordered subsets of projection data. IEEE Trans. Med. Imaging 13(4), 601–609 (1994)

    Article  Google Scholar 

  27. McCollough, C.H., et al.: Low-dose CT for the detection and classification of metastatic liver lesions: results of the 2016 low dose CT grand challenge. Med. Phys. 44(10), e339–e352 (2017)

    Google Scholar 

  28. Drifka, C.R., et al.: Highly aligned stromal collagen is a negative prognostic factor following pancreatic ductal adenocarcinoma resection. Oncotarget 7(46), 76197 (2016)

    Google Scholar 

  29. Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)

    Article  Google Scholar 

  30. Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)

    Google Scholar 

  31. Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 286–301 (2018)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation in China under Grant 62371016, U23B2063, 62022010, and 62176267, the Bejing Natural Science Foundation Haidian District Joint Fund in China under Grant L222032, the Beijing hope run special fund of cancer foundation of China under Grant LC2018L02, the Fundamental Research Funds for the Central University of China from the State Key Laboratory of Software Development Environment in Beihang University in China, the 111 Proiect in China under Grant B13003, the SinoUnion Healthcare Inc. under the eHealth program, the high performance computing (HPC) resources at Beihang University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Xu .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

We have no conflicts of interest to disclose.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, Z. et al. (2024). Region Attention Transformer for Medical Image Restoration. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15007. Springer, Cham. https://doi.org/10.1007/978-3-031-72104-5_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72104-5_58

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72103-8

  • Online ISBN: 978-3-031-72104-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics