Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

SA-DETR: Saliency Attention-based DETR for salient object detection

Published: 06 December 2024 Publication History

Abstract

Researches on the Salient Object Detection (SOD) task have made many advances based on deep learning methods. However, most methods have focused on predicting a fine mask rather than finding the most salient objects. Most datasets for the SOD task also focus on evaluating pixel-wise accuracy rather than “saliency”. In this study, we used the Salient Objects in Clutter (SOC) dataset to conduct research that focuses more on the saliency of objects. We propose a architecture that extends the cross-attention mechanism of Transformer to the DETR architecture to learn the relationship between the global image semantics and the objects. We extended module with Saliency Attention (SA) to the network, namely SA-DETR, to detect salient objects based on object-level saliency. Our proposed method with cross- and saliency-attentions shows superior results in detecting salient objects among multiple objects compared to other methods. We demonstrate the effectiveness of our proposed method by showing that it outperforms the state-of-the-art performance of the existing SOD method by 4.7% and 0.2% in MAE and mean E-measure, respectively.

References

[1]
Achanta R, Hemami S, Estrada F, et al (2009) Frequency-tuned salient region detection. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 1597–1604
[2]
Brahim K, Kalboussi R, Abdellaoui M, et al. Spatio-temporal saliency detection using objectness measure Signal, Image Video Process 2019 13 1055-1062
[3]
Carion N, Massa F, Synnaeve G, et al (2020) End-to-end object detection with transformers. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16. Springer, pp 213–229
[4]
Chen Q, Wang J, Han C et al (2022) Group detr v2: Strong object detector with encoder-decoder pretraining. arXiv preprint arXiv:2211.03594
[5]
Cheng MM, Zhang Z, Lin WY et al (2014) Bing: Binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3286–3293
[6]
Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
[7]
Fan DP, Cheng MM, Liu Y, et al (2017) Structure-measure: a new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision, pp 4548–4557
[8]
Fan DP, Gong C, Cao Y et al (2018) Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421
[9]
Fan DP, Zhang J, Xu G, et al. Salient objects in clutter IEEE Trans Pattern Anal Mach Intell 2022 45 2 2344-2366
[10]
Fang Y, Wang W, Xie B et al (2022) Eva: Exploring the limits of masked visual representation learning at scale. arXiv preprint arXiv:2211.07636
[11]
Harel J, Koch C, Perona P (2006) Graph-based visual saliency. Advances in neural information processing systems 19
[12]
Hou Q, Cheng MM, Hu X, et al. Deeply supervised salient object detection with short connections IEEE TPAMI 2019 41 4 815-828
[13]
Hou X, Zhang L (2007) Saliency detection: A spectral residual approach. In: 2007 IEEE Conference on computer vision and pattern recognition. IEEE, pp 1–8
[14]
Itti L, Koch C, and Niebur E A model of saliency-based visual attention for rapid scene analysis IEEE Trans Pattern Anal Mach Intell 1998 20 11 1254-1259
[15]
Li G, Yu Y (2016) Deep contrast learning for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 478–487
[16]
Li Y, Hou X, Koch C et al (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–287
[17]
Liu JJ, Hou Q, Cheng MM et al (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3917–3926
[18]
Liu N, Zhang N, Wan K et al (2021) Visual saliency transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4722–4732
[19]
Liu Y, Cheng MM, Hu X et al (2017) Richer convolutional features for edge detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3000–3009
[20]
Liu Z, Lin Y, Cao Y et al (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022
[21]
Luo Z, Mishra A, Achkar A et al (2017) Non-local deep features for salient object detection. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 6609–6617
[22]
Nguyen T (2015) Salient object detection via objectness proposals. In: Proceedings of the AAAI Conference on Artificial Intelligence
[23]
Pan J, Sayrol E, Nieto XG et al (2017) Salgan: Visual saliency prediction with adversarial networks. In: CVPR scene understanding workshop (SUNw)
[24]
Perazzi F, Krähenbühl P, Pritch Y et al (2012) Saliency filters: contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 733–740
[25]
Qin X, Zhang Z, Huang C et al (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7479–7489
[26]
Qin X, Zhang Z, Huang C, et al. U2-net: Going deeper with nested u-structure for salient object detection Pattern Recognit 2020 106 107404
[27]
Srivatsa RS, Babu RV (2015) Salient object detection via objectness measure. In: 2015 IEEE international conference on image processing (ICIP). IEEE, pp 4481–4485
[28]
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Advances in neural information processing systems 30
[29]
Wang L, Lu H, Wang Y et al (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 136–145
[30]
Wei J, Wang S, Huang Q (2020) net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 12321–12328
[31]
Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7264–7273
[32]
Yang C, Zhang L, Lu H et al (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173
[33]
Zaidi SSA, Ansari MS, Aslam A, et al. A survey of modern deep learning based object detection models Digit Signal Process 2022 126 103514
[34]
Zhang J, Fan DP, Dai Y, et al. Uncertainty inspired rgb-d saliency detection IEEE Trans Pattern Anal Mach Intell 2021 44 9 5761-5779
[35]
Zhang P, Wang D, Lu H et al (2017) Learning uncertain convolutional features for accurate saliency detection. In: Proceedings of the IEEE International Conference on computer vision, pp 212–221
[36]
Zhao JX, Liu JJ, Fan DP et al (2019) Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8779–8788
[37]
Zhuge M, Fan DP, Liu N, et al. Salient object detection via integrity learning IEEE Trans Pattern Anal Mach Intell 2022 45 3 3738-52
[38]
Zong Z, Song G, Liu Y (2022) Detrs with collaborative hybrid assignments training. arXiv preprint arXiv:2211.12860

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Analysis & Applications
Pattern Analysis & Applications  Volume 28, Issue 1
Mar 2025
524 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 06 December 2024
Accepted: 13 November 2024
Received: 11 January 2024

Author Tags

  1. DETR-based SOD
  2. Object-level global cross attention
  3. Object detection in clutter
  4. Saliency Attention mechanism
  5. Salient object detection

Author Tag

  1. Psychology and Cognitive Sciences
  2. Psychology

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media