Focusing intermediate pixels loss for salient object segmentation

Lei Chen ORCID: orcid.org/0000-0002-7843-8737¹,
Tieyong Cao¹,
Yunfei Zheng^1,2,3,
Zheng Fang¹,
Yang Wang¹,
Bingyang Fu¹,
Yekui Wang¹ &
…
Tong Han¹

103 Accesses
1 Citation
Explore all metrics

Abstract

To improve the network performance of salient object segmentation, many researchers modified the loss functions and set weights to pixel losses. However, these loss functions paid less attention to intermediate pixels of which the predicted probabilities lie in the intermediate region between correct and incorrect classification. To solve this problem, focusing intermediate pixels loss is proposed. Firstly, foreground and background are divided into correct and incorrect classified sets respectively to discover intermediate pixels which are difficult to determine the category. Secondly, the intermediate pixels are paid more attention according to the predicted probability. Finally, misclassified pixels are strengthened dynamically with the order of training epochs. The proposed method can 1) make the model focus on intermediate pixels that have more uncertainty; 2) solve the vanishing gradient problem of Focal Loss for well-classified pixels. Experiment results on six public datasets and two different type of network structures show that the proposed method performs better than other state-of-the-art weighted loss functions and the average F_β is increased by about 2.7% compared with typical cross entropy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Group Loss: An Efficient Strategy for Salient Object Detection

BENet: Boundary Enhance Network for Salient Object Detection

A contour self-compensated network for salient object detection

Article 27 June 2020

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Chen LC, Papandreou G, Schroff F, et al (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587.
Chen Z, Zhou H, Lai J et al (2020) Contour-aware loss: Boundary-aware learning for salient object segmentation. IEEE Trans Image Proc 30:431–443
Article ADS Google Scholar
Cheng MM, Mitra NJ, Huang X et al (2014) SalientShape: group saliency in image collections. Vis Comput 30(4):443–453
Article Google Scholar
Deng Z, Hu X, Zhu L et al (2018) R3net: Recurrent residual refinement network for saliency detection. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. AAAI Press, Menlo Park, CA, USA, pp 684–690
Google Scholar
Fan DP, Cheng MM, Liu JJ, et al (2018) Salient objects in clutter: Bringing salient object detection to the foreground. In: Proceedings of the European conference on computer vision, pp. 186–202.
Fan D P, Ji G P, Sun G, et al (2020) Camouflaged object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2777–2787.
Goyal P, Dollár P, Girshick R, et al (2017) Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv:1706.02677.
Gu J, Wang Z, Kuen J et al (2018) Recent advances in convolutional neural networks. Pattern Recogn 77:354–377
Article ADS Google Scholar
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778.
He T, Zhang Z, Zhang H, et al (2019) Bag of tricks for image classification with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 558–567.
Hossain MS, Betts JM, Paplinski AP (2021) Dual Focal Loss to address class imbalance in semantic segmentation. Neurocomputing 462:69–87
Article Google Scholar
Hou Q, Cheng MM, Hu X, et al (2017) Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3203–3212.
Ji Y, Zhang H, Zhang Z et al (2021) CNN-based encoder-decoder networks for salient object detection: A comprehensive review and recent advances. Inf Sci 546:835–857
Article MathSciNet Google Scholar
Kim T, Lee H, Kim D (2021) Uacanet: Uncertainty augmented context attention for polyp segmentation. In: Proceedings of the 29th ACM International Conference on Multimedia, pp 2167-2175.
Kingma D P, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980.
Lei Feng, Senlin Shu, Zhuoyi Lin, et al (2020) Can cross entropy loss be robust to label noise. In: Proceedings of the 29th International Joint Conferences on Artificial Intelligence, pp 2206–2212.
Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5455–5463.
Li Y, Hou X, Koch C, et al (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 280–287.
Li X, Yu L, Chang D et al (2019) Dual cross-entropy loss for small-sample fine-grained vehicle classification. IEEE Trans Veh Technol 68(5):4204–4212
Article Google Scholar
Lin TY, Goyal P, Girshick R, et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE inter-national conference on computer vision, pp. 2980–2988.
Liu JJ, Hou Q, Cheng MM, et al (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3917–3926.
Liu Z, Tang J, Xiang Q et al (2020) Salient object detection for RGB-D images by generative adversarial network. Multimed Tools Appl 79:25403–25425
Article Google Scholar
Liu Z, Lin Y, Cao Y, et al (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10012–10022.
Mao Y, Zhang J, Wan Z, et al (2021) Generative transformer for accurate and reliable salient object detection arXiv: 2104.10127.
Mavroforakis ME, Theodoridis S (2006) A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw 17(3):671–682
Article PubMed Google Scholar
Pan C, Yan WQ (2020) Object detection based on saturation of visual perception. Multimed Tools Appl 79:19925–19944
Article Google Scholar
Pang Y, Zhao X, Xiang T Z, et al (2022) Zoom in and out: A mixed-scale triplet network for camouflaged object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2160–2170.
Rahman MA, Wang Y (2016) Optimizing intersection-over-union in deep neural networks for image segmentation. International symposium on visual computing. Springer, Cham, pp 234–244
Google Scholar
Singh VK, Kumar N, Singh N (2020) A hybrid approach using color spatial variance and novel object position prior for salient object detection. Multimed Tools Appl 79:30045–30067
Article Google Scholar
Wang Q, Zhang L, Li Y et al (2020) Overview of deep-learning based methods for salient object detection in videos. Pattern Recogn 104:107340
Article Google Scholar
Wei J, Wang S, Huang Q (2020) F³Net: fusion, feedback and focus for salient object detection. Proc AAAI Conf Artificial Int 34(07):12321–12328
Google Scholar
Yang C, Zhang L, Lu H, et al (2013) Saliency detection via graph based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3166–3173.
Zhang P, Liu W, Lu H, et al (2018) Salient object detection by lossless feature reflection. arXiv preprint arXiv:1802.06527.
Zhao S, Wu B, Chu W, et al (2019) Correlation maximized structural similarity loss for semantic segmentation. arXiv:1910.08711.

Download references

Funding

This work was supported by the Natural Science Foundation of China (61801512, 62071484), Natural Science Foundation of Jiangsu Province (BK20180080).

Author information

Authors and Affiliations

The Army Engineering University of PLA, Nanjing, 210007, Jiangsu, China
Lei Chen, Tieyong Cao, Yunfei Zheng, Zheng Fang, Yang Wang, Bingyang Fu, Yekui Wang & Tong Han
The PLA Army Academy of Artillery and Air Defense, Hefei, 230031, Anhui, China
Yunfei Zheng
The Key Laboratory of Polarization Imaging Detection Technology, Hefei, 230031, Anhui, China
Yunfei Zheng

Authors

Lei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Tieyong Cao
View author publications
You can also search for this author in PubMed Google Scholar
Yunfei Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Fang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bingyang Fu
View author publications
You can also search for this author in PubMed Google Scholar
Yekui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tong Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tieyong Cao.

Ethics declarations

Conflict of interest/Competing interests

We declare that we have no conflict of interest to this work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

When calculating the gradient of loss function, we found that the gradient we calculated (Fig. 5b) is inconsistent with Dual Focal Loss (DFL) (Fig. 1b) [11] in background. The two images are shown below.

After research, we found that Hossain et al. [11] misunderstood cross entropy loss formula. The binary cross entropy loss is calculated as follows.

$${\mathrm{L}}_{\mathrm{CE}}=-{\sum}_{\mathrm{i}=1}^{\mathrm{T}}\left[{\mathrm{y}}_{\mathrm{i}}\log {\mathrm{p}}_{\mathrm{i}}+\left(1-{\mathrm{y}}_{\mathrm{i}}\right)\log \left(1-{\mathrm{p}}_{\mathrm{i}}\right)\right]$$

In foreground, y_i is 1. Cross entropy loss of foreground pixels is as follows.

$${\mathrm{L}}_{\mathrm{F}}^{\mathrm{i}}=-\log {\mathrm{p}}_{\mathrm{i}}$$

In background, y_i is 0. Cross entropy loss of background pixels is as follows.

$${\mathrm{L}}_{\mathrm{B}}^{\mathrm{i}}=-\log \left(1-{\mathrm{p}}_{\mathrm{i}}\right)$$

While Hossain et al. [11] calculated cross entropy loss of foreground and background in the same way.

$${\mathrm{L}}_{\mathrm{F}}^{\mathrm{i}}={\mathrm{L}}_{\mathrm{B}}^{\mathrm{i}}=-\log {\mathrm{p}}_{\mathrm{i}}$$

To explore the reason for the author’s error, we think it may be affected by Focal Loss (FL) [20]. On page 3 of FL, for the convenience of writing, cross entropy is written as follows.

$$\mathrm{CE}\left(p,y\right)=\mathrm{CE}\left({p}_{\mathrm{t}}\right)=-\log \left({p}_{\mathrm{t}}\right)$$

Focal loss is written as follows.

$$\mathrm{FL}\left({p}_{\mathrm{t}}\right)=-{\left(1-{p}_{\mathrm{t}}\right)}^{\gamma }=-\log \left({p}_{\mathrm{t}}\right)$$

However, p_t is not simply expressed as pixel predicted probability, it is distinguished to foreground and background. Hossain et al. [11] may directly regard p_t as predicted probability of the pixel. The screenshot of original focal loss is as follows.

Focal Loss

The Focal Loss is designed to address the one-stage object detection scenario in which there is an extreme imbalance between foreground and background classes during trainig (e.g., 1:1000). We introduce the focal loss starting from the cross entropy (CE) loss for binary classification¹:

$$\mathrm{CE}\left(p,y\right)=\left\{\begin{array}{c}-\log (p)\ \mathrm{if}\ y=1\\ {}-\log \left(1-p\right)\mathrm{otherwie}\end{array}\right.$$

(18)

In the above y ∈ {±}1 specifies the ground-truth class and p ∈ [0, 1] is the model’s estimated probability for the class with label y = 1. For notational convience, we define p_t:

$${p}_{\mathrm{t}}=\left\{\begin{array}{c}p\kern0.5em \mathrm{if}\ y=1\\ {}1-p\kern0.5em \mathrm{otherwise}\end{array}\right.$$

(19)

and rewrite CE (p, y) = CE(p_t) = − log(p_t).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, L., Cao, T., Zheng, Y. et al. Focusing intermediate pixels loss for salient object segmentation. Multimed Tools Appl 83, 19747–19766 (2024). https://doi.org/10.1007/s11042-023-15873-1

Download citation

Received: 29 September 2022
Revised: 20 March 2023
Accepted: 22 May 2023
Published: 28 July 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11042-023-15873-1

Focusing intermediate pixels loss for salient object segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Group Loss: An Efficient Strategy for Salient Object Detection

BENet: Boundary Enhance Network for Salient Object Detection

A contour self-compensated network for salient object detection

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest/Competing interests

Additional information

Publisher’s note

Appendix

Focal Loss

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Focusing intermediate pixels loss for salient object segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Group Loss: An Efficient Strategy for Salient Object Detection

BENet: Boundary Enhance Network for Salient Object Detection

A contour self-compensated network for salient object detection

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest/Competing interests

Additional information

Publisher’s note

Appendix

Appendix

Focal Loss

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation