Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Depth-Aware Salient Object Detection and Segmentation via Multiscale Discriminative Saliency Fusion and Bootstrap Learning

Published: 01 September 2017 Publication History

Abstract

This paper proposes a novel depth-aware salient object detection and segmentation framework via multiscale discriminative saliency fusion (MDSF) and bootstrap learning for RGBD images (RGB color images with corresponding Depth maps) and stereoscopic images. By exploiting low-level feature contrasts, mid-level feature weighted factors and high-level location priors, various saliency measures on four classes of features are calculated based on multiscale region segmentation. A random forest regressor is learned to perform the discriminative saliency fusion (DSF) and generate the DSF saliency map at each scale, and DSF saliency maps across multiple scales are combined to produce the MDSF saliency map. Furthermore, we propose an effective bootstrap learning-based salient object segmentation method, which is bootstrapped with samples based on the MDSF saliency map and learns multiple kernel support vector machines. Experimental results on two large datasets show how various categories of features contribute to the saliency detection performance and demonstrate that the proposed framework achieves the better performance on both saliency detection and salient object segmentation.

References

[1]
P. Le Callet and E. Niebur, “Visual attention and applications in multimedia technologies,” Proc. IEEE, vol. 101, no. 9, pp. 2058–2067, Sep. 2013.
[2]
L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 11, pp. 1254–1259, Nov. 1998.
[3]
T. Liu, J. Sun, N.-N. Zheng, X. Tang, and H.-Y. Shum, “Learning to detect a salient object,” in Proc. IEEE CVPR, Jun. 2007, pp. 1–8.
[4]
Z. Liu, Y. Xue, H. Yan, and Z. Zhang, “Efficient saliency detection based on Gaussian models,” IET Image Process., vol. 5, no. 2, pp. 122–131, Mar. 2011.
[5]
M.-M. Cheng, G.-X. Zhang, N. J. Mitra, X. Huang, and S.-M. Hu, “Global contrast based salient region detection,” in Proc. IEEE CVPR, Jun. 2011, pp. 409–416.
[6]
Y. Wei, F. Wen, W. Zhu, and J. Sun, “Geodesic saliency using background priors,” in Proc. ECCV, Sep. 2012, pp. 29–42.
[7]
H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, and S. Li, “Salient object detection: A discriminative regional feature integration approach,” in Proc. IEEE CVPR, Jun. 2013, pp. 2083–2090.
[8]
C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, “Saliency detection via graph-based manifold ranking,” in Proc. IEEE CVPR, Jun. 2013, pp. 3166–3173.
[9]
Z. Liu, W. Zou, and O. Le Meur, “Saliency tree: A novel saliency detection framework,” IEEE Trans. Image Process., vol. 23, no. 5, pp. 1937–1952, May 2014.
[10]
W. Zhu, S. Liang, Y. Wei, and J. Sun, “Saliency optimization from robust background detection,” in Proc. IEEE CVPR, Jun. 2014, pp. 2814–2821.
[11]
N. Tong, H. Lu, X. Ruan, and M.-H. Yang, “Salient object detection via bootstrap learning,” in Proc. IEEE CVPR, Jun. 2015, pp. 1884–1892.
[12]
P. Siva, C. Russell, T. Xiang, and L. Agapito, “Looking beyond the image: Unsupervised learning for object saliency and detection,” in Proc. IEEE CVPR, Jun. 2013, pp. 3238–3245.
[13]
C. Kanan, M. H. Tong, L. Zhang, and G. W. Cottrell, “SUN: Top-down saliency using natural statistics,” Vis. Cognit., vol. 17, nos. 6–7, pp. 979–1003, 2009.
[14]
J. Yang and M.-H. Yang, “Top-down visual saliency via joint CRF and dictionary learning,” in Proc. IEEE CVPR, Jun. 2012, pp. 2296–2303.
[15]
A. Borji and L. Itti, “Exploiting local and global patch rarities for saliency detection,” in Proc. IEEE CVPR, Jun. 2012, pp. 478–485.
[16]
J. Wang, M. P. Da Silva, P. Le Callet, and V. Ricordel, “Computational model of stereoscopic 3D visual saliency,” IEEE Trans. Image Process., vol. 22, no. 6, pp. 2151–2165, Jun. 2013.
[17]
J. Wang, D. M. Chandler, and P. Le Callet, “Quantifying the relationship between visual salience and visual importance,” Proc. SPIE, vol. 7527, p. 75270K, Feb. 2010.
[18]
E. Rahtu, J. Kannala, M. Salo, and J. Heikkilä, “Segmenting salient objects from images and videos,” in Proc. ECCV, Sep. 2010, pp. 366–379.
[19]
Z. Liu, R. Shi, L. Shen, Y. Xue, K. N. Ngan, and Z. Zhang, “Unsupervised salient object segmentation based on kernel density estimation and two-phase graph cut,” IEEE Trans. Multimedia, vol. 14, no. 4, pp. 1275–1289, Aug. 2012.
[20]
M.-M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, and S.-M. Hu, “Global contrast based salient region detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 3, pp. 569–582, Mar. 2015.
[21]
Y. Luo, J. Yuan, P. Xue, and Q. Tian, “Saliency density maximization for efficient visual objects discovery,” IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 12, pp. 1822–1834, Dec. 2011.
[22]
R. Shi, Z. Liu, H. Du, X. Zhang, and L. Shen, “Region diversity maximization for salient object detection,” IEEE Signal Process. Lett., vol. 19, no. 4, pp. 215–218, Apr. 2012.
[23]
J. Sun and H. Ling, “Scale and object aware image retargeting for thumbnail browsing,” in Proc. IEEE ICCV, Nov. 2011, pp. 1511–1518.
[24]
H. Du, Z. Liu, J. Jiang, and L. Shen, “Stretchability-aware block scaling for image retargeting,” J. Vis. Commun. Image Represent., vol. 24, no. 4, pp. 499–508, 2013.
[25]
C. Guo and L. Zhang, “A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression,” IEEE Trans. Image Process., vol. 19, no. 1, pp. 185–198, Jan. 2010.
[26]
L. Shen, Z. Liu, and Z. Zhang, “A novel H.264 rate control algorithm with consideration of visual attention,” J. Multimedia Tools Appl., vol. 63, no. 3, pp. 709–727, 2013.
[27]
S. Frintrop, A. Nuchter, H. Surmann, and J. Hertzberg, “Saliency-based object recognition in 3D data,” in Proc. IEEE ICIROS, Sep. 2004, pp. 2167–2172.
[28]
O. Le Meur and Z. Liu, “Saccadic model of eye movements for free-viewing condition,” Vis. Res., vol. 116, pp. 152–164, Nov. 2015.
[29]
A. Borji, D. N. Sihite, and L. Itti, “Salient object detection: A benchmark,” in Proc. ECCV, Oct. 2012, pp. 414–429.
[30]
A. M. Treisman and G. Gelade, “A feature-integration theory of attention,” Cognit. Psychol., vol. 12, no. 1, pp. 97–136, Jan. 1980.
[31]
Q. Zhao and C. Koch, “Learning visual saliency by combining feature maps in a nonlinear manner using AdaBoost,” J. Vis., vol. 12, no. 6, pp. 1–15, 2012.
[32]
E. Erdem and A. Erdem, “Visual saliency estimation by nonlinearly integrating features using region covariances,” J. Vis., vol. 13, no. 4, pp. 1–20, Mar. 2013.
[33]
L. Mai, Y. Niu, and F. Liu, “Saliency aggregation: A data-driven approach,” in Proc. IEEE CVPR, Jun. 2013, pp. 1131–1138.
[34]
O. Le Meur and Z. Liu, “Saliency aggregation: Does unity make strength?” in Proc. ACCV, Nov. 2014, pp. 18–32.
[35]
S. Gupta, R. Girshick, P. Arbeláez, and J. Malik, “Learning rich features from RGB-D images for object detection and segmentation,” in Proc. ECCV, Sep. 2014, pp. 345–360.
[36]
A. Saxena, S. H. Chung, and A. Y. Ng, “Learning depth from single monocular images,” in Proc. Adv. Neural Inf. Process. Syst., Dec. 2005, pp. 1161–1168.
[37]
D. Sun, S. Roth, and M. J. Black, “Secrets of optical flow estimation and their principles,” in Proc. IEEE CVPR, Jun. 2010, pp. 2432–2439.
[38]
N. Ouerhani and H. Hugli, “Computing visual attention from scene depth,” in Proc. IEEE ICPR, Sep. 2000, pp. 375–378.
[39]
C. Lang, T. V. Nguyen, H. Katti, K. Yadati, M. Kankanhalli, and S. Yan, “Depth matters: Influence of depth cues on visual saliency,” in Proc. ECCV, Oct. 2012, pp. 101–115.
[40]
J. Gautier and O. Le Meur, “A time-dependent saliency model combining center and depth biases for 2D and 3D viewing conditions,” Cognit. Comput., vol. 4, no. 2, pp. 141–156, Jun. 2012.
[41]
K. Desingh, K. M. Krishna, D. Rajan, and C. V. Jawahar, “Depth really matters: Improving visual salient region detection with depth,” in Proc. BMVC, Sep. 2013, pp. 1–11.
[42]
Y. Niu, Y. Geng, X. Li, and F. Liu, “Leveraging stereopsis for saliency analysis,” in Proc. IEEE CVPR, Jun. 2012, pp. 454–461.
[43]
R. Ju, Y. Liu, T. Ren, L. Ge, and G. Wu, “Depth-aware salient object detection using anisotropic center-surround difference,” Signal Process., Image Commun., vol. 38, no. 10, pp. 115–126, Oct. 2015.
[44]
X. Fan, Z. Liu, and G. Sun, “Salient region detection for stereoscopic images,” in Proc. IEEE DSP, Aug. 2014, pp. 454–458.
[45]
H. Song, Z. Liu, H. Du, G. Sun, and C. Bai, “Saliency detection for RGBD images,” in Proc. ICIMCS, Aug. 2015, pp. 240–243.
[46]
H. Peng, B. Li, W. Xiong, W. Hu, and R. Ji, “RGBD salient object detection: A benchmark and algorithms,” in Proc. ECCV, Sep. 2014, pp. 92–109.
[47]
C. Rother, V. Kolmogorov, and A. Blake, “GrabCut: Interactive foreground extraction using iterated graph cuts,” ACM Trans. Graph., vol. 23, no. 3, pp. 309–314, Aug. 2004.
[48]
J. Han, K. N. Ngan, M. Li, and H.-J. Zhang, “Unsupervised extraction of visual attention objects in color images,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 1, pp. 141–145, Jan. 2006.
[49]
B. C. Ko and J.-Y. Nam, “Object-of-interest image segmentation based on human attention and semantic region clustering,” J. Opt. Soc. Amer. A, Opt. Image Sci., vol. 23, no. 10, pp. 2462–2470, Oct. 2006.
[50]
X. Fan, Z. Liu, and L. Ye, “Salient object segmentation from stereoscopic Images,” in Proc. Graph-Based Representations Pattern Recognit., May 2015, pp. 272–281.
[51]
Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 11, pp. 1222–1239, Nov. 2001.
[52]
H. Song, Z. Liu, H. Du, and G. Sun, “Depth-aware saliency detection using discriminative saliency fusion,” in Proc. IEEE ICASSAP, Mar. 2016, pp. 1626–1630.
[53]
P. Arbeláez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 5, pp. 898–916, May 2011.
[54]
T. Leung and J. Malik, “Representing and recognizing the visual appearance of materials using three-dimensional textons,” Int. J. Comput. Vis., vol. 43, no. 1, pp. 29–44, Feb. 2001.
[55]
N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE CVPR, Jun. 2005, pp. 886–893.
[56]
M. Heikkilä, M. Pietikäinen, and C. Schmid, “Description of interest regions with local binary patterns,” Pattern Recognit., vol. 42, no. 3, pp. 425–436, 2009.
[57]
N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst., Man, Cybern., vol. 9, no. 1, pp. 62–66, Jan. 1979.
[58]
F. Yang, H. Lu, and Y.-W. Chen, “Human tracking by multiple kernel boosting with locality affinity constraints,” in Proc. ACCV, Nov. 2011, pp. 39–50.

Cited By

View all
  • (2024)Transformer Fusion and Pixel-Level Contrastive Learning for RGB-D Salient Object DetectionIEEE Transactions on Multimedia10.1109/TMM.2023.327530826(1011-1026)Online publication date: 1-Jan-2024
  • (2024)CalibNet: Dual-Branch Cross-Modal Calibration for RGB-D Salient Instance SegmentationIEEE Transactions on Image Processing10.1109/TIP.2024.343232833(4348-4362)Online publication date: 1-Jan-2024
  • (2024)AirSOD: A Lightweight Network for RGB-D Salient Object DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.329558834:3(1656-1669)Online publication date: 1-Mar-2024
  • Show More Cited By

Index Terms

  1. Depth-Aware Salient Object Detection and Segmentation via Multiscale Discriminative Saliency Fusion and Bootstrap Learning
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image IEEE Transactions on Image Processing
        IEEE Transactions on Image Processing  Volume 26, Issue 9
        Sept. 2017
        489 pages

        Publisher

        IEEE Press

        Publication History

        Published: 01 September 2017

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 03 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Transformer Fusion and Pixel-Level Contrastive Learning for RGB-D Salient Object DetectionIEEE Transactions on Multimedia10.1109/TMM.2023.327530826(1011-1026)Online publication date: 1-Jan-2024
        • (2024)CalibNet: Dual-Branch Cross-Modal Calibration for RGB-D Salient Instance SegmentationIEEE Transactions on Image Processing10.1109/TIP.2024.343232833(4348-4362)Online publication date: 1-Jan-2024
        • (2024)AirSOD: A Lightweight Network for RGB-D Salient Object DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.329558834:3(1656-1669)Online publication date: 1-Mar-2024
        • (2024)Bidirectional feature learning network for RGB-D salient object detectionPattern Recognition10.1016/j.patcog.2024.110304150:COnline publication date: 1-Jun-2024
        • (2024)Transformer-based cross-modality interaction guidance network for RGB-T salient object detectionNeurocomputing10.1016/j.neucom.2024.128149600:COnline publication date: 1-Oct-2024
        • (2024)Depth awakensImage and Vision Computing10.1016/j.imavis.2024.104924143:COnline publication date: 1-Mar-2024
        • (2024)CMDCF: an effective cross-modal dense cooperative fusion network for RGB-D SODNeural Computing and Applications10.1007/s00521-024-09692-036:23(14361-14378)Online publication date: 1-Aug-2024
        • (2024)Multi-modality information refinement fusion network for RGB-D salient object detectionThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-03076-640:6(4183-4199)Online publication date: 1-Jun-2024
        • (2023)Symmetry-aware transformer-based mirror detectionProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i1.25173(935-943)Online publication date: 7-Feb-2023
        • (2023)Dynamic Message Propagation Network for RGB-D and Video Salient Object DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/359761220:1(1-21)Online publication date: 18-Sep-2023
        • Show More Cited By

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media