research-article

Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration

Authors:

Mihai Marian Puscas,

Nicu SebeAuthors Info & Claims

MM '16: Proceedings of the 24th ACM international conference on Multimedia

Pages 831 - 840

https://doi.org/10.1145/2964284.2964295

Published: 01 October 2016 Publication History

Abstract

Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top performance on recent benchmarks, usually focus on either obtaining a precise similarity graph or designing efficient graph cutting strategies. However, these two components are often conducted in two separated steps, and thus the obtained similarity graph may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, joint graph learning and video segmentation (JGLVS)}, which learns the similarity graph and video segmentation simultaneously. JGLVS learns the similarity graph by assigning adaptive neighbors for each vertex based on multiple cues (appearance, motion, boundary and spatial information). Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the number of segmentations. Furthermore, JGLVS can automatically weigh multiple cues and calibrate the pairwise distance of superpixels based on their topology structures. Most noticeably, empirical results on the challenging dataset VSB100 show that JGLVS achieves promising performance on the benchmark dataset which outperforms the state-of-the-art by up to 11% for the BPR metric.

References

[1]

P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. From contours to regions: An empirical evaluation. In CVPR, pages 2294--2301, 2009.

[2]

P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 33(5):898--916, 2011.

Digital Library

[3]

W. Brendel and S. Todorovic. Video object segmentation by tracking regions. In ICCV, pages 833--840, 2009.

[4]

T. Brox and J. Malik. Object segmentation by long term analysis of point trajectories. In ECCV, pages 282--295, 2010.

Digital Library

[5]

L. Chen, J. Shen, W. Wang, and B. Ni. Video object segmentation via dense trajectories. IEEE Trans. Multimedia, 17(12):2225--2234, 2015.

Digital Library

[6]

J. Corso, E. Sharon, S. Dube, S. El-Saden, U. Sinha, and A. Yuille. Efficient multilevel brain tumor segmentation with integrated bayesian model classification. Medical Imaging, IEEE Transactions on, 27(5):629--640, 2008.

[7]

K. Fan. On a Theorem of Weyl Concerning Eigenvalues of Linear Transformations. I. Proceedings of the National Academy of Science, 35:652--655, Nov. 1949.

[8]

K. Fragkiadaki and J. Shi. Detection free tracking: Exploiting motion and topology for segmenting and tracking under entanglement. In CVPR, pages 2073--2080, 2011.

Digital Library

[9]

F. Galasso, R. Cipolla, and B. Schiele. Video segmentation with superpixels. In ACCV, 2012.

Digital Library

[10]

F. Galasso, M. Keuper, T. Brox, and B. Schiele. Spectral graph reduction for efficient image and streaming video segmentation. In CVPR, 2014.

Digital Library

[11]

F. Galasso, N. S. Nagaraja, T. J. Cardenas, T. Brox, and B. Schiele. A unified video segmentation benchmark: Annotation, metrics and analysis. In ICCV, 2013.

Digital Library

[12]

L. Gao, J. Song, F. Nie, Y. Yan, N. Sebe, and H. T. Shen. Optimal graph learning with partial tags and multiple features for image and video annotation. In CVPR, pages 4371--4379, 2015.

[13]

L. Gao, J. Song, F. Nie, F. Zou, N. Sebe, and H. T. Shen. Graph-without-cut: An ideal graph learning for image segmentation. In AAAI, pages 1188--1194, 2016.

[14]

M. Grundmann, V. Kwatra, M. Han, and I. Essa. Efficient hierarchical graph-based video segmentation. In CVPR, pages 2141--2148, 2010.

[15]

A. Jain, S. Chatterjee, and R. Vidal. Coarse-to-fine semantic video segmentation using supervoxel trees. In ICCV, pages 1865--1872, 2013.

Digital Library

[16]

H. Jiang, G. Zhang, H. Wang, and H. Bao. Spatio-temporal video segmentation of static scenes and its applications. IEEE Trans. Multimedia, 17(1):3--15, 2015.

[17]

M. Keuper, B. Andres, and T. Brox. Motion trajectory segmentation via minimum cost multicuts. In ICCV, 2015.

Digital Library

[18]

M. Keuper, B. Andres, and T. Brox. Motion trajectory segmentation via minimum cost multicuts. In ICCV, pages 3271--3279, 2015.

Digital Library

[19]

A. Khoreva, F. Galasso, M. Hein, and B. Schiele. Classifier based graph construction for video segmentation. In CVPR, 2015.

[20]

C. Li, L. Lin, W. Zuo, S. Yan, and J. Tang. Sold: Sub-optimal low-rank decomposition for efficient video segmentation. In CVPR, 2015.

[21]

B. Liu and X. He. Multiclass semantic video segmentation with object-level active inference. In CVPR, pages 4286--4294, 2015.

[22]

B. Luo, H. Li, T. Song, and C. Huang. Object segmentation from long video sequences. In ACM Multimedia, pages 1187--1190, 2015.

Digital Library

[23]

T. Ma and L. J. Latecki. Maximum weight cliques with mutex constraints for video object segmentation. In CVPR, pages 670--677, 2012.

Digital Library

[24]

N. S. Nagaraja, F. R. Schmidt, and T. Brox. Video segmentation with just a few strokes. In ICCV, pages 3235--3243, 2015.

Digital Library

[25]

F. Nie, X. Wang, and H. Huang. Clustering and projected clustering with adaptive neighbors. In SIGKDD, pages 977--986, 2014.

Digital Library

[26]

F. Nie, X. Wang, M. I. Jordan, and H. Huang. The constrained laplacian rank algorithm for graph-based clustering. In AAAI, pages 1969--1976, 2016.

Digital Library

[27]

P. Ochs and T. Brox. Object segmentation in video: A hierarchical variational approach for turning point trajectories into dense regions. In ICCV, pages 1583--1590, 2011.

Digital Library

[28]

P. Ochs and T. Brox. Higher order motion models and spectral clustering. In CVPR, pages 614--621, 2012.

Digital Library

[29]

P. Ochs, J. Malik, and T. Brox. Segmentation of moving objects by long term video analysis. IEEE Trans. Pattern Anal. Mach. Intell., 36(6):1187--1200, 2014.

Digital Library

[30]

S. Paris. Edge-preserving smoothing and mean-shift segmentation of video streams. In ECCV, pages 460--473, 2008.

Digital Library

[31]

S. H. Raza, M. Grundmann, and I. A. Essa. Geometric context from videos. In CVPR, pages 3081--3088, 2013.

Digital Library

[32]

A. V. Reina, S. Avidan, H. Pfister, and E. L. Miller. Multiple hypothesis video segmentation from superpixel flows. In ECCV, pages 268--281, 2010.

Digital Library

[33]

F. Shen, C. Shen, Q. Shi, A. van den Hengel, Z. Tang, and H. T. Shen. Hashing on nonlinear manifolds. IEEE Trans. Image Processing, 24(6):1839--1851, 2015.

Digital Library

[34]

J. Son, I. Jung, K. Park, and B. Han. Tracking-by-segmentation with online gradient boosting decision tree. In ICCV, 2015.

Digital Library

[35]

J. Song, Y. Yang, Z. Huang, H. T. Shen, and J. Luo. Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Trans. Multimedia, 15(8):1997--2008, 2013.

Digital Library

[36]

H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV, 2013.

Digital Library

[37]

Y. Wang, J. Liu, Y. Li, and H. Lu. Semi- and weakly- supervised semantic segmentation with deep convolutional neural networks. In ACM Multimedia, pages 1223--1226, 2015.

Digital Library

[38]

C. Xu, C. Xiong, and J. J. Corso. Streaming hierarchical video segmentation. In ECCV, pages 626--639, 2012.

Digital Library

[39]

X. Yao, J. Han, G. Cheng, and L. Guo. Semantic segmentation based on stacked discriminative autoencoders and context-constrained weakly supervised learning. In ACM Multimedia, pages 1211--1214, 2015.

Digital Library

[40]

S. Yi and V. Pavlovic. Multi-cue structure preserving MRF for unconstrained video segmentation. In ICCV, 2015.

Digital Library

[41]

C.-P. Yu, H. Le, G. Zelinsky, and D. Samaras. Efficient video segmentation using parametric graph partitioning. In ICCV, 2015.

Digital Library

[42]

V. Zografos, R. Lenz, E. Ringaby, M. Felsberg, and K. Nordberg. Fast segmentation of sparse 3d point trajectories using group theoretical invariants. In ACCV, pages 675--691, 2014.

Cited By

Shi MGuo JAn JZhang XZhang WXu P(2021)Unsupervised 2D dimensionality reduction by jointly learning structural and temporal correlationApplied Intelligence10.1007/s10489-021-02439-7Online publication date: 18-Aug-2021
https://doi.org/10.1007/s10489-021-02439-7
Zhuo XFraundorfer FKurz FReinartz P(2019)Automatic Annotation of Airborne Images by Label Propagation Based on a Bayesian-CRF ModelRemote Sensing10.3390/rs1102014511:2(145)Online publication date: 13-Jan-2019
https://doi.org/10.3390/rs11020145
Sun JXie JHu JLin ZLai JZeng WZheng WAmsaleg LHuet BLarson MGravier GHung HNgo CTsang Ooi W(2019)Predicting Future Instance Segmentation with Contextual Pyramid ConvLSTMsProceedings of the 27th ACM International Conference on Multimedia10.1145/3343031.3350949(2043-2051)Online publication date: 15-Oct-2019
https://dl.acm.org/doi/10.1145/3343031.3350949
Show More Cited By

Index Terms

Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
        Video segmentation

Recommendations

An integrated similarity metric for graph-based color image segmentation

Graph-based method has become one of the major trends in image segmentation. In this paper, we focus on how to build the affinity matrix which is one of the key issues in graph-based color image segmentation. Four different metrics are integrated in ...
Improved graph-cut segmentation for ultrasound liver cyst image

An optimal contour segmentation for ultrasonic liver cyst image is presented through combining graph-based method with particle swarm optimization (PSO) in this paper. After automatic selecting the region of interest (ROI) for ultrasonic liver cyst ...
A graph-based approach for spatio-temporal segmentation of coronary arteries in X-ray angiographic sequences

The segmentation and tracking of coronary arteries (CAs) are critical steps for the computation of biophysical measurements in pediatric interventional cardiology. In the literature, most methods are focused on either segmenting the vessel lumen or on ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '16: Proceedings of the 24th ACM international conference on Multimedia

October 2016

1542 pages

ISBN:9781450336031

DOI:10.1145/2964284

General Chairs:
Alan Hanjalic
Delft University of Technology
,
Cees Snoek
Qualcomm Research Netherlands / University of Amsterdam
,
Marcel Worring
University of Amsterdam
,
Moderator:
Dick Bulterman
CWI / VU University Amsterdam
,
Program Chairs:
Benoit Huet
EURECOM
,
Aisling Kelliher
Virginia Tech
,
Yiannis Kompatsiaris
CERTH-ITI
,
Jin Li
Microsoft

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

FP7 EC project
National Natural Science Foundation of China
the Fundamental Research Funds for the Central Universities

Conference

MM '16

Sponsor:

SIGMM

MM '16: ACM Multimedia Conference

October 15 - 19, 2016

Amsterdam, The Netherlands

Acceptance Rates

MM '16 Paper Acceptance Rate 52 of 237 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
478
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)2

Reflects downloads up to 16 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shi MGuo JAn JZhang XZhang WXu P(2021)Unsupervised 2D dimensionality reduction by jointly learning structural and temporal correlationApplied Intelligence10.1007/s10489-021-02439-7Online publication date: 18-Aug-2021
https://doi.org/10.1007/s10489-021-02439-7
Zhuo XFraundorfer FKurz FReinartz P(2019)Automatic Annotation of Airborne Images by Label Propagation Based on a Bayesian-CRF ModelRemote Sensing10.3390/rs1102014511:2(145)Online publication date: 13-Jan-2019
https://doi.org/10.3390/rs11020145
Sun JXie JHu JLin ZLai JZeng WZheng WAmsaleg LHuet BLarson MGravier GHung HNgo CTsang Ooi W(2019)Predicting Future Instance Segmentation with Contextual Pyramid ConvLSTMsProceedings of the 27th ACM International Conference on Multimedia10.1145/3343031.3350949(2043-2051)Online publication date: 15-Oct-2019
https://dl.acm.org/doi/10.1145/3343031.3350949
Zhou QYang WGao GOu WLu HChen JLatecki L(2019)Multi-scale deep context convolutional neural networks for semantic segmentationWorld Wide Web10.1007/s11280-018-0556-322:2(555-570)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1007/s11280-018-0556-3
Yu ZShao JYang QSun Z(2019)ProfitLeaderWorld Wide Web10.1007/s11280-018-0537-622:2(533-553)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1007/s11280-018-0537-6
Guo YZhang JGao L(2019)Exploiting long-term temporal dynamics for video captioningWorld Wide Web10.1007/s11280-018-0530-022:2(735-749)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1007/s11280-018-0530-0
Gao LSong JZhang DShen H(2018)Coarse-to-fine image co-segmentation with intra and inter rank constraintsProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304415.3304518(719-725)Online publication date: 13-Jul-2018
https://dl.acm.org/doi/10.5555/3304415.3304518
Song JZhou ZGao LXu XShen HBoll SMu Lee KLuo JZhu WByun HWen Chen CLienhart RMei T(2018)Cumulative Nets for Edge DetectionProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240688(1847-1855)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3240508.3240688
Shi HLi HWu QMeng FNgan KBoll SMu Lee KLuo JZhu WByun HWen Chen CLienhart RMei T(2018)Boosting Scene Parsing Performance via Reliable Scale PredictionProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240657(492-500)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3240508.3240657
Qiu ZYao TMei T(2018)Learning Deep Spatio-Temporal Dependence for Semantic Video SegmentationIEEE Transactions on Multimedia10.1109/TMM.2017.275950420:4(939-949)Online publication date: 1-Apr-2018
https://dl.acm.org/doi/10.1109/TMM.2017.2759504
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents