Abstract
Manually segmenting and labeling objects in video sequences is quite tedious, yet such annotations are valuable for learning-based approaches to object and activity recognition. While automatic label propagation can help, existing methods simply propagate annotations from arbitrarily selected frames (e.g., the first one) and so may fail to best leverage the human effort invested. We define an active frame selection problem: select k frames for manual labeling, such that automatic pixel-level label propagation can proceed with minimal expected error. We propose a solution that directly ties a joint frame selection criterion to the predicted errors of a flow-based random field propagation model. It selects the set of k frames that together minimize the total mislabeling risk over the entire sequence. We derive an efficient dynamic programming solution to optimize the criterion. Further, we show how to automatically determine how many total frames k should be labeled in order to minimize the total manual effort spent labeling and correcting propagation errors. We demonstrate our method’s clear advantages over several baselines, saving hours of human effort per video.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agarwala, A., Hertzmann, A., Salesin, D., Seitz, S.: Keyframe-Based Tracking for Rotoscoping and Animation. In: SIGGRAPH (2004)
Badrinarayanan, V., Galasso, F., Cipolla, R.: Label Propagation in Video Sequences. In: CVPR (2010)
Bai, X., Wang, J., Simons, D., Sapiro, G.: Video SnapCut: Robust Video Object Cutout using Localized Classifiers. In: SIGGRAPH (2009)
Batra, D., Kowdle, A., Parikh, D., Luo, J., Chen, T.: Interactive Co-segmentation with Intelligent Scribble Guidance. In: CVPR (2010)
Boykov, Y., Veksler, O., Zabih, R.: Fast Approximate Energy Minimization via Graph Cuts. TPAMI (2001)
Budvytis, I., Badrinarayanan, V., Cipolla, R.: Label Propagation in Complex Video Sequences using Semi-supervised Learning. In: BMVC (2010)
Cooper, M., Foote, J.: Discriminative Techniques for Keyframe Selection. In: ICME (2005)
Fathi, A., Balcan, M., Ren, X., Rehg, J.: Combining Self Training and Active Learning for Video Segmentation. In: BMVC (2011)
Fauqueur, J., Brostow, G., Cipolla, R.: Assisted Video Object Labeling by Joint Tracking of Regions and Keypoints. In: Proc. Int. Workshop on Interactive Computer Vision (2007)
Grundmann, M., Kwatra, V., Han, M., Essa, I.: Efficient Hierarchical Graph-based Video Segmentation. In: CVPR (2010)
Hoi, S., Jin, R., Zhu, J., Lyu, M.: Semi-supervised SVM Batch Mode Active Learning with Applications to Image Retrieval. ACM Trans. on Info Systems (2009)
Liu, C., Yuen, J., Torralba, A.: Nonparametric Scene Parsing: Label Transfer via Dense Scene Alignment. In: CVPR (2009)
Liu, T., Kender, J.: Optimization Algorithms for the Selection of Key Frame Sequences of Variable Length. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 403–417. Springer, Heidelberg (2002)
Patras, I., Hendriks, E., Lagendijk, R.: Semi-automatic Object-based Video Segmentation with Labeling of Color Segments. In: Signal Processing: Image Communication (2003)
Price, B.L., Morse, B.S., Cohen, S.: Livecut: Learning-based Interactive Video Segmentation by Evaluation of Multiple Propagated Cues. In: ICCV (2009)
Ren, X., Malik, J.: Tracking as Repeated Figure/Ground Segmentation. In: CVPR (2007)
Sundaram, N., Brox, T., Keutzer, K.: Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 438–451. Springer, Heidelberg (2010)
Tsai, D., Flagg, M., Rehg, J.M.: Motion Coherent Tracking with Multi-label MRF Optimization. In: BMVC (2010)
Vazquez-Reina, A., Avidan, S., Pfister, H., Miller, E.: Multiple Hypothesis Video Segmentation from Superpixel Flows. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 268–281. Springer, Heidelberg (2010)
Vijayanarasimhan, S., Grauman, K.: What’s It Going to Cost You?: Predicting Effort vs. Informativeness for Multi-Label Image Annotations. In: CVPR (2008)
Vijayanarasimhan, S., Jain, P., Grauman, K.: Far-Sighted Active Learning on a Budget for Image and Video Recognition. In: CVPR (2010)
Vondrick, C., Ramanan, D.: Video Annotation and Tracking with Active Learning. In: NIPS (2011)
Vondrick, C., Ramanan, D., Patterson, D.: Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 610–623. Springer, Heidelberg (2010)
Wang, J., Bhat, P., Colburn, R.A., Agrawala, M., Cohen, M.F.: Interactive Video Cutout. In: SIGGRAPH (2005)
Wolf, W.: Key Frame Selection by Motion Analysis. In: ICASSP (1996)
Yuen, J., Russell, B., Liu, C., Torralba, A.: Labelme Video: Building a Video Database with Human Annotations. In: ICCV (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vijayanarasimhan, S., Grauman, K. (2012). Active Frame Selection for Label Propagation in Videos. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7576. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33715-4_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-33715-4_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33714-7
Online ISBN: 978-3-642-33715-4
eBook Packages: Computer ScienceComputer Science (R0)