Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3123266.3123363acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Occlusion-aware Video Temporal Consistency

Published: 19 October 2017 Publication History

Abstract

Image color editing techniques such as color transfer, HDR tone mapping, dehazing, and white balance have been widely used and investigated in recent decades. However, naively employing them to videos frame-by-frame often leads to flickering or color inconsistency. To solve it generally, earlier methods rely on temporal filtering or warping from the previous frame, but they still fail in the cases of occlusion and produce blurry results. We introduce a new framework for these challenges: (1) We develop an online keyframe strategy to keep track of the dynamic objects, where more temporal information can be acquired than a single previous frame. (2) To preserve image details, local color affine model is employed. The main concept of this post-processing step is to capture the color transformation from editing algorithms and maintain the detail structures of the raw image simultaneously. Practically, our approach takes a raw video and its per-frame processed version, and generates a temporally consistent output. In addition, we propose a video quality metric to evaluate temporal coherence. Extensive experiments and subjective test are done to show the superiority of the proposed framework with respect to color fidelity, detail preservation, and temporal consistency.

References

[1]
Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk. 2010. Slic superpixels. Technical Report.
[2]
Mathieu Aubry, Sylvain Paris, Samuel W Hasinoff, Jan Kautz, and Frédo Durand. 2014. Fast local laplacian filters: Theory and applications. ACM Transactions on Graphics Vol. 33, 5 (2014), 167.
[3]
Tuncc Ozan Aydin, Nikolce Stefanoski, Simone Croci, Markus Gross, and Aljoscha Smolic. 2014. Temporally coherent local tone mapping of HDR video. ACM Transactions on Graphics Vol. 33, 6 (2014), 196.
[4]
Connelly Barnes, Eli Shechtman, Dan B Goldman, and Adam Finkelstein. 2010. The generalized patchmatch correspondence algorithm European Conference on Computer Vision. 29--43.
[5]
Nicolas Bonneel, Julien Rabin, Gabriel Peyré, and Hanspeter Pfister. 2015. Sliced and radon wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision Vol. 51, 1 (2015), 22--45.
[6]
Nicolas Bonneel, Kalyan Sunkavalli, Sylvain Paris, and Hanspeter Pfister. 2013. Example-based video color grading. ACM Transactions on Graphics Vol. 32, 4 (2013), 39--1.
[7]
Nicolas Bonneel, James Tompkin, Kalyan Sunkavalli, Deqing Sun, Sylvain Paris, and Hanspeter Pfister. 2015. Blind video temporal consistency. ACM Transactions on Graphics Vol. 34, 6 (2015), 196.
[8]
Peter Burt and Edward Adelson. 1983. The Laplacian pyramid as a compact image code. IEEE Transactions on Communications Vol. 31, 4 (1983), 532--540.
[9]
Shyamprasad Chikkerur, Vijay Sundaram, Martin Reisslein, and Lina J Karam. 2011. Objective video quality assessment methods: A classification, review, and performance comparison. IEEE Transactions on Broadcasting Vol. 57, 2 (2011), 165--182.
[10]
Xuan Dong, Boyan Bonev, Yu Zhu, and Alan L Yuille. 2015. Region-based temporally consistent video post-processing IEEE Conference on Computer Vision and Pattern Recognition. 714--722.
[11]
Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks IEEE Conference on Computer Vision and Pattern Recognition. 2414--2423.
[12]
Michaël Gharbi, YiChang Shih, Gaurav Chaurasia, Jonathan Ragan-Kelley, Sylvain Paris, and Frédo Durand. 2015. Transform recipes for efficient cloud photo enhancement. ACM Transactions on Graphics Vol. 34, 6 (2015), 228.
[13]
Han Gong, Graham D Finlayson, and Robert B Fisher. 2016. Recoding Color Transfer as a Color Homography. In British Machine Vision Conference.
[14]
Matthias Grundmann, Vivek Kwatra, Mei Han, and Irfan Essa. 2010. Efficient hierarchical graph-based video segmentation IEEE Conference on Computer Vision and Pattern Recognition. 2141--2148.
[15]
Kaiming He, Jian Sun, and Xiaoou Tang. 2011. Single image haze removal using dark channel prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, 12 (2011), 2341--2353.
[16]
Kaiming He, Jian Sun, and Xiaoou Tang. 2013. Guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 6 (2013), 1397--1409.
[17]
Eugene Hsu, Tom Mertens, Sylvain Paris, Shai Avidan, and Frédo Durand. 2008. Light mixture estimation for spatially varying white balance. ACM Transactions on Graphics Vol. 27, 3 (2008), 70.
[18]
Yinlin Hu, Rui Song, and Yunsong Li. 2016. Efficient coarse-to-fine patchmatch for large displacement optical flow IEEE Conference on Computer Vision and Pattern Recognition. 5704--5712.
[19]
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution European Conference on Computer Vision. 694--711.
[20]
Manuel Lang, Oliver Wang, Tunc Aydin, Aljoscha Smolic, and Markus Gross. 2012. Practical Temporal Consistency for Image-based Graphics Applications. ACM Transactions on Graphics Vol. 31, 4 (2012), 34:1--34:8.
[21]
Chuan Li and Michael Wand. 2016. Combining markov random fields and convolutional neural networks for image synthesis IEEE Conference on Computer Vision and Pattern Recognition. 2479--2486.
[22]
Ce Liu. 2009. Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. Dissertation. Massachusetts Institute of Technology.
[23]
Tom Mertens, Jan Kautz, and Frank Van Reeth. 2009. Exposure fusion: A simple and practical alternative to high dynamic range photography. Computer Graphics Forum Vol. 28, 1 (2009), 161--171.
[24]
Sylvain Paris. 2008. Edge-preserving smoothing and mean-shift segmentation of video streams European Conference on Computer Vision. 460--473.
[25]
Margaret H Pinson and Stephen Wolf. 2004. A new standardized method for objectively measuring video quality. IEEE Transactions on Broadcasting Vol. 50, 3 (2004), 312--322.
[26]
E Reinhard, M Adhikhmin, B Gooch, and P Shirley. 2001. Color transfer between images. IEEE Computer Graphics and Applications Vol. 21, 5 (2001), 34--41.
[27]
Kalpana Seshadrinathan and Alan Conrad Bovik. 2010. Motion tuned spatio-temporal quality assessment of natural videos. IEEE Transactions on Image Processing Vol. 19, 2 (2010), 335--350.
[28]
Yichang Shih, Sylvain Paris, Frédo Durand, and William T Freeman. 2013. Data-driven hallucination of different times of day from a single outdoor photo. ACM Transactions on Graphics Vol. 32, 6 (2013), 200.
[29]
Deqing Sun, Stefan Roth, and Michael J Black. 2014. A quantitative analysis of current practices in optical flow estimation and the principles behind them. International Journal of Computer Vision Vol. 106, 2 (2014), 115--137.
[30]
Ketan Tang, Jianchao Yang, and Jue Wang. 2014. Investigating haze-relevant features in a learning framework for image dehazing IEEE Conference on Computer Vision and Pattern Recognition. 2995--3002.
[31]
Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. In IEEE International Conference on Computer Vision. 839--846.
[32]
Zhou Wang, Eero P Simoncelli, and Alan C Bovik. 2003. Multiscale structural similarity for image quality assessment Asilomar Conference on Signals, Systems and Computers, Vol. Vol. 2. 1398--1402.
[33]
Xuezhong Xiao and Lizhuang Ma. 2009. Gradient-Preserving Color Transfer. Computer Graphics Forum Vol. 28, 7 (2009), 1879--1886.
[34]
Chenliang Xu, Caiming Xiong, and Jason J Corso. 2012. Streaming hierarchical video segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. 626--639.
[35]
Chun-Han Yao, Chia-Yang Chang, and Shao-Yi Chien. 2016. Example-based video color transfer. In IEEE International Conference on Multimedia and Expo. IEEE, 1--6.
[36]
Genzhi Ye, Elena Garces, Yebin Liu, Qionghai Dai, and Diego Gutierrez. 2014. Intrinsic video and applications. ACM Transactions on Graphics Vol. 33, 4 (2014), 80.

Cited By

View all
  • (2024)Temporal Optimization for Face Swapping Video based on Consistency InheritanceProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674457(165-170)Online publication date: 5-Jul-2024
  • (2024)Unsupervised Model-based Learning for Simultaneous Video Deflickering and Deblotching2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00407(4105-4113)Online publication date: 3-Jan-2024
  • (2024)Deep Image Matting With Sparse User InteractionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332669346:2(881-895)Online publication date: Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '17: Proceedings of the 25th ACM international conference on Multimedia
October 2017
2028 pages
ISBN:9781450349062
DOI:10.1145/3123266
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. color transfer
  2. occlusion
  3. temporal consistency
  4. video processing

Qualifiers

  • Research-article

Conference

MM '17
Sponsor:
MM '17: ACM Multimedia Conference
October 23 - 27, 2017
California, Mountain View, USA

Acceptance Rates

MM '17 Paper Acceptance Rate 189 of 684 submissions, 28%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)63
  • Downloads (Last 6 weeks)3
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Temporal Optimization for Face Swapping Video based on Consistency InheritanceProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674457(165-170)Online publication date: 5-Jul-2024
  • (2024)Unsupervised Model-based Learning for Simultaneous Video Deflickering and Deblotching2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00407(4105-4113)Online publication date: 3-Jan-2024
  • (2024)Deep Image Matting With Sparse User InteractionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332669346:2(881-895)Online publication date: Feb-2024
  • (2024)MCD-Net: Toward RGB-D Video Inpainting in Real-World ScenesIEEE Transactions on Image Processing10.1109/TIP.2024.335867533(1095-1108)Online publication date: 2024
  • (2024)Hierarchical Color Fusion Network (HCFN): Enhancing exemplar-based video colorizationNeurocomputing10.1016/j.neucom.2024.128121598(128121)Online publication date: Sep-2024
  • (2024)Temporally consistent video colorization with deep feature propagation and self-regularization learningComputational Visual Media10.1007/s41095-023-0342-810:2(375-395)Online publication date: 3-Jan-2024
  • (2024)Single-Video Temporal Consistency Enhancement with Rolling GuidanceComputational Visual Media10.1007/978-981-97-2092-7_6(109-130)Online publication date: 10-Apr-2024
  • (2023)Interactive Control over Temporal Consistency while Stylizing Video StreamsComputer Graphics Forum10.1111/cgf.1489142:4Online publication date: 26-Jul-2023
  • (2023)Volumetric Avatar Reconstruction with Spatio-Temporally Offset RGBD Cameras2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)10.1109/VR55154.2023.00023(72-82)Online publication date: Mar-2023
  • (2023)Hybrid High Dynamic Range Imaging fusing Neuromorphic and Conventional ImagesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.323133445:7(8553-8565)Online publication date: 1-Jul-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media