research-article

Occlusion-aware Video Temporal Consistency

Authors:

Chia-Yang Chang,

Shao-Yi ChienAuthors Info & Claims

MM '17: Proceedings of the 25th ACM international conference on Multimedia

Pages 777 - 785

https://doi.org/10.1145/3123266.3123363

Published: 19 October 2017 Publication History

Abstract

Image color editing techniques such as color transfer, HDR tone mapping, dehazing, and white balance have been widely used and investigated in recent decades. However, naively employing them to videos frame-by-frame often leads to flickering or color inconsistency. To solve it generally, earlier methods rely on temporal filtering or warping from the previous frame, but they still fail in the cases of occlusion and produce blurry results. We introduce a new framework for these challenges: (1) We develop an online keyframe strategy to keep track of the dynamic objects, where more temporal information can be acquired than a single previous frame. (2) To preserve image details, local color affine model is employed. The main concept of this post-processing step is to capture the color transformation from editing algorithms and maintain the detail structures of the raw image simultaneously. Practically, our approach takes a raw video and its per-frame processed version, and generates a temporally consistent output. In addition, we propose a video quality metric to evaluate temporal coherence. Extensive experiments and subjective test are done to show the superiority of the proposed framework with respect to color fidelity, detail preservation, and temporal consistency.

References

[1]

Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk. 2010. Slic superpixels. Technical Report.

[2]

Mathieu Aubry, Sylvain Paris, Samuel W Hasinoff, Jan Kautz, and Frédo Durand. 2014. Fast local laplacian filters: Theory and applications. ACM Transactions on Graphics Vol. 33, 5 (2014), 167.

Digital Library

[3]

Tuncc Ozan Aydin, Nikolce Stefanoski, Simone Croci, Markus Gross, and Aljoscha Smolic. 2014. Temporally coherent local tone mapping of HDR video. ACM Transactions on Graphics Vol. 33, 6 (2014), 196.

Digital Library

[4]

Connelly Barnes, Eli Shechtman, Dan B Goldman, and Adam Finkelstein. 2010. The generalized patchmatch correspondence algorithm European Conference on Computer Vision. 29--43.

Digital Library

[5]

Nicolas Bonneel, Julien Rabin, Gabriel Peyré, and Hanspeter Pfister. 2015. Sliced and radon wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision Vol. 51, 1 (2015), 22--45.

Digital Library

[6]

Nicolas Bonneel, Kalyan Sunkavalli, Sylvain Paris, and Hanspeter Pfister. 2013. Example-based video color grading. ACM Transactions on Graphics Vol. 32, 4 (2013), 39--1.

Digital Library

[7]

Nicolas Bonneel, James Tompkin, Kalyan Sunkavalli, Deqing Sun, Sylvain Paris, and Hanspeter Pfister. 2015. Blind video temporal consistency. ACM Transactions on Graphics Vol. 34, 6 (2015), 196.

Digital Library

[8]

Peter Burt and Edward Adelson. 1983. The Laplacian pyramid as a compact image code. IEEE Transactions on Communications Vol. 31, 4 (1983), 532--540.

[9]

Shyamprasad Chikkerur, Vijay Sundaram, Martin Reisslein, and Lina J Karam. 2011. Objective video quality assessment methods: A classification, review, and performance comparison. IEEE Transactions on Broadcasting Vol. 57, 2 (2011), 165--182.

[10]

Xuan Dong, Boyan Bonev, Yu Zhu, and Alan L Yuille. 2015. Region-based temporally consistent video post-processing IEEE Conference on Computer Vision and Pattern Recognition. 714--722.

[11]

Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks IEEE Conference on Computer Vision and Pattern Recognition. 2414--2423.

[12]

Michaël Gharbi, YiChang Shih, Gaurav Chaurasia, Jonathan Ragan-Kelley, Sylvain Paris, and Frédo Durand. 2015. Transform recipes for efficient cloud photo enhancement. ACM Transactions on Graphics Vol. 34, 6 (2015), 228.

Digital Library

[13]

Han Gong, Graham D Finlayson, and Robert B Fisher. 2016. Recoding Color Transfer as a Color Homography. In British Machine Vision Conference.

[14]

Matthias Grundmann, Vivek Kwatra, Mei Han, and Irfan Essa. 2010. Efficient hierarchical graph-based video segmentation IEEE Conference on Computer Vision and Pattern Recognition. 2141--2148.

[15]

Kaiming He, Jian Sun, and Xiaoou Tang. 2011. Single image haze removal using dark channel prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, 12 (2011), 2341--2353.

Digital Library

[16]

Kaiming He, Jian Sun, and Xiaoou Tang. 2013. Guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 6 (2013), 1397--1409.

Digital Library

[17]

Eugene Hsu, Tom Mertens, Sylvain Paris, Shai Avidan, and Frédo Durand. 2008. Light mixture estimation for spatially varying white balance. ACM Transactions on Graphics Vol. 27, 3 (2008), 70.

Digital Library

[18]

Yinlin Hu, Rui Song, and Yunsong Li. 2016. Efficient coarse-to-fine patchmatch for large displacement optical flow IEEE Conference on Computer Vision and Pattern Recognition. 5704--5712.

[19]

Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution European Conference on Computer Vision. 694--711.

[20]

Manuel Lang, Oliver Wang, Tunc Aydin, Aljoscha Smolic, and Markus Gross. 2012. Practical Temporal Consistency for Image-based Graphics Applications. ACM Transactions on Graphics Vol. 31, 4 (2012), 34:1--34:8.

Digital Library

[21]

Chuan Li and Michael Wand. 2016. Combining markov random fields and convolutional neural networks for image synthesis IEEE Conference on Computer Vision and Pattern Recognition. 2479--2486.

[22]

Ce Liu. 2009. Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. Dissertation. Massachusetts Institute of Technology.

Digital Library

[23]

Tom Mertens, Jan Kautz, and Frank Van Reeth. 2009. Exposure fusion: A simple and practical alternative to high dynamic range photography. Computer Graphics Forum Vol. 28, 1 (2009), 161--171.

[24]

Sylvain Paris. 2008. Edge-preserving smoothing and mean-shift segmentation of video streams European Conference on Computer Vision. 460--473.

Digital Library

[25]

Margaret H Pinson and Stephen Wolf. 2004. A new standardized method for objectively measuring video quality. IEEE Transactions on Broadcasting Vol. 50, 3 (2004), 312--322.

[26]

E Reinhard, M Adhikhmin, B Gooch, and P Shirley. 2001. Color transfer between images. IEEE Computer Graphics and Applications Vol. 21, 5 (2001), 34--41.

Digital Library

[27]

Kalpana Seshadrinathan and Alan Conrad Bovik. 2010. Motion tuned spatio-temporal quality assessment of natural videos. IEEE Transactions on Image Processing Vol. 19, 2 (2010), 335--350.

Digital Library

[28]

Yichang Shih, Sylvain Paris, Frédo Durand, and William T Freeman. 2013. Data-driven hallucination of different times of day from a single outdoor photo. ACM Transactions on Graphics Vol. 32, 6 (2013), 200.

Digital Library

[29]

Deqing Sun, Stefan Roth, and Michael J Black. 2014. A quantitative analysis of current practices in optical flow estimation and the principles behind them. International Journal of Computer Vision Vol. 106, 2 (2014), 115--137.

Digital Library

[30]

Ketan Tang, Jianchao Yang, and Jue Wang. 2014. Investigating haze-relevant features in a learning framework for image dehazing IEEE Conference on Computer Vision and Pattern Recognition. 2995--3002.

Digital Library

[31]

Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. In IEEE International Conference on Computer Vision. 839--846.

Digital Library

[32]

Zhou Wang, Eero P Simoncelli, and Alan C Bovik. 2003. Multiscale structural similarity for image quality assessment Asilomar Conference on Signals, Systems and Computers, Vol. Vol. 2. 1398--1402.

[33]

Xuezhong Xiao and Lizhuang Ma. 2009. Gradient-Preserving Color Transfer. Computer Graphics Forum Vol. 28, 7 (2009), 1879--1886.

[34]

Chenliang Xu, Caiming Xiong, and Jason J Corso. 2012. Streaming hierarchical video segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. 626--639.

Digital Library

[35]

Chun-Han Yao, Chia-Yang Chang, and Shao-Yi Chien. 2016. Example-based video color transfer. In IEEE International Conference on Multimedia and Expo. IEEE, 1--6.

[36]

Genzhi Ye, Elena Garces, Yebin Liu, Qionghai Dai, and Diego Gutierrez. 2014. Intrinsic video and applications. ACM Transactions on Graphics Vol. 33, 4 (2014), 80.

Digital Library

Cited By

Zhijian DZhou WLiu KZhang WYu N(2024)Temporal Optimization for Face Swapping Video based on Consistency InheritanceProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674457(165-170)Online publication date: 5-Jul-2024
https://dl.acm.org/doi/10.1145/3674399.3674457
Fulari AMulleti SRajwade A(2024)Unsupervised Model-based Learning for Simultaneous Video Deflickering and Deblotching2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00407(4105-4113)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00407
Wei TChen DZhou WLiao JZhao HZhang WHua GYu N(2024)Deep Image Matting With Sparse User InteractionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332669346:2(881-895)Online publication date: Feb-2024
https://doi.org/10.1109/TPAMI.2023.3326693
Show More Cited By

Index Terms

Occlusion-aware Video Temporal Consistency
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        Computational photography
  2. Computer graphics
    1. Graphics systems and interfaces
      1. Perception
    2. Image manipulation
      1. Image processing
2. General and reference
  1. Cross-computing tools and techniques
    1. Metrics

Recommendations

Blind video temporal consistency

Extending image processing techniques to videos is a non-trivial task; applying processing independently to each video frame often leads to temporal inconsistencies, and explicitly encoding temporal consistency requires algorithmic changes. We describe a ...
Temporal-Consistency-Aware Video Color Transfer
Advances in Computer Graphics
Abstract
This paper proposes a new temporal-consistency-aware color transfer method based on quaternion distance metric. Compared with the state-of-the-art methods, our method can keep the temporal consistency and better reduce the artifacts. Firstly, ...
Enforcing Temporal Consistency for Color Constancy in Video Sequences
Computational Color Imaging
Abstract
This paper focuses on enhancing temporal color constancy in video sequences, ensuring that the result not only achieves color accuracy frame-by-frame but is also consistent over time. Our approach consists of a three-step process: per-frame ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '17: Proceedings of the 25th ACM international conference on Multimedia

October 2017

2028 pages

ISBN:9781450349062

DOI:10.1145/3123266

General Chairs:
Qiong Liu
FXPAL, USA
,
Rainer Lienhart
Universität Augsburg, Germany
,
Haohong Wang
TCL America, USA
,
Program Chairs:
Sheng-Wei "Kuan-Ta" Chen
Academia Sinica, Taiwan
,
Susanne Boll
University of Oldenburg, Germany
,
Phoebe Chen
La Trobe University, Australia
,
Gerald Friedland
Lawrence Livermore National Lab, USA
,
Jia Li
Google, USA
,
Shuicheng Yan
Qihoo 360, China

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '17

Sponsor:

SIGMM

MM '17: ACM Multimedia Conference

October 23 - 27, 2017

California, Mountain View, USA

Acceptance Rates

MM '17 Paper Acceptance Rate 189 of 684 submissions, 28%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
578
Total Downloads

Downloads (Last 12 months)44
Downloads (Last 6 weeks)2

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhijian DZhou WLiu KZhang WYu N(2024)Temporal Optimization for Face Swapping Video based on Consistency InheritanceProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674457(165-170)Online publication date: 5-Jul-2024
https://dl.acm.org/doi/10.1145/3674399.3674457
Fulari AMulleti SRajwade A(2024)Unsupervised Model-based Learning for Simultaneous Video Deflickering and Deblotching2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00407(4105-4113)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00407
Wei TChen DZhou WLiao JZhao HZhang WHua GYu N(2024)Deep Image Matting With Sparse User InteractionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332669346:2(881-895)Online publication date: Feb-2024
https://doi.org/10.1109/TPAMI.2023.3326693
Hou JJi ZYang JWang CZheng F(2024)MCD-Net: Toward RGB-D Video Inpainting in Real-World ScenesIEEE Transactions on Image Processing10.1109/TIP.2024.335867533(1095-1108)Online publication date: 2024
https://doi.org/10.1109/TIP.2024.3358675
Wu SLiu ZZhang BZimmermann RBa ZZhang XRen K(2024)Do as I Do: Pose Guided Human Motion CopyIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.337153021:6(5293-5307)Online publication date: Nov-2024
https://doi.org/10.1109/TDSC.2024.3371530
Yin WLu PPeng XZhao ZYu J(2024)Hierarchical Color Fusion Network (HCFN): Enhancing exemplar-based video colorizationNeurocomputing10.1016/j.neucom.2024.128121598(128121)Online publication date: Sep-2024
https://doi.org/10.1016/j.neucom.2024.128121
Liu YZhao HChan KWang XLoy CQiao YDong C(2024)Temporally consistent video colorization with deep feature propagation and self-regularization learningComputational Visual Media10.1007/s41095-023-0342-810:2(375-395)Online publication date: 3-Jan-2024
https://doi.org/10.1007/s41095-023-0342-8
Fang XZhang S(2024)Single-Video Temporal Consistency Enhancement with Rolling GuidanceComputational Visual Media10.1007/978-981-97-2092-7_6(109-130)Online publication date: 10-Apr-2024
https://dl.acm.org/doi/10.1007/978-981-97-2092-7_6
Shekhar SReimann MHilscher MSemmo ADöllner JTrapp M(2023)Interactive Control over Temporal Consistency while Stylizing Video StreamsComputer Graphics Forum10.1111/cgf.1489142:4Online publication date: 26-Jul-2023
https://doi.org/10.1111/cgf.14891
Rendle GKreskowski AFroehlich B(2023)Volumetric Avatar Reconstruction with Spatio-Temporally Offset RGBD Cameras2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)10.1109/VR55154.2023.00023(72-82)Online publication date: Mar-2023
https://doi.org/10.1109/VR55154.2023.00023
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten