Abstract
Recently, “epitomes” were introduced as patch-based probability models that are learned by compiling together a large number of examples of patches from input images. In this paper, we describe how epitomes can be used to model video data and we describe significant computational speedups that can be incorporated into the epitome inference and learning algorithm. In the case of videos, epitomes are estimated so as to model most of the small space-time cubes from the input data. Then, the epitome can be used for various modeling and reconstruction tasks, of which we show results for video super-resolution, video interpolation, and object removal. Besides computational efficiency, an interesting advantage of the epitome as a representation is that it can be reliably estimated even from videos with large amounts of missing data. We illustrate this ability on the task of reconstructing the dropped frames in video broadcast using only the degraded video and also in denoising a severely corrupted video.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bishop, C., Blake, A., and Marthi, B. 2003. Super-resolution enhancement of video. In Proc Artificial Intelligence and Statistics.
Buades, A., Coll, B., and Morel, J.-M. 2005. A non-local algorithm for image denoising. In Proc. IEEE Computer Vision and Pattern Recognition, pp. 60–65.
Criminisi, A., Pérez, P., and Toyama, K. 2003. Object removal by exemplar-based inpainting. In Proc. Conf. Computer Vision and Pattern Recognition, pp. 721–728.
Efros, A.A. and Freeman, W.T. 2001. Image quilting for texture synthesis and transfer. In Proc. SIGGRAPH, pp. 341–346.
Freeman, W.T., Jones, T.R., and Pasztor, E.C. 2002. Example-based super-resolution. IEEE Computer Graphics and Applications, pp. 56–65.
Frey, B.J. and Jojic, N. 2005. A comparison of algorithms for inference and learning in probabilistic graphical models. IEEE Trans. Pattern Analysis and Machine Intelligence, 27(9):1392–1416.
Jepson, A. and Black, M. 1993. Mixture models for optical flow computation. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 760–761.
Jojic, N., Frey, B.J., and Kannan, A. 2003. Epitomic analysis of appearance and shape. In Proc. IEEE Intern. Conf. Computer Vision, pp. 34–41.
Neal, R.M. and Hinton, G.E. 1998. A view of the em algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models, Jordan, M.I. (Ed.), pp. 355–368.
Rosales, R., Achan, K., and Frey, B.J. 2003. Unsupervised image translation. In Proc. IEEE Intern. Conf. Computer Vision, pp. 472–478.
Wang, J.Y.A., Adelson, E.H., and Desai, U.Y. 1994. Applying mid-level vision techniques for video data compression and manipulation. In Proc. SPIE on Digital Video Compression on Personal Computers: Algorithms and Technologies, pp. 116–127.
Wexler, Y., Shechtman, E., and Irani, M. 2004. Space-time video completion. In Proc. IEEE Computer Vision and Pattern Recognition, pp. 120–127.
Zhu, S.C., Guo, C., Wu, Y.N., and Wang, Y. 2002. What are textons? In Proc. 7th European Conf. on Computer Vision-Part IV, pp. 793–807.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cheung, V., Frey, B.J. & Jojic, N. Video Epitomes. Int J Comput Vis 76, 141–152 (2008). https://doi.org/10.1007/s11263-006-0001-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-006-0001-4