Abstract
We present a novel method to transfer speech animation recorded in low resolution videos onto realistic 3D facial models. Unsupervised learning is utilized on a speech video corpus to find underlying manifold of facial configurations. K-means clustering is applied on the low dimensional space to find key speaking-related facial shapes. With a small set of laser scanner captured 3D models related to the clustering centroid, the facial animation in 2D videos is transferred onto 3D shapes. Especially by virtue of a weak perspective projection model, the underlying mandible rotation is recovered from videos and is utilized to drive 3D skull movements. The adaption of a generic skull onto facial models is guided by a 2D image, Tissue Map. With parsimonious data requirements, our system realizes the animation transferring and gains a realistic rendering effect with the underlying anatomical structure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ezzat, I., Geiger, G., Poggio, T.: Trainable videorealistic speech animation. ACM Transactions on Graphics 21, 388–398 (2002)
Chai, J., Xiao, J., Hodgins, J.: Vision-based control of 3d facial animation. In: Proc. ACM SIGGRAPH/ Eurographics Symp. on Computer Animation, San Diego, CA, pp. 193–206. Eurographics Association Aire-la-Ville, San Diego (2003)
Allen, B., Curless, B., Popovic, Z.: The space of all body shapes: Reconstruction and parameterization from range scans. In: Proc. ACM SIGGRAPH, San Diego, CA, pp. 587–594. Addison-Wesley, San Diego (2003)
Bregler, C., Covell, M., Slaney, M.: Video rewrite: Driving visual speech with audio. In: Proc. ACM SIGGRAPH, Los Angeles, CA, pp. 353–360. ACM Press/Addison-Wesley Publishing Co., Los Angeles (1997)
Brand, M.: Voice puppetry. In: Proc. ACM SIGGRAPH, Los Angeles, CA, pp. 21–28. ACM Press/Addison-Wesley Publishing Co., Los Angeles (1999)
Cao, Y., Faloutsos, P., Kohler, E., Pighin, F.: Real-time speech motion synthesis from recorded motions. In: Proc. ACM SIGGRAPH/Eurographics Symp. on Computer Animation, Grenoble, France, pp. 347–355 (2004)
Vlasic, D., Brand, M., Pfister, H., Popovic, J.: Face transfer with multilinear models. ACM Transactions on Graphics 24, 426–433 (2005)
Albrecht, I., Haber, J., Kahler, K., Schroder, M., Seidel, H.-P.: May i talk to you? facial animation from text. In: Proc. tenth Pacific Conference on Computer Graphics and Applications, pp. 77–86. IEEE Computer Society Press, Beijing (2002)
Lee, Y., Terzopoulos, D., Waters, K.: Realistic modeling for facial animations. In: Proc. ACM SIGGRAPH 1995, pp. 55–62. ACM Press, Los Angeles (1995)
Koch, R.M., Gross, M.H., Carls, F.R., Buren, D.F., Fankhauser, G., Parish, Y.I.H.: Simulating facial surgery using finite element methods. In: Proc. ACM SIGGRAPH 1996, pp. 421–428. ACM Press, New Orleans (1996)
Sifakis, E., Neverov, I., Fedkiw, R.: Automatic determination of facial muscle activations from sparse motion capture marker data. ACM Transactions on Graphics 24, 426–433 (2005)
Jolliffe, I. (ed.): Principal Component Analysis. Springer, New York (1986)
Pyun, H., Kim, Y., Chae, W., Kang, H.Y., Shin, S.Y.: An example-based approach for facial expression cloning. In: Proc. ACM SIGGRAPH/ Eurographics Symp. on Computer Animation, San Diego, CA, pp. 167–176 (2003)
Chuang, E.S., Deshpande, H., Bregler, C.: Facial expression space learning. In: Proc. 10th Pacific Conference on Computer Graphics and Applications, pp. 68–76. IEEE Computer Society, Beijing (2002)
Kruskal, J.B., Wish, M.: Multidimensional Scaling. Sage Publications, Beverly Hills (1978)
Cao, Y., Faloutsos, P., Pighin, F.: Unsupervised learning for speech motion editing. In: Proc. ACM SIGGRAPH/ Eurographics Symp. on Computer Animation, San Diego, CA, pp. 225–231 (2003)
Hyvarinen, A., Karhunen, J., Oja, E. (eds.): Independent Component Analysis. John Wiley Sons, New York (2001)
Tenenbaum, J.B., Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)
Juan, C., Bodenheimer, B.: Cartoon textures. In: Proc. ACM SIGGRAPH/ Eurographics Symp. on Computer Animation, Grenoble, France, pp. 267–276 (2004)
Hu, C., Chang, Y., Feris, R., Turk, M.: Manifold based analysis of facial expression. In: Proc. Computer Vision and Pattern Recognition Workshop, p. 81. IEEE Computer Society Press, Los Alamitos (2004)
Wang, Y., Huang, X., Lee, C.S., Zhang, S., Li, Z., Samaras, D., Metaxas, D., Elgammal, A., Huang, P.: High resolution acquisition, learning and transfer of dynamic 3-d facial expressions. In: Proc. Annual Conf. of the European Association for Computer Graphics, Grenoble, France, pp. 677–686 (2004)
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. In: Proc. 5th European Conference on Computer Vision, Freiburg, Germany, pp. 484–498. Springer, Heidelberg (1998)
Hatze, H.: High-precision three-dimensional photo- grammetric calibration and object space reconstruction using a modified dlt-approach. J. Biomechanics 21, 533–538 (1988)
Pei, Y., Zha, H.: Transferring speech video onto 3d realistic human faces. In: Proc. thirteenth Pacific Conference on Computer Graphics and Applications, Macao, P.R.China, pp. 13–15 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pei, Y., Zha, H. (2006). Vision Based Speech Animation Transferring with Underlying Anatomical Structure. In: Narayanan, P.J., Nayar, S.K., Shum, HY. (eds) Computer Vision – ACCV 2006. ACCV 2006. Lecture Notes in Computer Science, vol 3851. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11612032_60
Download citation
DOI: https://doi.org/10.1007/11612032_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31219-2
Online ISBN: 978-3-540-32433-1
eBook Packages: Computer ScienceComputer Science (R0)