Current Projects:

Real-time speech motion synthesis

Data-driven approaches have been successfully used for realistic visual speech synthesis. However, little effort has been devoted to real-time lip-synching for interactive applications. In particular, algorithms that are based on a graph of motions are notorious for their exponential complexity. In this work, we present a greedy graph search algorithm that yields vastly superior performance and allows real-time motion synthesis from a large database of motions. The time complexity of the algorithm is linear with respect to the size of an input utterance. In our experiments, the synthesis time for an input sentence of average length is under a second. The performance is satisfied in an interactive virtual environment, such as Video Games.

Demo Video Download: (640x480) (41.9M)

No One Ever Liked Me! - Monologue created by Speech-driven Motion Synthesis

The motion in this animation was created using our novel techniques for automatical expressive speech motion synthesis. The input of our system is a spoken utterance and a set of emotional tags. Its output is a realistic facial animation that is synched to the input audio and conveys faithfully the specified emotions.

The story is taken from play "The Food Chain" by Nicky Silver. The monologue No One Ever Liked Me! in this animation is an alternate ending coming at the end of a running tirade by Otto, a hugely overweight, insecure, rage-filled, Jewish, out-of-control verbal tornado. Otto's got a gun.

Video Download: (720x540): No One Ever Liked Me! (15.5M)

Interactive Motion Decomposition

We introduce a novel method for editing the style of motion data through motion decomposition. Our method extracts the style of a motion using linear decomposition based on Independent Component Analysis. The extracted style components are applied to other motions through a variety of editing operations. The resulting motions retain their original basic content while exhibiting the style of a different motion.

Supporting Video Download: (640x480) (21.7M)

Facial Animation Editing with Independent Component Analysis (ICA)

We present a new method for editing speech related facial motions. Our method uses an unsupervised learning technique, Independent Component Analysis (ICA), to extract a set of meaningful parameters without any annotation of the data. With ICA, we are able to solve a blind source separation problem and describe the original data as a linear combination of two sources. One source captures content (speech) and the other captures style (emotion). By manipulating the independent components we can edit the motions in intuitive ways.

Supporting Video Download: (480x360)
Part1(5.7M), Part2(16.8M), Part3(5.9M), Part4(7.2M)

Data-driven Visual Speech with Emotion Control

We present a set of novel techniques for automatically synthesizing speech-driven expressive facial animation. The input of our system is a spoken utterance and a set of emotional tags. These emotional tags can be specified by a user or extracted from the speech signal using a classifier. Its output is a realistic facial animation that is synched to the input audio and conveys faithfully the specified emotions. This approach relies on a database of high-fidelity ecorded facial motions. This database includes speech-related motions with variations across multiple emotions. Our main contribution is a system that is able to generate expressive speech facial animation with real-time performances.

Demo Videos (640x480)(54.7M) for Synthesis Speech Download.


Publications:

·         Yong Cao, Petros Faloutsos, Fred Pighin Expressive Speech-Driven Facial Animation, to appear in ACM Transactions on Graphics, October 2005. (Download PDF 2.2M)

·         Yong Cao, Petros Faloutsos, Eddie Kohler, Fred Pighin Real-time Speech Motion Synthesis from Recorded Motions, In Proceedings of the 2004 ACM SIGGRAPH / Eurographics Symposium on Computer Animation, Page 347-355. (Dowload PDF 415K)

·         Ari Shapiro , Yong Cao, Petros Faloutsos Stylistic Motion Decomposition, ACM SIGGRAPH / Eurographics Symposium on Computer Animation 2004, Poster Paper.

  • Ari Shapiro , Yong Cao, Petros Faloutsos Interactive Motion Decomposition, ACM SIGGRAPH 2004 Technical Sketches. (Dowload PDF 238K)

·         Yong Cao, Petros Faloutsos, Fred Pighin “Unsupervised Learning for Speech Motion Editing”, In Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation , Page 225-231. (Dowload PDF 755K)

·         Yong Cao, Tian Jie & Qiu Feng “Research of Progressive Meshes Algorithm Applied in Virtual Endoscopy System”, Journal of Software (Chinese Academy of Sciences) 2002, Vol. 13, No.4, pp. 677-685.

·         Qiu Feng, Tian Jie, Yong Cao “The Summarization of PACS System”, Chinese Journal of Medical Imaging Technology 2002, Vol. 16, No.1, pp. 73-75.

·         Liu Jingchun, Tianjie, Yong Cao “The Architecture and Implementation of PACS System”, Chinese Journal of Medical Imaging Technology 2000, Vol. 16, No.1, pp. 76-78