KR101089184B1 - Method and system for providing a speech and expression of emotion in 3D charactor - Google Patents
Method and system for providing a speech and expression of emotion in 3D charactor Download PDFInfo
- Publication number
- KR101089184B1 KR101089184B1 KR20100000837A KR20100000837A KR101089184B1 KR 101089184 B1 KR101089184 B1 KR 101089184B1 KR 20100000837 A KR20100000837 A KR 20100000837A KR 20100000837 A KR20100000837 A KR 20100000837A KR 101089184 B1 KR101089184 B1 KR 101089184B1
- Authority
- KR
- South Korea
- Prior art keywords
- expression
- speech
- lip
- lip shape
- character
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Processing Or Creating Images (AREA)
Abstract
The present invention simultaneously executes a utterance operation that expresses the contents of speech while executing an emotional expression operation such as crying or laughing of a three-dimensional character appearing in a three-dimensional animation, three-dimensional virtual space, advertisement contents delivery, and the like. The present invention relates to a system and method for providing character speech and emotion expression through which story transmission, advertisement delivery, and content delivery can be made clearly.
A system for providing speech and emotion expression of a character according to the present invention includes a situation recognition unit for recognizing surrounding situations; A speech door selection unit for selecting a speech door according to the recognized surrounding situation; A utterance image selection unit for selecting a lip shape required to express the selected utterance sentence; An expression selection unit for selecting a facial expression corresponding to the emotion expression according to the recognized surrounding situation; A sound source generator for generating a sound source corresponding to the selected speech sentence; A syntax analysis unit for extracting consonant and vowel information necessary for generating a lip shape from the spoken sentence, and generating time information in which the consonant and vowel in which the lip shape is changed are pronounced; A controller configured to control the facial expression, the lip shape, and the sound source to be synchronized; And an emotional expression unit expressing the synchronized facial expressions, the lips, and the sound source.
According to the present invention, it is possible to provide a character capable of simultaneously displaying a facial expression and speech content in a 2D or 3D character. Accordingly, various emotion expressions may be provided according to facial expressions and utterances of the character.
Description
The present invention relates to a system and method for simultaneously providing a utterance motion and an emotion expression motion of a three-dimensional character. More specifically, the three-dimensional character appearing in three-dimensional animation, three-dimensional virtual space, advertisement content delivery, etc. Simultaneous utterances that express the contents of speech while performing emotion expressing actions such as crying or laughing, make it possible to clearly communicate stories, advertisements, and contents through 3D characters. Emotional expression providing system and method.
The main direction of conventional facial animation research has been to find an efficient way to deal with emotions and lip movements. Until now, many studies on facial expression behavior have been conducted at home and abroad, but it is hard to say that characters appearing in 3D games and animations are producing natural facial expressions. Nevertheless, face modeling and animation have actually advanced dramatically in recent years.
Computer graphics technology for 3D animation production is currently growing and developing globally, and researches on expanding and improving the expressive range, improving performance for shortening production time and reducing production cost, and improving the interface for user convenience are being conducted. have.
In addition, current voice recognition and speaker authentication technology has been steadily developed around the world, showing very satisfactory performance in a limited environment. In this technology, it is essential to extract clear boundaries of phonemes from consecutive voices in order to improve the performance of food recognition or speaker authentication system. The most important thing to consider in the natural facial expression of the characters in the animation is the synchronization of the voice signal and the lip movement.
In the case of producing animation, the voice actor first records the dialogue and creates a character animation accordingly. Therefore, it is difficult to use the conventional text-based mouth shape synchronization and facial expression animation methods in actual production sites. A technique for generating animation by extracting phonemes directly from recorded voice data has been studied.
However, researches on facial expressions and movements of facial parts themselves have been done a lot, including medicine and art, but the three-dimensional face models that are actually used are mainly drawn by frame by animator by hand or by using three-dimensional software. Even if the animation was performed, the quality (Quality) was lowered compared to the working time.
In addition, when the emotion expression and the utterance motion are applied to the 3D character, the expression of the emotion of the 3D character is performed, such as the smiley expression of the lips, followed by the utterance of moving the lips, or the crying after the utterance operation is performed. Emotional expression and utterances such as the emotional expression of the operation proceeded as a separate sequential operation. Therefore, there is a demand for a technology that enables a speech operation to be performed simultaneously while executing an emotion expressing operation such as crying or laughing in order to improve content delivery or story delivery power according to the motion of the 3D character.
An object of the present invention for solving the above-described problems, a speech that expresses the delivered content in words while performing an emotional expression operation such as crying or laughing, such as a three-dimensional character appearing in three-dimensional animation, three-dimensional virtual space, advertising content delivery, etc. The present invention provides a system and method for providing utterances and emotion expressions of characters so that story transmission, advertisement delivery, content delivery, etc. can be clearly performed through the three-dimensional character.
According to an aspect of the present invention, there is provided a system for providing speech and emotion expression of a character, including: a situation recognition unit for recognizing a surrounding situation; A speech door selection unit for selecting a speech door according to the recognized surrounding situation; A utterance image selection unit for selecting a lip shape required to express the selected utterance sentence; An expression selection unit for selecting a facial expression corresponding to the emotion expression according to the recognized surrounding situation; A sound source generator for generating a sound source corresponding to the selected speech sentence; A syntax analysis unit for extracting consonant and vowel information necessary for generating a lip shape from the spoken sentence, and generating time information in which the consonant and vowel in which the lip shape is changed are pronounced; A controller configured to control the facial expression, the lip shape, and the sound source to be synchronized; And an emotional expression unit expressing the synchronized facial expressions, the lips, and the sound source.
In addition, the facial expression database for storing the facial expression as an image; A utterance image database storing the lip shape as a utterance image; A utterance statement database storing data corresponding to the utterance statement; And an emotion adding unit for changing the tone of the generated sound source to add emotion information.
The emotion expression unit may include a display unit for displaying the synchronized face and lips on a screen, and a sound source output unit for outputting a sound source synchronized with the face and lips.
In addition, the control unit analyzes the consonants and vowels of the speech sentence, controls the lip shape based on the vowel in which the lip shape changes the most, and closes the lip before expressing the next vowel when the lip is closed. To control.
The control unit may include connection lines, such as bones corresponding to human bones, on the lip-shaped graphic objects of the upper and lower lips with respect to the movement of the lips. The lips shape is controlled to move according to their movement.
The control unit may control a plurality of connection lines, a plurality of rotation control points in the connection line, a plurality of position control points at the tip of the lips, and a plurality of connection lines and a plurality of position control points for the lower lip.
In addition, the controller controls the lip shape by moving / rotating a control point, or controls the operation of the lip shape by applying acceleration / deceleration to an object to which the two control points are connected.
The control unit controls the lip shape by applying a weight to a control point of the lip shape in the facial expression according to the emotional state.
The controller may be further configured to, for the face expression, the lip shape, and the sound source, perform facial expression and the facial expression according to a synchronization function including a difference between a facial expression and expression time, a speech and speech time, and a facial expression time and speech time. Lip shape and the sound source is controlled to be synchronized.
On the other hand, in order to achieve the above object, a method of providing a speech and emotion expression of a character according to the present invention includes: (a) recognizing a surrounding situation; (b) selecting a spoken text according to the recognized surrounding situation; (c) selecting the shape of lips needed to express the selected speech; (d) selecting a facial expression corresponding to the emotional expression according to the recognized surrounding situation; (e) generating a sound source corresponding to the selected speech sentence; (f) extracting consonant and vowel information necessary for lip shape generation from the spoken sentence, and generating time information in which the consonant and vowel in which the lip shape is changed are pronounced; And (g) synchronizing the facial expression with the lips and expressing the sound source.
In addition, in the step (c), by analyzing the consonants and vowels of the utterance, the lip shape is selected based on the vowel in which the lip shape changes the most, and the lips before the next vowel is expressed in the consonant pronunciation when the lips are closed. Choose this closed lip shape.
In addition, the step (g), the facial expression and the lip shape and the sound source, the facial expression and expression time, the speech and speech time, the facial expression according to the synchronization function consisting of the difference between the facial expression time and speech time Expression is expressed in synchronization with the lip shape and the sound source.
Also, the step (c) may include connection lines, such as bones, corresponding to human bones in the graphic objects of the upper lip and the lower lip so that the joints move similarly to the selection of the lip shape. , And selects a lip shape formed according to the movement of the connecting lines.
Also, in the step (c), the changed lip shape is selected by moving / rotating the control point, or the lip shape to which acceleration / deceleration is applied to the object to which the two control points are connected is selected.
The step (c) selects a lip shape to which weights are applied to a control point of the lip shape in the facial expression according to the emotional state.
According to the present invention, it is possible to provide a character capable of simultaneously displaying a facial expression and speech content in a 2D or 3D character.
Accordingly, various emotion expressions may be provided according to facial expressions and utterances of the character.
1 is a block diagram schematically illustrating a functional block of a system for providing a speech and emotion expression of a character according to an exemplary embodiment of the present invention.
2 is a flowchart illustrating a method of providing a speech and emotion expression of a character according to an exemplary embodiment of the present invention.
3 is a view showing an example of a lip shape provided with a bone according to an embodiment of the present invention.
4 is a diagram illustrating an example of synchronizing facial expression and speech information based on time information.
5 is a view showing an example in which the facial expression and the shape of the lips are simultaneously expressed according to an embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.
1 is a block diagram schematically illustrating a functional block of a system for providing a speech and emotion expression of a character according to an exemplary embodiment of the present invention.
Referring to FIG. 1, the character speech and emotion
The
The speech
In addition, a user input unit may be provided so that a user may input emotions and spoken texts arbitrarily.
The utterance
The
The
The
The
The
The
The
The
The emotion adding unit 124 changes the tone of the generated sound source to add emotion information.
In addition, the
In addition, the
In addition, the
In addition, the
In addition, the
Then, the
Here, Tai is composed of facial expression i and expression time ti, Tbi is composed of speech sentence i and speech time ti, and Tci represents the difference i between facial expression time and speech sentence time.
2 is a flowchart illustrating a method of providing a speech and emotion expression of a character according to an exemplary embodiment of the present invention.
Referring to FIG. 2, the character speech and emotion
Here, the
Subsequently, the character utterance and emotion
Subsequently, the character utterance and emotion
At this time, the character utterance and emotion
In addition, the character utterance and emotion
In addition, the character utterance and emotion
When the character utterance and emotion
Here, the k value represents a weight for determining the final lip shape.
Subsequently, the character utterance and emotion
Subsequently, the character speech and emotion
Subsequently, the character utterance and emotion
Subsequently, the character speech and emotion
That is, the character speech and emotion
Therefore, it is possible to provide users with a three-dimensional character capable of expressing a lip shape and a sound source simultaneously with facial expressions such as a smiling face.
As described above, according to the present invention, a three-dimensional character appearing in a three-dimensional animation, a three-dimensional virtual space, advertisement content delivery, etc. simultaneously executes an utterance operation that expresses the delivered content in words while performing an emotional expression operation such as crying or laughing. By doing so, it is possible to realize a system and method for providing a utterance and emotion expression of a character such that story transmission, advertisement delivery, content delivery, and the like can be clearly performed through the three-dimensional character.
As those skilled in the art to which the present invention pertains may implement the present invention in other specific forms without changing the technical spirit or essential features, the embodiments described above are intended to be illustrative in all respects and should not be considered as limiting. Should be. The scope of the present invention is shown by the following claims rather than the detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included in the scope of the present invention. do.
100: character utterance and emotion expression providing system 102: situation recognition unit
104: speech door selection unit 106: speech image selection unit
108: facial expression selection unit 110: sound source generation unit
112: parser 114: control unit
116: emotion expression unit 118: facial expression DB
120: ignition image DB 122: utterance statement DB
124: Emotional Kabubu
Claims (15)
A speech door selection unit for selecting a speech door according to the recognized surrounding situation;
A utterance image selection unit for selecting a lip shape required to express the selected utterance sentence;
An expression selection unit for selecting a facial expression corresponding to the emotion expression according to the recognized surrounding situation;
A sound source generator for generating a sound source corresponding to the selected speech sentence;
A syntax analysis unit for extracting consonant and vowel information necessary for generating a lip shape from the spoken sentence, and generating time information in which the consonant and vowel in which the lip shape is changed are pronounced;
A controller configured to synchronize the facial expression, the lip shape, and the sound source according to a synchronization function based on the generated time information; And
An emotion expression unit expressing the synchronized facial expressions, lips, and sound sources;
Speech and emotion expression providing system of the character comprising a.
An expression database storing the facial expressions as an image;
A utterance image database storing the lip shape as a utterance image;
A utterance statement database storing data corresponding to the utterance statement; And
Emotion adding unit for changing the tone of the generated sound source to add emotion information;
Speech and emotion expression providing system of the character further comprising.
The emotion expression unit may include a display unit for displaying the synchronized face and lips on a screen, and a sound source output unit for outputting a sound source synchronized with the face and lips. Expressive delivery system.
The control unit analyzes the consonants and vowels of the utterance sentence, controls the lip shape based on the vowel in which the lip shape changes the most, and controls the lip to close before expressing the next vowel when the lip is closed. System for providing speech and emotion expression of the character, characterized in that.
The control unit includes connection lines, such as bones corresponding to human bones, on the lip-shaped graphic objects of the upper lip and the lower lip so that the lip movement is similar to the movement of the joint. System for providing speech and emotion expression of the character, characterized in that to control the movement of the lips shape.
The control unit controls a plurality of connection lines, a plurality of rotation control points in the connection line, a plurality of position control points of the end of the lip for the upper lip, and controls a plurality of connection lines and a plurality of position control points for the lower lip. Character utterance and emotion expression providing system.
The controller controls movement of the lip shape by moving / rotating a control point or by applying acceleration / deceleration to an object to which two control points are connected. Provide system.
And the control unit controls the lip-shaped control point by applying a weight to a control point of the lip shape in the facial expression according to the emotional state.
The control unit, the facial expression and the lip shape and the sound source, the facial expression and the lip shape according to the synchronization function consisting of the difference between the facial expression and expression time, the speech and speech time, the facial expression time and speech language time And controlling the sound source to be synchronized.
(b) selecting a spoken text according to the recognized surrounding situation;
(c) selecting the shape of lips needed to express the selected speech;
(d) selecting a facial expression corresponding to the emotional expression according to the recognized surrounding situation;
(e) generating a sound source corresponding to the selected speech sentence;
(f) extracting consonant and vowel information necessary for lip shape generation from the spoken sentence, and generating time information in which the consonant and vowel in which the lip shape is changed are pronounced; And
(g) synchronizing the facial expression with the lip shape and the sound source according to a synchronization function based on the generated time information;
Speech and emotion expression providing method of the character comprising a.
In the step (c), by analyzing the consonants and vowels of the utterance, the lip shape is selected based on the vowel in which the lip shape changes the most, and the lip is closed before the next vowel is expressed when the lip is closed. A method for providing speech and emotion expression of a character, characterized in that the selection of the lips shape.
In the step (g), for the facial expression and the lip shape and the sound source, the facial expression and the facial expression according to a synchronous function comprising a difference between facial expression and expression time, speech and speech time, facial expression time and speech language time, and the like. A method for providing speech and emotion expression of a character, characterized in that the lip and the sound source is displayed in synchronization.
In the step (c), for the lip selection, the lip shape graphic object of the upper lip and the lower lip is provided with connection lines such as a bone corresponding to a human bone, so that the joints move similarly. A method of providing utterances and expressions of characters characterized by selecting a lip shape formed according to the movement of the connecting lines.
In the step (c), moving / rotating the control point to select a changed lip shape, or selecting a lip shape to which acceleration / deceleration is applied to an object to which the two control points are connected.
In the step (c), the method of providing a character utterance and emotion expression, characterized in that the weight of the lip shape is selected to the control point of the lip shape in the facial expression according to the emotional state.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20100000837A KR101089184B1 (en) | 2010-01-06 | 2010-01-06 | Method and system for providing a speech and expression of emotion in 3D charactor |
PCT/KR2011/000071 WO2011083978A2 (en) | 2010-01-06 | 2011-01-06 | System and method for providing utterances and emotional expressions of a character |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20100000837A KR101089184B1 (en) | 2010-01-06 | 2010-01-06 | Method and system for providing a speech and expression of emotion in 3D charactor |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20110081364A KR20110081364A (en) | 2011-07-14 |
KR101089184B1 true KR101089184B1 (en) | 2011-12-02 |
Family
ID=44305944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR20100000837A KR101089184B1 (en) | 2010-01-06 | 2010-01-06 | Method and system for providing a speech and expression of emotion in 3D charactor |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR101089184B1 (en) |
WO (1) | WO2011083978A2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102045761B1 (en) | 2019-09-26 | 2019-11-18 | 미디어젠(주) | Device for changing voice synthesis model according to character speech context |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9165404B2 (en) | 2011-07-14 | 2015-10-20 | Samsung Electronics Co., Ltd. | Method, apparatus, and system for processing virtual world |
KR101358999B1 (en) * | 2011-11-21 | 2014-02-07 | (주) 퓨처로봇 | method and system for multi language speech in charactor |
KR102522867B1 (en) * | 2017-12-18 | 2023-04-17 | 주식회사 엘지유플러스 | Method and apparatus for communication |
JP6776409B1 (en) * | 2019-06-21 | 2020-10-28 | 株式会社コロプラ | Programs, methods, and terminals |
CN112669420A (en) * | 2020-12-25 | 2021-04-16 | 江苏匠韵文化传媒有限公司 | 3D animation production method and calculation production device |
CN114928755B (en) * | 2022-05-10 | 2023-10-20 | 咪咕文化科技有限公司 | Video production method, electronic equipment and computer readable storage medium |
CN115222856B (en) * | 2022-05-20 | 2023-09-26 | 一点灵犀信息技术(广州)有限公司 | Expression animation generation method and electronic equipment |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100856786B1 (en) * | 2006-07-27 | 2008-09-05 | 주식회사 와이즌와이드 | System for multimedia naration using 3D virtual agent and method thereof |
KR20080018408A (en) * | 2006-08-24 | 2008-02-28 | 한국문화콘텐츠진흥원 | Computer-readable recording medium with facial expression program by using phonetic sound libraries |
KR100912877B1 (en) * | 2006-12-02 | 2009-08-18 | 한국전자통신연구원 | A mobile communication terminal having a function of the creating 3d avata model and the method thereof |
-
2010
- 2010-01-06 KR KR20100000837A patent/KR101089184B1/en active IP Right Grant
-
2011
- 2011-01-06 WO PCT/KR2011/000071 patent/WO2011083978A2/en active Application Filing
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102045761B1 (en) | 2019-09-26 | 2019-11-18 | 미디어젠(주) | Device for changing voice synthesis model according to character speech context |
Also Published As
Publication number | Publication date |
---|---|
WO2011083978A3 (en) | 2011-11-10 |
WO2011083978A2 (en) | 2011-07-14 |
KR20110081364A (en) | 2011-07-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022048403A1 (en) | Virtual role-based multimodal interaction method, apparatus and system, storage medium, and terminal | |
US8224652B2 (en) | Speech and text driven HMM-based body animation synthesis | |
KR101089184B1 (en) | Method and system for providing a speech and expression of emotion in 3D charactor | |
KR102035596B1 (en) | System and method for automatically generating virtual character's facial animation based on artificial intelligence | |
CN106653052B (en) | Virtual human face animation generation method and device | |
US20120130717A1 (en) | Real-time Animation for an Expressive Avatar | |
EP1269465B1 (en) | Character animation | |
KR102116309B1 (en) | Synchronization animation output system of virtual characters and text | |
Naert et al. | A survey on the animation of signing avatars: From sign representation to utterance synthesis | |
CN111145777A (en) | Virtual image display method and device, electronic equipment and storage medium | |
CN113781610A (en) | Virtual face generation method | |
US20150187112A1 (en) | System and Method for Automatic Generation of Animation | |
Fernández-Baena et al. | Gesture synthesis adapted to speech emphasis | |
CN112734889A (en) | Mouth shape animation real-time driving method and system for 2D character | |
KR20080018408A (en) | Computer-readable recording medium with facial expression program by using phonetic sound libraries | |
Massaro et al. | A multilingual embodied conversational agent | |
Karpov et al. | Multimodal synthesizer for Russian and Czech sign languages and audio-visual speech | |
KR100813034B1 (en) | Method for formulating character | |
JP2003058908A (en) | Method and device for controlling face image, computer program and recording medium | |
Verma et al. | Animating expressive faces across languages | |
Lacerda et al. | Enhancing Portuguese Sign Language Animation with Dynamic Timing and Mouthing | |
Yang et al. | A multimodal approach of generating 3D human-like talking agent | |
Wang et al. | A real-time text to audio-visual speech synthesis system. | |
Fagel | Merging methods of speech visualization | |
Safabakhsh et al. | AUT-Talk: a farsi talking head |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant | ||
FPAY | Annual fee payment |
Payment date: 20141128 Year of fee payment: 4 |
|
FPAY | Annual fee payment |
Payment date: 20151127 Year of fee payment: 5 |
|
FPAY | Annual fee payment |
Payment date: 20161128 Year of fee payment: 6 |
|
FPAY | Annual fee payment |
Payment date: 20171124 Year of fee payment: 7 |
|
FPAY | Annual fee payment |
Payment date: 20181128 Year of fee payment: 8 |
|
FPAY | Annual fee payment |
Payment date: 20191128 Year of fee payment: 9 |