JP2004506996A

JP2004506996A - Apparatus and method for generating synthetic face image based on form information of face image

Info

Publication number: JP2004506996A
Application number: JP2002521224A
Authority: JP
Inventors: リー，ソング，ファン; リュ，チャン−ユ; ファン，ボン，ウー
Original assignee: バーチャルメディア　カンパニー　リミテッド
Priority date: 2000-08-22
Filing date: 2001-07-07
Publication date: 2004-03-04
Also published as: KR100407111B1; KR20020015642A; WO2002017234A1; CN1447955A; AU2001269581A1; KR20000064110A

Abstract

本発明は、入力された顔映像の形態情報に基づいて新しい合成顔映像を生成する装置およびその方法に関する。本発明による入力された顔映像の形態情報に基づいて新しい顔映像を合成する装置は、使用者インタフェース装置から伝送される顔映像情報から所定の参照映像に対する変形場として表示される入力顔映像の形態情報および参照映像にマッピングされた入力映像の色相または明暗の情報である質感情報を抽出し、使用者制御命令に応じて入力顔映像の形態情報を用いて映像データベースに予め格納され参照映像と同じ形態を有する種々の顔映像を入力顔映像の形態情報が反映された顔映像に合成する。本発明によれば、参照映像の形態情報と同じ形態情報を有し種々の質感情報を有する映像と抽出された入力顔映像の形態情報を用いて入力映像の状態には無関係に自然でかつ高品質の新しい映像を合成することができる。The present invention relates to an apparatus and a method for generating a new synthetic face image based on morphological information of an input face image. An apparatus for synthesizing a new face image based on morphological information of an input face image according to the present invention is configured to convert an input face image displayed as a deformation field for a predetermined reference image from face image information transmitted from a user interface device. Extract the texture information that is the hue or lightness / darkness information of the input video mapped to the morphological information and the reference video, and store the reference video stored in advance in the video database using the morphological information of the input facial video according to the user control command. Various face images having the same form are combined with a face image reflecting the form information of the input face image. According to the present invention, a video having the same morphological information as the morphological information of the reference video and having various texture information and the morphological information of the extracted input face video are used to obtain a natural and high image regardless of the state of the input video. A new quality image can be synthesized.

Description

〔技術分野〕
本発明は、合成された顔映像を生成する装置およびその方法に関し、特に、入力された顔映像の形態情報に基づいて新しい合成顔映像を生成する装置およびその方法に関する。
【０００１】
一般に、顔映像は、個人の特徴を一番良く示し、対話を自然でかつ円滑にする媒介体として利用されている。このような顔映像の応用分野としては、出入統制・セキュリティシステム、犯罪者検索・モンタージュ作成システム、コンピュータインタフェース、アニメ、ゲームなどが挙げられる。顔映像の応用分野において、顔映像の合成技術を用いる体表的なものとしては、キャラクター映像生成とメークアップデザインがある。
【０００２】
キャラクター映像の一種である顔映像のカリカチュアは、特定の人物の顔特徴をとらえて作られる。従って、顔映像のカリカチュアは、漫画製作または娯楽番組の製作に利用されるだけでなく、自分を代表する象徴またはアイコンなどのような用途として活用し得る。また、パソコン通信や電子メールにおける自分の固有な署名（Ｓｉｇｎａｔｕｒｅ）や、バーチャルリアリティでの使用者のアバター（Ａｖａｔａｒ）などとして活用することができる。
【０００３】
〔背景技術〕
このようなカリカチュアを生成するため、従来、専門画家が直接手作業でカリカチュアを描く方法、デジタルフィルタを用いて顔映像を自動処理する方法などが行われている。ここで、デジタルフィルタを用いた映像処理技法は、入力映像に対して適切な効果を与えるフィルタの組み合せを用いて水彩画風または木炭画風の映像効果を加味し、入力映像に対して全体として手作業で生成したカリカチュアの感じを与える方法である。
【０００４】
ところで、専門画家が直接カリカチュアを描く方法は、自然でかつ完成度の高いものが得られるが、手作業で行っているため、相当な時間がかかるという点、また、一様な品質を維持し難いという点から、制限的な状況にのみ適用し得る方法であるといえる。デジタルフィルタを用いた映像処理技法は、照明や背景などが制限された環境で撮影した映像に対して適用される方法であるため、背景とオブジェクトとの区分のない単なる二次元映像に対しては映像の照明または他の環境変化によって出力映像の質が大きく変化してしまい、この変化を補完する適切な方法が求められている。また、従来のカリカチュアの生成方法によれば、オブジェクトに対する形態情報を別に生成していないため、生成されたカリカチュア上の顔特徴を誇張したり表情を変化させるなどの修正作業が非常に複雑であり、顔映像の復元または三次元アバターなどへの拡張のような作業が殆どできないという問題点もあった。
【０００５】
メークアップデザインは、従来、消費者が、雑誌などからメークアップの施されたモデルの写真を見て、間接的に自分のスタイルなどを決定する方式で行われている。近年、コンピュータを用いるメークアップデザイン方法が紹介されている。この方法は、サンプルとしてのモデルの映像に製品を様々に適用してみるための方法であって、消費者が自分の顔映像に直接メークアップすることで得られる自然なメークアップの効果が得られていない。即ち、同一の色相を有する製品であっても、周辺の照明、顔の形態的な特徴による陰影や反射光などのような複雑な条件によって異になるため、消費者が、モデルの映像に適用されたメークアップ効果を見て自分の顔映像に適用されるメークアップ効果を自然に類推することはほとんどできないという問題点があった。
【０００６】
〔発明の開示〕
本発明は、上記の問題点を解決するためになされたもので、本発明の目的は、入力された顔映像から顔の形態情報を抽出し、この抽出された情報に基づいて顔映像を再合成する方法を用いて、使用者は、一層自然でかつ精巧なカリカチュア映像を得ることができ、また、使用者が自分の顔映像上で行われたメークアップデザインの映像を予め見ることができ、さらに、使用者が合成映像に直接種々のアクセサリをつけたり合成映像を変形することが容易であり、その結果、映像を使用者がリアルタイムで確認することができる、顔映像の形態情報に基づく合成顔映像の生成装置およびその方法を提供することにある。
【０００７】
上記の目的を達成するため、本発明による入力された顔映像の形態情報に基づいて新しい顔映像を合成する装置は、顔映像情報および使用者制御命令の入力を受けて映像処理装置に伝送し、映像処理装置で合成された顔映像情報を伝送され前記使用者制御命令に応じて出力または格納する使用者インタフェース装置、および、前記使用者インタフェース装置から伝送される顔映像情報から所定の参照映像に対する変形場として表示される入力顔映像の形態情報および参照映像にマッピングされた入力映像の色相または明暗の情報である質感情報を抽出し、前記使用者制御命令に応じて映像データベースに予め格納され参照映像と同じ形態を有する質感映像から選択された質感映像または前記選択された質感映像と前記抽出された質感情報が反映された質感映像の重み付けで生成された映像を前記入力顔映像の形態情報を用いて変換することで、合成顔映像を生成する映像処理装置を含むことを特徴とする。
【０００８】
本発明の他の目的を達成するため、入力された顔映像の形態情報に基づいて新しい顔映像を合成する方法において、（ａ）入力された顔映像情報から所定の参照映像に対する変形場として表示される入力顔映像の形態情報および参照映像にマッピングされた入力映像の色相または明暗の情報である質感情報を抽出するステップ、および、（ｂ）使用者制御命令に応じて、映像データベースに格納され参照映像と同じ形態を有する質感映像のうちから選択された質感映像または前記選択された質感映像と前記抽出された質感情報が反映された質感映像の重み付けで生成された映像を前記入力顔映像の形態情報を用いて変換することで、合成顔映像を生成するステップを含むことを特徴とする。
【０００９】
〔発明を実施するための最良の形態〕
以下、本発明の好ましい実施例を添付の図面を参照して詳細に説明する。
【００１０】
図１Ａおよび図１Ｂは、それぞれ本発明による顔映像の形態情報に基づく合成顔映像生成装置の第１実施例（１）および第２実施例（４０）を示す構成図であって、図１Ａに示された本発明の第１実施例（１）は、少なくとも１つ以上の使用者インタフェース装置（１０ａ、１０ｂ）、通信網（２０）および映像処理装置（３０）で構成され、ネットワーク環境で動作し、図１Ｂに示された本発明の第２実施例（４０）は、使用者インタフェース装置（５０）および映像処理装置（６０）で構成された単一のコンピュータシステムで動作する。
【００１１】
本発明の第１の実施例（１）の構成要素である使用者インタフェース装置（１０ａ、１０ｂ）と映像処理装置（３）、また、本発明の第２の実施例（４０）は、それぞれ図２に示したように少なくとも１つ以上の中央処理装置（ＣＰＵ）（７４）とメモリ装置（７３）を備えたコンピュータ（７２）、入力装置（７５）および出力装置（７６）を含むコンピュータシステム（７０）で構成される。コンピュータシステム（７０）の構成要素は、少なくとも１つ以上のバス構造（７７）によって相互連結されている。
【００１２】
図示された中央処理装置（７４）は、算術演算および論理演算を行う算術・論理演算装置（ＡＬＵ）（７４１）、データと命令語を仮格納するレジスタセット（７４２）およびコンピュータシステム（７０）の動作を制御する制御装置（７４３）を備える。本発明で使用される中央処理装置（７４）は、特定の製造社によって製造された特定の構造に限定されるものではないため、上記のような基本的な構成を有していれば、全ての形態のプロセッサが使用可能である。
【００１３】
メモリ装置（７３）は、高速のメインメモリ（７３１）とデータの長期格納のために使用される補助メモリ（７３２）を備える。メインメモリ（７３１）は、ＲＡＭ（ＲａｎｄｏｎＡｃｃｅｓｓＭｅｍｏｒｙ）とＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）半導体チップで構成され、補助メモリ（７３２）は、フロッピーディスク、ハードディスク、ＣＤ−ＲＯＭ、フラッシュメモリ、そして、電気、磁気、光または他の記録媒体を用いてデータを格納する装置で構成される。また、メインメモリ（７３１）は、ディスプレイ装置を介して映像をディスプレイするためのビデオディスプレイメモリを備えることもできる。本発明に関する平均的な知識を有する者であれば、メモリ装置（７３）が、種々の格納能力を有する種々の代替可能な構成要素を包含し得ることが容易に理解できるだろう。
【００１４】
入力装置（７５）には、キーボード、マウス、物理的変換機（例えば、マイク）などが含まれ、出力装置（７６）には、ディスプレイ、プリンタ、物理的変換機（例えば、スピーカ）などが含まれる。また、ネットワークインタフェースまたはモデムのような装置が、入力・出力装置として使用できる。
【００１５】
コンピュータシステム（７０）は、運営体制と少なくとも１つ以上の応用プログラムを備えている。運営体制は、コンピュータシステム（７０）の動作と資源の割当てを制御する一連のソフトウェアであり、応用プログラムは、運営体制を介して利用可能なコンピュータ資源を使用して使用者の要求する作業を行う一連のソフトウェアである。これらのいずれも、図示されたメモリ装置（７３）に格納される。結局、本発明によるコンピュータ基盤のキャラクター自動生成装置は、コンピュータシステム（７０）およびコンピュータシステム（７０）に設置されて動作する１つ以上の応用プログラムとして具現される。
【００１６】
図１Ａに示された本発明の第１の実施例（１）は、図１Ｂに示された第２の実施例（４０）に比べて通信網（２０）を通じたデータ伝送のための通信処理部（１４、３１）をさらに含んでいるのを除き、他の機能は同様であるため、以下、第１の実施例（１）を基準にして説明する。
【００１７】
図１Ａを参照して、少なくとも１つ以上の使用者インタフェース装置（１０ａ、１０ｂ）は、それぞれ使用者から顔映像情報と使用者制御命令の入力を受け、使用者制御命令に応じて合成された映像を伝送されて補正、格納または出力を行う装置であって、映像情報入力部（１１）と、使用者命令入力部（１２）、入出力制御部（１３）、通信処理部（１４）、映像補正部（１５）、映像格納部（１６）および出力部（１７）を備える。
【００１８】
映像情報入力部（１１）は、使用者から顔映像情報を入力される装置であって、例えば、スキャナまたはデジタルカメラなどのような装置が挙げられる。また、映像情報入力部（１１）は、種々の角度で撮影された映像を入力されるための多数のカメラおよび照明調節装置のようなカメラ補助装置を含んで構成することができる。なお、本発明の構成要素としての映像情報入力部（１１）は、機能的な側面から考慮する必要があるため、図２の入力装置（７５）だけでなく、顔映像情報を予め格納している補助メモリ（７３２）も含む広い意味で解釈される必要がある。
【００１９】
使用者命令入力部（１２）は、使用者から使用者制御命令（例えば、使用者情報、顔映像合成制御信号、映像補正制御信号など）の入力を受ける装置であって、使用者が選択し入力できるキーボード、マウス、タッチスクリーンなどのような装置が挙げられる。
【００２０】
入出力制御部（１３）は、映像情報入力部（１１）を介して入力された顔映像情報および使用者命令入力部（１２）を介して入力された使用者制御命令を通信処理部（１４）を介して映像処理装置（３０）に伝送するように制御し、映像処理装置（３０）で使用者制御命令に応じて新しく合成された映像情報を通信処理部（１４）を介して伝送され、補正、格納または出力するように制御する装置である。
【００２１】
通信処理部（１４）は、入出力制御部（１３）に連結され、通信網（２０）を通じて映像処理装置（３０）とデータを送受信する装置であって、例えば、インターネットを通じて映像情報を含むデータを送受信するイーサーネット（Ｅｔｈｅｒｎｅｔ（登録商標））カード、内部連結のためのシリアル／パラレルポート、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）ポートまたはＩＥＥＥ１３９４ポートなどのような装置が挙げられる。
【００２２】
映像補正部（１５）は、入出力制御部（１３）に連結され、映像処理装置（３０）で新しく合成され伝送された映像情報を使用者命令入力部（１２）を介して入力された使用者制御命令に応じて映像の角度、大きさおよび質感などを補正する。
【００２３】
映像格納部（１６）は、図２の補助メモリ（７３２）に対応する装置であって、映像処理装置（３０）で新しく合成され伝送された映像情報または映像補正部（１５）で補正された映像情報を入出力制御部（１３）の制御によって格納する。
【００２４】
また、出力部（１７）は、図２の出力装置（７６）に対応する装置であって、入出力制御部（１３）の制御によって映像処理装置（３０）が新しい映像を合成する時に要求される使用者制御命令の入力を受けるための使用者インタフェース画面情報をディスプレイし、映像処理装置（３０）で新しく合成され伝送された映像情報または映像補正部（１５）によって補正された映像情報をディスプレイまたはプリントする。
【００２５】
図１Ａに示された本発明の第１の実施例（１）において少なくとも１つ以上の使用者インタフェース装置（１０ａ、１０ｂ）と映像処理装置（３０）との間でデータを伝送する通信網（２０）としては、その実施形態によって有・無線インターネット、近距離ネットワーク、専用線などの多様な形態のネットワークが挙げられる。
【００２６】
図１Ａに示された本発明の第１の実施例における映像処理装置（３０）は、少なくとも１つ以上の使用者インタフェース装置（１０ａ、１０ｂ）から伝送される映像情報を処理し、伝送された使用者制御命令に応じて当該映像情報に基づいて新しい映像の合成を行ってから対応する使用者インタフェース装置に伝送する装置であって、通信処理部（３１）、映像処理部（３２）および映像データベース（３３）を含んで構成される。
【００２７】
通信処理部（３１）は、通信網（２０）を介して１つ以上の使用者インタフェース装置（１０ａ、１０ｂ）とデータを送受信する装置であって、使用者インタフェース装置を構成する通信処理部（１４）に対応してインターネットを介して映像情報を含むデータを送受信するイーサーネット（Ｅｔｈｅｒｎｅｔ（登録商標））カード、内部連結のためのシリアル／パラレルポート、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）ポートまたはＩＥＥＥ１３９４ポートなどのような装置が挙げられる。
【００２８】
映像処理部（３２）は、使用者インタフェース装置（１０ａ、１０ｂ）から伝送された顔映像情報から参照映像の形態情報に対する変形場として表示される入力顔映像の形態情報および参照映像にマッピングされた入力映像の色相または明暗情報である質感情報を抽出し、使用者インタフェース装置（１０ａ、１０ｂ）から伝送される使用者制御命令による使用者の要求を分析し、分析された使用者の要求に応じて抽出された入力顔映像の形態情報、抽出された質感情報および映像データベース（３３）に格納された種々の映像を用いて新しい顔映像を合成する装置であって、顔情報抽出部（３２１）、顔映像合成部（３２２）、部分映像代替部（３２３）およびアクセサリ映像追加部（３２４）を備える。
【００２９】
顔情報抽出部（３２１）は、使用者インタフェース装置（１０ａ、１０ｂ）から伝送される顔映像情報から参照映像に対する変形場として表示される入力顔映像の形態情報と、これを用いて参照映像にマッピングされた入力映像の色相または明暗の情報である質感情報とを抽出する。
【００３０】
顔映像合成部（３２２）は、使用者制御命令に応じて映像データベース（３３）に格納された質感映像のうちから選択された質感映像、または、選択された質感映像と顔情報抽出部（３２１）から抽出された質感情報が反映された質感映像の重み付けで生成される映像を顔情報抽出部（３２１）から抽出される入力映像の形態情報を用いて変換することで、新しい顔映像を合成する。
【００３１】
部分映像代替部（３２３）は、顔映像合成部（３２２）によって合成される新しい顔映像の一部または全体領域を、映像データベース（３３）に格納された標本映像のうち類似度が一番高いものと代替する。
【００３２】
アクセサリ映像追加部（３２４）は、映像データベース（３３）に格納されたアクセサリ映像のうち使用者制御命令に応じて選択されたアクセサリ映像を顔映像合成部（３２２）によって合成された顔映像に追加する。
【００３３】
映像データベース（３３）は、映像処理部（３２）で入力顔映像を処理するために要求される映像情報を予め格納するが、顔模型データベース（３３１）、付加映像データベース（３３２）、標本映像データベース（３３３）、メークアップ映像データベース（３３４）およびアクセサリ映像データベース（３３５）などを備える。
【００３４】
顔模型データベース（３３１）には、顔情報抽出部（３２１）が入力顔映像から参照映像に基づく形態情報および質感情報を抽出するために使用される各種の情報（多数の模型顔映像によって予め求められた形態平均、質感平均、形態固有ベクター、質感固有ベクターなど）が格納される。顔模型データベース（３３１）に格納される各種の情報については、図４に関する説明において詳述する。
【００３５】
付加映像データベース（３３２）には、参照映像と同じ形態を有し、アニメスタイル、スケッチスタイル、水彩画スタイルなどのような質感情報として表現される色々なスタイルのカリカチュア映像に関する情報が格納される。
【００３６】
標本映像データベース（３３３）には、顔映像の特定の部位別に形態変化や表情変化などを含む種々のカリカチュア標本映像に関する情報が格納される。
【００３７】
メークアップ映像データベース（３３４）には、参照映像と同じ形態を有して種々のサンプルメークアップを表現する質感情報としてのメークアップ映像に関する情報が格納される。
【００３８】
アクセサリ映像データベース（３３５）には、合成された顔映像に追加する、眼鏡、ヘアスタイル、帽子、イヤリング、体形などの映像に関する情報が格納される。
【００３９】
図１Ａに示されたように、本発明の第１の実施例（１）においては、少なくとも１つ以上の使用者インタフェース装置（１０ａ、１０ｂ）と１つの映像処理装置（３０）とが相互通信処理部（１４、３１）と通信網（２０）を介して接続されるように構成されているが、図１Ｂに示された本発明の第２の実施例（４０）と同様に、使用者インタフェース装置（５０）と映像処理装置（６０）とが１つのコンピュータシステム（７０）内に一体化されて運営されることも可能である。
【００４０】
以下、図３を参照して、本発明による顔映像に基づくキャラクター映像自動生成装置（１、４０）の基本的な動作過程を説明する。
【００４１】
先ず、映像処理部（３２、６２）の顔情報抽出部（３２１、６２１）は、使用者インタフェース装置（１０ａ、１０ｂ、５０）から入力された顔映像を伝送されて予め決められた参照映像に対する入力顔映像の形態情報および参照映像にマッピングされた入力映像の色相または明暗情報である質感情報を抽出する（Ｓ１０、Ｓ１１）。
【００４２】
次いで、映像処理部（３２、６２）の顔映像合成部（３２２、６２２）は、使用者インタフェース装置（１０ａ、１０ｂ、５０）から入力された使用者制御命令（顔映像合成制御信号）に応じて参照映像に基づく質感情報および顔情報抽出部（３２１、６２１）によって抽出された入力顔映像の形態情報を用いて新しい顔映像を合成する（Ｓ１２）。即ち、顔映像合成部（３２２、６２２）は、抽出された入力顔映像の形態情報を使用して入力顔映像の形状を復元し、抽出された入力顔映像の形態情報を使用して復元された入力顔映像の形状に抽出された質感情報をワーピング（ｗａｒｐｉｎｇ）することにより、使用者の顔映像の合成を行う。ここで、顔映像の合成時、使用する参照映像に基づく質感情報を適切に変更または代替することによって、入力される顔映像の形態を有する種々の新しい合成映像が生成される。
【００４３】
このように合成される顔映像は、使用者インタフェース装置（１０ａ、１０ｂ、５０）に伝送され出力部（１７、５７）によってディスプレイされ、使用者インタフェース装置（１０ａ、１０ｂ、５０）の使用者命令入力部（１２、５２）は、ディスプレイされた顔映像において形態情報を変更するか否かに関する使用者制御命令を使用者から入力される（Ｓ１３）。
【００４４】
使用者制御命令によって形態情報の変形が決定された場合、入力映像の形態情報は、形態情報変更のための使用者制御命令（例えば、再合成されてディスプレイされる顔映像から特定の部位をマウスでドラッグして拡大・縮小するなどの部分領域別の変形またはスライドバーを用いて顔全体を誇張するなどの全体的な変形を指示する制御信号）に応じて変形され、ステップＳ１２に進み、新しい顔映像が合成される。
【００４５】
ステップＳ１３において使用者が形態情報の変形を希望しない場合、追加的な使用者命令に応じてアクセサリ映像追加部（３２４、６２４）がステップＳ１２で合成された顔映像に映像データベース（６３）に格納された種々のアクセサリ映像を追加し、または、部分映像代替部（３２３、６２３）で、ステップＳ１２で合成された顔映像の特定部位を映像データベース（６３）に格納された種々の標本映像に代替することで、様々な付加効果が付与される（Ｓ１４）。
【００４６】
次いで、映像処理装置（３０、６０）で合成された顔映像は、使用者インタフェース装置（１０ａ、１０ｂ、５０）に伝送されて使用者にディスプレイされ、映像補正部（１５、５５）は、使用者命令入力部（１２、５２）を介して入力された使用者制御命令（映像補正制御信号）に応じて合成された顔映像の最終補正を行う（Ｓ１５）。映像補正部（１５、５５）によって補正された合成顔映像は、映像格納部（１６、５６）に格納されるか、出力部（１７、５７）によってディスプレイまたはプリントされる（Ｓ１６）。
【００４７】
図３の顔情報抽出ステップ（Ｓ１１）は、入力された顔映像から顔模型を基に形態情報Ｓ_ｉｎと質感情報Ｔ_ｉｎとを得る過程と要約される。
【００４８】
本発明において、顔映像の形態情報は、参照映像に関する変形場（ｄｅｆｏｒｍａｔｉｏｎｆｉｅｌｄ）と表示され、顔映像の質感情報は、参照映像にマッピングされた入力映像の色相または明暗情報として表示される。即ち、顔映像の形態情報Ｓは、参照映像上の各点ｐ_ｉ（ｉ＝１、…、ｎ；ここで、ｎは参照映像において予め決められた点の個数）と顔映像との対応点に対する平面座標上の位置の差と、顔映像の質感情報Ｔは、参照映像上の各点ｐ_ｉ（ｉ＝１、…、ｎ）に対する入力映像の対応点の色相または明暗の値と定義される。本発明の一実施例で使用される参照映像は、形態平均と質感平均を用いて合成したものであるが、本発明で使用され得る参照映像がこれに限定されるのではなく、予め用意されているｍ個の顔映像のうちのいずれかの映像を参照映像として使用することができる。
【００４９】
顔模型データベース（３３１、６３１）に格納される顔模型は、次のように事前に求められる。先ず、予め用意されているｍ個の顔映像からそれぞれ参照映像に基づいて形態情報Ｓ_ｊ（ｊ＝１、…、ｍ）と質感情報Ｔ_ｊ（ｊ＝１、…、ｍ）を抽出する。次いで、ｍ個の形態情報Ｓ_ｊ（ｊ＝１、…、ｍ）の各点ｐ_ｉ（ｉ＝１、…、ｎ）別の平均値からなる形態平均
【００５０】
【数１】

【００５１】
、ｍ個の質感情報Ｔ_ｊ（ｊ＝１、…、ｍ）の各点ｐ_ｉ（ｉ＝１、…、ｎ）別の平均値からなる質感平均
【００５２】
【数２】

【００５３】
を求め、形態の差
【００５４】
【数３】

【００５５】
（ｊ＝１、…、ｍ）の共分散Ｃ_Ｓと、質感の差
【００５６】
【数４】

【００５７】
（ｊ＝１、…、ｍ）の共分散Ｃ_Ｔを求める。
【００５８】
このように求められた値を主成分分析（ｐｒｉｎｃｉｐａｌｃｏｍｐｏｎｅｎｔａｎａｌｙｓｉｓ）処理してｍ個の顔模型に対する共分散の形態固有ベクターｓ_ｉ（ｉ＝１、…、ｍ−１）および質感固有ベクターｔ_ｉ（ｉ＝１、…、ｍ−１）を得ることができる。これに基づいて、顔映像は、形態固有ベクターｓ_ｉ（ｉ＝１、…、ｍ−１）および質感固有ベクターｔ_ｉ（ｉ＝１、…、ｍ−１）を基にして下記の式１のように表現することができる。
【００５９】
【数５】

【００６０】
（式中、
【００６１】
【数６】

【００６２】
であり、ｍは模型の個数である。）
このような過程を通じて形態平均
【００６３】
【数７】

【００６４】
、質感平均
【００６５】
【数８】

【００６６】
、形態固有ベクターｓ_ｉ（ｉ＝１、…、ｍ−１）および質感固有ベクターｔ_ｉ（ｉ＝１、…、ｍ−１）は、顔模型データベース（３３１、６３１）に格納され、入力された顔映像の形態情報および質感情報の抽出のために使用される。
【００６７】
以下、図４を参照して図３の顔情報抽出ステップ（Ｓ１１）をより詳しく説明する。図４の顔映像正規化ステップ（Ｓ１１１）では、入力された顔映像に対して所定の特徴点（例えば、両目の中間点および唇の中間点）を抽出し、抽出された入力顔映像の特徴点の位置が参照映像の特徴点の位置に一致するように入力顔映像を上下左右に移動してその大きさを調節する。このような映像正規化の過程は、所定のソフトウェアによって自動的に行われるか、使用者から制御命令を受け手動的に行うこともできるが、その細部的な過程は、本発明の範囲から逸脱するため、詳細な説明は省略する。
【００６８】
形態情報抽出ステップ（Ｓ１１２）では、顔映像正規化ステップ（Ｓ８）で正規化された入力顔映像と参照映像（または、参照映像と同じ形態を有する合成された質感推定映像
【００６９】
【数９】

【００７０】
）を対象にして階層型勾配法光流アルゴリズム（ｈｉｅｒａｒｃｈｉｃａｌ、ｇｒａｄｉｅｎｔ−ｂａｓｅｄｏｐｔｉｃａｌｆｌｏｗａｌｇｏｒｏｔｈｍ）［ＬｕｃａｓａｎｄＫａｎａｄｅ］を適用して参照映像に基づく形態情報
【００７１】
【数１０】

【００７２】
（正規化された入力顔映像と参照映像との対応点に対する位置の差の値）を推定する。階層型勾配法光流アルゴリズムは、類似した２つの映像に対する明るさ値（ｉｎｔｅｎｓｉｔｙ）を用いて２つ映像間の対応関係を光流で示す機能を行うアルゴリズムであって、本発明の属する技術分野で広く知られているため、具体的な説明は省略する。
【００７３】
形態情報推定ステップ（Ｓ１１２）で使用される階層型勾配法光流アルゴリズムで得られた形態情報には、入力された顔映像の照明や影などによるエラー値が含まれることができる。従って、形態情報補正ステップ（Ｓ１１３）では、形態情報推定ステップ（Ｓ１１２）で推定された形態情報
【００７４】
【数１１】

【００７５】
に対して形態固有ベクターｓ_ｉ（ｉ＝１、…、ｍ−１）に基づく線形分解（ｌｉｎｅａｒｄｅｃｏｍｐｏｓｉｔｉｏｎ）を行った後、さらに線形重畳（ｌｉｎｅａｒｓｕｐｅｒｐｏｓｉｔｉｏｎ）を行うことによって、エラー値の補正された形態情報
【００７６】
【数１２】

【００７７】
を得る。このとき、変形の自由度を高めるため、下記の式２によって求められるように、形態情報推定ステップ（Ｓ１１２）で推定された形態情報
【００７８】
【数１３】

【００７９】
と形態情報補正ステップ（Ｓ１１３）で補正された形態情報
【００８０】
【数１４】

【００８１】
との重み付けＳ_ｉｎ−１結果値として使用することが好ましい。
【００８２】
【数１５】

【００８３】
（式中、
【００８４】
【数１６】

【００８５】
）
逆ワーピングステップ（Ｓ１１４）では、モデル基盤の形態情報補正ステップ（Ｓ１１３）を通じて得られた形態情報
【００８６】
【数１７】

【００８７】
を用いて入力顔映像を参照映像の形態に変形する。この過程を「逆ワーピング（ＢａｃｋｗａｒｄＷａｒｐｉｎｇ）」という。
【００８８】
質感情報変形ステップ（Ｓ１１５）では、逆ワーピングされた映像の質感情報を質感固有ベクターｔ_ｉ（ｉ＝１、…、ｍ−１）に基づいて線形分解（ｌｉｎｅａｒｄｅｃｏｍｐｏｓｉｔｉｏｎ）を行った後、さらに線形重畳（ｌｉｎｅａｒｓｕｐｅｒｐｏｓｉｔｉｏｎ）を行うことで、入力顔映像に関する質感情報
【００８９】
【数１８】

【００９０】
を得る。
【００９１】
次いで、形態情報推定ステップ（Ｓ１１２）で正規化された入力顔映像を逆ワーピングステップ（Ｓ１１４）で参照映像の形態に変形された入力顔映像に、参照映像を参照映像と同様な形態を有する質感映像に代替してステップＳ１１２からステップＳ１１６を繰り返すことで、
【００９２】
【数１９】

【００９３】
を求める。即ち、ｋ番目の繰返し過程において、ステップＳ１１２およびステップＳ１１３の
【００９４】
【数２０】

【００９５】
は、それぞれ
【００９６】
【数２１】

【００９７】
に代替され、ステップＳ１１４の
【００９８】
【数２２】

【００９９】
は、
【０１００】
【数２３】

【０１０１】
に代替される。また、ステップＳ１１５での
【０１０２】
【数２４】

【０１０３】
は、
【０１０４】
【数２５】

【０１０５】
に代替され、最後の繰返し過程において決定された
【０１０６】
【数２６】

【０１０７】
が最終的な入力顔映像の質感情報であるＴ_ｉｎとなる。このような繰返し過程は、
【０１０８】
【数２７】

【０１０９】
のｖｅｃｔｏｒｎｏｒｍである
【０１１０】
【数２８】

【０１１１】
が所定の閾値より少なくなるまで、または、一定の繰返し回数が所定の回数以上となるまで行われ、参照映像に基づく入力顔映像の形態情報Ｓ_ｉｎを得る（Ｓ１１７）。
【０１１２】
次いで、入力された顔映像は、参照映像に基づく入力顔映像の形態情報Ｓ_ｉｎおよび質感情報Ｔ_ｉｎを用いて復元され得る。即ち、参照映像に基づく入力顔映像の質感情報Ｔ_ｉｎを参照映像に基づく入力顔映像の形態情報Ｓ_ｉｎを用いて変形させることで、入力された顔映像が再合成される。
【０１１３】
このような合成映像の特性を活用し得る代表例として、顔映像のカリカチュア生成、メークアップデザインなどが挙げられる。顔映像のカリカチュアの生成方法は、さらに、顔映像の合成方法と標本映像の代替方法とに分けられる。
【０１１４】
以下、図５を参照して、本発明による顔映像に基づくキャラクター映像自動生成装置（１、４０）によって行われる顔映像合成方法でカリカチュア映像を生成する過程について説明する。
【０１１５】
先ず、映像処理部（３２、６２）の顔情報抽出部（３２１、６２１）は、使用者インタフェース装置（１０ａ、１０ｂ、５０）から入力された顔映像を伝送され予め決められた参照映像に基づく入力顔映像の形態情報Ｓ_ｉｎおよび質感情報Ｔ_ｉｎを抽出する（Ｓ２０、Ｓ２１）。
【０１１６】
次に、顔映像合成部（３２２、６２２）は、付加映像データベース（３３２、６３２）に格納された様々なスタイルのカリカチュア映像（例えば、アニメスタイル、スケッチスタイル、水彩画スタイルなどの映像）を使用者インタフェース装置（１０ａ、１０ｂ、５０）を介して使用者に提示し、使用者が希望のスタイルのカリカチュアを選択する（Ｓ２２）。このとき、付加映像データベース（３３２、６３２）に格納された様々なスタイルのカリカチュア映像は、参照映像と同じ形態を有するものである。
【０１１７】
顔映像合成ステップ（Ｓ２３）では、顔映像合成部（３２２、６２２）が使用者によって選択されたスタイルのカリカチュア映像または選択されたカリカチュア映像と入力顔映像の質感情報Ｔ_ｉｎが反映された映像の重み付けによって生成される映像を、入力顔映像の形態情報Ｓ_ｉｎと合成することによって、使用者自分の形態情報が反映されるカリカチュア映像を合成する。
【０１１８】
このように合成されたカリカチュア映像は、使用者インタフェース装置（１０ａ、１０ｂ、５０）に伝送され出力部（１７、５７）によってディスプレイされ、使用者インタフェース装置（１０ａ、１０ｂ、５０）の使用者命令入力部（１２、５２）は、使用者からディスプレイされるカリカチュア映像の形態情報を変更するか否かに関する使用者制御命令の入力を受ける（Ｓ２４）。
【０１１９】
使用者制御命令によって形態情報の変更が決定された場合、入力映像の形態情報Ｓ_ｉｎは、形態情報変更のための使用者制御命令（例えば、再合成されてディスプレイされる顔映像において特定の部位をマウスでドラッグして拡大または縮小するなどの部分領域別の変形またはスライドバーを用いて顔全体を誇張するなどの全体的な変形を指示する制御信号）に応じて変更され、ステップＳ２２に進み、新しいカリカチュア映像が合成される。
【０１２０】
アクセサリ追加ステップ（Ｓ２５）では、使用者制御命令に応じてアクセサリ映像追加部（３２４、６２４）がアクセサリ映像データベース（３３５、６３５）から様々なアクセサリ（例えば、眼鏡、ヘアスタイル、帽子、イヤリング、体形などの映像）を引き出してカリカチュア映像に追加する。アクセサリ映像追加部（３２４、６２４）がアクセサリ映像を追加する時、ステップＳ２１で抽出された顔映像の形態情報Ｓ_ｉｎを用いて自動に大きさおよび位置の調節を行うことで、より自然な結果が得られる。また、部分映像代替部（３２３、６２３）がカリカチュア映像の特定部位を標本映像データベース（３３３、６３３）から引き出した標本映像に代替することで、嬉しさ、悲しみ、怒りなどの表情を示すか、表情変化の過程を表現するアニメフレームを用いて動画像効果を果たすことができる。
【０１２１】
次いで、映像処理装置（３０、６０）で合成されたカリカチュア映像は、使用者インタフェース装置（１０ａ、１０ｂ、５０）に伝送されて使用者にディスプレイされ、映像補正部（１５、５５）は、使用者命令入力部（１２、５２）を介して入力される使用者制御命令（映像補正制御信号）に応じてカリカチュア映像の最終補正を行う（Ｓ２６）。映像補正部（１５、５５）で補正されたカリカチュア映像は、映像格納部（１６、５６）に格納されるか、出力部（１７、５７）によってディスプレイまたはプリントされる（Ｓ２７）。
【０１２２】
このように得られたカリカチュア映像は、直に特定の用途で使用され、または、カリカチュアの製作時に下図として使用され、手動処理時の生産性を高めることができる。
【０１２３】
以下、図６を参照して、本発明による顔映像に基づくキャラクター映像自動生成装置（１、４０）によって行われる標本映像代替方法でカリカチュア映像を生成する過程について説明する。
【０１２４】
図６は、図５において類似度測定ステップ（Ｓ３５）と部分映像代替ステップ（Ｓ３６）とが追加されたもので、他のステップ（Ｓ３０乃至Ｓ３４、Ｓ３７乃至Ｓ３９）に関する重複説明は、省略する。即ち、図６に示された方法は、図５と同様な方法で合成されるカリカチュア映像の一部または全体領域を標本映像データベース（３３３、６３３）に予め用意されている標本映像に代替する方法である。
【０１２５】
標本映像データベース（３３３、６３３）に格納されている標本映像は、様々な顔映像に対して形態情報に関する統計分析を行った後、これを基に作られる。標本映像の構成方法は、標本映像の変形を許容する場合と、許容しない場合との２つに分けられる。
【０１２６】
先ず、標本映像の変形を許容する場合の標本映像の構成方法は、一定にして正規化された標本映像を構成し、ステップＳ３１で抽出された入力顔映像の形態情報Ｓ_ｉｎを基に、標本映像の大きさおよび模様を変形して代替する方式である。この方式は、入力顔映像の形態を充実に反映することができ、比較的少ない数の標本映像が要求されるというメリットがあるが、標本映像の変形による映像の歪みおよび全体的な画質低下が発生するというデメリットがある。
【０１２７】
その反面、標本映像の変形を許容しない場合の標本映像の構成方法は、予め構成された標本映像のみを用いてカリカチュア映像の一部または全体領域を代替して新しいカリカチュア映像を合成しているため、高画質の結果が得られるというメリットがあるが、入力顔映像の形態を充実に反映することが困難であり、発生可能な全ての変形に対する標本映像を予め用意する必要があるというデメリットがある。
【０１２８】
図６の類似度測定ステップ（Ｓ３５）で使用される類似度Ｄの測定方法は、下記の式３のように求められる。
【０１２９】
【数２９】

【０１３０】
上記の式３において、
【０１３１】
【数３０】

【０１３２】
であり、Ｃ_ｓｉ（ｉ＝１、…、ｎ）は、入力映像の形態情報を、Ｃ_ｒｉ（ｉ＝１、…、ｎ）は、標本映像の形態情報を、Ｃ_ｔｉ（ｉ＝１、…、ｎ）は、入力映像の質感情報と参照映像の質感情報Ｔ_ｒｅｆとの差を、Ｃ_ｑｉ（ｉ＝１、…、ｎ）は、標本映像の質感情報と参照映像の質感情報Ｔ_ｒｅｆとの差を示す。実行方法によっては、数学式３のＣ_ｓｉ、Ｃ_ｒｉ、Ｃ_ｔｉ、Ｃ_ｑｉに形態および質感情報をそのまま使用せずに、数学式１のように形態情報および質感情報を線形分解して得られる固有ベクターの係数を用いることもできる。このような場合、係数は、（ｍ−１）の次元を有する。
【０１３３】
部分映像代替部（３２３、６２３）は、入力映像と各標本映像との類似度Ｄを測定して（Ｓ３５）、この値が最小値となる標本映像で、カリカチュア映像の一部または全体領域を代替する（Ｓ３６）。
【０１３４】
標本映像代替方法でカリカチュア映像を生成する場合、低速の通信環境で生成されたカリカチュア映像の全体を伝送せず、代替される標本映像のコードを圧縮伝送することで、圧縮率を画期的に高めることができる。
【０１３５】
以下、図７を参照して、本発明による顔映像に基づくキャラクター映像自動生成装置（１、４０）によって行われるマークアップデザインの過程を説明する。
【０１３６】
図７は、図５におけるカリカチュア種類選択ステップ（Ｓ２２）、顔映像合成ステップ（Ｓ２３）および形態情報変更ステップ（Ｓ２４）の代わりに、メークアップ種類選択ステップ（Ｓ４２）、顔映像合成ステップ（Ｓ４３）、メークアップ修正ステップ（Ｓ４４）および満足可否確認ステップ（Ｓ４５）が追加されたものであり、他のステップ（Ｓ４０、Ｓ４１、Ｓ４６、Ｓ４７、Ｓ４８）に関する重複説明は省略する。
【０１３７】
メークアップ種類選択ステップ（Ｓ４２）では、顔映像合成部（３２２、６２２）がメークアップ映像データベース（３３４、６３４）に格納された種々のサンプルのメークアップ映像を使用者インタフェース装置（１０ａ、１０ｂ、５０）を介して使用者に提示し、使用者が希望のメークアップデザインを選択する。このとき、サンプルのメークアップ映像は、参照映像と同じ形態を有するものである。
【０１３８】
顔映像合成ステップ（Ｓ４３）では、顔映像合成部（３２２、６２２）が使用者によって選択されるメークアップ映像、または、選択されたサンプルのメークアップ映像とステップＳ４１で抽出された入力顔映像の質感情報Ｔ_ｉｎが反映された映像の重み付けで生成される映像を、ステップＳ４１で抽出された入力顔映像の形態情報Ｓ_ｉｎを用いて変換することによって、使用者自分の顔に選択されたメークアップデザインが適用された顔映像を合成することができる。
【０１３９】
このようにメークアップデザインの適用された顔映像は、使用者インタフェース装置（１０ａ、１０ｂ、５０）に伝送され出力部（１７、５７）によってディスプレイされ、使用者インタフェース装置（１０ａ、１０ｂ、５０）の使用者命令入力部（１２、５２）は、使用者からディスプレイされた顔映像上のメークアップの修正を指示する使用者制御命令の入力を受ける（Ｓ４４）。顔映像合成部（３２２、６２２）は、使用者制御命令に応じてメークアップデザインの適用された顔映像を修正し、修正された顔映像は、再度使用者インタフェース装置（１０ａ、１０ｂ、５０）に伝送され出力部（１７、５７）によってディスプレイされる。
【０１４０】
次いで、使用者インタフェース装置（１０ａ、１０ｂ、５０）の使用者命令入力部（１２、５２）は、使用者からディスプレイされた顔映像に対して満足するか否かを確認する使用者制御命令の入力を受ける（Ｓ４５）。このとき、使用者制御命令が「満足」の場合、アクセサリ追加ステップ（Ｓ４６）に進むが、そうでない場合は、メークアップ種類選択ステップ（Ｓ４２）に戻してメークアップデザインを再度行うようになる。
【０１４１】
以上、本発明の好適な実施例について述べてきたが、本発明の属する技術分野で通常の知識を有する者であれば、本発明の本質的な特性から逸脱しない範囲で変形された形態で実行することができることはいうまでもない。それで、上記の実施例は、限定的な観点でなく説明的な観点から考慮する必要がある。本発明の範囲は、前述の説明でなく請求の範囲に示されており、その同等な範囲内の全ての相違点は本発明に含まれるものと解釈される。
【０１４２】
〔産業上の利用可能性〕
以上のように、本発明によれば、第一、入力顔映像から参照映像に対する変形場として表示される入力顔映像の形態情報を抽出し、参照映像と同じ形態を有して種々の質感情報を有する映像と抽出された入力顔映像の形態情報を用いて入力された顔映像の形状が投影されながら入力映像の状態とは関係なく自然で高品質の新しい映像を合成することができるため、本発明は、キャラクター映像生成、仮想メークアップデザイン、犯罪者検索のためのモンタージュ作成、アニメおよび娯楽などの様々な分野で有用に活用されることができる。
【０１４３】
第二、カリカチュア映像生成においては、使用者の形態的特性を含んでいる様々なカリカチュアを直に生成することができ、生成されたカリカチュアは、一部または全体として誇張したり変形することができる。さらに、使用者の顔形態に関する情報を有しているため、複雑な映像補正過程を単純化し自動化することができ、キャラクター生成の生産性を向上させることができる。
【０１４４】
第三、メークアップデザインにおいては、使用者が簡便に自分のメークアップをデザインした後、確認することができ、一部または全体としての修正が容易に行われる。
【０１４５】
第四、合成された新しい映像に種々のアクセサリをつけた状態を直に確認することができ、形態情報に基づくバーチャルリアリティでのアバター、三次元顔映像の復元、ビデオチャットなどのように顔映像を必要とする数多い応用分野において容易に適用することができる。
【図面の簡単な説明】
【図１Ａ】
図１Ａは、本発明による合成顔映像生成装置の第１の実施例の機能的構成を示すブロック図である。
【図１Ｂ】
図１Ｂは、本発明による合成顔映像生成装置の第２の実施例の機能的構成を示すブロック図である。
【図２】
図２は、本発明の第１の実施例および第２の実施例が実行されるコンピュータシステムの装置的構成を示すブロック図である。
【図３】
図３は、本発明による合成顔映像の生成過程を示す基本的なフローチャートである。
【図４】
図４は、図３の顔情報抽出ステップをより詳しく示すフローチャートである。
【図５】
図５は、本発明による合成顔映像の生成装置によって行われる顔映像合成方法によるカリカチュア映像の生成過程を示すフローチャートである。
【図６】
図６は、本発明による合成顔映像の生成装置によって行われる標本映像代替方法によるカリカチュア映像の生成過程を示すフローチャートである。
【図７】
図７は、本発明による合成顔映像の生成装置によって行われるメークアップデザインの過程を示すフローチャートである。〔Technical field〕
The present invention relates to an apparatus and a method for generating a combined face image, and more particularly, to an apparatus and a method for generating a new combined face image based on input form information of the face image.
[0001]
In general, a facial image is used as a medium that shows the characteristics of an individual best and makes a dialogue natural and smooth. Application fields of such facial images include access control / security systems, criminal search / montage creation systems, computer interfaces, animations, games, and the like. In the field of application of face images, there are character image generation and make-up design as typical examples using face image synthesis technology.
[0002]
A caricature of a face image, which is a type of character image, is created by capturing the facial features of a specific person. Therefore, the caricature of the facial image can be used not only for producing a manga or an entertainment program but also as a symbol or icon representing oneself. In addition, it can be used as a personal signature (Signature) in personal computer communication or e-mail, or as an avatar of a user in virtual reality.
[0003]
(Background technology)
Conventionally, in order to generate such a caricature, a method in which a professional painter directly draws a caricature, a method in which a face image is automatically processed using a digital filter, and the like are performed. Here, the image processing technique using a digital filter adds a watercolor or charcoal style image effect using a combination of filters that give an appropriate effect to an input image, and manually performs a manual operation on the input image as a whole. This is a method to give the feeling of caricature generated by.
[0004]
By the way, the method of drawing caricatures directly by professional painters is natural and has a high degree of perfection.However, since it is done by hand, it takes a considerable amount of time. Because of the difficulty, it can be said that it is a method that can be applied only to restrictive situations. Image processing using digital filters is a method applied to images shot in environments with limited lighting and background, so it can be applied to mere 2D images that have no distinction between background and objects. Lighting of the image or other environmental changes can significantly alter the quality of the output image, and there is a need for an appropriate method to compensate for this change. Also, according to the conventional caricature generation method, since the morphological information for the object is not separately generated, the correction work such as exaggerating the facial features on the generated caricature or changing the expression is very complicated. There is also a problem that work such as restoration of a face image or extension to a three-dimensional avatar or the like can hardly be performed.
[0005]
2. Description of the Related Art Make-up design is conventionally performed by a method in which a consumer looks at a photograph of a model that has been made up from a magazine or the like and indirectly determines his or her own style. In recent years, a makeup design method using a computer has been introduced. This is a method for applying the product to the model image as a sample in various ways, and it is possible to obtain the natural makeup effect that can be obtained by directly making up the consumer's own facial image. Not been. In other words, even if the products have the same hue, they differ depending on the surrounding conditions, complex conditions such as shadows and reflected light due to the morphological features of the face. There is a problem that the makeup effect applied to the user's own face image can hardly be inferred by looking at the applied makeup effect.
[0006]
[Disclosure of the Invention]
SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and an object of the present invention is to extract face morphological information from an input face image, and to reproduce a face image based on the extracted information. By using the combining method, the user can obtain a more natural and sophisticated caricature image, and the user can preview the image of the make-up design performed on his / her own facial image. Furthermore, it is easy for the user to attach various accessories directly to the synthesized image or to deform the synthesized image, and as a result, the user can check the image in real time, and the synthesis based on the morphological information of the face image. An object of the present invention is to provide an apparatus and a method for generating a facial image.
[0007]
In order to achieve the above object, the apparatus for synthesizing a new face image based on the form information of the input face image according to the present invention transmits the face image information and the user control command to the image processing apparatus. A user interface device that transmits face image information synthesized by the image processing device and outputs or stores the image data in accordance with the user control command; and a predetermined reference image from the face image information transmitted from the user interface device. Extracting the morphological information of the input face image displayed as a deformation field and the texture information that is the hue or lightness / darkness information of the input image mapped to the reference image, and stored in the image database in advance according to the user control command. The texture image selected from the texture images having the same form as the reference image or the selected texture image and the extracted texture information are inverted. By converting the image generated by the weighting of by texture images using shape information of the input face image, characterized in that it comprises a video processing apparatus for generating a composite face image.
[0008]
According to another aspect of the present invention, there is provided a method of synthesizing a new face image based on shape information of an input face image, comprising the steps of: (a) displaying the input face image information as a deformation field for a predetermined reference image; Extracting texture information, which is hue or lightness / darkness information of the input image mapped to the reference image, and (b) stored in the image database according to the user control command. A texture image selected from texture images having the same form as the reference image or an image generated by weighting the selected texture image and the texture image in which the extracted texture information is reflected is used as the input face image. The method includes a step of generating a composite face image by performing conversion using the morphological information.
[0009]
[Best mode for carrying out the invention]
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.
[0010]
FIGS. 1A and 1B are configuration diagrams showing a first embodiment (1) and a second embodiment (40) of a synthesized face image generating apparatus based on form information of a face image according to the present invention, respectively. The first embodiment (1) of the present invention shown comprises at least one or more user interface devices (10a, 10b), a communication network (20) and a video processing device (30), and operates in a network environment. The second embodiment (40) of the present invention shown in FIG. 1B operates on a single computer system including a user interface device (50) and a video processing device (60).
[0011]
A user interface device (10a, 10b) and a video processing device (3), which are components of the first embodiment (1) of the present invention, and a second embodiment (40) of the present invention are shown in FIG. 2, a computer system including a computer (72) having at least one central processing unit (CPU) (74) and a memory device (73), an input device (75) and an output device (76). 70). The components of the computer system (70) are interconnected by at least one or more bus structures (77).
[0012]
The illustrated central processing unit (74) includes an arithmetic and logical operation unit (ALU) (741) for performing arithmetic and logical operations, a register set (742) for temporarily storing data and instructions, and a computer system (70). A control device (743) for controlling the operation is provided. The central processing unit (74) used in the present invention is not limited to a specific structure manufactured by a specific manufacturer. A processor of the form described above can be used.
[0013]
The memory device (73) includes a high-speed main memory (731) and an auxiliary memory (732) used for long-term storage of data. The main memory (731) includes a RAM (Randon Access Memory) and a ROM (Read Only Memory) semiconductor chip, and the auxiliary memory (732) includes a floppy disk, a hard disk, a CD-ROM, a flash memory, and electric, magnetic, , A device that stores data using light or another recording medium. Also, the main memory 731 may include a video display memory for displaying an image via a display device. Those of ordinary skill in the art will readily recognize that the memory device (73) may include a variety of alternative components having different storage capacities.
[0014]
The input device (75) includes a keyboard, a mouse, a physical converter (for example, a microphone), and the like, and the output device (76) includes a display, a printer, a physical converter (for example, a speaker), and the like. It is. Also, a device such as a network interface or a modem can be used as an input / output device.
[0015]
The computer system (70) includes an operating system and at least one or more application programs. The operating system is a series of software for controlling the operation of the computer system (70) and the allocation of resources, and the application program performs the work requested by the user using the computer resources available through the operating system. A series of software. All of these are stored in the illustrated memory device (73). After all, the computer-based character automatic generation device according to the present invention is embodied as a computer system 70 and one or more application programs installed and operated in the computer system 70.
[0016]
The first embodiment (1) of the present invention shown in FIG. 1A is different from the second embodiment (40) shown in FIG. 1B in the communication processing for data transmission through the communication network (20). The other functions are the same except that they further include the units (14, 31). Therefore, the following description will be made based on the first embodiment (1).
[0017]
Referring to FIG. 1A, at least one or more user interface devices (10a, 10b) receive face image information and a user control command from a user, and are synthesized according to the user control command. A device for transmitting, correcting, storing, or outputting a video, comprising a video information input unit (11), a user command input unit (12), an input / output control unit (13), a communication processing unit (14), An image correction unit (15), an image storage unit (16), and an output unit (17) are provided.
[0018]
The video information input unit (11) is a device for inputting facial video information from a user, and includes, for example, a device such as a scanner or a digital camera. Further, the image information input unit (11) can include a number of cameras for inputting images photographed at various angles and a camera auxiliary device such as a lighting control device. Since the video information input unit (11) as a component of the present invention needs to be considered from a functional aspect, not only the input device (75) of FIG. 2 but also face video information is stored in advance. It needs to be interpreted in a broad sense including the auxiliary memory (732).
[0019]
The user command input unit (12) is a device that receives a user control command (for example, user information, a facial image synthesis control signal, a video correction control signal, etc.) from a user, and is used by the user to select a user control command. Devices such as a keyboard, a mouse, a touch screen, etc. that can be input are included.
[0020]
The input / output control unit (13) transmits the face image information input via the image information input unit (11) and the user control command input via the user command input unit (12) to the communication processing unit (14). ) Is transmitted to the video processing device (30), and the video information newly synthesized by the video processing device (30) according to the user control command is transmitted via the communication processing unit (14). , Correction, storage or output.
[0021]
The communication processing unit (14) is connected to the input / output control unit (13) and transmits and receives data to and from the image processing device (30) through the communication network (20). And a device such as an Ethernet (registered trademark) card, a serial / parallel port for internal connection, a USB (Universal Serial Bus) port or an IEEE 1394 port.
[0022]
The image correction unit (15) is connected to the input / output control unit (13), and uses the image information newly synthesized and transmitted by the image processing device (30) through the user command input unit (12). The angle, size, texture and the like of the image are corrected according to the operator control command.
[0023]
The video storage unit (16) is a device corresponding to the auxiliary memory (732) of FIG. 2 and is video information newly synthesized and transmitted by the video processing device (30) or corrected by the video correction unit (15). The video information is stored under the control of the input / output control unit (13).
[0024]
The output unit (17) is a device corresponding to the output device (76) in FIG. 2, and is required when the video processing device (30) synthesizes a new video under the control of the input / output control unit (13). And displaying the user interface screen information for receiving the user control command, and displaying the video information newly synthesized and transmitted by the video processing device (30) or the video information corrected by the video correction unit (15). Or print.
[0025]
In the first embodiment (1) of the present invention shown in FIG. 1A, a communication network (10) for transmitting data between at least one or more user interface devices (10a, 10b) and a video processing device (30). 20) includes various forms of networks such as wired / wireless Internet, short-distance networks, and dedicated lines depending on the embodiment.
[0026]
The video processing device (30) according to the first embodiment of the present invention shown in FIG. 1A processes and transmits video information transmitted from at least one or more user interface devices (10a, 10b). An apparatus for synthesizing a new video based on the video information according to a user control command and transmitting the synthesized video to a corresponding user interface device, comprising a communication processing unit (31), a video processing unit (32), and a video processing unit. It comprises a database (33).
[0027]
The communication processing unit (31) is a device that transmits and receives data to and from one or more user interface devices (10a, 10b) via a communication network (20), and is a communication processing unit ( 14) An Ethernet (registered trademark) card for transmitting and receiving data including video information via the Internet, a serial / parallel port for internal connection, a USB (Universal Serial Bus) port, an IEEE1394 port, etc. And the like.
[0028]
The image processing unit (32) is configured to map, from the face image information transmitted from the user interface devices (10a, 10b), the shape information of the input face image displayed as a deformation field for the shape information of the reference image and the reference image. Texture information, which is hue or light / dark information of the input image, is extracted, and a user's request according to a user control command transmitted from the user interface device (10a, 10b) is analyzed, and according to the analyzed user's request. An apparatus for synthesizing a new face image using the morphological information of the input face image extracted by the above, the extracted texture information and various images stored in the image database (33), and a face information extracting unit (321) , A face video synthesizing unit (322), a partial video replacement unit (323), and an accessory video adding unit (324).
[0029]
The face information extracting unit (321) converts the face image information transmitted from the user interface devices (10a, 10b) into the form information of the input face image displayed as a deformation field for the reference image, and uses the information to form the reference image. The hue of the mapped input image or texture information, which is information of light and dark, is extracted.
[0030]
The face image synthesizing unit (322) is a material image selected from the material images stored in the image database (33) according to the user control command, or the selected material image and the face information extracting unit (321). ) Is converted by using the morphological information of the input video extracted from the face information extracting unit (321) to synthesize a new facial video. I do.
[0031]
The partial image replacement unit (323) is configured to convert a part or the whole area of the new face image synthesized by the face image synthesis unit (322) into the highest similarity among the sample images stored in the image database (33). Substitute with something.
[0032]
The accessory video adding unit (324) adds the accessory video selected according to the user control command from the accessory videos stored in the video database (33) to the face video synthesized by the face video synthesizing unit (322). I do.
[0033]
The video database (33) stores in advance video information required for processing the input face video in the video processing unit (32). The video database (331), the additional video database (332), and the sample video database (333), a makeup video database (334) and an accessory video database (335).
[0034]
The face model database (331) includes various types of information used by the face information extraction unit (321) to extract morphological information and texture information based on the reference image from the input face image (predetermined from a large number of model face images). Morphological average, texture average, morphology specific vector, texture specific vector, etc.) are stored. Various types of information stored in the face model database (331) will be described in detail with reference to FIG.
[0035]
The additional video database (332) stores information on caricature videos of various styles having the same form as the reference video and expressed as texture information such as an animation style, a sketch style, and a watercolor style.
[0036]
The sample image database (333) stores information on various caricature sample images including a morphological change and a facial expression change for each specific part of the face image.
[0037]
The makeup video database (334) stores information on makeup videos as texture information expressing various sample makeups in the same form as the reference video.
[0038]
The accessory video database (335) stores information on video such as glasses, hairstyles, hats, earrings, and body shapes to be added to the synthesized facial video.
[0039]
As shown in FIG. 1A, in the first embodiment (1) of the present invention, at least one or more user interface devices (10a, 10b) and one video processing device (30) communicate with each other. Although it is configured to be connected to the processing units (14, 31) via the communication network (20), as in the second embodiment (40) of the present invention shown in FIG. The interface device (50) and the video processing device (60) can be integrated and operated in one computer system (70).
[0040]
Hereinafter, with reference to FIG. 3, a basic operation process of the automatic character image generating apparatus (1, 40) based on a face image according to the present invention will be described.
[0041]
First, the face information extraction unit (321, 621) of the image processing unit (32, 62) transmits the face image input from the user interface device (10a, 10b, 50) to a predetermined reference image. The morphological information of the input face image and the texture information, which is the hue or brightness information of the input image mapped to the reference image, are extracted (S10, S11).
[0042]
Next, the face image synthesis section (322, 622) of the image processing section (32, 62) responds to a user control command (face image synthesis control signal) input from the user interface device (10a, 10b, 50). Then, a new face image is synthesized using the texture information based on the reference image and the form information of the input face image extracted by the face information extraction unit (321, 621) (S12). That is, the face image synthesizing unit (322, 622) restores the shape of the input face image using the shape information of the extracted input face image, and restores the shape using the extracted shape information of the input face image. The texture information extracted into the shape of the input face image is warped to synthesize the user's face image. Here, when synthesizing the face image, various new synthesized images having the form of the input face image are generated by appropriately changing or replacing the texture information based on the reference image to be used.
[0043]
The face image synthesized in this way is transmitted to the user interface device (10a, 10b, 50) and displayed on the output unit (17, 57), and the user command of the user interface device (10a, 10b, 50) is displayed. The input unit (12, 52) receives a user control command regarding whether to change the form information in the displayed face image from the user (S13).
[0044]
When the deformation of the morphological information is determined by the user control command, the morphological information of the input image is used as the user control command for changing the morphological information (for example, a specific part is re-combined and displayed from the displayed face image with a mouse , And is transformed in accordance with a control signal for instructing an overall deformation such as exaggerating the entire face using a slide bar or a whole area such as exaggerating the entire face using a slide bar. The face image is synthesized.
[0045]
If the user does not desire to transform the morphological information in step S13, the accessory video adding section (324, 624) stores the face video synthesized in step S12 in the video database (63) according to the additional user command. Various accessory images that have been added are added, or a specific portion of the face image synthesized in step S12 is replaced with various sample images stored in the image database (63) in the partial image replacement unit (323, 623). By doing so, various additional effects are provided (S14).
[0046]
Next, the face image synthesized by the image processing device (30, 60) is transmitted to the user interface device (10a, 10b, 50) and displayed to the user, and the image correction unit (15, 55) uses the image. The final correction of the synthesized face image is performed according to the user control command (image correction control signal) input via the user command input unit (12, 52) (S15). The composite face image corrected by the image correction unit (15, 55) is stored in the image storage unit (16, 56) or displayed or printed by the output unit (17, 57) (S16).
[0047]
The face information extraction step (S11) in FIG. 3 includes the form information S based on the face model from the input face image. _in And texture information T _in And the process of obtaining
[0048]
In the present invention, the form information of the face image is displayed as a deformation field related to the reference image, and the texture information of the face image is displayed as hue or brightness information of the input image mapped to the reference image. That is, the morphological information S of the face image is expressed by each point p on the reference image. _i (I = 1,..., N; where n is the predetermined number of points in the reference image) and the difference between the position on the plane coordinates with respect to the corresponding point of the face image and the texture information T of the face image are Each point p on the reference video _i (I = 1,..., N) are defined as the hue or lightness / darkness value of the corresponding point of the input video. Although the reference video used in the embodiment of the present invention is synthesized using the morphological average and the texture average, the reference video that can be used in the present invention is not limited to this, and is prepared in advance. Any one of the m face images can be used as a reference image.
[0049]
The face model stored in the face model database (331, 631) is obtained in advance as follows. First, form information S based on reference images from m face images prepared in advance. _j (J = 1,..., M) and texture information T _j (J = 1,..., M) are extracted. Then, m pieces of form information S _j Each point p of (j = 1,..., M) _i (I = 1,..., N) Morphological average consisting of different averages
[0050]
(Equation 1)

[0051]
, M pieces of texture information T _j Each point p of (j = 1,..., M) _i (I = 1,..., N) Texture average consisting of different average values
[0052]
(Equation 2)

[0053]
And the difference in form
[0054]
[Equation 3]

[0055]
(J = 1,..., M) covariance C _S And the difference in texture
[0056]
(Equation 4)

[0057]
(J = 1,..., M) covariance C _T Ask for.
[0058]
The values obtained in this way are subjected to principal component analysis, and the covariance form eigenvectors s for m face models are processed. _i (I = 1,..., M-1) and the texture-specific vector t _i (I = 1,..., M-1) can be obtained. Based on this, the face image is transformed into a form-specific vector s _i (I = 1,..., M-1) and the texture-specific vector t _i Based on (i = 1,..., M−1), it can be expressed as the following equation 1.
[0059]
(Equation 5)

[0060]
(Where
[0061]
(Equation 6)

[0062]
And m is the number of models. )
Morphological average through such a process
[0063]
(Equation 7)

[0064]
, Texture average
[0065]
(Equation 8)

[0066]
, Form-specific vectors _i (I = 1,..., M-1) and the texture-specific vector t _i (I = 1,..., M-1) are stored in the face model database (331, 631), and are used for extracting morphological information and texture information of the input face image.
[0067]
Hereinafter, the face information extraction step (S11) of FIG. 3 will be described in more detail with reference to FIG. In the face image normalization step (S111) of FIG. 4, predetermined feature points (for example, the midpoint of both eyes and the midpoint of the lips) are extracted from the input face image, and the features of the extracted input face image are extracted. The size of the input face image is adjusted by moving it up, down, left, and right so that the position of the point matches the position of the feature point of the reference image. Such a process of image normalization may be automatically performed by a predetermined software or may be manually performed upon receiving a control command from a user. However, a detailed process thereof is out of the scope of the present invention. Therefore, detailed description is omitted.
[0068]
In the form information extraction step (S112), the input face image normalized in the face image normalization step (S8) and the reference image (or a synthesized texture estimation image having the same form as the reference image)
[0069]
(Equation 9)

[0070]
) And applying a hierarchical-gradient optical flow algorithm (Lucas and Kanade) to the morphological information based on the reference image
[0071]
(Equation 10)

[0072]
(The value of the position difference between the normalized input face image and the reference image with respect to the corresponding point) is estimated. The hierarchical gradient light flow algorithm is an algorithm that performs a function of indicating a correspondence relationship between two images by using a brightness value (intensity) of two similar images as a light flow. , And a detailed description is omitted.
[0073]
The morphological information obtained by the hierarchical gradient light flow algorithm used in the morphological information estimating step (S112) may include an error value due to illumination or shadow of the input face image. Therefore, in the form information correction step (S113), the form information estimated in the form information estimation step (S112)
[0074]
(Equation 11)

[0075]
For the shape-specific vectors _i After performing linear decomposition based on (i = 1,..., M−1), linear superposition is further performed to correct the morphological information with corrected error values.
[0076]
(Equation 12)

[0077]
Get. At this time, in order to increase the degree of freedom of deformation, the morphological information estimated in the morphological information estimating step (S112) is calculated as shown in the following equation
[0078]
(Equation 13)

[0079]
And the morphological information corrected in the morphological information correction step (S113)
[0080]
[Equation 14]

[0081]
And weight S _in-1 It is preferable to use it as a result value.
[0082]
(Equation 15)

[0083]
(Where
[0084]
(Equation 16)

[0085]
)
In the inverse warping step (S114), the morphological information obtained through the model-based morphological information correction step (S113)
[0086]
[Equation 17]

[0087]
To transform the input face image into a reference image. This process is referred to as "Backward Warping".
[0088]
In the texture information deformation step (S115), the texture information of the reverse-warped video is converted to a texture-specific vector t. _i After performing linear decomposition on the basis of (i = 1,..., M−1), and further performing linear superposition, the texture information on the input face image is obtained.
[0089]
(Equation 18)

[0090]
Get.
[0091]
Next, the input face image normalized in the morphological information estimation step (S112) is transformed into the reference image form in the inverse warping step (S114). By repeating steps S112 to S116 in place of video,
[0092]
[Equation 19]

[0093]
Ask for. In other words, in the k-th repetition process, steps S112 and S113
[0094]
(Equation 20)

[0095]
Respectively
[0096]
(Equation 21)

[0097]
In step S114
[0098]
(Equation 22)

[0099]
Is
[0100]
[Equation 23]

[0101]
Is replaced by Also, in step S115
[0102]
(Equation 24)

[0103]
Is
[0104]
(Equation 25)

[0105]
And was determined in the last iteration
[0106]
(Equation 26)

[0107]
Is T, which is the texture information of the final input face image. _in It becomes. Such an iterative process,
[0108]
[Equation 27]

[0109]
Is the vector norm of
[0110]
[Equation 28]

[0111]
Is smaller than a predetermined threshold value or until a certain number of repetitions is equal to or more than a predetermined number, and the form information S of the input face image based on the reference image _in Is obtained (S117).
[0112]
Next, the input face image is the form information S of the input face image based on the reference image. _in And texture information T _in Can be restored using That is, the texture information T of the input face image based on the reference image _in Of input face image based on reference image S _in , The input face image is re-synthesized.
[0113]
Representative examples that can utilize the characteristics of such a composite image include caricature generation of a face image and make-up design. The method of generating a caricature of a face image is further divided into a method of synthesizing a face image and a method of replacing a sample image.
[0114]
Hereinafter, a process of generating a caricature image by the method of synthesizing a face image performed by the apparatus for automatically generating a character image based on a face image according to the present invention will be described with reference to FIG.
[0115]
First, the face information extraction units (321, 621) of the video processing units (32, 62) transmit the face images input from the user interface devices (10a, 10b, 50) and are based on predetermined reference images. Input face image form information S _in And texture information T _in Is extracted (S20, S21).
[0116]
Next, the face video synthesizing unit (322, 622) uses the caricature video of various styles (for example, video such as animation style, sketch style, watercolor style, etc.) stored in the additional video database (332, 632) to the user. It is presented to the user via the interface device (10a, 10b, 50), and the user selects a caricature of a desired style (S22). At this time, the caricature videos of various styles stored in the additional video database (332, 632) have the same form as the reference video.
[0117]
In the face image synthesizing step (S23), the face image synthesizing unit (322, 622) uses the caricature image of the style selected by the user or the texture information T of the selected caricature image and the input face image. _in Is generated by the weighting of the video in which is reflected the shape information S of the input face video. _in To synthesize a caricature image in which the user's own form information is reflected.
[0118]
The caricature image synthesized in this way is transmitted to the user interface device (10a, 10b, 50) and displayed on the output unit (17, 57), and the user command of the user interface device (10a, 10b, 50) is displayed. The input unit (12, 52) receives a user control command regarding whether to change the form information of the caricature image displayed from the user (S24).
[0119]
When the change of the form information is determined by the user control command, the form information S of the input video is _in Is a user control command for changing the morphological information (for example, using a deformation or a slide bar for each partial area such as enlarging or reducing by dragging a specific portion with a mouse in a recomposed and displayed face image). Control signal instructing an overall deformation such as exaggerating the entire face), and the process proceeds to step S22 to synthesize a new caricature image.
[0120]
In the accessory adding step (S25), the accessory image adding unit (324, 624) sends various accessories (for example, glasses, hairstyles, hats, earrings, and body shapes) from the accessory image database (335, 635) according to the user control command. Video) and add it to the caricature video. When the accessory image adding unit (324, 624) adds an accessory image, the form information S of the face image extracted in step S21. _in By automatically adjusting the size and the position using, a more natural result can be obtained. In addition, the partial video replacement unit (323, 623) substitutes a specific part of the caricature video with a sample video extracted from the sample video database (333, 633) to indicate an expression such as joy, sadness, anger, etc. A moving image effect can be achieved using an animation frame expressing the process of changing facial expressions.
[0121]
Next, the caricature image synthesized by the image processing device (30, 60) is transmitted to the user interface device (10a, 10b, 50) and displayed to the user, and the image correction unit (15, 55) uses the caricature image. The final correction of the caricature image is performed according to a user control command (image correction control signal) input through the user command input unit (12, 52) (S26). The caricature image corrected by the image correction unit (15, 55) is stored in the image storage unit (16, 56) or displayed or printed by the output unit (17, 57) (S27).
[0122]
The caricature image obtained in this way can be used immediately for a specific application, or used as a figure below when producing a caricature, and can increase productivity during manual processing.
[0123]
Hereinafter, a process of generating a caricature image by a sample image replacement method performed by the automatic character image generation apparatus (1, 40) based on a face image according to the present invention will be described with reference to FIG.
[0124]
FIG. 6 is similar to FIG. 5 except that a similarity measurement step (S35) and a partial video substitution step (S36) are added, and redundant description of other steps (S30 to S34, S37 to S39) is omitted. That is, the method shown in FIG. 6 replaces a part or the whole area of the caricature image synthesized in the same manner as in FIG. 5 with a sample image prepared in advance in the sample image database (333, 633). It is.
[0125]
The sample images stored in the sample image databases (333, 633) are created based on statistical analysis on morphological information for various face images, and then. The method of composing the sample image is divided into two cases: a case where the deformation of the sample image is permitted and a case where the deformation of the sample image is not permitted.
[0126]
First, the method of constructing a sample image when the deformation of the sample image is permitted is to construct a sample image that is normalized to be constant, and the form information S of the input face image extracted in step S31. _in Is a method in which the size and pattern of the sample image are transformed and substituted based on This method has the advantage that the form of the input face image can be fully reflected and a relatively small number of sample images are required, but image distortion due to deformation of the sample image and overall image quality deterioration are caused. There is a disadvantage that it occurs.
[0127]
On the other hand, the method of constructing a sample image when the deformation of the sample image is not allowed is because a new caricature image is synthesized by substituting a part or the whole area of the caricature image using only the preconfigured sample image. Has the advantage that high quality results can be obtained, but it is difficult to fully reflect the form of the input face image, and there is a disadvantage that sample images for all possible deformations need to be prepared in advance. .
[0128]
The method of measuring the similarity D used in the similarity measurement step (S35) of FIG. 6 is obtained as in the following Expression 3.
[0129]
(Equation 29)

[0130]
In Equation 3 above,
[0131]
[Equation 30]

[0132]
And C _si (I = 1,..., N) represents the form information of the input video as C _ri (I = 1,..., N) represents the morphological information of the sample video as C _ti (I = 1,..., N) are the texture information of the input video and the texture information T of the reference video. _ref And C _qi (I = 1,..., N) are the texture information of the sample video and the texture information T of the reference video. _ref Shows the difference from Depending on the execution method, C _si , C _ri , C _ti , C _qi Instead of using the morphological and texture information as it is, the coefficients of the eigenvectors obtained by linearly decomposing the morphological and texture information as in Equation 1 can also be used. In such a case, the coefficients have a dimension of (m-1).
[0133]
The partial image replacement unit (323, 623) measures the similarity D between the input image and each sample image (S35), and determines a partial or entire area of the caricature image with the sample image having the minimum value. Substitute (S36).
[0134]
When a caricature image is generated by the sample image replacement method, the compression rate is dramatically improved by transmitting the code of the replacement sample image instead of transmitting the entire caricature image generated in a low-speed communication environment. Can be enhanced.
[0135]
Hereinafter, a process of a markup design performed by the apparatus for automatically generating a character image based on a face image according to the present invention will be described with reference to FIG.
[0136]
FIG. 7 shows a make-up type selection step (S42) and a face image synthesis step (S43) instead of the caricature type selection step (S22), the face image synthesis step (S23) and the morphological information change step (S24) in FIG. , A make-up correction step (S44) and a satisfaction / unsatisfaction confirmation step (S45) are added, and redundant description of the other steps (S40, S41, S46, S47, S48) will be omitted.
[0137]
In the make-up type selection step (S42), the face image synthesizing unit (322, 622) uses the make-up images of various samples stored in the make-up image database (334, 634) to the user interface devices (10a, 10b, 50), the user is presented with the desired make-up design. At this time, the sample make-up video has the same form as the reference video.
[0138]
In the face image synthesizing step (S43), the face image synthesizing section (322, 622) compares the make-up image selected by the user or the make-up image of the selected sample with the input face image extracted in step S41. Texture information T _in The video generated by weighting the video in which is reflected by the input face video extracted in step S41 _in , It is possible to synthesize a face image in which the selected makeup design is applied to the user's own face.
[0139]
The face image to which the makeup design is applied is transmitted to the user interface device (10a, 10b, 50) and displayed on the output unit (17, 57), and is displayed on the user interface device (10a, 10b, 50). The user command input unit (12, 52) receives a user control command for instructing a make-up correction on the displayed face image from the user (S44). The face image synthesizing unit (322, 622) modifies the face image to which the make-up design is applied according to the user control command, and the corrected face image is again used by the user interface device (10a, 10b, 50). And displayed by the output unit (17, 57).
[0140]
Next, a user command input unit (12, 52) of the user interface device (10a, 10b, 50) receives a user control command for confirming whether the user is satisfied with the displayed face image. An input is received (S45). At this time, if the user control command is "satisfied", the process proceeds to the accessory addition step (S46). If not, the process returns to the make-up type selection step (S42) and the make-up design is performed again.
[0141]
The preferred embodiment of the present invention has been described above. However, those skilled in the art to which the present invention pertains can execute the present invention in a modified form without departing from the essential characteristics of the present invention. It goes without saying that you can do it. Thus, the above embodiments need to be considered from a descriptive perspective rather than a limiting one. The scope of the invention is set forth in the following claims, rather than the foregoing description, and all differences within the scope of the invention are intended to be embraced by the invention.
[0142]
[Industrial applicability]
As described above, according to the present invention, first, form information of an input face image displayed as a deformation field for a reference image is extracted from an input face image, and various texture information having the same form as the reference image is extracted. Since the shape of the input face image is projected using the shape information of the image having the input face image and the extracted input face image, it is possible to synthesize a natural and high-quality new image regardless of the state of the input image. INDUSTRIAL APPLICABILITY The present invention can be effectively used in various fields such as character image generation, virtual make-up design, montage creation for criminal search, animation, and entertainment.
[0143]
Second, in caricature image generation, various caricatures including the morphological characteristics of the user can be directly generated, and the generated caricatures can be partially or entirely exaggerated or deformed. . Further, since the information about the face of the user is included, complicated image correction processes can be simplified and automated, and the productivity of character generation can be improved.
[0144]
Third, in the make-up design, the user can easily design his or her own make-up and then check it, so that a part or the whole can be easily corrected.
[0145]
Fourth, it is possible to directly check the state of various accessories attached to the synthesized new image, and to use face images such as avatars in virtual reality based on morphological information, 3D face image restoration, video chat, etc. Can be easily applied in many application fields that require.
[Brief description of the drawings]
FIG. 1A
FIG. 1A is a block diagram showing a functional configuration of a first embodiment of a synthetic facial image generation device according to the present invention.
FIG. 1B
FIG. 1B is a block diagram showing a functional configuration of a second embodiment of the synthetic facial image generation device according to the present invention.
FIG. 2
FIG. 2 is a block diagram showing an apparatus configuration of a computer system on which the first embodiment and the second embodiment of the present invention are executed.
FIG. 3
FIG. 3 is a basic flowchart showing a process of generating a composite face image according to the present invention.
FIG. 4
FIG. 4 is a flowchart showing the face information extracting step of FIG. 3 in more detail.
FIG. 5
FIG. 5 is a flowchart illustrating a process of generating a caricature image by a face image synthesizing method performed by the synthetic face image generating apparatus according to the present invention.
FIG. 6
FIG. 6 is a flowchart illustrating a process of generating a caricature image according to the sample image replacing method performed by the synthetic face image generating apparatus according to the present invention.
FIG. 7
FIG. 7 is a flowchart illustrating a makeup design process performed by the synthetic face image generating apparatus according to the present invention.

Claims

An apparatus for synthesizing a new face image based on the input form information of the face image,
A user interface device that receives face image information and a user control command, transmits the received face image information to a video processing device, and transmits face image information synthesized by the video processing device and outputs or stores the face image information according to the user control command. And a texture, which is shape information of an input face image displayed as a deformation field for a predetermined reference image from the face image information transmitted from the user interface device and hue or brightness information of the input image mapped to the reference image. The information is extracted, and the texture image selected from the texture images stored in advance in the video database according to the user control command and having the same form as the reference image or the selected texture image and the extracted texture information are reflected. The image generated by weighting the textured image is converted by using the morphological information of the input face image, so that the synthesized face image Generator of the synthetic face image based on the form information of the face image, which comprises a; generated video processing apparatus.

2. The apparatus according to claim 1, wherein the user interface device and the image processing device are integrated into one computer system and executed.

In the first aspect, the user interface device and the video processing device are executed by different computer systems, respectively.
An apparatus for generating a composite face image based on form information of a face image, further comprising a communication network for transmitting and receiving data between the user interface device and the image processing device.

2. The image processing device according to item 2 or 3,
From the face image information transmitted from the user interface device, form information of an input face image displayed as a deformation field for a reference image and texture information as hue or brightness information of the input image mapped to the reference image are extracted. Face information extraction unit;
A texture image previously stored in a video database according to the user control command and selected from texture images having the same form as the reference image, or a texture image in which the selected texture image and the extracted texture information are reflected. A face image synthesizing unit that generates a synthetic face image by converting an image generated by weighting of the input face image using the morphological information of the input face image; and
A video database storing information about the reference video and texture information of various videos having the same form as the reference video;
An apparatus for generating a synthetic face image based on morphological information of a face image, comprising:

In item 4, the video database comprises:
A morphological average generated by analyzing a principal component by determining morphological average, texture average, covariance of morphological difference and covariance of morphological difference from morphological information and texture information based on reference images extracted from many face images A face model database that stores texture averages, shape-specific vectors and texture-specific vectors;
An additional image database that stores information on caricature images of various styles having the same form as the reference image; and
A makeup video database storing information on various makeup design videos having the same form as the reference video;
An apparatus for generating a synthetic face image based on morphological information of a face image, comprising:

In the fifth aspect, the face information extraction unit includes:
A normalization module for normalizing the input face image according to the reference image;
A morphological information estimating module for estimating morphological information based on a reference image by applying a hierarchical gradient light flow algorithm to the normalized input face image and the reference image;
The morphological information estimated by the morphological information estimating module is subjected to linear decomposition and linear superposition based on the morphological eigenvectors stored in the face model database to generate morphological information with corrected error values. module;
An inverse warping module for transforming an input face image into a reference image using the corrected shape information;
A texture information determination module that determines texture information for the input face image by performing linear decomposition and linear superposition of the texture information of the inversely warped image based on the texture eigenvector stored in the face model database; and
A repetition module that generates the form information of the input face image based on the reference image by repeating the module until a predetermined condition is satisfied;
An apparatus for generating a synthetic face image based on morphological information of a face image, comprising:

In item 5, the video database comprises:
Further comprising a sample image database for storing information on the sample image of the caricature by the facial expression change for each specific part of the face image,
The image processing device,
The image processing apparatus further includes a partial image replacement unit that replaces a part or the entire area of the new face image synthesized by the face image synthesis unit with a sample image having the highest similarity among sample images stored in the sample image database. An apparatus for generating a synthesized face image based on face image morphological information.

In item 5, the video database comprises:
Further comprising an accessory video database storing information on various accessory videos to be added to the synthesized face video,
The face image further includes an accessory image adding unit that adds an accessory image selected by a user control command among the accessory images stored in the accessory image database to the face image synthesized by the face image synthesizing unit. For generating a synthetic face image based on the morphological information.

In a method of synthesizing a new face image based on the form information of the input face image,
(A) extracting, from the input face image information, morphological information of the input face image displayed as a deformation field with respect to a predetermined reference image and texture information which is hue or brightness information of the input image mapped to the reference image; ;and,
(B) in response to the user control command, a texture image selected in advance from texture images stored in advance in a video database and having the same form as the reference image, or the selected texture image and the extracted texture information are Generating a composite face image by converting an image generated by weighting the reflected texture image using the form information of the input face image;
A method for generating a synthesized face image based on morphological information of a face image, characterized by including:

In the ninth embodiment, the step (a) includes:
(A1) normalizing the input face image according to the reference image;
(A2) estimating morphological information based on the reference image by applying a hierarchical gradient light flow algorithm to the normalized input face image and the reference image;
(A3) The morphological information estimated in step (a2) is subjected to linear decomposition and linear superposition based on morphological eigenvectors stored in advance in a video database to generate morphological information with corrected error values. Steps;
(A4) transforming the input face image into a reference image using the shape information corrected in step (a3);
(A5) The texture information of the input face image is obtained by performing linear decomposition and linear superposition of the texture information of the image transformed into the form of the reference image in the step (a4) based on a texture-specific vector previously stored in the image database. Determining information; and (a6) repeating the steps (a2) to (a5) reflecting the results of the steps (a4) and (a5) until a predetermined condition is satisfied. Generating the form and texture information of the input face image based on the reference image;
A method for generating a synthesized face image based on morphological information of a face image, characterized by including:

In Item 10, the step (a) comprises:
(A0) Analyzing the principal components by calculating the morphological average, the texture average, the covariance of the morphological difference and the covariance of the texture difference from the morphological information and the texture information based on the reference images extracted from the many model face images, A method for generating a composite face image based on shape information of a face image, further comprising a step of generating a shape-specific vector and a texture-specific vector in advance.

In the ninth embodiment, the step (b) includes:
(B1) selecting one caricature image among various styles of caricature images stored in the image database and having the same form as the reference image according to the user control command; and
(B2) combining the caricature image selected in the step (b1) or the image generated by weighting the selected caricature image and the image reflecting the extracted texture information with the form information of the input face image; Synthesizing a caricature image in which the form information of the input face image is reflected;
A method for generating a synthesized face image based on morphological information of a face image, characterized by including:

In paragraph 12, the step (b) comprises:
(B3) when the change of the morphological information is determined by the user control command, the morphological information of the input video is modified according to the user control command for controlling the morphological information change, and the steps (b1) and (b2) are performed. And generating a composite face image based on the morphological information of the face image.

In paragraph 12,
And (c) replacing a part or the entire area of the caricature image synthesized in the step (b) with a sample image having the highest similarity among sample images stored in the image database. A method for generating a synthesized face image based on the form information of a featured face image.

In paragraph 14, the similarity is:
Weighting the sum of the difference between the morphological information and the texture information between the synthesized caricature image and the sample video, or the sum of the differences between the coefficients of the eigenvectors obtained by performing linear decomposition on the morphological information and the texture information A method for generating a synthesized face image based on morphological information of a face image, characterized by being determined by weighting.

In paragraph 12 or 14,
(D) adding the accessory image selected by the user control command among the accessory images stored in the image database to the synthesized caricature image, wherein the combining is based on the facial image form information. A method for generating facial images.

In paragraph 16,
The position and size of the accessory image added in the step (d) are determined using the form information of the input face image extracted in the step (a). A method for generating a composite face image based on the image.

In the ninth embodiment, the step (b) includes:
(B1) selecting one of the makeup design videos of various styles stored in the video database and having the same form as the reference video according to the user control command;
(B2) The form of the input face image is added to the makeup design image selected in the step (b1) or the image generated by weighting the selected makeup design image and the image reflecting the extracted texture information. Synthesizing information to synthesize a make-up design image in which the morphological information of the input face image is reflected; and (b3) texture information of the reference image in response to a user control instruction instructing make-up correction Changing the;
A method for generating a synthesized face image based on morphological information of a face image, characterized by including:

In paragraph 18,
(C) adding the accessory image selected by the user control command among the accessory images stored in the image database to the synthesized makeup design image. A method for generating a composite face image based on the image.

In paragraph 19,
The position and size of the accessory image added in the step (c) are determined using the form information of the input face image extracted in the step (a). A method for generating a composite face image based on the image.