JP2004034273A

JP2004034273A - Robot and system for generating action program during utterance of robot

Info

Publication number: JP2004034273A
Application number: JP2002198256A
Authority: JP
Inventors: Ryota Hiura; 日浦　亮太; Masahito Takuhara; 宅原　雅人
Original assignee: Mitsubishi Heavy Industries Ltd
Current assignee: Mitsubishi Heavy Industries Ltd
Priority date: 2002-07-08
Filing date: 2002-07-08
Publication date: 2004-02-05
Anticipated expiration: 2022-07-08
Also published as: JP3930389B2

Abstract

<P>PROBLEM TO BE SOLVED: To make a robot act properly in correspondent to its utterance and let a user know the start point and end point of robot's utterance. <P>SOLUTION: A plurality of action pattern data giving different values to different actions requiring different times are stored in a database 31. At an utterance time measuring section 32, the duration of an utterance of the robot is measured using audio waveform data 35a generated before the robot starts its utterance. At an action pattern combination generating section 33, two or more combinations of action patterns, and their reproduction order and timing are generated, using the contents of the database 31, so that the start and end of the utterance and that of the action of the robot coincide in accordance with the duration of the utterance. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明はロボットが発話している間、ロボットを適度に動作させるとともに、ユーザに発話の開始及び終了のタイミングを伝えるための技術に関する。
【０００２】
【従来の技術】
従来の技術では、ロボットの発話中は、（１）　決まった動作を再生したり、（２）　発話中の波形データ（特に、音の強弱）に応じてロボットを動作させていた。
【０００３】
しかし、（１）　の決まった動作の再生では、発話時間は一定でないことから、発話の開始と終了とに動作の開始と終了とが多くの場合ぴったり一致しないため、ユーザがロボットと対話するときの発言タイミングをつかみにくいという欠点がある。（２）　波形データに応じたロボット動作では、動作パターンが音声データに対して瞬間的な動作の組合せとなるため、発話の意味に合致した動作とならないという欠点がある。
【０００４】
【発明が解決しようとする課題】
本発明の課題は、上述した従来の技術が持つ欠点を解消し、ロボットが発話している間、ロボットを適度に動作させるとともに、ユーザに発話の開始及び終了のタイミングを伝えることができる技術を提供することである。
【０００５】
【課題を解決するための手段】
第１発明は、上記課題を解決するロボット発話中の動作プログラム生成装置であり、動作時間が互いに異なる値に決まっている複数の動作パターンのデータを有するデータベースと、ロボットが発話を開始する前に生成される音声の波形データから発話時間の長さを発話開始前に測定する発話時間測定手段と、前記発話時間測定手段で測定した発話時間に対して発話の開始と終了とに動作の開始と終了とが一致するように、前記データベースの内容から２つ以上の動作パターンの組合せ、再生順序及び再生タイミングを生成する動作パターン組合せ生成手段とを備えることを特徴とする。
【０００６】
第２発明は、第１発明において、前記動作パターン組合せ生成手段は、発話開始と発話終了とに動作の開始と終了とが一致しない場合、前後に隣接する動作パターンの間に無動作時間を挿入することを特徴とするロボット発話中の動作プログラム生成装置である、
第３発明は、第１発明において、発声すべき文字データから動作種別を判断する動作種別判断手段を備えることを特徴とするロボット発話中の動作プログラム生成装置である。
【０００７】
第４発明は、第１発明から第３発明いずれかのロボット発話中の動作プログラム生成装置を有することを特徴とするロボットである。
【０００８】
【発明の実施の形態】
以下、本発明の実施の形態を、図面を参照しながら説明する。
【０００９】
本発明の説明に先立ち、本発明を適用するロボットの一例を図３を参照して説明する。図３に示すロボット１は、ロボット本体１１が頭部１２と胸部１３と胴部（台車部）１４と左右の腕部１５とを有する人間を模したものであり、頭部１２と腕部１５は図示しない駆動機構により回動可能であるとともに、胴部１４に装備された左右の走行用車輪１６は図示しない駆動機構により操舵及び走行可能であって、作業空間をバッテリ駆動により自律的に移動するように構成されている。このロボット１は、一般家庭等の屋内を作業空間として人間と共存し、例えば、一般家庭内でユーザの生活を補助・支援・介護するために用いられる。そのため、ロボット１は、内蔵のＣＰＵ（コンピュータ）及び各種のセンサにより、ユーザと会話したり、ユーザの行動を見守ったり、ユーザの行動を補助したり、ユーザと一緒に行動したりする機能を備えている。ロボット１の形状としては、愛玩用に動物を模したものなど、種々考えられる。
【００１０】
詳細には、図３に示すロボット１はＣＰＵを用いた制御部を持ち、頭部１２に２つのカメラ１７と２つのマイクロホン１８が装着され、胸部１３の中央部に音量・音源方向センサと焦電型赤外線センサからなる人検知センサ１９、左右にスピーカ２０ａが装着されている。また、胸部１３の中央部に画像ディスプレイ２０ｂが装着され、胴部４に超音波式障害物センサ２１やレーザ式障害物センサ２２が装着されている。キーボード及びタッチセンサ等が装着されることもある。
【００１１】
カメラ１７はユーザや屋内を撮影してその画像を制御部に出力し、マイクロホン１８はユーザの音声や電話のベル、呼び鈴、テレビの音などの生活音を取り込んで制御部に出力し、人検知センサ１９はユーザの有無を検出して制御部に出力する。スピーカ２０ａはマイクロホン１８とともにユーザとの会話に用いられ、画像ディスプレイ２０ｂはユーザに対する情報提供に用いられる。キーボードやタッチセンサはユーザ用のデータ入力機器であり、ユーザの生活パターンデータを入力したり、ユーザの意思を入力するために用いられる。スピーカ２０ａはユーザに対する情報提供にも用いられる。制御部には、所定期間にわたるカレンダ及び現在の年月日及び時刻を計時するカレンダクロックが備えられている。
【００１２】
制御部には、キーボード等のデータ入力機器と、カメラ１７、マイクロホン１８、人検知センサ１９及び障害物センサ２１、２２等のユーザの生活パターンを常時モニタする外界センサと、カレンダクロックといった内界センサとにより、ロボット１の自己位置に関する情報と、ユーザの位置に関する情報と、ユーザの行動に関する情報とが入力される。
【００１３】
制御部は、自己位置認識機能と、ユーザ認識機能と、データベース（データ蓄積部）と、生活パターンデータ処理機能と、駆動制御機能とを有している。
【００１４】
自己位置認識機能は、カメラ１７が撮影した画像情報に基づいてロボット１自身の位置及び方位（向き、姿勢）を認識する。ユーザ認識機能は、マイクロホン１８で取り込んだ音声からユーザの発言内容の認識（音声認識）を行うとともに、カメラ１７で撮影した画像情報とマイクロホン１８で取り込んだ音声とからユーザ個人の認識を行い、また、カメラ１７と人検知センサ１９の検出結果からユーザの位置、向き、姿勢、活動量を認識する。
【００１５】
データベースは、ユーザとの会話用やユーザへの話し掛け用の音声データ、ユーザとの会話時やユーザへの話し掛け時の身振り動作のデータ、ユーザの生活情報（ユーザの居住に関する部屋の間取りや家具の配置、ユーザ個人の日常の生活パターン、ユーザの趣味、健康状態など）を記憶している。ユーザの生活情報には、必要に応じて文字列や映像、音声などのキーワードが付される。生活パターンのデータとしては、起床、就寝、食事、薬服用、散歩、自由時間、入浴などの行動に関するイベントが上げられ、これらのイベントが時間データとともにタイムスケジュールとして記憶される。
【００１６】
生活パターンデータ処理機能は、ユーザがデータ入力機器から直接入力した生活情報をキーワードを付してデータベースに蓄積したり、カメラ１７やマイクロホン１８、人検知センサ１９で取得した音声認識結果などユーザの各種情報を処理することで一つ一つの生活情報にキーワードやその日時を付してデータベースに蓄積する。更に、生活パターンデータ処理機能は、ユーザからの指示や話し掛けに応じて、あるいは、時刻などに応じて、これに対応するデータをデータベースから選択し、選択したデータに応じた指令を動作制御部に与えて制御する。
【００１７】
駆動制御機能は、生活パターンデータ処理機能からの指令に応じて、走行用車輪１６を駆動してロボット１の走行及び操舵を実行するとともに、頭部１２や腕部１３を駆動してそれの回動を実行する。また、スピーカ２０ａや画像ディスプレイ２０ｂを駆動してユーザとの会話を実行する。
【００１８】
次に、図１、図２を参照して、本発明の実施例に係るロボット発話中の動作プログラム生成装置（以下、単に動作プログラム生成装置と呼ぶ）を説明する。
【００１９】
上述したロボット１は、図１に例示する動作プログラム生成装置を有しており、この動作プログラム生成装置は、データベース３１と、発話時間測定部３２と、動作パターン組合せ生成部３３とを備えたものである。
【００２０】
図１中、３４はロボットに備えられた音声合成装置であり、音声波形生成部３５と波形再生部３６からなる。音声波形生成部３５は発声すべき文字データを入力して、これに対応する音声の波形データ３５ａを発話開始前に生成し、波形再生部３６は波形データ３５ａを実時間の音声波形に再生する。再生された音声波形は増幅器３７で増幅されてスピーカ２０ａに与えられ、これによりロボットが発話する。
【００２１】
ここで、ロボットが突然しゃべり始めるとユーザが聞き取りにくいので、文字データの先頭に「あ−」とか「うーん」とか意味がない文字データを挿入している。これにより、ロボットが「あ−」とか「うーん」とか意味がない言葉を発してから本来の発話を行うので、ユーザの注意を引き、聞き取りやすくなる。
【００２２】
データベース３１には、動作時間が互いに異なる値に決まっている複数の動作パターンのデータが格納されている。動作パターンは頭部１２や腕部１５、胴部１４のジェスチャ等の動作をパターン化したものであり、その例を上げると、ロボットがユーザに話し掛ける動きの種類として、１秒間分の動作パターン、２秒間分の動作パターン、４秒間分の動作パターン、８秒間分の動作パターンが格納されている。また、ロボットがユーザに問い掛ける動きの種類として、１秒間分の動作パターン、２秒間分の動作パターンが格納されている。更に、ロボットが確認する動きの種類として、１秒間分の動作パターン、２秒間分の動作パターンが格納されている。
【００２３】
発話時間測定部３２は、音声波形生成部３５等によってロボットの発話開始前に生成される波形データ３５ａから、発話時間の長さを発話開始前に測定する。波形データ３５ａ自体は実時間の音声波形に再生される前のデータであるから、波形データ３５ａのデータ長さ（バイト数）を調べることなどにより、瞬時に発話時間を測定することができる。
【００２４】
動作パターン組合せ生成部３３は、発話時間測定部３２で測定した発話時間に対して、ロボットの発話の開始と終了とにロボットの動作の開始と終了とが一致するように、データベース３１の内容から２つ以上の動作パターンの組合せを選択し、それの再生順序及び再生タイミングを生成する。
【００２５】
また、動作パターン組合せ生成部３３は、発話開始と発話終了とに動作の開始と終了とがぴったり一致しない場合は、前後に隣接する動作パターンの間に無動作時間を挿入するようにしている。
【００２６】
動作パターン組合せ生成部３３が選択する複数の動作パターンは発話内容になるべく適していることが好ましく、そのために、動作種別選択信号３８ａを利用している。
【００２７】
本例では、動作種別判断部３８にて発声すべき文字データを解析することで、質問や話し掛け、確認等の動作種別を判断し、動作種別選択信号３８ａを得るようにしている。
【００２８】
動作制御装置３９は、動作パターン組合せ生成部３３が選択した複数の動作パターンを、動作パターン組合せ生成部３３が指定する再生順序と再生タイミングで再生し、頭部１２や腕部１５、胴部１４のアクチュエータ（駆動機構）４０を実際に動作させる。
【００２９】
図２に動作パターンの組合せ手順の例を示す。
【００３０】
図２においては、波形データ３５ａの長さから、発話時間は１４．５秒であり、動作種別はロボットからユーザに対する質問であるとしている。
【００３１】
そこで、動作パターン組合せ生成部３３は、まず、動作種別＝質問より、最後に行う動作として２秒所要する質問用の動作パターン（２秒分の問い掛ける動き）を決定する。次に、始めに行う動作として、８秒分の話し掛ける動きの動作パターンと４秒分の話し掛ける動きの動作パターンとを組み合わせ、全体で１４秒を埋める。残った０．５秒については、４秒分の話し掛ける動きの動作パターンと２秒分の問い掛ける動きの動作パターンとの間に、０．５秒の無動作時間を挿入する。
【００３２】
これにより、発話の開始と終了とに動作の開始と終了とがぴったり一致し、ロボット１が発話している間、ロボット１が適度に動作するとともに、ユーザに発話の開始及び終了のタイミングを伝えることができる。
【００３３】
【発明の効果】
第１発明は、動作時間が互いに異なる値に決まっている複数の動作パターンのデータを有するデータベースと、ロボットが発話を開始する前に生成される音声の波形データから発話時間の長さを発話開始前に測定する発話時間測定手段と、前記発話時間測定手段で測定した発話時間に対して発話の開始と終了とに動作の開始と終了とが一致するように、前記データベースの内容から２つ以上の動作パターンの組合せ、再生順序及び再生タイミングを生成する動作パターン組合せ生成手段とを備えるので、発話に合わせて適度にロボットを動作させるとともに、ユーザに発話の開始及び終了のタイミングを伝えることができる。
【００３４】
第２発明は、発話開始と発話終了とに動作の開始と終了とが一致しない場合、動作パターン組合せ生成手段が前後に隣接する動作パターンの間に無動作時間を挿入するので、発話の開始と終了とに動作の開始と終了とがぴったり一致する。
【００３５】
第３発明は、発声すべき文字データから動作種別を判断する動作種別判断手段を備えるので、発話内容に適した動作をロボットに行わせることができる。
【００３６】
第４発明は、ロボットが第１発明から第３発明いずれかのロボット発話中の動作プログラム生成装置を有するので、ロボットが発話に合わせて適度に動作するとともに、ユーザに発話の開始及び終了のタイミングを伝えることができる。
【図面の簡単な説明】
【図１】本発明の実施例に係るロボット発話中の動作プログラム生成装置の構成を示す図。
【図２】動作パターンの組合せ手順の例を示す図。
【図３】ロボットの一例を示す図。
【符号の説明】
１　ロボット
１２　頭部
１３　胸部
１４　胴部（台車部）
２０ａ　スピーカ
３１　データベース
３２　発話時間測定部
３３　動作パターン組合せ生成部
３４　音声合成装置
３５　音声波形生成部
３５ａ　音声の波形データ
３６　波形再生部
３７　増幅器
３８　動作種別判断部
３８ａ　動作種別選択信号
３９　動作制御部
４０　アクチュエータ[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a technique for appropriately operating a robot while the robot is speaking, and for transmitting the start and end timings of speech to a user.
[0002]
[Prior art]
In the related art, during the utterance of the robot, (1) a predetermined motion is reproduced, and (2) the robot is operated according to waveform data (particularly, sound intensity) during the utterance.
[0003]
However, in the reproduction of the fixed motion of (1), since the utterance time is not constant, the start and end of the motion often do not exactly coincide with the start and end of the utterance. Has a disadvantage that it is difficult to grasp the remark timing. (2) In the robot operation according to the waveform data, since the operation pattern is a combination of the instantaneous operation with respect to the voice data, there is a disadvantage that the operation does not match the meaning of the utterance.
[0004]
[Problems to be solved by the invention]
An object of the present invention is to solve the above-mentioned drawbacks of the conventional technology, and to provide a technique capable of operating the robot appropriately while the robot speaks and transmitting the start and end timings of the speech to the user. To provide.
[0005]
[Means for Solving the Problems]
A first aspect of the present invention is an apparatus for generating an operation program during a robot utterance that solves the above-mentioned problem, comprising: a database having data of a plurality of operation patterns whose operation times are determined to be different from each other; Utterance time measuring means for measuring the length of utterance time before the start of utterance from waveform data of the generated speech, and starting and ending operations of utterance for the utterance time measured by the utterance time measuring means. An operation pattern combination generating means for generating a combination of two or more operation patterns, a reproduction order and a reproduction timing from the contents of the database so that the end coincides with the operation pattern.
[0006]
In a second aspect based on the first aspect, the operation pattern combination generating means inserts a non-operation time between adjacent operation patterns when the start and end of the operation do not coincide with the start and end of the utterance. An operation program generation device during robot utterance characterized by performing
According to a third invention, in the first invention, there is provided an operation program generation apparatus for uttering a robot, further comprising an operation type determination unit for determining an operation type from character data to be uttered.
[0007]
According to a fourth aspect, there is provided a robot having the motion program generation device for uttering a robot according to any one of the first to third aspects.
[0008]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0009]
Prior to the description of the present invention, an example of a robot to which the present invention is applied will be described with reference to FIG. The robot 1 shown in FIG. 3 simulates a human having a robot body 11 having a head 12, a chest 13, a torso (cart unit) 14, and left and right arms 15. Is rotatable by a drive mechanism (not shown), and the left and right traveling wheels 16 mounted on the body 14 can be steered and run by a drive mechanism (not shown), and move autonomously in the work space by battery driving. It is configured to The robot 1 coexists with a human in an indoor space such as a general home as a work space, and is used, for example, to assist, support, and care for a user's life in the general home. Therefore, the robot 1 has a function of talking with the user, watching the user's action, assisting the user's action, and acting together with the user by using the built-in CPU (computer) and various sensors. ing. Various shapes of the robot 1 are conceivable, such as an animal model for pets.
[0010]
Specifically, the robot 1 shown in FIG. 3 has a control unit using a CPU, two cameras 17 and two microphones 18 are mounted on the head 12, and a volume / sound source direction sensor and a focus sensor are provided at the center of the chest 13. A human detection sensor 19 composed of an electric infrared sensor and speakers 20a are mounted on the left and right. Further, an image display 20 b is mounted on the center of the chest 13, and an ultrasonic obstacle sensor 21 and a laser obstacle sensor 22 are mounted on the trunk 4. A keyboard and a touch sensor may be mounted.
[0011]
The camera 17 captures an image of the user or the room and outputs the image to the control unit, and the microphone 18 captures the user's voice and life sounds such as a telephone bell, a doorbell, and a television sound and outputs the captured image to the control unit. The sensor 19 detects the presence or absence of a user and outputs the result to the control unit. The speaker 20a is used for conversation with the user together with the microphone 18, and the image display 20b is used for providing information to the user. The keyboard and the touch sensor are data input devices for the user, and are used for inputting the life pattern data of the user and inputting the intention of the user. The speaker 20a is also used for providing information to a user. The control unit includes a calendar for a predetermined period and a calendar clock for measuring the current date and time.
[0012]
The control unit includes a data input device such as a keyboard, an external sensor that constantly monitors a user's life pattern such as a camera 17, a microphone 18, a human detection sensor 19, and obstacle sensors 21 and 22, and an internal sensor such as a calendar clock. Thus, information on the self-position of the robot 1, information on the position of the user, and information on the behavior of the user are input.
[0013]
The control unit has a self-position recognition function, a user recognition function, a database (data storage unit), a life pattern data processing function, and a drive control function.
[0014]
The self-position recognition function recognizes the position and azimuth (direction, posture) of the robot 1 itself based on image information captured by the camera 17. The user recognition function recognizes the content of the user's utterance (voice recognition) from the voice captured by the microphone 18, and performs personal recognition of the user from the image information captured by the camera 17 and the voice captured by the microphone 18. , The user's position, orientation, posture, and activity amount are recognized from the detection results of the camera 17 and the human detection sensor 19.
[0015]
The database includes voice data for conversation with the user and talking to the user, data of a gesture motion at the time of talking to the user and talking to the user, user's living information (room layout and furniture related to the user's residence, etc.). Location, personal daily life pattern of the user, user's hobby, health condition, etc.). A keyword such as a character string, video, or audio is added to the user's life information as needed. As the life pattern data, events related to activities such as getting up, going to bed, eating, taking medicine, taking a walk, free time, and taking a bath are listed, and these events are stored as time schedules together with time data.
[0016]
The life pattern data processing function stores life information directly input by a user from a data input device in a database with a keyword attached thereto, and various types of user recognition such as voice recognition results obtained by a camera 17, a microphone 18, and a human detection sensor 19. By processing the information, each piece of life information is attached to the database with a keyword and its date and time. Further, the life pattern data processing function selects data corresponding to the command from the database in response to an instruction or speech from the user, or according to time, and issues a command corresponding to the selected data to the operation control unit. Give and control.
[0017]
The drive control function drives the running wheels 16 to execute running and steering of the robot 1 in accordance with a command from the life pattern data processing function, and drives the head 12 and the arm 13 to rotate the head 12 and the arm 13. Perform the action. Further, the speaker 20a and the image display 20b are driven to execute a conversation with the user.
[0018]
Next, with reference to FIG. 1 and FIG. 2, an operation program generation device during robot utterance (hereinafter, simply referred to as an operation program generation device) according to an embodiment of the present invention will be described.
[0019]
The above-described robot 1 has an operation program generation device illustrated in FIG. 1, and the operation program generation device includes a database 31, a speech time measurement unit 32, and an operation pattern combination generation unit 33. It is.
[0020]
In FIG. 1, reference numeral 34 denotes a speech synthesizer provided in the robot, which comprises a speech waveform generation unit 35 and a waveform reproduction unit 36. The voice waveform generator 35 receives character data to be uttered and generates corresponding voice waveform data 35a before the start of utterance, and the waveform reproducer 36 reproduces the waveform data 35a into a real-time voice waveform. . The reproduced voice waveform is amplified by the amplifier 37 and provided to the speaker 20a, whereby the robot speaks.
[0021]
Here, if the robot suddenly starts talking, it is difficult for the user to hear. Therefore, character data such as "A-" or "Hmm" is inserted at the beginning of the character data. Thereby, the robot utters the meaningless words such as "A-" or "Well" and then utters the original utterance, thereby attracting the user's attention and making it easier to hear.
[0022]
The database 31 stores data of a plurality of operation patterns whose operation times are different from each other. The motion pattern is a pattern of motion such as a gesture of the head 12, the arm 15, and the body 14, and in the example, a motion pattern for one second as a type of motion that the robot speaks to the user, An operation pattern for 2 seconds, an operation pattern for 4 seconds, and an operation pattern for 8 seconds are stored. In addition, a motion pattern for one second and a motion pattern for two seconds are stored as types of motions that the robot asks the user. Further, a motion pattern for one second and a motion pattern for two seconds are stored as types of motions confirmed by the robot.
[0023]
The utterance time measuring unit 32 measures the length of the utterance time before the utterance starts from the waveform data 35a generated before the utterance of the robot by the voice waveform generation unit 35 or the like. Since the waveform data 35a itself is data before being reproduced as a real-time audio waveform, the utterance time can be instantaneously measured by checking the data length (the number of bytes) of the waveform data 35a.
[0024]
The motion pattern combination generation unit 33 performs a process on the speech time measured by the speech time measurement unit 32 such that the start and end of the robot's speech coincide with the start and end of the robot's speech. A combination of two or more operation patterns is selected, and a reproduction order and a reproduction timing thereof are generated.
[0025]
When the start and end of the operation do not exactly match the start and end of the utterance, the operation pattern combination generation unit 33 inserts a non-operation time between adjacent operation patterns before and after.
[0026]
It is preferable that the plurality of operation patterns selected by the operation pattern combination generation unit 33 are suitable for the utterance content as much as possible. For this purpose, the operation type selection signal 38a is used.
[0027]
In this example, the action type determining unit 38 analyzes the character data to be uttered, thereby determining the action type such as question, talk, and confirmation, and obtaining the action type selection signal 38a.
[0028]
The operation control device 39 reproduces the plurality of operation patterns selected by the operation pattern combination generation unit 33 in the reproduction order and the reproduction timing designated by the operation pattern combination generation unit 33, and reproduces the head 12, the arm 15, and the trunk 14 (Actuator) 40 is actually operated.
[0029]
FIG. 2 shows an example of a procedure for combining operation patterns.
[0030]
In FIG. 2, it is assumed that the utterance time is 14.5 seconds and the motion type is a question from the robot to the user based on the length of the waveform data 35a.
[0031]
Therefore, the motion pattern combination generation unit 33 first determines a motion pattern (questioning motion for two seconds) for a question that requires two seconds as the last motion to be performed based on the motion type = question. Next, as an operation to be performed first, an operation pattern of a speaking movement for 8 seconds and an operation pattern of a speaking movement for 4 seconds are combined to fill a total of 14 seconds. For the remaining 0.5 seconds, a non-operation time of 0.5 seconds is inserted between the operation pattern of the speaking movement for 4 seconds and the operation pattern of the interrogating movement for 2 seconds.
[0032]
Thereby, the start and end of the operation coincide exactly with the start and end of the utterance, and while the robot 1 is uttering, the robot 1 operates appropriately and informs the user of the start and end timing of the utterance. be able to.
[0033]
【The invention's effect】
The first invention is based on a database having data of a plurality of operation patterns whose operation times are determined to be different from each other, and determines the length of the utterance time from the waveform data of the voice generated before the robot starts uttering. The utterance time measuring means to measure before and the utterance time measured by the utterance time measuring means, two or more from the contents of the database so that the start and end of the operation coincide with the start and end of the utterance. And a motion pattern combination generating means for generating the combination of the motion patterns, the reproduction order and the reproduction timing, so that the robot can be appropriately operated in accordance with the utterance and the start and end timings of the utterance can be transmitted to the user. .
[0034]
According to the second invention, when the start and end of the operation do not coincide with the start and end of the utterance, the operation pattern combination generation unit inserts a non-operation time between adjacent front and rear operation patterns. The start and end of the operation exactly match the end.
[0035]
Since the third invention includes the operation type determining means for determining the operation type from the character data to be uttered, the robot can perform an operation suitable for the content of the utterance.
[0036]
In the fourth invention, since the robot has the operation program generating device during the robot utterance according to any one of the first invention to the third invention, the robot operates appropriately in accordance with the utterance and gives the user a start and end timing of the utterance. Can be conveyed.
[Brief description of the drawings]
FIG. 1 is a diagram showing a configuration of an operation program generation device during robot utterance according to an embodiment of the present invention.
FIG. 2 is a diagram showing an example of a procedure for combining operation patterns.
FIG. 3 is a diagram illustrating an example of a robot.
[Explanation of symbols]
1 Robot 12 Head 13 Chest 14 Torso (trolley)
20a speaker 31 database 32 utterance time measurement unit 33 operation pattern combination generation unit 34 audio synthesis device 35 audio waveform generation unit 35a audio waveform data 36 waveform reproduction unit 37 amplifier 38 operation type determination unit 38a operation type selection signal 39 operation control unit 40 Actuator

Claims

An utterance that measures the length of utterance time before the start of utterance from a database having data of a plurality of operation patterns whose operation times are different from each other, and waveform data of speech generated before the robot starts uttering A combination of two or more operation patterns from the contents of the database so that the start and end of the utterance coincide with the utterance time measured by the utterance time measurement unit. And a motion pattern combination generating means for generating a reproduction order and a reproduction timing.

2. The method according to claim 1, wherein when the start and end of the utterance do not coincide with each other, the operation pattern combination generation unit inserts a non-operation time between adjacent front and rear operation patterns. A motion program generation device during robot utterance.

2. The motion program generation device according to claim 1, further comprising an operation type determination unit configured to determine an operation type from character data to be uttered.

A robot comprising the motion program generation device during robot utterance according to any one of claims 1 to 3.