JP4634889B2

JP4634889B2 - Voice dialogue scenario creation method, apparatus, voice dialogue scenario creation program, recording medium

Info

Publication number: JP4634889B2
Application number: JP2005235121A
Authority: JP
Inventors: 哲郎甘粕; 昇宮崎; 輝雄萩野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2005-08-15
Filing date: 2005-08-15
Publication date: 2011-02-16
Anticipated expiration: 2025-08-15
Also published as: JP2007052043A

Description

本発明は複数の話題を扱うことができる音声対話シナリオを作成する音声対話シナリオ作成方法と装置及びこの装置をコンピュータで実現する音声対話シナリオ作成プログラム、このプログラムを記録した記録媒体に関する。 The present invention relates to a voice conversation scenario creating method and apparatus for creating a voice conversation scenario capable of handling a plurality of topics, a voice conversation scenario creating program for realizing the apparatus by a computer, and a recording medium on which the program is recorded.

音声認識、合成音声の技術を組み合わせて、利用者と音声を用いて対話を介して、ある作業に関する命令を入力するなどの目的を達成する音声対話装置について、近年では、非特許文献１のように、対話のある時点でシステムが利用者に入力するよう要求している項目とは異なる項目への単語の発生を許容したり、いい淀みなどを許容したりすることが出来るなど、人間にとって非常に自然な音声対話が可能な音声対話システムが提案されている。
また、特許文献１のように画面上に擬人化して表現されたアニメーションエージェントキャラクタを表示させ、ユーザはそのエージェントに対して話しかけるように発声された音声を認識し、応答をあらかじめ録音された音声や合成音で再生したり画面上にテキストで表示するなどして言葉として出力したり、応答のニュアンスをエージェントの仕草としてアニメーションで表示させたりすることで通知しながら対話を進める装置も提案されている。 In recent years, a non-patent document 1 discloses a speech dialogue apparatus that achieves an object such as inputting a command related to a certain task through dialogue using a voice with a user by combining speech recognition and synthesized speech technologies. In addition, it is possible for human beings to allow words to be generated in items that are different from the items that the system requires the user to input at some point in the conversation, and to allow good grudges. In addition, a voice dialogue system capable of natural voice dialogue has been proposed.
Also, an animation agent character expressed as an anthropomorphic person is displayed on the screen as in Patent Document 1, and the user recognizes the voice spoken to speak to the agent, and the response is recorded in advance. Proposals have also been made for devices that advance conversations while notifying them by playing them back with synthetic sounds, displaying them as text on the screen, and displaying the nuances of the responses as animations of the agent. .

更に近年では、テキストファイルによるスクリプト言語による記述と、そのスクリプト言語が記述されたファイルをスクリプト言語の実行処理系が読み込むことで音声対話システムを構成することが一般的となっている。
スクリプト言語によるテキストファイルには、ユーザへ応答を返す場合の再生文章や、利用者からの音声入力やその他の入力があった場合の次の応答を選択するための分岐規則や、音声認識装置や音声合成装置、その他の入力装置に対する最小限の指示が記述される。この様なスクリプト言語によるテキストファイルを音声対話シナリオと呼ぶ。 Furthermore, in recent years, it has become common to construct a speech dialogue system by describing a script language in a text file and a script language execution processing system reading a file in which the script language is described.
The text file in the script language includes a playback sentence when returning a response to the user, a branch rule for selecting the next response when there is a voice input or other input from the user, a voice recognition device, The minimum instructions for the speech synthesizer and other input devices are described. A text file in such a script language is called a voice dialogue scenario.

音声対話シナリオを読み込んだ実行処理系は、音声対話シナリオの記述に従って、音声認識、音声合成、その他入出力装置を駆動し音声対話を実行する。各装置の詳細な制御内容は、スクリプト言語の命令記述と対応させて、あらかじめ実行処理系の内部に組み込まれている。音声対話シナリオとその実行処理系の組み合わせにより、システムの可搬性や構築コスト・システム作成者に必要な知識・経験の低減が図られた。
一方、対話システムの役割として一問一答形式の簡便なものを仮定し、一入力毎に複数の対話システムの中から適切なものを一つ選択し、そのシステムの出力を応答する、といった方法も提案されている（特許文献２、特許文献３）。
特開２００４−２９５８３７号公報特開２００４−２４０１５０号公報特開２００４−２４０２２５号公報平沢純一、山本俊一郎、堀貴明、大附克年、“ＣＴＩ向け自由発話対応音声対話システムRexDialog”、社団法人情報処理学会研究報告、2003-SPL-47(8),pp35-40. The execution processing system that has read the voice dialogue scenario executes voice dialogue by driving voice recognition, voice synthesis, and other input / output devices according to the description of the voice dialogue scenario. The detailed control contents of each device are incorporated in advance in the execution processing system in association with the script language command description. The combination of the voice dialogue scenario and its execution processing system has reduced the portability of the system, the construction cost, and the knowledge and experience required for the system creator.
On the other hand, assuming that the role of the dialogue system is a simple one-question-answer format, for each input, select an appropriate one from a plurality of dialogue systems and respond to the output of that system. Have also been proposed (Patent Documents 2 and 3).
JP 2004-295837 A JP 2004-240150 A JP 2004-240225 A Junichi Hirasawa, Shunichiro Yamamoto, Takaaki Hori, Katsutoshi Otsuki, “RexDialog, a spoken dialogue system for free speech for CTI”, Information Processing Society of Japan Research Report, 2003-SPL-47 (8), pp35-40.

しかしながら、音声対話シナリオとその実行処理系の組み合わせにより構築コストの低減が図られたとしても、依然、構築コストが高い場合がある。その一つに、複数の話題を取り扱い、その全話題で扱う語句を音声対話の実行中、音声入力可能な時点で常に受理可能とすることを必要とする場合がある。
例えば、市役所窓口総合音声対話システムであれば、転入・転居案内、社会保険案内、粗大ごみ受付などといったように、複数の話題を扱い、その全ての話題の語句について常に音声対話の途中で音声入力可能とし、入力があった場合は適切に応対しようとする対話システムを作るときには、その音声対話システムの記述量が膨大になるという問題があった。これは常に全ての話題について扱おうとすると、ユーザの入力発話から次の応答内容を決定する処理の中で、それぞれの話題がもつ話題の目的達成のために必要な入力項目（これをスロットと呼ぶ）全てについて、直前の発話による入力によって変化があったかを検査しなくてはならないためである。すなわち、複数の話題を扱い、それぞれの話題が入力対象とする語句を常に受理可能とするには、各話題で扱うスロット数の総和乗の分岐規則を最低でも記述する必要があった。これは、話題の数を少しでも増やすと爆発的に記述量が増大することを示す。 However, even if the construction cost is reduced by the combination of the voice conversation scenario and its execution processing system, the construction cost may still be high. For example, there are cases where it is necessary to handle a plurality of topics and to accept words / phrases handled by all the topics at a time when voice input is possible during the execution of the voice dialogue.
For example, in the case of the city hall window general voice dialogue system, multiple topics such as moving-in / moving guidance, social insurance guidance, oversized garbage reception, etc. are handled, and voices are always input during the voice dialogue for all the topics. When creating a dialogue system that is capable of responding appropriately to input, there is a problem that the amount of description of the spoken dialogue system becomes enormous. If you always try to deal with all the topics, in the process of determining the next response content from the user's input utterance, input items necessary to achieve the topic purpose of each topic (this is called a slot) This is because it is necessary to inspect whether or not everything has changed due to the input from the previous utterance. In other words, in order to handle a plurality of topics and to always accept words and phrases input by each topic, it is necessary to describe at least a branching rule of the sum of the number of slots handled in each topic. This indicates that the description amount increases explosively when the number of topics is increased even a little.

また上記した特許文献２、特許文献３のように、対話システムの役割として一問一答形式の簡便なものを仮定し、一入力毎に複数の対話システムから適切なものを一つ選択しそのシステムの出力を応答する、といった方法を採った場合、
この方法は、記述量が膨大になることを避けることが出来るが、一つ一つの話題の目的達成のための対話のやり取りとして数度にわたるやり取りを行わなければならない対話システムには適用できないという問題があった。 In addition, as described in Patent Document 2 and Patent Document 3 described above, the role of the dialogue system is assumed to be a simple one-question-answer format, and an appropriate one is selected from a plurality of dialogue systems for each input. When the method of responding to the output of the system is taken,
Although this method can avoid an enormous amount of description, it cannot be applied to a dialogue system that requires several exchanges to achieve the purpose of each topic. was there.

本発明では、それぞれが異なる１つの話題にのみ対応した元シナリオを複数備え、これら複数の元シナリオを連結して複数の話題に対応可能な音声対話シナリオを作成する音声対話シナリオ作成装置を提案するものであり、その特徴とする構成は、元シナリオに備えられているリソースから各元シナリオを連結することに必要な情報、どの元シナリオに話題を遷移するかを決める情報、話題が遷移した状態でも全ての各元シナリオで扱う話題に関する単語を音声認識するための情報を抽出する情報抽出手段と、この情報抽出手段が抽出した情報からどの元シナリオに遷移してもどの元シナリオを実行中であっても他の元シナリオの扱う話題に関する利用者の発声を処理し、音声応答を実行できる環境に整合させるための整合情報を生成する整合情報作成手段とを備えることを特徴とする。 The present invention proposes a voice conversation scenario creation device that includes a plurality of original scenarios each corresponding to only one different topic, and creates a voice conversation scenario capable of dealing with a plurality of topics by connecting the plurality of original scenarios. The characteristic configuration is the information necessary to connect each original scenario from the resources provided in the original scenario, the information that determines which original scenario the topic will transition to, and the state in which the topic has transitioned However, information extraction means for extracting information for speech recognition of words related to topics handled in all original scenarios, and which original scenario is being executed no matter which original scenario is transitioned from the information extracted by this information extraction means Even if there is, it is possible to process the user's utterances on the topics handled by other original scenarios and generate alignment information to match the environment where voice response can be executed. Characterized in that it comprises an information creation unit.

本発明では更に上記音声対話シナリオ作成装置において、整合情報作成手段は、連結された元シナリオ内で扱う各元シナリオの名称と、元シナリオのファイル名、ファイルの位置、最初に実行する元シナリオの情報を記述した連結情報を作成する連結情報作成手段と、各元シナリオの中の音声認識リソースから目的達成のために必要な入力項目に入力される単語と、各元シナリオの名称と、元シナリオを想起させる関連語句とを抽出し、これらを話題の遷移を引き起こす遷移用語句として適正に音声認識し、理解結果を出力するための遷移用リソースを作成する遷移用リソース作成手段と、遷移用語句が認識された場合に、その認識用語句と対応する元シナリオへ遷移するかを確認し、遷移することを表示し、連結されている別の元シナリオに制御を移す動作を実行するための遷移用対話シナリオを作成する話題遷移用対話シナリオ作成手段とによって構成される。 Further in the present invention, in the above-mentioned voice dialogue scenario creation device, the matching information creation means includes the name of each original scenario handled in the linked original scenario, the file name of the original scenario, the position of the file, and the original scenario to be executed first. Linked information creation means for creating linked information describing information, words input to input items necessary for achieving the purpose from the speech recognition resource in each original scenario, the name of each original scenario, and the original scenario A transition resource creating means for creating a transition resource for extracting a related word / phrase that recalls, appropriately recognizing these as a transition term phrase causing a topic transition, and outputting an understanding result, and a transition term phrase Is recognized, it confirms whether to transition to the original scenario corresponding to the recognition term phrase, displays the transition, and another original scenario that is connected Constituted by the creation Topic Transition dialog scenario means for creating a transition for dialog scenario to perform the operation to transfer control.

本発明の音声対話シナリオ作成装置に用いる整合情報生成手段は、連結情報作成手段と、遷移用リソース作成手段と、遷移用シナリオ作成手段に加えて、連結された元シナリオで実行可能な対話内容を表示し、音声対話手順を案内するためのナビゲート用シナリオ作成手段を備えることを特徴とする。
本発明の音声対話シナリオ作成装置に用いる遷移用リソース作成手段は、各元シナリオに備えられているリソースの中のキーワード辞書から所定のクラスに属するキーワードを抽出し、抽出したキーワードにこのキーワードの属性を示す情報を付加した遷移用キーワード集を作成する遷移用キーワード辞書作成手段と、各元シナリオに記述されている各キーワードがどの元シナリオに属するかを示す表として作用するシナリオ名キーワード対応データベースを作成するシナリオ名キーワード対応データベース作成手段と、各元シナリオに記述されている発話理解リストから、シナリオキーワードクラスリストに含まれるクラスの、その規則中に含む発話理解規則を抽出し、抽出した各規則のクラス名をシナリオ内のキーワードに属性を表わす情報を付加した名称に、また振り分け先スロット名を遷移先シナリオ関連語にそれぞれ書き替え、シナリオ内キーワードに属性を表わす情報が付加されたキーワード又は遷移先シナリオ関連語に属するキーワードが音声認識した単語列中に出現すれば、そのキーワードを遷移先シナリオ関連語スロットに振り分けよ、とする内容の発話理解規則と、シナリオ指定クラスに属するキーワードが音声認識した単語列中に出現した場合は、そのキーワードを遷移先シナリオスロットに振り分けよ、とした内容の発話理解規則を作成する遷移用発話理解規則作成手段と、各元シナリオが利用する言語モデルを生成した元シナリオ例文リストの中から、シナリオキーワードクラスリスト中にあるクラスを含む例文を抽出し、抽出した各例文中のクラス名をシナリオ内のキーワードの属性を表わす情報を付加したクラス名に置換し、更に追加例文リスト中の各例文からシナリオ関連キーワードリストに含まれる単語を抽出し、その単語をシナリオ関連語クラスに置換し、置換後の追加例文リスト中の各例文を形態素解析手段で単語ごとに分かち書きし、その読み仮名を振って遷移用言語モデルと遷移用認識辞書とを作成する遷移用認識言語モデル・辞書作成手段とによって構成したことを特徴とする。 The matching information generating means used in the voice dialogue scenario creating apparatus of the present invention includes the dialogue information executable in the linked original scenario in addition to the linked information creating means, the transition resource creating means, and the transition scenario creating means. A navigation scenario creating means for displaying and guiding a voice dialogue procedure is provided.
The transition resource creating means used in the voice conversation scenario creating device of the present invention extracts a keyword belonging to a predetermined class from a keyword dictionary in resources provided in each original scenario, and attributes the keyword to the extracted keyword. A transition keyword dictionary creating means for creating a transition keyword collection to which information indicative of a scenario is added, and a scenario name keyword correspondence database that acts as a table indicating to which original scenario each keyword described in each original scenario belongs From the created scenario name keyword correspondence database creation means and the utterance understanding list described in each original scenario, the utterance understanding rules included in the rules of the classes included in the scenario keyword class list are extracted, and each extracted rule Attribute of class name as a keyword in the scenario A word that has been rewritten to the name with the information added and the slot name to which the distribution destination is assigned as the transition destination scenario related word, and the keyword in which the attribute information is added to the keyword in the scenario or the keyword that belongs to the transition destination scenario related word is voice-recognized If it appears in the sequence, the keyword is assigned to the destination scenario-related word slot, and if the keyword belonging to the scenario specification class appears in the speech-recognized word sequence, the keyword From the original scenario example list that created the language model used by each original scenario and the scenario utterance understanding rule creation means for creating an utterance understanding rule with the content that was assigned to the transition destination scenario slot, the scenario keyword class Extract example sentences containing classes in the list, and class names in each extracted example sentence Replace with the class name to which the information indicating the attribute of the keyword in the scenario is added, further extract the word included in the scenario related keyword list from each example sentence in the additional example sentence list, replace the word with the scenario related word class, A transition recognition language model / dictionary creation means for creating a transition language model and a transition recognition dictionary by writing each example sentence in the additional example sentence list after replacement for each word by a morpheme analysis means, It is characterized by comprising.

本発明によれば、複数の話題に対応する音声対話シナリオを非常に小さい作業量で作成することが可能である。また、利用者に対して、別の話題へ遷移するための特別なコマンドを覚えなければならないなどの制約を排除し、システム上で扱えるほとんどのキーワードを受理して必要であれば遷移用の対話へ自動的に移行するコンテンツを作成することができるので、非常に利便性の高い音声対話システムを提供することが可能となる。 According to the present invention, it is possible to create a voice conversation scenario corresponding to a plurality of topics with a very small amount of work. It also eliminates restrictions such as having to remember special commands for users to transition to another topic, accepts most keywords that can be handled on the system, and transition dialogs if necessary. Thus, it is possible to create a content that automatically shifts to, so that it is possible to provide a very convenient voice dialogue system.

本発明による音声対話シナリオ作成方法及び音声対話作成装置はハードウェアによって構成することもできるが、より簡素に実施するにはコンピュータに本発明で提案する音声対話シナリオ作成プログラムをインストールし、コンピュータに音声対話シナリオ作成装置として機能させる形態が最良の実施形態である。
コンピュータに本発明による音声対話シナリオ作成装置として機能させるにはコンピュータにインストールした音声対話シナリオ作成プログラムによりコンピュータ内に元シナリオに備えられているリソースから各元シナリオを連結することに必要な情報、どの元シナリオに話題を連結するかを決める情報、話題が遷移した状態で各元シナリオに適した音声認識を行わせるための情報を抽出する情報抽出手段と、この情報抽出手段が抽出した情報からどの元シナリオに遷移しても各元シナリオが適正に音声応答を実行できる環境に整合させるための整合情報を生成する整合情報生成手段とを構築し、この整合情報生成手段により整合情報を生成し、この整合情報をリソースとして保持することにより、それぞれが異なる１つの話題にのみ対応した複数の元シナリオを、複数の話題に対応可能な音声対話シナリオとして機能させることができる。 The voice dialogue scenario creation method and the voice dialogue creation device according to the present invention can be configured by hardware. However, in order to carry out more simply, the voice dialogue scenario creation program proposed in the present invention is installed in a computer, and the voice is created in the computer. The best mode is a mode of functioning as a dialogue scenario creation device.
In order for a computer to function as a voice conversation scenario creation device according to the present invention, information necessary for linking each original scenario from resources provided in the original scenario in the computer by a voice conversation scenario creation program installed in the computer, which Information that determines whether to link topics to the original scenario, information extraction means that extracts information for performing speech recognition suitable for each original scenario in the state where the topics have transitioned, and which information from the information extracted by this information extraction means A matching information generation unit that generates matching information for matching with an environment in which each original scenario can properly execute a voice response even when transitioning to the original scenario is constructed, and the matching information generation unit generates matching information, By holding this consistency information as a resource, each corresponds to only one different topic A plurality of original scenarios, can function as a possible speech dialog scenario corresponding to a plurality of topics.

図１を用いて本発明による音声対話シナリオ作成装置の概要を説明する。図中１００はそれぞれ異なる１つの話題にのみ対応した元シナリオ１００Ａ、１００Ｂ、…１００Ｎで構成される元シナリオ群を示す。各元シナリオ１００Ａ、１００Ｂ、…１００Ｎは元シナリオの名称とか、各元シナリオ１００Ａ、１００Ｂ、…１００Ｎのそれぞれで用いられる音声認識・発話理解リソースなどを格納したリソース１０１Ａ、１０１Ｂ、…１０１Ｎを備える。
２００は情報抽出手段を示す。この情報抽出手段２００は各元シナリオ１００Ａ、１００Ｂ、…１００Ｎのそれぞれのリソース１０１Ａ、１０１Ｂ、…１０１Ｎから、これら元シナリオ１００Ａ、１００Ｂ、…１００Ｎを連結するに必要な情報を抽出する。各元シナリオ１００Ａ、１００Ｂ、…１００Ｎを連結するに必要な情報とは例えば各元シナリオ１０１Ａ、１０１Ｂ、…１０１Ｎの名称とか、各元シナリオのファイル名、ファイルの位置、最初に実行する元シナリオを指定する情報、等とすることができる。 The outline of the voice dialogue scenario creating apparatus according to the present invention will be described with reference to FIG. In the figure, reference numeral 100 denotes an original scenario group composed of original scenarios 100A, 100B,... 100N corresponding to only one different topic. Each of the original scenarios 100A, 100B,... 100N includes resources 101A, 101B,... 101N that store the names of the original scenarios, speech recognition / utterance understanding resources used in each of the original scenarios 100A, 100B,.
Reference numeral 200 denotes information extraction means. This information extraction means 200 extracts information necessary for connecting these original scenarios 100A, 100B,... 100N from the respective resources 101A, 101B,. Information necessary to link the original scenarios 100A, 100B,... 100N includes, for example, the names of the original scenarios 101A, 101B,... 101N, the file name of each original scenario, the location of the file, and the original scenario to be executed first. Information to be specified, etc.

情報抽出手段２００が抽出した情報は整合情報生成手段３００に引き渡され、この整合情報生成手段３００で整合情報４００を生成し、この整合情報４００を整合情報記憶部５００に記憶させる。
音声対話シナリオ実行手段６００は、元シナリオ群１００に格納した元シナリオ１００Ａ、１００Ｂ、…１００Ｎの各リソース１０１Ａ、１０１Ｂ、…１０１Ｎと整合情報記憶部５００に格納した整合情報４００とを用いて複数の話題に対応する音声対話システムとして機能する。 The information extracted by the information extraction unit 200 is transferred to the matching information generation unit 300, the matching information generation unit 300 generates the matching information 400, and the matching information 400 is stored in the matching information storage unit 500.
The voice conversation scenario executing means 600 uses a plurality of resources 101A, 101B,... 101N of the original scenarios 100A, 100B,... 100N stored in the original scenario group 100 and the matching information 400 stored in the matching information storage unit 500. It functions as a spoken dialogue system corresponding to the topic.

図２以下を用いて各部の詳細を順に説明する。各元シナリオ１００Ａ、１００Ｂ、…１００Ｎはシナリオファイル名１０１Ａ―１、１０１Ｂ―１…とキーワード辞書１０１Ａ―２、１０１Ｂ―２…と、発話理解規則リスト１０１Ａ―３、１０１Ｂ―３…と、元シナリオ例文リスト１０１Ａ―４、１０１Ｂ―４…とを備える。
ここで、キーワード辞書１０１Ａ―２、１０１Ｂ−２…とは各元シナリオ１０１Ａ、１０１Ｂ…の中で用いる目的達成のために必要な入力項目（以下スロットと称す）に入力されるキーワードの辞書を示す。また発話理解規制リスト１０１Ａ−３、１０１Ｂ−３…とは音声認識された単語列の中からスロットに入れるべきキーワードを抽出してスロットに振り分けるための発話理解規則を集めたもの、元シナリオ例文リスト１０１Ａ−４、１０１Ｂ―４…とは各元シナリオ実行時に音声認識する際に利用するクラス言語モデルと単語辞書を作成する場合に学習に用いた発話文の例文である。 Details of each part will be described in order with reference to FIG. Each original scenario 100A, 100B,... 100N includes scenario file names 101A-1, 101B-1,..., Keyword dictionaries 101A-2, 101B-2, etc., utterance understanding rule lists 101A-3, 101B-3,. Example sentence lists 101A-4, 101B-4,.
Here, the keyword dictionaries 101A-2, 101B-2,... Indicate keyword dictionaries that are input to input items (hereinafter referred to as slots) necessary for achieving the purpose used in each of the original scenarios 101A, 101B,. . The utterance understanding restriction lists 101A-3, 101B-3,... Are a collection of utterance understanding rules for extracting the keywords to be put into the slots from the speech-recognized word strings and allocating them to the slots. 101A-4, 101B-4,... Are examples of spoken sentences used for learning when creating a class language model and a word dictionary to be used for speech recognition when each original scenario is executed.

図１に示した情報抽出手段２００はこれらの元シナリオ１００Ａ、１００Ｂ、…から、連絡時に準備する情報としてシナリオファイル名１０２と、シナリオキーワードクラスリスト１０３と、シナリオ関連キーワードリスト１０４と、追加例文リスト１０５と、各シナリオの内容説明文１０６とを抽出する。これと共に、これらの情報は図３に示す整合情報生成手段３００へ入力される。
整合情報生成手段３００は図３に示す実施例では遷移用リソース作成手段３０１と、ナビゲート用シナリオ作成手段３０２と、遷移用シナリオ作成手段３０３と、連結情報作成手段３０４とによって構成した場合を示す。 1 extracts scenario file name 102, scenario keyword class list 103, scenario related keyword list 104, and additional example sentence list as information to be prepared at the time of contact from these original scenarios 100A, 100B,. 105 and the contents explanation 106 of each scenario are extracted. At the same time, these pieces of information are input to the matching information generating means 300 shown in FIG.
In the embodiment shown in FIG. 3, the matching information generating unit 300 is configured by a transition resource creating unit 301, a navigation scenario creating unit 302, a transition scenario creating unit 303, and a linked information creating unit 304. .

遷移用リソース作成手段３０１は整合情報４００の一部を構成する遷移用リソース４０１を生成する。遷移用リソース４０１の内容としては遷移用キーワード辞書４０１−１と、遷移用発話理解規則４０１−２と、遷移用言語モデル４０１−３と、遷移用認識辞書４０１−４とで構成される。更に遷移用リソース作成手段３０１はシナリオ名キーワード対応ＤＢ（データベース）４０１−５をも生成する。
ナビゲート用シナリオ生成手段３０３は遷移用シナリオ４０４を生成し、連結情報作成手段３０４は連結情報４０５を生成する。 The transition resource creation unit 301 generates a transition resource 401 that forms part of the matching information 400. The content of the transition resource 401 includes a transition keyword dictionary 401-1, a transition utterance understanding rule 401-2, a transition language model 401-3, and a transition recognition dictionary 401-4. Further, the transition resource creation means 301 also creates a scenario name keyword correspondence DB (database) 401-5.
The navigation scenario generation unit 303 generates a transition scenario 404, and the connection information generation unit 304 generates connection information 405.

以下では各部の動作を説明する。ここで、ファイル名は必要であればそのファイルを保持しているコンピュータのネットワーク上のアドレス、そのコンピュータ内の記憶装置内における位置なども含むものであってもよい。ファイル名に関しては以後同様である。
個々の発話理解規則リスト１０１Ａ−３、１０１Ｂ−３…には、スロットへ入力するキーワードとして取り出すべき単語の属性（以下クラスと称す）を表わす記号と、そのクラスの位置にあったキーワードをどのスロットへ振り分ければよいかという指示が対で示されている。スロットへの振り分けをより高精度に行なうためにクラスの記号が出現する前後単語列の情報を付加してもよい。例えば「“横浜駅”から“品川駅”まで」という発話の中から、“横浜駅”を「発駅スロット」、“品川駅”までを「着駅スロット」に振り分けるための規則は以下のようになる。 Hereinafter, the operation of each unit will be described. Here, if necessary, the file name may include a network address of a computer holding the file, a location in a storage device in the computer, and the like. The same applies to the file name.
In each of the utterance understanding rule lists 101A-3, 101B-3,..., A symbol indicating the attribute (hereinafter referred to as a class) of a word to be extracted as a keyword to be input to the slot and the keyword at the position of the class. An instruction as to whether or not to distribute is shown in pairs. In order to perform the allocation to the slot with higher accuracy, information on the word strings before and after the appearance of the class symbol may be added. For example, from the utterance "From" Yokohama Station "to" Shinagawa Station ", the rules for allocating" Yokohama Station "to" Departure Station Slot "and" Shinagawa Station "to" Destination Station Slot "are as follows: become.

発話理解規則例：「＜駅名＞：発駅スロットから＜駅名＞：着駅スロットまで」
この発話理解規則例では、「＜駅名＞」は、駅名クラスに属する単語が出現することを示し、「：発駅スロット」、「：着駅スロット」はそれぞれ、直前のクラス記号に含まれる単語が、それぞれのスロットに振り分けられることを示す。
また、キーワードは、そのキーワードの表記と、クラス言語モデル内のクラスと、そのキーワードの意味素性の３つの情報からなる。
また、元シナリオ１００Ａ、１００Ｂ…中には、各シナリオのエージェントの個性を表現するために、音声応答を行う際に音声合成を使う場合の音色や話速に関するパラメータや、エージェントを画面に表示する場合の画像や動画データを指定するための記述なども含まれる。 Example of utterance comprehension rules: “From <Station name>: From station slot to <Station name>: Arrival station slot”
In this utterance understanding rule example, “<station name>” indicates that a word belonging to the station name class appears, and “: departure station slot” and “: arrival station slot” are words included in the immediately preceding class symbol, respectively. Is assigned to each slot.
The keyword is composed of three pieces of information including the keyword notation, the class in the class language model, and the semantic feature of the keyword.
In addition, in the original scenarios 100A, 100B,..., Parameters related to timbre and speech speed when using voice synthesis when performing a voice response, and agents are displayed on the screen in order to express the individuality of each scenario agent. A description for designating image data or moving image data is also included.

シナリオを連結するには上記のほかに、連結しようとしている個々の元シナリオの呼び方（例えば、「ごみ分別案内」等）であるシナリオ名称と、その元シナリオ１００Ａ、１００Ｂ…のキーワード辞書１０１Ａ−２、１０１Ｂ−２…中のキーワードで、そのシナリオを特徴づけるキーワード（例えば、シナリオ名称が「ごみ分別案内」ならば、各ゴミの名前、ゴミ分別種別などの単語）をシナリオキーワードと呼び、そのシナリオの各シナリオキーワードが属するクラスをリストしたシナリオキーワードクラスリスト１０３とする。 In order to connect scenarios, in addition to the above, in addition to the above, a scenario name which is a name of each original scenario to be connected (for example, “garbage separation guide”) and a keyword dictionary 101A- of the original scenarios 100A, 100B. 2, 101B-2... Are keywords that characterize the scenario (for example, if the scenario name is “garbage separation guidance”, the name of each garbage, a word such as garbage classification type) is called a scenario keyword, and The scenario keyword class list 103 is a list of classes to which each scenario keyword belongs.

上記のシナリオキーワードには含まれないが、シナリオの扱う話題や、話題の名称に強く関連する単語（例えば「ゴミの捨て方」など）をシナリオ関連キーワードと呼び、そのシナリオ関連キーワードをリストしたものであるシナリオ関連キーワードリスト１０４と、
シナリオ関連語リスト内に登録されているシナリオ関連語が出現する発話文の例文と、シナリオ名が出現する発話文の例文を集めた追加例文リスト１０４と、そのシナリオファイルにおいて扱う話題やその概要などの説明文１０６とを各元シナリオ１００Ａ、１００Ｂ…１００N毎に準備する。 A list of scenario-related keywords that are not included in the above scenario keywords, but the topics handled by the scenario and words that are strongly related to the topic name (such as "How to throw away garbage") are called scenario-related keywords. Scenario-related keyword list 104,
Example sentences of an utterance sentence in which a scenario-related word registered in the scenario-related word list appears, an additional example sentence list 104 that collects example sentences of an utterance sentence in which a scenario name appears, a topic handled in the scenario file, an outline thereof, and the like Are prepared for each original scenario 100A, 100B... 100N.

また、連結された元シナリオにおいて、連結前の各元シナリオや本実施例において作成されるナビゲート用シナリオ４０３のいずれの部分シナリオから対話を開始すればよいのかをシナリオ名称で示した開始シナリオ名４０６（図２参照）と、この連結したシナリオの総称としてナビゲート用シナリオ４０３につけるシナリオ名称（例えば、「○○市総合案内」等）であるナビゲートシナリオ名４０７（図２参照）を準備する。
連結された対話シナリオの作成では、連結された元シナリオ１００Ａ、１００Ｂ…１００Ｎでは各部分シナリオを構成することになる各元シナリオのシナリオファイル名と、そのシナリオに付与されたシナリオ名称と、開始シナリオ名を連結情報作成手段３０４が読み込み、ナビゲート用シナリオ作成手段３０２が出力するナビゲート用シナリオ４０３のファイル名と、そのシナリオに付与されたシナリオ名称と、遷移用シナリオの実行が開始されるときに、最初に実行されるべき部分シナリオのシナリオ名称またはシナリオファイル名をリストにした連結情報４０５を出力する。 Also, in the linked original scenario, the name of the starting scenario indicating the scenario name from which each of the original scenarios before linking or the partial scenario of the navigation scenario 403 created in this embodiment should be started. 406 (see FIG. 2) and a navigation scenario name 407 (see FIG. 2), which is a scenario name (for example, “XX city general guidance”, etc.) to be given to the navigation scenario 403 as a generic name of the connected scenarios, are prepared. To do.
In the creation of the linked dialogue scenario, the scenario file name of each original scenario that constitutes each partial scenario in the linked original scenarios 100A, 100B,... 100N, the scenario name assigned to the scenario, and the start scenario When the linked information creation unit 304 reads the name, the file name of the navigation scenario 403 output from the navigation scenario creation unit 302, the scenario name assigned to the scenario, and the execution of the transition scenario are started The connection information 405 that lists the scenario name or scenario file name of the partial scenario to be executed first is output.

次に図４に遷移用リソース作成手段３０１の構成の一例を示す。遷移用リソース作成手段３０１は、情報抽出手段２００が抽出したキーワード辞書、発話理解規則リスト、元シナリオ例文リスト、追加例文リストと、ナビゲートシナリオ名を入力とし、これらを用いて遷移用リソース作成手段３０１内に配置された遷移用シナリオキーワード辞書作成手段３１０が、遷移用キーワード辞書４０１−１を出力し、シナリオ名キーワード対応ＤＢ作成手段３２０がシナリオ名キーワード対応ＤＢ４０１−５を出力し、遷移用発話理解規則作成手段３３０が、遷移用発話理解規則４０１−２を出力し、遷移用認識言語モデル・辞書作成手段３４０が、遷移用言語モデル４０１−３と、遷移用単語辞書４０１−４をそれぞれ出力する。 Next, FIG. 4 shows an example of the configuration of the transition resource creation unit 301. The transition resource creation means 301 receives the keyword dictionary extracted by the information extraction means 200, the utterance understanding rule list, the original scenario example sentence list, the additional example sentence list, and the navigation scenario name, and uses these as transition resource creation means. The transition scenario keyword dictionary creation means 310 arranged in 301 outputs the transition keyword dictionary 401-1, the scenario name keyword correspondence DB creation means 320 outputs the scenario name keyword correspondence DB 401-5, and the transition utterance. The understanding rule creation means 330 outputs the transition utterance understanding rules 401-2, and the transition recognition language model / dictionary creation means 340 outputs the transition language model 401-3 and the transition word dictionary 401-4, respectively. To do.

遷移用キーワード辞書作成手段３１０は、遷移用キーワード辞書作成方法を用いて、遷移用キーワード辞書４０１−１を生成する。図５にその流れを示す。
遷移用キーワード辞書作成方法では、シナリオ内キーワード抽出ステップＳ５−１、シナリオ内キーワードクラス再付与ステップＳ５−２、シナリオ関連語クラス再付与ステップＳ５−３、シナリオ名称キーワード生成ステップＳ５−４という手順を連結対象となる全ての元シナリオについて繰り返し、それらの手順で得られたキーワードのリストをファイル等に書き出すステップ（Ｓ５−５）とを備える。 The transition keyword dictionary creation means 310 generates the transition keyword dictionary 401-1 using the transition keyword dictionary creation method. FIG. 5 shows the flow.
In the transition keyword dictionary creation method, the procedure of scenario keyword extraction step S5-1, scenario keyword class reassignment step S5-2, scenario related word class reassignment step S5-3, scenario name keyword generation step S5-4 is performed. A step (S5-5) of repeating for all the original scenarios to be connected and writing a list of keywords obtained by those procedures to a file or the like.

シナリオ内キーワード抽出ステップＳ５−１では、各元シナリオのキーワード辞書１０１Ａ−２、１０１Ｂ−２…から、シナリオキーワードクラスリストに示されたクラスに属するキーワードを抜き出してくる。
次にシナリオ内キーワードクラス再付与ステップＳ５−２において、シナリオ内キーワード抽出ステップＳ５−１により抜き出された全キーワードのクラス名を、「シナリオ内キーワード−ｄ」に置き換える。なお、ｄは、シナリオ内キーワードの属性を表わす情報であり、そのとき処理している元シナリオのシナリオ名称である。シナリオ名称が「ごみ分別案内」ならばクラス名は「シナリオ内キーワード-ごみ分別案内」と置き換えられる。 In the scenario keyword extraction step S5-1, keywords belonging to the class indicated in the scenario keyword class list are extracted from the keyword dictionary 101A-2, 101B-2,.
Next, in the in-scenario keyword class reassignment step S5-2, the class names of all keywords extracted in the in-scenario keyword extraction step S5-1 are replaced with “in-scenario keyword-d”. Note that d is information indicating the attribute of the keyword in the scenario, and is the scenario name of the original scenario being processed at that time. If the scenario name is “garbage separation guidance”, the class name is replaced with “keyword in scenario-garbage separation guidance”.

次にシナリオ関連語クラス再付与ステップＳ５−３ではシナリオ関連キーワードリスト内の各キーワードを読み込み、その全てのキーワードクラスを、「シナリオ関連語−ｄ」に置き換える。ｄは、シナリオ内キーワードクラス再付与ステップＳ５−２と同様に現在処理しているシナリオ名称である。
更に、シナリオ名称キーワード生成ステップＳ５−４では、表記、意味素性をシナリオ名称とし、クラスを「シナリオ指定」クラスとしてキーワードを生成する。
上記を、各元シナリオについて実行し、さらにナビゲートシナリオ名についても遷移用キーワード辞書作成方法と同様にシナリオ名称キーワードを生成した後、シナリオ内キーワードクラス再付与、シナリオ関連語クラス再付与、シナリオ名称キーワード生成でそれぞれ書き換え、生成したキーワード全てを遷移用キーワードとしてファイル等に書き出す。 Next, in the scenario related word class reassignment step S5-3, each keyword in the scenario related keyword list is read, and all of the keyword classes are replaced with “scenario related word-d”. d is the name of the scenario currently being processed as in the intra-scenario keyword class reassignment step S5-2.
Further, in the scenario name keyword generation step S5-4, a keyword is generated with the notation and the semantic feature as the scenario name and the class as the “scenario designation” class.
Execute the above for each original scenario, and generate the scenario name keyword for the navigation scenario name as well as the transition keyword dictionary creation method, then reassign the keyword class within the scenario, reassign the scenario related term class, and the scenario name Each keyword is rewritten by keyword generation, and all the generated keywords are written to a file or the like as transition keywords.

シナリオ名キーワード対応ＤＢ（データベース）作成手段３２０は、シナリオ名キーワード対応ＤＢ作成方法を用いて、あるキーワードがどの元シナリオに属するかを示す表として作用するシナリオ名キーワード対応ＤＢ−４０１を作成する。図６に、シナリオ名キーワード対応ＤＢ作成方法の流れを示す。
シナリオ名キーワード対応ＤＢ作成方法では、遷移用キーワード辞書作成手段３１０が作成した遷移用キーワード辞書４０１−１を読み込む。そして、シナリオ名-表記取得ステップＳ６−１において、「シナリオ内キーワード−ｄ」クラスと「シナリオ関連語−ｄ」クラスに属するキーワードを取得する。ｄは連結しようとしている各元シナリオの名称である。そして、取得した各キーワード毎に、その表記と、クラス名の中からシナリオ名称部分を取得する。 The scenario name keyword correspondence DB (database) creation means 320 creates a scenario name keyword correspondence DB-401 that acts as a table indicating to which original scenario a certain keyword belongs, using the scenario name keyword correspondence DB creation method. FIG. 6 shows a flow of a scenario name keyword correspondence DB creation method.
In the scenario name keyword correspondence DB creation method, the transition keyword dictionary 401-1 created by the transition keyword dictionary creation means 310 is read. Then, in the scenario name-notation acquisition step S6-1, keywords belonging to the “in-scenario keyword-d” class and the “scenario-related word-d” class are acquired. d is the name of each original scenario to be connected. Then, for each acquired keyword, the scenario name portion is acquired from the notation and the class name.

取り出したそれぞれの１つのキーワードの表記とシナリオ名称の対応関係を１つのレコードとして、ＤＢ登録処理ステップＳ６−２によってＤＢへ登録する。なお、登録するデータベースはＳＱＬ言語などを介してアクセスする一般的なリレーショナルデータベースシステムでも良いし、１行に表記と対応するシナリオ名称がカンマ（,）などのデリミタ文字を挟んだものが列挙されているようなテキストファイルなどの形式でも良い。
遷移用発話理解規則作成手段３３０は、遷移用発話理解規則作成方法を用いて、遷移用発話理解規則４０１−２を生成する。図７に、遷移用発話理解規則作成方法の流れを示す。 The correspondence between the extracted keyword description and the scenario name is registered as one record in the DB in DB registration processing step S6-2. The database to be registered may be a general relational database system accessed via the SQL language or the like, and a scenario name corresponding to the notation is listed on one line with a delimiter character such as a comma (,). It may be in the form of a text file.
The transition utterance understanding rule creation means 330 generates a transition utterance understanding rule 401-2 by using the transition utterance understanding rule creation method. FIG. 7 shows the flow of the transition utterance understanding rule creation method.

遷移用発話理解規則作成方法は、理解規則抽出ステップＳ７−１、理解規則書き換えステップＳ７−２、規則生成１ステップＳ７−２、規則生成２ステップＳ７−４の４つの手段が連結しようとする各元シナリオについて繰り返され、ファイル等に生成した規則を書き出す。
理解規則抽出ステップＳ７−２では、各元シナリオ１００Ａ、１００Ｂの発話理解規則リスト１０１Ａ−３、１０１Ｂ−３…（図２参照）から、シナリオキーワードクラスリスト１０３に含まれるクラスをその規則中に含む発話理解規則を抽出する。 The utterance comprehension rule creation method for transition includes each of the four means of understanding rule extraction step S7-1, understanding rule rewriting step S7-2, rule generation 1 step S7-2, and rule generation 2 step S7-4. It is repeated for the original scenario, and the generated rules are written to a file or the like.
In the understanding rule extraction step S7-2, classes included in the scenario keyword class list 103 are included in the rules from the utterance understanding rule lists 101A-3, 101B-3 (see FIG. 2) of the original scenarios 100A and 100B. Extract utterance comprehension rules.

理解規則書き換えステップＳ７−２では、理解規則抽出ステップＳ７−１で抽出した各規則のクラス名を「シナリオ内キーワード−ｄ」に（ｄは現在処理中の元シナリオ名称）、振り分け先スロット名を「遷移先シナリオ関連語」スロットにそれぞれ書き換える。
規則生成１ステップＳ７−３では、「シナリオ内キーワード−ｄ」または「シナリオ関連語−ｄ」に属するキーワードが音声認識した単語列中に出現すれば、そのキーワードを「遷移先シナリオ関連語」スロットに振り分けよ、とした内容の発話理解規則を生成する。 In the understanding rule rewriting step S7-2, the class name of each rule extracted in the understanding rule extracting step S7-1 is set to “in-scenario keyword-d” (d is the original scenario name currently being processed), and the allocation slot name is changed. Rewrite each in the “Transition destination scenario related word” slot.
In rule generation 1 step S7-3, if a keyword belonging to “in-scenario keyword-d” or “scenario-related word-d” appears in the speech-recognized word string, the keyword is placed in the “transition destination scenario-related word” slot. Generate an utterance comprehension rule with the content of “Sort by”.

規則生成ステップＳ７−４では、「シナリオ指定」クラスに属するキーワードが音声認識した単語列中に出現すれば、そのキーワードを「遷移先シナリオ」スロットに振り分けよ、とした内容の発話理解規則を生成する。
上記を、各元シナリオについて実行し、さらにナビゲートシナリオ名称について規則生成２ステップＳ７−４を実行した後、理解規則書き換えステップＳ７−２、規則生成１ステップＳ７−３、規則生成２ステップＳ７−４でそれぞれ書き換え、生成した発話理解規則をファイル等へ書き出し、遷移用発話理解規則４０１−２とする。 In the rule generation step S7-4, if a keyword belonging to the “scenario designation” class appears in the speech-recognized word string, an utterance comprehension rule with the content that the keyword is assigned to the “transition destination scenario” slot is generated. To do.
The above is executed for each original scenario, and further, rule generation 2 step S7-4 is executed for the navigation scenario name, then, understanding rule rewriting step S7-2, rule generation 1 step S7-3, rule generation 2 step S7- The utterance comprehension rules rewritten and generated in step 4 are written out to a file or the like, and set as transition utterance comprehension rules 401-2.

遷移用認識言語モデル・辞書作成手段３４０は、遷移用認識言語モデル・辞書作成方法を用いて、遷移用言語モデル４０１−３と遷移用認識辞書４０１−４を作成する。図８に、遷移用認識言語モデル・辞書作成方法の流れを示す。
遷移用認識言語モデル・辞書作成方法は、連結しようとする各元シナリオについて、例文抽出ステップＳ８−１と、例文クラス変換ステップＳ８−２と、キーワード・クラス置換ステップＳ８−３と、形態素解析ステップＳ８−４を行い、言語モデルの計算・辞書の作成ステップＳ８−５と、それらの書き出しという処理ステップＳ８−６とからなる。 The transition recognition language model / dictionary creation means 340 creates the transition language model 401-3 and the transition recognition dictionary 401-4 using the transition recognition language model / dictionary creation method. FIG. 8 shows the flow of the transition recognition language model / dictionary creation method.
The transition recognition language model / dictionary creation method includes an example sentence extraction step S8-1, an example sentence class conversion step S8-2, a keyword / class replacement step S8-3, and a morpheme analysis step for each original scenario to be connected. Step S8-4 is performed, and it includes a language model calculation / dictionary creation step S8-5 and a process step S8-6 for writing them out.

例文抽出ステップＳ８−１では、各元シナリオが利用する言語モデルを生成した元シナリオ例文リスト（形態素解析され、キーワードが入る位置はクラス名で置き換えられている）の中から、シナリオキーワードクラスリスト中にあるクラスを含む例文を抽出する。
例文クラスステップＳ８−２では、今度は追加例文リスト１０５（図２）を読み込み、追加例文リスト１０５中の各例文に、シナリオ関連キーワードリスト１０４に含まれる単語があれば、その単語の部分を「シナリオ関連語−ｄ」クラスに置き換える。
キーワード−クラス変換ステップＳ８−３では、今度は追加例文リスト１０５（図２）を読み込み、追加例文リスト１０５中の各例文に、シナリオ関連キーワードリスト１０４に含まれる単語があれば、その単語の部分を「シナリオ関連語−ｄ」クラスに置き換える。 In the example sentence extraction step S8-1, in the scenario keyword class list from the original scenario example sentence list (the morpheme analysis is performed and the position where the keyword is entered is replaced by the class name) that has generated the language model used by each original scenario. Extract example sentences that contain classes in
In the example sentence class step S8-2, this time, the additional example sentence list 105 (FIG. 2) is read. If each example sentence in the additional example sentence list 105 includes a word included in the scenario-related keyword list 104, the word part is “ Replace with the “scenario-related word-d” class.
In the keyword-class conversion step S8-3, this time, the additional example sentence list 105 (FIG. 2) is read, and if each example sentence in the additional example sentence list 105 has a word included in the scenario-related keyword list 104, the word portion Is replaced with the “scenario-related word-d” class.

形態素解析ステップＳ８−４では、キーワード−クラス置換ステップＳ８−３でキーワードのクラスへの置換後の追加例文リスト中の各例文を、形態素解析器を使って、単語ごとに分かち書きし、その読み仮名を振る。
言語モデルの計算・辞書の生成ステップＳ８−５では、上記例文クラス変換ステップＳ８−２と、形態素解析ステップＳ８−４で作成した形態素解析済み例文と、クラス内単語辞書の情報として遷移用キーワード辞書とを言語モデル・辞書作成方法を利用して、言語モデルと辞書を作成する。これには、特開２００４−６９８５８号公報及び特開２００４−５３７４５号公報で示される方法を利用することが出来る。 In the morpheme analysis step S8-4, each example sentence in the additional example sentence list after the replacement of the keyword with the class in the keyword-class replacement step S8-3 is divided for each word by using the morpheme analyzer, and the reading pseudonym Shake.
In the language model calculation / dictionary generation step S8-5, the example sentence class conversion step S8-2, the morpheme-analyzed example sentence created in the morpheme analysis step S8-4, and the transition keyword dictionary as information in the in-class word dictionary The language model and dictionary are created using the language model / dictionary creation method. For this, the methods disclosed in Japanese Patent Application Laid-Open Nos. 2004-69858 and 2004-53745 can be used.

計算した言語モデル、生成した辞書は、それぞれファイル等に保存され、遷移用言語モデル４０１−３及び遷移用認識辞書４０１−４とされる。
ナビゲート用シナリオ作成手段３０２（図２参照）は、ナビゲートシナリオ名と各元シナリオに付与したシナリオ名称と、連結された対話シナリオにおいて全体的な役割を務める対話シナリオであるナビゲート用シナリオ４０３を生成する。
ナビゲート用シナリオ４０３は、このシステムがいったいどんな話題の対話が出来るかという説明を行なうよう記述された対話シナリオである。ナビゲート用シナリオの生成にあたっては、連結される各元シナリオ名称が特定されなくても記述できるフローについてはあらかじめ記述したものをテンプレートとして保存しておき、ナビゲートシナリオ作成手段３０２が実行されて、連結される元シナリオ名称が定まったところでそのテンプレート内に完成に必要な情報を追記・生成することで完全なシナリオファイルとするといった手段で実現可能である。 The calculated language model and the generated dictionary are saved in a file or the like, respectively, and are used as a transition language model 401-3 and a transition recognition dictionary 401-4.
The navigation scenario creation means 302 (see FIG. 2) includes a navigation scenario name 403, a scenario name assigned to each original scenario, and a navigation scenario 403 that is an interactive scenario that plays an overall role in the connected interactive scenario. Is generated.
The navigating scenario 403 is an interactive scenario described to explain what kind of topic the system can perform. When generating a scenario for navigation, a flow that can be described without specifying each original scenario name to be linked is saved in advance as a template, and the navigation scenario creating means 302 is executed, When the name of the original scenario to be connected is determined, it can be realized by adding a necessary information for completion in the template and creating a complete scenario file.

また、遷移用シナリオ生成手段３０３は、各元シナリオ名称とナビゲートシナリオ名を読み込み、遷移用シナリオを生成する。例えば、ユーザが連結されている各元シナリオのいずれかを指示した場合、つまり「遷移先シナリオ」スロットにシナリオ指定クラスに属するキーワードが入力された場合、そのシナリオ名称の示す各元シナリオが連結されているかをシナリオ名称の列挙により応答してユーザの発言を促す。また、遷移先の候補となるシナリオが複数あった場合、つまり「遷移先シナリオ関連語」スロットに入力されたキーワードの表記からシナリオ名キーワード対応ＤＢ４０２を検索した結果、対応するシナリオ名称が複数得られた場合には、そのいずれに遷移すればよいかを確認するなどの対話を行なうように記述されている対話シナリオである。 Moreover, the transition scenario generation unit 303 reads each original scenario name and the navigation scenario name, and generates a transition scenario. For example, if the user indicates one of the linked original scenarios, that is, if a keyword belonging to the scenario-designated class is entered in the “Transition Destination Scenario” slot, the original scenarios indicated by the scenario names are linked. It responds by enumerating the scenario name and prompts the user to speak. In addition, when there are a plurality of scenarios that are candidates for the transition destination, that is, as a result of searching the scenario name keyword correspondence DB 402 from the keyword notation input in the “transition destination scenario related word” slot, a plurality of corresponding scenario names are obtained. In this case, the dialogue scenario is described so as to conduct a dialogue such as confirming which one to transition to.

以上により、連結された対話シナリオが作成できる。
対話シナリオ実行手段６００（図１参照）では、連結情報を読み込み最初に実行すべき元シナリオのシナリオファイル（元シナリオの１つか、ナビゲートシナリオ）を読み込んでから実行する。
最初に実行するシナリオが元シナリオの１つであれば、同時に遷移用シナリオと遷移用リソースを読み込み、元シナリオの音声認識・発話理解と平行してユーザの同じ発声に遷移用リソースによる音声認識・発話理解も実行する。遷移用リソースからの発話理解結果が得られ、さらに元シナリオソースからの認識結果が得られない場合または元シナリオソースからの認識結果が得られてもその尤度が遷移用リソースの結果に比して極端に小さい状態となった場合は、遷移用シナリオによる応答動作が実行される。遷移用シナリオによる対話の結果により、他の元シナリオまたはナビゲート用シナリオの実行が遷移用シナリオから要求されると、対話シナリオ実行手段６００ではそのシナリオを読み込み実行する。ナビゲート用シナリオが実行されるときには、音声認識・発話理解の処理は遷移用リソースに対してのみ行われる。ナビゲート用シナリオはまたその対話結果に応じて、元シナリオの１つの実行や、対話の終了を対話シナリオ実行手段６００に対して要求し、要求を受けた対話シナリオ実行手段６００では元シナリオの実行やシステムの終了処理を行なう。 As described above, a linked dialogue scenario can be created.
In the dialogue scenario execution means 600 (see FIG. 1), the connection information is read and the original scenario scenario file (one of the original scenarios or the navigation scenario) to be executed first is read and executed.
If the scenario to be executed first is one of the original scenarios, the scenario for transition and the resource for transition are read at the same time. Perform utterance comprehension. When the utterance comprehension result is obtained from the transition resource and the recognition result from the original scenario source is not obtained or the recognition result from the original scenario source is obtained, the likelihood is higher than the result of the transition resource. If the state becomes extremely small, a response action based on the transition scenario is executed. When execution of another original scenario or navigation scenario is requested from the transition scenario based on the result of the dialog by the transition scenario, the dialog scenario executing means 600 reads and executes the scenario. When the navigation scenario is executed, the speech recognition / utterance understanding process is performed only for the transition resource. The navigating scenario also requests the dialog scenario executing means 600 to execute one of the original scenarios or the end of the dialog according to the result of the dialog, and the dialog scenario executing means 600 that received the request executes the original scenario. And system termination processing.

本実施例において、対話システムの連結のために入力として新たに人手等で作成しなければいけない情報は、名称、キーワードの列挙を行なうだけでよく、その作成コストはその対話システムで扱おうとする全話題のスロットに関して待ち受けるための規則を各コスト（＞２のスロット数乗）に比してはるかに少ない。
遷移用リソースは、上記で説明した生成過程から、連結しようとする元シナリオが扱うキーワードや、その元シナリオを連想させるキーワードを認識・理解可能である。
また、遷移用リソースからの発話理解結果が得られるとその結果に応じて、遷移用シナリオやナビゲート用シナリオにより、別の話題を扱う対話シナリオへの遷移が可能となっている。 In this embodiment, information that has to be newly created manually as input for connection of the dialogue system only needs to enumerate names and keywords, and the creation cost is all that is to be handled by the dialogue system. There are far fewer rules to wait for the topic slot compared to each cost (> 2 times the number of slots).
From the generation process described above, the transition resource can recognize and understand the keywords handled by the original scenario to be linked and the keywords associated with the original scenario.
When an utterance understanding result is obtained from the transition resource, a transition scenario or a navigation scenario can be used to transition to a dialogue scenario that deals with another topic.

以上説明した音声対話シナリオ作成装置はハードウェアによって実現することも可能であるが、現実的には本発明による音声対話シナリオ作成プログラムをコンピュータにインストールし、コンピュータに音声対話シナリオ作成装置として機能させる実施形態が最も実現性が高い。本発明による音声対話シナリオ作成プログラムはコンピュータが解読可能なプログラム言語によって記述され、コンピュータが読みよ取り可能な例えばＣＤ−ＲＯＭ、磁気ディスクのような記録媒体に記録される。コンピュータにはこれらの記録媒体から或いは通信回線を通じてインストールされ、コンピュータに備えられたＣＰＵに解読されて音声対話シナリオ作成動作を実行する。 Although the voice conversation scenario creation device described above can be realized by hardware, in practice, the voice dialogue scenario creation program according to the present invention is installed in a computer and the computer functions as a voice dialogue scenario creation device. The form is the most feasible. The voice interaction scenario creation program according to the present invention is described in a computer-readable program language, and is recorded on a recording medium such as a CD-ROM or a magnetic disk that can be read by the computer. The computer is installed from these recording media or through a communication line, and is decrypted by a CPU provided in the computer to execute a voice dialogue scenario creation operation.

本発明による音声対話シナリオ作成方法及び装置は音声対話システムを構築するためのプログラム作成現場で活用される。 The voice dialogue scenario creating method and apparatus according to the present invention is utilized in a program creation field for constructing a voice dialogue system.

本発明による音声対話シナリオ作成方法及び装置の概要を説明するためのブロック図。The block diagram for demonstrating the outline | summary of the voice dialogue scenario creation method and apparatus by this invention. 本発明による音声対話シナリオ作成装置の一実施例を説明するためのブロック図。The block diagram for demonstrating one Example of the voice dialogue scenario creation apparatus by this invention. 図２の続きを説明するためのブロック図。FIG. 3 is a block diagram for explaining the continuation of FIG. 2. 本発明の実施例に用いた遷移用リソース作成手段の詳細を説明するためのブロック図。The block diagram for demonstrating the detail of the resource creation means for transition used for the Example of this invention. 図４に示した遷移用リソース作成手段の動作を説明するためのフローチャート。6 is a flowchart for explaining the operation of the transition resource creating unit shown in FIG. 本発明の実施例に用いたシナリオ名キーワード対応データベース作成手段の動作を説明するためのフローチャート。The flowchart for demonstrating operation | movement of the scenario name keyword corresponding | compatible database preparation means used for the Example of this invention. 本発明の実施例に用いた遷移用発話理解規則作成手段の動作を説明するためのフローチャート。The flowchart for demonstrating operation | movement of the utterance understanding rule preparation means for a transition used for the Example of this invention. 本発明による実施例に用いた遷移用認識言語モデル・認識辞書作成手段の動作を説明するためのフローチャート。The flowchart for demonstrating operation | movement of the recognition language model for a transition and the recognition dictionary preparation means used for the Example by this invention.

Explanation of symbols

１００元シナリオ群４０１−１遷移用キーワード辞書
１００Ａ〜１００Ｎ元シナリオ４０１−２遷移用発話理解規則
１０１Ａ〜１０１Ｎリソース４０１−３遷移用言語モデル
１０１Ａ−１シナリオファイル名４０１−４遷移用認識辞書
１０１Ａ−２キーワード辞書４０１−５シナリオキーワード対応ＤＢ
１０１Ａ−３発話理解規則リスト４０３ナビゲート用シナリオ
１０１Ａ−４元シナリオ例文リスト４０４遷移用シナリオ
１０２シナリオファイル名４０５連結情報
１０３シナリオキーワードリスト
１０４シナリオ関連キーワードリスト
１０５追加例文リスト
１０６各シナリオの内容
２００情報抽出手段
３００整合情報作成手段
３０１遷移用リソース作成手段
３０２対話シナリオ
３０３遷移用シナリオ作成手段
３０４連結情報作成手段 100 original scenario group 401-1 transition keyword dictionary 100A to 100N original scenario 401-2 transition utterance understanding rules 101A to 101N resources 401-3 transition language model 101A-1 scenario file name 401-4 transition recognition dictionary 101A- 2 Keyword dictionary 401-5 Scenario keyword correspondence DB
101A-3 Utterance comprehension rule list 403 Navigating scenario 101A-4 Original scenario example list 404 Transition scenario
102 Scenario file name 405 Link information
103 Scenario Keyword List
104 Scenario-related keyword list
105 List of additional example sentences
106 Content of each scenario
200 Information extraction means
300 Consistency information creation means
301 Transition resource creation means
302 Dialogue scenario
303 Transition scenario creation means
304 Link information creation means

Claims

A method for creating a voice conversation scenario comprising a plurality of original scenarios each corresponding to only one different topic, and creating a voice conversation scenario capable of corresponding to a plurality of topics by connecting the plurality of original scenarios.
Information necessary for linking each original scenario from the resources provided in the above original scenario, information for determining which original scenario the topic is to be transitioned to, and voice recognition suitable for each original scenario in the state where the topic is transitioned Extract information to be performed, generate matching information to match the environment in which each original scenario can properly execute voice response regardless of which original scenario transitions from these information,
The consistency information includes the name of each original scenario to be handled in the linked original scenario, the original scenario file name, the file location, the consolidated information describing the original scenario information to be executed first, and the original scenario. The words that are input to the input items necessary to achieve the objective from the voice recognition resources, the names of the original scenarios, and related phrases that recall the original scenarios are extracted, and the transition terms that cause the topic transition As a transition resource to properly recognize the voice and output the understanding result,
When a transition term phrase is recognized, it confirms whether the transition to the original scenario corresponding to the recognition term phrase is displayed, displays the transition, and executes an operation to transfer control to another connected original scenario. And a dialogue scenario for transition for creating a dialogue dialogue scenario capable of handling a plurality of topics by holding this matching information.

A voice conversation scenario creation device that includes a plurality of original scenarios each corresponding to only one different topic, and creates a voice conversation scenario capable of supporting a plurality of topics by connecting the plurality of original scenarios.
Information necessary for linking each original scenario from the resources provided in the above original scenario, information for determining which original scenario the topic should transition to, and speech recognition suitable for each original scenario in the state where the topic has transitioned Information extracting means for extracting information to be performed;
A consistency information creating means for processing and uttering a user's utterance on a topic handled by another original scenario regardless of which original scenario is being executed from the information extracted by the information extracting means, equipped with a,
The matching information creation means
A concatenation information creation means for creating concatenation information describing the name of each original scenario handled in the concatenated original scenario, the file name of the original scenario, the location of the file, and the information of the original scenario to be executed first;
Extract the words that are input to the input items necessary to achieve the objective from the speech recognition resources in each original scenario, the name of each original scenario, and related phrases that recall the original scenario, and transition these to the topic A transition resource creating means for creating a transition resource for appropriately recognizing and outputting an understanding result as a transition term phrase that causes
When a transition term phrase is recognized, it confirms whether the transition to the original scenario corresponding to the recognition term phrase is displayed, displays the transition, and executes an operation to transfer control to another connected original scenario. A transition scenario creating means for creating a transition scenario for
Voice interaction scenario creating device according to claim Rukoto constituted by.

In the voice dialogue scenario creating device according to claim 2 ,
In addition to the connection information creation unit, the transition resource creation unit, and the transition scenario creation unit, the matching information generation unit includes:
An apparatus for creating a voice conversation scenario, characterized by comprising a scenario creation means for navigation for displaying conversation contents executable in the above-mentioned linked original scenario and guiding a voice conversation procedure.

In the voice dialogue scenario creation device according to claim 2 or 3 ,
The transition resource creation means
A transition keyword dictionary that extracts keywords belonging to a predetermined class from a keyword dictionary in resources provided in each original scenario, and creates a transition keyword collection in which information indicating the attribute of the keyword is added to the extracted keywords. Creating means;
Scenario name keyword correspondence database creation means for creating a scenario name keyword correspondence database that acts as a table indicating which original scenario each keyword described in each original scenario belongs;
From the list of utterance understanding rules described in each original scenario, utterance understanding rules included in the rules of the classes included in the scenario keyword class list are extracted, and the class name of each extracted rule is used as a keyword in the scenario. Rewrite the name to which the attribute information is added and the distribution destination slot name to the transition destination scenario related word, respectively, and the keyword with the attribute information added to the keyword in the scenario or the keyword belonging to the transition destination scenario related word is spoken If it appears in the recognized word string, the utterance comprehension rule with the content that the keyword should be assigned to the transition destination scenario-related word slot, and the keyword belonging to the scenario specification class appears in the voice-recognized word string , An utterance comprehension rule with the content that the keyword is assigned to the destination scenario slot Transition for speech understanding rule generation means for generating,
Extract example sentences including classes in the scenario keyword class list from the original scenario example sentence list that generated the language model used by each original scenario, and use the class name in each extracted example sentence to search for keywords in the scenario. Replaced with the class name to which the information to be provided is added, further extracted the keywords included in the scenario-related keyword list from each example sentence in the additional example sentence list, replaced the keyword class with the scenario-related word class, and additional example sentences A transition recognition language model / dictionary creation means for creating a transition language model and a transition recognition dictionary from each of the example sentences in the list by morphological analysis means for each word;
A voice conversation scenario creation device characterized by comprising:

A spoken dialogue scenario creation program that is described in a program language that can be read by a computer and that causes the computer to function as at least a spoken dialogue scenario creation device according to any one of claims 2 to 4 .

6. A recording medium comprising a computer-readable recording medium, wherein the voice conversation scenario creating program according to claim 5 is recorded on the recording medium.