JP2002091858A

JP2002091858A - Information providing device, information generator, information providing system connected therewith, method therefor and recording medium recorded with program therefor

Info

Publication number: JP2002091858A
Application number: JP2000278126A
Authority: JP
Inventors: Tatsuya Sakai; 達也酒井; Toshihiko Yoshida; 俊彦吉田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2000-09-13
Filing date: 2000-09-13
Publication date: 2002-03-29

Abstract

PROBLEM TO BE SOLVED: To provide an information providing device, by which a user can acquire only an irreducibly minimum vocabulary dictionary required for speech operation. SOLUTION: In accordance with the transmission request of information from terminal equipment, an information providing part 210 converts a part of a display text character string in designated information into speech vocabulary dictionary data to be used for speech operation and adds these data to information corresponding to the transmission request. Then, the information is transmitted through communication equipment 230 to the terminal equipment. Therefore, the user can acquire only the irreducibly minimum vocabulary dictionary required for the speech operation.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、インターネットイ
ンフラによって提供されるホームページ等の情報にアク
セスする際に使用される音声操作機能や音声入力機能に
関し、特に、これらの機能に使用される音声語彙辞書を
インターネットを介して提供する情報提供装置、音声語
彙辞書を生成する情報生成装置、それらを接続した情報
提供システム、それらの方法およびそれらのプログラム
を記録した記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice operation function and a voice input function used when accessing information such as a homepage provided by an Internet infrastructure, and more particularly to a voice vocabulary dictionary used for these functions. Providing information via the Internet, an information generating device for generating a speech vocabulary dictionary, an information providing system connecting them, a method thereof, and a recording medium on which such a program is recorded.

【０００２】[0002]

【従来の技術】近年、インターネットインフラの普及に
伴って、一般ユーザが情報サービス会社等によって配信
される情報やサービスを容易に入手して利用することが
できるようになってきている。一方、それらの情報を入
手する際に使用されるコンピュータ等の情報処理装置の
マンマシンインタフェースについては、依然として一般
ユーザにとって非常に分かり難く、使用し難いものであ
る。2. Description of the Related Art In recent years, with the spread of the Internet infrastructure, it has become possible for general users to easily obtain and use information and services distributed by information service companies and the like. On the other hand, a man-machine interface of an information processing device such as a computer used to obtain such information is still very difficult for general users to understand and use.

【０００３】従来から、各コンピュータメーカ等ではマ
ンマシンインタフェースの向上を目的として、たとえば
音声によるコンピュータ操作に代表される音声認識技術
を応用したものの開発が行なわれている。その１つとし
て、情報提供サービス分野において音声認識を利用した
マンマシンインタフェースが種々開発されており、特開
平８−６５８９号公報、特開平８−２２３３０９号公報
および特開２０００−７６０４０号公報等に開示された
発明がある。[0003] Conventionally, in order to improve the man-machine interface, each computer maker or the like has been developing one that applies a voice recognition technique represented by a computer operation using voice, for example. As one of them, various man-machine interfaces using voice recognition have been developed in the information providing service field, and are disclosed in JP-A-8-6589, JP-A-8-223309 and JP-A-2000-76040. There is a disclosed invention.

【０００４】特開平８−６５８９号公報に開示された電
話回線音声入力システムによれば、ユーザとシステムと
の対話の進行場面に従って、各場面で用いられる語彙文
法情報が記憶部から取り出され、電話回線ネットワーク
を経由してユーザ端末へ送出される。ユーザ端末は、受
信した語彙文法情報を語彙文法バッファに格納する。音
声認識部は、語彙文法バッファから語彙文法情報を読み
出して、音声入力部から入力されたユーザの音声入力信
号を認識し、認識結果を電話回線ネットワークを経由し
て情報サービスシステムへ送る。According to the telephone line voice input system disclosed in Japanese Patent Application Laid-Open No. 8-6589, vocabulary grammar information used in each scene is extracted from a storage unit in accordance with the progress of a dialog between a user and the system. It is sent to the user terminal via the line network. The user terminal stores the received lexical grammar information in the lexical grammar buffer. The voice recognition unit reads the vocabulary grammar information from the vocabulary grammar buffer, recognizes the user's voice input signal input from the voice input unit, and sends the recognition result to the information service system via the telephone line network.

【０００５】また、特開平８−２２３３０９号公報に開
示された音声入力ネットワークサービスシステムによれ
ば、センタシステムが記憶部に格納されている音声入力
語彙情報およびディスプレイ表示情報を通信ネットワー
クを介して端末装置に送信する。端末装置は、通信ネッ
トワークを介して受信した音声入力語彙情報をバッファ
に蓄積し、音声入力語彙情報中の読み情報および端末音
声入力辞書中の読み情報を音声認識部へ出力する。音声
認識部から出力された認識結果はアクションに変換さ
れ、実行される。According to the voice input network service system disclosed in Japanese Patent Application Laid-Open No. 8-223309, the center system transmits voice input vocabulary information and display information stored in a storage unit to a terminal via a communication network. Send to device. The terminal device stores the voice input vocabulary information received via the communication network in the buffer, and outputs the reading information in the voice input vocabulary information and the reading information in the terminal voice input dictionary to the voice recognition unit. The recognition result output from the voice recognition unit is converted into an action and executed.

【０００６】また、特開２０００−７６０４０号公報に
開示された音声入力ネットワーク端末装置によれば、ア
プリケーションサーバからダウンロードした、音声操作
を可能にするための操作命令の読みと、音声入力された
操作命令をアプリケーションサーバへの問合わせ文に変
換するための情報とが記述された認識辞書情報を、アプ
リケーションサーバ毎のネットワークアドレスと対応付
けて蓄積管理する。そして、利用者の音声入力を認識辞
書情報に基づいて音声認識が行なわれ、認識結果に基づ
いて蓄積管理される上記情報を検索し、接続先アプリケ
ーションサーバを特定する。According to the voice input network terminal device disclosed in Japanese Patent Application Laid-Open No. 2000-76040, an operation command for enabling a voice operation downloaded from an application server is read, and a voice-input operation is performed. The recognition dictionary information in which information for converting the command into an inquiry sent to the application server is described and stored in association with the network address of each application server. Then, voice recognition of the user's voice input is performed based on the recognition dictionary information, and the information stored and managed is searched based on the recognition result, and a connection destination application server is specified.

【０００７】[0007]

【発明が解決しようとする課題】しかし、上述した従来
技術においては、大きく２つの問題点がある。以下に、
これらの問題点について説明する。However, the above-mentioned prior art has two major problems. less than,
These problems will be described.

【０００８】図１７は、上述した特開平８−６５８９号
公報に開示された電話回線音声入力システム等における
一般的なシステム構成を示すブロック図である。このシ
ステムは、端末装置１００と、情報提供サーバ２００
と、インターネット接続プロバイダ３００とを含み、端
末装置１００はインターネット接続プロバイダ３００お
よびインターネットを介して情報提供サーバ２００にア
クセスする。なお、このシステムは、主としてＨＴＭＬ
（Hyper Text Markup Language）データ形式であるペー
ジデータが情報提供サーバ２００から提供される。FIG. 17 is a block diagram showing a general system configuration of a telephone line voice input system and the like disclosed in the above-mentioned JP-A-8-6589. This system includes a terminal device 100 and an information providing server 200.
And the terminal device 100 accesses the information providing server 200 via the Internet connection provider 300 and the Internet. This system mainly uses HTML
(Hyper Text Markup Language) page data in data format is provided from the information providing server 200.

【０００９】また、情報端末１００は、音声認識部１１
０と、アクションコード記憶部１１１と、語彙辞書記憶
部１１２と、基本辞書記憶部１１３と、アプリケーショ
ンプログラムとして提供されるブラウザ１３０と、ペー
ジ記憶部１３１と、入力装置１４０と、表示装置１５０
と、マイク１６０とを含む。The information terminal 100 has a voice recognition unit 11
0, an action code storage unit 111, a vocabulary dictionary storage unit 112, a basic dictionary storage unit 113, a browser 130 provided as an application program, a page storage unit 131, an input device 140, and a display device 150.
And a microphone 160.

【００１０】まず、入力装置１４０から情報提供サーバ
２００のネットワークアドレス（サイトアドレス）文字
列データが指示／入力される。ブラウザ１３０は、文
字列データに基づいて、一般公衆回線およびインター
ネット接続プロバイダ３００を介して情報提供サーバ２
００に接続し、ページデータ等の情報の送信要求を出
力する。情報提供サーバ２００は、送信要求を受ける
と、端末装置１００に対してページデータ、そのページ
データに対応する語彙辞書データおよびアクションコー
ドを送信する。First, a network address (site address) character string data of the information providing server 200 is instructed / input from the input device 140. The browser 130 sends the information providing server 2 via the general public line and the Internet connection provider 300 based on the character string data.
00, and outputs a transmission request for information such as page data. When receiving the transmission request, the information providing server 200 transmits the page data, the vocabulary dictionary data corresponding to the page data, and the action code to the terminal device 100.

【００１１】ブラウザ１３０は、情報を受信すると、
ページデータをページ記憶部１３１に記憶するととも
に、表示装置１５０にページデータを表示させる。ま
た、ブラウザ１３０は、語彙辞書データおよびアクショ
ンコードを音声認識部１１０へ出力する。音声認識部
１１０は、ブラウザ１３０から出力された語彙辞書デー
タおよびアクションコードをそれぞれ語彙辞書記憶部１
１２およびアクションコード記憶部１１１に記憶する。When the browser 130 receives the information,
The page data is stored in the page storage unit 131, and the display device 150 displays the page data. Further, the browser 130 outputs the vocabulary dictionary data and the action code to the voice recognition unit 110. The voice recognition unit 110 stores the vocabulary dictionary data and the action code output from the browser 130, respectively, in the vocabulary dictionary storage unit 1.
12 and the action code storage unit 111.

【００１２】利用者がマイク１６０を介して音声を入力
すると、音声認識部１１０は語彙辞書記憶部１１２に記
憶されている語彙辞書データと、基本辞書記憶部１１３
に記憶されている基本辞書とを参照して、音声入力デー
タの認識を行なう。そして、音声認識部１１０は、ア
クションコード記憶部１１１に記憶されているアクショ
ンコードデータを参照して、音声認識結果に対応するア
クションコードを生成する。When the user inputs a voice via the microphone 160, the voice recognition unit 110 reads the vocabulary dictionary data stored in the vocabulary dictionary storage unit 112 and the basic dictionary storage unit 113.
The voice input data is recognized with reference to the basic dictionary stored in the. Then, the voice recognition unit 110 refers to the action code data stored in the action code storage unit 111 and generates an action code corresponding to the voice recognition result.

【００１３】図１７に示すシステムにおいて、基本辞書
記憶部１１３に格納されている基本辞書も語彙辞書記憶
部１１２に記憶されている語彙辞書と等価であり、端末
装置１００に予め基本辞書を用意する必要はないと考え
ることもできる。しかし、実システムを構築する上にお
いては、この基本辞書を用意する必要がある。この理由
について、以下に説明する。In the system shown in FIG. 17, the basic dictionary stored in the basic dictionary storage unit 113 is equivalent to the vocabulary dictionary stored in the vocabulary dictionary storage unit 112, and the basic dictionary is prepared in the terminal device 100 in advance. You can think that it is not necessary. However, when building a real system, it is necessary to prepare this basic dictionary. The reason will be described below.

【００１４】図１８は、かな漢字変換の処理内容を模式
的に説明するための図である。かな漢字変換は、大きく
マッチング検出処理とふるい処理との２つのステップに
よって行なわれる。マッチング検出処理によって主に変
換候補の抽出が行なわれ、ふるい処理によって変換候補
の絞り込みおよび優先順位付けが行なわれる。なお、図
１８においては、マッチング検出処理が行なわれた後に
ふるい処理が行なわれる直列処理として記載されている
が、認識精度の向上等のために並列に処理されるのが一
般的である。ただし、問題点は共通であるので、簡単の
ために直列処理を用いて説明することにする。FIG. 18 is a diagram for schematically explaining the processing contents of kana-kanji conversion. Kana-Kanji conversion is performed in two major steps: matching detection processing and sieving processing. Conversion candidates are mainly extracted by matching detection processing, and conversion candidates are narrowed down and prioritized by sieving processing. Although FIG. 18 illustrates the serial processing in which the sieving processing is performed after the matching detection processing is performed, the processing is generally performed in parallel to improve recognition accuracy and the like. However, since the problems are common, the description will be made using serial processing for simplicity.

【００１５】たとえば、入力データ文字列「かがくはん
のう」が入力された場合、マッチング検出処理において
図１９（ａ）に示す辞書Ａが利用され、登録語彙中に入
力データ文字列と完全一致する語彙の有無がチェックさ
れる。完全一致する語彙がなければ、余り文字列ができ
ないように辞書Ａに登録されている登録語彙で入力デー
タ文字列が分解される。その結果として、「科学反
応」、「化学反応」、「化学班の兎」等の複数候補が出
力される。For example, when the input data character string “Kagahaku Hanno” is input, the dictionary A shown in FIG. 19A is used in the matching detection processing, and a vocabulary that completely matches the input data character string in the registered vocabulary is used. Is checked. If there is no completely matching vocabulary, the input data character string is decomposed by the registered vocabulary registered in the dictionary A so that no extra character string is formed. As a result, a plurality of candidates such as “Scientific Reaction”, “Chemical Reaction”, and “Chemical Team Rabbit” are output.

【００１６】ふるい処理によって、これらの複数候補の
中から文章として成り立つもののみが選択されたり、候
補優先順位処理が行なわれたりする。図１９（ｂ）は、
ふるい処理の際に使用される辞書Ｂの一部を示す図であ
る。この辞書Ｂは、組合せとなり得る語彙が記憶された
辞書である。上記入力データ文字列の場合には、マッチ
ング検出処理によって検出された複数候補にうち「化
学」と「反応」との組合せからなる「化学反応」のみが
辞書Ｂ内に登録されていることから、ふるい処理によっ
て「化学反応」のみが出力されることを示している。By the sieving process, only those which are valid as sentences are selected from the plurality of candidates, or the candidate priority process is performed. FIG. 19 (b)
It is a figure showing a part of dictionary B used at the time of sieving processing. The dictionary B is a dictionary in which vocabularies that can be combined are stored. In the case of the input data character string, since only “chemical reaction” composed of a combination of “chemical” and “reaction” among the plurality of candidates detected by the matching detection processing is registered in the dictionary B, This indicates that only the “chemical reaction” is output by the sieving process.

【００１７】以上の説明においてはかな漢字変換処理を
例としたが、音声認識をはじめとして、手書き認識、自
然言語認識等、いずれの認識処理においても同様の処理
が行なわれる。この処理において、以下のような問題点
が２つ存在する。In the above description, the Kana-Kanji conversion process is described as an example. However, the same process is performed in any recognition process such as speech recognition, handwriting recognition, natural language recognition, and the like. In this processing, there are two problems as follows.

【００１８】その１つは、辞書Ａ内に登録されている語
彙の２語彙間の関係が辞書Ｂに登録されるとすると、辞
書Ｂの情報量が最大（辞書Ａ内に登録されている語彙
数）²となる可能性があり、その情報量が膨大になると
いう点である。One of them is that if the relationship between two vocabularies registered in the dictionary A is registered in the dictionary B, the information amount of the dictionary B is maximized (the vocabulary registered in the dictionary A). (Number) ² and the amount of information is enormous.

【００１９】他の１つは、かな漢字変換処理における語
彙単位の認識処理が、音声認識処理の場合にはさらに細
かい単位である音素単位となる。この音素とは、音素波
形を特定の文字データとして表現した音素記号であり、
一般的にはローマ字文字列に類似するものとなる。した
がって、情報量がさらに膨大になるという点である。Another one is that the recognition processing of the vocabulary unit in the kana-kanji conversion processing is a phoneme unit which is a finer unit in the case of the speech recognition processing. This phoneme is a phoneme symbol representing a phoneme waveform as specific character data.
Generally, it is similar to a Roman character string. Therefore, the amount of information is further increased.

【００２０】上述した特開平８−６５８９号公報に開示
された電話回線音声入力システム等において、語彙辞書
が音素からなるものと想定した場合、たとえば自然語音
声入力が可能なシステムを実現しようとすると、辞書Ｂ
の容量が単に認識語彙数と比例関係になるのではなく上
述したように膨大な量となるため、速度パフォーマンス
の低下やトラフィックの増大という問題が発生する。In the telephone line voice input system disclosed in Japanese Patent Application Laid-Open No. 8-6589, if it is assumed that the vocabulary dictionary is composed of phonemes, it is necessary to realize a system capable of inputting natural language voice, for example. , Dictionary B
Is not proportional to the number of recognized vocabulary words, but becomes enormous as described above, which causes problems such as a reduction in speed performance and an increase in traffic.

【００２１】逆に、語彙辞書がひらがな／カタカナ文字
データからなるものと想定した場合、認識エンジンの制
御方法にもよるが、誤った認識結果が得られることは減
少する。しかし、認識結果を全く出力しない、すなわち
何を喋っても何も動作しないということが発生する可能
性が高くなる。たとえば、特開２０００−７６０４０号
公報の実施の形態に記載されているように、音声入力デ
ータ「あきはばらのさいしゅうでんしゃ」の「あきはば
ら」や「さいしゅうでんしゃ」を抽出する処理や、助詞
「の」を取除く処理等を行なうようなシステムの場合、
辞書Ｂの容量が膨大となり、異なった情報をネットワー
クを介して送受信する毎に通信トラフィックが増大する
という問題が発生する。また、辞書Ｂの部分を送受信し
ないシステムの場合には、語彙を絞ることにより音声認
識精度の向上が全く図れない、すなわち助詞「の」を取
除けるようなシステムにはならないと考えられる。Conversely, assuming that the vocabulary dictionary is composed of Hiragana / Katakana character data, the possibility of obtaining incorrect recognition results is reduced depending on the control method of the recognition engine. However, there is a high possibility that a recognition result is not output at all, that is, no matter what you say, nothing happens. For example, as described in the embodiment of Japanese Patent Application Laid-Open No. 2000-76040, "Akihabara" and "Saiyudensha" of voice input data In the case of a system that performs processing to extract or remove the particle "no",
There is a problem that the capacity of the dictionary B becomes enormous and communication traffic increases every time different information is transmitted and received through the network. Also, in the case of a system that does not transmit and receive the part of the dictionary B, it is considered that the speech recognition accuracy cannot be improved at all by narrowing down the vocabulary, that is, the system cannot remove the particle "no".

【００２２】したがって、上述した特開平８−６５８９
号公報に開示された電話回線音声入力システム等の技術
は、文法解析を行なうシステムに利用されるというより
も、音声入力データと語彙レベルで完全一致した場合に
所定の制御を行なうような音声コマンドを用いた程度の
システムで利用されるべきものである。なお、図１７に
示す基本辞書記憶部１１３に記憶される基本辞書は、音
声認識処理のために用いられる、辞書Ａに対応する音素
波形と音素記号との対応付けや、辞書Ｂに対応する音素
単位の関係が記述された辞書である。また、語彙辞書記
憶部１１２に記憶される語彙辞書は、情報提供サーバ２
００によって提供される情報に関連した、音素記号で構
成された語彙列が記述された辞書である。Accordingly, the above-mentioned Japanese Patent Application Laid-Open No. Hei 8-6589
The technology disclosed in Japanese Unexamined Patent Publication (Kokai) No. H11-107421 is not used in a system for performing grammatical analysis, but rather uses voice commands such as performing predetermined control when voice input data completely matches the vocabulary level. Should be used in systems that use. Note that the basic dictionaries stored in the basic dictionary storage unit 113 shown in FIG. 17 correspond to phoneme waveforms and phoneme symbols corresponding to the dictionary A and phoneme symbols corresponding to the dictionary B, which are used for speech recognition processing. This is a dictionary in which the relationship between units is described. The vocabulary dictionary stored in the vocabulary dictionary storage unit 112 is stored in the information providing server 2.
This is a dictionary that describes a vocabulary string composed of phoneme symbols related to the information provided by 00.

【００２３】また、特開平８−６５８９号公報に開示さ
れた電話回線音声入力システム等の技術におけるもう１
つの問題点は、情報提供者の開発負荷に関するものであ
る。情報提供サーバ２００から送信される語彙辞書デー
タの種類は、以下の４種類が考えられる。[0023] Further, in the technology of the telephone line voice input system and the like disclosed in JP-A-8-6589, there is another one.
One problem is related to the development load of the information provider. The following four types of vocabulary dictionary data transmitted from the information providing server 200 can be considered.

【００２４】標準的な音声波形データ漢字文字データをも含む文字データひらがな／カタカナ文字のみで構成される文字デー
タテキスト文字列で構成される特定の規定に基づく音
素記号データ上記については、上述したシステム以上に送信データ
量が膨大となるため、図１７に示すシステムには向いて
いない。Standard Speech Waveform Data Character Data Including Kanji Character Data Character Data Consisting Only of Hiragana / Katakana Characters Phoneme Symbol Data Based on Specific Regulations Consisting of Text Character Strings As described above, since the amount of transmission data is enormous, it is not suitable for the system shown in FIG.

【００２５】上記については、端末装置１００におい
て、少なくとも入力データに対して漢字読み変換処理を
行なう必要があり、その変換ロジックによっては情報提
供者側の趣旨に合わない動作となる可能性がある。たと
えば、「昨日」は、「きのう」と読むこともでき、「さ
くじつ」と読むこともできる。情報提供者側としては、
「きのう」は別のコマンドである「機能」に割当てたい
ため、「昨日」は「さくじつ」と読ませたいとする。し
かし、端末装置１００の音声認識エンジン内に存在する
漢字読み変換ロジックに依存するため、情報提供者とし
ては対応することができなくなってしまう。In the above, in the terminal device 100, it is necessary to perform a kanji reading conversion process on at least the input data, and depending on the conversion logic, the operation may not match the purpose of the information provider. For example, "Yesterday" can be read as "Kino" or "Sakujitsu". On the information provider side,
“Yesterday” is to be assigned to another function “function”, so “yesterday” is to be read as “sakujitsu”. However, since it depends on the kanji reading conversion logic existing in the speech recognition engine of the terminal device 100, the information provider cannot respond.

【００２６】上記については、音声のイントネーショ
ン等がひらがな／カタカナコード列では表現できないと
いう問題がある。たとえば、特定のリズムを付けて、所
定の語彙を喋った時にだけ秘密のページアドレスに移行
する等の遊び心のあるホームページを提供しようとして
も、情報提供者の配慮だけでは実現することが不可能と
なる。In the above, there is a problem that the intonation of speech cannot be represented by a Hiragana / Katakana code sequence. For example, even if you try to provide a playful homepage, such as shifting to a secret page address only when you speak a predetermined vocabulary with a specific rhythm, it is impossible to realize it only by considering the information provider. Become.

【００２７】したがって、語彙辞書データのデータ形式
は、これらの問題がない上記が最も優れており、音声
認識エンジン提供者の間で取決めがなされた規定に基づ
いて音素記号が作成されるのが最良であると考えられ
る。Therefore, the data format of the vocabulary dictionary data is the best because it does not have these problems, and it is best that the phoneme symbols are created based on the rules agreed between the speech recognition engine providers. It is considered to be.

【００２８】近年、ビジネスとしてではなく、個人でホ
ームページを開設する人も増加している。このような個
人が開設したホームページを閲覧する人の多くは、その
開設者の友人や親しい人、すなわち個人である。In recent years, an increasing number of individuals have opened homepages individually, not as businesses. Many people who browse the homepage established by such individuals are friends or close people of the establisher, that is, individuals.

【００２９】語彙辞書のデータ形式が上記の内容に統
一されたとすると、この音素記号の規定書は一般ユーザ
にとって理解し難いものになると考えられる。このよう
なシステムが認知されていくにしたがって、音素記号へ
変換するための語彙辞書作成ツールが商品化されるであ
ろう。しかし、この語彙辞書作成ツールが商品化される
まで、または語彙辞書作成ツールを購入するまで、プロ
のホームページデザイナを抱えている会社用のホームペ
ージにアクセスするときは音声操作が可能であるが、友
人のホームページにアクセスするときには音声操作が不
可能であるという、本来の目的と矛盾した状態となる。Assuming that the data format of the vocabulary dictionary is unified to the above contents, it is considered that the definition of the phoneme symbols is difficult for general users to understand. With the recognition of such systems, lexical dictionary creation tools for converting to phoneme symbols will be commercialized. However, until the lexical dictionary creation tool is commercialized or the lexical dictionary creation tool is purchased, voice operations are possible when accessing the homepage for a company that has a professional homepage designer. When accessing this homepage, voice operations are not possible, which is inconsistent with the original purpose.

【００３０】本発明は、上記問題点を解決するためにな
されたものであり、第１の目的は、ユーザが音声操作に
必要な最低限の語彙辞書のみを入手することが可能な情
報提供装置、情報生成装置、それらを接続した情報提供
システム、それらの方法およびそれらのプログラムを記
録した記録媒体を提供することである。The present invention has been made to solve the above problems, and a first object of the present invention is to provide an information providing apparatus which enables a user to obtain only a minimum vocabulary dictionary necessary for voice operation. , An information generating apparatus, an information providing system connecting them, a method thereof, and a recording medium on which the program is recorded.

【００３１】第２の目的は、ユーザが音声操作によって
リンク先のファイルにアクセスすることが可能な情報提
供装置、情報生成装置、それらを接続した情報提供シス
テム、それらの方法およびそれらのプログラムを記録し
た記録媒体を提供することである。A second object is to record an information providing device, an information generating device, an information providing system connecting them, a method thereof, and a program thereof in which a user can access a linked file by voice operation. The purpose of the present invention is to provide a recording medium.

【００３２】[0032]

【課題を解決するための手段】本発明のある局面に従え
ば、情報提供装置は、情報の送信要求を受信し、送信要
求に対応する情報を送信するための通信手段と、通信手
段によって受信された情報の送信要求に応じて、指定さ
れた情報内の表示テキスト文字列の一部を音声操作に用
いられる音声語彙辞書データに変換し、送信要求に対応
する情報に付与して通信手段へ出力するための情報提供
手段とを含む。According to one aspect of the present invention, an information providing apparatus receives a request for transmitting information, and a communication unit for transmitting information corresponding to the transmission request; In response to the requested information transmission, a part of the display text string in the specified information is converted into voice vocabulary dictionary data used for voice operation, added to the information corresponding to the transmission request, and transmitted to the communication means. Information providing means for outputting.

【００３３】情報提供手段は、ユーザによって指定され
た情報内の表示テキスト文字列の一部を音声操作に用い
られる音声語彙辞書データに変換し、送信要求に対応す
る情報に付与して通信手段へ出力するので、ユーザは音
声操作に必要な最低限の語彙辞書のみを入手することが
可能となる。The information providing means converts a part of the display text character string in the information designated by the user into voice vocabulary dictionary data used for voice operation, attaches it to information corresponding to the transmission request, and sends it to the communication means. Since the output is performed, the user can obtain only the minimum vocabulary dictionary necessary for voice operation.

【００３４】好ましくは、情報提供手段は、変換された
音声語彙辞書データを、送信要求に対応する情報内に既
に存在する音声語彙辞書データにマージして出力する。[0034] Preferably, the information providing means merges the converted speech vocabulary dictionary data with speech vocabulary dictionary data already existing in the information corresponding to the transmission request, and outputs the merged speech vocabulary dictionary data.

【００３５】したがって、必要な音声語彙辞書データの
みを追加することが可能となる。好ましくは、情報提供
手段は、指定された情報内のハイパーリンク記述に関連
する表示テキスト文字列の一部を音声操作に用いられる
音声語彙辞書データに変換し、ハイパーリンク記述に関
連づけて前記通信手段へ出力する。Therefore, it becomes possible to add only necessary speech vocabulary dictionary data. Preferably, the information providing means converts a part of the display text string related to the hyperlink description in the designated information into voice vocabulary dictionary data used for voice operation, and associates the communication text with the hyperlink description. Output to

【００３６】したがって、ユーザは音声操作によってリ
ンク先のファイルにアクセスすることが可能となる。Therefore, the user can access the linked file by voice operation.

【００３７】さらに好ましくは、情報提供手段は、送信
要求に対応する情報内に既に存在する音声語彙辞書デー
タを、異なる形式の音声語彙辞書データに変換して通信
手段へ出力する。[0037] More preferably, the information providing means converts the speech vocabulary dictionary data already existing in the information corresponding to the transmission request into speech vocabulary dictionary data of a different format and outputs the data to the communication means.

【００３８】したがって、異なる形式の音声語彙辞書デ
ータにも対応できる。さらに好ましくは、情報提供装置
はさらに、ひらがな／カタカナ文字列を音素記号に変換
するための第１の辞書を含み、情報提供手段は、第１の
辞書を参照して、表示テキスト文字列のひらがな／カタ
カナを音素記号列に変換する。Therefore, it is possible to cope with speech vocabulary dictionary data of different formats. More preferably, the information providing device further includes a first dictionary for converting a hiragana / katakana character string into a phoneme symbol, and the information providing means refers to the first dictionary and reads a hiragana / display character string. / Convert katakana to phoneme symbol string.

【００３９】さらに好ましくは、情報提供装置はさら
に、漢字をひらがな／カタカナ文字列に変換するための
第２の辞書を含み、情報提供手段は、第２の辞書を参照
して表示テキスト文字列の漢字をひらがな／カタカナ文
字列に変換し、第１の辞書を参照してひらがな／カタカ
ナ文字列を音素記号列に変換する。More preferably, the information providing device further includes a second dictionary for converting kanji into Hiragana / Katakana character strings, and the information providing means refers to the second dictionary to convert the display text character string. Kanji is converted into a Hiragana / Katakana character string, and the Hiragana / Katakana character string is converted into a phoneme symbol string with reference to the first dictionary.

【００４０】本発明の別の局面に従えば、情報生成装置
は、表示テキスト文字列を有する複数のファイルを含ん
だ情報を生成するための情報生成手段と、情報生成手段
によって生成された複数のファイルの表示テキスト文字
列の一部を音声操作に用いられる音声語彙辞書データに
変換し、複数のファイルに付与するための語彙辞書生成
手段と、語彙辞書生成手段によって音声語彙辞書データ
が付与された複数のファイルを送信してアップロードす
るための通信手段とを含む。According to another aspect of the present invention, an information generating apparatus includes: an information generating unit configured to generate information including a plurality of files having display text strings; and a plurality of information generated by the information generating unit. A vocabulary dictionary generating unit for converting a part of the display text string of the file into voice vocabulary dictionary data used for voice operation and providing the voice vocabulary dictionary data to a plurality of files; Communication means for transmitting and uploading a plurality of files.

【００４１】語彙辞書生成手段は、情報生成手段によっ
て生成された複数のファイルの表示テキスト文字列の一
部を音声操作に用いられる音声語彙辞書データに変換
し、複数のファイルに付与するので、その情報を情報提
供装置にアップロードすることによって、ユーザに音声
操作を行なえる情報を提供することが可能となる。The vocabulary dictionary generating means converts a part of the display text character strings of the plurality of files generated by the information generating means into voice vocabulary dictionary data used for voice operation, and assigns it to the plurality of files. By uploading the information to the information providing device, it is possible to provide the user with information that can perform voice operations.

【００４２】好ましくは、語彙辞書生成手段は、複数の
ファイル内のハイパーリンク記述に関連する表示テキス
ト文字列の一部を音声操作に用いられる音声語彙辞書デ
ータに変換し、ハイパーリンク記述に関連づけて前記通
信手段へ出力する。Preferably, the vocabulary dictionary generating means converts a part of the display text character string related to the hyperlink description in the plurality of files into voice vocabulary dictionary data used for voice operation, and associates it with the hyperlink description. Output to the communication means.

【００４３】したがって、ユーザは音声操作によってリ
ンク先のファイルにアクセスすることが可能となる。Therefore, the user can access the linked file by voice operation.

【００４４】本発明のさらに別の局面に従えば、端末装
置と、端末装置からの情報の送信要求に応じて、情報を
提供する情報提供装置と、情報提供装置にアップロード
する情報を生成する情報生成装置とを含んだ情報提供シ
ステムであって、情報生成装置は、表示テキスト文字列
を有する複数のファイルを含んだ情報を生成するための
情報生成手段と、情報生成手段によって生成された複数
のファイルの表示テキスト文字列の一部を音声操作に用
いられる音声語彙辞書データに変換し、複数のファイル
に付与するための語彙辞書生成手段と、語彙辞書生成手
段によって音声語彙辞書データが付与された複数のファ
イルを送信して情報提供装置にアップロードするための
第１の通信手段とを含み、情報提供装置は、端末装置か
ら情報の送信要求を受信し、送信要求に対応する情報を
送信するための第２の通信手段と、第２の通信手段によ
って受信された情報の送信要求に応じて、アップロード
された情報を第２の通信手段へ出力するための情報提供
手段とを含む。According to still another aspect of the present invention, a terminal device, an information providing device for providing information in response to a transmission request for information from the terminal device, and an information for generating information to be uploaded to the information providing device An information providing system including a generating device, the information generating device includes: an information generating unit configured to generate information including a plurality of files having a display text string; and a plurality of information generated by the information generating unit. A vocabulary dictionary generating unit for converting a part of the display text string of the file into voice vocabulary dictionary data used for voice operation and providing the voice vocabulary dictionary data to a plurality of files; First communication means for transmitting a plurality of files and uploading the files to the information providing apparatus, the information providing apparatus comprising: A second communication unit for receiving and transmitting information corresponding to the transmission request, and outputting the uploaded information to the second communication unit in response to the transmission request for the information received by the second communication unit Information providing means for performing

【００４５】情報生成装置は、複数のファイルの表示テ
キスト文字列の一部を音声操作に用いられる音声語彙辞
書データに変換し、複数のファイルに付与して情報提供
装置にアップロードするので、ユーザに音声操作を行な
える情報を提供することが可能となる。The information generating apparatus converts a part of the display text character strings of a plurality of files into voice vocabulary dictionary data used for voice operation, attaches the plurality of files to the plurality of files, and uploads the files to the information providing apparatus. It is possible to provide information that allows voice operation.

【００４６】さらに好ましくは、語彙辞書生成手段は、
複数のファイル内のハイパーリンク記述に関連する表示
テキスト文字列の一部を音声操作に用いられる音声語彙
辞書データに変換し、ハイパーリンク記述に関連づけて
前記第１の通信手段へ出力する。More preferably, the vocabulary dictionary generating means comprises:
A part of the display text string related to the hyperlink description in the plurality of files is converted into speech vocabulary dictionary data used for voice operation, and is output to the first communication means in association with the hyperlink description.

【００４７】したがって、ユーザは音声操作によってリ
ンク先のファイルにアクセスすることが可能となる。Therefore, the user can access the linked file by voice operation.

【００４８】本発明のさらに別の局面に従えば、情報提
供方法は、情報の送信要求を受信するステップと、受信
された情報の送信要求に応じて、指定された情報内の表
示テキスト文字列の一部を音声操作に用いられる音声語
彙辞書データに変換し、送信要求に対応する情報に付与
するステップと、音声語彙辞書データが付与された情報
を送信するステップとを含む。According to yet another aspect of the present invention, an information providing method includes a step of receiving an information transmission request, and a display text string in designated information in response to the received information transmission request. Is converted into voice vocabulary dictionary data used for voice operation and added to information corresponding to the transmission request, and a step of transmitting information to which the voice vocabulary dictionary data is added.

【００４９】ユーザによって指定された情報内の表示テ
キスト文字列の一部を音声操作に用いられる音声語彙辞
書データに変換し、送信要求に対応する情報に付与して
送信するので、ユーザは音声操作に必要な最低限の語彙
辞書のみを入手することが可能となる。A part of the display text character string in the information specified by the user is converted into vocabulary vocabulary dictionary data used for voice operation and added to the information corresponding to the transmission request and transmitted. It is possible to obtain only the minimum vocabulary dictionary necessary for the user.

【００５０】本発明のさらに別の局面に従えば、情報生
成方法は、表示テキスト文字列を有する複数のファイル
を含んだ情報を生成するステップと、生成された複数の
ファイルの表示テキスト文字列の一部を音声操作に用い
られる音声語彙辞書データに変換し、複数のファイルに
付与するステップと、音声語彙辞書データが付与された
複数のファイルを送信してアップロードするステップと
を含む。According to still another aspect of the present invention, an information generating method includes a step of generating information including a plurality of files having a display text string, and a step of generating a display text string of the plurality of generated files. The method includes a step of converting part of the data into voice vocabulary dictionary data used for voice operation and providing the data to a plurality of files, and a step of transmitting and uploading a plurality of files provided with the voice vocabulary dictionary data.

【００５１】生成された複数のファイルの表示テキスト
文字列の一部を音声操作に用いられる音声語彙辞書デー
タに変換し、複数のファイルに付与するので、その情報
を情報提供装置にアップロードすることによって、ユー
ザに音声操作を行なえる情報を提供することが可能とな
る。A part of the generated display text character strings of the plurality of files is converted into speech vocabulary dictionary data used for voice operation, and is added to the plurality of files. By uploading the information to the information providing device, In addition, it is possible to provide the user with information for performing voice operations.

【００５２】本発明のさらに別の局面に従えば、情報提
供方法をコンピュータに実行させるためのプログラムを
記録したコンピュータで読取可能な記録媒体であって、
情報提供方法は、情報の送信要求を受信するステップ
と、受信された情報の送信要求に応じて、指定された情
報内の表示テキスト文字列の一部を音声操作に用いられ
る音声語彙辞書データに変換し、送信要求に対応する情
報に付与するステップと、音声語彙辞書データが付与さ
れた情報を送信するステップとを含む。According to still another aspect of the present invention, there is provided a computer-readable recording medium having recorded thereon a program for causing a computer to execute the information providing method,
The information providing method includes the steps of receiving an information transmission request, and, in response to the received information transmission request, converting a part of a display text character string in the specified information into voice vocabulary dictionary data used for voice operation. The method includes a step of converting and adding the information corresponding to the transmission request, and a step of transmitting the information to which the speech vocabulary dictionary data is added.

【００５３】ユーザによって指定された情報内の表示テ
キスト文字列の一部を音声操作に用いられる音声語彙辞
書データに変換し、送信要求に対応する情報に付与して
送信するので、ユーザは音声操作に必要な最低限の語彙
辞書のみを入手することが可能となる。A part of the display text character string in the information specified by the user is converted into speech vocabulary dictionary data used for voice operation, and added to information corresponding to the transmission request and transmitted. It is possible to obtain only the minimum vocabulary dictionary necessary for the user.

【００５４】本発明のさらに別の局面に従えば、情報生
成方法をコンピュータに実行させるためのプログラムを
記録したコンピュータで読取可能な記録媒体であって、
情報生成方法は、表示テキスト文字列を有する複数のフ
ァイルを含んだ情報を生成するステップと、生成された
複数のファイルの表示テキスト文字列の一部を音声操作
に用いられる音声語彙辞書データに変換し、複数のファ
イルに付与するステップと、音声語彙辞書データが付与
された複数のファイルを送信してアップロードするステ
ップとを含む。According to still another aspect of the present invention, there is provided a computer-readable recording medium recording a program for causing a computer to execute the information generating method,
The information generating method includes a step of generating information including a plurality of files having display text strings, and converting a part of the generated display text strings of the plurality of files into voice vocabulary dictionary data used for voice operation. And a step of transmitting and uploading the plurality of files to which the speech vocabulary dictionary data has been added.

【００５５】生成された複数のファイルの表示テキスト
文字列の一部を音声操作に用いられる音声語彙辞書デー
タに変換し、複数のファイルに付与するので、その情報
を情報提供装置にアップロードすることによって、ユー
ザに音声操作を行なえる情報を提供することが可能とな
る。A part of the generated display text strings of a plurality of files is converted into speech vocabulary dictionary data used for voice operation and is added to the plurality of files. By uploading the information to the information providing device, In addition, it is possible to provide the user with information for performing voice operations.

【００５６】[0056]

【発明の実施の形態】図１は、本発明の実施の形態にお
ける情報提供システムにおいて提供される情報のデータ
形式を示す図である。このデータ形式は、本実施の形態
において特別に規定したものであり、インターネットで
提供される情報のデータ形式である、ＨＴＭＬやＸＭＬ
（eXtensible Markup Language）とは異なる。このこと
は、制御方法の細かい点、たとえばサーバの情報送信時
において、後述する語彙辞書管理エリアを送信していな
いか、除外しているなどの点において実システムと異な
る。しかし、このことは本発明の本質を阻害するもので
はなく、むしろインターネット等の通信インフラやネッ
トワーク環境等に限定されないことを意味している。FIG. 1 is a diagram showing a data format of information provided in an information providing system according to an embodiment of the present invention. This data format is specially defined in the present embodiment, and is a data format of information provided on the Internet, such as HTML or XML.
(EXtensible Markup Language). This is different from the actual system in that the control method is detailed, for example, the vocabulary dictionary management area described later is not transmitted or excluded when transmitting information from the server. However, this does not hinder the essence of the present invention, but rather means that the present invention is not limited to a communication infrastructure such as the Internet or a network environment.

【００５７】図１に示すデータ形式は、大きく提供情報
記述エリアと、語彙辞書情報記述エリアとの２つのエリ
アからなる。提供情報記述エリアは、提供される情報の
内容が記述されるエリアであり、表示データエリアと、
アクションオブジェクト管理エリアとを含む。表示デー
タエリアは、表示処理されるテキストデータが記述され
るエリアである。また、アクションオブジェクト管理エ
リアは、閲覧者の何らかの操作により、何らかの表示上
の変化が発生するアクションオブジェクトを管理するた
めの情報が記述されるエリアであり、たとえばボタンや
リンク等の制御がなされるオブジェクトの位置や制御内
容が記述される。The data format shown in FIG. 1 is roughly composed of two areas: a provided information description area and a vocabulary dictionary information description area. The provided information description area is an area in which the content of provided information is described, and includes a display data area,
Action object management area. The display data area is an area in which text data to be displayed is described. The action object management area is an area in which information for managing an action object in which some kind of display change occurs due to some operation of a viewer is described. For example, an object in which controls such as buttons and links are performed. Is described.

【００５８】たとえば、図２に示すＨＴＭＬファイルの
場合であれば、“テスト：”と“ここ”と“をクリック
してください”とが表示データであり、その他がアクシ
ョンオブジェクトと考えれば良い。すなわち、実際のＨ
ＴＭＬファイルの場合には、通常の表示データもアクシ
ョンオブジェクトもその他の単なる定義文等も連続して
記述されるが、本実施の形態においては説明を容易にす
るためにエリア管理されているものとする。For example, in the case of the HTML file shown in FIG. 2, "test:", "here" and "click here" are the display data, and the others are action objects. That is, the actual H
In the case of a TML file, normal display data, action objects, and other mere definition statements are continuously described, but in this embodiment, it is assumed that area management is performed for ease of explanation. I do.

【００５９】アクションオブジェクト管理エリアは、ア
クションオブジェクト毎にアクションオブジェクトの種
類、対応する表示データ上におけるアクションオブジェ
クトの位置や領域、およびアクションオブジェクト内容
を含む。アクションオブジェクトの種類ＡＴＹは、ボタ
ン、リンク等の種類を示している。ＨＴＭＬファイルの
「タグ」に相当すると考えても良い。アクションオブジ
ェクトの位置や領域を示すＰＯＳは、表示データエリア
と別管理されているために付加されている。表示データ
エリア上のどのテキストデータが対応するアクションオ
ブジェクトの表示データであるかを示すフィールドと考
えれば良い。The action object management area includes, for each action object, the type of the action object, the position and area of the action object on the corresponding display data, and the content of the action object. The type ATY of the action object indicates a type such as a button or a link. It may be considered that it corresponds to the “tag” of the HTML file. The POS indicating the position or area of the action object is added because it is managed separately from the display data area. It can be considered as a field indicating which text data on the display data area is display data of the corresponding action object.

【００６０】アクションオブジェクト内容ＯＤＴは、ア
クションオブジェクトの制御内容等が記述される。本実
施の形態においては、アクションオブジェクトの種類Ａ
ＴＹが“リンク”の場合のみ例示しており、ＵＲＬ（Un
iform Resource Locator）名を意味する文字列が記述さ
れているものとする。なお、記述されているアクション
オブジェクトの個数は図１には記載されていないが、情
報ファイル内に確保されている変数Ｎに記憶されている
ものとする。The action object content ODT describes the control content of the action object and the like. In the present embodiment, the type A of the action object
Only the case where the TY is “link” is illustrated, and the URL (Un
It is assumed that a character string indicating an iform Resource Locator (name) is described. Although the number of action objects described is not shown in FIG. 1, it is assumed that the number is stored in a variable N secured in the information file.

【００６１】語彙辞書情報記述エリアは、語彙辞書エリ
アと、語彙辞書管理エリアとを含んでいる。語彙辞書エ
リアは、提供情報記述エリアに記述されている情報に対
応した音声認識用の語彙辞書データが記述されるエリア
であり、閲覧者側の端末で音声認識用に利用される音声
語彙等が記憶されるエリアである。語彙辞書管理エリア
は、音声語彙辞書が記憶されているか等の情報を管理す
るための管理情報が記述されるエリアである。The vocabulary dictionary information description area includes a vocabulary dictionary area and a vocabulary dictionary management area. The vocabulary dictionary area is an area in which vocabulary dictionary data for speech recognition corresponding to the information described in the provided information description area is described. This is the area to be stored. The vocabulary dictionary management area is an area in which management information for managing information such as whether a speech vocabulary dictionary is stored is described.

【００６２】語彙辞書エリアは、音素記号列またはひら
がな／カタカナ文字列である語彙データ列が記憶されて
いる配列変数ＪＤＡＴＡと、その語彙データ列に対応す
るアクションオブジェクトがいずれであるかを示す変数
ＯＮＯとを含んでいる。図１において、変数ＯＮＯは、
便宜上アクションオブジェクトの登場順にその値が記憶
されているものとする。上述したアクションオブジェク
トの個数と同様に、図１には記憶されている語彙データ
の個数も示していないが、情報ファイル内に確保されて
いる変数Ｍに記憶されているものとする。The vocabulary dictionary area includes an array variable JDATA storing a vocabulary data string that is a phoneme symbol string or a hiragana / katakana character string, and a variable ONO indicating which action object corresponds to the vocabulary data string. And In FIG. 1, the variable ONO is
For convenience, it is assumed that the values are stored in the order in which the action objects appear. Similar to the number of action objects described above, the number of vocabulary data stored in FIG. 1 is not shown, but it is assumed that the vocabulary data is stored in a variable M secured in the information file.

【００６３】語彙辞書管理エリアには、情報ファイル内
に語彙データが１つ以上存在するか否か、および語彙デ
ータが存在する場合においてその語彙データを送信要求
の際に自動的に語彙辞書を生成／マージするか否かを示
す変数ＪＦＧと、現在登録されている語彙データが音素
記号列およびひらがな／カタカナ文字列のいずれである
かを示す変数ＪＴＹとが記述されている。In the vocabulary dictionary management area, whether or not there is one or more vocabulary data in the information file, and if the vocabulary data exists, a vocabulary dictionary is automatically generated when the vocabulary data is requested to be transmitted. / Variable JFG indicating whether or not to merge, and variable JTY indicating whether currently registered vocabulary data is a phoneme symbol string or a hiragana / katakana character string.

【００６４】図３は、本発明の実施の形態における情報
提供システムの概略構成を示すブロック図である。この
情報提供システムは、端末装置１と、情報提供装置（以
下、情報提供サーバとも呼ぶ。）２と、インターネット
接続プロバイダ３と、情報生成装置４とを含み、端末装
置１はインターネット接続プロバイダ３およびインター
ネットを介して、情報提供サーバ２または情報生成装置
４にアクセスする。FIG. 3 is a block diagram showing a schematic configuration of the information providing system according to the embodiment of the present invention. This information providing system includes a terminal device 1, an information providing device (hereinafter, also referred to as an information providing server) 2, an Internet connection provider 3, and an information generating device 4. Access the information providing server 2 or the information generating device 4 via the Internet.

【００６５】また、端末装置１は、音声認識部１０と、
アクションコード記憶部１１と、語彙辞書記憶部１２
と、基本辞書記憶部１３と、アプリケーションプログラ
ムとして提供されるブラウザ３０と、ページ記憶部３１
と、入力装置４０と、表示装置５０と、マイク６０とを
含む。入力装置４０は、キーボードやマウス等によって
構成される。Further, the terminal device 1 includes a voice recognition unit 10
Action code storage unit 11 and vocabulary dictionary storage unit 12
A basic dictionary storage unit 13, a browser 30 provided as an application program, and a page storage unit 31
, An input device 40, a display device 50, and a microphone 60. The input device 40 includes a keyboard, a mouse, and the like.

【００６６】後述するように、端末装置から情報ファイ
ルの送信要求があった場合に、図１に示すデータ形式で
記述される情報ファイルが情報提供サーバから送信され
ることになるが、まず、情報ファイルの送信を要求し、
閲覧／表示する端末装置１の処理手順について説明す
る。As will be described later, when a terminal device requests transmission of an information file, the information file described in the data format shown in FIG. 1 is transmitted from the information providing server. Request file transmission,
The processing procedure of the terminal device 1 for browsing / displaying will be described.

【００６７】図４は、本発明の実施の形態における端末
装置１、主としてブラウザ３０の処理手順を説明するた
めのフローチャートである。まず、ブラウザ３０は、ユ
ーザがキーボード等の入力装置４０を介してＵＲＬ指定
を行なったか否かを判定する（Ｓ１）。ユーザによって
ＵＲＬが指定されたときには（Ｓ１，Ｙｅｓ）、ブラウ
ザ３０は音声認識部１０が稼働中であるか否かを判定す
る（Ｓ２）。この判定は、音声認識エンジン（本実施の
形態では、音声認識部１０）を具備しない端末装置内に
も組み込まれ（インストールされ）得る、ＰＣアプリケ
ーションソフトウェア形態として実現されたブラウザ３
０を考慮しての制御である。FIG. 4 is a flowchart for explaining the processing procedure of the terminal device 1, mainly the browser 30, in the embodiment of the present invention. First, the browser 30 determines whether or not the user has specified a URL via the input device 40 such as a keyboard (S1). When the URL is specified by the user (S1, Yes), the browser 30 determines whether the voice recognition unit 10 is operating (S2). This determination is performed by the browser 3 implemented as a PC application software form that can be incorporated (installed) even in a terminal device that does not have a speech recognition engine (in this embodiment, the speech recognition unit 10).
The control takes into account 0.

【００６８】一般的に、音声認識エンジンもＰＣシステ
ムに組み込まれ得る形態（アプリケーションソフトウェ
アと区別して、一般的に、ドライバソフトウェアと呼ば
れるソフトウェア形態）をとる。すなわち、この判定制
御は、正確には、「稼働中か否か」というより、むし
ろ、ＰＣにおける実システムでは、「音声認識エンジン
がシステムに組み込まれているか否か」を意味する制御
であり、本制御の実コードはＰＣシステム（ＯＳ）とし
て規定された、アプリケーションソフトウェアとドライ
バソフトウェアとの仲立ちを行なうソフトウェアインタ
フェースを介して処理されることになるが、制御の流れ
を分かり易くするため、およびデータ形式を図１に示す
内容で規定したため等から、簡略化して説明している。In general, the voice recognition engine also takes a form that can be incorporated into a PC system (a software form generally called driver software as distinguished from application software). That is, this determination control is, more precisely, a control that means “whether or not the speech recognition engine is incorporated in the system” in the actual system of the PC, rather than “whether or not it is operating”, The actual code of this control is processed via a software interface defined as a PC system (OS) that mediates between application software and driver software. Since the format is specified by the contents shown in FIG. 1, the description is simplified.

【００６９】音声認識部１０が稼働中であれば（Ｓ２，
Ｙｅｓ）、ステップＳ７へ処理が進む。また、音声認識
部１０が稼働中でなければ（Ｓ２，Ｎｏ）、音声語彙辞
書の送信要求を行なわずに、指定されたＵＲＬの情報フ
ァイルの送信要求を出力し（Ｓ３）、ステップＳ８へ処
理が進む。If the voice recognition unit 10 is operating (S2,
Yes), the process proceeds to step S7. If the voice recognition unit 10 is not operating (S2, No), a request for transmitting the information file of the specified URL is output without requesting the transmission of the voice vocabulary dictionary (S3), and the process proceeds to step S8. Advances.

【００７０】ステップＳ１において、ユーザによってＵ
ＲＬが指定されていなければ（Ｓ１，Ｎｏ）、ブラウザ
３０は音声認識部１０が稼働中であるか否かを判定する
（Ｓ４）。音声認識部１０が稼働中でなければ（Ｓ４，
Ｎｏ）、ステップＳ１へ戻って以降の処理を繰返す。ま
た、音声認識部１０が稼働中であれば（Ｓ４，Ｙｅ
ｓ）、現在表示中の表示情報に対応する語彙辞書を既に
受信しているか否かを判定する（Ｓ５）。表示情報の語
彙辞書を未だ受信していない場合には（Ｓ５，Ｎｏ）、
ステップＳ１へ戻って以降の処理を繰返す。In step S 1, U
If the RL is not specified (S1, No), the browser 30 determines whether or not the voice recognition unit 10 is operating (S4). If the voice recognition unit 10 is not operating (S4,
No), the process returns to step S1, and the subsequent processes are repeated. If the voice recognition unit 10 is operating (S4, Ye
s) It is determined whether a vocabulary dictionary corresponding to the currently displayed display information has already been received (S5). If the vocabulary dictionary of the display information has not been received yet (S5, No),
Returning to step S1, the subsequent processing is repeated.

【００７１】ステップＳ５において、表示情報の語彙辞
書を既に受信している場合には（Ｓ５，Ｙｅｓ）、ユー
ザにより音声指定されたＵＲＬが音声認識部１０から出
力されたか否かを判定する（Ｓ６）。音声認識部１０か
らＵＲＬが出力されていなければ（Ｓ６，Ｎｏ）、ステ
ップＳ１へ戻って以降の処理を繰返す。また、音声認識
部１０からＵＲＬが出力されている場合（Ｓ６，Ｙｅ
ｓ）、または、ステップＳ２において音声認識部１０が
稼働中であると判定された場合には（Ｓ２，Ｙｅｓ）、
ブラウザ３０は指定されたＵＲＬの情報ファイルの送信
要求を、一般公衆回線およびインターネット接続プロバ
イダ３を介して情報提供サーバ２へ送信する。In step S5, if the vocabulary dictionary of the display information has already been received (S5, Yes), it is determined whether or not the URL specified by the voice by the user has been output from the voice recognition unit 10 (S6). ). If the URL has not been output from the voice recognition unit 10 (S6, No), the process returns to step S1 to repeat the subsequent processes. When a URL is output from the voice recognition unit 10 (S6, Ye
s) or when it is determined in step S2 that the voice recognition unit 10 is operating (S2, Yes),
The browser 30 transmits a request for transmitting the information file of the specified URL to the information providing server 2 via the general public line and the Internet connection provider 3.

【００７２】ステップＳ８において、ブラウザ３０はＵ
ＲＬの情報ファイルを受信したか否かを判定する。ＵＲ
Ｌの情報ファイルを受信していなければ（Ｓ８，Ｎ
ｏ）、ブラウザ３０はその情報ファイルを受信するまで
待機する。また、ブラウザ３０がＵＲＬの情報ファイル
を受信した場合には（Ｓ８，Ｙｅｓ）、その情報ファイ
ルに語彙辞書が付加されているか否かが判定される。情
報ファイルに語彙辞書が付加されていなければ（Ｓ９，
Ｎｏ）、ステップＳ１１へ処理が進む。また、情報ファ
イルに語彙辞書が付加されていれば（Ｓ９，Ｙｅｓ）、
受信した語彙辞書を音声認識部１０へ出力する（Ｓ１
０）。そして、ブラウザ３０は表示装置５０に受信情報
を表示させ（Ｓ１１）、ステップＳ１へ戻って以降の処
理を繰返す。In step S8, the browser 30
It is determined whether an RL information file has been received. UR
If the L information file has not been received (S8, N
o), the browser 30 waits until receiving the information file. When the browser 30 receives the URL information file (S8, Yes), it is determined whether or not a vocabulary dictionary is added to the information file. If no vocabulary dictionary is added to the information file (S9,
No), the process proceeds to step S11. If a vocabulary dictionary is added to the information file (S9, Yes),
The received vocabulary dictionary is output to the voice recognition unit 10 (S1
0). Then, the browser 30 causes the display device 50 to display the received information (S11), returns to step S1, and repeats the subsequent processing.

【００７３】図５は、音声認識部１０の処理手順を説明
するためのフローチャートである。この処理手順は、図
４に示すブラウザ３０の処理に対応している。まず、音
声認識部１０は、利用者がマイク６０を介して音声を入
力したか否かを判定する（Ｓ５０）。音声が入力されて
いなければ（Ｓ５０，Ｎｏ）、ブラウザ３０から音声語
彙辞書が出力されているか否かを判定する（Ｓ５１）。
ブラウザ３０から音声語彙辞書が出力されていれば（Ｓ
５１，Ｙｅｓ）、その音声語彙辞書の内容（ＪＤＡＴＡ
［１］〜ＪＤＡＴＡ［Ｍ］）およびアクションコード
（ＯＮＯ［１］〜ＯＮＯ［Ｍ］）を語彙辞書記憶部１２
およびアクションコード記憶部１１に登録して（Ｓ５
２）、ステップＳ１へ戻って以降の処理を繰返す。ま
た、ブラウザ３０から語彙辞書が出力されていなければ
（Ｓ５１，Ｎｏ）、ステップＳ１へ戻って処理を繰返
す。FIG. 5 is a flowchart for explaining the processing procedure of the voice recognition unit 10. This processing procedure corresponds to the processing of the browser 30 shown in FIG. First, the voice recognition unit 10 determines whether the user has input voice through the microphone 60 (S50). If no voice has been input (S50, No), it is determined whether or not a voice vocabulary dictionary has been output from the browser 30 (S51).
If a speech vocabulary dictionary is output from the browser 30 (S
51, Yes), the contents of the speech vocabulary dictionary (JDATA
[1] to JDATA [M]) and action codes (ONO [1] to ONO [M]) are stored in the vocabulary dictionary storage unit 12.
And registered in the action code storage unit 11 (S5
2) Return to step S1 and repeat the subsequent processing. If the vocabulary dictionary has not been output from the browser 30 (S51, No), the process returns to step S1 and repeats the process.

【００７４】ステップＳ５０において、利用者がマイク
６０を介して音声を入力していれば（Ｓ５０，Ｙｅ
ｓ）、音声認識部１０は語彙辞書情報記述エリアＪＤＡ
ＴＡを参照して音声認識を行ない、一致する音声語彙に
対応するアクションコードＯＮＯを抽出する（Ｓ５
３）。そして、そのアクションコードの種類が“リン
ク”（ＯＮＯ番目のアクションオブジェクトのＡＴＹ
値）であるか否かが判定される（Ｓ５４）。In step S50, if the user has input a voice through the microphone 60 (S50, Ye
s), the speech recognition unit 10 uses the vocabulary dictionary information description area JDA
The voice recognition is performed by referring to the TA, and the action code ONO corresponding to the matching voice vocabulary is extracted (S5).
3). Then, the type of the action code is “link” (ATY of the ONOth action object).
Is determined (S54).

【００７５】アクションコードの種類が“リンク”の場
合には（Ｓ５４，Ｙｅｓ）、音声認識部１０は音声認識
結果ＯＮＯに対応するアクションオブジェクト内容であ
るＯＤＴ値（ＵＲＬデータ）をブラウザ３０へ出力する
（Ｓ５５）。また、アクションコードの種類が“リン
ク”以外の場合には（Ｓ５４，Ｎｏ）、音声認識結果の
音声語彙に対応するアクションオブジェクトのＯＤＴ値
に応じてアクション制御を行なう（Ｓ５６）。If the type of the action code is “link” (S 54, Yes), the voice recognition unit 10 outputs the ODT value (URL data), which is the action object content corresponding to the voice recognition result ONO, to the browser 30. (S55). If the type of action code is other than "link" (S54, No), action control is performed according to the ODT value of the action object corresponding to the speech vocabulary of the speech recognition result (S56).

【００７６】図６は、本発明の実施の形態における情報
提供サーバ２の概略構成を示すブロック図である。この
情報提供サーバ２は、情報提供部２１０と、漢字→ひら
カナ変換辞書記憶部２１１と、ひらカナ→音素変換辞書
記憶部２１２と、提供情報記憶装置２２０と、通信装置
２３０と、入力装置２４０と、表示装置２５０と、ＣＤ
−ＲＯＭ（Compact Disc-Read Only Memory）２７０が
装着されるＣＤ−ＲＯＭドライブ２６０とを含む。ま
た、情報提供部２１０は、語彙辞書生成部２１５を含
み、情報提供サーバ２の全体的な制御を行なう。情報提
供部２１０は、プログラム（以下、情報提供プログラム
と呼ぶ。）によって実現される。この情報提供プログラ
ムはＣＤ−ＲＯＭ２７０によって供給され、図示しない
ハードディスク等にインストールされる。FIG. 6 is a block diagram showing a schematic configuration of the information providing server 2 according to the embodiment of the present invention. The information providing server 2 includes an information providing unit 210, a kanji → hirakana conversion dictionary storage unit 211, a hiragana → phoneme conversion dictionary storage unit 212, a provided information storage device 220, a communication device 230, and an input device 240. , Display device 250, and CD
And a CD-ROM drive 260 in which a ROM (Compact Disc-Read Only Memory) 270 is mounted. Further, the information providing unit 210 includes a vocabulary dictionary generating unit 215, and performs overall control of the information providing server 2. The information providing unit 210 is realized by a program (hereinafter, referred to as an information providing program). This information providing program is supplied by the CD-ROM 270 and is installed on a hard disk (not shown).

【００７７】提供情報記憶装置２２０は、端末装置１か
らの要求に応じて提供する情報を記憶しており、ハード
ディスク等の大容量のディスクによって構成される。通
信装置２３０は、専用回線または一般公衆回線を介して
端末装置１から情報提供コマンドを受信したり、端末装
置１へ情報ファイルを送信したりする。The provided information storage device 220 stores information to be provided in response to a request from the terminal device 1, and is constituted by a large-capacity disk such as a hard disk. The communication device 230 receives an information provision command from the terminal device 1 via a dedicated line or a general public line, or transmits an information file to the terminal device 1.

【００７８】漢字→ひらカナ変換辞書記憶部２１１に
は、提供する情報ファイル内に含まれるテキスト文字の
うち漢字をひらがな／カタカナに変換する際に使用され
る辞書が記憶される。また、ひらカナ→音素変換辞書記
憶部２１２には、提供する情報ファイル内に含まれるテ
キスト文字のうちひらがな／カタカナを音素に変換する
際、または漢字から変換されたひらがな／カタカナを音
素に変換する際に使用される辞書が記憶される。The kanji → hirakana conversion dictionary storage section 211 stores a dictionary used when converting kanji to hiragana / katakana among text characters included in the provided information file. Also, the hiragana / phoneme conversion dictionary storage unit 212 converts hiragana / katakana among text characters included in the provided information file into phonemes, or converts hiragana / katakana converted from kanji into phonemes. The dictionary used at that time is stored.

【００７９】入力装置２４０は、キーボードやマウス等
によって構成される。入力装置２４０および表示装置２
５０は、情報利用者のモニタに使用されたり、情報ファ
イルのメンテナンス等に使用される。The input device 240 includes a keyboard, a mouse, and the like. Input device 240 and display device 2
Reference numeral 50 is used for monitoring an information user or for maintenance of an information file.

【００８０】図７は、図６に示す情報提供部２１０の処
理手順を説明するためのフローチャートである。まず、
情報提供部２１０は、通信装置２３０を介して端末装置
１から情報ファイルの送信要求を受信したか否かを判定
する（Ｓ１００）。情報ファイルの送信要求を受信して
いなければ（Ｓ１００，Ｎｏ）、情報ファイルの送信要
求を受信するまで待機する。また、情報ファイルの送信
要求を受信すれば（Ｓ１００，Ｙｅｓ）、情報提供部２
１０は情報ファイルを送信する際に音声語彙辞書を付与
して情報ファイルを送信するか否かを判定する（Ｓ１０
１）。FIG. 7 is a flowchart for explaining the processing procedure of the information providing unit 210 shown in FIG. First,
The information providing unit 210 determines whether an information file transmission request has been received from the terminal device 1 via the communication device 230 (S100). If an information file transmission request has not been received (S100, No), the process waits until an information file transmission request is received. Further, if the information file transmission request is received (S100, Yes), the information providing unit 2
10 determines whether or not to transmit the information file by adding a speech vocabulary dictionary when transmitting the information file (S10).
1).

【００８１】音声語彙辞書を付与しない場合には（Ｓ１
０１，Ｎｏ）、送信要求があったＵＲＬに対応する情報
ファイルの内容、すなわち図１に示す変数ＤＴＥＸＴ、
ＰＯＳ［１］〜ＰＯＳ［Ｎ］およびＯＤＴ［１］〜ＯＤ
Ｔ［Ｎ］を要求があった端末装置１へ送信する（Ｓ１０
２）。そして、エントリへ戻ってステップＳ１００以
降の処理を繰返す。When no speech vocabulary dictionary is added (S1
01, No), the contents of the information file corresponding to the URL for which transmission was requested, that is, the variable DTEXT shown in FIG.
POS [1] to POS [N] and ODT [1] to OD
T [N] is transmitted to the requesting terminal device 1 (S10
2). Then, the process returns to the entry and repeats the processing from step S100.

【００８２】また、音声語彙辞書を付与する場合には
（Ｓ１０１，Ｙｅｓ）、送信要求があったＵＲＬに対応
する情報ファイルの変数ＪＦＧ値が“１”であるか否か
を判定する（Ｓ１０３）。送信要求があったＵＲＬのＪ
ＦＧ値が“１”でなければ（Ｓ１０３，Ｎｏ）、情報フ
ァイル内に語彙辞書が存在しない場合（ＪＦＧ＝０）、
またはすでに語彙辞書が存在するが、自動的に情報ファ
イル内の内容を解析してアクションオブジェクトに対応
する音声語彙辞書を生成し、すでに存在する語彙辞書に
マージして送信することが指定されている場合（ＪＦＧ
＝−１）であるため、エントリへ進んで後述する処理
を行なう。When a speech vocabulary dictionary is added (S101, Yes), it is determined whether or not the variable JFG value of the information file corresponding to the URL requested to be transmitted is "1" (S103). . J of the URL that requested transmission
If the FG value is not “1” (S103, No), if there is no vocabulary dictionary in the information file (JFG = 0),
Or, a vocabulary dictionary already exists, but it is specified that the contents in the information file are automatically analyzed to generate a voice vocabulary dictionary corresponding to the action object, and merged with the already existing vocabulary dictionary and transmitted. Case (JFG
= -1), the process proceeds to the entry and the processing described later is performed.

【００８３】送信要求があったＵＲＬのＪＦＧ値が
“１”であれば（Ｓ１０３，Ｙｅｓ）、情報ファイルの
変数ＪＴＹが“０”であるか否かを判定する（Ｓ１０
４）。ＪＴＹ値が“０”でない場合には（Ｓ１０４，Ｎ
ｏ）、情報ファイル内の語彙辞書ＪＤＡＴＡ［１］〜Ｊ
ＤＡＴＡ［Ｍ］のデータ形式が音素記号列であるので、
情報ファイルの内容に音声語彙辞書を付与して、図１に
示す変数ＤＴＥＸＴ、ＰＯＳ［１］〜ＰＯＳ［Ｎ］、Ｏ
ＤＴ［１］〜ＯＤＴ［Ｎ］、ＪＤＡＴＡ［１］〜ＪＤＡ
ＴＡ［Ｍ］およびＯＮＯ［１］〜ＯＮＯ［Ｍ］を要求が
あった端末装置１へ送信する（Ｓ１０５）。そして、エ
ントリへ戻ってステップＳ１００以降の処理を繰返
す。If the JFG value of the URL requested to be transmitted is "1" (S103, Yes), it is determined whether or not the variable JTY of the information file is "0" (S10).
4). If the JTY value is not “0” (S104, N
o), vocabulary dictionaries JDATA [1] to J in the information file
Since the data format of DATA [M] is a phoneme symbol string,
A speech vocabulary dictionary is added to the contents of the information file, and the variables DTEXT, POS [1] to POS [N], O shown in FIG.
DT [1] to ODT [N], JDATA [1] to JDA
TA [M] and ONO [1] to ONO [M] are transmitted to the terminal device 1 that has requested (S105). Then, the process returns to the entry and repeats the processing from step S100.

【００８４】また、情報ファイルの変数ＪＴＹが“０”
の場合には（Ｓ１０４，Ｙｅｓ）、語彙辞書ＪＤＡＴＡ
［１］〜ＪＤＡＴＡ［Ｍ］のデータ形式がひらがな／カ
タカナであるので、ひらがな／カタカナを音素記号列に
変換するためにエントリへ進んで後述する処理を行な
う。When the variable JTY of the information file is "0"
(S104, Yes), the vocabulary dictionary JDATA
Since the data format of [1] to JDATA [M] is Hiragana / Katakana, the process proceeds to the entry to convert Hiragana / Katakana into a phoneme symbol string, and performs processing described later.

【００８５】上述したように、語彙辞書のデータ形式と
して「音素記号」と「ひらカナ」の２種類を想定してい
るが、この理由は音素記号のデータ形式の規定が音声認
識エンジンに依存するからである。すなわち、情報提供
者がインターネットを介して提供する情報に音声語彙辞
書を付与するサービスを想定した場合、このサービスが
最初に商品化される際には「端末装置内の音声認識エン
ジンは、その音声認識エンジンに依存した情報提供サー
バにアクセスする必要がある。」という制限が付くもの
と予想されるからである。As described above, two types of data formats of the vocabulary dictionary, "phoneme symbols" and "hirakana", are assumed. The reason is that the definition of the phoneme symbol data format depends on the speech recognition engine. Because. That is, assuming a service in which an information provider adds a speech vocabulary dictionary to information provided via the Internet, when this service is first commercialized, the speech recognition engine in the terminal device uses the speech recognition engine. It is necessary to access an information providing server that depends on the recognition engine. "

【００８６】たとえば、シャープ社製の音声認識エンジ
ンを使用する際には、シャープ社が準備した情報提供サ
ーバによって提供される情報にアクセスする必要がある
ことを意味し、シャープ社が準備した情報提供サーバを
経由して情報をアクセスしなければ音声操作が行なえな
いことを意味している。このようなサービスが一社によ
って提供される場合には利用者の混乱もあまりないと考
えられる。しかし、このようなサービスが世間に認知さ
れるに従って、各社が独自に音声認識エンジンを開発す
ることになり、音声認識エンジン毎に規定が異なる音素
記号が存在することになる。その状況がさらに進むと、
各社間で音素記号規定の統一化が行なわれることになる
であろう。For example, when using a speech recognition engine manufactured by Sharp, it means that it is necessary to access information provided by an information providing server prepared by Sharp. This means that voice operations cannot be performed unless information is accessed via the server. When such a service is provided by one company, it is considered that there is not much confusion of users. However, as such services are recognized by the public, each company will develop its own speech recognition engine, and there will be phoneme symbols with different rules for each speech recognition engine. As the situation progresses further,
The standardization of phoneme symbols will be unified among companies.

【００８７】本実施の形態においては、音素記号とひら
カナとの２種類について説明しているが、これらのデー
タ形式に特に意味がある訳ではなく、情報ファイルとし
て複数のデータ形式が存在し得ることを考慮したもので
ある。情報ファイルとして様々な種類が存在し得るが、
通信媒体上統一することが望ましいため、本実施の形態
においては音素記号を統一フォーマットとして送信／受
信している。また、記憶媒体上の情報ファイルは音素記
号列およびひらカナ文字列で記憶されている。しかし、
これらに限られるものではなく、たとえば漢字コードを
含んだ文字列であっても良いし、「規定Ｖｅｒ１音素記
号」または「規定Ｖｅｒ２音素記号」であっても良い。In this embodiment, two types, phoneme symbols and hiragana, have been described. However, these data formats are not particularly meaningful, and a plurality of data formats may exist as information files. This is taken into account. There can be various types of information files,
Since it is desirable to be unified on a communication medium, in the present embodiment, phoneme symbols are transmitted / received as a unified format. The information file on the storage medium is stored as a phoneme symbol string and a hiragana character string. But,
The present invention is not limited to these, and may be, for example, a character string including a kanji code, or a “specified Ver1 phoneme symbol” or a “specified Ver2 phoneme symbol”.

【００８８】また、音声語彙辞書の自動生成制御は、送
信要求された情報ファイルを管理／記憶しているサーバ
が実施しなくても良いという点に注意すべきである。本
実施の形態においては、説明の便宜上情報ファイルを管
理／記録している情報提供サーバ内で音声語彙辞書の自
動生成制御を実施しているが、音声語彙辞書の生成サー
ビスを行なう別のサーバに、提供する情報ファイルとそ
の情報の要求元を出力することによって、その別のサー
バが音声語彙辞書が付与された情報ファイルを情報要求
者に提供するようにしても良い。また、端末装置１が要
求した情報ファイルを一旦受信し、端末装置１がその情
報ファイルを専用サーバへ送信することにより、専用サ
ーバから音声語彙辞書が付与された情報ファイルを受信
するようにしても良い。なお、実際のサービスの形態に
おいては、情報を発信する者と、その情報を提供するシ
ステムを管理している者とが異なるため、上述した専用
サーバで音声語彙辞書を生成して送信するのが一般的で
あると考えられる。It should be noted that the automatic generation control of the speech vocabulary dictionary need not be performed by the server that manages / stores the information file requested to be transmitted. In the present embodiment, for the sake of convenience of explanation, automatic generation control of a speech vocabulary dictionary is performed in an information providing server which manages / records an information file. By outputting the information file to be provided and the request source of the information, the other server may provide the information file to which the speech vocabulary dictionary is added to the information requester. Further, the information file requested by the terminal device 1 may be temporarily received, and the terminal device 1 may transmit the information file to the dedicated server, thereby receiving the information file to which the speech vocabulary dictionary is added from the dedicated server. good. In the actual form of service, since the person who sends information is different from the person who manages the system that provides the information, it is necessary to generate and transmit a speech vocabulary dictionary using the dedicated server described above. It is considered general.

【００８９】図８は、図７のステップＳ１０３におい
て、送信要求があった情報ファイルの変数ＪＦＧが
“１”でない場合の処理（エントリの処理）を説明す
るためのフローチャートである。まず、ローカル変数ｉ
に“１”を代入して初期化する（Ｓ２００）。そして、
情報提供部２１０は、送信要求があった情報ファイルの
内容を図示しない送信用メモリに転送する（Ｓ２０
１）。なお、このステップＳ２０１以降において情報フ
ァイルの変数値の変更処理を行なっているが、提供情報
記憶装置２２０に記憶される情報ファイルの内容を変更
するものではなく、送信用メモリに転送された情報ファ
イルの内容を変更することを意味している。FIG. 8 is a flowchart for explaining the processing (entry processing) when the variable JFG of the information file requested to be transmitted is not "1" in step S103 of FIG. First, the local variable i
Is initialized by substituting "1" into (S200). And
The information providing unit 210 transfers the contents of the information file requested to be transmitted to a not-shown transmission memory (S20).
1). Although the process of changing the variable value of the information file is performed after step S201, this does not change the content of the information file stored in the provided information storage device 220, but the information file transferred to the transmission memory. Means to change the content of

【００９０】次に、情報提供部２１０は、ＪＦＧ値が
“０”であるか否かを判定する（Ｓ２０２）。ＪＦＧ値
が“０”の場合（Ｓ２０２，Ｙｅｓ）、すなわち音声語
彙辞書を付与しない場合には、変数Ｍに“１”を代入し
て初期化する（Ｓ２０４）。また、ＪＦＧ値が“０”で
ない場合（Ｓ２０２，Ｎｏ）、すなわち音声語彙辞書を
付与する場合には、変数Ｍを１だけインクリメントする
（Ｓ２０３）。Next, the information providing unit 210 determines whether or not the JFG value is "0" (S202). When the JFG value is "0" (S202, Yes), that is, when the speech vocabulary dictionary is not added, "1" is substituted for the variable M to initialize (S204). When the JFG value is not “0” (S202, No), that is, when a speech vocabulary dictionary is added, the variable M is incremented by 1 (S203).

【００９１】次に、語彙辞書生成部２１５は、アクショ
ンオブジェクトに対応するテキストデータ位置／領域が
記憶されている変数ＰＯＳ［ｉ］で示されるテキストデ
ータ列（漢字）をひらがな／カタカナ文字列に変換して
ＣＳ１とし（Ｓ２０５）、さらにひらがな／カタカナ文
字列を音素記号列に変換してＣＳ２とする（Ｓ２０
６）。語彙辞書生成部２１５は、音素記号列に変換され
た文字列ＣＳ２を語彙辞書データとして変数ＪＤＡＴＡ
［Ｍ］に格納し、現在の変数ｉの値をアクションオブジ
ェクト番号として変数ＯＮＯ［Ｍ］に格納する（Ｓ２０
７）。Next, the vocabulary dictionary generation unit 215 converts a text data string (Kanji) indicated by the variable POS [i] in which the text data position / area corresponding to the action object is stored into a Hiragana / Katakana character string. To CS1 (S205), and further converts the Hiragana / Katakana character string to a phoneme symbol string to CS2 (S20).
6). The vocabulary dictionary generation unit 215 uses the character string CS2 converted into the phoneme symbol string as vocabulary dictionary data as a variable JDATA.
[M], and the current value of the variable i is stored as an action object number in the variable ONO [M] (S20).
7).

【００９２】次に、情報提供部２１０は、変数ｉの値が
アクションオブジェクトの個数Ｎと一致するか否かを判
定する（Ｓ２０８）。変数ｉの値とアクションオブジェ
クトの個数Ｎの値とが一致しない場合には（Ｓ２０８，
Ｎｏ）、未だ処理していないアクションオブジェクトが
存在するため、変数ｉと変数Ｍとをインクリメントし
（Ｓ２０９）、ステップＳ２０５以降の処理を繰返す。Next, the information providing unit 210 determines whether or not the value of the variable i matches the number N of action objects (S208). If the value of the variable i does not match the value of the number N of action objects (S208,
No), since there is an action object that has not been processed yet, the variable i and the variable M are incremented (S209), and the processing after step S205 is repeated.

【００９３】また、変数ｉの値とアクションオブジェク
トの個数Ｎの値とが一致する場合には（Ｓ２０８，Ｙｅ
ｓ）、生成が完了した音声語彙辞書データを情報ファイ
ルに付与し、図１に示す変数ＤＴＥＸＴ、ＰＯＳ［１］
〜ＰＯＳ［Ｎ］、ＯＤＴ［１］〜ＯＤＴ［Ｎ］、ＪＤＡ
ＴＡ［１］〜ＪＤＡＴＡ［Ｍ］およびＯＮＯ［１］〜Ｏ
ＮＯ［Ｍ］を要求があった端末装置１へ送信する（Ｓ２
１０）。そして、エントリへ戻ってステップＳ１００
以降の処理を繰返す。If the value of the variable i matches the value of the number N of action objects (S208, Ye
s), the generated speech vocabulary dictionary data is added to the information file, and the variables DTEXT, POS [1] shown in FIG.
~ POS [N], ODT [1] ~ ODT [N], JDA
TA [1] to JDATA [M] and ONO [1] to O
NO [M] is transmitted to the terminal device 1 that has made the request (S2
10). Then, returning to the entry, step S100
Repeat the subsequent processing.

【００９４】図９は、音声語彙辞書データの生成の一例
を示す図である。図９（ａ）に示すＨＴＭＬファイルに
よって、図９（ｂ）に示す表示が行なわれる。下線を引
いた「先頭ページに戻る」がマウスによってクリックさ
れると、“sentoupe-jinimodoru”等のような音声語彙
辞書が作成されることになる。すなわち、使用者がこの
文章と同一の音声を入力しなければ、音声が認識されな
いシステムとなってしまう。FIG. 9 is a diagram showing an example of generation of speech vocabulary dictionary data. The display shown in FIG. 9B is performed by the HTML file shown in FIG. 9A. When the underlined "return to top page" is clicked with the mouse, a speech vocabulary dictionary such as "sentoupe-jinimodoru" is created. That is, if the user does not input the same voice as this sentence, the system will not recognize the voice.

【００９５】このように、アクションオブジェクトに対
応する表示テキストデータは、必ずしも単語になってい
るとは限らず、文章となっている場合もある。提供され
る情報内容にも依るが、一般的には文章ではなく単語と
すべきであろう。したがって、図８のステップＳ２０５
の前に単語分割処理を挿入し、分割された単語の中から
最初の単語文字列に対してのみステップＳ２０５および
Ｓ２０６の変換処理を行なうようにすれば良い。なお、
単語分割処理によって、最初の単語を抽出するだけで良
いので、単なる名詞種の単語辞書だけを実装すれば良い
であろう。As described above, the display text data corresponding to the action object is not always a word, but may be a sentence. In general, it should be a word, not a sentence, depending on the information provided. Therefore, step S205 in FIG.
, A word division process may be inserted before step S205, and the conversion processes of steps S205 and S206 may be performed only on the first word character string among the divided words. In addition,
Since it is only necessary to extract the first word by word division processing, it is sufficient to implement only a noun type word dictionary.

【００９６】図１０は、図７のステップＳ１０４におい
て、送信要求があった情報ファイルの変数ＪＴＹが
“０”の場合の処理（エントリの処理）を説明するた
めのフローチャートである。まず、情報提供部２１０
は、ローカル変数ｉに“１”を代入して初期化する（Ｓ
３００）。そして、情報提供部２１０は、送信要求があ
った情報ファイルの内容を図示しない送信用メモリに転
送する（Ｓ３０１）。なお、このステップＳ３０１以降
において情報ファイルの変数値の変更処理を行なってい
るが、提供情報記憶装置２２０に記憶される情報ファイ
ルの内容を変更するものではなく、送信用メモリに転送
された情報ファイルの内容を変更することを意味してい
る。FIG. 10 is a flowchart for explaining the processing (entry processing) when the variable JTY of the information file requested to be transmitted is “0” in step S104 of FIG. First, the information providing unit 210
Is initialized by assigning “1” to a local variable i (S
300). Then, the information providing unit 210 transfers the contents of the information file requested to be transmitted to a not-shown transmission memory (S301). Although the process of changing the variable value of the information file is performed after step S301, the content of the information file stored in the provided information storage device 220 is not changed, and the information file transferred to the transmission memory is not changed. Means to change the content of

【００９７】次に、語彙辞書生成部２１５は、現状の音
声語彙辞書の内容ＪＤＡＴＡ［ｉ］（ひらがな／カタカ
ナ）を音素記号列に変換してＣＳ２とする（Ｓ３０
２）。語彙辞書生成部２１５は、音素記号列に変換され
た文字列ＣＳ２を語彙辞書データとして変数ＪＤＡＴＡ
［ｉ］に格納する（Ｓ３０３）。Next, the vocabulary dictionary generation unit 215 converts the content JDATA [i] (Hiragana / Katakana) of the current speech vocabulary dictionary into a phoneme symbol string to obtain CS2 (S30).
2). The vocabulary dictionary generation unit 215 uses the character string CS2 converted into the phoneme symbol string as vocabulary dictionary data as a variable JDATA.
It is stored in [i] (S303).

【００９８】次に、情報提供部２１０は、変数ｉの値が
音声語彙データの個数Ｍと一致するか否かを判定する
（Ｓ３０４）。変数ｉの値と音声語彙データの個数Ｍの
値とが一致しない場合には（Ｓ３０４，Ｎｏ）、未だ処
理していない音声語彙データが存在するため、変数ｉを
インクリメントし（Ｓ３０５）、ステップＳ３０２以降
の処理を繰返す。Next, the information providing unit 210 determines whether or not the value of the variable i matches the number M of speech vocabulary data (S304). If the value of the variable i does not match the value of the number M of speech vocabulary data (S304, No), the variable i is incremented (S305) because there is speech vocabulary data that has not been processed yet, and step S302 is performed. Repeat the subsequent processing.

【００９９】また、変数ｉの値と音声語彙データの個数
Ｍの値とが一致する場合には（Ｓ３０４，Ｙｅｓ）、生
成が完了した音声語彙辞書データを情報ファイルに付与
し、図１に示す変数ＤＴＥＸＴ、ＰＯＳ［１］〜ＰＯＳ
［Ｎ］、ＯＤＴ［１］〜ＯＤＴ［Ｎ］、ＪＤＡＴＡ
［１］〜ＪＤＡＴＡ［Ｍ］およびＯＮＯ［１］〜ＯＮＯ
［Ｍ］を要求があった端末装置１へ送信する（Ｓ３０
６）。そして、エントリへ戻ってステップＳ１００以
降の処理を繰返す。If the value of the variable i matches the value of the number M of the voice vocabulary data (S304, Yes), the generated voice vocabulary dictionary data is added to the information file and shown in FIG. Variable DTEXT, POS [1] to POS
[N], ODT [1] to ODT [N], JDATA
[1] to JDATA [M] and ONO [1] to ONO
[M] is transmitted to the terminal device 1 that has made the request (S30)
6). Then, the process returns to the entry and repeats the processing from step S100.

【０１００】図１１は、本発明の実施の形態における情
報生成装置４の概略構成を示すブロック図である。図６
に示す本実施の形態における情報提供サーバ２と比較し
て、情報提供部２１０が情報生成部２１１に置換されて
いる点のみが異なる。したがって、重複する構成および
機能の詳細な説明は繰返さない。FIG. 11 is a block diagram showing a schematic configuration of the information generating device 4 according to the embodiment of the present invention. FIG.
The only difference is that the information providing unit 210 is replaced by the information generating unit 211 as compared with the information providing server 2 in the present embodiment shown in FIG. Therefore, detailed description of the same configurations and functions will not be repeated.

【０１０１】提供情報記憶装置２２０には、提供する情
報が記憶されるが、その情報は外部の端末装置１等から
アクセスすることができない。また、提供情報記憶装置
２２０に記憶されている情報は、未完成であることも考
えられる。提供する情報として完成した時点で、情報生
成部２１１は通信装置２３０を介して情報提供サーバ２
へアップロードする。情報生成部２１１は、プログラム
（以下、情報生成プログラムと呼ぶ。）によって実現さ
れる。この情報生成プログラムはＣＤ−ＲＯＭ２７０に
よって供給され、図示しないハードディスク等にインス
トールされる。The provided information is stored in the provided information storage device 220, but the information cannot be accessed from the external terminal device 1 or the like. Further, the information stored in the provided information storage device 220 may be incomplete. When the information to be provided is completed, the information generating unit 211 transmits the information to the information providing server 2 via the communication device 230.
Upload to. The information generation unit 211 is realized by a program (hereinafter, referred to as an information generation program). This information generating program is supplied by the CD-ROM 270 and is installed on a hard disk (not shown).

【０１０２】図１２は、図１１に示す情報生成装置４の
処理手順を説明するためのフローチャートである。この
フローチャートは、ホームページの作成／開設の手順を
大まかに示したものに類似するが、語彙辞書を生成する
ステップＳ２４が追加されている。まず、ホームページ
のページ素材の生成および準備を行なう（Ｓ２１）。ペ
ージ素材の基本は、文書データ（テキストデータ）と画
像データとである。画像データとしては、イラスト、フ
ォトデータ、アニメーション、動画等が利用される。こ
のように、ホームページのページ素材となるものが、コ
ンピュータで扱えるデジタルデータとして準備される。FIG. 12 is a flowchart for explaining the processing procedure of the information generating device 4 shown in FIG. This flowchart is similar to the one showing roughly the procedure for creating / establishing a homepage, but additionally includes a step S24 for generating a vocabulary dictionary. First, generation and preparation of a page material of a homepage are performed (S21). The basis of the page material is document data (text data) and image data. As the image data, illustrations, photo data, animations, moving images, and the like are used. In this way, the page material of the home page is prepared as digital data that can be handled by a computer.

【０１０３】これらのデジタルデータを生成するソフト
ウェア機能として、各種データを生成するためのエディ
タ機能、デジタルカメラ等のデジタルデータ入力装置か
らデータを入力するインポート機能等が挙げられるが、
その他にもＢＭＰ（Basic Multilingual Plane）形式の
ファイルをＧＩＦ（Graphics Interchange Format）形
式やＪＰＥＧ（Joint Photographic Experts Group）形
式に変換してデータ量を削減する機能や、画像ファイル
をＷｅｂカラーの範囲内に変換して、特定の色数および
特定の色コードの範囲内に画像データを調整してホーム
ページを閲覧する機器間の差異を少なくする画像変換機
能等も存在する。Software functions for generating these digital data include an editor function for generating various data, an import function for inputting data from a digital data input device such as a digital camera, and the like.
Other functions include the conversion of BMP (Basic Multilingual Plane) format files to GIF (Graphics Interchange Format) format and JPEG (Joint Photographic Experts Group) format to reduce the amount of data, and image files within the Web color range. There is also an image conversion function or the like for converting and adjusting image data within a range of a specific number of colors and a specific color code to reduce a difference between devices that browse a homepage.

【０１０４】また、近年、雑誌付録、ソフトウェアパッ
ケージや、インターネットインフラを利用した写真、イ
ラスト、サウンド、ムービー、ＣＧＩ（Computer Graph
icsInterface）等の素材も提供されており、個人ユーザ
でもこれらの素材集を利用して独自のホームページを制
作することが可能となっている。In recent years, magazine supplements, software packages, and photographs, illustrations, sounds, movies, CGIs (Computer Graphs) utilizing Internet infrastructure have been developed.
Materials such as icsInterface) are also provided, and individual users can also create their own homepages using these materials.

【０１０５】次に、ページデータが生成される（Ｓ２
２）。１つのページデータは、複数の素材から構成され
ており、たとえばテキストの回りにイラストやグラフィ
ックデータが飾り付けられたり、文書データに画像デー
タが貼り付けられたりして生成される。Next, page data is generated (S2).
2). One page data is composed of a plurality of materials, and is generated by, for example, decorating illustrations and graphic data around text, or pasting image data to document data.

【０１０６】このページデータを生成するソフトウェア
機能として、単なる編集機能、文字／画像位置を決定す
るレイアウト機能、画像データ等の素材を貼り付けた
り、埋め込んだりするリンク機能等が挙げられる。な
お、ホームページのデータ形式は、ＨＴＭＬフィアルで
あり、１つのページデータは１つのＨＴＭＬファイルに
よって構成される。そのＨＴＭＬファイルの中身、すな
わちページデータは、使用されている素材や表示位置等
が文字情報で記述されており、素材が画像データの場合
にはその画像データのファイル名が文字列で記述され
る。As software functions for generating the page data, there are a simple editing function, a layout function for determining a character / image position, a link function for pasting or embedding a material such as image data, and the like. The data format of the homepage is an HTML file, and one page data is constituted by one HTML file. In the contents of the HTML file, that is, in the page data, a used material, a display position, and the like are described by character information. When the material is image data, a file name of the image data is described by a character string. .

【０１０７】次に、ページ間のリンクが設定される（Ｓ
２３）。一般的に、ホームページは階層化されており、
トップページとしてホームページ全体の内容を示す目次
ページがあり、このトップページが親となって幾つかの
ページがリンクされるという構成を有している。ステッ
プＳ２３は、ホームページ全体の構成を編集／生成する
ステップであるが、他のページデータも素材であると考
えればステップＳ２２とほとんど同じ処理内容となる。Next, a link between pages is set (S
23). Generally, the home page is hierarchical,
As a top page, there is a table of contents page showing the contents of the entire home page, and the top page is a parent and several pages are linked. Step S23 is a step of editing / generating the configuration of the entire home page. However, if other page data is considered to be a material, the processing content is almost the same as that of step S22.

【０１０８】しかし、注目すべき点は、個人ユーザが自
身で作成した素材、たとえば自身のパーソナルコンピュ
ータ内にファイル形式で存在する素材に限定されない点
である。すなわち、他人のホームページにもリンクを張
ることができることである。It should be noted, however, that the material is not limited to the material created by the individual user himself, for example, the material existing in the personal computer in a file format. In other words, a link can be provided to another person's homepage.

【０１０９】次に、音声語彙辞書が生成される（Ｓ２
４）。この処理の詳細は後述する。最後に、作成された
ホームページが登録／公開される（Ｓ２５）。ステップ
Ｓ２１〜Ｓ２４の処理においては、個人ユーザのホーム
ページが単に作成されたに過ぎず、このままでは第３者
がそのホームページにアクセスすることができないため
である。一般的には、ステップＳ２１〜Ｓ２４の処理で
作成されたホームページを構成する全てのファイルを、
契約しているプロバイダのサーバにアップロードするこ
とによって、ホームページが登録／公開される。Next, a speech vocabulary dictionary is generated (S2).
4). Details of this processing will be described later. Finally, the created homepage is registered / released (S25). This is because in the processing of steps S21 to S24, the homepage of the individual user is merely created, and a third party cannot access the homepage as it is. Generally, all the files that make up the homepage created in the processing of steps S21 to S24 are
The homepage is registered / published by uploading it to the server of the contracted provider.

【０１１０】図１３は、図１２のステップＳ２４の処理
手順をさらに詳細に説明するためのフローチャートであ
る。まず、情報生成部２１１は、表示装置２５０に語彙
辞書の自動生成処理を行なうか否かのメッセージおよび
生成する語彙辞書のタイプを促すメッセージを表示させ
る。そして、入力装置２４０による利用者からの入力が
あると（Ｓ４００）、情報生成部２１１はＪＦＧ値が
“０”であるか否かを判定する（Ｓ４０１）。FIG. 13 is a flowchart for explaining the processing procedure of step S24 in FIG. 12 in more detail. First, the information generation unit 211 causes the display device 250 to display a message as to whether or not to perform a vocabulary dictionary automatic generation process and a message prompting the type of the vocabulary dictionary to be generated. Then, when there is an input from the user through the input device 240 (S400), the information generation unit 211 determines whether or not the JFG value is “0” (S401).

【０１１１】ＪＦＧ値が“０”の場合（Ｓ４０１，Ｙｅ
ｓ）、すなわち語彙辞書を自動生成しない場合には、提
供する情報を構成する全てのファイルに対して、語彙辞
書管理エリアの変数ＪＦＧおよび変数ＪＴＹを追加し、
情報提供記憶装置２２０に同じファイル名で更新登録す
る（Ｓ４０２）。このときのＪＦＧ値は“０”であり、
ＪＴＹ値は任意である。そして、情報生成部２１１は表
示装置２５０に語彙辞書を自作するか否かのメッセージ
を表示させて、利用者に入力を促す。When the JFG value is "0" (S401, Ye
s) That is, when the vocabulary dictionary is not automatically generated, the variable JFG and the variable JTY of the vocabulary dictionary management area are added to all the files constituting the provided information,
Update registration is performed with the same file name in the information providing storage device 220 (S402). At this time, the JFG value is “0”,
The JTY value is arbitrary. Then, the information generation unit 211 causes the display device 250 to display a message as to whether or not to create a vocabulary dictionary, and prompts the user to input.

【０１１２】利用者が語彙辞書の自作を選択しなかった
場合には（Ｓ４０３，Ｎｏ）、図１２のステップＳ２５
へ処理が進む。また、利用者が語彙辞書の自作を選択し
た場合には（Ｓ４０３，Ｙｅｓ）、語彙辞書を自作する
ためのエディタを起動させて、語彙辞書の作成が行なわ
れる。なお、この処理は、本願発明と直接関係するもの
ではないため、詳細な説明は行なわない。If the user does not select his / her own vocabulary dictionary (S403, No), step S25 in FIG.
The process proceeds to. When the user selects his / her own vocabulary dictionary (S403, Yes), an editor for self-creating the vocabulary dictionary is activated to create the vocabulary dictionary. Note that this processing is not directly related to the present invention, and thus will not be described in detail.

【０１１３】ステップＳ４０１において、ＪＦＧ値が
“０”でない場合（Ｓ４０１，Ｎｏ）、すなわち語彙辞
書を自動生成する場合には、ローカル配列変数ｔｆｉｌ
ｅ［］に提供情報を構成する全てのファイルのファイル
名を格納する（Ｓ４０４）。なお、ローカル変数ＦＣに
は提供情報を構成する全てのファイルのファイル数から
１を引いた値が格納されている。In step S401, when the JFG value is not "0" (S401, No), that is, when the vocabulary dictionary is automatically generated, the local array variable tfil
The file names of all the files constituting the provided information are stored in e [] (S404). Note that the local variable FC stores a value obtained by subtracting 1 from the number of all files constituting the provision information.

【０１１４】次に、ローカル変数ｋを“０”にクリアし
（Ｓ４０５）、ローカル変数ｋがＦＣの値よりも大きい
か否かが判定される（Ｓ４０６）。ローカル変数ｋがＦ
Ｃの値よりも大きい場合には（Ｓ４０６，Ｙｅｓ）、ス
テップＳ４０３へ進んで以降の処理を行なう。また、ロ
ーカル変数ｋがＦＣの値以下であれば（Ｓ４０６，Ｎ
ｏ）、エントリへ進んで後述する処理を行なう。Next, the local variable k is cleared to "0" (S405), and it is determined whether the local variable k is larger than the value of FC (S406). Local variable k is F
If the value is larger than the value of C (S406, Yes), the process proceeds to step S403 to perform the subsequent processing. If the local variable k is equal to or smaller than the value of FC (S406, N
o), proceed to the entry and perform the processing described later.

【０１１５】図１４は、語彙辞書の自動生成制御の処理
手順を説明するためのフローチャートである。まず、語
彙辞書生成部２１５は、語彙辞書を生成するファイル、
すなわちｔｆｉｌｅ［ｋ］値で示されるファイルのファ
イル内容を読込む（Ｓ４１０）。そして、ローカル変数
ｉに“１”を代入して初期化する（Ｓ４１１）。FIG. 14 is a flowchart for explaining the processing procedure of automatic vocabulary dictionary generation control. First, the vocabulary dictionary generation unit 215 generates a vocabulary dictionary file,
That is, the file contents of the file indicated by the tfile [k] value are read (S410). Then, “1” is substituted for the local variable i to perform initialization (S411).

【０１１６】次に、語彙辞書生成部２１５は、変数ｉの
値がアクションオブジェクトの個数Ｎよりも大きいか否
かを判定する（Ｓ４１２）。変数ｉの値がアクションオ
ブジェクトの個数Ｎ以下であれば（Ｓ４１２，Ｎｏ）、
アクションオブジェクトに対応するテキストデータ位置
／領域が記憶されている変数ＰＯＳ［ｉ］で示されるテ
キストデータ列（漢字）をひらがな／カタカナ文字列に
変換してＣＳ１とする（Ｓ４１３）。Next, the vocabulary dictionary generation section 215 determines whether or not the value of the variable i is larger than the number N of action objects (S412). If the value of the variable i is equal to or less than the number N of action objects (S412, No),
The text data string (Kanji) indicated by the variable POS [i] in which the text data position / area corresponding to the action object is stored is converted into a Hiragana / Katakana character string to obtain CS1 (S413).

【０１１７】次に、変数ＪＴＹが“０”であるか否かが
判定される（Ｓ４１４）。変数ＪＴＹが“０”でない場
合（Ｓ４１４，Ｎｏ）、すなわち利用者によって指示さ
れた語彙辞書データのタイプが音素記号である場合に
は、語彙辞書生成部２１５はひらがな／カタカナ文字列
を音素記号列に変換してＣＳ２とする（Ｓ４１５）。そ
して、音素記号列に変換された文字列ＣＳ２を語彙辞書
データとして変数ＪＤＡＴＡ［ｉ］に格納し、現在の変
数ｉの値をアクションオブジェクト番号として変数ＯＮ
Ｏ［ｉ］に格納する（Ｓ４１６）。そして、変数ｉをイ
ンクリメントし（Ｓ４１７）、ステップＳ４１２に戻っ
て以降の処理を繰返す。すなわち、次のアクションオブ
ジェクトに対する語彙辞書の生成を行なう。Next, it is determined whether or not the variable JTY is "0" (S414). If the variable JTY is not “0” (S414, No), that is, if the type of the vocabulary dictionary data specified by the user is a phoneme symbol, the vocabulary dictionary generation unit 215 converts the Hiragana / Katakana character string into a phoneme symbol string. To CS2 (S415). Then, the character string CS2 converted into the phoneme symbol string is stored in the variable JDATA [i] as vocabulary dictionary data, and the value of the current variable i is set as the action object number and the variable ON is set.
It is stored in O [i] (S416). Then, the variable i is incremented (S417), and the process returns to step S412 to repeat the subsequent processing. That is, a vocabulary dictionary for the next action object is generated.

【０１１８】ステップＳ４１４において、変数ＪＴＹが
“０”である場合（Ｓ４１４，Ｙｅｓ）、すなわち利用
者によって指示された語彙辞書データのタイプがひらが
な／カタカナである場合には、語彙辞書生成部２１５は
ひらがな／カタカナに変換された文字列ＣＳ１を語彙辞
書データとして変数ＪＤＡＴＡ［ｉ］に格納し、現在の
変数ｉの値をアクションオブジェクト番号として変数Ｏ
ＮＯ［ｉ］に格納する（Ｓ４１８）。そして、ステップ
Ｓ４１７へ処理が進む。In step S414, if the variable JTY is “0” (S414, Yes), that is, if the type of the vocabulary dictionary data specified by the user is Hiragana / Katakana, the vocabulary dictionary generation unit 215 The character string CS1 converted into Hiragana / Katakana is stored as lexical dictionary data in the variable JDATA [i], and the value of the current variable i is stored in the variable O as the action object number.
NO [i] is stored (S418). Then, the process proceeds to step S417.

【０１１９】ステップＳ４１２において、変数ｉの値が
アクションオブジェクトの個数Ｎよりも大きければ（Ｓ
４１２，Ｙｅｓ）、変数Ｍに（ｉ−１）を代入する（Ｓ
４１９）。そして、語彙辞書生成部２１５は、ファイル
ｔｆｉｌｅ［ｋ］の内容に、図１に示す変数ＪＤＡＴＡ
［１］〜ＪＤＡＴＡ［Ｍ］、ＯＮＯ［１］〜ＯＮＯ
［Ｍ］、ＪＦＧおよびＪＴＹを追加して、提供情報記憶
装置２２０に同じファイル名で更新登録する（Ｓ４２
０）。そして、変数ｋの値をインクリメントして（Ｓ４
２１）、図１３に示すループエントリＬｐへ処理が進
む。In step S412, if the value of the variable i is larger than the number N of action objects (S
412, Yes), substituting (i-1) for the variable M (S
419). Then, the vocabulary dictionary generation unit 215 adds the variable JDATA shown in FIG. 1 to the contents of the file tfile [k].
[1] to JDATA [M], ONO [1] to ONO
[M], JFG and JTY are added and updated and registered in the provided information storage device 220 with the same file name (S42).
0). Then, the value of the variable k is incremented (S4
21), the process proceeds to the loop entry Lp shown in FIG.

【０１２０】図１５は、本実施の形態の情報提供システ
ムにおける語彙辞書の送信経路を説明するための図であ
る。図１５（ａ）は、語彙辞書が付与された情報ファイ
ルが情報提供サーバ２から送信される場合を示してい
る。端末装置１から情報ファイルの送信要求がある
と、情報提供サーバ２は情報生成装置４に対して情報フ
ァイルを出力し、語彙辞書の生成要求を出力する。情
報生成装置４は、語彙辞書生成要求に応じて語彙辞書
を生成し、語彙辞書を付与した情報ファイルを情報提
供サーバ２へ送信する。そして、情報提供サーバ２は、
語彙辞書が付与された情報ファイルを端末装置１へ送
信する。FIG. 15 is a diagram for explaining the transmission path of the vocabulary dictionary in the information providing system according to the present embodiment. FIG. 15A shows a case where an information file to which a vocabulary dictionary is added is transmitted from the information providing server 2. When a request for transmitting an information file is received from the terminal device 1, the information providing server 2 outputs the information file to the information generating device 4 and outputs a request for generating a vocabulary dictionary. The information generation device 4 generates a vocabulary dictionary in response to the vocabulary dictionary generation request, and transmits an information file to which the vocabulary dictionary has been added to the information providing server 2. And the information providing server 2
The information file to which the vocabulary dictionary is added is transmitted to the terminal device 1.

【０１２１】図１５（ｂ）は、語彙辞書が付与された情
報ファイルが、情報生成装置４から送信される場合を示
している。端末装置１から情報ファイルの送信要求が
あると、情報提供サーバ２は情報生成装置４に対して情
報ファイルおよび送信要求があった端末装置のＩＤを出
力し、語彙辞書の生成要求を出力する。情報生成装置
４は、語彙辞書生成要求に応じて語彙辞書を生成し、
語彙辞書を付与した情報ファイルを直接端末装置１へ
送信する。FIG. 15B shows a case where an information file to which a vocabulary dictionary has been added is transmitted from the information generating device 4. When there is a request to transmit an information file from the terminal device 1, the information providing server 2 outputs the information file and the ID of the terminal device that has made the transmission request to the information generation device 4, and outputs a vocabulary dictionary generation request. The information generation device 4 generates a vocabulary dictionary in response to the vocabulary dictionary generation request,
The information file provided with the vocabulary dictionary is directly transmitted to the terminal device 1.

【０１２２】図１５（ｃ）は、端末装置１から直接情報
生成装置４へ情報ファイルが送信され、それに応じて語
彙辞書が付与された情報ファイルが、情報生成装置４か
ら送信される場合を示している。端末装置１から情報フ
ァイルの送信要求があると、情報提供サーバ２は端末
装置１に対して情報ファイルを送信する。端末装置１
は、情報生成装置４に対して情報ファイルおよび送信要
求を出力し、語彙辞書生成要求を出力する。情報生成
装置４は、語彙辞書生成要求に応じて語彙辞書を生成
し、語彙辞書を付与した情報ファイルを直接端末装置
１へ送信する。FIG. 15C shows a case where an information file is directly transmitted from the terminal device 1 to the information generating device 4 and an information file to which a vocabulary dictionary is added is transmitted from the information generating device 4 accordingly. ing. When a request for transmitting an information file is received from the terminal device 1, the information providing server 2 transmits the information file to the terminal device 1. Terminal device 1
Outputs an information file and a transmission request to the information generation device 4, and outputs a vocabulary dictionary generation request. The information generation device 4 generates a vocabulary dictionary in response to the vocabulary dictionary generation request, and directly transmits the information file to which the vocabulary dictionary has been added to the terminal device 1.

【０１２３】図１５（ａ）〜図１５（ｃ）に示す３種類
の情報提供システムの形態において、語彙辞書の生成を
有料とするか否かによって制御が若干異なる。すなわ
ち、図１５（ａ）に示す形態の場合にのみ情報生成装置
４が情報提供を要求している端末装置１のＩＤを通知さ
れない。したがって、語彙辞書の生成を有料とするため
には、契約者であるか否かを判定するために端末装置１
のＩＤの通知を情報提供サーバ２から受けるか、端末装
置１から直接ＩＤを受けるための構成が必要となる。ま
た、サービスを有料とするために契約者データベースを
設け、契約者であるか否かの判別を行なう必要もある。In the three types of information providing systems shown in FIGS. 15A to 15C, the control is slightly different depending on whether or not the vocabulary dictionary is generated for a fee. That is, only in the case of the mode shown in FIG. 15A, the information generating device 4 is not notified of the ID of the terminal device 1 requesting information provision. Therefore, in order to charge the generation of the vocabulary dictionary, the terminal device 1 needs to determine whether or not the user is a contractor.
A configuration for receiving the ID notification from the information providing server 2 or receiving the ID directly from the terminal device 1 is required. In addition, it is necessary to provide a contractor database in order to charge the service, and to determine whether or not the contractor.

【０１２４】図１６は、情報提供サービスを有料とした
場合の処理手順を説明するためのフローチャートであ
る。この処理は、情報生成装置４に付加されるものであ
る。まず、サービスが要求されたか否かが判定される
（Ｓ５００）。サービスが要求されていない場合には
（Ｓ５００，Ｎｏ）、サービスが要求されるまで待機す
る。サービスが要求されれば（Ｓ５００，Ｙｅｓ）、サ
ービスの要求者が契約者であるか否かが判定される（Ｓ
５０１）。FIG. 16 is a flowchart for explaining a processing procedure when the information providing service is charged. This processing is added to the information generating device 4. First, it is determined whether a service has been requested (S500). If the service has not been requested (S500, No), the process waits until the service is requested. If a service is requested (S500, Yes), it is determined whether the service requester is a contractor (S500).
501).

【０１２５】サービスの要求者が契約者でなければ（Ｓ
５０１，Ｎｏ）、情報提供サーバ２または端末装置１か
ら出力された情報ファイルのまま、情報提供サーバ２ま
たは端末装置１へ送信する（Ｓ５０２）。そして、ステ
ップＳ５００へ戻って以降の処理を繰返す。If the service requester is not a contractor (S
501, No), the information file output from the information providing server 2 or the terminal device 1 is transmitted to the information providing server 2 or the terminal device 1 as it is (S502). Then, the process returns to step S500 to repeat the subsequent processing.

【０１２６】また、サービスの要求者が契約者であれば
（Ｓ５０１，Ｙｅｓ）、上述した語彙辞書の生成処理を
行ない（Ｓ５０３）、語彙辞書を付与した情報ファイル
を情報提供サーバ２または端末装置１へ送信する（Ｓ５
０４）。そして、ステップＳ５００へ戻って以降の処理
を繰返す。If the service requester is a contractor (S501, Yes), the vocabulary dictionary generation processing described above is performed (S503), and the information file with the vocabulary dictionary is transferred to the information providing server 2 or the terminal device 1. (S5)
04). Then, the process returns to step S500 to repeat the subsequent processing.

【０１２７】なお、上述した本実施の形態においては、
主にリンク指定部分を音声認識する場合について説明し
たが、リンク指定部分以外のテキスト部分の音声語彙辞
書を作成するようにし、様々なアクションオブジェクト
と関連づけることによって、音声操作によって様々な動
作を行なうことも可能である。In the present embodiment described above,
The explanation was mainly on the case of voice recognition of the link designation part. However, it is necessary to create a speech vocabulary dictionary of the text part other than the link designation part and perform various actions by voice operation by associating it with various action objects. Is also possible.

【０１２８】以上説明したように、本実施の形態におけ
る情報提供システムによれば、情報提供サーバ２は、端
末装置１から送信された表示テキストに対応する語彙辞
書を作成して端末装置１へ送信するようにしたので、ユ
ーザは音声操作に必要な最低限の語彙辞書のみを入手す
ることができ、端末装置１に保持する語彙辞書のデータ
量を削減することが可能となった。また、情報提供者
は、送信すべき語彙辞書を作成する必要がなくなり、情
報提供者の負担を軽減することが可能となった。As described above, according to the information providing system in the present embodiment, the information providing server 2 creates a vocabulary dictionary corresponding to the display text transmitted from the terminal device 1 and transmits it to the terminal device 1 As a result, the user can obtain only the minimum vocabulary dictionary necessary for voice operation, and the data amount of the vocabulary dictionary stored in the terminal device 1 can be reduced. In addition, the information provider does not need to create a vocabulary dictionary to be transmitted, so that the burden on the information provider can be reduced.

【０１２９】また、情報提供サーバ２は、端末装置１か
ら送信された表示テキストを音素記号に変換して端末装
置１へ送信するようにしたので、音声認識率の低下を防
止しつつも、送信データ量を削減することが可能となっ
た。Further, the information providing server 2 converts the display text transmitted from the terminal device 1 into phoneme symbols and transmits the phoneme symbol to the terminal device 1. It has become possible to reduce the amount of data.

【０１３０】また、情報生成装置４は、ホームページの
各ファイル内に存在するテキスト文字列の一部を音声語
彙辞書に変換して付与するようにしたので、閲覧者は音
声操作を行なうことが可能となった。特に、ＵＲＬ記述
に関連づけて音声語彙辞書を作成することによって、音
声操作によってリンク先のファイルに容易にアクセスす
ることが可能となった。Further, since the information generating device 4 converts a part of the text character string present in each file of the homepage into a speech vocabulary dictionary and gives it, the viewer can perform a voice operation. It became. In particular, by creating a voice vocabulary dictionary in association with a URL description, it has become possible to easily access a linked file by voice operation.

【０１３１】今回開示された実施の形態は、すべての点
で例示であって制限的なものではないと考えられるべき
である。本発明の範囲は上記した説明ではなくて特許請
求の範囲によって示され、特許請求の範囲と均等の意味
および範囲内でのすべての変更が含まれることが意図さ
れる。The embodiments disclosed this time are to be considered in all respects as illustrative and not restrictive. The scope of the present invention is defined by the terms of the claims, rather than the description above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

[Brief description of the drawings]

【図１】本発明の実施の形態における情報提供システ
ムにおいて提供される情報のデータ形式を示す図であ
る。FIG. 1 is a diagram showing a data format of information provided in an information providing system according to an embodiment of the present invention.

【図２】ＨＴＭＬファイルの一例を示す図である。FIG. 2 is a diagram illustrating an example of an HTML file.

【図３】本発明の実施の形態における情報提供システ
ムの概略構成を示すブロック図である。FIG. 3 is a block diagram illustrating a schematic configuration of an information providing system according to the embodiment of the present invention.

【図４】本発明の実施の形態における端末装置１の処
理手順を説明するためのフローチャートである。FIG. 4 is a flowchart for explaining a processing procedure of the terminal device 1 according to the embodiment of the present invention.

【図５】音声認識部１０の処理手順を説明するための
フローチャートである。FIG. 5 is a flowchart for explaining a processing procedure of the voice recognition unit 10;

【図６】本発明の実施の形態における情報提供サーバ
２の概略構成を示すブロック図である。FIG. 6 is a block diagram illustrating a schematic configuration of an information providing server 2 according to the embodiment of the present invention.

【図７】情報提供部２１０の処理手順を説明するため
のフローチャートである。7 is a flowchart illustrating a processing procedure of an information providing unit 210. FIG.

【図８】図７のステップＳ１０３において、送信要求
があった情報ファイルの変数ＪＦＧが“１”でない場合
の処理を説明するためのフローチャートである。FIG. 8 is a flowchart for explaining a process when the variable JFG of the information file requested to be transmitted is not “1” in step S103 of FIG. 7;

【図９】音声語彙辞書データの生成の一例を示す図で
ある。FIG. 9 is a diagram showing an example of generation of speech vocabulary dictionary data.

【図１０】図７のステップＳ１０４において、送信要
求があった情報ファイルの変数ＪＴＹが“０”である場
合の処理を説明するためのフローチャートである。FIG. 10 is a flowchart illustrating a process when the variable JTY of the information file requested to be transmitted is “0” in step S104 of FIG. 7;

【図１１】本発明の実施の形態における情報生成装置
４の概略構成を示すブロック図である。FIG. 11 is a block diagram illustrating a schematic configuration of an information generation device 4 according to the embodiment of the present invention.

【図１２】本発明の実施の形態における情報生成装置
４の処理手順を説明するためのフローチャートである。FIG. 12 is a flowchart for explaining a processing procedure of the information generating device 4 according to the embodiment of the present invention.

【図１３】図１２のステップＳ２４の処理手順をさら
に詳細に説明するためのフローチャートである。FIG. 13 is a flowchart for explaining the processing procedure of step S24 in FIG. 12 in further detail;

【図１４】語彙辞書の自動生成制御の処理手順を説明
するためのフローチャートである。FIG. 14 is a flowchart illustrating a processing procedure of vocabulary dictionary automatic generation control.

【図１５】本発明の実施の形態の情報提供システムに
おける語彙辞書の送信経路を説明するための図である。FIG. 15 is a diagram illustrating a transmission path of a vocabulary dictionary in the information providing system according to the embodiment of the present invention.

【図１６】情報提供サービスを有料とした場合の処理
手順を説明するためのフローチャートである。FIG. 16 is a flowchart illustrating a processing procedure when the information providing service is charged.

【図１７】従来の電話回線音声入力システム等におけ
る一般的なシステム構成を示すブロック図である。FIG. 17 is a block diagram showing a general system configuration in a conventional telephone line voice input system or the like.

【図１８】かな漢字変換の処理内容を模式的に説明す
るための図である。FIG. 18 is a diagram for schematically explaining processing contents of kana-kanji conversion.

【図１９】かな漢字変換処理において使用される辞書
の一例を示す図である。FIG. 19 is a diagram showing an example of a dictionary used in kana-kanji conversion processing.

[Explanation of symbols]

１端末装置、２情報提供装置、３インターネット
接続プロバイダ、４情報生成装置、１０音声認識部、
１１アクションコード記憶部、１２語彙辞書記憶
部、１３基本辞書記憶部、３０ブラウザ、３１ペ
ージ記憶部、４０，２４０入力装置、５０，２５０
表示装置、６０マイク、２１０情報提供部、２１１
漢字→ひらカナ変換辞書記憶部、２１２ひらカナ→
音素変換辞書記憶部、２１４情報生成部、２１５語
彙辞書生成部、２２０情報提供記憶部、２３０通信
装置、２６０ＣＤ−ＲＯＭドライブ、２７０ＣＤ−
ＲＯＭ。1 terminal device, 2 information providing device, 3 Internet connection provider, 4 information generating device, 10 voice recognition unit,
11 action code storage unit, 12 vocabulary dictionary storage unit, 13 basic dictionary storage unit, 30 browser, 31 page storage unit, 40, 240 input device, 50, 250
Display device, 60 microphone, 210 information providing unit, 211
Kanji → Hiragana conversion dictionary storage unit, 212 Hiragana →
Phoneme conversion dictionary storage unit, 214 information generation unit, 215 vocabulary dictionary generation unit, 220 information provision storage unit, 230 communication device, 260 CD-ROM drive, 270 CD-
ROM.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考） // Ｇ０６Ｆ 17/30 １１０Ｇ１０Ｌ 3/00 ５７１Ａ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI theme coat ゛ (reference) // G06F 17/30 110 G10L 3/00 571A

Claims

[Claims]

A communication unit for receiving a transmission request for information and transmitting information corresponding to the transmission request, wherein the communication unit receives the information transmission request and transmits the information corresponding to the transmission request. An information providing apparatus comprising: an information providing unit for converting a part of a display text character string into voice vocabulary dictionary data used for voice operation, adding the converted data to information corresponding to the transmission request, and outputting the information to the communication unit.

2. The information according to claim 1, wherein the information providing unit merges the converted speech vocabulary dictionary data with speech vocabulary dictionary data already existing in information corresponding to the transmission request, and outputs the merged speech vocabulary dictionary data. Providing device.

3. The information providing means converts a part of a display text string related to a hyperlink description in the designated information into voice vocabulary dictionary data used for voice operation, and converts the part into the hyperlink description. 3. The information providing apparatus according to claim 1, wherein the information is output to the communication unit in association with the information.

4. The information providing means converts the speech vocabulary dictionary data already existing in the information corresponding to the transmission request into speech vocabulary dictionary data of a different format and outputs the speech vocabulary dictionary data to the communication means. 4. The information providing device according to any one of claims 3 to 3.

5. The information providing device further comprises a hiragana /
A first dictionary for converting a katakana character string into a phoneme symbol, wherein the information providing unit converts the hiragana / katakana of the display text character string into a phoneme symbol string with reference to the first dictionary; The information providing device according to any one of claims 1 to 4.

6. The information providing apparatus further includes a second dictionary for converting a kanji into a hiragana / katakana character string, wherein the information providing means refers to the second dictionary and displays the display text character. 6. The information providing apparatus according to claim 5, wherein the kanji in the column is converted into a Hiragana / Katakana character string, and the Hiragana / Katakana character string is converted into a phoneme symbol string with reference to the first dictionary.

7. An information generating means for generating information including a plurality of files having a display text string, and voice operation of a part of the display text strings of the plurality of files generated by the information generating means. A vocabulary dictionary generating means for converting the vocabulary dictionary data to be used for the vocabulary dictionary and providing the vocabulary dictionary data to the plurality of files; and An information generating apparatus including: a communication unit.

8. The vocabulary dictionary generating means converts a part of a display text string associated with a hyperlink description in the plurality of files into voice vocabulary dictionary data used for voice operation, and converts the text into a hyperlink description. The information generating apparatus according to claim 7, wherein the information is output to the communication unit in association with the information.

9. An information including a terminal device, an information providing device for providing information in response to an information transmission request from the terminal device, and an information generating device for generating information to be uploaded to the information providing device. A providing system, wherein the information generating device includes: an information generating unit configured to generate information including a plurality of files having a display text string; and display text characters of the plurality of files generated by the information generating unit. A vocabulary dictionary generating means for converting a part of the sequence into voice vocabulary dictionary data used for voice operation and providing the plurality of files; and a plurality of files to which voice vocabulary dictionary data is provided by the vocabulary dictionary generating means. And first communication means for uploading the information to the information providing apparatus, and the information providing apparatus transmits information from the terminal apparatus. A second communication unit for receiving a communication request and transmitting information corresponding to the transmission request; and responding to the transmission request for the information received by the second communication unit, Information providing means for outputting to the second communication means.

10. The vocabulary dictionary generating means converts a part of a display text string associated with a hyperlink description in the plurality of files into voice vocabulary dictionary data used for voice operation, The information providing system according to claim 9, wherein the information is output to the first communication unit in association with the information.

11. A step of receiving an information transmission request, and in response to the received information transmission request, speech vocabulary dictionary data used for voice operation of a part of a display text character string in designated information. And providing the information to the information corresponding to the transmission request; and transmitting the information to which the voice vocabulary dictionary data is added.

12. Generating information including a plurality of files having display text strings, and converting a part of the generated display text strings of the plurality of files into voice vocabulary dictionary data used for voice operation. An information generation method, comprising: converting and attaching the plurality of files to the plurality of files; and transmitting and uploading the plurality of files to which the speech vocabulary dictionary data is assigned.

13. A computer-readable recording medium storing a program for causing a computer to execute the information providing method, wherein the information providing method includes a step of receiving an information transmission request; Converting a part of the display text string in the designated information into voice vocabulary dictionary data used for voice operation in response to the information transmission request, and attaching the information to information corresponding to the transmission request; Transmitting information to which vocabulary dictionary data has been added, the computer-readable recording medium.

14. A computer-readable recording medium recording a program for causing a computer to execute the information generation method, wherein the information generation method stores information including a plurality of files having a display text string. Generating, converting a part of the generated display text character strings of the plurality of files into voice vocabulary dictionary data used for voice operation, and attaching the data to the plurality of files; Transmitting and uploading the plurality of assigned files.