JP6477096B2

JP6477096B2 - Input device and sound synthesizer

Info

Publication number: JP6477096B2
Application number: JP2015058374A
Authority: JP
Inventors: 竜之介大道; ウイルソンマイケル
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2015-03-20
Filing date: 2015-03-20
Publication date: 2019-03-06
Anticipated expiration: 2035-03-20
Also published as: JP2016177639A

Description

本発明は、複数の点に同時に触れて操作することができるマルチタッチパネルを利用した入力装置に関する。 The present invention relates to an input device using a multi-touch panel that can be operated by simultaneously touching a plurality of points.

電子機器に対するユーザの入力操作を受け付けるユーザインタフェースとしてタッチパネルが一般に普及している。タッチパネルは液晶ディスプレイなどの表示装置の表示面に透明なシート状の接触検知センサを貼り付けて構成されている。電子機器のユーザは、表示装置に表示される画像による案内に応じて指やペン先などで透明接触検知センサの表面（以下、操作検出面）に触れるといった直観的で判り易い操作により、電子機器に対する各種入力を行うことができる。タッチパネルを電子機器に対するユーザインタフェースとして用いる場合、ユーザの操作内容を表すデータ、すなわち、ユーザのタッチ位置を表すデータをその電子機器にて実行するアプリケーションに応じたデータに変換する必要がある。例えば、歌唱音声を合成する歌唱合成装置であれば、ユーザのタッチ位置を合成対象の歌唱音声の発音を表すデータ（例えば、音素や音高、音量を表すデータ等）に変換し、その変換結果に応じて歌唱音声を合成するといった具合である。 As a user interface for accepting a user input operation on an electronic device, a touch panel is generally popular. The touch panel is configured by attaching a transparent sheet-like contact detection sensor to a display surface of a display device such as a liquid crystal display. The user of the electronic device performs an intuitive and easy-to-understand operation such as touching the surface (hereinafter referred to as an operation detection surface) of the transparent contact detection sensor with a finger or a pen tip according to the guidance by the image displayed on the display device. Various inputs can be made. When a touch panel is used as a user interface for an electronic device, it is necessary to convert data representing the operation contents of the user, that is, data representing the touch position of the user into data corresponding to an application executed on the electronic device. For example, in the case of a singing voice synthesizing device that synthesizes a singing voice, the user's touch position is converted into data representing the pronunciation of the singing voice to be synthesized (for example, data representing a phoneme, pitch, volume, etc.), and the conversion result The singing voice is synthesized according to the situation.

“アングリーバード”、インターネット、＜ＵＲＬ：http://www.angrybirds.jp/＞“Angry Bird”, Internet, <URL: http://www.angrybirds.jp/>

従来のタッチパネルは、ユーザが操作検出面の複数個所に同時に指を触れたとしても、一点だけしか検出できないものや、接触箇所を正しく検出できないものが主流であった。しかし、近年では、複数個所の接触検出が可能なマルチタッチパネルが普及しつつある。電子機器のユーザインタフェースとしてマルチタッチパネルを用いると、従来にはなかったアプリケーションを実現できると期待される。例えば、複数人が同時に操作することで共同作業的なアプリケーションを実現できる、といった具合である。また、複数のタッチ位置間の位置関係に新たな情報を対応させたり、それらタッチ位置の変化に応じてアプリケーションの処理内容を変えられるようにすれば、上記ユーザインタフェースを適用可能なアプリケーションの幅が広がる。しかし、複数のタッチ位置の位置関係に新たな情報を対応させることや、それら複数のタッチ位置の経時的な変化に応じてアプリケーションの処理内容を変えられるようにする技術は従来なかった。 Conventional touch panels have mainly been those that can detect only one point even if the user touches a plurality of locations on the operation detection surface at the same time, and those that cannot correctly detect a contact location. However, in recent years, multi-touch panels capable of detecting contact at a plurality of locations are becoming popular. When a multi-touch panel is used as a user interface of an electronic device, it is expected that an application that has not existed before can be realized. For example, a collaborative application can be realized by a plurality of people operating simultaneously. In addition, if new information is made to correspond to the positional relationship between a plurality of touch positions, or the processing contents of the application can be changed according to changes in the touch positions, the range of applications to which the user interface can be applied is increased. spread. However, there has been no technology that makes new information correspond to the positional relationship of a plurality of touch positions, or changes the processing contents of an application in accordance with changes over time of the plurality of touch positions.

本発明は以上に説明した課題に鑑みて為されたものであり、操作検出面上の複数のタッチ位置の関係に情報を対応させることを可能にするとともに、当該位置関係の経時的な変化に応じて処理内容を変えるアプリケーションを実現できるようにする技術を提供することを目的とする。 The present invention has been made in view of the problems described above, and makes it possible to make information correspond to the relationship between a plurality of touch positions on the operation detection surface, and to change the positional relationship over time. An object of the present invention is to provide a technology that makes it possible to realize an application that changes processing contents in response.

上記課題を解決するために本発明は、操作検出面に対してユーザにより為されたタッチ操作による複数のタッチ位置の各々を示す操作内容データを出力する操作検出手段と、前記操作内容データを他の情報を表すデータに変換して出力する手段であって、前記操作内容データの示す複数のタッチ位置のうちの一つを始点とし当該始点以外のタッチ位置のうちの一つを終点とした場合における当該始点と当該終点の予め定められた方の位置、当該予め定められた方から見た他方の方向および両者の間の距離を予め定められたルールにしたがって３種類の情報に変換し、少なくとも当該３種類の情報を含む複数の情報を表すデータを出力する変換手段とを備えることを特徴とする入力装置を提供する。 In order to solve the above-described problems, the present invention provides operation detection means for outputting operation content data indicating each of a plurality of touch positions by a touch operation performed by a user on an operation detection surface; When the data is converted into data representing the information and output, and one of the touch positions indicated by the operation content data is a start point and one of the touch positions other than the start point is an end point The predetermined position of the start point and the end point in the above, the other direction viewed from the predetermined direction, and the distance between them is converted into three types of information according to a predetermined rule, at least There is provided an input device comprising conversion means for outputting data representing a plurality of pieces of information including the three types of information.

上記操作検出手段の具体例としてはマルチタッチパネルが挙げられる。本発明の入力装置を電子機器に対する入力手段として用いるようにすれば、始点或いは終点の位置に応じて定まる情報の他に、始点と終点のうち予め定められた方から見た他方の方向に応じて定まる情報と両者の間の距離に応じて定まる情報（すなわち、両者の位置関係に応じて定まる情報）を操作検出面に対する操作に応じて出力することが可能になる。また、上記操作検出手段が、少なくとも１つのタッチ位置の更新が発生する毎に上記操作内容データを出力するものであり、上記変換手段が、操作内容データを受け取る毎に上記変換および出力を行うものであれば、始点と終点の位置関係を経時的に変化させる操作、すなわち、始点を指示している指先や終点を指示している指先の何れか一方、或いは両方を操作検出面をなぞるように動かす操作に応じて上記情報が経時的に更新され、当該情報を利用して何らかの処理を行うアプリケーション側では当該情報の更新に応じてその処理内容を変えることができる。非特許文献１に開示のコンピュータゲームでは、キャラクタをタッチしたままドラッグすることで、当該キャラクタを投げ飛ばす際の初速および方向を指定することができるが、これら初速や方向の経時的な変化を指定できる訳ではなく、本発明とは異なる技術である。 A specific example of the operation detection means is a multi-touch panel. If the input device of the present invention is used as an input means for an electronic device, in addition to the information determined according to the position of the start point or the end point, it corresponds to the other direction as viewed from a predetermined one of the start point and the end point. It is possible to output information determined according to the distance between them and information determined according to the distance between them (that is, information determined according to the positional relationship between the two) according to the operation on the operation detection surface. The operation detection means outputs the operation content data every time at least one touch position is updated, and the conversion means performs the conversion and output each time the operation content data is received. If so, an operation for changing the positional relationship between the start point and the end point with time, i.e., tracing either one or both of the fingertip indicating the start point and the fingertip indicating the end point on the operation detection surface. The information is updated over time according to the operation to be moved, and the processing content can be changed according to the update of the information on the application side that performs some processing using the information. In the computer game disclosed in Non-Patent Document 1, it is possible to specify the initial speed and direction when throwing the character by dragging while touching the character, but it is possible to specify changes in the initial speed and direction over time. It is not a translation but a technique different from the present invention.

より好ましい態様としては、始点と終点のうちの予め定められた方の位置、当該予め定められた方から見た他方の方向および両者の間の距離の少なくとも一つに関する基準値または推奨値をユーザに報知する報知手段を上記入力装置に設ける態様が考えられる。このような態様によれば、ユーザは、報知手段により報知される基準値或いは推奨値を参照しつつ、始点或いは終点を指定する操作を行うことができるからである。 As a more preferable aspect, the reference value or recommended value relating to at least one of the predetermined position of the start point and the end point, the other direction viewed from the predetermined direction, and the distance between the two is set by the user. It is conceivable to provide the input device with notification means for informing the input device. This is because, according to such an aspect, the user can perform an operation of designating the start point or the end point while referring to the reference value or the recommended value notified by the notification unit.

さらに別の好ましい態様としては、操作検出手段には、各タッチ位置に加えてタッチ開始時刻を表す操作内容データを生成させ、変換手段には、始点のタッチ開始時刻と終点のタッチ開始時刻の時間差を前記３種類の情報とは異なる第４の情報に変換して出力させる態様が考えられる。このような態様によれば、操作検出面上の複数のタッチ位置の位置関係および時間関係に情報を対応させることが可能になり、一層多様な情報入力を行うことが可能になる。なお、上記時間差を第４の情報に対応させるのではなく、始点の決定に利用しても良い。例えば、時間差を設けて指定された２つのタッチ位置のうちの一方を始点、他方を終点とする場合、上記時間差が所定の閾値未満の場合には先に指定された方を始点とし、上記時間差が所定の閾値以上の場合は後から指定された方を始点とするといった具合である。 In another preferred embodiment, the operation detection means generates operation content data representing the touch start time in addition to each touch position, and the conversion means causes the time difference between the start touch start time and the end touch start time. Can be converted into fourth information different from the three types of information and output. According to such an aspect, it becomes possible to make information correspond to the positional relationship and the time relationship of a plurality of touch positions on the operation detection surface, and it becomes possible to perform more diverse information input. In addition, you may utilize the said time difference for determination of a starting point instead of making it respond | correspond to 4th information. For example, if one of the two touch positions specified with a time difference is set as the start point and the other is set as the end point, if the time difference is less than a predetermined threshold, the previously specified one is set as the start point, and the time difference is If is greater than or equal to a predetermined threshold, the one specified later is used as the starting point.

本発明において始点と終点のうちの予め定められた方の位置等に対応付ける情報の種類は、本発明の入力装置をどのような種類の電子機器の入力手段として用いるのかに応じて定まる。本発明の入力装置の適用対象となる電子機器としては、歌唱合成装置などの音合成装置やコンピュータゲーム機が挙げられる。また、本発明の入力装置を、地図アプリケーション用の入力装置、或いはコンピュータゲーム用の入力装置として用いても勿論良い。例えば、本発明の入力装置を、歌唱合成装置等の音合成装置の入力手段として用いる態様、換言すれば、本発明の入力装置と、当該入力装置の出力データに応じて歌唱音声や楽器演奏音を合成する音合成手段とを有する音合成装置を提供する態様においては、始点と終点の予め定められた方の位置、当該予め定められた方から見た他方の方向および両者の間の距離を、合成対象の音を規定する複数の情報（少なくとも、音色または発音、音高および音量の３つの情報）に対応付けておけば良い。また、始点と終点の各々のタッチ時刻の時間差を第４の情報に対応付ける態様においては、合成結果の音を出力する際のベロシティに当該第４の情報を対応付けるようにすれば良い。 In the present invention, the type of information associated with a predetermined one of the starting point and the ending point is determined according to the type of electronic device used as the input device of the present invention. Examples of the electronic device to which the input device of the present invention is applied include a sound synthesizer such as a singing synthesizer and a computer game machine. Of course, the input device of the present invention may be used as an input device for a map application or an input device for a computer game. For example, an aspect in which the input device of the present invention is used as input means of a sound synthesizer such as a synthesizer, in other words, a singing voice or a musical instrument performance sound according to the input device of the present invention and output data of the input device. In the aspect of providing a sound synthesizer having a sound synthesizer for synthesizing the sound, the position of the predetermined point of the start point and the end point, the other direction viewed from the predetermined direction, and the distance between the two are determined. It is only necessary to associate with a plurality of pieces of information defining at least the synthesis target sound (at least three pieces of information of tone color or pronunciation, pitch, and volume). Further, in the aspect in which the time difference between the touch times of the start point and the end point is associated with the fourth information, the fourth information may be associated with the velocity at the time of outputting the synthesized sound.

本発明の入力装置を音合成装置の入力手段として用いる態様においては、始点と終点のうちの予め定められた方から他方を見た方向に音高を対応付け、変換手段には、予め定められた方を中心として他方が予め定められた角度分回転する毎に音高が一オクターブ変化するように、始点および終点の位置を示すデータを音高を示すデータに変換させる方向と音高の対応付けを行っておくことが好ましい。始点と終点のうちの予め定められた方から見た他方の方向は、当該予め定められた方を中心とする極座標における角度で表すことができ、当該角度による表現は平均律等の音楽理論における音高表現との親和性が高いからである。 In an aspect in which the input device of the present invention is used as an input unit of a sound synthesizer, a pitch is associated with a direction in which the other one of the start point and the end point is viewed from the predetermined point, and the conversion unit has a predetermined pitch. Correspondence between the direction and pitch to convert the data indicating the position of the start point and end point to data indicating the pitch so that the pitch changes by one octave each time the other rotates by a predetermined angle. It is preferable to perform the attachment. The other direction seen from the predetermined one of the start point and the end point can be expressed by an angle in polar coordinates centered on the predetermined direction, and the expression by the angle is in music theory such as equal temperament. This is because it has high affinity with pitch expression.

上記課題を解決するための別の態様としては、ＣＰＵ（Central Processing Unit）などのコンピュータを上記変換手段および上記音合成手段として機能させるプログラム、すなわち、当該コンピュータを本発明の音合成装置として機能させるプログラムを提供する態様が考えられる。同様に、コンピュータを上記変換手段として機能させるプログラム、すなわち、当該コンピュータを本発明の入力装置として機能させるプログラムを提供する態様も考えられる。これらのプログラムの具体的な提供態様としては、ＣＤ−ＲＯＭ（Compact Disk-Read Only memory）やＤＶＤ（登録商標：Digital Versatile Disc）、フラッシュＲＯＭなどのコンピュータ読み取り可能な記録媒体に上記プログラムを書き込んで配布する態様や、インターネットなどの電気通信回線経由のダウンロードにより配布する態様が考えられる。 As another aspect for solving the above-described problem, a program that causes a computer such as a CPU (Central Processing Unit) to function as the conversion unit and the sound synthesis unit, that is, the computer functions as the sound synthesis device of the present invention. An aspect of providing a program is conceivable. Similarly, an aspect of providing a program that causes a computer to function as the conversion means, that is, a program that causes the computer to function as the input device of the present invention is also conceivable. As specific provision modes of these programs, the above programs are written on a computer-readable recording medium such as a CD-ROM (Compact Disk-Read Only memory), a DVD (registered trademark: Digital Versatile Disc), or a flash ROM. A distribution mode or a distribution mode via a telecommunication line such as the Internet can be considered.

この発明の音合成装置の一実施形態である歌唱合成装置１０の構成例を示すブロック図である。It is a block diagram which shows the structural example of the song synthesis apparatus 10 which is one Embodiment of the sound synthesizer of this invention. 同歌唱合成装置１０のユーザインタフェース部１１０の表示手段１１０ａに表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on the display means 110a of the user interface part 110 of the song synthesizing | combining apparatus 10. FIG. 同歌唱合成装置１０に対する入力操作を説明するための図である。It is a figure for demonstrating input operation with respect to the song synthesizing | combining apparatus. 同歌唱合成装置１０の制御部１００が歌唱合成プログラム１２４ａにしたがって作動することにより実現される状態遷移を説明するための図である。It is a figure for demonstrating the state transition implement | achieved when the control part 100 of the song synthesizing | combining apparatus 10 operate | moves according to the song synthesis program 124a. 同制御部１００が歌唱合成プログラム１２４ａにしたがって実行する処理のうち、タッチ開始の検出を契機として実行する処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process performed by the detection of a touch start among the processes which the control part 100 performs according to the song synthesis program 124a. 同制御部１００が歌唱合成プログラム１２４ａにしたがって実行する処理のうち、タッチ終了の検出を契機として実行する処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process performed by the detection of completion | finish of a touch among the processes which the control part 100 performs according to the song synthesis program 124a. 角度のアンラッピングを実現するプログラムのソースコードの一例を示す図であるIt is a figure which shows an example of the source code of the program which implement | achieves angle unwrapping 変形例（３）を説明するための図である。It is a figure for demonstrating a modification (3). 変形例（４）を説明するための図である。It is a figure for demonstrating a modification (4). 変形例（８）を説明するための図である。It is a figure for demonstrating a modification (8). 変形例（８）を説明するための図である。It is a figure for demonstrating a modification (8). 変形例（８）を説明するための図である。It is a figure for demonstrating a modification (8). 変形例（８）を説明するための図である。It is a figure for demonstrating a modification (8).

以下、図面を参照しつつ、この発明の実施形態を説明する。
（Ａ：構成）
図１は、本発明の音合成装置の一実施形態の歌唱合成装置１０の構成例を示す図である。歌唱合成装置１０は、ユーザが手に持って操作可能な大きさ（例えばＢ５サイズ）のタブレット型のコンピュータ装置に歌唱合成プログラムをインストールしたものである。歌唱合成装置１０のユーザは、歌唱合成装置１０を片手で保持しつつ他方の手で各種操作を行うことができる。歌唱合成装置１０は、ユーザの操作に応じて歌唱音声を合成し、放音する。図１に示すように、歌唱合成装置１０は、制御部１００、ユーザインタフェース（以下、「Ｉ／Ｆ」と表記）部１１０、記憶部１２０、およびこれら構成要素間のデータ授受を仲介するバス１３０を有している。 Embodiments of the present invention will be described below with reference to the drawings.
(A: Configuration)
FIG. 1 is a diagram illustrating a configuration example of a singing voice synthesizing apparatus 10 according to an embodiment of the sound synthesizing apparatus of the present invention. The singing voice synthesizing apparatus 10 is obtained by installing a singing voice synthesizing program on a tablet computer device having a size (for example, B5 size) that a user can hold and operate. The user of the song synthesizing device 10 can perform various operations with the other hand while holding the song synthesizing device 10 with one hand. The singing voice synthesizing apparatus 10 synthesizes a singing voice according to a user operation and emits the sound. As shown in FIG. 1, the singing voice synthesizing apparatus 10 includes a control unit 100, a user interface (hereinafter referred to as “I / F”) unit 110, a storage unit 120, and a bus 130 that mediates data exchange between these components. have.

制御部１００は例えばＣＰＵである。制御部１００は、記憶部１２０に記憶されている歌唱合成プログラム１２４ａを実行することで歌唱合成装置１０の制御中枢として機能する。制御部１００が歌唱合成プログラム１２４ａにしたがって実行する処理の詳細については後に明らかにする。 The control unit 100 is a CPU, for example. The control unit 100 functions as a control center of the singing voice synthesizing apparatus 10 by executing the singing voice synthesis program 124a stored in the storage unit 120. Details of processing executed by the control unit 100 in accordance with the song synthesis program 124a will be clarified later.

ユーザＩ／Ｆ部１１０は、表示手段１１０ａ、音出力手段１１０ｂ、および操作検出手段１１０ｃを有する。表示手段１１０ａは、歌唱合成装置１０の筐体と略同じサイズの液晶パネルである。音出力手段１１０ｂは、制御部１００から与えられる音データ（例えば、音のサンプリング波形を表すサンプル列：本実施形態では歌唱音声のサンプル列）に応じた音を出力する。図１では図示を省略したが、音出力手段１１０ｂは、上記音データをアナログ信号に変換するＤ／Ａ変換器と当該Ｄ／Ａ変換器の出力信号により駆動されるスピーカとを含んでいる。 The user I / F unit 110 includes a display unit 110a, a sound output unit 110b, and an operation detection unit 110c. The display means 110a is a liquid crystal panel having substantially the same size as the casing of the singing voice synthesizing apparatus 10. The sound output means 110b outputs a sound corresponding to sound data given from the control unit 100 (for example, a sample sequence representing a sound sampling waveform: a sample sequence of singing voice in this embodiment). Although not shown in FIG. 1, the sound output means 110b includes a D / A converter that converts the sound data into an analog signal and a speaker that is driven by the output signal of the D / A converter.

操作検出手段１１０ｃは、表示手段１１０ａの表示面全体を覆うように貼り付けられた透明なシート状の接触検知センサ（すなわち、Ｂ５サイズよりも若干小さい透明接触検知センサ）である。操作検出手段１１０ｃは透明であるため、ユーザは操作検出手段１１０ｃ越しに表示手段１１０ａの表示内容を視認できる。詳細については後述するが、表示手段１１０ａには、制御部１００による制御の下、歌唱合成装置１０の使用をユーザに促す画面が表示される。ユーザは、表示手段１１０ａに表示される画面による案内を受けて操作検出面に対する操作を行う。操作検出手段１１０ｃは、その表面（すなわち、操作検出面）に対してユーザの指先等によるタッチ開始またはタッチ終了の操作が為されると、その操作内容を表す操作内容データを出力する。なお、タッチ終了の操作とは、タッチしていた指先等を操作検出面から離すことを言う。 The operation detection unit 110c is a transparent sheet-like contact detection sensor (ie, a transparent contact detection sensor slightly smaller than the B5 size) attached so as to cover the entire display surface of the display unit 110a. Since the operation detection unit 110c is transparent, the user can visually recognize the display content of the display unit 110a through the operation detection unit 110c. Although details will be described later, a screen that prompts the user to use the singing voice synthesizing device 10 under the control of the control unit 100 is displayed on the display unit 110a. The user performs an operation on the operation detection surface in response to guidance on the screen displayed on the display unit 110a. The operation detection unit 110c outputs operation content data representing the operation content when a touch start operation or touch end operation is performed on the surface (that is, the operation detection surface) by the user's fingertip or the like. Note that the touch end operation refers to releasing the touched fingertip or the like from the operation detection surface.

操作内容データには、タッチ開始の位置またはタッチ終了の位置を示す位置データと、操作種別（タッチ開始であるか、タッチ終了であるか）を示す操作種別データとが含まれている。本実施形態では、位置データとして、操作検出面の左上隅を原点とし、表示手段１１０ａの水平走査方向を一方の座標軸、垂直走査方向を他方の座標軸とした場合の二次元座標系におけるタッチ開始の位置等の座標を示すデータが用いられる。操作検出手段１１０ｃは、操作検出面上のある位置に対してタッチが為されている状態で他の位置に対するタッチが開始されると、後者の位置を示す位置データおよびタッチ開始を示す操作種別データを含む操作内容データを出力する。つまり、操作検出手段１１０ｃと表示手段１１０ａはマルチタッチパネルを形成する。 The operation content data includes position data indicating a touch start position or a touch end position, and operation type data indicating an operation type (touch start or touch end). In this embodiment, as position data, the origin of the touch start in the two-dimensional coordinate system when the upper left corner of the operation detection surface is the origin, the horizontal scanning direction of the display unit 110a is one coordinate axis, and the vertical scanning direction is the other coordinate axis. Data indicating coordinates such as position is used. When the touch on another position is started in a state where a touch is made on a position on the operation detection surface, the operation detection unit 110c performs position data indicating the latter position and operation type data indicating the touch start. Operation contents data including is output. That is, the operation detection unit 110c and the display unit 110a form a multi-touch panel.

記憶部１２０は、図１に示すように、揮発性記憶部１２２および不揮発性記憶部１２４を有する。揮発性記憶部１２２は例えばＲＡＭ（Random Access Memory）である。揮発性記憶部１２２は、各種プログラムを実行する際のワークエリアとして制御部１００によって利用される。不揮発性記憶部１２４は、フラッシュＲＯＭやハードディスクである。不揮発性記憶部１２４には、歌唱合成プログラム１２４ａの他に、ＯＳ（Operating System）を制御部１００に実現させるためのＯＳソフトウェア（図１では図示略）が格納されている。 As illustrated in FIG. 1, the storage unit 120 includes a volatile storage unit 122 and a nonvolatile storage unit 124. The volatile storage unit 122 is, for example, a RAM (Random Access Memory). The volatile storage unit 122 is used by the control unit 100 as a work area when executing various programs. The nonvolatile storage unit 124 is a flash ROM or a hard disk. In addition to the song synthesis program 124a, the nonvolatile storage unit 124 stores OS software (not shown in FIG. 1) for causing the control unit 100 to implement an OS (Operating System).

制御部１００は、歌唱合成装置１０の電源（図１では図示略）の投入またはリセットを契機としてＯＳソフトウェアを不揮発性記憶部１２４から揮発性記憶部１２２へ読み出してその実行を開始する。ＯＳソフトウェアにしたがって作動し、ＯＳを実現している状態の制御部１００は、ユーザの指示に応じて他のプログラムを実行することができる。ユーザＩ／Ｆ部１１０を介して歌唱合成プログラム１２４ａの実行指示を与えられると、制御部１００は歌唱合成プログラム１２４ａを不揮発性記憶部１２４から揮発性記憶部１２２に読み出し、その実行を開始する。 The control unit 100 reads the OS software from the non-volatile storage unit 124 to the volatile storage unit 122 and starts executing it when a power source (not shown in FIG. 1) is turned on or reset. The control unit 100 operating according to the OS software and realizing the OS can execute another program in accordance with a user instruction. When an instruction to execute the song synthesis program 124a is given via the user I / F unit 110, the control unit 100 reads the song synthesis program 124a from the nonvolatile storage unit 124 to the volatile storage unit 122 and starts its execution.

歌唱合成プログラム１２４ａにしたがって作動している制御部１００は、合成対象の歌唱音声の発音（本実施形態では、当該歌唱音声の音素）、音高および音量を指示する合成指示データにしたがって歌唱音声の波形データを合成し、音出力手段１１０ｂに出力する音声合成手段として機能する。加えて、歌唱合成プログラム１２４ａにしたがって作動している制御部１００は、合成対象の歌唱音声を規定する複数の情報（前述した音素、音高、および音量）を操作検出手段１１０ｃに対する操作によりユーザに指定させるための画面を表示手段１１０ａに表示させる。 The control unit 100 operating in accordance with the singing synthesis program 124a is configured to generate the singing voice according to the synthesis instruction data instructing the pronunciation of the singing voice to be synthesized (in this embodiment, the phoneme of the singing voice), the pitch, and the volume. It functions as speech synthesis means for synthesizing waveform data and outputting it to the sound output means 110b. In addition, the control unit 100 operating according to the singing synthesis program 124a gives the user a plurality of pieces of information (phonemes, pitches, and volumes) that define the singing voice to be synthesized by operating the operation detecting unit 110c. A screen for designating is displayed on the display means 110a.

より詳細に説明すると、歌唱合成プログラム１２４ａにしたがって作動している制御部１００は、合成対象の歌唱音声を規定する複数の情報をユーザに指定させるため、図２に示す画面を表示手段１１０ａに表示させる。図２に示すように、この画面には、各々が正五角形の頂点となるように配置された小円Ａ０１〜Ａ０５と小円Ａ０１〜Ａ０５の各々の中心を通る大円Ａ０６とが描画されており、小円Ａ０１〜Ａ０５の各々の内部には母音を表す文字“ａ”、“ｅ”、“ｉ”、“ｏ”、および“ｕ”が描画されている。図２に示す画面を視認したユーザは、操作検出面に対して一つのベクトルを指定する操作（すなわち、ベクトルの始点と終点を指定する操作）を行うことで、合成対象の歌唱音声を規定する複数の情報を指定することができる。 More specifically, the control unit 100 operating according to the song synthesis program 124a displays the screen shown in FIG. 2 on the display means 110a in order to allow the user to specify a plurality of pieces of information that define the song voice to be synthesized. Let As shown in FIG. 2, a small circle A01 to A05 and a great circle A06 passing through the center of each of the small circles A01 to A05 are drawn on this screen so as to be the vertices of a regular pentagon. In each of the small circles A01 to A05, letters “a”, “e”, “i”, “o”, and “u” representing vowels are drawn. The user who visually recognizes the screen shown in FIG. 2 defines the singing voice to be synthesized by performing an operation for designating one vector on the operation detection plane (that is, an operation for designating the start point and end point of the vector). Multiple information can be specified.

本実施形態では、操作検出手段１１０ｃに対して複数のベクトルを同時に指定することはできない。つまり、１つのベクトルが指定されている状態（当該ベクトルの始点および終点をユーザが指先で押さえている状態）でさらに他の位置に対するタッチ操作が為されても当該タッチ操作は無視される。本実施形態では、ベクトルが全く指定されていない状態或いは終点のみが指定された状態（始点と終点とが指定された状態から始点の指定が解除されることで生じる状態）でユーザがタッチ操作を行うと、そのタッチ位置はベクトルの始点と解釈される。 In the present embodiment, a plurality of vectors cannot be specified simultaneously for the operation detection unit 110c. That is, even when a touch operation is performed on another position in a state where one vector is specified (a state where the user holds the start point and end point of the vector with a fingertip), the touch operation is ignored. In the present embodiment, the user performs a touch operation in a state in which no vector is specified or in a state in which only the end point is specified (a state in which the start point and the end point are specified to cancel the start point specification). When done, the touch position is interpreted as the starting point of the vector.

より詳細に説明すると、ユーザは、小円Ａ０１〜Ａ０５の何れかの内部の一点を始点として指定することで小円に対応する母音を、合成対象の歌唱音声の音素として指定することができる。本実施形態では、合成対象の歌唱音声の音素の指定を促す表象として、母音を割り当てられた小円を用いるが、他の表象を用いても勿論良い。また、ユーザは、始点から見た終点の方向で音高を、始点と終点の間の距離で音量を指定することができる。例えば、母音“ａ”をある音高および音量で発音する歌唱音声の合成を指示する場合には、図３の矢印Ｂ０１で表されるようなベクトルを指定（すなわち、始点および終点を指定）すれば良く、より大きな音量の歌唱音声の合成を指示する場合には同図３の矢印Ｂ０２で表されるようなベクトルを指定すれば良い。そして、図３の矢印Ｂ０１のベクトルの示す歌唱音声よりも音高の高い歌唱音声の合成を指定する場合には、図３の矢印Ｂ０３で表されるようなベクトルを指定すれば良い。 More specifically, the user can designate a vowel corresponding to the small circle as a phoneme of the singing voice to be synthesized by designating one of the small circles A01 to A05 as a starting point. In the present embodiment, a small circle to which a vowel is assigned is used as a representation that prompts the user to specify the phoneme of the singing voice to be synthesized. However, other representations may be used. In addition, the user can specify the pitch in the direction of the end point viewed from the start point, and the volume in the distance between the start point and the end point. For example, in order to instruct the synthesis of a singing voice that utters the vowel “a” at a certain pitch and volume, a vector as indicated by the arrow B01 in FIG. 3 is designated (that is, the start point and the end point are designated). What is necessary is just to designate the vector as shown by arrow B02 of the same figure, when instruct | indicating the synthesis | combination of the loud sound of singing. Then, when specifying the synthesis of a singing voice having a pitch higher than that of the singing voice indicated by the vector indicated by the arrow B01 in FIG. 3, a vector represented by the arrow B03 in FIG. 3 may be specified.

図２に示す画面を表示手段１１０ａに表示させている状況下で操作検出手段１１０ｃに対して何らかの操作が行われると、制御部１００はその操作内容に応じたデータを揮発性記憶部１２２に書き込み、これにより歌唱合成装置１０の内部状態は順次遷移して行く。歌唱合成装置１０の内部状態としては図４に示す４つの状態が挙げられる。すなわち、始点および終点の両者の何れも指定されていない初期状態Ｓ０１、始点のみが指定された始点記憶済状態Ｓ０２、終点のみが指定された終点記憶済状態Ｓ０３、始点と終点の両者が指定された始点・終点記憶済状態Ｓ０４の４種類の状態である。制御部１００は、操作検出手段１１０ｃの操作検出面に対するタッチ開始の検出を契機として図５に示す処理を実行する一方、タッチ終了の検出を契機として図６に示す処理を実行することで上記４種類の状態間の遷移を実現する。 When an operation is performed on the operation detection unit 110c in a state where the screen illustrated in FIG. 2 is displayed on the display unit 110a, the control unit 100 writes data corresponding to the operation content in the volatile storage unit 122. As a result, the internal state of the singing voice synthesizing apparatus 10 sequentially changes. As the internal state of the singing voice synthesizing apparatus 10, there are four states shown in FIG. That is, the initial state S01 in which neither the start point nor the end point is specified, the start point stored state S02 in which only the start point is specified, the end point stored state S03 in which only the end point is specified, both the start point and the end point are specified There are four types of states: a start point / end point stored state S04. The control unit 100 executes the process shown in FIG. 5 triggered by the detection of the touch start on the operation detection surface of the operation detection unit 110c, while executing the process shown in FIG. 6 triggered by the detection of the touch end. Realize transitions between types of states.

図５は、タッチ開始の検出を契機として実行される処理の流れを示すフローチャートである。操作検出手段１１０ｃから受け取った操作内容データを解析してタッチ開始を検出した制御部１００は、まず、始点を示す始点位置データが揮発性記憶部１２２に格納されているか否かを判定する（ステップＳＡ１００）。本実施形態では、操作内容データに含まれている操作種別データが“タッチ開始”を意味するものである場合に制御部１００はタッチ開始を検出し、ステップＳＡ１００の判定を行う。 FIG. 5 is a flowchart showing a flow of processing executed in response to detection of touch start. The control unit 100 that has detected the start of touch by analyzing the operation content data received from the operation detection unit 110c first determines whether or not the start point position data indicating the start point is stored in the volatile storage unit 122 (step). SA100). In the present embodiment, when the operation type data included in the operation content data means “touch start”, the control unit 100 detects the touch start and performs the determination in step SA100.

ステップＳＡ１００の判定結果が“Ｎｏ”であれば、制御部１００は、操作内容データに含まれる位置データの示す位置を始点の位置とし、当該始点の位置を示す始点位置データを記憶し、すなわち、揮発性記憶部１２２に当該始点位置データを書き込み（ステップＳＡ１１０）、当該処理を終了する。これにより、初期状態Ｓ０１から始点記憶済状態Ｓ０２への遷移Ｔ０１、或いは終点記憶済状態Ｓ０３から始点・終点記憶済状態Ｓ０４への遷移Ｔ０７が実現される。これに対して、ステップＳＡ１００の判定結果が“Ｙｅｓ”である場合には、制御部１００は、終点の位置を示す終点位置データが揮発性記憶部１２２に格納されているか否かを判定する（ステップＳＡ１２０）。 If the determination result in step SA100 is “No”, the control unit 100 sets the position indicated by the position data included in the operation content data as the start point position, and stores the start point position data indicating the position of the start point. The starting point position data is written in the volatile storage unit 122 (step SA110), and the process ends. Thereby, the transition T01 from the initial state S01 to the start point stored state S02 or the transition T07 from the end point stored state S03 to the start / end point stored state S04 is realized. On the other hand, when the determination result in step SA100 is “Yes”, the control unit 100 determines whether or not the end point position data indicating the end point position is stored in the volatile storage unit 122 ( Step SA120).

ステップＳＡ１２０の判定結果が“Ｎｏ”であれば、制御部１００は、操作内容データに含まれる位置データの示す位置を終点の位置とし、当該終点の位置を示す終点位置データを記憶（ステップＳＡ１３０）して当該処理を終了する。これにより、始点記憶済状態Ｓ０２から始点・終点記憶済状態Ｓ０４への遷移Ｔ０２が実現される。ステップＳＡ１２０の判定結果が“Ｙｅｓ”である場合には、制御部１００はタッチ開始期位置の記憶を行うことなく、当該処理を終了する。始点と終点とが指定されている状態でさらに他の位置に対するタッチが為されても、そのタッチ操作を無視するためである。 If the determination result in step SA120 is “No”, the control unit 100 sets the position indicated by the position data included in the operation content data as the end point position, and stores end point position data indicating the end point position (step SA130). Then, the process ends. Thereby, the transition T02 from the start point stored state S02 to the start point / end point stored state S04 is realized. If the determination result in step SA120 is “Yes”, the control unit 100 ends the process without storing the touch start period position. This is for ignoring the touch operation even if another position is touched while the start point and the end point are specified.

図６は、タッチ終了の検出を契機として実行される処理の流れを示すフローチャートである。操作検出手段１１０ｃから受け取った操作内容データを解析してタッチ終了を検出した制御部１００は、まず、始点のタッチが解除されたのか否かを判定する（ステップＳＢ１００）。ステップＳＢ１００の判定結果が“Ｙｅｓ”であれば、制御部１００は、始点位置データを揮発性記憶部１２２から削除し（ステップＳＢ１１０）、当該処理を終了する。これにより、始点記憶済状態Ｓ０２から初期状態Ｓ０１への遷移Ｔ０４、或いは始点・終点記憶済状態Ｓ０４から終点記憶済状態Ｓ０３への遷移Ｔ０５が実現される。 FIG. 6 is a flowchart showing a flow of processing executed in response to detection of the end of touch. The control unit 100 that has detected the end of the touch by analyzing the operation content data received from the operation detection unit 110c first determines whether the touch at the start point has been released (step SB100). If the determination result in step SB100 is “Yes”, the control unit 100 deletes the starting point position data from the volatile storage unit 122 (step SB110), and ends the process. Thereby, the transition T04 from the start point stored state S02 to the initial state S01 or the transition T05 from the start point / end point stored state S04 to the end point stored state S03 is realized.

これに対して、ステップＳＢ１００の判定結果が“Ｎｏ”である場合、制御部１００は、終点のタッチが解除されたか否かを判定する（ステップＳＢ１２０）。ステップＳＢ１２０の判定結果が“Ｙｅｓ”であれば、制御部１００は、終点位置データを揮発性記憶部１２２から削除し（ステップＳＢ１３０）、当該処理を終了する。これにより、始点・終点記憶済状態Ｓ０４から始点記憶済状態Ｓ０２への遷移Ｔ０３、或いは終点記憶済状態Ｓ０３から初期状態Ｓ０１への遷移Ｔ０６が実現される。ステップＳＢ１２０の判定結果が“Ｎｏ”である場合には、制御部１００は当該処理を終了する。なお、本実施形態では、始点および終点以外に他の点を指定するタッチ操作が為されても当該タッチ操作は無視されるため、ステップＳＢ１２０の判定結果が“Ｎｏ”になることはない。 On the other hand, when the determination result of step SB100 is “No”, the control unit 100 determines whether or not the end point touch is released (step SB120). If the determination result in step SB120 is “Yes”, the control unit 100 deletes the end point position data from the volatile storage unit 122 (step SB130), and ends the process. Thereby, the transition T03 from the start point / end point stored state S04 to the start point stored state S02 or the transition T06 from the end point stored state S03 to the initial state S01 is realized. If the determination result in step SB120 is “No”, the control unit 100 ends the process. In the present embodiment, even if a touch operation that designates another point other than the start point and the end point is performed, the touch operation is ignored, so that the determination result in step SB120 does not become “No”.

始点記憶済状態Ｓ０２から始点・終点記憶済状態Ｓ０４への遷移Ｔ０２、或いは終点記憶済状態Ｓ０３から始点・終点記憶済状態Ｓ０４への遷移Ｔ０７が発生すると、１つのベクトルが確定する。このため、制御部１００は、始点・終点記憶済状態Ｓ０４への遷移が発生すると当該ベクトルを表すデータを予め定められたルールにしたがって合成指示データに変換して音合成手段に与える。つまり、歌唱合成プログラム１２４ａにしたがって作動している制御部１００は、操作内容データを予め定められたルールにしたがって合成指示データに変換して出力する変換手段として機能する。 When the transition T02 from the start point stored state S02 to the start point / end point stored state S04 or the transition T07 from the end point stored state S03 to the start point / end point stored state S04 occurs, one vector is determined. For this reason, when the transition to the start point / end point stored state S04 occurs, the control unit 100 converts the data representing the vector into synthesis instruction data according to a predetermined rule, and gives it to the sound synthesis means. That is, the control unit 100 operating according to the song synthesis program 124a functions as a conversion unit that converts operation content data into synthesis instruction data according to a predetermined rule.

本実施形態の変換手段は、始点・終点記憶済状態Ｓ０４において記憶されている始点の位置を合成対象の歌唱音声の音素に、始点から見た終点の方向を同歌唱音声の音高に、始点と終点との間の距離を歌唱音声の音量に変換し、当該音素、音高および音量の歌唱音声の合成を指示する合成指示データを音合成手段に出力する。その結果、上記音素、音高および音量の歌唱音声が音出力手段から出力される。このようにして出力を開始した歌唱音について、制御部１００は、始点・終点記憶済状態Ｓ０４から他の状態への状態遷移の発生を契機として出力を停止する。 The conversion means of this embodiment uses the position of the start point stored in the start point / end point stored state S04 as the phoneme of the singing voice to be synthesized, the direction of the end point viewed from the start point as the pitch of the singing voice, Is converted into the volume of the singing voice, and synthesis instruction data for instructing synthesis of the singing voice having the phoneme, the pitch and the volume is output to the sound synthesis means. As a result, the singing voice having the above phoneme, pitch and volume is output from the sound output means. With respect to the singing sound that has started to be output in this manner, the control unit 100 stops the output when the state transition from the start point / end point stored state S04 to another state occurs.

変換手段におけるデータ変換ルールとしては種々のものが考えられるが、具体例を挙げると以下の通りである。まず、始点位置の音素への変換ルールについては、図２に示す画面における小円Ａ０１〜Ａ０２の各々の円周上および内部の領域を示す範囲データ（前述した二次元座標系における当該円周上および当該範囲を示すデータ）に対応付けてその小円に対応する母音を示す母音データを格納した変換ルールテーブルを歌唱合成プログラム１２４ａに内蔵させておき、この変換ルールテーブルの格納内容と始点位置データの示す始点位置との対比により実現すれば良い。例えば、始点位置が何れかの小円の円周上または領域内の位置である場合には、変換手段は、変換ルールテーブルにおいてその小円に対応付けられた母音データの表す母音を、合成対象の歌唱音声の音素とするのである。なお、始点位置が小円Ａ０１〜Ａ０５の円周上または内部に属さない場合には、最も近い小円に対応する母音が指定されたと解釈しても良く、また、最も近い小円と２番目に近い小円の各々に対応する母音の間の音素が指定されたと解釈しても良い。また、始点位置が小円Ａ０１〜Ａ０５の円周上または内部に属さない場合には、各母音に対応する小円の中心と当該始点との間の距離の逆数をその合計値が１になるように正規化した値を各母音の重みとして各母音をモーフィングすることで得られる音素の発音を指示されたと解釈し、当該始点の位置を当該音素（上記モーフィングにより得られる音素）に変換しても良い。 Various data conversion rules in the conversion means can be considered, and specific examples are as follows. First, regarding the conversion rule to the phoneme of the starting point position, range data (on the circumference in the two-dimensional coordinate system described above) on the circumference of each of the small circles A01 to A02 on the screen shown in FIG. And a conversion rule table storing vowel data indicating vowels corresponding to the small circle in association with the small circle) is incorporated in the singing synthesis program 124a, and the stored contents of this conversion rule table and the start point position data What is necessary is just to implement | achieve by contrast with the starting point position which shows. For example, when the starting point position is on the circumference of one of the small circles or in a region, the conversion unit converts the vowel represented by the vowel data associated with the small circle in the conversion rule table to be synthesized. This is the phoneme of the singing voice. When the starting point position does not belong to or within the circumference of the small circles A01 to A05, it may be interpreted that the vowel corresponding to the closest small circle is designated, and the closest small circle and the second It may be interpreted that a phoneme between vowels corresponding to each of small circles close to is designated. When the starting point position does not belong to or within the circumference of the small circles A01 to A05, the sum of the reciprocals of the distance between the center of the small circle corresponding to each vowel and the starting point is 1. The phonetic pronunciation obtained by morphing each vowel with the normalized value as the weight of each vowel is interpreted as instructed, and the position of the starting point is converted to the phoneme (phoneme obtained by the morphing) Also good.

始点から見た終点の方向の音高への変換、および始点と終点の間の距離の音量への変換については、変換手段は、まず、終点の位置を、始点を原点とする極座標（ｒ、θ）に変換する。始点の座標が（ｘ１、ｙ１）であり、終点の座標が（ｘ２、ｙ２：ただし、ｘ２≠ｘ１、ｙ２≠ｙ１）である場合、変換手段は、以下の数１および数２により極座標変換を実現する。なお、数２においてａｔａｎ（）は逆正接関数であり、ｘ＝ｘ２−ｘ１、ｙ＝ｙ２−ｙ１であり、πは円周率である。次いで、変換手段は、数１にしたがって算出したｒの値を変換ルールテーブル（すなわち、ｒの値と音量とを対応付ける変換ルールテーブル）を用いて音量に変換するとともに、数２にしたがって算出したθの値を変換ルールテーブル（すなわち、θの値と音高とを対応付ける変換ルールテーブル）を用いて音高に変換する。これら変換ルールテーブルについても歌唱合成プログラム１２４ａに予め内蔵させておけば良い。

For the conversion to the pitch in the direction of the end point viewed from the start point, and the conversion to the volume of the distance between the start point and the end point, the conversion means first sets the position of the end point to polar coordinates (r, r, θ). When the coordinates of the start point are (x1, y1) and the coordinates of the end point are (x2, y2: where x2 ≠ x1, y2 ≠ y1), the conversion means performs polar coordinate conversion according to the following equations 1 and 2. Realize. In Equation 2, atan () is an arc tangent function, x = x 2 −x 1, y = y 2 −y 1, and π is a circumference ratio. Next, the conversion means converts the value of r calculated according to Equation 1 into a volume using a conversion rule table (that is, a conversion rule table that associates the value of r with the volume), and θ calculated according to Equation 2 Is converted into a pitch using a conversion rule table (that is, a conversion rule table that associates the value of θ with the pitch). These conversion rule tables may be incorporated in the song synthesis program 124a in advance.

数２に従って算出されるθの値は−πからπの範囲に制限される。ここで、タッチ位置検出の時間的間隔を十分小さくとることで直前のタッチ位置検出から現在のタッチ位置検出までの間に閾値θth以上の角度の変化は起きないものとみなすことができる。このような状況下では、閾値θth＝π として所謂アンラッピングを施すことでθの値を正負ともに拡大することができる。なお、アンラッピングの具体的な実現アルゴリズムとしては、例えば図７のソースコードにより示されるアルゴリズムが挙げられる。このようなアンラッピングを施すことで、始点を中心として終点が予め定められた角度分回転する毎に１オクターブ音高が変化する（例えば、反時計回りに予め定められた角度分回転したときは音高が１オクターブ高くなり、時計回りに予め定められた角度分回転したときは音高が１オクターブ低くなる）ように上記角度θと音高の変換ルールを定めておくことで、平均律等の音楽理論との親和性の高い音高表現が可能になる。通常の楽曲の演奏を行う際には２オクターブ程度の音域が必要となることを考慮すると、上記予め定められた角度をπラジアンと設定しておくことが好ましい。始点を中心に終点を一周させるといった直観的に判り易い操作で、通常の楽曲の演奏に必要な２オクターブの音域を確保できるからである。なお、より広い音域が必要となる楽曲の演奏の際には、上記予め定められた角度をπ／２ラジアン（すなわち、１／４周分の角度）、或いはπ／４ラジアン（すなわち、１／８周分の角度）に設定すれば良いことは言うまでもない。 The value of θ calculated according to Equation 2 is limited to a range of −π to π. Here, if the time interval of touch position detection is sufficiently small, it can be considered that no change in angle beyond the threshold θth occurs between the previous touch position detection and the current touch position detection. Under such circumstances, the value of θ can be increased in both positive and negative directions by applying so-called unwrapping with the threshold θth = π. In addition, as a concrete implementation algorithm of unwrapping, for example, an algorithm shown by the source code of FIG. By applying such unwrapping, the pitch changes by one octave every time the end point is rotated by a predetermined angle with the start point as the center (for example, when rotating by a predetermined angle counterclockwise) By setting the above angle θ and pitch conversion rules so that the pitch becomes 1 octave higher and the pitch becomes 1 octave lower when rotated clockwise by a predetermined angle) It is possible to express pitches with high affinity with music theory. In consideration of the fact that a range of about two octaves is required when playing a normal music piece, it is preferable to set the predetermined angle to π radians. This is because an intuitively easy-to-understand operation such as turning the end point around the start point makes it possible to secure a 2-octave range necessary for normal music performance. When playing a musical piece that requires a wider range, the predetermined angle is set to π / 2 radians (that is, an angle corresponding to a quarter turn) or π / 4 radians (that is, 1 / Needless to say, it may be set to an angle of 8 laps.

以上説明したように、本実施形態によれば、歌唱音声を規定する複数の情報（歌詞の音素、当該歌詞を発音する際の音高および音量）を操作検出面上の複数のタッチ位置（本実施形態では、始点および終点の２点の位置）の位置関係に対応させることが可能になる。また、操作検出手段１１０ｃとして、少なくとも１つのタッチ位置の更新が発生する毎に上記操作内容データを出力するものを用い、その操作内容データに応じて始点或いは終点の位置を更新し更新後の始点位置等に応じて合成指示データを生成して出力する処理を制御部１００に実行させるようにすれば、始点を指示している指先や終点を指示している指先の何れか一方、或いは両方を操作検出面をなぞるように動かす操作に応じて上記複数の情報を経時的に更新すること、すなわち、歌唱音声を経時的に変化させることが可能になり、ビブラートなどの音響効果の付与を簡便に行うことが可能になる。 As described above, according to the present embodiment, a plurality of pieces of information defining the singing voice (phonemes of lyrics, pitches and volumes when sounding the lyrics) are displayed on a plurality of touch positions (books). In the embodiment, it is possible to correspond to the positional relationship between the two points of the start point and the end point. Further, as the operation detection unit 110c, one that outputs the operation content data every time at least one touch position is updated is used, and the start point or the end point is updated according to the operation content data, and the updated start point is obtained. If the control unit 100 is caused to execute the process of generating and outputting the synthesis instruction data according to the position or the like, either one or both of the fingertip that indicates the start point and the fingertip that indicates the end point will be performed. It is possible to update the plurality of information over time according to the operation to move the operation detection surface, that is, to change the singing voice over time, and to easily apply acoustic effects such as vibrato. It becomes possible to do.

本実施形態では、歌唱合成装置への本発明の適用を説明したが、文章等の読み上げ音声を合成する装置や楽器の演奏音を合成する装置に本発明を適用しても良い。例えば、楽器の演奏音を合成する音合成装置であれば、始点の位置に音色を対応付けておけば良い。文章の読み上げ音声を合成する音合成装置や楽器の演奏音を合成する音合成装置であっても、合成対象の音を規定する情報として音色或いは音素と、音高および音量の指定および経時的な更新を直観的かつ判り易い操作で行えることが好ましいことには変わりはないからである。 In the present embodiment, the application of the present invention to the singing voice synthesizing apparatus has been described. However, the present invention may be applied to an apparatus that synthesizes a reading voice such as a sentence or an apparatus that synthesizes a performance sound of a musical instrument. For example, in the case of a sound synthesizer that synthesizes performance sounds of musical instruments, a timbre may be associated with the position of the starting point. Even in a sound synthesizer that synthesizes a text-to-speech speech or a sound synthesizer that synthesizes a performance sound of a musical instrument, the tone color or phoneme, the pitch and volume specification, and the time This is because it is preferable that the update can be performed by an intuitive and easy-to-understand operation.

（Ｂ：変形）
以上本発明の実施形態について説明したが、以下の変形を加えても良い。
（１）上記実施形態では、始点記憶済状態Ｓ０２から始点・終点記憶済状態Ｓ０４への遷移、或いは終点記憶済状態Ｓ０３から始点・終点記憶済状態Ｓ０４への遷移の発生タイミングを発音開始時刻としたが、初期状態Ｓ０１から始点記憶済状態Ｓ０２への遷移の発生タイミングを発音開始時刻としても良い。始点記憶済状態Ｓ０２では終点が指定されておらず、合成対象の歌唱音声の音高および音量を確定できない。このため、初期状態Ｓ０１から始点記憶済状態Ｓ０２への遷移の発生タイミングを発音開始時刻とする態様においては音高および音量については予め定められた値としても良く、また、疑似乱数等を用いて定めた値としても良い。 (B: Deformation)
Although the embodiment of the present invention has been described above, the following modifications may be added.
(1) In the above embodiment, the sound generation start time is defined as the generation timing of the transition from the start point stored state S02 to the start point / end point stored state S04 or the transition from the end point stored state S03 to the start point / end point stored state S04. However, the generation timing of the transition from the initial state S01 to the start point stored state S02 may be set as the sound generation start time. In the start point stored state S02, the end point is not specified, and the pitch and volume of the singing voice to be synthesized cannot be determined. For this reason, in the aspect in which the generation timing of the transition from the initial state S01 to the start point stored state S02 is set as the sound generation start time, the pitch and volume may be set in advance, or a pseudo random number or the like may be used. It may be a predetermined value.

また、歌唱合成装置１０が加速度センサなどの他のセンサや操作子等の他の操作検出手段を有している場合には、当該他の操作検出手段を介して取得したデータに応じて音高および音量を定めるようにしても良い。発音終了時刻についても同様に、始点・終点記憶済状態Ｓ０４から始点記憶済状態Ｓ０２或いは終点記憶済状態Ｓ０３への遷移が発生したタイミングを発音終了時刻とするのではなく、始点・終点記憶済状態Ｓ０４から始点記憶済状態Ｓ０２或いは終点記憶済状態Ｓ０３の何れかを経て初期状態Ｓ０１へ遷移したタイミングを発音終了時刻としても良い。なお、始点・終点記憶済状態Ｓ０４において始点の指定が解除された場合には、それまでの終点を新たな始点として始点記憶済状態Ｓ０２へ遷移させるようにしても良い。 In addition, when the singing voice synthesizing apparatus 10 has other sensors such as an acceleration sensor or other operation detection means such as an operator, the pitch is determined according to the data acquired through the other operation detection means. Also, the volume may be determined. Similarly, for the sound generation end time, the start point / end point stored state is not set as the sound generation end time when the transition from the start point / end point stored state S04 to the start point stored state S02 or the end point stored state S03 occurs. The timing of transition from S04 to the initial state S01 through either the start point stored state S02 or the end point stored state S03 may be set as the sound generation end time. When the start point designation is canceled in the start point / end point stored state S04, the previous end point may be changed to the start point stored state S02 as a new start point.

（２）上記実施形態では、操作検出手段１１０ｃに対するタッチ操作により指定された始点の位置を合成対象の歌唱音声の音素に、始点から見た終点の方向を歌唱音声の音高に、始点と終点との間の距離を歌唱音声の音量に対応させたがこれに限定される訳ではない。例えば、始点の位置を音量に対応させ、始点から見た終点の方向を音素に対応させ、始点と終点の間の距離を音高に対応させても良く、始点と終点の役割を入れ替えても良い。また、時間的に先に指定された方を終点とし、後から指定された方を始点としても良い。始点と終点とが指定されると１つのベクトルが特定されるので、当該ベクトルを規定する複数の情報（ベクトルの大きさ、方向、および位置の各々を示す情報）を合成対象の歌唱音声を規定する複数の情報に対応させる態様であれば良い。また、始点と終点の他に、始点とは異なり、終点とも異なる第３の点をユーザに指定させ、これら３つの点の位置関係を複数の情報に対応させても良い。また、始点と終点を順次指定する際の時間間隔を歌唱音声のベロシティに対応させる（例えば、始点と終点の各々を指定する際の時間間隔が短いほどベロシティを大きくする）など、上記ベクトルを規定する情報以外の情報を歌唱音声の合成に利用しても良い。 (2) In the above embodiment, the position of the start point specified by the touch operation on the operation detection unit 110c is the phoneme of the singing voice to be synthesized, the direction of the end point viewed from the start point is the pitch of the singing voice, and the start point and the end point Although the distance between is made to correspond to the volume of singing voice, it is not necessarily limited to this. For example, the position of the start point may correspond to the volume, the direction of the end point viewed from the start point may correspond to the phoneme, the distance between the start point and the end point may correspond to the pitch, or the roles of the start point and end point may be interchanged good. Also, the one specified earlier in time may be the end point, and the one specified later may be the start point. When a start point and an end point are specified, a single vector is specified. Therefore, a plurality of pieces of information that define the vector (information indicating the magnitude, direction, and position of the vector) are defined as the singing voice to be synthesized. Any mode may be used as long as it corresponds to a plurality of information. In addition to the start point and the end point, the user may designate a third point that is different from the end point and different from the end point, and the positional relationship between these three points may correspond to a plurality of pieces of information. Also, specify the above vector such that the time interval when sequentially specifying the start point and end point corresponds to the velocity of the singing voice (for example, the velocity is increased as the time interval when specifying each of the start point and end point is shorter). Information other than the information to be used may be used for the synthesis of the singing voice.

一般に歌声は、発音が開始されてから音響的に定常的な状態に至るまでにある程度の時間を要し、発音が音響的に定常的な状態ではなくなってから完全に発音が停止するまでにもある程度時間を要する。ベロシティとは、音楽の演奏情報の通信に使用されるＭＩＤＩ規格において、主に発音開始および終了時点の過渡特性を伝達する目的で利用される情報である。従来、スマートフォンなどのタブレット端末用の楽器アプリケーション等では、そのタブレット端末が加速度センサを有している場合には、加速度センサの出力値をベロシティとして利用することが多かった。しかし、このような態様では、タブレット端末自体を動かしながら演奏した場合に意図したベロシティを入力できないといった問題や、タブレット端末の持ち方に応じて入力値が変わってしまうといった問題があった。これに対して、ベクトルの始点と終点を順次指定する際の時間間隔をベロシティに対応させるようにすれば、装置自体を動かしながら使用しても問題はなく、また、装置自体の持ち方による影響を小さくすることができる。 In general, a singing voice takes a certain amount of time from the start of pronunciation until it reaches an acoustically steady state. It takes some time. Velocity is information used mainly for the purpose of transmitting transient characteristics at the start and end of pronunciation in the MIDI standard used for communication of music performance information. Conventionally, in a musical instrument application or the like for a tablet terminal such as a smartphone, when the tablet terminal has an acceleration sensor, the output value of the acceleration sensor is often used as a velocity. However, in such a mode, there is a problem that an intended velocity cannot be input when playing while moving the tablet terminal itself, and an input value changes depending on how the tablet terminal is held. On the other hand, if the time interval for sequentially specifying the start and end points of the vector is made to correspond to the velocity, there is no problem even if the device itself is moved, and the influence of how the device itself is held. Can be reduced.

（３）上記実施形態では、母音の範囲で合成対象の歌唱音声の発音を指定させたが、ボタンなどを併用することでより多彩な発音（或いは音色）を指定できるようにしても良い。例えば、図８に示す画面を表示手段１１０ａに表示させ、[Ｓ]ボタンＡ０８をタッチした状態で母音[ａ]に対応する小円Ａ０１内の点を始点として指定するタッチ操作が為された場合には、子音／ｓ／を付加して／ｓａ／の発音の音を合成するようにしても良い。また、多様な歌詞（或いは音色）による音合成を可能にするため、始点・終点記憶済状態Ｓ０４への遷移が発生する毎に表示手段１１０ａに表示させる音素（或いは音色）の候補を切り替えるようにしても良い。また、歌唱合成装置に本発明を適用する場合、音素ではなく発音記号により合成対象の歌唱音声の発音を指定させても良い。この場合、小円Ａ０１等の内部に音素を示す文字を表示するのではなく発音記号を表示すれば良い。要は、小円Ａ０１等に発音を表す発音文字（音素を表す文字や発音記号）を表示してユーザに発音を指定する操作を促す態様であれば良い。また、合成対象の歌唱音声の発音を指定させる際に、音節単位で発音を指定させても良く、日本語の範囲で発音を指定させるのであればモーラ単位で発音を指定させても良い。 (3) In the above-described embodiment, the pronunciation of the singing voice to be synthesized is designated within the range of vowels. However, more various pronunciations (or timbres) may be designated by using buttons together. For example, when the screen shown in FIG. 8 is displayed on the display unit 110a and the touch operation is performed in which the point in the small circle A01 corresponding to the vowel [a] is specified as the start point while the [S] button A08 is touched. Alternatively, consonant / s / may be added to synthesize a sound with / sa / pronunciation. In addition, in order to enable sound synthesis using various lyrics (or timbres), the phoneme (or timbre) candidates to be displayed on the display means 110a are switched each time the transition to the start / end point stored state S04 occurs. May be. Further, when the present invention is applied to a singing voice synthesizing apparatus, the pronunciation of the singing voice to be synthesized may be designated by phonetic symbols instead of phonemes. In this case, a phonetic symbol may be displayed instead of a character indicating a phoneme inside the small circle A01 or the like. In short, any mode may be used as long as a pronunciation character (phoneme character or phonetic symbol) representing pronunciation is displayed on the small circle A01 or the like to prompt the user to specify a pronunciation. Further, when the pronunciation of the singing voice to be synthesized is designated, the pronunciation may be designated in syllable units, and the pronunciation may be designated in mora units if the pronunciation is designated in the Japanese range.

（４）上記実施形態の音合成装置（すなわち、歌唱合成装置１０）は、歌唱音声を規定する複数の情報を指定する操作を検出する操作検出手段と、当該操作検出手段により検出された操作の内容を表すデータを合成指示データに変換する変換手段と、この合成指示データにしたがって歌唱音声を合成する音合成手段、および音出力手段を有していた。しかし、ヘッドフォンを音出力手段として用いる等、図９（ａ）に示すように、音出力手段を外部要素としても良い。つまり、音出力手段は、本発明の音合成装置の必須構成要素ではない。また、操作検出手段、変換手段および音合成手段についても、これらのうちの何れか１つと他の２つとが異なる装置に実装されていても良く、また、これら３つが全て異なる装置に実行されていても良い。例えば、図９（ｂ）に示すように操作検出手段と変換手段とで、ユーザの操作内容を表すデータを合成指示データに変換して出力する入力装置を形成し、有線或いは無線通信により当該入力装置から音声合成手段に合成指示データを与えるようにしても良い。また、図９（ｂ）における入力装置を単体で製造・販売しても良い。 (4) The sound synthesizer (that is, the song synthesizer 10) of the above embodiment includes an operation detection unit that detects an operation that specifies a plurality of information that defines a singing voice, and an operation that is detected by the operation detection unit. It had conversion means for converting data representing contents into synthesis instruction data, sound synthesis means for synthesizing singing voice according to the synthesis instruction data, and sound output means. However, the sound output means may be an external element as shown in FIG. 9A, such as using headphones as the sound output means. That is, the sound output means is not an essential component of the sound synthesizer of the present invention. As for the operation detecting means, the converting means, and the sound synthesizing means, any one of these may be mounted on different devices, and all three of them may be executed on different devices. May be. For example, as shown in FIG. 9 (b), the operation detection means and the conversion means form an input device that converts data representing the user's operation contents into synthesis instruction data and outputs the combined instruction data. Synthesis instruction data may be given from the apparatus to the voice synthesis means. Further, the input device in FIG. 9B may be manufactured and sold alone.

また、図９（ｃ）に示すように、操作検出手段１１０ｃと音出力手段１１０ｂとを有する第１のコンピュータ装置をインターネットなどの電気通信回線に接続するとともに、上記変換手段および音声合成手段として機能する第２のコンピュータ装置を当該電気通信回線に接続し、当該電気通信回線経由で第１のコンピュータ装置から送信されてくる操作内容データの合成指示への変換、当該合成指示に応じた波形データの合成、およびその波形データの第１のコンピュータ装置への返信を上記第２のコンピュータ装置に行わせるようにしても良い。図９（ｃ）では、２台の第１のコンピュータ装置を電気通信回線に接続する場合について例示されている。このような態様によれば、歌唱音声のリアルタイム合成を所謂ＡＳＰ形式の通信サービスで提供することが可能になる。なお、複数の歌唱合成装置１０を電気通信回線に接続し、それら複数の歌唱合成装置１０のうちの１つを第１のコンピュータ装置かつ第２のコンピュータ装置として機能させ、他のものを第１のコンピュータ装置として機能させた場合も同様の効果が得られる。 Further, as shown in FIG. 9 (c), the first computer device having the operation detecting means 110c and the sound output means 110b is connected to an electric communication line such as the Internet and functions as the converting means and the voice synthesizing means. Connecting the second computer device to the telecommunications line, converting the operation content data transmitted from the first computer device via the telecommunications line into a synthesis instruction, and the waveform data corresponding to the synthesis instruction You may make it make the said 2nd computer apparatus perform a synthesis | combination and the reply to the 1st computer apparatus of the waveform data. FIG. 9C illustrates the case where two first computer devices are connected to a telecommunication line. According to such an aspect, real-time synthesis of singing voice can be provided by a so-called ASP communication service. A plurality of song synthesizing apparatuses 10 are connected to an electric communication line, and one of the plurality of song synthesizing apparatuses 10 is made to function as a first computer apparatus and a second computer apparatus, and the other one is set as a first one. The same effect can be obtained when the computer apparatus is made to function.

また、複数の上記第１のコンピュータ装置を電気通信回線に接続して同時に演奏を行わせ、第１のコンピュータ装置の各々に入力されたベクトルを、当該ベクトルを入力された装置以外の他の第１のコンピュータ装置にも表示させるとともに、第２のコンピュータ装置からの返信を上記複数のコンピュータ装置に対するマルチキャストにより行っても良い。このようにすることで、複数の第１のコンピュータ装置の各々のユーザに、各ユーザが指定した歌唱音声の音楽的な関係を各ベクトルの位置関係等を通じて視覚的に把握させることが可能になる。 Further, a plurality of the first computer devices are connected to an electric communication line to perform simultaneously, and a vector input to each of the first computer devices is changed to a second one other than the device input with the vector. The information may be displayed on one computer device, and the reply from the second computer device may be sent by multicast to the plurality of computer devices. By doing so, it becomes possible for each user of the plurality of first computer devices to visually grasp the musical relationship of the singing voice designated by each user through the positional relationship of each vector. .

例えば、２πラジアンの回転が１オクターブ（すなわち、１２００ｃｅｎｔ）の音高変化に一致するように音高とベクトルの角度とを対応付けておけば、（ｉ）音高が等しいベクトルは平行になる、（ｉｉ）音高の比がオクターブの関係、すなわち、１２００ｃｅｎｔの倍数の関係にある２つのベクトルも平行になる、（ｉｉｉ）音色が同一のベクトルは始点が一致する等の視覚的な美観が得られる。音高に関してより一般的に言えば、互いの音高の関係が特別な値にあることをベクトルの為す角度により視覚的に把握できるといった効果が得られる。なお、上記（ｉ）〜（ｉｉｉ）の何れかの事象の発生を第２のコンピュータ装置において検知して複数の第１のコンピュータ装置の各々に通知し、当該通知に応じた報知（ベクトルの表示態様を変化させるなどの視覚に訴える報知や、音による報知或いは振動による報知等）を行っても良い。 For example, if pitches and vector angles are associated with each other so that a rotation of 2π radians coincides with a pitch change of 1 octave (ie, 1200 cent), (i) vectors having equal pitches are parallel. (Ii) Two vectors having a pitch ratio of octave, that is, a multiple of 1200 cent, are parallel, (iii) Vectors having the same timbre have the same visual aesthetics. It is done. More generally speaking, with respect to the pitch, there is an effect that it is possible to visually grasp that the relationship between the pitches is a special value by the angle made by the vector. The occurrence of any of the above events (i) to (iii) is detected by the second computer device and notified to each of the plurality of first computer devices, and a notification (vector display) corresponding to the notification is made. Notification that appeals visually such as changing the aspect, notification by sound, or notification by vibration may be performed.

（６）上記実施形態では、操作検出手段１１０ｃとして略Ｂ５サイズのシート状の接触検知センサを用いた。しかし、畳一枚分などより大きな接触検知センサを用いても良い。このように大きな接触検知センサを用いる場合には、当該接触検知センサを床などに敷き、その操作検出面を足で踏むことで始点および終点の指定を行うようにすれば良い。なお、このような態様においては、接触検知センサは透明である必要はなく、各々異なる発音を示す複数の表象（図２における小円Ａ０１〜Ａ０５）が描画されたものであれば良い。このような態様によれば、接触検知センサ上でステップを踏んだり、踊るなどの操作で合成対象の歌唱音声を規定する複数の情報を指定することが可能になり、ゲーム感覚で歌唱合成を行うことが可能になる。また、本態様と上記実施形態の態様とを併用し、歌詞を構成する音素のうち母音については本態様により指定し、子音については上記実施形態の態様により指定するようにしても良い。 (6) In the above embodiment, a substantially B5 size sheet-like contact detection sensor is used as the operation detection unit 110c. However, a larger contact detection sensor such as one tatami mat may be used. When using such a large contact detection sensor, the start point and the end point may be designated by placing the contact detection sensor on the floor or the like and stepping on the operation detection surface with a foot. In such an aspect, the contact detection sensor does not need to be transparent, and may be anything in which a plurality of representations (small circles A01 to A05 in FIG. 2) showing different pronunciations are drawn. According to such an aspect, it becomes possible to specify a plurality of information defining the singing voice to be synthesized by an operation such as stepping or dancing on the contact detection sensor, and singing is performed in a game sense. It becomes possible. Moreover, this aspect and the aspect of the above-described embodiment may be used in combination, and among the phonemes constituting the lyrics, the vowel may be specified according to this aspect, and the consonant may be specified according to the aspect of the above-described embodiment.

（７）上記実施形態では、本発明の入力装置（すなわち、操作検出手段と変換手段とを有する入力装置）を、音合成装置への情報入力手段として用いたが、携帯型ゲーム機や据え置き型ゲーム機への情報入力手段として用いても良く、また、地図アプリケーションへの情報入力手段として用いても良い。例えば、地図アプリケーションへの情報入力手段やロールプレイングゲーム用の情報入力手段として本発明の入力装置を用いると、地図上で始点と終点とが指定されたことを契機としてその始点から見た終点の方向の風景の画像を、始点と終点との間の距離に応じたズーム比率で表示し、道案内を行うことが可能になる。 (7) In the above embodiment, the input device of the present invention (that is, the input device having the operation detecting means and the converting means) is used as the information input means to the sound synthesizer, but the portable game machine or the stationary type is used. You may use as an information input means to a game machine, and you may use as an information input means to a map application. For example, when the input device of the present invention is used as an information input means for a map application or an information input means for a role playing game, the end point viewed from the start point when the start point and the end point are specified on the map. An image of a landscape in the direction can be displayed at a zoom ratio corresponding to the distance between the start point and the end point, and road guidance can be performed.

（８）アプリケーションの種類に応じてベクトルを指定する際の基準となる情報を表示しても良い。具体的には、図１０に示すような格子や、図１１に示す十字線と同心円の組み合わせなど何らかの規則に基づいた表象を上記基準を示す情報として表示するのである。このような態様によれば、ベクトルの位置、大きさ、角度についての基準或いは基準間隔をユーザに把握させつつ入力操作を行わせることが可能になる。また、図１２に示すように、入力中のベクトルの始点に上記十字線と同心円の組み合わせの中心を位置づけるなど、大きさや角度についての基準を示す表象の表示位置を入力中のベクトルの位置に追従させても良い。また、上記同心円を構成する各円の半径を始点の指定時刻からの経過時間に応じて次第に小さく（或いは大きく）しても良い。 (8) Information serving as a reference for designating a vector according to the type of application may be displayed. Specifically, a representation based on a certain rule such as a lattice as shown in FIG. 10 or a combination of a cross line and a concentric circle as shown in FIG. 11 is displayed as information indicating the reference. According to such an aspect, it is possible to perform an input operation while allowing the user to grasp the reference or reference interval for the position, size, and angle of the vector. In addition, as shown in FIG. 12, the display position of the representation indicating the reference for the size and angle follows the position of the vector being input, such as by positioning the center of the combination of the crosshair and the concentric circle at the starting point of the vector being input. You may let them. Further, the radius of each circle constituting the concentric circle may be gradually reduced (or increased) according to the elapsed time from the designated time at the start point.

また、図１３に示すように、ベクトルの位置、方向および長さの何れか（或いは、これらのうちの複数）についての推奨値を示す表象を表示させても良く、入力中のベクトルと当該表象の表すベクトルとが充分に近くなったときに、その旨をユーザに報知しても良い。具体的には、入力中のベクトルと当該表象の表すベクトルの近似の度合いに応じて、入力中のベクトルと当該表象の表すベクトルの何れか一方（或いは両方）の表示態様を異ならせる態様や音や振動により報知する態様が考えられる。また、この推奨値を表す表象についても、入力中のベクトルの始点の指定時刻からの経過時間に応じて表示態様を変化させても良い。 Further, as shown in FIG. 13, a symbol indicating a recommended value for any one of (or a plurality of) the position, direction, and length of the vector may be displayed. When the vector represented by is sufficiently close, it may be notified to the user. Specifically, depending on the degree of approximation between the vector being input and the vector represented by the representation, a mode or sound in which either (or both) the display mode of the vector being input and the vector represented by the representation is different is displayed. A mode in which the notification is made by vibration or vibration can be considered. Further, the display mode of the representation representing the recommended value may be changed according to the elapsed time from the designated time of the starting point of the vector being input.

（９）上記実施形態では、変換手段と音合成手段とがソフトウェアモジュールにより実現されていたが、電子回路などのハードウェアにより実現しても良い。また、上記実施形態では、歌唱合成プログラム１２４ａに制御部１００を変換手段として機能させるプログラムが含まれていたが、制御部１００を変換手段として機能させるプログラム単体で提供しても良い。マルチタッチパネルを有するコンピュータ装置の制御部を当該プログラムにしたがって作動させることで、当該コンピュータ装置を本発明の入力装置として機能させることが可能になるからである。なお、当該プログラムの具体的な提供態様としては、ＣＤ−ＲＯＭやＤＶＤなどのコンピュータ読み取り可能な記録媒体に当該プログラムを記録して配布する態様や、電気通信回線経由のダウンロードにより配布する態様が考えられる。 (9) In the above embodiment, the converting means and the sound synthesizing means are realized by software modules, but may be realized by hardware such as an electronic circuit. Moreover, in the said embodiment, the program which functions the control part 100 as a conversion means was contained in the song synthesis program 124a, However, You may provide with the program single body which functions the control part 100 as a conversion means. This is because by operating the control unit of the computer device having a multi-touch panel in accordance with the program, the computer device can function as the input device of the present invention. As a specific mode of providing the program, a mode in which the program is recorded on a computer-readable recording medium such as a CD-ROM or a DVD and a mode in which the program is distributed by downloading via a telecommunication line are considered. It is done.

１０…歌唱合成装置、１００…制御部、１１０…ユーザＩ／Ｆ部、１１０ａ…表示手段、１１０ｂ…音出力手段、１１０ｃ…操作検出手段、１２０……記憶部、１２２…揮発性記憶部、１２４…不揮発性記憶部、１２４ａ…歌唱合成プログラム、１３０…バス。 DESCRIPTION OF SYMBOLS 10 ... Singing synthesis apparatus, 100 ... Control part, 110 ... User I / F part, 110a ... Display means, 110b ... Sound output means, 110c ... Operation detection means, 120 ... Memory | storage part, 122 ... Volatile memory part, 124 ... Non-volatile storage unit, 124a ... singing synthesis program, 130 ... bus.

Claims

Operation detection means for outputting operation content data indicating each of a plurality of touch positions by a touch operation performed by the user on the operation detection surface;
Means for converting the operation content data output at the predetermined time interval into data representing other information and outputting the data, wherein a start point and an end point of a vector are designated among a plurality of touch positions indicated by the operation content data is a has been touched position starts in a state not the start point of the vector, the start point and the end point when the touch position is started in a state where the starting point of the vector other than the starting point is specified as the end point of the vector Of the predetermined direction , the other direction θ viewed from the predetermined direction , and the distance r between the two , and unwrapping the direction θ to obtain the value of the direction θ. Conversion means for expanding both positive and negative , converting into three types of information according to a predetermined rule, and outputting data representing a plurality of information including at least the three types of information; An input device comprising:

Informing means for informing a user of a reference value or a recommended value relating to at least one of a predetermined position of the start point and the end point, the other direction viewed from the predetermined direction, and a distance between the two. The input device according to claim 1, further comprising:

The operation detection means outputs operation content data representing a touch start time in addition to the touch position,
The input according to claim 1, wherein the conversion unit converts the time difference between the touch start time at the start point and the touch start time at the end point into fourth information different from the three types of information. apparatus.

The converting means converts the predetermined position, the other direction viewed from the predetermined direction, and the distance between the two according to a predetermined rule according to a timbre or pronunciation, pitch and volume. The input device according to claim 3, wherein the time difference between the touch start time at the start point and the touch start time at the end point is converted into a velocity at the time of outputting the tone color or the sound of the pronunciation.

A pitch is associated with the direction of viewing the other from the predetermined one of the start point and the end point,
The converting means is configured to position the start point and the end point so that the pitch changes by one octave each time the other of the start point and the end point rotates around a predetermined angle. The input device according to claim 4, wherein the data indicating the pitch is converted into the data indicating the pitch.

In a sound synthesizer that synthesizes and outputs a timbre or sound corresponding to a user's operation as a sound having a pitch and volume according to the operation,
Operation detection means for outputting operation content data indicating each of a plurality of touch positions by a touch operation performed by the user on the operation detection surface;
A means for converting the operation content data into data representing other information and outputting the data, wherein one of a plurality of touch positions indicated by the operation content data is set as a start point and one of the touch positions other than the start point. The tone of each of the starting point and the predetermined position of the end point, the other direction as viewed from the predetermined direction, and the distance between the two when the two points are the end points. Or conversion means for converting to pronunciation, pitch and volume, and outputting synthesis instruction data for instructing to synthesize the tone or tone with the pitch and volume;
Sound synthesis means for synthesizing sound according to synthesis instruction data output by the conversion means;
A sound synthesizer characterized by comprising: