TWI440017B

TWI440017B - Voice recognition function activation systems and methods, and machine readable medium and computer program products thereof

Info

Publication number: TWI440017B
Application number: TW97137691A
Authority: TW
Inventors: Fu Chiang Chou; Chu Yen-Lee
Original assignee: Htc Corp
Priority date: 2008-10-01
Filing date: 2008-10-01
Publication date: 2014-06-01
Also published as: TW201015539A

Description

Voice recognition function starting system and method, and machine readable media and computer Program product

本發明係有關於一種語音辨識功能啟動系統及方法，且特別有關於一種可以依據聲音之變異度決定是否啟動語音辨識功能之系統及方法。The present invention relates to a voice recognition function activation system and method, and more particularly to a system and method for determining whether to activate a voice recognition function according to the variability of sound.

近年來，電子裝置，如電腦與可攜式裝置變得越來越高階且變得更多功能化。由於這些裝置與應用的便利，也使得這些裝置逐漸成為人們的生活必需品之一。In recent years, electronic devices, such as computers and portable devices, have become more sophisticated and more functional. Due to the convenience of these devices and applications, these devices have gradually become one of the necessities of life.

為了提供更便利的輸入與操作方式，部分電子裝置可以提供語音辨識系統。使用者可以利用語音完成輸入與操作電子裝置。另外，當使用者處於不適合利用手動輸入與操作的環境中，如在開車的時候，語音辨識系統亦提供使用者更便捷與安全的輸入與操作方式。雖然透過語音可以輸入與操作電子裝置與/或汽車系統。然而，如何啟動語音辨識系統成為設計者的重要關鍵。In order to provide a more convenient input and operation mode, some electronic devices can provide a voice recognition system. The user can complete the input and operation of the electronic device by using voice. In addition, when the user is in an environment that is not suitable for manual input and operation, such as when driving, the voice recognition system also provides a more convenient and safe input and operation mode for the user. Although voice can be input and operated with electronic devices and/or automotive systems. However, how to activate the speech recognition system becomes an important key for designers.

由於環境中充滿各式各樣的聲音，如果讓語音辨識系統持續地辨識聲音，常常會產生許多錯誤的辨識。因此，通常會額外設計一個按鈕來啟動語音辨識系統。由於使用者必須手動按下此按鈕來啟動語音辨識系統，因此，對於使用者而言係不便的，且在特殊狀況下，如行車中，此行為係缺乏安全性的。Since the environment is full of various sounds, if the speech recognition system continuously recognizes the sound, many misidentifications are often generated. Therefore, an extra button is usually designed to activate the speech recognition system. Since the user has to manually press this button to activate the speech recognition system, it is inconvenient for the user, and in special circumstances, such as driving, this behavior is insecure.

為了克服前述問題，一種習知技術開發來啟動語音辨識系統。在此習知技術中，系統會持續偵測聲音中是否包括一關鍵字。當偵測到關鍵字時，則啟動語音辨識系統，以進行完整的語音辨識功能。在此習知技術中，使用者無須手動按下任何按鈕便可啟動語音辨識系統。然而，由於系統仍然係持續地偵測聲音中是否含有關鍵字。若在聲音源不單純或較為吵雜的環境中，可能形成錯誤的辨識亦係非常驚人的。因此，利用關鍵字來啟動語音辨識系統的技術亦鮮見實作於產品上。In order to overcome the aforementioned problems, a conventional technique is developed to initiate speech recognition. Knowledge system. In this prior art, the system continuously detects whether a keyword is included in the sound. When a keyword is detected, the speech recognition system is activated for complete speech recognition. In this prior art, the user can activate the speech recognition system without having to manually press any button. However, since the system is still continuously detecting whether the sound contains keywords. If the sound source is not simple or noisy, it may be very surprising to form a wrong identification. Therefore, the technology of using a keyword to activate a speech recognition system is rarely seen on a product.

有鑑於此，本發明提供語音辨識功能啟動系統及方法。In view of this, the present invention provides a speech recognition function activation system and method.

本發明實施例之一種語音辨識功能啟動系統包括一收音單元與一處理模組。該處理模組取得該收音單元偵測得到之一第一期間之一第一聲音，且計算該第一期間內該第一聲音之一第一變異度。該處理模組判斷該第一變異度是否小於一第一設定值。當該第一變異度小於該第一設定值時，該處理模組取得該收音單元偵測得到之一第二期間之一第二聲音，且判斷該第二聲音中是否包括一關鍵字。當該第二聲音中包括該關鍵字時，該處理模組啟動一語音辨識功能。當該語音辨識功能啟動時，該收音單元偵測得到之一第三聲音中之每一文字將被偵測。A voice recognition function starting system according to an embodiment of the present invention includes a sounding unit and a processing module. The processing module obtains a first sound of one of the first periods detected by the sounding unit, and calculates a first variability of the first sound in the first period. The processing module determines whether the first variability is less than a first set value. When the first variability is less than the first set value, the processing module obtains a second sound of one of the second periods detected by the sounding unit, and determines whether a keyword is included in the second sound. When the keyword is included in the second sound, the processing module activates a voice recognition function. When the voice recognition function is activated, the sounding unit detects that each of the third sounds will be detected.

本發明實施例之一種語音辨識功能啟動方法。首先，取得一第一期間之一第一聲音，且計算該第一期間內該第一聲音之一第一變異度。判斷該第一變異度是否小於一第一設定值。當該第一變異度小於該第一設定值時，取得一第二期間之一第二聲音。判斷該第二聲音中是否包括一關鍵字。當該第二聲音中包括該關鍵字時，啟動一語音辨識功能。當該語音辨識功能啟動時，一第三聲音中之每一文字將被偵測。A method for starting a voice recognition function according to an embodiment of the present invention. First, a first sound of one of the first periods is obtained, and a first variability of the first sound in the first period is calculated. Determining whether the first variability is less than one A set value. When the first variability is less than the first set value, the second sound of one of the second periods is obtained. It is determined whether a keyword is included in the second sound. When the keyword is included in the second sound, a voice recognition function is activated. When the speech recognition function is activated, each of the third sounds will be detected.

本發明上述方法可以透過程式碼方式存在。當程式碼被機器載入且執行時，機器變成用以實行本發明之裝置。The above method of the present invention can exist in a coded manner. When the code is loaded and executed by the machine, the machine becomes the means for practicing the invention.

為使本發明之上述目的、特徵和優點能更明顯易懂，下文特舉實施例，並配合所附圖示，詳細說明如下。The above described objects, features, and advantages of the invention will be apparent from the description and appended claims appended claims

第1圖顯示依據本發明實施例之語音辨識功能啟動系統。Figure 1 shows a speech recognition function activation system in accordance with an embodiment of the present invention.

語音辨識功能啟動系統100可以是一電子裝置，如電腦系統、汽車系統、與可攜式裝置，如多媒體播放器、個人數位助理、全球衛星定位裝置、觸控式手機、智慧型手機或行動電話等之手持式裝置。語音辨識功能啟動系統100包括一收音單元110、一顯示單元120與一處理模組130。收音單元110可以是一麥克風用以接收環境中之聲音。顯示單元120可以是一螢幕或是一燈號，用以顯示關鍵字偵測圖示。處理模組130係依據收音單元110接收的聲音執行本案之語音辨識功能啟動方法，其細節將於後說明。The voice recognition function activation system 100 can be an electronic device such as a computer system, an automobile system, and a portable device such as a multimedia player, a personal digital assistant, a global satellite positioning device, a touch mobile phone, a smart phone, or a mobile phone. Such as hand-held devices. The voice recognition function activation system 100 includes a sound collection unit 110, a display unit 120, and a processing module 130. The radio unit 110 can be a microphone for receiving sound in the environment. The display unit 120 can be a screen or a light to display a keyword detection icon. The processing module 130 performs the voice recognition function starting method of the present case according to the sound received by the sound receiving unit 110, the details of which will be described later.

第2圖顯示依據本發明實施例之語音辨識功能啟動方法。FIG. 2 shows a method for starting a voice recognition function according to an embodiment of the present invention.

如步驟S202，透過收音單元110接收一期間之聲音，且如步驟S204，計算期間內聲音之變異度(Variance)。值得注意的是，計算變異度的方法係數值分析熟習之技術，在此不再贅述。如步驟S206，判斷此期間聲音的變異度是否小於一第一設定值，且維持一既定時間。注意的是，第一設定值與既定時間可以依據不同需求彈性設計。當此期間聲音的變異度並未小於第一設定值或持續既定時間時(步驟S206的否)，流程回到步驟S202。當此期間聲音的變異度小於第一設定值且持續既定時間時(步驟S206的是)，如步驟S208，透過顯示單元120顯示一關鍵字偵測圖示。關鍵字偵測圖示之顯示可以提示使用者進行關鍵字之輸入。值得注意的是，步驟S206中判斷變異度是否小於第一設定值既定時間係用以避免瞬間聲音變化與/或不同聲音源造成的誤判。然而，在一些實施例中，步驟S206亦可僅判斷變異度是否小於第一設定值即可。In step S202, the sound of a period is received by the sound pickup unit 110, and as in step S204, the variability of the sound during the period is calculated. It is worth noting that the method of calculating the coefficient of variability is a familiar technique and will not be described here. In step S206, it is determined whether the variability of the sound during this period is less than a first set value and maintained for a predetermined time. It is noted that the first set value and the set time can be flexibly designed according to different needs. When the variability of the sound during this period is not less than the first set value or continues for a predetermined time (NO in step S206), the flow returns to step S202. When the variability of the sound during this period is less than the first set value and continues for a predetermined time (YES in step S206), in step S208, a keyword detection icon is displayed through the display unit 120. The display of the keyword detection icon can prompt the user to enter a keyword. It should be noted that, in step S206, it is determined whether the variability is less than the first set value, and the predetermined time is used to avoid instantaneous sound changes and/or misjudgments caused by different sound sources. However, in some embodiments, step S206 may also only determine whether the variability is less than the first set value.

如步驟S210，透過收音單元110持續接收另一期間之聲音，且如步驟S212，計算此期間內聲音之變異度。如步驟S214，判斷此期間聲音的變異度是否大於一第二設定值。當此期間聲音的變異度並未大於第二設定值時(步驟S214的否)，流程回到步驟S210。當此期間聲音的變異度大於第二設定值時(步驟S214的是)，如步驟S216，判斷聲音中是否包括一內定之關鍵字。類似地，步驟S212與S214中計算與判斷此期間聲音的變異度是否大於第二設定值係用以避免瞬間聲音變化與/或不同聲音源造成的誤判。然而，在一些實施例中，步驟S212與S214可以省略，而直接進行步驟S216的判斷。若聲音中並未包括內定之關鍵字(步驟S216的否)，如步驟S218，取消在顯示單元120中相應關鍵字偵測圖示之顯示，並回到步驟S202。若聲音中包括內定之關鍵字(步驟S216的是)，如步驟S220，啟動一語音辨識功能。注意的是，當語音辨識功能啟動時，接收之聲音中每一文字都將會被偵測。In step S210, the sound of another period is continuously received by the sound pickup unit 110, and in step S212, the degree of variability of the sound during the period is calculated. In step S214, it is determined whether the variability of the sound during this period is greater than a second set value. When the variability of the sound during this period is not greater than the second set value (NO in step S214), the flow returns to step S210. When the variability of the sound during this period is greater than the second set value (YES in step S214), in step S216, it is determined whether or not a predetermined keyword is included in the sound. Similarly, in steps S212 and S214, it is calculated and determined whether the variability of the sound during this period is greater than the second set value to avoid an instantaneous sound change and/or a misjudgment caused by a different sound source. Of course However, in some embodiments, steps S212 and S214 may be omitted, and the determination of step S216 is directly performed. If the default keyword is not included in the voice (NO in step S216), in step S218, the display of the corresponding keyword detection icon in the display unit 120 is canceled, and the flow returns to step S202. If the default keyword is included in the voice (YES in step S216), in step S220, a voice recognition function is activated. Note that when the speech recognition function is activated, each text in the received sound will be detected.

因此，本案之語音辨識功能啟動系統及方法可以依據環境中聲音的變異度自動啟動語音辨識功能。當期間內聲音的變異度小於設定值時，啟動關鍵字偵測，且在偵測到關鍵字之後自動啟動語音辨識功能，從而在便捷與安全性的考量下，啟動語音辨識功能。Therefore, the voice recognition function starting system and method of the present invention can automatically activate the voice recognition function according to the variability of the sound in the environment. When the variability of the sound is less than the set value during the period, the keyword detection is started, and the voice recognition function is automatically started after the keyword is detected, thereby starting the voice recognition function under the consideration of convenience and security.

本發明之方法，或特定型態或其部份，可以以程式碼的型態存在。程式碼可以包含於實體媒體，如軟碟、光碟片、硬碟、或是任何其他機器可讀取(如電腦可讀取)儲存媒體，亦或不限於外在形式之電腦程式產品，其中，當程式碼被機器，如電腦載入且執行時，此機器變成用以參與本發明之裝置。程式碼也可以透過一些傳送媒體，如電線或電纜、光纖、或是任何傳輸型態進行傳送，其中，當程式碼被機器，如電腦接收、載入且執行時，此機器變成用以參與本發明之裝置。當在一般用途處理單元實作時，程式碼結合處理單元提供一操作類似於應用特定邏輯電路之獨特裝置。The method of the invention, or a particular type or portion thereof, may exist in the form of a code. The code may be included in a physical medium such as a floppy disk, a CD, a hard disk, or any other machine readable (such as computer readable) storage medium, or is not limited to an external computer program product, wherein When the code is loaded and executed by a machine, such as a computer, the machine becomes a device for participating in the present invention. The code can also be transmitted via some transmission medium, such as a wire or cable, fiber optics, or any transmission type, where the machine becomes part of the program when it is received, loaded, and executed by a machine, such as a computer. Invented device. When implemented in a general purpose processing unit, the code combination processing unit provides a unique means of operation similar to application specific logic.

雖然本發明已以較佳實施例揭露如上，然其並非用以限定本發明，任何熟悉此項技藝者，在不脫離本發明之精神和範圍內，當可做些許更動與潤飾，因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the present invention has been disclosed above in the preferred embodiment, it is not intended to be used The scope of the present invention is defined by the scope of the appended claims, and the scope of the invention is defined by the scope of the appended claims.

100‧‧‧語音辨識功能啟動系統100‧‧‧Voice recognition function start system

110‧‧‧收音單元110‧‧‧ Radio unit

120‧‧‧顯示單元120‧‧‧Display unit

130‧‧‧處理模組130‧‧‧Processing module

S202、S204、...、S220‧‧‧步驟S202, S204, ..., S220‧‧ steps

第1圖為一示意圖係顯示依據本發明實施例之語音辨識功能啟動系統。Figure 1 is a schematic diagram showing a voice recognition function activation system in accordance with an embodiment of the present invention.

第2圖為一流程圖係顯示依據本發明實施例之語音辨識功能啟動方法。2 is a flow chart showing a method for starting a voice recognition function according to an embodiment of the present invention.

S202、S204、...、S220‧‧‧步驟S202, S204, ..., S220‧‧ steps

Claims

A voice recognition function starting system, comprising: a sound receiving unit; and a processing module, wherein the sounding unit detects one of the first sounds of the first period, and calculates one of the first sounds in the first period a variability, and determining whether the first variability is less than a first set value, and when the first variability is less than the first set value, obtaining the second sound of one of the second periods detected by the sounding unit And determining whether the second sound includes a keyword, and when the second sound includes the keyword, starting a voice recognition function, wherein when the voice recognition function is activated, the sounding unit detects one of the keys Each of the third sounds will be detected.

The voice recognition function activation system of claim 1, wherein the processing module further determines whether the first variability is less than the first set value for a predetermined time, and when the first variability is less than the first setting When the value is the predetermined time, the second sound is obtained.

The voice recognition function activation system of claim 1, further comprising a display unit configured to display a keyword detection icon when the first variability is less than the first set value.

The voice recognition function activation system of claim 3, wherein the display unit cancels the display of the keyword detection icon when the keyword is not included in the second sound.

The voice recognition function starting system of claim 1, wherein the processing module further calculates a second variability of the second sound, And determining whether the second variability is greater than a second set value, and when the second variability is greater than the second set value, determining whether the keyword is included in the second sound.

A voice recognition function starting method includes the following steps: obtaining a first sound of a first period; calculating a first variability of the first sound in the first period; determining whether the first variability is less than a first a set value; when the first variability is less than the first set value, obtaining a second sound of a second period; determining whether the second sound includes a keyword; and when the second sound includes the key When the word is activated, a voice recognition function is activated, wherein when the voice recognition function is activated, each of the third voices will be detected.

The method for starting a voice recognition function according to claim 6, further comprising the steps of: determining whether the first variability is less than the first set value for a predetermined time; and when the first variability is less than the first setting When the value is the predetermined time, the second sound is obtained.

The method for starting a voice recognition function according to claim 6 further includes displaying a keyword detection icon when the first variability is less than the first set value.

The voice recognition function initiator as described in item 8 of the patent application scope The method further includes canceling the display of the keyword detection icon when the keyword is not included in the second sound.

The method for starting a voice recognition function according to claim 6, further comprising the steps of: calculating a second variability of the second sound; determining whether the second variability is greater than a second set value; and when When the second variability is greater than the second set value, it is determined whether the keyword is included in the second sound.

A machine readable medium for storing a code for causing a device to perform a voice recognition function activation method, the method comprising the steps of: obtaining a first sound of a first period; calculating the first period of time a first variability of the first sound; determining whether the first variability is less than a first set value; and when the first variability is less than the first set value, obtaining a second sound of a second period; determining Whether a keyword is included in the second sound; and when the keyword is included in the second sound, a voice recognition function is activated, wherein when the voice recognition function is activated, each of the third voices is to be Detection.

A computer program product for loading by a machine and executing a voice recognition function starting method, comprising: a first code for obtaining a first sound of a first period; a second code for calculating a first variability of the first sound in the first period; a third code for determining whether the first variability is less than a first set value; a code for obtaining a second sound of a second period when the first variability is less than the first set value; a fifth code for determining whether a keyword is included in the second sound; And a sixth code for initiating a voice recognition function when the keyword is included in the second voice, wherein each voice of the third voice is detected when the voice recognition function is activated.