TWI440017B - Voice recognition function activation systems and methods, and machine readable medium and computer program products thereof - Google Patents
Voice recognition function activation systems and methods, and machine readable medium and computer program products thereof Download PDFInfo
- Publication number
- TWI440017B TWI440017B TW97137691A TW97137691A TWI440017B TW I440017 B TWI440017 B TW I440017B TW 97137691 A TW97137691 A TW 97137691A TW 97137691 A TW97137691 A TW 97137691A TW I440017 B TWI440017 B TW I440017B
- Authority
- TW
- Taiwan
- Prior art keywords
- sound
- variability
- voice recognition
- recognition function
- set value
- Prior art date
Links
Landscapes
- Telephone Function (AREA)
- User Interface Of Digital Computer (AREA)
Description
本發明係有關於一種語音辨識功能啟動系統及方法,且特別有關於一種可以依據聲音之變異度決定是否啟動語音辨識功能之系統及方法。The present invention relates to a voice recognition function activation system and method, and more particularly to a system and method for determining whether to activate a voice recognition function according to the variability of sound.
近年來,電子裝置,如電腦與可攜式裝置變得越來越高階且變得更多功能化。由於這些裝置與應用的便利,也使得這些裝置逐漸成為人們的生活必需品之一。In recent years, electronic devices, such as computers and portable devices, have become more sophisticated and more functional. Due to the convenience of these devices and applications, these devices have gradually become one of the necessities of life.
為了提供更便利的輸入與操作方式,部分電子裝置可以提供語音辨識系統。使用者可以利用語音完成輸入與操作電子裝置。另外,當使用者處於不適合利用手動輸入與操作的環境中,如在開車的時候,語音辨識系統亦提供使用者更便捷與安全的輸入與操作方式。雖然透過語音可以輸入與操作電子裝置與/或汽車系統。然而,如何啟動語音辨識系統成為設計者的重要關鍵。In order to provide a more convenient input and operation mode, some electronic devices can provide a voice recognition system. The user can complete the input and operation of the electronic device by using voice. In addition, when the user is in an environment that is not suitable for manual input and operation, such as when driving, the voice recognition system also provides a more convenient and safe input and operation mode for the user. Although voice can be input and operated with electronic devices and/or automotive systems. However, how to activate the speech recognition system becomes an important key for designers.
由於環境中充滿各式各樣的聲音,如果讓語音辨識系統持續地辨識聲音,常常會產生許多錯誤的辨識。因此,通常會額外設計一個按鈕來啟動語音辨識系統。由於使用者必須手動按下此按鈕來啟動語音辨識系統,因此,對於使用者而言係不便的,且在特殊狀況下,如行車中,此行為係缺乏安全性的。Since the environment is full of various sounds, if the speech recognition system continuously recognizes the sound, many misidentifications are often generated. Therefore, an extra button is usually designed to activate the speech recognition system. Since the user has to manually press this button to activate the speech recognition system, it is inconvenient for the user, and in special circumstances, such as driving, this behavior is insecure.
為了克服前述問題,一種習知技術開發來啟動語音辨 識系統。在此習知技術中,系統會持續偵測聲音中是否包括一關鍵字。當偵測到關鍵字時,則啟動語音辨識系統,以進行完整的語音辨識功能。在此習知技術中,使用者無須手動按下任何按鈕便可啟動語音辨識系統。然而,由於系統仍然係持續地偵測聲音中是否含有關鍵字。若在聲音源不單純或較為吵雜的環境中,可能形成錯誤的辨識亦係非常驚人的。因此,利用關鍵字來啟動語音辨識系統的技術亦鮮見實作於產品上。In order to overcome the aforementioned problems, a conventional technique is developed to initiate speech recognition. Knowledge system. In this prior art, the system continuously detects whether a keyword is included in the sound. When a keyword is detected, the speech recognition system is activated for complete speech recognition. In this prior art, the user can activate the speech recognition system without having to manually press any button. However, since the system is still continuously detecting whether the sound contains keywords. If the sound source is not simple or noisy, it may be very surprising to form a wrong identification. Therefore, the technology of using a keyword to activate a speech recognition system is rarely seen on a product.
有鑑於此,本發明提供語音辨識功能啟動系統及方法。In view of this, the present invention provides a speech recognition function activation system and method.
本發明實施例之一種語音辨識功能啟動系統包括一收音單元與一處理模組。該處理模組取得該收音單元偵測得到之一第一期間之一第一聲音,且計算該第一期間內該第一聲音之一第一變異度。該處理模組判斷該第一變異度是否小於一第一設定值。當該第一變異度小於該第一設定值時,該處理模組取得該收音單元偵測得到之一第二期間之一第二聲音,且判斷該第二聲音中是否包括一關鍵字。當該第二聲音中包括該關鍵字時,該處理模組啟動一語音辨識功能。當該語音辨識功能啟動時,該收音單元偵測得到之一第三聲音中之每一文字將被偵測。A voice recognition function starting system according to an embodiment of the present invention includes a sounding unit and a processing module. The processing module obtains a first sound of one of the first periods detected by the sounding unit, and calculates a first variability of the first sound in the first period. The processing module determines whether the first variability is less than a first set value. When the first variability is less than the first set value, the processing module obtains a second sound of one of the second periods detected by the sounding unit, and determines whether a keyword is included in the second sound. When the keyword is included in the second sound, the processing module activates a voice recognition function. When the voice recognition function is activated, the sounding unit detects that each of the third sounds will be detected.
本發明實施例之一種語音辨識功能啟動方法。首先,取得一第一期間之一第一聲音,且計算該第一期間內該第一聲音之一第一變異度。判斷該第一變異度是否小於一第 一設定值。當該第一變異度小於該第一設定值時,取得一第二期間之一第二聲音。判斷該第二聲音中是否包括一關鍵字。當該第二聲音中包括該關鍵字時,啟動一語音辨識功能。當該語音辨識功能啟動時,一第三聲音中之每一文字將被偵測。A method for starting a voice recognition function according to an embodiment of the present invention. First, a first sound of one of the first periods is obtained, and a first variability of the first sound in the first period is calculated. Determining whether the first variability is less than one A set value. When the first variability is less than the first set value, the second sound of one of the second periods is obtained. It is determined whether a keyword is included in the second sound. When the keyword is included in the second sound, a voice recognition function is activated. When the speech recognition function is activated, each of the third sounds will be detected.
本發明上述方法可以透過程式碼方式存在。當程式碼被機器載入且執行時,機器變成用以實行本發明之裝置。The above method of the present invention can exist in a coded manner. When the code is loaded and executed by the machine, the machine becomes the means for practicing the invention.
為使本發明之上述目的、特徵和優點能更明顯易懂,下文特舉實施例,並配合所附圖示,詳細說明如下。The above described objects, features, and advantages of the invention will be apparent from the description and appended claims appended claims
第1圖顯示依據本發明實施例之語音辨識功能啟動系統。Figure 1 shows a speech recognition function activation system in accordance with an embodiment of the present invention.
語音辨識功能啟動系統100可以是一電子裝置,如電腦系統、汽車系統、與可攜式裝置,如多媒體播放器、個人數位助理、全球衛星定位裝置、觸控式手機、智慧型手機或行動電話等之手持式裝置。語音辨識功能啟動系統100包括一收音單元110、一顯示單元120與一處理模組130。收音單元110可以是一麥克風用以接收環境中之聲音。顯示單元120可以是一螢幕或是一燈號,用以顯示關鍵字偵測圖示。處理模組130係依據收音單元110接收的聲音執行本案之語音辨識功能啟動方法,其細節將於後說明。The voice recognition function activation system 100 can be an electronic device such as a computer system, an automobile system, and a portable device such as a multimedia player, a personal digital assistant, a global satellite positioning device, a touch mobile phone, a smart phone, or a mobile phone. Such as hand-held devices. The voice recognition function activation system 100 includes a sound collection unit 110, a display unit 120, and a processing module 130. The radio unit 110 can be a microphone for receiving sound in the environment. The display unit 120 can be a screen or a light to display a keyword detection icon. The processing module 130 performs the voice recognition function starting method of the present case according to the sound received by the sound receiving unit 110, the details of which will be described later.
第2圖顯示依據本發明實施例之語音辨識功能啟動方法。FIG. 2 shows a method for starting a voice recognition function according to an embodiment of the present invention.
如步驟S202,透過收音單元110接收一期間之聲音,且如步驟S204,計算期間內聲音之變異度(Variance)。值得注意的是,計算變異度的方法係數值分析熟習之技術,在此不再贅述。如步驟S206,判斷此期間聲音的變異度是否小於一第一設定值,且維持一既定時間。注意的是,第一設定值與既定時間可以依據不同需求彈性設計。當此期間聲音的變異度並未小於第一設定值或持續既定時間時(步驟S206的否),流程回到步驟S202。當此期間聲音的變異度小於第一設定值且持續既定時間時(步驟S206的是),如步驟S208,透過顯示單元120顯示一關鍵字偵測圖示。關鍵字偵測圖示之顯示可以提示使用者進行關鍵字之輸入。值得注意的是,步驟S206中判斷變異度是否小於第一設定值既定時間係用以避免瞬間聲音變化與/或不同聲音源造成的誤判。然而,在一些實施例中,步驟S206亦可僅判斷變異度是否小於第一設定值即可。In step S202, the sound of a period is received by the sound pickup unit 110, and as in step S204, the variability of the sound during the period is calculated. It is worth noting that the method of calculating the coefficient of variability is a familiar technique and will not be described here. In step S206, it is determined whether the variability of the sound during this period is less than a first set value and maintained for a predetermined time. It is noted that the first set value and the set time can be flexibly designed according to different needs. When the variability of the sound during this period is not less than the first set value or continues for a predetermined time (NO in step S206), the flow returns to step S202. When the variability of the sound during this period is less than the first set value and continues for a predetermined time (YES in step S206), in step S208, a keyword detection icon is displayed through the display unit 120. The display of the keyword detection icon can prompt the user to enter a keyword. It should be noted that, in step S206, it is determined whether the variability is less than the first set value, and the predetermined time is used to avoid instantaneous sound changes and/or misjudgments caused by different sound sources. However, in some embodiments, step S206 may also only determine whether the variability is less than the first set value.
如步驟S210,透過收音單元110持續接收另一期間之聲音,且如步驟S212,計算此期間內聲音之變異度。如步驟S214,判斷此期間聲音的變異度是否大於一第二設定值。當此期間聲音的變異度並未大於第二設定值時(步驟S214的否),流程回到步驟S210。當此期間聲音的變異度大於第二設定值時(步驟S214的是),如步驟S216,判斷聲音中是否包括一內定之關鍵字。類似地,步驟S212與S214中計算與判斷此期間聲音的變異度是否大於第二設定值係用以避免瞬間聲音變化與/或不同聲音源造成的誤判。然 而,在一些實施例中,步驟S212與S214可以省略,而直接進行步驟S216的判斷。若聲音中並未包括內定之關鍵字(步驟S216的否),如步驟S218,取消在顯示單元120中相應關鍵字偵測圖示之顯示,並回到步驟S202。若聲音中包括內定之關鍵字(步驟S216的是),如步驟S220,啟動一語音辨識功能。注意的是,當語音辨識功能啟動時,接收之聲音中每一文字都將會被偵測。In step S210, the sound of another period is continuously received by the sound pickup unit 110, and in step S212, the degree of variability of the sound during the period is calculated. In step S214, it is determined whether the variability of the sound during this period is greater than a second set value. When the variability of the sound during this period is not greater than the second set value (NO in step S214), the flow returns to step S210. When the variability of the sound during this period is greater than the second set value (YES in step S214), in step S216, it is determined whether or not a predetermined keyword is included in the sound. Similarly, in steps S212 and S214, it is calculated and determined whether the variability of the sound during this period is greater than the second set value to avoid an instantaneous sound change and/or a misjudgment caused by a different sound source. Of course However, in some embodiments, steps S212 and S214 may be omitted, and the determination of step S216 is directly performed. If the default keyword is not included in the voice (NO in step S216), in step S218, the display of the corresponding keyword detection icon in the display unit 120 is canceled, and the flow returns to step S202. If the default keyword is included in the voice (YES in step S216), in step S220, a voice recognition function is activated. Note that when the speech recognition function is activated, each text in the received sound will be detected.
因此,本案之語音辨識功能啟動系統及方法可以依據環境中聲音的變異度自動啟動語音辨識功能。當期間內聲音的變異度小於設定值時,啟動關鍵字偵測,且在偵測到關鍵字之後自動啟動語音辨識功能,從而在便捷與安全性的考量下,啟動語音辨識功能。Therefore, the voice recognition function starting system and method of the present invention can automatically activate the voice recognition function according to the variability of the sound in the environment. When the variability of the sound is less than the set value during the period, the keyword detection is started, and the voice recognition function is automatically started after the keyword is detected, thereby starting the voice recognition function under the consideration of convenience and security.
本發明之方法,或特定型態或其部份,可以以程式碼的型態存在。程式碼可以包含於實體媒體,如軟碟、光碟片、硬碟、或是任何其他機器可讀取(如電腦可讀取)儲存媒體,亦或不限於外在形式之電腦程式產品,其中,當程式碼被機器,如電腦載入且執行時,此機器變成用以參與本發明之裝置。程式碼也可以透過一些傳送媒體,如電線或電纜、光纖、或是任何傳輸型態進行傳送,其中,當程式碼被機器,如電腦接收、載入且執行時,此機器變成用以參與本發明之裝置。當在一般用途處理單元實作時,程式碼結合處理單元提供一操作類似於應用特定邏輯電路之獨特裝置。The method of the invention, or a particular type or portion thereof, may exist in the form of a code. The code may be included in a physical medium such as a floppy disk, a CD, a hard disk, or any other machine readable (such as computer readable) storage medium, or is not limited to an external computer program product, wherein When the code is loaded and executed by a machine, such as a computer, the machine becomes a device for participating in the present invention. The code can also be transmitted via some transmission medium, such as a wire or cable, fiber optics, or any transmission type, where the machine becomes part of the program when it is received, loaded, and executed by a machine, such as a computer. Invented device. When implemented in a general purpose processing unit, the code combination processing unit provides a unique means of operation similar to application specific logic.
雖然本發明已以較佳實施例揭露如上,然其並非用以 限定本發明,任何熟悉此項技藝者,在不脫離本發明之精神和範圍內,當可做些許更動與潤飾,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the present invention has been disclosed above in the preferred embodiment, it is not intended to be used The scope of the present invention is defined by the scope of the appended claims, and the scope of the invention is defined by the scope of the appended claims.
100‧‧‧語音辨識功能啟動系統100‧‧‧Voice recognition function start system
110‧‧‧收音單元110‧‧‧ Radio unit
120‧‧‧顯示單元120‧‧‧Display unit
130‧‧‧處理模組130‧‧‧Processing module
S202、S204、...、S220‧‧‧步驟S202, S204, ..., S220‧‧ steps
第1圖為一示意圖係顯示依據本發明實施例之語音辨識功能啟動系統。Figure 1 is a schematic diagram showing a voice recognition function activation system in accordance with an embodiment of the present invention.
第2圖為一流程圖係顯示依據本發明實施例之語音辨識功能啟動方法。2 is a flow chart showing a method for starting a voice recognition function according to an embodiment of the present invention.
S202、S204、...、S220‧‧‧步驟S202, S204, ..., S220‧‧ steps
Claims (12)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW97137691A TWI440017B (en) | 2008-10-01 | 2008-10-01 | Voice recognition function activation systems and methods, and machine readable medium and computer program products thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW97137691A TWI440017B (en) | 2008-10-01 | 2008-10-01 | Voice recognition function activation systems and methods, and machine readable medium and computer program products thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201015539A TW201015539A (en) | 2010-04-16 |
TWI440017B true TWI440017B (en) | 2014-06-01 |
Family
ID=44830091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW97137691A TWI440017B (en) | 2008-10-01 | 2008-10-01 | Voice recognition function activation systems and methods, and machine readable medium and computer program products thereof |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI440017B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI412019B (en) | 2010-12-03 | 2013-10-11 | Ind Tech Res Inst | Sound event detecting module and method thereof |
-
2008
- 2008-10-01 TW TW97137691A patent/TWI440017B/en not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
TW201015539A (en) | 2010-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11341972B2 (en) | Speech recognition using two language models | |
JP7577816B2 (en) | Voice Triggers for Digital Assistants | |
US10438595B2 (en) | Speaker identification and unsupervised speaker adaptation techniques | |
US11189273B2 (en) | Hands free always on near field wakeword solution | |
US9734830B2 (en) | Speech recognition wake-up of a handheld portable electronic device | |
KR102623272B1 (en) | Electronic apparatus and Method for controlling electronic apparatus thereof | |
US9211854B2 (en) | System and method for incorporating gesture and voice recognition into a single system | |
JP6001758B2 (en) | Audio input from user | |
CN106796785B (en) | Sound sample validation for generating a sound detection model | |
WO2016023317A1 (en) | Voice information processing method and terminal | |
CN105580071B (en) | Method and apparatus for training a voice recognition model database | |
WO2008063701A3 (en) | Systems and methods for qualified registration | |
KR20160004914A (en) | Method and device for playing multimedia | |
KR20220070546A (en) | Text independent speaker recognition | |
WO2014032597A1 (en) | Voice recognition method and electronic device | |
TW201928740A (en) | Keyword confirmation method and apparatus | |
US9031843B2 (en) | Method and apparatus for enabling multimodal tags in a communication device by discarding redundant information in the tags training signals | |
TWI440017B (en) | Voice recognition function activation systems and methods, and machine readable medium and computer program products thereof | |
US11929081B2 (en) | Electronic apparatus and controlling method thereof | |
CN101714355A (en) | Voice recognition function starting system and method | |
WO2016197430A1 (en) | Information output method, terminal, and computer storage medium | |
US20110054650A1 (en) | Methods and systems for application procedure management | |
JP6988680B2 (en) | Voice dialogue device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4A | Annulment or lapse of patent due to non-payment of fees |