CN112927685B - Dynamic voice recognition method and device thereof - Google Patents
Dynamic voice recognition method and device thereof Download PDFInfo
- Publication number
- CN112927685B CN112927685B CN201911242880.2A CN201911242880A CN112927685B CN 112927685 B CN112927685 B CN 112927685B CN 201911242880 A CN201911242880 A CN 201911242880A CN 112927685 B CN112927685 B CN 112927685B
- Authority
- CN
- China
- Prior art keywords
- processing circuit
- voice
- memory
- stage
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000015654 memory Effects 0.000 claims abstract description 111
- 238000001514 detection method Methods 0.000 claims abstract description 37
- 230000005540 biological transmission Effects 0.000 claims abstract description 13
- 230000000694 effects Effects 0.000 claims description 19
- 238000010586 diagram Methods 0.000 description 4
- 230000035945 sensitivity Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Telephone Function (AREA)
Abstract
The invention provides a dynamic voice recognition method and a device. The dynamic voice recognition method comprises the following steps of executing a first stage: the voice data is detected by the digital microphone and stored in the first memory, the voice is detected in the voice data to generate a voice detection signal, and the second stage or the third stage is selectively determined and executed by the first processing circuit according to the total effective data amount, the transmission bit rate of the digital microphone and the identification interval time. And executing the second stage, wherein the first processing circuit outputs a first instruction to the second processing circuit, and the second processing circuit enables the memory access circuit to transfer the sound data to the second memory and store the sound data as voice data according to the first instruction. Executing the third stage, the first processing circuit outputs a second instruction to the second processing circuit, the second processing circuit instructs the memory access circuit to transfer the sound data to the second memory and store the sound data as voice data according to the second instruction, and the second processing circuit confirms whether the voice data matches with the preset voice instruction.
Description
Technical Field
The present invention relates to a voice detection and recognition technology, and more particularly, to a dynamic voice recognition method and apparatus thereof.
Background
In the existing electronic devices, voice assistant (voice assistant) technology is widely used in various fields and supports a voice wake-up function. When the voice assistant is in a standby mode (standby mode), the voice assistant still needs to listen to the hot word and give a corresponding response when the hot word appears, so that the voice assistant must wake up periodically, the processing system of the voice assistant will start up in the standby mode to detect whether there is a voice by using the voice activity detection circuit, and further enter voice recognition when the voice appears, so as to confirm whether the hot word (hot words) exists in the voice, and further determine whether to execute the system startup of the electronic device or execute the corresponding operation according to the voice activity detection circuit.
However, the regular wake-up of the voice assistant at equal frequency detects that it has poor sensitivity. At the same time, the processing system of the voice assistant needs to meet the low-power operation so as to meet the relevant specifications of energy requirements.
Disclosure of Invention
In view of the above, the present invention provides a dynamic voice recognition method, comprising performing a first stage: detecting sound data by using a digital microphone and storing the sound data in a first memory; detecting a voice in the voice data to generate a voice detection signal; and selectively determining to execute the second stage or the third stage by the first processing circuit according to the total effective data amount, the transmission bit rate of the digital microphone and the identification interval time. The second stage is executed, the first processing circuit outputs a first instruction to the second processing circuit, and the second processing circuit enables the memory access circuit to transfer the sound data to the second memory and store the sound data as voice data according to the first instruction. Executing the third stage, the first processing circuit outputs the second instruction, the second processing circuit makes the memory access circuit transfer the sound data to the second memory and store the sound data as the sound data according to the second instruction, and the second processing circuit confirms whether the sound data in the second memory matches with a preset sound instruction.
The invention further provides a dynamic voice recognition device, which comprises a digital microphone, a first memory, a voice activity detection circuit, a memory access circuit, a second memory, a first processing circuit and a second processing circuit. The digital microphone is used for detecting sound data. The first memory is electrically connected to the digital microphone for storing sound data. The voice activity detection circuit is electrically connected with the digital microphone and is used for detecting sound data and generating a voice detection signal. The memory access circuit is electrically connected with the first memory and is used for transferring sound data to the second memory according to the first instruction so as to store the sound data. The first processing circuit is electrically connected with the voice activity detection circuit. The second processing circuit is electrically connected with the first processing circuit, the second memory and the memory access circuit. The dynamic voice recognition device is used for executing the dynamic voice recognition method.
According to some embodiments, when the first processing circuit receives the voice detection signal, the first processing circuit outputs the first instruction or the second instruction after identifying the interval time.
According to some embodiments, the identification interval is determined by a budget relationship value, and the identification interval is 2 seconds when the budget relationship value is less than or equal to 1/3 of the target average power consumption; when the budget relation value is greater than the target average power consumption by 1/3 of the previous period time and less than or equal to the target average power consumption by 2/3 of the previous period time, the identification interval time is 1.5 seconds; and when the budget relation value is greater than the target average power consumption by 2/3 of the previous period time, the identification interval time is 1 second.
According to some embodiments, the budget relationship value is a target average power consumption by a previous period time- (a first average power consumption of the first phase by a first time of the first phase + a second average power consumption of the second phase by a second time of the second phase + a third average power consumption of the third phase by a third time of the third phase), wherein the previous period time is equal to a sum of the first time, the second time and the third time.
According to some embodiments, for example, the third average power consumption is greater than the second average power consumption, and the second average power consumption is greater than the first average power consumption.
According to some embodiments, after generating the voice detection signal, the first processing circuit determines whether the first memory is full of voice data, and proceeds to the next step when the first memory is full of voice data.
In summary, the present invention takes user experience into consideration when performing dynamic voice recognition, and can reduce average power consumption when triggering searching for a preset voice command (hotword) in a standby mode, thereby providing a method with better sensitivity.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
Fig. 1 is a block diagram of an electronic device according to an embodiment of the invention.
FIG. 2 is a flow chart of a dynamic speech recognition method according to an embodiment of the invention.
FIG. 3 is a waveform diagram of a dynamic speech recognition device according to an embodiment of the present invention.
FIG. 4 is a flow chart of a dynamic speech recognition method according to another embodiment of the invention.
Reference numerals illustrate:
10 electronic device
20 Dynamic voice recognition device
21 Digital microphone
22 First memory
23 Voice activity detection circuit
24 Memory access circuit
25 First processing circuit
26 Second processing circuit
27 Second memory
30 Video processing circuit
31-33 Core processing circuit
34-36 Third memories
C1 first instruction
C2 second instruction
SD1 sound data
SD2 speech data
SS human voice detection signal
ST1 first stage
ST2 second stage
ST3 third stage
T cycle time
T1-T2 time
Interval time of Ti identification
S10 to S28 steps
S30 to S36 steps
Detailed Description
Fig. 1 is a block diagram of an electronic device according to an embodiment of the invention, referring to fig. 1, the electronic device 10 includes a dynamic voice recognition device 20, an audio/video processing circuit 30, a plurality of core processing circuits 31-33 and a plurality of third memories 34-36, and the plurality of core processing circuits 31-33 are electrically connected to the third memories 34-36. When the dynamic voice recognition device 20 recognizes the preset voice command in the standby mode (standby mode), the electronic device 10 executes the system startup procedure, so that the audio/video processing circuit 30, the plurality of core processing circuits 31-33 and the plurality of third memories 34-36 can cooperate with each other to play the audio/video signal received by the electronic device 10. In one embodiment, the electronic device 10 may be a television, but is not limited thereto.
The dynamic voice recognition device 20 comprises a digital microphone 21, a first memory 22, a voice activity detection circuit 23, a memory access circuit 24, a first processing circuit 25, a second processing circuit 26 and a second memory 27. The digital microphone 21 is used to detect a sound data SD1. The first memory 22 is electrically connected to the digital microphone 21 for storing the audio data SD1. In one embodiment, the first memory 22 may be, but is not limited to, a Static Random Access Memory (SRAM).
The voice activity detection circuit 23 is electrically connected to the digital microphone 21 for detecting the sound data SD1 and generating a voice detection signal SS. In one embodiment, the voice activity detection circuit 23 may be, but is not limited to, a voice recognition chip or voice recognition processing circuit.
The memory access circuit 24 is electrically connected to the first memory 22 and the second memory 27, and is configured to transfer the audio data SD1 to the second memory 27 according to a first command to store the audio data SD1 as a voice data SD2. In one embodiment, the memory access circuit 24 may be, but is not limited to, a direct memory access (Direct Memory Acess, DMA) circuit and the second memory 27 may be, but is not limited to, a Dynamic Random Access Memory (DRAM).
The first processing circuit 25 is electrically connected to the voice activity detection circuit 23, and is configured to generate a first command C1 or a second command C2 according to the voice detection signal SS. The second processing circuit 26 is electrically connected to the first processing circuit 25, the second memory 27 and the memory access circuit 24, and the second processing circuit 26 causes the memory access circuit 24 to transfer the sound data SD1 to the second memory 27 and store the sound data SD2 according to the first command C1; or the second processing circuit 26 causes the memory access circuit 24 to transfer the sound data SD1 to the second memory 27 and store the sound data SD2 as the sound data SD2 according to the second command C2, and confirms whether the sound data SD2 in the second memory 27 matches a predetermined sound command. In an embodiment, the first processing circuit 25 may use a microcontroller with low power consumption, for example, an 8051 microcontroller, but the invention is not limited thereto. The second processing circuit 26 may be any of various types of processing circuits such as a general microprocessor, a microcontroller, a central processing unit, etc., but the invention is not limited thereto.
In one embodiment, the first instruction C1 or the second instruction C2 is an instruction for modifying the shared state.
Fig. 2 is a flow chart of a dynamic voice recognition method according to an embodiment of the invention, fig. 3 is a waveform diagram of a dynamic voice recognition device according to an embodiment of the invention, please refer to fig. 1, 2 and 3, the dynamic voice recognition method includes performing a first stage ST1 (step S10-step S18, step S22) and performing a second stage ST2 (step S20) or a third stage ST3 (step S24-step S26) by using the dynamic voice recognition device 20, and the following detailed descriptions are given for each stage.
In the execution of the first stage ST1 (pure standby stage), as shown in step S10, the sound data SD1 is detected by the digital microphone 21, and the sound data SD1 is stored in the first memory 22. In step S12, the voice activity detection circuit 23 detects whether the voice data SD1 has voice, and when the voice data SD1 detects voice, it is triggered to generate the voice detection signal SS, and outputs the voice detection signal SS to the first processing circuit 25. As shown in step S14, the first processing circuit 25 determines whether the first memory 22 is full of the audio data SD1, and proceeds to the next step S16 when the audio data SD1 is full, so as to ensure that there are enough audio data SD1 for the next step. In step S16, the first processing circuit 25 selectively determines to execute the second stage ST2 (DMA stage) or the third stage ST3 (speech recognition stage) according to the total effective data amount, the transmission bit rate of the digital microphone 21 and a recognition interval time Ti.
In one embodiment, the target average power consumption, the first average power consumption of the first stage ST1, the second average power consumption of the second stage ST2, and the third average power consumption of the third stage ST3 are known, and the time occupied by each stage in the previous period time T is obtained, including the first time Ta of the first stage ST1, the second time Tb of the second stage ST2, and the third time Tc of the third stage ST3, where the previous period time T is equal to the sum of the first time Ta, the second time Tb, and the third time Tc, i.e., t=ta+tb+tc. In one embodiment, the cycle time T may be, but is not limited to, 16 seconds. Thus, a Budget related value (Budget) of the power usage can be obtained by the above parameters, where the Budget related value is the target average power consumption by the previous period time T- (the first average power consumption of the first stage ST1 by the first time Ta of the first stage ST 1+the second average power consumption of the second stage ST2 by the second time Tb of the second stage ST 2+the third average power consumption of the third stage ST3 by the third time Tc of the third stage ST 3).
After obtaining the budget relation value, the identification interval time Ti can be dynamically determined according to the budget relation value. In detail, when the budget is equal to or less than the target average power consumption by 1/3 of the previous period time T, the identification interval time Ti is determined to be 2 seconds. When the budget relation value is greater than the target average power consumption by 1/3 and less than or equal to the target average power consumption by 2/3, determining that the identification interval time Ti is 1.5 seconds. When the budget is greater than the target average power consumption by a period of time T by 2/3, the identification interval is determined to be 1 second. Then, the total effective data amount is known to be the sum of the effective data amount of the first memory 22 and the effective data amount of the second memory 27, and the transmission bit rate of the digital microphone 21, so that the first processing circuit 25 determines to execute the DMA stage of the second stage ST2 when the total effective data amount is smaller than the product of the transmission bit rate of the digital microphone 21 and the identification interval time. When the total effective data amount is greater than or equal to the product of the transmission bit rate of the digital microphone 21 and the recognition interval time, the first processing circuit 25 determines to execute the voice recognition stage of the third stage ST 3.
When the first processing circuit 25 determines to execute the second stage ST2, as shown in step S18, the first processing circuit 25 wakes up the second processing circuit 26 first and then proceeds to the second stage ST2. In the second stage ST2, as shown in step S20, the first processing circuit 25 outputs the first command C1 to the second processing circuit 26, and the second processing circuit 26 causes the memory access circuit 24 to transfer the audio data SD1 in the first memory 22 to the second memory 27 according to the first command C1 for storing as the audio data SD2. In the second stage ST2, the voice data SD2 is transferred to the second memory 27 only through the memory access circuit 24 without voice recognition.
When the first processing circuit 25 determines to execute the third stage ST3, as shown in step S22, the first processing circuit 25 wakes up the second processing circuit 27 first, and then proceeds to the third stage ST3. In the third stage ST3, as shown in step S24, the first processing circuit 25 outputs the second command C2 to the second processing circuit 26, and the second processing circuit 26 further causes the memory access circuit 24 to transfer the audio data SD1 in the first memory 22 to the second memory 27 according to the second command C2 to store the audio data SD2, and determines whether the audio data SD2 in the second memory 27 matches the predetermined audio command. In step S26, the second processing circuit 26 determines whether the voice data SD2 in the second memory 27 matches the preset voice command, and if the voice data SD2 confirms that the voice data SD2 matches the preset voice command, the system boot program is executed to wake up other circuits, including the audio/video processing circuit 30, the core processing circuits 31-33, the third memories 34-36, etc., to boot the system.
Fig. 4 is a flowchart of a dynamic voice recognition method according to another embodiment of the invention, please refer to fig. 1, 3 and 4, wherein the dynamic voice recognition method includes performing a first stage ST1 (step S10-step S16) and performing a second stage ST2 (step S30) or a third stage ST3 (step S32-step S34) by using the dynamic voice recognition device 20, and the following detailed descriptions are provided for each stage.
In the execution of the first stage ST1 (pure standby stage), as shown in step S10, the sound data SD1 is detected by the digital microphone 21, and the sound data SD1 is stored in the first memory 22. In step S12, the voice activity detecting circuit 23 detects whether the voice data SD1 has voice, and is triggered to generate a voice detecting signal SS for transmitting to the first processing circuit 25 when the voice is detected. As shown in step S14, the first processing circuit 25 determines whether the first memory 22 is full of the audio data SD1, and proceeds to the next step S16 when the audio data SD1 is full, so as to ensure that there are enough audio data SD1 for the next step. In step S16, the first processing circuit 25 selectively determines to execute the second stage ST2 (DMA stage) or the third stage ST3 (speech recognition stage) according to the total effective data amount, the transmission bit rate of the digital microphone 21 and a recognition interval time Ti.
When the first processing circuit 25 determines to execute the second stage ST2, as shown in step S30, in the second stage ST2, the first processing circuit 25 outputs the first command C1 and wakes up the second processing circuit 26, and the second processing circuit 26 causes the memory access circuit 24 to transfer the sound data SD1 in the first memory 22 to the second memory 27 according to the first command C1 for storing as the sound data SD2.
When the first processing circuit 25 determines to execute the third stage ST3, as shown in step S32, in the third stage ST3, the first processing circuit 25 outputs the second command C2 and wakes up the second processing circuit 26, and the second processing circuit 26 makes the memory access circuit 24 transfer the sound data SD1 in the first memory 22 to the second memory 27 according to the second command C2 to store the sound data SD2 as the sound data SD2, and confirms whether the sound data SD2 in the second memory 27 matches the predetermined sound command. In step S34, the second processing circuit 26 determines whether the voice data SD2 in the second memory 27 matches the preset voice command, and if the voice data SD2 confirms that the voice data SD2 matches the preset voice command, the system boot program is executed in step S28 to wake up all the circuits to boot the system.
The steps (S10 to S26 and S30 to S34) of the dynamic speech recognition method are merely examples, and are not limited to the sequential execution of the above examples. The various operations under the dynamic speech recognition method may be added, replaced, omitted, or performed in a different order as appropriate without departing from the spirit and scope of the present invention.
In an embodiment, when the first processing circuit 25 receives the voice detection signal SS, the first processing circuit 25 outputs the first command C1 or the second command C2 after identifying the interval time Ti. As shown in fig. 1 and 3, when the first processing circuit 25 receives the voice detection signal SS at the time T1, the first processing circuit 25 outputs the first command C1 or the second command C2 at the time T2 after the recognition interval time Ti, wherein the recognition interval time Ti can be dynamically determined based on the foregoing manner, so as to ensure that the received voice data SD1 is sufficient to reflect the predetermined voice command before enabling the second processing circuit 26 and the second memory 27, so that the low-power operation can be satisfied to meet the relevant specifications of the energy requirement.
In one embodiment, if the keyword set by the preset voice command is "Hi, TV", as shown in fig. 1 and 3, at time T1, the digital microphone 21 detects the external sound and generates the sound data SD1, and the first memory 22 stores the sound data SD1, for example, the digital microphone 21 detects the voice command such as "Hi, TV …" from the user to the dynamic voice recognition device 20. Meanwhile, the voice activity detection circuit 23 determines that the voice data SD1 has voice and outputs a voice detection signal SS. At time T2, the first processing circuit 25 outputs the first instruction C1 or the second instruction C2. The second processing circuit 26 and the second memory 27 are also enabled, and at this time, the second processing circuit 26 enables the memory access circuit 24 according to the first command C1 or the second command C2 to transfer the audio data SD1 to the second memory 27 and store the audio data SD2. Therefore, the second processing circuit 26 can analyze the voice data SD2 to determine whether the voice data SD2 matches the predetermined voice command (Hi, TV), and wake up other circuits to execute the system boot process after the second processing circuit 26 determines that the voice data SD2 matches the predetermined voice command.
In one embodiment, the first stage ST1 uses the digital microphone 21, the first memory 22, the voice activity detection circuit 23 and the first processing circuit 25 in the dynamic voice recognition device 20. The second stage ST2 uses the digital microphone 21, the first memory 22, the voice activity detection circuit 23, the memory access circuit 24, the first processing circuit 25, a portion of the second processing circuit 26 (only a portion of the functions of the second memory are activated) and the second memory 27 in the dynamic voice recognition device 20. The third stage ST3 uses all the digital microphone 21, the first memory 22, the voice activity detection circuit 23, the memory access circuit 24, the first processing circuit 25, the second processing circuit 26 and the second memory 27 in the dynamic voice recognition device 20. Therefore, the third average power consumption of the third stage ST3 is greater than the second average power consumption of the second stage ST2, and the second average power consumption is greater than the first average power consumption of the first stage ST 1. For example, the power consumption corresponding to the first stage ST1 is about 0.5 watt, and the power consumption corresponding to the third stage ST3 is about 4 watt, and the power consumption corresponding to the second stage ST2 is between the two.
Therefore, the present invention can determine the budget relation value according to the time occupied by each stage (the first time, the second time and the third time) and the average power consumption of each stage in the previous period time T, so as to dynamically determine the length of the recognition interval time Ti according to the budget relation value, and further determine whether the recognition of the voice data is required (executing the second stage ST2 or the third stage ST 3) according to the determined length, so that the voice recognition can be dynamically performed according to the power consumption of the actual operation. Therefore, the invention can take the experience of the user into consideration when dynamic voice recognition is carried out, and can reduce the average power consumption when searching the preset voice command in the standby mode, thereby providing a method with better sensitivity.
The above embodiments are only for illustrating the technical spirit and features of the present invention, and it is intended to enable those skilled in the art to understand the present invention and to implement it according to the present invention, but not limit the scope of the present invention, i.e. the scope of the present invention shall be covered by the appended claims.
Claims (9)
1. A dynamic speech recognition method, comprising:
A first phase is performed:
detecting sound data by a digital microphone and storing the sound data in a first memory;
Detecting voice in the voice data to generate a voice detection signal; and
Selectively determining to execute a second stage or a third stage by a first processing circuit according to a total effective data amount, a transmission bit rate of the digital microphone and an identification interval time;
the second phase is performed:
the first processing circuit outputs a first instruction to a second processing circuit, and the second processing circuit enables a memory access circuit to transfer the sound data to a second memory and store the sound data as voice data according to the first instruction; and
The third phase is performed:
The first processing circuit outputs a second instruction to the second processing circuit, the second processing circuit enables the memory access circuit to transfer the sound data to the second memory and store the sound data as the voice data according to the second instruction, and the second processing circuit confirms whether the voice data in the second memory matches a preset voice instruction, wherein the first processing circuit determines to execute the second stage when the total effective data amount is smaller than the product of the transmission bit rate of the digital microphone and the identification interval time; and when the total effective data amount is greater than or equal to the product of the transmission bit rate of the digital microphone and the identification interval time, the first processing circuit determines to execute the third stage, wherein the total effective data amount is the sum of the effective data amount of the first memory and the effective data amount of the second memory.
2. The method of claim 1, wherein the first processing circuit outputs the first command or the second command after the recognition interval time when the first processing circuit receives the voice detection signal.
3. The method of claim 2, wherein the recognition interval is determined by a budget relationship value, the recognition interval being 2 seconds when the budget relationship value is less than or equal to 1/3 of a period before the target average power consumption; the identification interval time is 1.5 seconds when the budget relation value is greater than the target average power consumption by 1/3 of the previous period time and less than or equal to the target average power consumption by 2/3 of the previous period time; and the identification interval is 1 second when the budget is greater than the target average power consumption by 2/3 of the previous period.
4. The method of claim 3, wherein the budget relationship value is the target average power consumption for the previous period of time- (the first average power consumption for the first phase of time + the first average power consumption for the second phase of time + the second average power consumption for the second phase of time + the third average power consumption for the third phase of time), wherein the previous period of time is equal to the sum of the first time, the second time and the third time.
5. The method of claim 4, wherein the third average power consumption is greater than the second average power consumption, and the second average power consumption is greater than the first average power consumption.
6. The method of claim 1, wherein after the step of generating the voice detection signal, further comprising: judging whether the first memory is full of the sound data, and continuing to proceed to the next step when the sound data is full.
7. The method of claim 1, wherein in the step of performing the first stage, after selectively determining to perform the second stage or the third stage, further comprising: the first processing circuit wakes up the second processing circuit.
8. The method of claim 1, wherein the first processing circuit wakes up the second processing circuit when the first processing circuit outputs the first instruction or the second instruction.
9. A dynamic speech recognition device, comprising:
a digital microphone for detecting a sound data;
A first memory electrically connected to the digital microphone for storing the sound data;
a voice activity detection circuit electrically connected to the digital microphone for detecting the voice data and generating a voice detection signal;
a memory access circuit electrically connected to the first memory, the memory access circuit transferring the audio data to a second memory for storing as a voice data;
A first processing circuit electrically connected to the voice activity detection circuit; and
The second processing circuit is electrically connected with the first processing circuit, the second memory and the memory access circuit;
the dynamic voice recognition device is used for executing the following steps:
A first phase is performed:
detecting the sound data by the digital microphone and storing the sound data in the first memory;
The voice activity detection circuit detects voice in the voice data to generate the voice detection signal; and
Selectively determining to execute a second stage or a third stage by the first processing circuit according to a total effective data amount, a transmission bit rate of the digital microphone and an identification interval time;
the second phase is performed:
The first processing circuit outputs a first instruction to the second processing circuit, and the second processing circuit enables the memory access circuit to transfer the sound data to the second memory and store the sound data as the voice data according to the first instruction; and
The third phase is performed:
The first processing circuit outputs a second instruction to the second processing circuit, the second processing circuit enables the memory access circuit to transfer the sound data to the second memory and store the sound data as the voice data according to the second instruction, and the second processing circuit confirms whether the voice data in the second memory matches a preset voice instruction, wherein the first processing circuit determines to execute the second stage when the total effective data amount is smaller than the product of the transmission bit rate of the digital microphone and the identification interval time; and when the total effective data amount is greater than or equal to the product of the transmission bit rate of the digital microphone and the identification interval time, the first processing circuit determines to execute the third stage, wherein the total effective data amount is the sum of the effective data amount of the first memory and the effective data amount of the second memory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911242880.2A CN112927685B (en) | 2019-12-06 | 2019-12-06 | Dynamic voice recognition method and device thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911242880.2A CN112927685B (en) | 2019-12-06 | 2019-12-06 | Dynamic voice recognition method and device thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112927685A CN112927685A (en) | 2021-06-08 |
CN112927685B true CN112927685B (en) | 2024-09-03 |
Family
ID=76161669
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911242880.2A Active CN112927685B (en) | 2019-12-06 | 2019-12-06 | Dynamic voice recognition method and device thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112927685B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201311379D0 (en) * | 2013-06-26 | 2013-08-14 | Wolfson Microelectronics Plc | Speech Recognition |
CN109285540A (en) * | 2017-07-21 | 2019-01-29 | 致伸科技股份有限公司 | The operating system of digital speech assistant |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9142215B2 (en) * | 2012-06-15 | 2015-09-22 | Cypress Semiconductor Corporation | Power-efficient voice activation |
US9361885B2 (en) * | 2013-03-12 | 2016-06-07 | Nuance Communications, Inc. | Methods and apparatus for detecting a voice command |
US9703350B2 (en) * | 2013-03-15 | 2017-07-11 | Maxim Integrated Products, Inc. | Always-on low-power keyword spotting |
US20140358552A1 (en) * | 2013-05-31 | 2014-12-04 | Cirrus Logic, Inc. | Low-power voice gate for device wake-up |
US9613626B2 (en) * | 2015-02-06 | 2017-04-04 | Fortemedia, Inc. | Audio device for recognizing key phrases and method thereof |
US11189273B2 (en) * | 2017-06-29 | 2021-11-30 | Amazon Technologies, Inc. | Hands free always on near field wakeword solution |
-
2019
- 2019-12-06 CN CN201911242880.2A patent/CN112927685B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201311379D0 (en) * | 2013-06-26 | 2013-08-14 | Wolfson Microelectronics Plc | Speech Recognition |
CN109285540A (en) * | 2017-07-21 | 2019-01-29 | 致伸科技股份有限公司 | The operating system of digital speech assistant |
Also Published As
Publication number | Publication date |
---|---|
CN112927685A (en) | 2021-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11862173B2 (en) | Always-on audio control for mobile device | |
US11217256B2 (en) | Voice interaction method, device and terminal | |
KR101994569B1 (en) | Clock Switching on Constant-On Components | |
US8533510B2 (en) | Power management method for a multi-microprocessor system | |
US11373637B2 (en) | Processing system and voice detection method | |
US8015329B2 (en) | Data transfer coherency device and methods thereof | |
TWI727521B (en) | Dynamic speech recognition method and apparatus therefor | |
CN112927685B (en) | Dynamic voice recognition method and device thereof | |
CN1937075A (en) | Data transfer operation completion detection circuit and semiconductor memory device provided therewith | |
US20040027882A1 (en) | Semiconductor memory device and control method therefor | |
CN111414071B (en) | Processing system and voice detection method | |
US6643732B1 (en) | Delayed read/write scheme for SRAM interface compatible DRAM | |
US20030145240A1 (en) | System, method and computer program product for selecting a power management mode in an information handling system | |
JP2002245794A (en) | Sdram refresh circuit | |
JP2003132012A (en) | System for bus control and method therefor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |