WO2020125522A1

WO2020125522A1 - Automatic calibration method, device and apparatus for microphone array and storage medium

Info

Publication number: WO2020125522A1
Application number: PCT/CN2019/124639
Authority: WO
Inventors: 孙铭
Original assignee: 深圳Tcl新技术有限公司
Priority date: 2018-12-17
Filing date: 2019-12-11
Publication date: 2020-06-25
Also published as: CN109451415A

Abstract

The present application discloses an automatic calibration method, device and apparatus for microphone array and a storage medium. The method comprises the following steps: obtaining digital audio data after respective channels of a microphone array performs preprocessing on a picked-up benchmark audio signal; calculating audio signal energy values corresponding to respective channels according to the digital audio data; detecting whether the audio signal energy values satisfy a preset consistency condition; if the audio signal energy values do not satisfy the preset consistency condition, adjusting gain values of respective channels so as to make the audio signal energy values satisfy the preset consistency condition.

Description

Microphone array automatic proofreading method, device, equipment and storage medium

This application requires the priority of the Chinese patent application submitted to the Chinese Patent Office on December 17, 2018 with the application number 201811542125.1 and the invention titled "Microphone array automatic proofreading method, device, equipment and storage medium", the entire content of which is cited by reference Incorporated in the application.

Technical field

The present application relates to the technical field of microphones, in particular to a method, device, equipment and storage medium for automatic calibration of microphone arrays.

Background technique

With the development of artificial intelligence technology, there are more and more use cases for intelligent voice interaction, and the requirements for far-field voice recognition are becoming higher and higher. Using microphone array to pick up sound is currently the most important far-field sound picking method, that is, multi-channel synchronous voice data is collected, and then the collected voice data is processed by noise reduction, sound source positioning, beam processing and other processing measures and then sent to the back end Voice recognition module. Therefore, the microphone array pickup performance is a prerequisite for the quality of the entire far-field speech recognition system. At present, there are different spatial layouts of microphone arrays, such as linear, circular, etc., but either way, the pickup performance of all microphones is mainly required to maintain the same pickup gain, but due to the existence of individual differences in microphone array hardware The sound pickup consistency of each microphone channel does not meet the technical requirements, which will cause deviations in the algorithm calculation of the back-end speech recognition module, thereby affecting the far-field sound pickup and speech recognition effects.

Summary of the invention

The main purpose of the present application is to provide a method, device and computer readable storage medium for automatic calibration of a microphone array, aiming to solve the technical problem that the sound pickup consistency of each microphone channel due to individual differences in microphone array hardware cannot meet the technical requirements.

In order to achieve the above object, the present application provides an automatic calibration method of a microphone array, which includes the steps of:

Obtain digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array;

Calculating the energy value of the audio signal corresponding to each channel according to the digital audio data;

Detecting whether the energy value of the audio signal corresponding to each channel meets the preset consistency condition;

If the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition, the gain value of each channel is adjusted so that the energy value of the audio signal corresponding to each channel satisfies the Preset consistency conditions.

Optionally, the step of acquiring digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array includes:

Acquire digital audio data after each channel of the microphone performs gain amplification processing and analog-to-digital conversion processing on the picked-up reference audio signal.

Optionally, the step of calculating the energy value of the audio signal corresponding to each channel according to the digital audio data includes:

Separately framing the digital audio data corresponding to each channel;

Acquiring audio data of a preset frame in the digital audio data corresponding to each channel;

Calculate the energy value of the audio signal corresponding to each channel according to the audio data of the preset frame.

Optionally, the step of detecting whether the energy value of the audio signal corresponding to each channel meets a preset consistency condition includes:

Calculating the average energy value of the energy values of the audio signals corresponding to the respective channels;

Separately calculating the absolute value of the energy difference between the energy value of the audio signal and the average energy value corresponding to each channel;

Detecting whether the absolute value of the energy difference of each channel is less than a preset difference;

If the absolute value of the energy difference of each channel is less than the preset difference, it is determined that the energy value of the audio signal corresponding to each channel satisfies the preset consistency condition.

Optionally, the step of adjusting the gain value of each channel includes:

Determine the channel to be adjusted in each channel;

The gain value of the channel to be adjusted is adjusted according to a preset adjustment mode.

Optionally, the step of determining the channel to be adjusted in each channel includes:

Determine the channel with the largest absolute value of the energy difference among all channels as the channel to be adjusted;

The step of adjusting the gain value of the channel to be adjusted according to a preset adjustment method includes:

Determine whether the energy value of the audio signal of the channel to be adjusted is greater than the average energy value;

If the energy value of the audio signal of the channel to be adjusted is greater than the average energy value, the gain value of the channel to be adjusted is reduced by a preset value;

If the energy value of the audio signal of the channel to be adjusted is less than the average energy value, the gain value of the channel to be adjusted is increased by a preset value.

Optionally, after the step of adjusting the gain value of the channel to be adjusted according to a preset adjustment method, the method further includes:

Detecting whether distortion occurs in the channel to be adjusted after gain-amplifying the reference audio signal according to the adjusted gain value;

If it is detected that the channel to be adjusted undergoes gain amplification processing on the reference audio signal according to the adjusted gain value, a distortion phenomenon occurs, and a prompt message to modify the preset adjustment mode is output.

Optionally, before the step of adjusting the gain value of each channel, the method further includes:

Detecting whether the current adjustment times for adjusting the gain values of the channels are greater than the preset times;

If the current number of adjustments to adjust the gain value of each channel is greater than the preset number of times, the step of adjusting the gain value of each channel includes:

Perform preset error reporting.

In addition, in order to achieve the above object, the present application also provides a microphone array automatic proofreading device. The microphone array automatic proofreading device includes:

The acquisition module is configured to acquire digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array;

A calculation module configured to calculate the energy value of the audio signal corresponding to each channel according to the digital audio data;

A detection module configured to detect whether the energy value of the audio signal corresponding to each channel meets a preset consistency condition;

An adjustment module configured to adjust the gain value of each channel so that the audio signal corresponding to each channel if the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition The energy value meets the preset consistency condition.

In addition, in order to achieve the above object, the present application also provides a microphone array automatic proofreading device, the microphone array automatic proofreading device includes a memory, a processor, and a microphone array stored on the memory and operable on the processor An automatic calibration program, when the automatic microphone array calibration program is executed by the processor, the steps of the automatic calibration method for the microphone array described above are implemented.

In addition, in order to achieve the above object, the present application also provides a computer-readable storage medium on which the microphone array automatic calibration program is stored, and the microphone array automatic calibration program is implemented as described above when executed by the processor The steps of the microphone array automatic calibration method described above.

This application obtains digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array; calculates the energy value of the audio signal corresponding to each channel according to the digital audio data; detects whether the energy value of the audio signal corresponding to each channel meets A preset consistency condition; if the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition, adjust the gain value of each channel so that the energy value of the audio signal corresponding to each channel meets the preset consistency condition, The automatic calibration of each microphone channel of the microphone array is realized to compensate for the hardware difference of each microphone channel, so that the sound pickup of each microphone channel meets the consistency requirements.

BRIEF DESCRIPTION

FIG. 1 is a schematic structural diagram of a hardware operating environment involved in an embodiment of the present application;

FIG. 2 is a schematic flowchart of a first embodiment of a microphone array automatic proofreading method of the present application.

FIG. 3 is a detailed flowchart of step S3 in the embodiment of the automatic calibration method of the microphone array of the present application.

The implementation, functional characteristics and advantages of the present application will be further described in conjunction with the embodiments and with reference to the drawings.

detailed description

It should be understood that the specific embodiments described herein are only used to explain the present application, and are not used to limit the present application.

Due to the current technical problem that the pickup consistency of each microphone channel cannot meet the technical requirements due to the individual differences in the microphone array hardware, this application provides a solution by preprocessing the picked up reference audio signal by acquiring each channel of the microphone array Digital audio data; calculating the energy value of the audio signal corresponding to each channel according to the digital audio data; detecting whether the energy value of the audio signal corresponding to each channel meets the preset consistency condition; if the channels correspond The energy value of the audio signal does not satisfy the preset consistency condition, then the gain value of each channel is adjusted so that the energy value of the audio signal corresponding to each channel meets the preset consistency condition. The automatic calibration of each microphone channel of the microphone array is realized to compensate for the hardware difference of each microphone channel, so that the sound pickup of each microphone channel meets the consistency requirements.

The present application provides a microphone array automatic proofreading device. Referring to FIG. 1, FIG. 1 is a schematic structural diagram of a hardware operating environment involved in a solution of an embodiment of the present application.

It should be noted that FIG. 1 is a schematic diagram of the hardware operating environment of the microphone array automatic proofreading device.

As shown in FIG. 1, the microphone array automatic proofreading device may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002. Among them, the communication bus 1002 is used to implement connection communication between these components. The user interface 1003 may include a display (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. The network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as a disk memory. The memory 1005 may optionally be a storage device independent of the foregoing processor 1001.

Optionally, the microphone array automatic proofreading device may further include a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and so on. Those skilled in the art may understand that the structure of the automatic microphone array calibration device shown in FIG. 1 does not constitute a limitation on the automatic microphone array calibration device, and may include more or fewer components than the illustration, or a combination of certain components. Or different component arrangements.

As shown in FIG. 1, the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a microphone array automatic proofreading program.

In the microphone array automatic calibration device shown in FIG. 1, the network interface 1004 is mainly used to connect the microphone array device and perform data communication with the microphone array device; the proofreader can trigger the calibration instruction through the user interface 1003 to make the microphone array automatic calibration device According to the calibration instruction, the microphone array device is automatically calibrated. The user interface 1003 can also be used to display voice data and calibration results; and the processor 1001 can be used to call the microphone array automatic calibration program stored in the memory 1005 and perform the following operations:

Further, the step of acquiring digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array includes:

Further, the step of calculating the energy value of the audio signal corresponding to each channel according to the digital audio data includes:

Separately framing the digital audio data corresponding to each channel;

Further, the step of detecting whether the energy value of the audio signal corresponding to each channel meets a preset consistency condition includes:

Further, the step of adjusting the gain value of each channel includes:

Determine the channel to be adjusted in each channel;

Further, the step of determining the channel to be adjusted in each channel includes:

Further, after the step of adjusting the gain value of the channel to be adjusted according to the preset adjustment mode, the processor 1001 may call the microphone array automatic calibration program stored in the memory 1005, and also perform the following operations:

Detecting whether a distortion phenomenon occurs after the channel to be adjusted performs gain amplification on the reference audio signal according to the adjusted gain value;

If it is detected that the channel to be adjusted undergoes gain amplification processing on the reference audio signal according to the adjusted gain value, then a distortion message is output, a prompt message to modify the preset adjustment mode is output.

Based on the above hardware structure, various embodiments of the microphone array automatic proofreading method of the present application are proposed. The microphone array automatic proofreading method of the present application is mainly applied to the above-mentioned microphone array automatic proofreading device. In the following embodiments, for the convenience of description, the proofreading device is used as an executive body to explain the various embodiments.

Referring to FIG. 2, the first embodiment of the automatic calibration method of the microphone array of the present application provides an automatic calibration method of the microphone array. It should be noted that although the logic sequence is shown in the flowchart, in some cases, it may be different. The steps shown or described are performed in the order presented here. The automatic calibration method of the microphone array includes:

Step S1: Obtain digital audio data after pre-processing the picked up reference audio signal in each channel of the microphone array;

The proofreading device and the microphone array device are connected by a wired or wireless method for data transmission. The proofing device may include a sound source device. The proofing device controls the sound source device to play a reference audio stream. The reference audio stream is a segment of audio stream whose energy value stability meets the stability requirements. The proofing environment is preferably a noise-free environment, so that The proofreading effect is the best. Each microphone channel of the microphone array converts the reference audio stream into a reference audio signal respectively. This process is similar to the process in which existing microphones convert sound into electrical signals. After picking up the reference audio signal, each channel preprocesses the reference audio signal to obtain digital audio data. The preprocessing may be to perform gain amplification processing on the reference audio signal first, and then perform analog-to-digital conversion processing, or may be other processing procedures capable of converting the reference audio signal into digital audio data. The proofreading device obtains digital audio data after preprocessing the reference audio signal by each channel of the microphone array.

Further, step S1 includes:

Step S11: Acquire digital audio data after each channel of the microphone performs gain amplification processing and analog-to-digital conversion processing on the picked up reference audio signal.

After picking up the reference audio signal, each channel of the microphone array performs gain amplification processing on the reference audio signal. At this time, each channel amplifies the reference audio signal with an initial gain value. Among them, the initial gain value of each channel can be preset Set, the initial gain value of each channel should be the same. In an ideal state, each microphone channel performs gain amplification processing on the same reference audio signal with the same initial gain value, and an audio signal with the same gain amplification characteristic should be obtained, but due to the individual differences in hardware of each microphone channel of the microphone array, and such hardware The difference is a difference that is difficult to avoid when producing hardware. Therefore, the initial gain value of each channel needs to be adjusted so that the gain amplification characteristics of each channel are consistent. After performing gain amplification processing on the reference audio signal, each channel obtains the amplified audio signal, and then performs analog-to-digital conversion on the amplified audio signal to obtain digital audio data. The analog-to-digital conversion is based on the preset sampling in the microphone array device Sampling and quantization are performed on sampling specifications such as rate and sampling size. The digital audio data obtained after analog-to-digital conversion is the sample value of each sampling point recorded in the order of sampling. The proofreading device acquires digital audio data corresponding to each channel from the microphone array device.

Step S2: Calculate the energy value of the audio signal corresponding to each channel according to the digital audio data;

After acquiring the digital audio data corresponding to each channel, the proofreading device calculates the energy value of the audio signal corresponding to each channel according to the acquired digital audio data corresponding to each channel. Since the reference audio stream is an audio stream whose energy value stability meets the stability requirements, the amplified audio signal after the gain amplification processing of the picked up reference audio signal by each channel should also meet the energy value equal, but, due to the microphone The hardware difference of each microphone channel of the array may cause the energy value of the amplified audio signal of one channel to be too different from other channels. Therefore, the gain value of each channel needs to be adjusted so that the energy value of the amplified audio signal of each channel Meet consistency requirements. The proofreading device calculates the energy value of the audio signal collected at the same time in each channel based on the digital audio data acquired from each channel, that is, the energy value of the audio signal corresponding to each channel is obtained.

Further, step S2 includes:

In step S21, the digital audio data corresponding to each channel is separately framed;

Define the short-term average energy E _n of a speech signal at time _n as:

Formula 1:

Among them, N is the window length, m is the sampling point on the window, x(m) is the sample value of the sampling point, ω(n-m) is the window function, it can be seen that the short-term average energy is the weight of the sample value of one frame sum of square. In particular, when the window function is a rectangular window, there is formula 2:

At this time, the short-term average energy is the sum of squares of sample values in one frame. The proofreading device first performs windowing and framing processing on the digital audio data corresponding to each channel, the window function is a rectangular window, and the frame length can be set in advance according to specific needs.

Step S22: Acquire audio data of a preset frame in the digital audio data corresponding to each channel;

After performing windowing and framing processing on the digital audio data corresponding to each channel, the proofreading device respectively obtains audio data of a preset frame in the digital audio data processed by each channel. The preset frame may be the Nth frame of the digital audio data after the frame processing, and N may be set in advance according to specific needs.

Step S23: Calculate the energy value of the audio signal corresponding to each channel according to the audio data of the preset frame.

After acquiring the audio data of a preset frame of each channel, the proofreading device substitutes the audio data of the preset frame into the above formula 2 to obtain the energy value of the audio signal corresponding to each channel.

Step S3, detecting whether the energy value of the audio signal corresponding to each channel meets the preset consistency condition;

After the proofreading device calculates the energy value of the audio signal corresponding to each channel, it is determined whether the energy value of each audio signal meets the preset consistency condition. Ideally, the energy value of the audio signal corresponding to each microphone channel should be exactly the same in order to make the far-field speech recognition system that uses the microphone array to pick up the best voice recognition effect, but due to the individual hardware differences of each microphone channel of the microphone array, There may be a situation where the energy value of the amplified audio signal of one channel is too different from other channels. At this time, it is necessary to detect whether the energy value of the audio signal corresponding to each channel meets the preset consistency condition, that is, whether the amplified audio of a certain channel appears The signal energy value is too different from other channels. The preset consistency condition may be that the difference between the energy values of the audio signals corresponding to each channel is less than a preset energy difference, or the difference between the energy values of the audio signals corresponding to each channel and the average energy value The value is less than a preset energy difference. Among them, the preset energy difference can be set according to specific needs. When the preset energy difference is set smaller, the smaller the difference in energy value of the audio signal corresponding to each microphone channel, the proofreading effect is also good, but at the same time proofreading equipment calculation It may increase, so the preset energy difference can take a reasonable value, so that the proofreading process will not take too long, while ensuring the proofreading effect.

Step S4, if the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition, adjust the gain value of each channel so that the energy value of the audio signal corresponding to each channel Satisfying the preset consistency condition.

If the proofreading device detects that the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition, it can adjust the gain value of each channel. Each microphone channel performs gain amplification processing on the picked up reference audio signal according to the adjusted gain value, and then performs analog-to-digital conversion. The proofreading device obtains the digital audio data corresponding to each channel again, calculates and analyzes the energy value of the audio signal corresponding to each channel, detects whether the energy value of the audio signal corresponding to each channel meets the preset consistency condition, and if it does not meet, then adjusts again , Loop operation until it is detected that the energy value of the audio signal corresponding to each channel meets the preset consistency condition. When it is detected that the energy value of the audio signal corresponding to each channel meets the preset consistency condition, the gain value of each channel does not change any more, and the proofreading is completed.

In this embodiment, the digital audio data after preprocessing the picked up reference audio signal by each channel of the microphone array is obtained; the energy value of the audio signal corresponding to each channel is calculated according to the digital audio data; whether the energy value of the audio signal corresponding to each channel is detected The preset consistency condition is satisfied; if the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition, the gain value of each channel is adjusted so that the energy value of the audio signal corresponding to each channel meets the preset consistency condition , To achieve automatic calibration of each microphone channel of the microphone array, make up for the hardware differences of each microphone channel, so that the sound pickup of each microphone channel meets the consistency requirements.

Further, based on the above-mentioned first embodiment, the second embodiment of the microphone array automatic proofreading method of the present application provides a microphone array automatic proofreading method. Referring to FIG. 3, in this embodiment, step S3 includes:

Step S31, calculating an average energy value of the energy values of the audio signals corresponding to the channels;

Step S32, respectively calculating the absolute value of the energy difference between the energy value of the audio signal and the average energy value corresponding to each channel;

After calculating the energy value of the audio signal corresponding to each channel, the proofreading device calculates the average energy value of the energy value of the audio signal corresponding to each channel. Then calculate the absolute value of the difference between the energy value of the audio signal corresponding to each channel and the average energy value, that is, the absolute value of the energy difference of each channel.

Step S33: Detect whether the absolute value of the energy difference of each channel is less than a preset difference;

After calculating the absolute value of the energy difference of each channel, the proofreading device detects whether the absolute value of the energy difference of each channel is less than the preset difference. The preset difference can be set according to specific needs. When the consistency of each channel of the microphone array is high, the preset difference can be set smaller, so that the energy value of the audio signal corresponding to each channel is closer.

Step S34: If the absolute value of the energy difference of each channel is less than the preset difference, it is determined that the energy value of the audio signal corresponding to each channel satisfies the preset consistency condition.

If it is detected that the absolute value of the energy difference of each channel is less than the preset difference, the proofreading device determines that the energy value of the audio signal corresponding to each channel meets the preset consistency condition. If it is detected that the absolute value of the energy difference of at least one channel is not less than the preset difference, it means that the energy value of the audio signal of the channel and other channels is too large. At this time, the proofreading device determines that the energy value of the audio signal corresponding to each channel is not Meet the preset consistency conditions.

In this embodiment, by calculating the average energy value of the audio signal energy value corresponding to each channel, and then detecting whether the absolute value of the energy difference between the audio signal energy value corresponding to each channel and the average energy value is less than the preset difference value, if If it is less than the preset difference value, it is determined that the energy value of the audio signal corresponding to each channel satisfies the preset consistency condition, and it is realized that whether the energy value of the audio signal corresponding to each channel satisfies the preset consistency condition is quickly calculated.

Further, based on the above-mentioned first or second embodiment, the third embodiment of the microphone array automatic proofreading method of the present application provides a microphone array automatic proofreading method. In this embodiment, the step of adjusting the gain value of each channel includes:

Step S41, determining the channel to be adjusted in each channel;

When the proofreading device detects that the energy value of the audio signal corresponding to each channel does not meet the preset consistency condition, it is necessary to adjust the gain value of each channel. At this time, the channel to be adjusted needs to be determined, that is, the channel gain value to be adjusted needs to be determined. The determination method can be to select the channel with the lowest audio signal energy value as the reference channel, determine the other channels as the channel to be adjusted, and adjust the gain value of the other channels; or select the channel with the highest audio signal energy value as the reference channel, and use the other channels It is determined as the channel to be adjusted; the channel to be adjusted may also be one, for example, the channel with the largest absolute value of the energy difference from the average energy value is determined as the channel to be adjusted.

Step S42: Adjust the gain value of the channel to be adjusted according to a preset adjustment method.

After determining the channel to be adjusted, the gain value of the channel to be adjusted is determined according to a preset adjustment mode, where the preset adjustment mode corresponds to the method of determining the channel to be adjusted. For example, when the channel with the lowest energy value of the audio signal is used as the reference channel, and other channels are determined as channels to be adjusted, the gain values of the other channels are reduced by a preset value, where the preset value can be set according to specific needs, and the preset value is set If it is too large, the difference between the energy value of the audio signal after adjusting the gain value of the channel to be adjusted and the energy value of the audio signal before adjusting the gain value is too large, so that the energy value of the audio signal corresponding to each channel is always difficult to meet the preset consistency condition. Setting the preset value too small will increase the number of adjustments and increase the amount of calculation, so the preset value should be set appropriately. Further, the preset value can also be set to correspond to the magnitude of the energy difference, that is, when the difference between the energy value of the audio signal of the channel to be adjusted and the energy value of the audio signal of the reference channel is larger, the corresponding preset value is larger, when The smaller the difference between the energy value of the audio signal of the channel to be adjusted and the energy value of the audio signal of the reference channel, the smaller the corresponding preset value.

Further, step S41 includes:

Step S411: Determine the channel with the largest absolute value of the energy difference among all channels as the channel to be adjusted;

Specifically, the method for determining the channel to be adjusted may be that, after the proofreading device calculates the absolute value of the energy difference between the energy value of the audio signal and the average energy value of each channel, the absolute value of the energy difference of each channel is compared, and all channels are The channel with the largest absolute value of energy difference is determined as the channel to be adjusted. If the absolute values of the energy differences of multiple channels are equal, the multiple channels are all determined as channels to be adjusted, or alternatively, determined as channels to be adjusted.

Step S42 includes:

Step S421: Determine whether the energy value of the audio signal of the channel to be adjusted is greater than the average energy value;

Step S422, if the energy value of the audio signal of the channel to be adjusted is greater than the average energy value, the gain value of the channel to be adjusted is reduced by a preset value;

Step S423: If the energy value of the audio signal of the channel to be adjusted is less than the average energy value, increase the gain value of the channel to be adjusted by a preset value.

After the proofreading device determines the channel with the largest absolute value of energy difference as the channel to be adjusted, it is determined whether the energy value of the audio signal of the channel to be adjusted is greater than the average energy value. If it is determined that the energy value of the audio signal of the channel to be adjusted is greater than the average energy value, the gain value of the channel to be adjusted is reduced by a preset value. The preset values are the same as above, and can be set according to specific needs. If it is determined that the energy value of the audio signal of the channel to be adjusted is less than the average energy value, the gain value of the channel to be adjusted is increased by a preset value.

In this embodiment, by determining the channel with the largest absolute value of energy difference among all channels as the channel to be adjusted, and then adjusting the gain value of the channel to be adjusted according to the preset value, the method of adjusting the audio signal energy value of each channel can be quickly achieved Requirements, speeding up the proofreading of proofreading equipment.

Further, after step S42, it also includes:

Step S43: Detect whether the channel to be adjusted undergoes gain amplification processing on the reference audio signal according to the adjusted gain value to see whether distortion occurs;

After the proofreading device adjusts the gain value of the channel to be adjusted according to the preset adjustment mode, the pending adjustment channel performs gain amplification processing and analog-to-digital conversion on the picked-up reference audio signal with the adjusted gain value. At this time, the proofreading device may first Detecting whether the channel to be adjusted exhibits distortion after performing gain processing on the picked-up reference audio signal according to the adjusted gain value. Since the proofreading device adjusts the gain value of the channel to be adjusted according to the preset adjustment method, the preset value in the preset adjustment method may be set to be too large, so that the adjusted gain value is too large, which may result in gain amplification processing The rear audio signal is distorted. If the audio signal is distorted, it cannot be used as the basis for the later speech recognition module to recognize speech, so it is a necessary condition that the audio signal is not distorted.

Step S44, if it is detected that the channel to be adjusted undergoes gain amplification processing on the reference audio signal according to the adjusted gain value, then a prompt message to modify the preset adjustment mode is output.

If the proofreading device detects that the channel to be adjusted undergoes gain amplification processing on the picked up reference audio signal according to the adjusted gain value, it may output a prompt message prompting the operation and maintenance personnel to modify the preset adjustment method, that is, prompting the operation The maintenance personnel modify the preset value in the preset adjustment mode so that the adjusted gain value will not be too large and cause distortion of the audio signal. If the proofreading device detects that the channel to be adjusted does not appear to be distorted after performing gain amplification on the picked up reference audio signal according to the adjusted gain value, the digital audio data corresponding to each channel is acquired again, and the audio signal energy corresponding to each channel is acquired. Calculate and analyze the value to detect whether the energy value of the audio signal corresponding to each channel meets the preset consistency condition. If it does not meet the requirement, adjust again and perform a loop operation until it is detected that the energy value of the audio signal corresponding to each channel meets the preset consistency Conditions so far.

In this embodiment, through the above steps of detecting whether the audio signal after the channel gain amplification is adjusted is distorted, the distortion of the audio signal during the calibration process can be avoided to ensure the accuracy of the microphone array calibration.

Further, based on the foregoing first, second, or third embodiment, the fourth embodiment of the automatic microphone array calibration method of the present application provides an automatic microphone array calibration method. In this embodiment, before the step of adjusting the gain value of each channel, the method further includes:

Step S51, detecting whether the current adjustment times for adjusting the gain values of the channels are greater than the preset times;

If the proofreading device detects that the energy value of the audio signal of each channel does not meet the preset consistency condition, you need to adjust the gain value of each channel, and then check whether the preset consistency condition is met after the adjustment, and loop until the preset consistency is detected. Sexual conditions. However, there may be too many times to adjust the channel gain value during the proofreading process, but the energy value of the audio signal of each channel still does not meet the preset consistency condition. In this case, there may be a problem in the hardware of the microphone array device. Therefore, before adjusting the gain value of each channel, the proofreading device can first detect whether the number of adjustments to adjust the gain value of each channel is greater than the preset number of times. The proofreading device can set a counter to record the number of times to adjust the gain value, and the gain value of each channel is increased by one every time the adjustment is made. The preset number of times can be set according to specific needs.

Step S52, if the current adjustment times for adjusting the gain value of each channel are greater than the preset number of times, the step of adjusting the gain value of each channel includes: performing a preset error reporting operation.

If the proofreading device detects that the current adjustment times for adjusting the gain value of each channel are greater than the preset times, it will no longer adjust the channel gain value, suspend the proofreading, and perform the preset error reporting operation. The preset error reporting operation may be outputting prompt information, prompting the operation and maintenance personnel to check too many times, or issuing an error reporting tone. By checking the error operation performed by the equipment, the operation and maintenance personnel can be prompted to check or repair the malfunction of the microphone array equipment.

In this embodiment, when it is detected that the number of adjustments to adjust the gain value of each channel is greater than the preset number of times, a preset error report operation is performed to detect a problem on the hardware of the microphone array device for the operation and maintenance personnel to Carry out inspection or repair.

In addition, an embodiment of the present application also provides an automatic calibration device for a microphone array. The automatic calibration device for a microphone array includes:

Optionally, the acquisition module is further configured to acquire digital audio data obtained by performing gain amplification processing and analog-to-digital conversion processing on the picked-up reference audio signal by each channel of the microphone.

Optionally, the calculation module includes:

The frame processing unit is configured to perform frame processing on the digital audio data corresponding to the respective channels;

An obtaining unit configured to obtain audio data of a preset frame in the digital audio data corresponding to each channel;

The first calculation unit is configured to calculate the energy value of the audio signal corresponding to each channel according to the audio data of the preset frame.

Optionally, the detection module includes:

The second calculation unit is configured to calculate the average energy value of the audio signal energy value corresponding to each channel; and also configured to calculate the energy of the audio signal energy value and the average energy value corresponding to each channel respectively Absolute value of difference;

A detection unit configured to detect whether the absolute value of the energy difference of each channel is less than a preset difference;

The first determining unit is configured to determine that the energy value of the audio signal corresponding to each channel satisfies the preset consistency if the absolute value of the energy difference of each channel is less than the preset difference condition.

Optionally, the adjustment module includes:

A second determination unit configured to determine the channel to be adjusted in each channel;

The adjustment unit is configured to adjust the gain value of the channel to be adjusted according to a preset adjustment mode.

Optionally, the second determining unit is further configured to determine the channel with the largest absolute value of the energy difference among all channels as the channel to be adjusted;

The adjusting unit is further configured to determine whether the energy value of the audio signal of the channel to be adjusted is greater than the average energy value; if the energy value of the audio signal of the channel to be adjusted is greater than the average energy value, then The gain value of the channel to be adjusted decreases by a preset value; if the energy value of the audio signal of the channel to be adjusted is less than the average energy value, the gain value of the channel to be adjusted is increased by a preset value.

Optionally, the detection module is further configured to detect whether distortion occurs in the channel to be adjusted according to the adjusted gain value after performing gain amplification processing on the reference audio signal;

The automatic calibration device of the microphone array further includes:

The output module is configured to output a prompt message to modify the preset adjustment mode if a distortion phenomenon occurs after the channel to be adjusted undergoes gain amplification processing on the reference audio signal according to the adjusted gain value.

Optionally, the detection module is further configured to detect whether the current number of adjustments to adjust the gain value of each channel is greater than a preset number of times;

The automatic calibration device of the microphone array further includes:

The error reporting module is configured to perform a preset error reporting operation if the current number of adjustments to adjust the gain value of each channel is greater than the preset number of times.

It should be noted that the embodiments of the microphone array automatic proofreading device are basically the same as the embodiments of the microphone array automatic proofreading method described above, and details are not repeated here.

In addition, an embodiment of the present application also provides a computer-readable storage medium that stores a microphone array automatic calibration program stored on the computer-readable storage medium, and the microphone array automatic calibration program is implemented by the processor to implement the microphone array as described above The steps of the automatic proofreading method. The expanded content of the specific implementation of the microphone array automatic proofreading device and the storage medium (ie, computer-readable storage medium) of the present application is basically the same as the above embodiments of the microphone array automatic proofreading method, and details are not described herein.

It should be noted that in this article, the terms "include", "include" or any other variant thereof are intended to cover non-exclusive inclusion, so that a process, method, article or system that includes a series of elements includes not only those elements, It also includes other elements that are not explicitly listed, or include elements inherent to this process, method, article, or system. Without more restrictions, the element defined by the sentence "include a..." does not exclude that there are other identical elements in the process, method, article or system that includes the element.

The sequence numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that the methods in the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be implemented by hardware, but in many cases the former is better Implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or part that contributes to the existing technology, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above , Disk, CD), including several instructions to make a terminal device (which can be a mobile phone, computer, server, air conditioner, or network equipment, etc.) to perform the method described in each embodiment of the present application.

The above are only preferred embodiments of the present application, and do not limit the scope of the patent of the present application. Any equivalent structure or equivalent process transformation made by the description and drawings of this application, or directly or indirectly used in other related technical fields The same reason is included in the patent protection scope of this application.

Claims

An automatic calibration method for a microphone array, wherein the automatic calibration method for a microphone array includes the following steps:

Obtain digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array;

Calculating the energy value of the audio signal corresponding to each channel according to the digital audio data;

Detecting whether the energy value of the audio signal corresponding to each channel meets the preset consistency condition; and,

If the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition, the gain value of each channel is adjusted so that the energy value of the audio signal corresponding to each channel satisfies the Preset consistency conditions.
The automatic calibration method for a microphone array according to claim 1, wherein the step of acquiring digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array includes:

Acquire digital audio data after each channel of the microphone performs gain amplification processing and analog-to-digital conversion processing on the picked-up reference audio signal.
The method for automatically calibrating a microphone array according to claim 1, wherein the step of calculating the energy value of the audio signal corresponding to each channel according to the digital audio data includes:

Separately framing the digital audio data corresponding to each channel;

Acquiring audio data of a preset frame in the digital audio data corresponding to each channel; and,

Calculate the energy value of the audio signal corresponding to each channel according to the audio data of the preset frame.
The automatic calibration method for a microphone array according to claim 1, wherein the step of detecting whether the energy value of the audio signal corresponding to each channel satisfies a preset consistency condition includes:

Calculating the average energy value of the energy values of the audio signals corresponding to the respective channels;

Separately calculating the absolute value of the energy difference between the energy value of the audio signal and the average energy value corresponding to each channel;

Detecting whether the absolute value of the energy difference of each channel is less than a preset difference; and,

If the absolute value of the energy difference of each channel is less than the preset difference, it is determined that the energy value of the audio signal corresponding to each channel satisfies the preset consistency condition.
The method for automatically calibrating a microphone array according to claim 4, wherein the step of adjusting the gain value of each channel includes:

Determine the channel to be adjusted in each channel;

The gain value of the channel to be adjusted is adjusted according to a preset adjustment mode.
The method for automatically calibrating a microphone array according to claim 5, wherein the step of determining the channel to be adjusted in each channel includes:

Determine the channel with the largest absolute value of the energy difference among all channels as the channel to be adjusted;

The step of adjusting the gain value of the channel to be adjusted according to a preset adjustment method includes:

Determining that the energy value of the audio signal of the channel to be adjusted is greater than the average energy value, and reducing the gain value of the channel to be adjusted by a preset value;

It is determined that the energy value of the audio signal of the channel to be adjusted is less than the average energy value, and the gain value of the channel to be adjusted is increased by a preset value.
The method for automatically calibrating a microphone array according to claim 1, wherein before the step of adjusting the gain value of each channel, further comprising:

Detecting whether the current adjustment times for adjusting the gain values of the channels are greater than the preset times;

If the current number of adjustments to adjust the gain value of each channel is greater than the preset number of times, the step of adjusting the gain value of each channel includes:

Perform preset error reporting.
A microphone array automatic proofreading device, wherein the microphone array automatic proofreading device includes:

The acquisition module is configured to acquire digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array;

A calculation module configured to calculate the energy value of the audio signal corresponding to each channel according to the digital audio data;

A detection module configured to detect whether the energy value of the audio signal corresponding to each channel meets a preset consistency condition; and,

An adjustment module configured to adjust the gain value of each channel so that the audio signal corresponding to each channel if the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition The energy value meets the preset consistency condition.
The automatic microphone array calibration device according to claim 8, wherein the acquisition module is further configured to acquire digital audio data after gain amplification processing and analog-to-digital conversion processing are performed on the picked-up reference audio signal by each channel of the microphone.
The microphone array automatic proofreading device according to claim 8, wherein the calculation module comprises:

The frame processing unit is configured to perform frame processing on the digital audio data corresponding to the respective channels;

An obtaining unit configured to obtain audio data of a preset frame in the digital audio data corresponding to each channel;

The first calculation unit is configured to calculate the energy value of the audio signal corresponding to each channel according to the audio data of the preset frame.
The microphone array automatic proofreading device according to claim 8, wherein the detection module comprises:

The second calculation unit is configured to calculate the average energy value of the audio signal energy value corresponding to each channel; and also configured to calculate the energy of the audio signal energy value and the average energy value corresponding to each channel respectively Absolute value of difference;

A detection unit configured to detect whether the absolute value of the energy difference of each channel is less than a preset difference;

The first determining unit is configured to determine that the energy value of the audio signal corresponding to each channel satisfies the preset consistency if the absolute value of the energy difference of each channel is less than the preset difference condition.
The microphone array automatic proofreading device according to claim 11, wherein the adjustment module comprises:

A second determination unit configured to determine the channel to be adjusted in each channel;

The adjustment unit is configured to adjust the gain value of the channel to be adjusted according to a preset adjustment mode.
A microphone array automatic proofreading device, wherein the microphone array automatic proofreading device includes a memory, a processor, and a microphone array automatic proofreading program stored on the memory and operable on the processor, the microphone array automatic proofreading program When the proofreading program is executed by the processor, the following steps are implemented:

Obtain digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array;

Calculating the energy value of the audio signal corresponding to each channel according to the digital audio data;

Detecting whether the energy value of the audio signal corresponding to each channel meets the preset consistency condition; and,

If the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition, the gain value of each channel is adjusted so that the energy value of the audio signal corresponding to each channel satisfies the Preset consistency conditions.
The microphone array automatic proofreading device according to claim 13, wherein the step of acquiring digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array includes:

Acquire digital audio data after each channel of the microphone performs gain amplification processing and analog-to-digital conversion processing on the picked-up reference audio signal.
The microphone array automatic proofreading device according to claim 13, wherein the step of calculating the energy value of the audio signal corresponding to each channel according to the digital audio data includes:

Separately framing the digital audio data corresponding to each channel;

Acquiring audio data of a preset frame in the digital audio data corresponding to each channel; and,

Calculate the energy value of the audio signal corresponding to each channel according to the audio data of the preset frame.
The microphone array automatic proofreading device according to claim 13, wherein the step of detecting whether the energy value of the audio signal corresponding to each channel meets a preset consistency condition includes:

Calculating the average energy value of the energy values of the audio signals corresponding to the respective channels;

Separately calculating the absolute value of the energy difference between the energy value of the audio signal and the average energy value corresponding to each channel;

Detecting whether the absolute value of the energy difference of each channel is less than a preset difference; and,

If the absolute value of the energy difference of each channel is less than the preset difference, it is determined that the energy value of the audio signal corresponding to each channel satisfies the preset consistency condition.
A computer-readable storage medium, wherein the computer-readable storage medium stores a microphone array automatic calibration program, and when the microphone array automatic calibration program is executed by a processor, the following steps are implemented:

Obtain digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array;

Calculating the energy value of the audio signal corresponding to each channel according to the digital audio data;

Detecting whether the energy value of the audio signal corresponding to each channel meets the preset consistency condition; and,

If the energy value of the audio signal corresponding to each channel does not satisfy the preset consistency condition, the gain value of each channel is adjusted so that the energy value of the audio signal corresponding to each channel satisfies the Preset consistency conditions.
The computer-readable storage medium of claim 17, wherein the step of acquiring digital audio data after pre-processing the picked up reference audio signal by each channel of the microphone array includes:

Acquire digital audio data after each channel of the microphone performs gain amplification processing and analog-to-digital conversion processing on the picked-up reference audio signal.
The computer-readable storage medium of claim 17, wherein the step of calculating the energy value of the audio signal corresponding to each channel according to the digital audio data includes:

Separately framing the digital audio data corresponding to each channel;

Acquiring audio data of a preset frame in the digital audio data corresponding to each channel; and,

Calculate the energy value of the audio signal corresponding to each channel according to the audio data of the preset frame.
The computer-readable storage medium of claim 17, wherein the step of detecting whether the energy value of the audio signal corresponding to each channel meets a preset consistency condition includes:

Calculating the average energy value of the energy values of the audio signals corresponding to the respective channels;

Separately calculating the absolute value of the energy difference between the energy value of the audio signal and the average energy value corresponding to each channel;

Detecting whether the absolute value of the energy difference of each channel is less than a preset difference; and,

If the absolute value of the energy difference of each channel is less than the preset difference, it is determined that the energy value of the audio signal corresponding to each channel satisfies the preset consistency condition.