Nothing Special   »   [go: up one dir, main page]

CN111402913B - Noise reduction method, device, equipment and storage medium - Google Patents

Noise reduction method, device, equipment and storage medium Download PDF

Info

Publication number
CN111402913B
CN111402913B CN202010111706.0A CN202010111706A CN111402913B CN 111402913 B CN111402913 B CN 111402913B CN 202010111706 A CN202010111706 A CN 202010111706A CN 111402913 B CN111402913 B CN 111402913B
Authority
CN
China
Prior art keywords
sound
target
audio signals
noise
audio signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010111706.0A
Other languages
Chinese (zh)
Other versions
CN111402913A (en
Inventor
冯大航
陈孝良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SoundAI Technology Co Ltd
Original Assignee
Beijing SoundAI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SoundAI Technology Co Ltd filed Critical Beijing SoundAI Technology Co Ltd
Priority to CN202010111706.0A priority Critical patent/CN111402913B/en
Publication of CN111402913A publication Critical patent/CN111402913A/en
Application granted granted Critical
Publication of CN111402913B publication Critical patent/CN111402913B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The application provides a noise reduction method, device, equipment and storage medium, and belongs to the technical field of voice. The noise reduction method comprises the following steps: the method comprises the steps of acquiring audio signals acquired by two sound acquisition devices for acquiring the same sound emitted by a target sound source, filtering the audio signals acquired by the two sound acquisition devices, and filtering the audio signals corresponding to the two sound acquisition devices, so that the noise reduction processing of the audio signals corresponding to the two sound acquisition devices is realized, the duty ratio of the noise signals in the acquired target audio signals is smaller, the signal-to-noise ratio of the target audio signals is larger, and the noise reduction effect is better.

Description

Noise reduction method, device, equipment and storage medium
Technical Field
The present application relates to the field of speech technologies, and in particular, to a noise reduction method, apparatus, device, and storage medium.
Background
Nowadays, wireless earphone products are deeply favored by users because of their convenience. The wireless earphone is integrated with the microphone and the earphone, when a user needs to make a call, the wireless earphone connected with the mobile phone is worn on the ear, and the wireless earphone is used for communication, so that the method of communication by a hand-held mobile phone can be replaced, and the method is more convenient. However, when a call is made, noise inevitably exists in the environment, and the call quality is affected.
In the related art, a noise reduction function is generally provided for a wireless earphone, so that the wireless earphone can process collected voice signals, strengthen human voice, weaken noise and achieve the effect of noise reduction.
Before the appearance of a TWS (Ture Wireless Stereo, true wireless) headset, there is usually only one wireless headset, and only a set of voice input signals can be collected, so that the noise reduction effect after processing the voice input signals is not good enough. The current TWS earphone generally consists of two wireless earphones, wherein the two wireless earphones can respectively collect two groups of voice input signals, but no noise reduction method capable of utilizing the two groups of voice input signals exists in the related art, so that a noise reduction method capable of being applied to the TWS earphone is needed.
Disclosure of Invention
In view of this, embodiments of the present application provide a noise reduction method, apparatus, device, and storage medium, which can reduce noise of audio signals of two sound collection devices, and can improve noise reduction effect.
In one aspect, a noise reduction method is provided, the method comprising:
acquiring audio signals obtained by acquiring the same sound emitted by a target sound source by two sound acquisition devices;
filtering the audio signals acquired by the two sound acquisition devices respectively to obtain audio signals corresponding to each sound acquisition device;
And taking the two sound collection devices as two microphones in the same microphone array, and filtering audio signals corresponding to the two sound collection devices to obtain the target audio signal after noise reduction of the same sound emitted by the target sound source.
Optionally, filtering the audio signals corresponding to the two sound collecting devices to obtain the target audio signal after noise reduction of the same sound sent by the target sound source includes:
and carrying out weighted summation on the audio signals corresponding to the two sound collecting devices to obtain the target audio signal which is emitted by the target sound source and is noise-reduced by the same sound.
Optionally, the filtering the audio signals corresponding to the two sound collecting devices by using the two sound collecting devices as two microphones in the same microphone array to obtain the target audio signal after noise reduction of the same sound sent by the target sound source includes:
weighting and summing the audio signals corresponding to the two sound collecting devices to obtain candidate target audio signals after noise reduction of the same sound emitted by the target sound source;
and filtering out the audio signals except the target direction in the candidate target audio signals to obtain the target audio signals after noise reduction of the same sound emitted by the target sound source.
Optionally, the filtering the audio signals except the target direction in the candidate target audio signals to obtain the target audio signal after noise reduction of the same sound sent by the target sound source includes:
respectively filtering out an audio signal in a target direction in the audio signal corresponding to each sound collecting device to obtain a noise signal corresponding to each sound collecting device, wherein the target direction is the direction in which the target sound source points to the middle position of the two sound collecting devices;
carrying out weighted summation on noise signals corresponding to the two sound collection devices to obtain noise signals;
and removing the noise signals in the candidate target audio signals to obtain the target audio signals after noise reduction of the same sound emitted by the target sound source.
Optionally, the noise signals corresponding to the two sound collection devices are weighted and summed to obtain a noise signal; removing the noise signal in the candidate target audio signal to obtain the target audio signal after noise reduction of the same sound sent by the target sound source, wherein the method comprises the following steps:
according to the weight corresponding to each sound collection device, weighting and summing the noise signals corresponding to the two sound collection devices to obtain noise signals;
Removing the noise signals in the candidate target audio signals to obtain target audio signals after noise reduction of the same sound emitted by the target sound source;
and adjusting the weight according to the correlation between the target audio signal and the expected audio signal obtained based on the candidate target audio signal, and continuously executing the steps of obtaining and removing the noise signal based on the adjusted weight until the target audio signal meets the target condition, and obtaining the target audio signal after noise reduction of the same sound emitted by the target sound source.
Optionally, the filtering is performed on the audio signals collected by the two sound collecting devices to obtain audio signals corresponding to each sound collecting device, including:
according to the time delay of the sound collected by different collection units in each sound collection device, performing time delay compensation on the audio signals collected by each collection unit in each sound collection device to obtain candidate audio signals of each collection unit;
and filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signals corresponding to each sound acquisition device.
Optionally, the method further comprises:
Detecting states of the two sound collection devices;
when the two sound collection devices are in a working state, executing the steps of acquiring the audio signals corresponding to the two sound collection devices, respectively filtering and filtering as the same microphone array;
when any sound collecting equipment is not in a working state, acquiring an audio signal collected by the sound collecting equipment in the working state on the sound sent by the target sound source, filtering the audio signal, and obtaining the target audio signal after noise reduction of the same sound sent by the target sound source.
In one aspect, there is provided a noise reduction device, the device comprising:
the acquisition module is used for acquiring audio signals acquired by acquiring the same sound emitted by the target sound source by the two sound acquisition devices;
the first filtering module is used for respectively filtering the audio signals acquired by the two sound acquisition devices to obtain audio signals corresponding to each sound acquisition device;
and the second filtering module is used for filtering the audio signals corresponding to the two sound collecting devices to obtain the target audio signal which is emitted by the target sound source and is noise-reduced by the same sound.
Optionally, the second filtering module includes:
and the second weighting module is used for carrying out weighted summation on the audio signals corresponding to the two sound collecting devices to obtain the target audio signal which is sent by the target sound source and is noise-reduced by the same sound.
Optionally, the second filtering module includes:
the second weighting module is used for carrying out weighted summation on the audio signals corresponding to the two sound collecting devices to obtain candidate target audio signals after noise reduction of the same sound sent by the target sound source;
and the second filtering module is used for filtering out the audio signals except the target direction in the candidate target audio signals to obtain the target audio signals which are transmitted by the target sound source and are noise-reduced by the same sound.
Optionally, the second filtering module includes:
the second filtering sub-module is used for filtering the audio signals in the target direction in the audio signals corresponding to each sound collecting device respectively to obtain the noise signals corresponding to each sound collecting device, wherein the target direction is the direction in which the target sound source points to the middle position of the two sound collecting devices;
the second weighting submodule is used for carrying out weighted summation on the noise signals corresponding to the two sound acquisition devices to obtain noise signals;
And the second removing module is used for removing the noise signals in the candidate target audio signals to obtain the target audio signals after noise reduction of the same sound emitted by the target sound source.
Optionally, the second weighting submodule is specifically configured to perform weighted summation on noise signals corresponding to the two sound collection devices according to the weight corresponding to each sound collection device to obtain noise signals;
the second filtering module further comprises a second adjusting module, configured to adjust the weight according to a correlation between the target audio signal and a desired audio signal obtained based on the candidate target audio signal;
and the second filtering module is further used for continuously executing the steps of obtaining and removing the noise signal based on the adjusted weight until the target condition is met, and stopping to obtain the target audio signal after noise reduction of the same sound sent by the target sound source.
Optionally, the first filtering module includes:
the time delay module is used for carrying out time delay compensation on the audio signals acquired by each acquisition unit in each sound acquisition device according to the time delay of the sound acquired by different acquisition units in each sound acquisition device to obtain candidate audio signals of each acquisition unit;
And the first filtering sub-module is used for filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signals corresponding to each sound acquisition device.
Optionally, the apparatus further includes:
the detection module is used for detecting the states of the two sound collection devices;
the device is also for:
when the two sound collection devices are in a working state, executing the steps of acquiring the audio signals corresponding to the two sound collection devices, respectively filtering and filtering as the same microphone array;
when any sound collecting equipment is not in a working state, acquiring an audio signal collected by the sound collecting equipment in the working state on the sound sent by the target sound source, filtering the audio signal, and obtaining the target audio signal after noise reduction of the same sound sent by the target sound source.
In one aspect, a computer device is provided that includes one or more processors and one or more memories having at least one instruction stored therein that is loaded and executed by the one or more processors to implement operations performed by the noise reduction method.
In one aspect, a computer-readable storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement operations performed by the noise reduction method is provided.
The technical scheme provided by the embodiment of the application has the beneficial effects that at least the following steps are included:
in the embodiment of the application, the audio signals acquired by acquiring the same sound emitted by the target sound source by the two sound acquisition devices are respectively filtered, and the audio signals corresponding to the two sound acquisition devices are further filtered, so that the noise reduction processing of the audio signals corresponding to the two sound acquisition devices is realized again, the occupation ratio of the noise signals in the acquired target audio signals is smaller, the signal-to-noise ratio of the target audio signals is larger, and the noise reduction effect is better.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an implementation environment of a noise reduction method according to an embodiment of the present application;
FIG. 2 is a flow chart of a noise reduction method according to an embodiment of the present application;
FIG. 3 is a flow chart of another noise reduction method provided by an embodiment of the present application;
FIG. 4 is a block diagram of a noise reduction method according to an embodiment of the present application;
FIG. 5 is a block diagram of another noise reduction method according to an embodiment of the present application;
FIG. 6 is a block diagram of another noise reduction method according to an embodiment of the present application;
FIG. 7 is a block diagram of another noise reduction method according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a noise reduction device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
It should be noted that the embodiments described below are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In the present application, the terms "first," "second," and the like are used for distinguishing the same item or similar items having substantially the same function and function, and it should be understood that the terms "first," "second," and "nth" do not have a logical or chronological dependency relationship, nor do they limit the number and order of execution.
Fig. 1 is a schematic view of an implementation environment of a noise reduction method according to an embodiment of the present application, and referring to fig. 1, the implementation environment may include two sound collection devices 101 and a target sound source 102.
The target sound source 102 may be a human mouth, or may be a sound-producing element such as a speaker, and the sound collecting device 101 may be a wireless earphone, or may be a terminal device capable of collecting sound. The target sound source 102 emits sound, and the two sound collection devices 101 collect the same sound emitted by the target sound source 102 to obtain an audio signal emitted by the target sound source 102.
In the embodiment of the present application, two sound collection devices 101 may collect sound emitted by a target sound source, and perform noise reduction processing on the collected audio signal to obtain a target audio signal.
Fig. 2 is a flowchart of a noise reduction method according to an embodiment of the present application, referring to fig. 2, the method includes:
201. And acquiring audio signals obtained by acquiring the same sound emitted by the target sound source by the two sound acquisition devices.
202. And respectively filtering the audio signals acquired by the two sound acquisition devices to obtain the audio signals corresponding to each sound acquisition device.
203. And taking the two sound collection devices as two microphones in the same microphone array, and filtering the audio signals corresponding to the two sound collection devices to obtain the target audio signal after noise reduction of the same sound emitted by the target sound source.
In a possible implementation manner, the filtering the audio signals corresponding to the two sound collecting devices to obtain the target audio signal after noise reduction of the same sound emitted by the target sound source includes:
and carrying out weighted summation on the audio signals corresponding to the two sound collecting devices to obtain the target audio signal which is emitted by the target sound source and is noise-reduced by the same sound.
In one possible implementation manner, the filtering the audio signals corresponding to the two sound collecting devices with the two sound collecting devices as two microphones in the same microphone array to obtain the target audio signal after the same sound emitted by the target sound source is noise reduced includes:
Carrying out weighted summation on the audio signals corresponding to the two sound collecting devices to obtain candidate target audio signals after noise reduction of the same sound sent by the target sound source;
and filtering out the audio signals except the target direction in the candidate target audio signals to obtain the target audio signals after noise reduction of the same sound sent by the target sound source.
In a possible implementation manner, the filtering the audio signal other than the target direction in the candidate target audio signal to obtain the target audio signal after noise reduction of the same sound emitted by the target sound source includes:
respectively filtering out an audio signal in a target direction in the audio signal corresponding to each sound collecting device to obtain a noise signal corresponding to each sound collecting device, wherein the target direction is the direction in which the target sound source points to the middle position of the two sound collecting devices;
carrying out weighted summation on noise signals corresponding to the two sound collection devices to obtain noise signals;
and removing the noise signal in the candidate target audio signals to obtain the target audio signals after noise reduction of the same sound emitted by the target sound source.
In one possible implementation manner, the noise signals corresponding to the two sound collection devices are weighted and summed to obtain a noise signal; removing the noise signal in the candidate target audio signal to obtain a target audio signal after noise reduction of the same sound emitted by the target sound source, including:
According to the weight corresponding to each sound collection device, weighting and summing the noise signals corresponding to the two sound collection devices to obtain noise signals;
removing the noise signal in the candidate target audio signal to obtain a target audio signal after noise reduction of the same sound emitted by the target sound source;
and adjusting the weight according to the correlation between the target audio signal and the expected audio signal obtained based on the candidate target audio signal, and continuously executing the steps of obtaining and removing the noise signal based on the adjusted weight until the target audio signal meets the target condition, and obtaining the target audio signal after noise reduction of the same sound sent by the target sound source.
In one possible implementation manner, the filtering the audio signals collected by the two sound collecting devices respectively to obtain the audio signal corresponding to each sound collecting device includes:
according to the time delay of the sound collected by different collection units in each sound collection device, performing time delay compensation on the audio signals collected by each collection unit in each sound collection device to obtain candidate audio signals of each collection unit;
and filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signals corresponding to each sound acquisition device.
In one possible embodiment, the method further comprises:
detecting states of the two sound collection devices;
when the two sound collection devices are in a working state, executing the filtering steps of obtaining the audio signals corresponding to the two sound collection devices, respectively filtering and serving as the same microphone array;
when any sound collecting equipment is not in a working state, acquiring an audio signal collected by the sound collecting equipment in the working state for the sound emitted by the target sound source, and filtering the audio signal to obtain a target audio signal after noise reduction of the same sound emitted by the target sound source.
Fig. 3 is a flowchart of a noise reduction method according to an embodiment of the present application. Referring to fig. 3, the method includes:
301. the computer equipment acquires audio signals obtained by acquiring the same sound emitted by the target sound source by the two sound acquisition equipment.
In an embodiment of the application, a method for acquiring and reducing noise of sound through two sound acquisition devices is provided. Each sound collecting device can comprise at least two collecting units for collecting the audio signals of the target sound source. For example, the sound collection device may be a headset and the collection unit may be a microphone. In a specific scenario, two microphones may be mounted on each earpiece, and sound may be collected by four microphones of the two earpieces.
In each sound collection device, the collection units may be arranged in a linear manner or in a circular manner, which is not limited in the present application, and a linear arrangement will be described below as an example. For example, as shown in fig. 1, the two sound collecting devices may be two headphones, each of which may include two microphones, one of which may include a mic (microphone) 1 and a mic2, and the other of which may include a mic3 and a mic4. The target sound source can be a human mouth, and when a user uses the earphone to communicate, the user can wear the two earphone on the two ears respectively, and four microphones on the two earphone can collect sound emitted by the human mouth.
The audio signal collected by each sound collection device may be an audio signal set, where the audio signal set may include an audio signal collected by each collection unit in the sound collection device. Therefore, more audio signals can be obtained by utilizing the two sound collecting devices to collect the sound emitted by the target sound source, and enough data is provided for the subsequent noise reduction of the audio signals.
In one possible implementation, the computer device may be a device to which the two sound collection devices are connected, e.g., the two sound collection devices may collect sound, obtain an audio signal, and send the audio signal to the connected computer device, which processes the audio signal. For example, the computer device may be a mobile phone for a user to talk, and after the earphone collects the audio signal, it sends the audio signal to the mobile phone to perform noise reduction processing, and then sends the audio signal.
In another possible implementation manner, the computer device may also be the two sound collecting devices, and after the two sound collecting devices collect the audio signals, the audio signals may be sent or played after noise reduction.
302. The computer equipment filters the audio signals acquired by the two sound acquisition equipment respectively to obtain the audio signals corresponding to each sound acquisition equipment.
The audio signals collected by each sound device comprise sound signals of a target sound source and noise signals in the environment, the audio signals collected by the two sound collecting devices are filtered, namely noise signals in the audio signals are filtered, so that the obtained noise-reduced audio signals corresponding to each sound collecting device can be understood, the noise-reduced audio signals mainly comprise the sound signals, and the noise signals are reduced. For example, taking the above-mentioned sound collection device as an earphone, the collection unit may be a microphone as an example, the earphone including the mic1 and the mic2 may be used as a first earphone, and the earphone including the mic3 and the mic4 may be used as a second earphone. And filtering the audio signals acquired by the mic1 and the mic2, filtering noise signals in the audio signals in the filtering process, and finally obtaining the noise-reduced audio signals corresponding to the first earphone. And filtering the audio signals acquired by the mic3 and the mic4, filtering noise signals in the audio signals in the filtering process, and finally obtaining the noise-reduced audio signals corresponding to the second earphone.
In one possible embodiment, the filtering process may be performed by filtering the time delay of the sound collected by the collection unit on the sound collection device to remove the noise signal. Specifically, the computer device may perform delay compensation on the audio signal collected by each collection unit in each sound collection device according to the delay of the sound collected by each collection unit in each sound collection device, so as to obtain a candidate audio signal of each collection unit, and then filter the candidate audio signal of the collection unit in each sound collection device, so as to obtain an audio signal corresponding to each sound collection device.
For each acquisition unit, the time delay may be a time delay of an audio signal acquired by the acquisition unit relative to an audio signal acquired by a reference element. Specifically, the computer device may analyze the audio signal collected by each sound collecting device, and obtain a time delay of the audio signal of the target sound source collected by each collecting unit in each sound collecting device relative to the reference array element.
In each sound collection device, the distance between each collection unit and the target sound source may be different, so that the time when each collection unit collects the audio signal of the target sound source may be different, and the phase of the collected audio signal may be different. The acquisition unit closest to the target sound source acquires the audio signal of the target sound source first, and the acquisition unit farthest from the target sound source acquires the audio signal of the target sound source last. For example, as shown in fig. 1, when a user uses two headphones to make a call, the mic2 and the mic4 collect sounds emitted by the mouth first, and the mic1 and the mic3 collect sounds emitted by the mouth after a certain time delay. For the first earphone, the mic2 closer to the human mouth may be used as a reference element, and for the second earphone, the mic4 closer to the human mouth may be used as a reference element. And analyzing the audio signals acquired by the mic1 and the mic2, so that the time delay of the mic1 relative to the time delay of the sound generated by the human mouth acquired by the mic2 can be obtained. And analyzing the audio signals acquired by the mic3 and the mic4, so that the time delay of the mic3 relative to the time delay of the mic4 acquired by the mic4 to the sound emitted by the human mouth can be obtained.
Specifically, the computer device may obtain a cross-correlation coefficient between the audio signals collected by each collecting unit, and according to the cross-correlation coefficient, obtain the relative time of each collecting unit, and determine the time delay of the sound collected by each collecting unit relative to the reference array element. Of course, other methods may be adopted in the delay acquiring process, and the embodiment of the present application does not limit a specific method of acquiring the delay. By analyzing the audio signals acquired by each acquisition unit, the time delay of the audio signals of the target sound source acquired by each acquisition unit relative to the reference array element can be acquired, and a basis is provided for compensating the time delay in the follow-up process.
After the computer equipment obtains the time delay, the time delay compensation can be respectively carried out on the audio signals collected by each collection unit according to the time delay, so that the candidate audio signals of each collection unit are obtained.
For each sound collecting device, if the time that each collecting unit collects the audio signal of the target sound source is different, the phase of the audio signal of the target sound source collected by each collecting unit in each sound collecting device may be different at the same time.
The delay compensation process for the plurality of acquisition units can be as shown in fig. 4, and x can be used i,m (t) representing the audio signal acquired by the mth acquisition unit in the ith sound acquisition device, and correspondingly, the candidate audio signal of the mth acquisition unit may be represented as x' i,m (t)=x i,m (t+τ i,m ). Wherein τ i,m The time delay corresponding to the mth acquisition unit in the ith sound acquisition equipment. After the computer equipment obtains the candidate audio signals of each acquisition unit, the computer equipment can carry out weighted summation on the candidate audio signals of the acquisition units in each sound acquisition equipment to obtain each sound acquisitionAnd the audio signals i and m corresponding to the equipment are positive integers.
Wherein by this weighted summation process, the audio signal from the direction of the target sound source can be emphasized, thereby realizing beam forming of the direction of the target sound source, whereby the noise signal is attenuated with respect to the sound signal. The time delay compensation enables the phases of the audio signals of the target sound source collected by each collecting unit at the same moment to be the same, and among the audio signals of the target sound source collected by each collecting unit, the audio signals of the target sound source are coherent, the noise signals can be incoherent, the candidate audio signals are weighted and summed on the basis, the audio signals from the direction of the target sound source can be overlapped and enhanced, the incoherent noise signals can not be overlapped, and further the enhancement of the audio signals of the direction of the target sound source is realized. For example, after the computer device obtains the candidate audio signals of the mic1 and the mic2 of the first earphone, the candidate audio signals of the mic1 and the mic2 may be weighted and summed, and among the obtained audio signals corresponding to the first earphone, the audio signals from the direction of the human mouth are reinforced, while the noise signals in other directions are not reinforced, so that noise reduction of the audio signals collected by the mic1 and the mic2 is realized.
In one possible implementation, the computer device may weight sum candidate audio signals for a plurality of acquisition units of each sound acquisition device by the following formula:
wherein W is a =[w a1 ,w a2 ,…w am ] T For the weight vector corresponding to the acquisition unit, w am Is the weight of the mth acquisition unit, X' i (t)=[x′ i,1 (t),x′ i,2 (t),…x′ i,m (t)] T Is the candidate audio signal vector s of m acquisition units in the ith sound acquisition equipment after time delay compensation i (t) is an audio signal corresponding to the ith sound collection device.
It should be noted that, when the above weighted summation is performed on the candidate audio signals of the acquisition units, the weight of each acquisition unit may be a fixed weight, where the fixed weight may be preset by a related technician according to an actual requirement, self experience or experimental result, for example, each sound acquisition device may include two acquisition units, and correspondingly, the weight of each acquisition unit may be 0.5. The weight may also be adjusted according to the filtering result, for example, according to the quality of the audio signal corresponding to each obtained sound collection device.
In one possible implementation manner, as shown in fig. 5, the computer device may perform weighted summation on candidate audio signals of the collection units in each sound collection device according to a first weight corresponding to each collection unit, to obtain candidate audio signals corresponding to each sound collection device, and filter audio signals from a direction of a target sound source in the candidate audio signals of each collection unit, to obtain noise signals corresponding to each collection unit, and perform weighted summation on noise signals of the collection units in each sound collection device according to a second weight corresponding to each collection unit, to obtain noise signals corresponding to each sound collection device. And carrying out difference on the candidate audio signals and the noise signals corresponding to each sound collection device to obtain the audio signals corresponding to each sound collection device.
The candidate audio signals corresponding to each sound collecting device are obtained by strengthening the audio signals from the direction of the target sound source, which is equivalent to weakening the noise signals from other directions, but still comprises the audio signals and the noise signals of the target sound source, so that the candidate audio signals and the noise signals corresponding to each sound collecting device are subjected to difference, noise signals in the candidate audio signals are filtered, and further noise reduction is realized. The noise signal corresponding to each sound collecting device can be obtained by filtering the audio signal from the direction of the target sound source in the audio signal of each collecting unit, obtaining the corresponding noise signal of each collecting unit, and carrying out weighted summation on the noise signals of the collecting units in each sound collecting device.
For example, the computer device performs weighted summation on the candidate audio signals of the mic1 and the mic2 in the first earphone to obtain candidate audio signals corresponding to the first earphone, and then can filter out audio signals from the direction of the human mouth from the audio signals collected by the mic1 and the mic2 to obtain corresponding noise signals of the mic1 and the mic2, and then performs weighted summation on the noise signals of the mic1 and the mic2 to obtain the noise signals corresponding to the first earphone. And the candidate audio signals corresponding to the first earphone are subjected to difference with the noise signals, so that the noise signals in the candidate audio signals corresponding to the first earphone can be filtered out, and the audio signals corresponding to the first earphone are obtained.
The computer equipment obtains the noise signal corresponding to each sound collecting equipment by filtering the audio signal from the direction of the target sound source, and makes the difference between the candidate audio signal corresponding to each sound collecting equipment and the noise signal, so that the noise signal in the candidate audio signal of each sound collecting equipment is filtered, and a better noise reduction effect is achieved.
In one possible implementation, as shown in fig. 5, the computer device may implement a filtering process for the audio signal from the direction of the target sound source in the audio signal of each acquisition unit by using the blocking matrix, so as to obtain, in the audio signal of each acquisition unit, an audio signal from a direction other than the direction of the target sound source, that is, a noise signal corresponding to each acquisition unit.
For example, the computer device may obtain the audio signal corresponding to each sound collection device through the following formula.
U i (t)=BX′ i (t)
s i (t)=s′ i (t)-z i (t)
Wherein s' i (t) is the candidate audio signal corresponding to the ith sound collection device, W a =[w a1 ,w a2 ,…w am ] T For the first weight vector corresponding to the acquisition unit, X' i (t)=[x′ i,1 (t),x′ i,2 (t),…x′ i,m (t)] T Is the candidate audio signal vector of m acquisition units in the ith sound acquisition equipment after time delay compensation, U i (t) is an array signal processed by a blocking matrix B for the audio signal vector of m acquisition units in the ith sound acquisition device, W b =[w b1 ,w b2 ,…w bm ] T For the second weight vector, w, corresponding to the acquisition unit bm Is the adaptive weight of the mth acquisition unit, z i And (t) is a noise signal corresponding to the ith sound collection device, and i and m are positive integers. In order to be able to filter out the audio signals from the direction of the target sound source from the audio signals of each acquisition unit, the blocking matrix B may be
It should be noted that, when the foregoing weighted summation is performed on the candidate audio signals corresponding to the acquisition units, the first weights and the second weights may be fixed weights, and the first weights and the second weights may be preset by related technicians according to actual requirements, self experience or experimental results, for example, the first weights and the second weights of each sound acquisition device may be 0.5. The first weight and the second weight may also be adjusted according to the filtering result, for example, the quality of the audio signal corresponding to the obtained sound collecting device may be adjusted.
In one possible implementation, the second weight may be an updatable weight. After the computer device obtains the audio signal corresponding to each sound collection device, the second weight can be adjusted according to the correlation between the audio signal corresponding to each sound collection device and the expected audio signal obtained based on the audio signal corresponding to each sound collection device, and the steps of obtaining and removing the noise signal are continuously executed based on the adjusted second weight until the audio signal corresponding to each sound collection device is stopped when the target condition is met, so that the audio signal corresponding to each sound collection device is obtained.
In the target audio signal obtained by subtracting the candidate target audio signal and the noise signal, there may be residual noise signals that are not filtered, and correspondingly, the second weight for performing weighted summation on the noise signals of the two sound collecting devices may be an updatable weight, so that the obtained target audio signal meets the target condition.
Through the iterative process, the second weight can be adaptively adjusted to improve the noise reduction effect of the audio signal. For the target condition, the target condition may be that the correlation converges, or the correlation is greater than a target threshold, or a difference between the correlation and the target threshold is smaller than a difference threshold, or the iteration number reaches a target number, which is not limited in the embodiment of the present application.
For example, the candidate audio signals and the noise signals of the first earphone are subjected to difference, and after the audio signals corresponding to the first earphone are obtained, the second weights of the mic1 and the mic2 can be updated, so that the noise signals of the first earphone obtained according to the noise signals of the mic1 and the mic2 can be better filtered.
The computer device may update the second weights using a modified least mean square algorithm to minimize the output power of the noise signal of each sound collection device.
Wherein W is b And (n+1) is the updated second weight vector.
The second weight of each acquisition unit is updated according to the correlation between the audio signal corresponding to each sound acquisition device and the expected audio signal obtained based on the audio signal corresponding to each sound acquisition device, so that the noise signal of each sound acquisition device can be adaptively updated, the filtering effect of the noise signal in the candidate audio signal corresponding to each sound acquisition device is better and better, and the noise reduction effect is improved.
In one possible implementation manner, after the audio signal is collected by the sound collection device, the computer device may also convert the collected audio signal into a frequency domain signal first, and then perform the next data processing.
Any sound is essentially an acoustic wave generated by vibration of an object, and the acquisition unit is capable of performing acquisition of an audio signal by converting the acoustic wave generated by such vibration into an electrical signal, the acquired electrical signal being in fact a change in voltage over time, where the voltage may represent the change in sound to some extent. The time domain is the situation that the description variable changes with time, and the audio signal acquired by the acquisition unit is obviously located in the time domain. The audio signals in the time domain are formed by superposition of various signals, and are difficult to split in the time domain, but can be split into signals with different frequencies in the frequency domain, so that complex audio signals can be split into a plurality of simple audio signals, and the analysis of the signals can be more convenient. The present application is not limited to a specific conversion mode.
303. The computer equipment filters the audio signals corresponding to the two sound collecting equipment by taking the two sound collecting equipment as two microphones in the same microphone array to obtain a target audio signal after noise reduction of the same sound emitted by the target sound source.
After the computer equipment respectively carries out noise reduction on the audio signals acquired by the two sound acquisition equipment, the two sound acquisition equipment can be used as two microphones in the same microphone array to further carry out noise reduction.
Specifically, the noise reduction process may be: and filtering the audio signals corresponding to the two sound collecting devices, filtering noise signals in directions other than the target direction, and obtaining the target audio signals after noise reduction of the sound emitted by the target sound source. For example, the audio signals corresponding to the two headphones may be respectively collected as the audio signals of the mic2 and the mic4, and further, the mic2 and the mic4 may be used as the two microphones in the same microphone array. Correspondingly, the target direction can be the direction of the human mouth pointing to the middle position of the mic2 and the mic4, namely, the incidence angle of sound waves emitted by the human mouth to a microphone array formed by the mic2 and the mic4 can be regarded as 90 degrees, and during filtering, audio signals in directions other than the direction can be filtered, so that noise reduction is realized.
The audio signals corresponding to the sound collection equipment are further noise-reduced, so that the duty ratio of the noise signals in the finally obtained target audio signals is smaller, namely the signal-to-noise ratio of the target audio signals is larger, and the noise reduction effect is better. And when wearing two earphones respectively on two ears, the time that sound waves emitted by the human mouth reach the two earphones can be regarded as equal, that is, the time that the two sound collection devices collect sound waves emitted by the target sound source can be regarded as equal. The time delay is not needed to be considered when the audio signals corresponding to the two sound collecting devices are respectively filtered, so that the noise reduction method is simplified. Of course, the computer device may also calculate the time delay of the sound collected by the two sound collection devices, and perform time delay compensation and filtering based on the time delay. The embodiment of the application is not limited to which implementation mode is adopted specifically.
In one possible implementation, as shown in fig. 6, the computer device may perform weighted summation on the audio signals corresponding to the two sound collecting devices to obtain a target audio signal after the same sound emitted by the target sound source is noise-reduced, so as to implement a filtering process on the audio signals corresponding to the two sound collecting devices.
Wherein, through the process of weighted summation, the audio signal from the direction of the target sound source can be enhanced, thereby realizing the beam forming of the direction of the target sound source. So that the noise signal is attenuated relative to the sound signal. In the audio signals corresponding to each sound acquisition device, the audio signals of the target sound source are coherent, the noise signals can be incoherent, the audio signals corresponding to the two sound acquisition devices are weighted and summed on the basis, the audio signals from the target direction can be overlapped and enhanced, the incoherent noise signals can not be overlapped, and further the enhancement of the audio signals of the target sound source is realized. For example, after the audio signals of the first earphone and the second earphone are obtained by the computer device, the audio signals of the first earphone and the second earphone can be weighted and summed, and in the obtained target audio signals, the audio signals from the direction of the mouth are enhanced, and the noise signals in other directions are not enhanced, so that further noise reduction of the audio signals of the first earphone and the second earphone is realized.
In one possible implementation, the computer device may weight sum the audio signals corresponding to each sound collection device by the following formula:
Wherein W is a =[w a1 ,w a2 ,…w ai ] T Weight vector, w, corresponding to sound collection equipment ai Weight s of the ith sound collection device i (t) is an audio signal corresponding to the ith sound collection device, and y (t) is a target audio signal.
It should be noted that, when the above-mentioned audio signals corresponding to the two sound collecting devices are weighted and summed, the weight of each sound collecting device may be a fixed weight, and the fixed weight may be preset by a relevant technician according to the actual requirement, self experience or experimental result, for example, the fixed weight of each sound collecting device may be 0.5. The weight may also be adjusted according to the filtering result, for example, according to the quality of the obtained target audio signal.
In one possible implementation manner, as shown in fig. 7, the computer device may perform weighted summation on the audio signals corresponding to the two sound collecting devices to obtain candidate target audio signals after noise reduction of the same sound sent by the target sound source, and then filter out the audio signals except for the target direction in the candidate target audio signals to obtain the target audio signals after noise reduction of the same sound sent by the target sound source.
The computer device filters out the audio signals except the target direction in the candidate target audio signals, namely, filters out the noise signals in the candidate target audio signals, and can be divided into two cases, wherein one case is to directly filter out the noise signals in the candidate audio signals to obtain the target audio signals, and the other case is to obtain the noise signals according to the audio signals corresponding to the two sound collecting devices, and subtract the noise signals from the candidate target audio signals to obtain the target audio signals.
In one possible implementation manner, the computer device may directly filter the candidate audio signal again, and filter the audio signal in a direction other than the target direction to obtain the target audio signal in the target direction. The application is not limited to a specific filtering mode.
In one possible implementation manner, as shown in fig. 7, the computer device may perform weighted summation on the audio signals corresponding to the two sound collecting devices according to the first weight corresponding to each sound collecting device, so as to obtain candidate target audio signals after noise reduction of the same sound emitted by the target sound source. The computer equipment filters the audio signals in the target direction in the audio signals corresponding to each sound collecting equipment respectively to obtain noise signals corresponding to each sound collecting equipment, and performs weighted summation on the noise signals of the two sound collecting equipment according to the second weight corresponding to each sound collecting equipment to obtain the noise signals. And the candidate target audio signal and the noise signal are subjected to difference, so that the target audio signal after noise reduction of the same sound emitted by the target sound source can be obtained.
The target direction may be a direction in which the target sound source points to a middle position of the two sound collection apparatuses. The candidate target audio signal is obtained by enhancing the audio signal from the target direction, but still comprises the audio signal of the target sound source and the noise signal, so that the candidate target audio signal and the noise signal can be subjected to difference, and the noise signal in the candidate target audio signal can be filtered, thereby realizing further noise reduction. The noise signals can be obtained by filtering the audio signals in the target direction in the audio signals corresponding to each sound collecting device, obtaining the noise signals corresponding to each sound collecting device, and carrying out weighted summation on the noise signals of the two sound collecting devices.
For example, after the audio signals corresponding to the first earphone and the second earphone are weighted and summed to obtain the candidate target audio signals, the audio signals corresponding to the first earphone and the second earphone in the audio signals corresponding to the first earphone and the second earphone can be filtered respectively, the audio signals pointing to the middle position of the first earphone and the second earphone from the human mouth can be obtained respectively to obtain the noise signals corresponding to the first earphone and the second earphone, and then the noise signals corresponding to the first earphone and the second earphone are weighted and summed to obtain the noise signals. And the candidate target audio signal is subjected to difference with the noise signal, so that the noise signal in the candidate target audio signal can be filtered out, and the target audio signal is obtained.
The computer equipment obtains the noise signal by filtering the audio signal in the target direction, and makes the difference between the candidate target audio signal and the noise signal, so that the noise signal in the candidate target audio signal is filtered, and a better noise reduction effect is achieved.
In one possible implementation manner, as shown in fig. 7, the computer device may utilize the blocking matrix to implement filtering of the audio signal in the target direction in the audio signal corresponding to each sound collecting device, so as to obtain, in the audio signal of each sound collecting device, an audio signal in a direction other than the target direction, that is, a noise signal of each sound collecting device.
For example, the target audio signal may be obtained by the following formula.
U(t)=BS i (t)
y(t)=y′(t)-z(t)
Where y' (t) is the candidate target audio signal, W a =[w a1 ,w a2 ,…w ai ] T S is a first weight vector corresponding to the sound collection equipment i (t)=[s 1 (t),s 2 (t)…s i (t)] T For the audio signal vectors of the i sound collecting devices, U (t) is an array signal processed by the blocking matrix B and W is the audio signal vector of the i sound collecting devices b =[w b1 ,w b2 ,…w bi ] T A second weight vector, w, corresponding to the sound collection device bi For the second weight of the i-th sound collection device, z (t) is a noise signal and y (t) is a target audio signal. In order to be able to filter out the audio signals of the target direction in the audio signals of each sound collection device, the blocking matrix B may be
It should be noted that, when the foregoing weighted summation is performed on the audio signals corresponding to the two sound collecting devices, the first weight and the second weight may be fixed weights, and the first weight and the second weight may be preset by related technicians according to actual requirements, self experience or experimental results, for example, the first weight and the second weight of each sound collecting device may be 0.5. The first weight and the second weight may also be adjusted according to the filtering result, for example, according to the quality of the obtained target audio signal.
In one possible implementation, the second weight may be an updatable weight. The computer equipment performs weighted summation on noise signals corresponding to the two sound collecting equipment according to the second weight corresponding to each sound collecting equipment to obtain noise signals, removes the noise signals in the candidate target audio signals, obtains target audio signals after the same sound emitted by the target sound source is noise reduced, adjusts the second weight according to the correlation between the target audio signals and the expected audio signals obtained based on the candidate target audio signals, and continues to perform the steps of noise signal obtaining and removing based on the adjusted second weight until the target conditions are met, and stops obtaining the target audio signals after the same sound emitted by the target sound source is noise reduced.
In the target audio signal obtained by subtracting the candidate target audio signal and the noise signal, there may be residual noise signals that are not filtered, and correspondingly, the second weight for performing weighted summation on the noise signals of the two sound collecting devices may be an updatable weight, so that the obtained target audio signal meets the target condition.
Through the iterative process, the second weight can be adaptively adjusted to improve the noise reduction effect of the audio signal. For the target condition, the target condition may be that the correlation converges, or the correlation is greater than a target threshold, or a difference between the correlation and the target threshold is smaller than a difference threshold, or the iteration number reaches a target number, which is not limited in the embodiment of the present application.
For example, the computer device performs a difference between the candidate target audio signal and the noise signal, and after obtaining the target audio signal, the second weight may be updated, so that the noise signal obtained according to the noise signals corresponding to the first earphone and the second earphone may be better filtered out of the candidate target audio signal.
The computer device may update the second weight using a modified least mean square algorithm to minimize the output power of the noise signal.
Wherein W is b And (n+1) is the updated second weight vector.
The second weight is updated according to the correlation between the target audio signal and the expected audio signal obtained based on the candidate target audio signal, so that the noise signal can be adaptively updated, the filtering effect of the noise signal in the candidate target audio signal is better and better, and the noise reduction effect is improved.
In one possible embodiment, the noise reduction method further includes:
the computer equipment detects the states of the two sound collecting equipment, and when the two sound collecting equipment are in the working state, the filtering steps of obtaining the audio signals corresponding to the two sound collecting equipment, respectively filtering and serving as the same microphone array are executed; when any sound collecting equipment is not in a working state, acquiring an audio signal collected by the sound collecting equipment in the working state for sound emitted by a target sound source, filtering the audio signal, and obtaining a target audio signal after noise reduction of the same sound emitted by the target sound source.
Before the noise reduction method is implemented, the computer equipment detects whether the two sound collection equipment are simultaneously started or not, and then different noise reduction methods are implemented, so that when only one sound collection equipment is started, the collected audio signals can be noise reduced, the noise reduction effect is achieved to a certain extent, and the situation that the audio signals cannot be noise reduced due to the fact that necessary data are deleted is avoided.
The embodiment of the application provides a noise reduction method, which is characterized in that a computer device acquires audio signals obtained by acquiring the same sound emitted by a target sound source by two sound acquisition devices, filters the audio signals acquired by the two sound acquisition devices respectively, and further filters the audio signals corresponding to the two sound acquisition devices, so that the noise reduction processing of the audio signals corresponding to the two sound acquisition devices is realized, the occupation ratio of the noise signals in the acquired target audio signals is smaller, the signal to noise ratio of the target audio signals is larger, and the noise reduction effect is better.
Fig. 8 is a schematic structural diagram of a noise reduction device according to an embodiment of the present application, referring to fig. 8, the device includes:
the acquiring module 801 is configured to acquire audio signals acquired by two sound collecting devices from the same sound emitted by the target sound source.
The first filtering module 802 is configured to filter the audio signals collected by the two sound collecting devices, respectively, to obtain an audio signal corresponding to each sound collecting device.
And the second filtering module 803 is configured to filter the audio signals corresponding to the two sound collecting devices to obtain the target audio signal after noise reduction of the same sound emitted by the target sound source.
In one possible implementation, the second filtering module 803 includes:
and the second weighting module is used for carrying out weighted summation on the audio signals corresponding to the two sound collecting devices to obtain the target audio signal which is sent by the target sound source and is noise-reduced by the same sound.
In one possible implementation, the second filtering module 803 includes:
and the second weighting module is used for carrying out weighted summation on the audio signals corresponding to the two sound collecting devices to obtain candidate target audio signals which are sent by the target sound source and are noise-reduced by the same sound.
And the second filtering module is used for filtering out the audio signals except the target direction in the candidate target audio signals to obtain the target audio signals which are transmitted by the target sound source and are noise-reduced by the same sound.
In one possible implementation, the second filtering module includes:
and the second filtering sub-module is used for filtering the audio signals in the target direction in the audio signals corresponding to each sound collecting device respectively to obtain the noise signals corresponding to each sound collecting device, wherein the target direction is the direction in which the target sound source points to the middle position of the two sound collecting devices.
And the second weighting submodule is used for carrying out weighted summation on the noise signals corresponding to the two sound acquisition devices to obtain the noise signals.
And the second removing module is used for removing the noise signals in the candidate target audio signals to obtain the target audio signals after the same sound emitted by the target sound source is noise reduced.
In a possible implementation manner, the second weighting submodule is specifically configured to perform weighted summation on noise signals corresponding to the two sound collection devices according to the weight corresponding to each sound collection device, so as to obtain noise signals.
The second filtering module further includes a second adjusting module, configured to adjust the weight according to a correlation between the target audio signal and a desired audio signal obtained based on the candidate target audio signal.
The second filtering module is further configured to, based on the adjusted weight, continue to perform the steps of obtaining and removing the noise signal, until the noise signal meets a target condition, and stop to obtain the target audio signal after noise reduction of the same sound sent by the target sound source.
In one possible implementation, the first filtering module 802 includes:
and the time delay module is used for compensating the time delay of the audio signals acquired by each acquisition unit in each sound acquisition device according to the time delay of the sound acquired by different acquisition units in each sound acquisition device, so as to obtain candidate audio signals of each acquisition unit.
And the first filtering sub-module is used for filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signals corresponding to each sound acquisition device.
In one possible embodiment, the apparatus further comprises:
and the detection module is used for detecting the states of the two sound collection devices.
The device is also used for:
when the two sound collection devices are in a working state, the steps of obtaining the audio signals corresponding to the two sound collection devices, respectively filtering and filtering as the same microphone array are executed.
When any sound collecting equipment is not in a working state, acquiring an audio signal collected by the sound collecting equipment in the working state for the sound emitted by the target sound source, and filtering the audio signal to obtain a target audio signal after noise reduction of the same sound emitted by the target sound source.
The embodiment of the application provides a noise reduction device, which is characterized in that audio signals obtained by acquiring the same sound emitted by a target sound source by two sound acquisition devices are acquired, the audio signals acquired by the two sound acquisition devices are respectively filtered, and then the audio signals corresponding to the two sound acquisition devices are further filtered, so that the noise reduction processing of the audio signals corresponding to the two sound acquisition devices is realized, the occupancy rate of the noise signals in the acquired target audio signals is smaller, the signal to noise ratio of the target audio signals is larger, and the noise reduction effect is better.
It should be noted that: in the noise reduction device provided in the above embodiment, only the division of the above functional modules is used as an example, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the noise reduction device and the noise reduction method provided in the foregoing embodiments belong to the same concept, and detailed implementation processes of the noise reduction device and the noise reduction method are described in the method embodiments, which are not repeated here.
The sound collection device in the above-mentioned noise reduction method may be a terminal shown in fig. 9, and fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application. Referring to fig. 9, the terminal 900 may be: a smart phone, a tablet, an MP3 (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio plane 3) player, an MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio plane 4) player, a notebook or a desktop. Terminal 900 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, etc.
In general, the terminal 900 includes: one or more processors 901 and one or more memories 902.
Processor 901 may include one or more processing cores, such as a 4-core processor, a 9-core processor, and the like. The processor 901 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 901 may also include a main processor and a coprocessor, the main processor being a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 901 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 901 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
The memory 902 may include one or more computer-readable storage media, which may be non-transitory. The memory 902 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 902 is used to store at least one instruction for execution by processor 901 to implement the noise reduction methods provided by the method embodiments of the present application.
In some embodiments, the terminal 900 may further optionally include: a peripheral interface 903, and at least one peripheral. The processor 901, memory 902, and peripheral interface 903 may be connected by a bus or signal line. The individual peripheral devices may be connected to the peripheral device interface 903 via buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one of radio frequency circuitry 904, a display 905, a camera assembly 906, audio circuitry 907, a positioning assembly 908, and a power source 909.
The peripheral interface 903 may be used to connect at least one peripheral device associated with an I/O (Input/Output) to the processor 901 and the memory 902. In some embodiments, the processor 901, memory 902, and peripheral interface 903 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 901, the memory 902, and the peripheral interface 903 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.
The Radio Frequency circuit 904 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 904 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 904 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 904 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuit 904 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuit 904 may also include NFC (Near Field Communication ) related circuits, which the present application is not limited to.
The display 905 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 905 is a touch display, the display 905 also has the ability to capture touch signals at or above the surface of the display 905. The touch signal may be input as a control signal to the processor 901 for processing. At this time, the display 905 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 905 may be one, providing a front panel of the terminal 900; in other embodiments, the display 905 may be at least two, respectively disposed on different surfaces of the terminal 900 or in a folded design; in some embodiments, the display 905 may be a flexible display disposed on a curved surface or a folded surface of the terminal 900. Even more, the display 905 may be arranged in an irregular pattern other than rectangular, i.e., a shaped screen. The display 905 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.
The camera assembly 906 is used to capture images or video. Optionally, the camera assembly 906 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 906 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.
The audio circuit 907 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 901 for processing, or inputting the electric signals to the radio frequency circuit 904 for voice communication. For purposes of stereo acquisition or noise reduction, the microphone may be plural and disposed at different portions of the terminal 900. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 901 or the radio frequency circuit 904 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 907 may also include a headphone jack.
The location component 908 is used to locate the current geographic location of the terminal 900 to enable navigation or LBS (Location Based Service, location-based services). The positioning component 908 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, the Granati system of Russia, or the Galileo system of the European Union.
The power supply 909 is used to supply power to the various components in the terminal 900. The power supply 909 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When the power supply 909 includes a rechargeable battery, the rechargeable battery can support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, terminal 900 can further include one or more sensors 910. The one or more sensors 910 include, but are not limited to: acceleration sensor 911, gyroscope sensor 912, pressure sensor 913, fingerprint sensor 914, optical sensor 915, and proximity sensor 916.
The acceleration sensor 911 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 900. For example, the acceleration sensor 911 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 901 may control the display 905 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 911. The acceleration sensor 911 may also be used for the acquisition of motion data of a game or a user.
The gyro sensor 912 may detect a body direction and a rotation angle of the terminal 900, and the gyro sensor 912 may collect a 3D motion of the user on the terminal 900 in cooperation with the acceleration sensor 911. The processor 901 may implement the following functions according to the data collected by the gyro sensor 912: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.
The pressure sensor 913 may be provided at a side frame of the terminal 900 and/or at a lower layer of the display 905. When the pressure sensor 913 is provided at a side frame of the terminal 900, a grip signal of the user to the terminal 900 may be detected, and the processor 901 performs left-right hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 913. When the pressure sensor 913 is provided at the lower layer of the display 905, the processor 901 performs control of the operability control on the UI interface according to the pressure operation of the user on the display 905. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
The fingerprint sensor 914 is used for collecting the fingerprint of the user, and the processor 901 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 914, or the fingerprint sensor 914 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 901 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 914 may be provided on the front, back or side of the terminal 900. When a physical key or a vendor Logo is provided on the terminal 900, the fingerprint sensor 914 may be integrated with the physical key or the vendor Logo.
The optical sensor 915 is used to collect the intensity of ambient light. In one embodiment, the processor 901 may control the display brightness of the display panel 905 based on the intensity of ambient light collected by the optical sensor 915. Specifically, when the ambient light intensity is high, the display luminance of the display screen 905 is turned up; when the ambient light intensity is low, the display luminance of the display panel 905 is turned down. In another embodiment, the processor 901 may also dynamically adjust the shooting parameters of the camera assembly 906 based on the ambient light intensity collected by the optical sensor 915.
A proximity sensor 916, also referred to as a distance sensor, is typically provided on the front panel of the terminal 900. Proximity sensor 916 is used to collect the distance between the user and the front of terminal 900. In one embodiment, when the proximity sensor 916 detects that the distance between the user and the front face of the terminal 900 gradually decreases, the processor 901 controls the display 905 to switch from the bright screen state to the off screen state; when the proximity sensor 916 detects that the distance between the user and the front surface of the terminal 900 gradually increases, the processor 901 controls the display 905 to switch from the off-screen state to the on-screen state.
Those skilled in the art will appreciate that the structure shown in fig. 9 is not limiting and that more or fewer components than shown may be included or certain components may be combined or a different arrangement of components may be employed.
In an exemplary embodiment, a computer readable storage medium, such as a memory, including instructions executable by a processor to perform the noise reduction method of the above embodiments is also provided. For example, the computer readable storage medium may be Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), compact disc Read-Only Memory (CD-ROM), magnetic tape, floppy disk, optical data storage device, and the like.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the above storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but is intended to cover all modifications, equivalents, alternatives, and improvements falling within the spirit and principles of the application.

Claims (6)

1. A method of noise reduction, the method comprising:
acquiring an audio signal obtained by acquiring the same sound emitted by a target sound source by two sound acquisition devices, wherein each sound acquisition device comprises at least two acquisition units, and each acquisition unit is a microphone;
Filtering the audio signals acquired by the two sound acquisition devices respectively to obtain audio signals corresponding to each sound acquisition device;
according to the first weight corresponding to each sound collecting device, carrying out weighted summation on the audio signals corresponding to the two sound collecting devices to obtain candidate target audio signals after noise reduction of the same sound sent by the target sound source;
respectively filtering out an audio signal in a target direction in the audio signal corresponding to each sound collecting device to obtain a noise signal corresponding to each sound collecting device, wherein the target direction is the direction in which the target sound source points to the middle position of the two sound collecting devices;
according to the second weight corresponding to each acquisition unit in each sound acquisition device, weighting and summing the noise signals of the acquisition units in each sound acquisition device to obtain noise signals;
removing the noise signals in the candidate target audio signals to obtain target audio signals after noise reduction of the same sound emitted by the target sound source;
and adjusting the second weight according to the correlation between the target audio signal and the expected audio signal obtained based on the candidate target audio signal, and continuing to execute the steps of obtaining and removing the noise signal based on the adjusted second weight until the target audio signal meets the target condition, and obtaining the target audio signal after noise reduction of the same sound emitted by the target sound source.
2. The method according to claim 1, wherein filtering the audio signals collected by the two sound collection devices to obtain the audio signal corresponding to each sound collection device includes:
according to the time delay of the sound collected by different collection units in each sound collection device, performing time delay compensation on the audio signals collected by each collection unit in each sound collection device to obtain candidate audio signals of each collection unit;
and filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signals corresponding to each sound acquisition device.
3. The method according to claim 1, wherein the method further comprises:
detecting states of the two sound collection devices;
when the two sound collection devices are in a working state, executing the steps of acquiring the audio signals corresponding to the two sound collection devices, respectively filtering and filtering as the same microphone array;
when any sound collecting equipment is not in a working state, acquiring an audio signal collected by the sound collecting equipment in the working state on the sound sent by the target sound source, filtering the audio signal, and obtaining the target audio signal after noise reduction of the same sound sent by the target sound source.
4. A noise reduction device, characterized in that the device comprises a plurality of functional modules for performing the noise reduction method of any one of claims 1 to 3.
5. A computer device comprising one or more processors and one or more memories having stored therein at least one instruction that is loaded and executed by the one or more processors to implement the operations performed by the noise reduction method of any of claims 1 to 3.
6. A computer readable storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement the operations performed by the noise reduction method of any one of claims 1 to 3.
CN202010111706.0A 2020-02-24 2020-02-24 Noise reduction method, device, equipment and storage medium Active CN111402913B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010111706.0A CN111402913B (en) 2020-02-24 2020-02-24 Noise reduction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010111706.0A CN111402913B (en) 2020-02-24 2020-02-24 Noise reduction method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111402913A CN111402913A (en) 2020-07-10
CN111402913B true CN111402913B (en) 2023-09-12

Family

ID=71413851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010111706.0A Active CN111402913B (en) 2020-02-24 2020-02-24 Noise reduction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111402913B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112185336A (en) * 2020-09-28 2021-01-05 苏州臻迪智能科技有限公司 Noise reduction method, device and equipment
CN112785998B (en) * 2020-12-29 2022-11-15 展讯通信(上海)有限公司 Signal processing method, equipment and device
CN114697812B (en) * 2020-12-29 2023-06-20 华为技术有限公司 Sound collection method, electronic equipment and system
CN112837703B (en) * 2020-12-30 2024-08-23 深圳市联影高端医疗装备创新研究院 Method, device, equipment and medium for acquiring voice signal in medical imaging equipment
CN113539291B (en) * 2021-07-09 2024-06-25 北京声智科技有限公司 Noise reduction method and device for audio signal, electronic equipment and storage medium
CN113766385B (en) * 2021-09-24 2023-12-22 维沃移动通信有限公司 Earphone noise reduction method and device
CN115132220B (en) * 2022-08-25 2023-02-28 深圳市友杰智新科技有限公司 Method, device, equipment and storage medium for restraining double-microphone awakening of television noise

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9741360B1 (en) * 2016-10-09 2017-08-22 Spectimbre Inc. Speech enhancement for target speakers
CN108091344A (en) * 2018-02-28 2018-05-29 科大讯飞股份有限公司 A kind of noise-reduction method, apparatus and system
WO2018127483A1 (en) * 2017-01-03 2018-07-12 Koninklijke Philips N.V. Audio capture using beamforming
CN108922554A (en) * 2018-06-04 2018-11-30 南京信息工程大学 The constant Wave beam forming voice enhancement algorithm of LCMV frequency based on logarithm Power estimation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10438605B1 (en) * 2018-03-19 2019-10-08 Bose Corporation Echo control in binaural adaptive noise cancellation systems in headsets

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9741360B1 (en) * 2016-10-09 2017-08-22 Spectimbre Inc. Speech enhancement for target speakers
WO2018127483A1 (en) * 2017-01-03 2018-07-12 Koninklijke Philips N.V. Audio capture using beamforming
CN108091344A (en) * 2018-02-28 2018-05-29 科大讯飞股份有限公司 A kind of noise-reduction method, apparatus and system
CN108922554A (en) * 2018-06-04 2018-11-30 南京信息工程大学 The constant Wave beam forming voice enhancement algorithm of LCMV frequency based on logarithm Power estimation

Also Published As

Publication number Publication date
CN111402913A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
CN111402913B (en) Noise reduction method, device, equipment and storage medium
CN110764730B (en) Method and device for playing audio data
CN111050250B (en) Noise reduction method, device, equipment and storage medium
CN111445901B (en) Audio data acquisition method and device, electronic equipment and storage medium
CN110618805B (en) Method and device for adjusting electric quantity of equipment, electronic equipment and medium
CN111723803B (en) Image processing method, device, equipment and storage medium
CN112133332B (en) Method, device and equipment for playing audio
CN110619614B (en) Image processing method, device, computer equipment and storage medium
CN110956580B (en) Method, device, computer equipment and storage medium for changing face of image
CN110401898B (en) Method, apparatus, device and storage medium for outputting audio data
CN110797042B (en) Audio processing method, device and storage medium
CN109102811B (en) Audio fingerprint generation method and device and storage medium
CN111613213B (en) Audio classification method, device, equipment and storage medium
CN112269559A (en) Volume adjustment method and device, electronic equipment and storage medium
CN108196813B (en) Method and device for adding sound effect
CN114384466A (en) Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium
CN111931712B (en) Face recognition method, device, snapshot machine and system
CN112133319B (en) Audio generation method, device, equipment and storage medium
CN109360577B (en) Method, apparatus, and storage medium for processing audio
CN111916105B (en) Voice signal processing method, device, electronic equipment and storage medium
CN112133267B (en) Audio effect processing method, device and storage medium
CN110889391A (en) Method and device for processing face image, computing equipment and storage medium
CN113539291B (en) Noise reduction method and device for audio signal, electronic equipment and storage medium
CN115334413B (en) Voice signal processing method, system and device and electronic equipment
CN111091512B (en) Image processing method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant