Nothing Special   »   [go: up one dir, main page]

CN109087659A - Audio optimization method and apparatus - Google Patents

Audio optimization method and apparatus Download PDF

Info

Publication number
CN109087659A
CN109087659A CN201810878268.3A CN201810878268A CN109087659A CN 109087659 A CN109087659 A CN 109087659A CN 201810878268 A CN201810878268 A CN 201810878268A CN 109087659 A CN109087659 A CN 109087659A
Authority
CN
China
Prior art keywords
environmental noise
audio data
noise model
model
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810878268.3A
Other languages
Chinese (zh)
Inventor
叶韵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics China R&D Center
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics China R&D Center
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics China R&D Center, Samsung Electronics Co Ltd filed Critical Samsung Electronics China R&D Center
Priority to CN201810878268.3A priority Critical patent/CN109087659A/en
Publication of CN109087659A publication Critical patent/CN109087659A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

Provide a kind of audio optimization method and apparatus.The audio optimization method includes: to obtain the multiple environmental noise models for being suitable for varying environment;Obtain audio data;An environmental noise model is selected from the multiple environmental noise model for audio data;Audio data is optimized using the environmental noise model of selection.Audio optimization method and apparatus of the invention can select optimal environmental noise model from multiple environmental noise models and optimize to audio data.

Description

Audio optimization method and apparatus
Technical field
It is described below and is related to a kind of audio optimization method and apparatus, it can be from multiple ambient noises more particularly, to one kind A kind of environmental noise model, and the method optimized using the environmental noise model selected to audio data are selected in model And equipment.
Background technique
It is a kind of method dealt with objects that voice-optimizing, which is with the disturbing factor in voice recording and broadcasting, wherein interference Factor, which includes the case where that interfering noise, voice fuzzy, accent, pronunciation mistake, distorsion, distortion etc. are multiple, causes aphthenxia clear. Voice-optimizing system is exactly a kind of system that these disturbing factors are reduced and eliminated, to promote voice communication from every side Comfort, convenience and adaptability.
When smart machine receives calls or sends voice messaging, it will usually encounter noisy situation.When interfering larger, The sound quality that will affect recording needs to carry out audio optimization to voice.It is complete that current audio optimization technology relies primarily on hardware At, need to arrange in pairs or groups using external equipment and interface, external equipment needs are individually charged, in use more have inconvenience. Meanwhile the audio optimization scheme of hardware is relatively fixed, generallys use the fixation sides such as high-pass filtering, low-pass filtering or gaussian filtering Method can not be made for environment and adaptively adjust, and usual effect of optimization is more unstable.
Summary of the invention
It is proposed that the present invention can at least solve disadvantages mentioned above and provide following advantages.
It is an aspect of the invention to family that can be used can select a phase from preconfigured multiple environmental noise models The environmental noise model of prestige optimizes audio data.
Another aspect of the present invention is can be by carrying out pre-optimized processing to audio data come automatically from being pre-configured with Environmental noise model in select optimal environmental noise model audio data optimized.
Another aspect of the present invention is that optimal environmental noise model can be reselected according to audio data, and uses weight The suitable environment noise model newly selected optimizes audio data, so that can be during optimizing to audio data Always audio data is optimized using optimal environmental noise model.
Another aspect of the present invention is can be in the process for obtaining audio data and optimizing to the audio data of acquisition In, constantly collect data, and data are differently stored in each data set, and by the data to collection at Reason realizes advanced optimizing to multiple environmental noise models of audio optimization device configuration, enables environmental noise model More accurately embody the noise characteristic under specific environment.
Another aspect of the present invention is that family can be used to establish new environmental noise model, so that user can be dissatisfied existing Some environmental noise models in the case where the effect of optimization of audio data, establish new environmental noise model to audio data into Row optimization.
According to an aspect of the present invention, a kind of audio optimization method is provided, which comprises obtain and be suitable for difference Multiple environmental noise models of environment;Obtain audio data;It is selected from the multiple environmental noise model for audio data One environmental noise model;Audio data is optimized using the environmental noise model of selection.
A step of environmental noise model is selected from the multiple environmental noise model can include: manually selecting mould Under formula, specified input of the user to an environmental noise model in the multiple environmental noise model is received, and by user Specified environmental noise model is determined as the environmental noise model of selection.
A step of environmental noise model is selected from the multiple environmental noise model can include: adaptively selected Under mode, pre-optimized is carried out to audio data using each environmental noise model in the multiple environmental noise model respectively, And an environmental noise model is selected from the multiple environmental noise model according to pre-optimized result.
Pre-optimized is carried out to audio data using each environmental noise model in the multiple environmental noise model respectively, And the step of selecting an environmental noise model from the multiple environmental noise model according to pre-optimized result can include: from sound Frequency intercepts out the audio data of predetermined length in;Respectively using each ambient noise in the multiple environmental noise model Model optimizes the audio data of the predetermined length, obtains corresponding with the multiple environmental noise model multiple respectively Optimum results;Each optimum results in multiple optimum results are evaluated respectively, from the multiple environmental noise model The environmental noise model for selecting evaluation result best.
The step of each optimum results in multiple optimum results are evaluated respectively can include: calculate separately each excellent Change signal-to-noise ratio, subjective speech quality assessment and the segmental signal-to-noise ratio of result;Calculate separately signal-to-noise ratio, the subjectivity of each optimum results The weighted average of speech quality evaluation and segmental signal-to-noise ratio;Environment corresponding with the highest optimum results of weighted average is made an uproar Acoustic model is determined as the best environmental noise model of evaluation result.
Under adaptively selected mode, the step of being optimized using the environmental noise model of selection to audio data, can be wrapped It includes: being spaced at predetermined time intervals, a pre-optimized is carried out to audio data, and according to pre-optimized result from multiple ambient noise moulds Environmental noise model is selected in type, using selection environmental noise model among audio data from selecting the environment Audio data until being played at the time of selecting environmental noise model next time at the time of noise model optimizes.
The step of audio data is optimized can include: by subtracting the environment with selection from the frequency spectrum of audio data The frequency spectrum of the corresponding noise characteristic of noise model come obtain optimization after audio data.
The optimization method may also include that after obtaining audio data, and it is optimised to determine whether audio data needs;Such as Fruit audio data does not need optimised, then to be added to audio data in pre-set training sample database clean data collection In;It, will after the environmental noise model using selection optimizes audio data if audio data needs are optimised Original audio data corresponding with the audio data after optimization is added to the ambient noise with selection in the training sample database The corresponding noise data of model is concentrated;The audio data and corresponding with the environmental noise model of selection concentrated based on clean data The audio data that noise data is concentrated, optimizes the environmental noise model of selection.
After the environmental noise model using selection optimizes audio data, by with the audio data phase after optimization The concentration of the noise data corresponding with the environmental noise model of selection in the training sample database is added in the original audio data answered The step of can include: under adaptively selected mode, determine whether the environmental noise model of selection meets predetermined condition;If choosing The environmental noise model selected is unsatisfactory for predetermined condition, then prompts the user whether the new environmental noise model of suggestion;If user selects It selects and establishes new environmental noise model, then establish new environmental noise model, and will be corresponding with the audio data after optimization former The concentration of the noise data corresponding with new environmental noise model in the training sample database is added in beginning audio data.
The step of whether environmental noise model of selection meets predetermined condition determined can include: calculate the predetermined length First weighted average of the signal-to-noise ratio of audio data, subjective speech quality assessment and segmental signal-to-noise ratio;It calculates using selection Environmental noise model the audio data of predetermined length is optimized after the signal-to-noise ratio of audio data, subjective speech quality assessment With the second weighted average of segmental signal-to-noise ratio;Calculate the ratio of the second weighted average and the first weighted average;If institute It states ratio and does not reach predetermined value, it is determined that the environmental noise model of selection is unsatisfactory for the predetermined condition.
The step of environmental noise model of selection is optimized can include: will be corresponding with the environmental noise model of selection Difference between the audio frequency characteristics for the audio data that the audio frequency characteristics and clean data for the audio data that noise data is concentrated are concentrated is really It is set to noise characteristic corresponding with the environmental noise model of selection.
According to another aspect of the present invention, a kind of audio optimization equipment is provided, the equipment includes: that model obtains mould Block is configured as obtaining the multiple environmental noise models for being suitable for varying environment;Data acquisition module is configured as obtaining audio Data;Model selection module is configured as selecting an environment to make an uproar from the multiple environmental noise model for audio data Acoustic model;Audio optimization module, the environmental noise model for being configured with selection optimize audio data.
Model selection module can be configured to: in the manual selection mode, receive user to the multiple ambient noise mould The specified input of an environmental noise model in type, and the environmental noise model that user is specified is determined as the environment of selection Noise model.
Model selection module can be configured to: under adaptively selected mode, use the multiple ambient noise mould respectively Each environmental noise model in type carries out pre-optimized to audio data, and according to pre-optimized result from the multiple ambient noise An environmental noise model is selected in model.
Model selection module can be configured to: the audio data of predetermined length is intercepted out from audio data;It uses respectively Each environmental noise model in the multiple environmental noise model optimizes the audio data of the predetermined length, obtains Multiple optimum results corresponding with the multiple environmental noise model respectively;Respectively to each optimization knot in multiple optimum results Fruit is evaluated, the environmental noise model for selecting evaluation result best from the multiple environmental noise model.
Model selection module can be configured to: calculate separately signal-to-noise ratio, the subjective speech quality assessment of each optimum results And segmental signal-to-noise ratio;Calculate separately the weighting of the signal-to-noise ratio, subjective speech quality assessment and segmental signal-to-noise ratio of each optimum results Average value;Environmental noise model corresponding with the highest optimum results of weighted average is determined as the best environment of evaluation result Noise model.
Model selection module can be configured to: under adaptively selected mode, be spaced at predetermined time intervals, to audio data A pre-optimized is carried out, and an environmental noise model is selected from multiple environmental noise models according to pre-optimized result, is used The environmental noise model of selection among audio data to selecting from playing at the time of selecting the environmental noise model next time Out at the time of environmental noise model until audio data optimize.
Audio optimization module can be configured to: by subtracting the environmental noise model with selection from the frequency spectrum of audio data The frequency spectrum of corresponding noise characteristic obtains the audio data after optimization.
The audio optimization equipment may also include that model optimization module, be configured as: after obtaining audio data, really It is optimised whether audio data needs;If audio data does not need optimised, audio data is added to and is preset Training sample database in clean data concentrate;If audio data needs are optimised, model is used in audio optimization module It, will be corresponding with the audio data after optimization former after the environmental noise model of selecting module selection optimizes audio data Beginning audio data is added to the concentration of the noise data corresponding with the environmental noise model of selection in the training sample database;It is based on The audio data that the audio data and noise data corresponding with the environmental noise model of selection that clean data is concentrated are concentrated, to choosing The environmental noise model selected optimizes.
The audio optimization equipment may also include that model building module, be configured as: under adaptively selected mode, really Whether the environmental noise model for determining selection meets predetermined condition;If the environmental noise model of selection is unsatisfactory for predetermined condition, Prompt the user whether the new environmental noise model of suggestion;If new environmental noise model is established in user's selection, establish newly Environmental noise model, and by original audio data corresponding with the audio data after optimization be added in the training sample database with The corresponding noise data of new environmental noise model is concentrated.
Model building module can be configured to: calculate signal-to-noise ratio, the Subjective speech matter of the audio data of the predetermined length First weighted average of amount assessment and segmental signal-to-noise ratio;It calculates and uses the environmental noise model of selection to the audio of predetermined length Data optimize after the signal-to-noise ratio of audio data, subjective speech quality assessment and segmental signal-to-noise ratio the second weighted average Value;Calculate the ratio of the second weighted average and the first weighted average;If the ratio does not reach predetermined value, it is determined that The environmental noise model of selection is unsatisfactory for the predetermined condition.
Model optimization module can be configured to: the audio that noise data corresponding with the environmental noise model of selection is concentrated Difference between the audio frequency characteristics for the audio data that the audio frequency characteristics and clean data of data are concentrated is determined as making an uproar with the environment of selection The corresponding noise characteristic of acoustic model.
A kind of computer readable storage medium, is stored with program, is configured as: described program includes that the above-mentioned audio of execution is excellent The code of change method.
A kind of computer, the readable medium including being stored with computer program, is configured as: the computer program includes Execute the code of above-mentioned audio optimization method.The present invention can be made an uproar by selecting optimal environment from multiple environmental noise models Acoustic model optimizes audio data.In addition, whether the present invention can also be satisfied with audio optimization effect according to user, establish new Environmental noise model, so that being increased using the time with audio optimization equipment, optional ambient noise in audio optimization equipment Model becomes more, and environmental noise model gradually increases the suitability of audio data.In addition, the present invention can also continue on audio During optimizing equipment, the clean data without noise and the noise data with noise are stored, and by clean data and make an uproar Sound data constantly optimize all environmental noise models, so that each environmental noise model can more accurately indicate phase The noise characteristic of environment is answered, so that increasing using the time with audio optimization equipment, more preferable to the effect of optimization of audio data.
Detailed description of the invention
By below with reference to be exemplarily illustrated embodiment attached drawing carry out description, exemplary embodiment of the present it is upper Stating will become apparent with other purposes and feature, in which:
Fig. 1 is the flow chart of audio optimization method accoding to exemplary embodiment;
Fig. 2 is the selection one from multiple environmental noise models under adaptively selected mode accoding to exemplary embodiment The flow chart of environmental noise model;
Fig. 3 is the schematic diagram optimized under adaptively selected mode to audio data accoding to exemplary embodiment;
Fig. 4 is the flow chart of the method that optimization is trained to environmental noise model accoding to exemplary embodiment;
Fig. 5 is the block diagram of audio optimization equipment accoding to exemplary embodiment.
Specific embodiment
It reference will now be made in detail the embodiment of the present invention, examples of the embodiments are shown in the accompanying drawings, wherein identical mark Number identical component is indicated always.It will illustrate the embodiment, by referring to accompanying drawing below to explain the present invention.
User is issuing voice or when listening to voice, by the accent of user, category of language, user institute doing in the environment The factors such as disturb, voice that user issues or the voice received may be interfered, and cause voice quality poor, it is therefore desirable to right Voice optimizes.Due to user send voice or receive voice when, interfere voice quality factor can real-time change, because This, persistently voice is optimized using a kind of optimal way may cause a certain section in voice effect of optimization it is preferable, certain The poor phenomenon of one section of effect of optimization, therefore, it is proposed to which one kind can be by selection prioritization scheme come excellent to user speech progress The method of change.Hereinafter, the noise in audio data can refer to accent, speak get stuck, dialect, ambient noise etc. are all to lead The factor for causing aphthenxia clear.
Fig. 1 is the flow chart of audio optimization method according to an embodiment of the present disclosure.
In step 101, the multiple environmental noise models for being suitable for varying environment are obtained.Audio optimization equipment is bought in user Afterwards when initial start-up audio optimization equipment, can prompt user is multiple environment that audio optimization device configuration is suitable for varying environment Noise model.
When user uses audio optimization equipment for the first time, audio optimization equipment can prompt the user whether to need to custom-configure Multiple environmental noise models.According to an embodiment of the invention, audio optimization equipment can if user does not need to custom-configure Multiple environmental noise models of default are provided.
According to another embodiment of the present invention, user can customize the multiple environmental noise models of configuration.For example, user Ke Gen According to the use environment of audio optimization equipment, scene etc., selected from all environmental noise models stored in audio optimization equipment Multiple environmental noise models, as multiple environmental noise models for audio optimization device configuration.Optionally, user can also be from clothes Desired environmental noise model is downloaded as one in multiple environmental noise models for audio optimization device configuration on business device. Above example is merely exemplary, and the invention is not limited thereto, and user can be also that audio optimization equipment obtains ring by other methods Border noise model.
It is suitable for for each environmental noise model in multiple environmental noise models of audio optimization device configuration different Environment has the noise characteristic for the environment for indicating different.For example, audio optimization equipment can store the English suitable for voice environment Voice model, Korean voice model, Chinese voice model, suitable for indoor environment TV background sound model, be suitable for outdoor The streetscape background sound model of environment, general standard noise model etc..When user does not need to custom-configure environmental noise model When, it can default and English voice model and standard noise model are provided.Optionally, when user's selection custom-configures ambient noise mould When type, if audio optimization equipment is the sound-box device suitable for family, user can make an uproar from all environment being locally stored Chinese voice model and TV background sound model are selected in acoustic model.Optionally, when user's selection custom-configures ambient noise When model, user can also download from a server the environmental noise model for not having storage in sound-box device.For example, user can be from clothes It is engaged in downloading the music background sound model for being suitable for music background environment on device.Above example is merely exemplary, and the present invention is unlimited In this.
In step 102, audio data is obtained.Audio optimization equipment can typing by the audio data of microphone input, or The audio data played by loudspeaker.
According to an embodiment of the invention, when user is by microphone input voice or by loudspeaker broadcasting voice, sound Frequency optimization equipment can carry out data collection, thus typing voice.User can independently choose whether typing voice or choosing according to prompt Select the automatic input voice under special scenes.During typing voice, other under consolidated network are in smart machine Equipment also is used as recording device, while the voice of typing user is to obtain more acurrate and richer audio data sources.
According to another embodiment of the present invention, user can carry out traditional voice communication, or can pass through audio optimization equipment The application (for example, wechat etc.) of middle installation carries out voice communication or receiving and transmitting voice message.According to an embodiment of the invention, working as user When carrying out traditional voice communication, user can be acquired by the voice of microphone input as analog signal, acquired To after the audio data of analog signal, the audio data of analog signal is converted to the audio data of digital signal, and will turn The audio data for being changed to digital signal saves as digital file format, for example, WAV format or PCM format.In addition, that will simulate After the audio data of signal is converted to the audio data of digital signal, the clear data in audio data can be filtered and be removed, Digital file format is saved as, again to increase the efficiency of audio data.In addition, leading to when user carries out traditional voice communication The voice for crossing the broadcastings such as loudspeaker, bluetooth headset is directly entered as digital file format, for example, WAV format or PCM format.
According to another embodiment of the present invention, when user in audio optimization equipment by installing using (for example, wechat Deng) carry out voice communication or when receiving and transmitting voice message, user by the voice of microphone input, pass through the voice that loudspeaker plays Speech message with transmitting-receiving is the audio data of digital signal, in this case, directly by voice communication or speech message The voice of middle user saves as the audio data of digital file format, for example, WAV format or PCM format.In addition, when user exists When using interior receiving and transmitting voice message, if there are word messages before or after a speech message, that is, a voice disappears Breath is the response to a upper word message, or carrying out the lower a piece of news of response to a speech message is word message, then Text information in the context word message of speech message can be protected as the label of this speech message, and by text information Save as the lteral data of txt format.
In step 103, an environmental noise model is selected from multiple environmental noise models for audio data.Audio is excellent Change the mode that equipment there can be multiple choices environmental noise model, for example, manual selection modes and adaptively selected mode.
According to an embodiment of the invention, user can set audio optimization equipment to manually from multiple environmental noise models The manual selection modes of one environmental noise model of middle selection.In the manual selection mode, it is needed when audio optimization equipment is got When wanting optimised audio data, audio optimization equipment can receive user to an ambient noise in multiple environmental noise models The specified input of model, and the environmental noise model that user is specified is determined as the environmental noise model of selection.For example, in hand Under dynamic selection mode, audio optimization equipment can show the list of all optional environmental noise models, and user can pass through touch screen The modes such as curtain, key or voice input specify a desired environmental noise model from the list of display.Optionally, manual Under mode, audio optimization equipment can be that user recommends an environmental noise model according to the environment locating for it, for example, by recommendation Environmental noise model is highlighted on first of list, and user may specify the environmental noise model of recommendation, or may specify other Environmental noise model.Above example is merely exemplary, and the invention is not limited thereto.
According to another embodiment of the present invention, user can set audio optimization equipment to adaptively to make an uproar from multiple environment The adaptively selected mode of an environmental noise model is selected in acoustic model.Under adaptively selected mode, audio optimization equipment Pre-optimized can be carried out to audio data using each environmental noise model in multiple environmental noise models respectively, and according to pre- excellent The result of change selects an environmental noise model from multiple environmental noise models.It will be below in reference to Fig. 2 to adaptively selecting It selects and the process that audio data optimizes is described in detail under mode.
Selection is used after selecting an environmental noise model in multiple environmental noise models in step 104 Environmental noise model optimizes audio data.According to an embodiment of the invention, during optimizing to audio data, it can By subtracting the frequency spectrum of noise characteristic corresponding with the environmental noise model of selection from the frequency spectrum of audio data to be optimized, come Audio data after obtaining optimization, and replace the original audio data being not optimised to be supplied to user the audio data after optimization. In addition, except through the frequency spectrum of noise characteristic is subtracted from the frequency spectrum of audio data come method that audio data is optimized with Outside, those skilled in the art also the other methods in audio optimization field can be used to optimize audio data.
Fig. 2 is the selection one from multiple environmental noise models under adaptively selected mode accoding to exemplary embodiment The flow chart of environmental noise model.
In step 201, the audio data of predetermined length is intercepted out from the audio data of acquisition.
According to an embodiment of the invention, if audio data is that audio optimization equipment progress real-time voice is used in user Audio data when call, then can from the time of interception out since audio data in audio data predetermined time length sound Frequency evidence.For example, preceding 5 seconds audio datas can be intercepted out from audio data.Above example is merely exemplary, and the present invention is not It is limited to this.
According to another embodiment of the present invention, if audio data is speech message of the user in the inscribed transmitting-receiving of application, Can from the time of interception out since audio data in because of data predetermined time length audio data.For example, can be from audio Intercept out preceding 5 seconds audio datas in data, or can from the time of interception out since audio data in audio data length For 1/10 audio data of the length of whole audio data.
It according to another embodiment of the present invention, can after the audio data for intercepting out predetermined length in audio data The audio data of the predetermined length intercepted out is handled, the audio data of the predetermined length intercepted out is converted into number File format.For example, if in audio optimization equipment store audio data digital file format be WAV format when, can to cut The audio data of the predefined length of taking-up carries out WAV formatting.In addition, can also be to the audio data of the predetermined length intercepted out The processing such as sampled, adjust yard parameter.
It is predetermined to what is intercepted out using each environmental noise model in multiple environmental noise models respectively in step 202 The audio data of length optimizes, and obtains multiple optimum results.According to an embodiment of the invention, can be by from intercepting out The frequency spectrum of the noise characteristic of environmental noise model is subtracted in the frequency spectrum of the audio data of predetermined length come to the pre- fixed length intercepted out The audio data of degree optimizes.Respectively using each environmental noise model to the audio data of the predetermined length intercepted out into After row optimization, optimum results corresponding with each environmental noise model respectively can be obtained.
In step 203, multiple optimum results are evaluated respectively.According to an embodiment of the invention, multiple excellent obtaining After changing result (that is, audio data of the predetermined length after multiple optimizations), the signal-to-noise ratio of each optimum results can be calculated separately (SNR), subjective speech quality assessment (PESQ) and segmentation SNR, then calculate separately SNR, PESQ and the segmentation of each optimum results The weighted average of SNR.For example, the weight ratio of SNR, PESQ and segmentation SNR can be 1:1:1 or 2:1:1, but the present invention Without being limited thereto, also weight ratio can be arranged according to the effect of optimization of audio data or desired effect of optimization in user.To multiple After the completion of optimum results evaluation, environmental noise model corresponding with the highest optimum results of weighted average can be determined as evaluating As a result best environmental noise model.
In step 204, the best environmental noise model of selection evaluation result from multiple environmental noise models.According to this hair Bright embodiment, when the weighted average of the first environment noise model in multiple environmental noise models is greater than other ambient noises When the weighted average of model, first environment noise model is selected from multiple environmental noise models.For example, indoors in environment The audio optimization equipment used has English voice model, Chinese voice model, TV background sound model and music background sound mould Type is determined and is used when the audio data when carrying out voice communication to user using each environmental noise model carries out pre-optimized Weighting of the TV background sound model to SNR, PESQ and segmentation SNR of the effect of optimization of the audio data of the certain length intercepted out Average value highest then selects TV background sound model from all environmental noise models.
In addition, under adaptively selected mode, preferred embodiment in accordance with the present invention, after starting to obtain audio data, It can be spaced at predetermined time intervals and a pre-optimized is carried out to audio data, and according to pre-optimized result from multiple environmental noise models One environmental noise model of middle selection, reuse the environmental noise model selected in audio data from selecting the environment Audio data until being played at the time of selecting environmental noise model next time at the time of noise model optimizes.Below will The above process is described in detail referring to Fig. 3.
Fig. 3 is the schematic diagram optimized under adaptively selected mode to audio data accoding to exemplary embodiment.
Referring to Fig. 3, at the time of since audio data, can start intercept out from audio data 5 seconds every 30 seconds Audio data.That is, the audio data that can start to intercept out from audio data at the 0th second from the 0th second to the 5th second, The audio data for starting for 30th second to intercept out from audio data from the 30th second to the 35th second, cut from audio data at the 60th second Take out the audio data from the 60th second to the 65th second.
As shown in figure 3, can be realized in 3 seconds after intercepting out 5 seconds audio datas every time and use multiple environment respectively Each environmental noise model in noise model optimizes the 5 seconds audio datas intercepted out to obtain multiple optimization knots Fruit, and multiple optimum results are evaluated respectively.That is, can start at the 5th second to the sound intercepted out from the 0th second to the 5th second Frequency can obtain the evaluation result of each optimum results according to optimizing at the 8th second, can start at the 35th second to intercepting out Audio data from the 30th second to the 35th second optimizes, and the evaluation result of each optimum results can be obtained at the 38th second, can Started to optimize the from the 60th to the 65th second audio data intercepted out at the 65th second, and can be obtained at the 68th second each excellent Change the evaluation result of result.
Then, as shown in figure 3, each environmental noise model of use can be obtained at the 8th second to the 0th second to the 5th second audio Data optimize after each optimum results evaluation result, and can be selected from multiple environmental noise models with evaluation tie The corresponding environmental noise model of the best optimum results of fruit, and can be used the environmental noise model selected to since the 8th second The audio data terminated by the 37th second optimizes, and can be obtained at the 38th second using each environmental noise model to the 30th second to the 35 seconds audio datas optimize after each optimum results evaluation result, and can be selected from multiple environmental noise models Environmental noise model corresponding with the best optimum results of evaluation result out, and can be used the environmental noise model selected to from 38th audio data for starting to terminate by the 67th second optimizes.Above example is full of illustratively, and the invention is not limited thereto.This Outside, during optimizing to audio data, the audio data after optimization can be supplied to user in real time.
In addition, being merely exemplary referring to Fig. 3 description carried out, the invention is not limited thereto.
Audio data is optimized under adaptively selected mode as described above, being described referring to Fig. 2 and Fig. 3 Method.According to another embodiment of the present invention, in a manual mode, user can according to locating for audio optimization equipment environment change Become, that is, obtain the change of the environment of audio data, manually change the environmental noise model optimized to audio data.Or Person, user can imitate after having carried out the optimization of a period of time to audio data using the ambient noise manually selected according to optimization Fruit changes the environmental noise model optimized to audio data.In addition, according to another embodiment of the present invention, user can be During to audio optimization, it is switched to adaptively selected mode from manual selection modes, and according to as described above adaptive The method that audio data optimizes is continued to optimize audio data under selection mode.
In addition, can be used to keep environmental noise model more preferable to the effect of optimization of audio data without the pure of noise Data and noise data with noise optimize each environmental noise model, so that each environmental noise model can be more accurate The noise characteristic of ground expression specific environment.The method optimized to environmental noise model is described in detail below with reference to Fig. 4.
Fig. 4 is the flow chart of the method optimized to environmental noise model accoding to exemplary embodiment.
In step 401, after obtaining audio data, determine whether the audio data obtained needs to optimize.When audio is excellent Change equipment when getting audio data, audio optimization equipment can be determined according to the quality of audio data audio data whether need by Optimization.For example, may be selected not optimize audio data when user thinks that audio quality is preferable.When user thinks audio When poor quality, it may be selected to optimize audio data to improve the audio quality of audio data.In addition, user can also be manual It is optimised whether the specified audio data obtained needs.Optionally, user can specify before it will carry out one section of voice communication Next voice communication needs optimised.Optionally, user can be preset audio optimization equipment and obtain in specific environment The audio data default taken does not need optimised, or the audio that audio optimization equipment obtains in specific environment can be preset Data default needs optimised.In addition, above example is merely exemplary, the present disclosure is not limited thereto.
In step 402, if it is determined that the audio data of acquisition does not need optimised, that is, think that the audio data obtained is The lesser clean data of without noise jamming or noise jamming, the then audio data that can be will acquire are added in audio optimization Clean data in equipment in pre-set training sample database is concentrated, the training sample as pure audio data.
In step 403, if it is determined that the audio data of acquisition need it is optimised, then can be in use from multiple ambient noise moulds After the environmental noise model selected in type optimizes audio data, by with use the environmental noise model of selection into The corresponding original audio data of audio data (that is, before optimizing with noisy audio data) addition after row optimization Noise data corresponding with the environmental noise model of selection into training sample database is concentrated, as with noisy audio data Sample.
According to an embodiment of the invention, in the manual selection mode, audio optimization equipment is made an uproar using the environment that user selects Acoustic model optimizes audio data, and after completing to the optimization of audio data, user can manually designate whether will be original Audio data (that is, before optimizing with noisy audio data) is added to selecting manually in training sample database with user The corresponding noise data of the environmental noise model selected is concentrated.Optionally, it is made an uproar in audio optimization equipment using the environment that user selects After acoustic model optimizes audio data, it can inquire whether user is satisfied with optimum results, it, can if user selects satisfaction Default by original audio data (that is, before optimizing with noisy audio data) be added in training sample database with The corresponding noise data of the environmental noise model that user manually selects is concentrated.In addition, above example is merely exemplary, the present invention It is without being limited thereto.
According to another embodiment of the present invention, under adaptively selected mode, in use from multiple environmental noise models After the environmental noise model selected optimizes audio data, it will be carried out with the environmental noise model selected is used The corresponding original audio data of the audio data of optimization (that is, before optimizing with noisy audio data) is added to instruction The noise data corresponding with the environmental noise model practiced in sample database is concentrated.For example, can be by as shown in Figure 3 the 8th second to 37 seconds audio datas be added in training sample database with the environmental noise model phase best in the 8th second determining evaluation result The noise data answered is concentrated, and the 38th second to the 67th second audio data as shown in Figure 3 can be added in training sample database It is concentrated in the corresponding noise data of the 38th second best environmental noise model of determining evaluation result.
According to another embodiment of the present invention, it under adaptively selected mode, is selected from multiple environmental noise models Out after an environmental noise model, it may be determined that whether the environmental noise model selected meets predetermined condition.According to the present invention Embodiment can calculate the sound of the predetermined length intercepted out when intercepting out the audio data of predetermined length from audio data SNR, PESQ of frequency evidence and the weighted average for being segmented SNR, and made an uproar according to calculated weighted average from multiple environment After selecting the best environmental noise model of evaluation result in acoustic model, by the best environmental noise model of evaluation result to interception SNR, PESQ of the optimum results that the audio data of predetermined length out optimizes and weighted average and the interception for being segmented SNR SNR, PESQ of the audio data of predetermined length out and the weighted average for being segmented SNR are compared, if comparison result is big In or be equal to predetermined value, for example, 10%, then confirm that the environmental noise model selected meets predetermined condition.If comparison result Less than predetermined value, for example, 10%, then confirm that the environmental noise model selected is unsatisfactory for predetermined condition.In addition, above example is only It is exemplary, the invention is not limited thereto.
If the environmental noise model selected meets predetermined condition, in the environmental noise model that use is selected to sound After frequency evidence optimizes, by original audio data corresponding with the audio data for using the environmental noise model to optimize (that is, before optimizing with noisy audio data) is added to corresponding to the environmental noise model in training sample database Noise data concentrate.It is complete in the optimization to audio data if the environmental noise model selected is unsatisfactory for predetermined condition Cheng Hou, audio optimization equipment can prompt the user whether to establish new environmental noise model, and user may be selected to establish or do not establish new Environmental noise model.
For example, when user thinks that audio data after optimizing using the environmental noise model for being unsatisfactory for predetermined condition is full Foot needs, then may be selected not establish environmental noise model.It, can will be with if new environmental noise model is not established in user's selection The corresponding original audio data of the audio data optimized using the environmental noise model is (that is, having before optimizing The audio data of noise) it is added to noise data corresponding with the environmental noise model concentration in training sample database.If with New environmental noise model is established in family selection, then can establish new environmental noise model and corresponding with new environmental noise model Noise training set, and can by original audio data corresponding with the audio data for using the environmental noise model to optimize (that is, Before optimizing with noisy audio data) be added in training sample database with newly-established environmental noise model phase The noise data answered is concentrated, without the noise data collection corresponding with the environmental noise model being added in training sample database In.In addition, if in newly-established environmental noise model noise data concentrate audio data it is less, then when user again When in environment corresponding with newly-established environmental noise model, audio optimization equipment can prompt user's typing in the present context A segment of audio data, and it is added to the concentration of the noise data corresponding with newly-established environmental noise model in training sample database, To enrich the data of noise data concentration, optimize convenient for the subsequent training to newly-established environmental noise model.
In addition, according to another embodiment of the present invention, either in the manual selection mode, or in adaptively selected mould Under formula, if user is dissatisfied to the audio data after optimization, new environmental noise model can be actively established, and will have and make an uproar The original audio data of sound is added to the concentration of the noise data corresponding with newly-established environmental noise model in training sample database.
According to another embodiment of the present invention, if the clean data of clean data concentration is less, can be referred to according to user Show or the audio data after optimization is added to clean data and concentrated by system setting.For example, if the data that clean data is concentrated Less, then after optimizing to the audio data in one section of voice communication, audio optimization equipment can inquire user to optimization Whether audio data afterwards is satisfied with, and when user's selection is satisfied with, can continue to ask the user whether to add the audio data after optimization It is added in clean data library, and the audio data after optimization can be added in clean data library according to the user's choice.It is optional Ground, audio optimization equipment automatically can add the audio data after optimization after user determines the audio data after satisfactory optimization It is added in clean data library.Optionally, audio optimization equipment can be after optimizing a segment of audio data, and direct access inquiry is The no audio data by after optimization is added in clean data library, and can according to the user's choice add the audio data after optimization It is added in clean data library.
In step 404, the audio data and noise corresponding with the environmental noise model of selection concentrated based on clean data Audio data in data set, it is specific to noise corresponding with the environmental noise model of selection to optimize.
According to an embodiment of the invention, audio optimization equipment can default setting in the equipment free time to environmental noise model into Row optimization for example, optimizing during night 2:00-6:00 to environmental noise model, or makes in the stopping of audio optimization equipment With being more than certain time, for example, 30 minutes whens, optimize environmental noise model.In addition, audio optimization equipment can also be defaulted Be arranged the audio data only concentrated when clean data duration it is accumulative when being more than such as 30 minutes just can to environmental noise model into Row optimization.In addition, above example is merely exemplary, the invention is not limited thereto.
According to another embodiment of the present invention, when the data that clean data is concentrated change, in audio optimization equipment It, can will be with each ambient noise if the duration for the audio data that clean data is concentrated is accumulative more than 30 minutes during free time The corresponding noise data collection of model is compared with clean data collection, by noise data concentrate audio data audio frequency characteristics with The audio frequency characteristics for the audio data that clean data is concentrated compare training, and the noise data of each environmental noise model is concentrated Audio data audio frequency characteristics and clean data concentrate audio data audio frequency characteristics between difference be identified as with often A corresponding noise characteristic of environmental noise model.Alternatively, when clean data concentrate audio data there is no variation, but with it is more When the audio data that the specific corresponding noise data of environmental noise model is concentrated in a environmental noise model changes, then exist It, can be by audio if the duration for the audio data that clean data is concentrated is accumulative more than 30 minutes when the audio optimization equipment free time The audio for the audio data that the audio frequency characteristics and clean data for the audio data that the changed noise data of data is concentrated are concentrated Feature compares training, audio frequency characteristics for the audio data that the noise data of specific environmental noise model is concentrated and pure Difference between the audio frequency characteristics of audio data in data set is determined as noise characteristic corresponding with specific environmental noise model. In addition, other also can be used in those skilled in the art other than the above-mentioned method optimized to environmental noise model Method optimizes environmental noise model.
By above-mentioned audio optimization method, optimal environmental noise model pair can be selected from multiple environmental noise models Audio data optimizes.In addition, also audio optimization effect whether can be satisfied with according to user, new environmental noise model is established, So that being increased using the time with audio optimization equipment, optional environmental noise model becomes more, environment in audio optimization equipment Noise model gradually increases the suitability of audio data.In addition, can also be deposited during continuing on audio optimization equipment The clean data without noise and the noise data with noise are stored up, and is made an uproar by clean data and noise data to all environment Acoustic model constantly optimizes, so that each environmental noise model can more accurately indicate the noise characteristic of respective environment, makes Increasing using the time with audio optimization equipment is obtained, it is more preferable to the effect of optimization of audio data.
Fig. 5 is the block diagram of audio optimization equipment accoding to exemplary embodiment.
Audio optimization equipment 500 may include that model obtains module 501, data acquisition module 502, Model selection module 503 With audio optimization module 504.
Model, which obtains module 501, can obtain multiple environmental noise models suitable for varying environment.Audio is bought in user After optimization equipment 500 when initial start-up audio optimization equipment 500, model, which obtains module 501, to prompt user to set for audio optimization Standby 500 configuration is suitable for multiple environmental noise models of varying environment.
When user uses audio optimization equipment 500 for the first time, model obtains module 501 and can prompt the user whether to need to make by oneself Justice configures multiple environmental noise models.According to an embodiment of the invention, model obtains if user does not need to custom-configure Module 501 can provide multiple environmental noise models of default.
According to another embodiment of the present invention, if user needs to custom-configure multiple environmental noise models, model The environmental noise model that audio optimization equipment 500 configures can be determined as the environmental noise model that user selects by obtaining module 501. For example, user can be according to use environment, the scene etc. of audio optimization equipment 500, what is stored from audio optimization equipment 500 is all Multiple environmental noise models are selected in environmental noise model, as the multiple ambient noise moulds configured for audio optimization equipment 500 Type.Optionally, user can also download from a server desired environmental noise model as configuring for audio optimization equipment 500 One in multiple environmental noise models.Above example is merely exemplary, and the invention is not limited thereto, and user can also pass through other Method is that audio optimization equipment 500 obtains environmental noise model.
It is suitable for for each environmental noise model in multiple environmental noise models of audio optimization device configuration different Environment has the noise characteristic for the environment for indicating different.For example, audio optimization equipment 500 can be stored suitable for voice environment English voice model, Korean voice model, Chinese voice model, suitable for indoor environment TV background sound model, be suitable for The streetscape background sound model of outdoor environment, general standard noise model etc..When user does not need to custom-configure ambient noise When model, model, which obtains module 501, can default offer English voice model and standard noise model.Optionally, when user selects When custom-configuring environmental noise model, if audio optimization equipment 500 is the sound-box device suitable for family, user is from originally Chinese voice model and TV background sound model are selected in all environmental noise models of ground storage.Optionally, when user selects It selects when custom-configuring environmental noise model, user can also download from a server the environment for not having storage in audio optimization equipment Noise model.For example, user can download from a server the music background sound model suitable for music background environment.It is selected in user After selecting environmental noise model, model, which obtains module 501, to be determined as audio optimization equipment for the environmental noise model that user selects Multiple environmental noise models of 500 configurations.Above example is merely exemplary, and the invention is not limited thereto.
Data acquisition module 502 can obtain audio data.Data acquisition module 502 can the typing sound that passes through microphone input Frequency evidence, or the audio data played by loudspeaker.Wherein, microphone and loudspeaker are not shown in FIG. 5.
According to an embodiment of the invention, when user is by microphone input voice or by loudspeaker broadcasting voice, number Data collection can be carried out according to acquisition module 502, thus typing voice.User can according to prompt independently choose whether typing voice or Selection automatic input voice under special scenes.During typing voice, its under consolidated network is in smart machine His equipment also is used as recording device, while the voice of typing user is to obtain more acurrate and richer audio data sources.
According to another embodiment of the present invention, user can carry out traditional voice communication, or can pass through audio optimization equipment The application (for example, wechat etc.) installed in 500 carries out voice communication or receiving and transmitting voice message.According to an embodiment of the invention, working as When user carries out traditional voice communication, user can be passed through the voice of microphone input as simulation by data acquisition module 502 Signal is acquired, and after collecting the audio data of analog signal, the audio data of analog signal is converted to digital letter Number audio data, and the audio data for being converted into digital signal saves as digital file format, for example, WAV format or PCM Format.It, can will be in audio data in addition, after the audio data of analog signal is converted to the audio data of digital signal Clear data filtering removal, then digital file format is saved as, to increase the efficiency of audio data.In addition, being passed in user When the voice communication of system, data acquisition module 502 can will be directly entered by the voice of the broadcastings such as loudspeaker, bluetooth headset as number Word file format, for example, WAV format or PCM format.
According to another embodiment of the present invention, when user in audio optimization equipment 500 by installing using (for example, micro- Letter etc.) carry out voice communication or when receiving and transmitting voice message, user by the voice of microphone input, pass through the language that loudspeaker plays Sound and the speech message of transmitting-receiving are the audio datas of digital signal, and in this case, data acquisition module 502 is directly by language The voice of user saves as the audio data of digital file format in sound call or speech message, for example, WAV format or PCM lattice Formula.In addition, when user is in the interior receiving and transmitting voice message of application, if there are texts to disappear before or after a speech message Breath a, that is, speech message is the response to a upper word message, or carries out next of response to a speech message and disappear Breath is word message, then data acquisition module 502 can be using the text information in the context word message of speech message as this The label of speech message, and text information is saved as to the lteral data of txt format.
Model selection module 503 can select an ambient noise mould for audio data from multiple environmental noise models Type.Audio optimization equipment 500 can have the mode of multiple choices environmental noise model, for example, manual selection modes and adaptive Selection mode.
According to an embodiment of the invention, user can set audio optimization equipment 500 to manually from multiple ambient noises The manual selection modes of an environmental noise model are selected in model.In the manual selection mode, when data acquisition module 502 obtains It gets when needing optimised audio data, Model selection module 503 can receive user to one in multiple environmental noise models The specified input of a environmental noise model, and the environmental noise model that user is specified is determined as the ambient noise mould of selection Type.For example, in the manual selection mode, audio optimization equipment 500 can show the list of all optional environmental noise models, use Family can specify a desired ambient noise mould by touching the modes such as screen, key or voice input from the list of display Type.Optionally, in a manual mode, Model selection module 503 can be that user recommends an environment to make an uproar according to the environment locating for it Acoustic model, for example, the environmental noise model of recommendation is highlighted on first of list, user may specify that the environment of recommendation is made an uproar Acoustic model, or may specify other environmental noise models.The environmental noise model that Model selection module 503 can specify user is true It is set to the environmental noise model of user's selection.Above example is merely exemplary, and the invention is not limited thereto.
According to another embodiment of the present invention, user can set audio optimization equipment 500 to adaptively from multiple rings The adaptively selected mode of an environmental noise model is selected in the noise model of border.Under adaptively selected mode, model selection Module 503 can carry out pre-optimized to audio data using each environmental noise model in multiple environmental noise models respectively, and An environmental noise model is selected from multiple environmental noise models according to the result of pre-optimized.
According to an embodiment of the invention, Model selection module 503 can be from the audio data of acquisition under adaptive model The audio data of predetermined length is intercepted out, and respectively using each environmental noise model in multiple environmental noise models to interception The audio data of predetermined length out optimizes, and obtains multiple optimization knots corresponding with the multiple environmental noise model respectively Fruit;Each optimum results in multiple optimum results are evaluated respectively again, are selected from the multiple environmental noise model The best environmental noise model of evaluation result.
According to an embodiment of the invention, if audio data is that user is used audio optimization equipment 500 and carries out in real time Audio data when voice communication, then Model selection module 503 can be intercepted out from audio data since audio data when The audio data of predetermined time length is carved, for example, Model selection module 503 can intercept out preceding 5 seconds sounds from audio data Frequency evidence.Above example is merely exemplary, and the invention is not limited thereto.
According to another embodiment of the present invention, if audio data is speech message of the user in the inscribed transmitting-receiving of application, Model selection module 503 can from the time of interception out since audio data in because of data predetermined time length audio number According to.For example, Model selection module 503 can intercept out preceding 5 seconds audio datas from audio data, or can be cut from audio data Length is 1/10 audio data of the length of whole audio data at the time of taking-up since audio data.
According to another embodiment of the present invention, Model selection module 503 is intercepting out predetermined length from audio data After audio data, the audio data of the predetermined length intercepted out can be handled, the sound for the predetermined length that will be intercepted out Frequency evidence is converted to digital file format.For example, if storing the digital file format of audio data in audio optimization equipment 500 When for WAV format, Model selection module 503 can carry out WAV formatting to the audio data of the predefined length intercepted out.This Outside, Model selection module 503 such as can also be sampled to the audio data of the predetermined length intercepted out, adjust yard parameter at the processing.
After Model selection module 503 intercepts out the audio data of predetermined length in the audio data from acquisition, it can distinguish The audio data of the predetermined length intercepted out is optimized using each environmental noise model in multiple environmental noise models, And obtain multiple optimum results.According to an embodiment of the invention, Model selection module 503 can be by from the predetermined length intercepted out Audio data frequency spectrum in subtract environmental noise model noise characteristic frequency spectrum come the audio to the predetermined length intercepted out Data optimize.Model selection module 503 is in the sound respectively using each environmental noise model to the predetermined length intercepted out Frequency can obtain optimum results corresponding with each environmental noise model respectively according to after optimizing.
Model selection module 503, can be right respectively after obtaining optimum results corresponding with each environmental noise model respectively Multiple optimum results are evaluated.According to an embodiment of the invention, Model selection module 503 obtain multiple optimum results (that is, The audio data of predetermined length after multiple optimizations) after, signal-to-noise ratio (SNR), the subjective language of each optimum results can be calculated separately Sound quality assesses (PESQ) and segmentation SNR, then calculates separately SNR, PESQ of each optimum results and be segmented the weighted average of SNR Value.For example, the weight ratio of SNR, PESQ and segmentation SNR can be 1:1:1 or 2:1:1, however, the present invention is not limited thereto, user Also weight ratio can be set according to the effect of optimization of audio data or the effect of desired audio data.Model selection module 503 To multiple optimum results evaluate after the completion of, can will environmental noise model corresponding with the highest optimum results of weighted average it is true It is set to the best environmental noise model of evaluation result, and the environment for selecting evaluation result best from multiple environmental noise models is made an uproar Acoustic model.
In addition, under adaptively selected mode, preferred embodiment in accordance with the present invention, after starting to obtain audio data, Model selection module 503 can be spaced at predetermined time intervals to audio data carry out a pre-optimized, and according to pre-optimized result from An environmental noise model is selected in multiple environmental noise models, reuses the environmental noise model selected in audio data From played at the time of selecting the environmental noise model select environmental noise model next time at the time of until audio number According to optimizing.Since the above process is described in detail above with reference to Fig. 3, repeated description will be no longer carried out.
Audio optimization module 504 can select an environment in Model selection module 503 from multiple environmental noise models After noise model, audio data is optimized using the environmental noise model of selection.According to an embodiment of the invention, right During audio data optimizes, audio optimization module 504 can be by subtracting and selecting from the frequency spectrum of audio data to be optimized The frequency spectrum of the corresponding noise characteristic of the environmental noise model selected, to obtain the audio data after optimization, and by the audio after optimization Data replace the original audio data being not optimised to be supplied to user.In addition, making an uproar except through being subtracted from the frequency spectrum of audio data The frequency spectrum of acoustic signature is come other than the method that optimizes to audio data, audio optimization neck is also can be used in those skilled in the art Other methods in domain optimize audio data.
Preferred embodiment in accordance with the present invention, audio optimization equipment 500 may also include model optimization module (not shown).Mould Type optimization module can determine whether the audio data obtained needs to optimize after starting to obtain audio data.When data acquire When module 502 gets audio data, model optimization module can be determined according to the quality of audio data audio data whether needs Optimization.For example, model optimization module may be selected not optimize audio data when user thinks that audio quality is preferable.When When user thinks that audio quality is bad, model optimization module may be selected to optimize audio data to improve the sound of audio data Frequency quality.In addition, user can also specify manually obtain audio data whether need it is optimised.For example, user can be will be into Next voice communication is specified to need before one section of voice communication of row optimised.Optionally, it is excellent that audio can be preset in user Change the audio data default that equipment 500 obtains in specific environment and do not need optimised, or audio optimization can be preset and set The standby 500 audio data defaults obtained in specific environment need optimised.In addition, above example is merely exemplary, this It discloses without being limited thereto.
If it is optimised that model optimization module determines that the audio data obtained does not need, that is, think the audio data obtained It is without noise jamming or the lesser clean data of noise jamming, then the audio data that model optimization module can will acquire adds It is added to the clean data in audio optimization equipment 500 in pre-set training sample database to concentrate, as pure audio number According to training sample.
If it is optimised that model optimization module determines that the audio data obtained needs, can make in audio optimization module 504 After being optimized with the environmental noise model that Model selection module 503 is selected to audio data, by with the environment that uses selection Noise model optimize after the corresponding original audio number of audio data (that is, before optimizing with noisy audio Data) it is concentrated according to the noise data corresponding with the environmental noise model of selection being added in training sample database, it makes an uproar as having The sample of the audio data of sound.
According to an embodiment of the invention, in the manual selection mode, using Model selection module in audio optimization module 504 After 503 optimize audio data from the environmental noise model selected in multiple environmental noise models, user can hand It designates whether original audio data (that is, before optimizing with noisy audio data) being added to training sample dynamicly Noise data corresponding with the environmental noise model that user manually selects in library is concentrated.Optionally, in audio optimization module 504 After being optimized using the environmental noise model that Model selection module 503 selects to audio data, model optimization module can be inquired Whether user is satisfied with optimum results, if user selects satisfaction, model optimization module can be defaulted original audio data (that is, before optimizing with noisy audio data) is added to the environment manually selected with user in training sample database The corresponding noise data of noise model is concentrated.In addition, above example is merely exemplary, the invention is not limited thereto.
According to another embodiment of the present invention, it under adaptively selected mode, is selected in audio optimization module 504 using model It selects after module 503 optimizes audio data from the environmental noise model selected in multiple environmental noise models, mould Type optimization module is by original audio data corresponding with the audio data for using the environmental noise model selected to optimize (that is, before optimizing with noisy audio data) is added to corresponding to the environmental noise model in training sample database Noise data concentrate.For example, the 8th second to the 37th second audio data as shown in Figure 3 can be added to by model optimization module Concentrating in training sample database in the corresponding noise data of the 8th second best environmental noise model of determining evaluation result, can By the 38th second to the 67th second audio data as shown in Figure 3 be added in training sample database in determining evaluation in the 38th second As a result the corresponding noise data of best environmental noise model is concentrated.
According to another embodiment of the present invention, if the clean data of clean data concentration is less, model optimization module It can be indicated according to user or the audio data after optimization is added to clean data and concentrated by system setting.For example, if pure number Less according to the data of concentration, then after optimizing to the audio data in one section of voice communication, audio optimization equipment 500 can Whether inquiry user is satisfied with the audio data after optimization, when user's selection is satisfied with, can continue to ask the user whether to optimize Audio data afterwards is added in clean data library, and model optimization module can be according to the user's choice by the audio data after optimization It is added in clean data library.Optionally, model optimization module can be after user determines the audio data after satisfactory optimization, automatically Audio optimization equipment after optimization is added in clean data library by ground.Optionally, audio optimization equipment 500 can be to one section of sound Frequency is according to after optimizing, and whether direct access inquiry is added to the audio data after optimization in clean data library, model optimization Audio data after optimization can be added in clean data library by module according to the user's choice.
The audio data and corresponding with the environmental noise model of selection that model optimization module can be concentrated based on clean data The audio data that noise data is concentrated, it is specific to noise corresponding with the environmental noise model of selection to optimize.For example, model Optimization module can default setting environmental noise model is optimized in the equipment free time, for example, during night 2:00-6:00 Environmental noise model is optimized, or is stopped using in audio optimization equipment 500 more than certain time, for example, at 30 minutes Environmental noise model is optimized.In addition, model optimization module can also default setting only when clean data concentrate audio number According to duration it is accumulative be more than such as 30 minutes when environmental noise model can just be optimized.In addition, above example is only example Property, the invention is not limited thereto.
For example, when the data that clean data is concentrated change, during 500 free time of audio optimization equipment, if pure Net amount is accumulative more than 30 minutes according to the duration of the audio data of concentration, then model optimization module can will be with each environmental noise model Corresponding noise data collection is compared with clean data collection, by the audio frequency characteristics of the audio data of noise data concentration with it is pure The audio frequency characteristics of audio data in data set compare training, the sound that the noise data of each environmental noise model is concentrated Difference between the audio frequency characteristics for the audio data that the audio frequency characteristics and clean data of frequency evidence are concentrated is identified as and each ring The corresponding noise characteristic of border noise model.Alternatively, when clean data concentrate audio data there is no variation, but with multiple rings When the audio data that the specific corresponding noise data of environmental noise model is concentrated in the noise model of border changes, then in audio When optimizing 500 free time of equipment, if the duration for the audio data that clean data is concentrated is accumulative more than 30 minutes, model optimization mould The audio that block can concentrate the audio frequency characteristics for the audio data that the changed noise data of audio data is concentrated and clean data The audio frequency characteristics of data compare training, the audio for the audio data that the noise data of specific environmental noise model is concentrated Difference between the audio frequency characteristics for the audio data that feature and clean data are concentrated is determined as corresponding to specific environmental noise model Noise characteristic.In addition, those skilled in the art is also other than the above-mentioned operation optimized to environmental noise model Other methods can be used to optimize environmental noise model.
According to another preferred embodiment of the invention, audio optimization equipment 500 may also include model building module and (not show Out).Model building module can establish new environmental noise model.According to an embodiment of the invention, under adaptively selected mode, In Model selection module 503 after selecting an environmental noise model in multiple environmental noise models, model building module It can determine whether the environmental noise model selected meets predetermined condition.According to an embodiment of the invention, working as Model selection module 503 from audio data when intercepting out the audio data of predetermined length, and model building module can calculate the predetermined length intercepted out Audio data SNR, PESQ and be segmented the weighted average of SNR, and according to calculated weighted average from multiple rings After selecting the best environmental noise model of evaluation result in the noise model of border, by the best environmental noise model pair of evaluation result SNR, PESQ of the optimum results that the audio data of the predetermined length intercepted out optimizes and be segmented the weighted average of SNR with SNR, PESQ of the audio data of the predetermined length intercepted out and the weighted average for being segmented SNR are compared, if comparing knot Fruit is greater than or equal to predetermined value, for example, 10%, then the environmental noise model that model building module confirmation is selected meets predetermined item Part.If comparison result is less than predetermined value, for example, 10%, then the environmental noise model that model building module confirmation is selected is not Meet predetermined condition.In addition, above example is merely exemplary, the invention is not limited thereto.
If the environmental noise model that model building module confirmation Model selection module 503 is selected meets predetermined condition, Then audio data is optimized using the environmental noise model that Model selection module 503 is selected in audio optimization module 504 Afterwards, model building module by original audio data corresponding with the audio data for using the environmental noise model to optimize (that is, Before optimizing with noisy audio data) corresponding with the environmental noise model making an uproar of being added in training sample database In sound data set.If the environmental noise model that Model selection module 503 is selected is unsatisfactory for predetermined condition, in audio optimization After the completion of module 504 is to the optimization of audio data, model building module can prompt the user whether to establish new environmental noise model, User may be selected to establish or do not establish new environmental noise model.For example, when user thinks using the ring for being unsatisfactory for predetermined condition Audio data after border noise model optimizes meets needs, then may be selected not establish environmental noise model.
If new environmental noise model is not established in user's selection, model building module can will with use the ambient noise The corresponding original audio data of the audio data that model optimizes (that is, before optimizing with noisy audio data) The noise data corresponding with the environmental noise model being added in training sample database is concentrated.If new ring is established in user's selection Border noise model, then model building module can establish new environmental noise model and noise corresponding with new environmental noise model Training set, and can be by original audio data corresponding with the audio data for using the environmental noise model to optimize (that is, carrying out Before optimization with noisy audio data) be added to it is corresponding with newly-established environmental noise model in training sample database Noise data is concentrated.In addition, if it is less with the audio data of the noise data concentration in newly-established environmental noise model, then When user is again in environment corresponding with newly-established environmental noise model, model building module can prompt user at this Typing a segment of audio data under environment, and it is added to the noise corresponding with newly-established environmental noise model in training sample database In data set, to enrich the data of noise data concentration, optimize convenient for the subsequent training to newly-established environmental noise model.
In addition, according to another embodiment of the present invention, either in the manual selection mode, or in adaptively selected mould Under formula, if user is dissatisfied to the audio data after optimization, model building module can actively establish new ambient noise mould Type, and by with noise original audio data be added to it is corresponding with newly-established environmental noise model in training sample database Noise data is concentrated.
Therefore, by above-mentioned audio optimization equipment, optimal ambient noise can be selected from multiple environmental noise models Model optimizes audio data.In addition, also audio optimization effect whether can be satisfied with according to user, new ambient noise is established Model, so that being increased using the time with audio optimization equipment, optional environmental noise model becomes more in audio optimization equipment, Environmental noise model gradually increases the suitability of audio data.In addition, can also be in the process for continuing on audio optimization equipment In, the clean data without noise and the noise data with noise are stored, and by clean data and noise data to all Environmental noise model constantly optimizes, so that each environmental noise model can more accurately indicate that the noise of respective environment is special Sign, so that being increased using the time with audio optimization equipment, more preferable to the effect of optimization of audio data.
Embodiment according to the present invention also provides a kind of computer readable storage medium for being stored with computer program.The meter Calculation machine readable storage medium storing program for executing is stored with the computer for making processor execute above-mentioned audio optimization method when being executed by a processor Program.The computer readable storage medium is the arbitrary data storage device that can store the data read by computer system.Meter The example of calculation machine readable storage medium storing program for executing includes: read-only memory, random access memory, CD-ROM, tape, floppy disk, light number According to storage device and carrier wave (such as being transmitted through wired or wireless transmission path by the data of internet).
Embodiment according to the present invention also provides a kind of computing device.The computing device includes processor and memory.It deposits Reservoir is for storing computer program.The computer program is executed by processor so that processor executes audio as described above The computer program of optimization method.
Although having shown that and describing some exemplary embodiments of the invention, it will be understood by those skilled in the art that It, can be to these in the case where not departing from the principle and spirit of the invention defined by the claims and their equivalents Embodiment is modified.

Claims (14)

1. a kind of audio optimization method, which comprises
Obtain the multiple environmental noise models for being suitable for varying environment;
Obtain audio data;
An environmental noise model is selected from the multiple environmental noise model for audio data;
Audio data is optimized using the environmental noise model of selection.
2. audio optimization method as described in claim 1, wherein select an environment from the multiple environmental noise model The step of noise model includes:
In the manual selection mode, user is received to the specified of an environmental noise model in the multiple environmental noise model Input, and the environmental noise model that user is specified is determined as the environmental noise model of selection.
3. audio optimization method as described in claim 1, wherein select an environment from the multiple environmental noise model The step of noise model includes:
Under adaptively selected mode, respectively using each environmental noise model in the multiple environmental noise model to audio Data carry out pre-optimized, and an environmental noise model is selected from the multiple environmental noise model according to pre-optimized result.
4. audio optimization method as claimed in claim 3, wherein respectively using each of the multiple environmental noise model Environmental noise model carries out pre-optimized to audio data, and is selected from the multiple environmental noise model according to pre-optimized result The step of one environmental noise model includes:
The audio data of predetermined length is intercepted out from audio data;
Respectively using each environmental noise model in the multiple environmental noise model to the audio data of the predetermined length It optimizes, obtains multiple optimum results corresponding with the multiple environmental noise model respectively;
Each optimum results in multiple optimum results are evaluated respectively, select to comment from the multiple environmental noise model The best environmental noise model of valence result.
5. audio optimization method as claimed in claim 4, wherein respectively to each optimum results in multiple optimum results into Row evaluation the step of include:
Calculate separately signal-to-noise ratio, subjective speech quality assessment and the segmental signal-to-noise ratio of each optimum results;
Calculate separately the signal-to-noise ratio of each optimum results, the weighted average of subjective speech quality assessment and segmental signal-to-noise ratio;
Environmental noise model corresponding with the highest optimum results of weighted average is determined as the best environment of evaluation result to make an uproar Acoustic model.
6. audio optimization method as claimed in claim 3, wherein under adaptively selected mode, made an uproar using the environment of selection The step of acoustic model optimizes audio data include:
It is spaced at predetermined time intervals, a pre-optimized is carried out to audio data, and according to pre-optimized result from multiple ambient noises Environmental noise model is selected in model, using selection environmental noise model among audio data from selecting the ring Audio data until being played at the time of selecting environmental noise model next time at the time of the noise model of border optimizes.
7. audio optimization method as described in claim 1, wherein the step of optimizing to audio data include:
It is obtained by subtracting the frequency spectrum of noise characteristic corresponding with the environmental noise model of selection from the frequency spectrum of audio data Audio data after optimization.
8. such as claim 1 or audio optimization method as claimed in claim 4, further includes:
After obtaining audio data, it is optimised to determine whether audio data needs;
If audio data does not need optimised, audio data to be added in pre-set training sample database pure number According to concentration;
If audio data needs are optimised, after the environmental noise model using selection optimizes audio data, Original audio data corresponding with the audio data after optimization is added to making an uproar in the training sample database with the environment of selection The corresponding noise data of acoustic model is concentrated;
The audio that the audio data and noise data corresponding with the environmental noise model of selection concentrated based on clean data are concentrated Data optimize the environmental noise model of selection.
9. audio optimization method as claimed in claim 8, wherein using selection environmental noise model to audio data into Row optimization after, by original audio data corresponding with the audio data after optimization be added in the training sample database with selection The corresponding noise data of environmental noise model the step of concentrating include:
Under adaptively selected mode, determine whether the environmental noise model of selection meets predetermined condition;
If the environmental noise model of selection is unsatisfactory for predetermined condition, the new environmental noise model of suggestion is prompted the user whether;
If new environmental noise model is established in user's selection, establish new environmental noise model, and by with the sound after optimization The noise number corresponding with new environmental noise model in the training sample database is added according to corresponding original audio data for frequency According to concentration.
10. audio optimization method as claimed in claim 9, wherein it is predetermined to determine whether the environmental noise model of selection meets The step of condition includes:
Calculate the first weighting of the signal-to-noise ratio, subjective speech quality assessment and segmental signal-to-noise ratio of the audio data of the predetermined length Average value;
Calculate the noise of the audio data after optimizing using the environmental noise model of selection to the audio data of predetermined length Than, the second weighted average of subjective speech quality assessment and segmental signal-to-noise ratio;
Calculate the ratio of the second weighted average and the first weighted average;
If the ratio does not reach predetermined value, it is determined that the environmental noise model of selection is unsatisfactory for the predetermined condition.
11. audio optimization method as claimed in claim 8, wherein the step of being optimized to the environmental noise model of selection It include: the audio frequency characteristics and clean data collection of the audio data for concentrating noise data corresponding with the environmental noise model of selection In audio data audio frequency characteristics between difference be determined as the corresponding noise characteristic of environmental noise model with selection.
12. a kind of audio optimization equipment, the equipment include:
Model obtains module, is configured as obtaining the multiple environmental noise models for being suitable for varying environment;
Data acquisition module is configured as obtaining audio data;
Model selection module is configured as selecting an ambient noise from the multiple environmental noise model for audio data Model;
Audio optimization module, the environmental noise model for being configured with selection optimize audio data.
13. a kind of computer readable storage medium, is stored with program, be configured as: described program includes executing such as claim The code of audio optimization method described in any one of 1-11.
14. a kind of computer, the readable medium including being stored with computer program, are configured as: the computer program includes Execute the code of the audio optimization method as described in any one of claim 1-11.
CN201810878268.3A 2018-08-03 2018-08-03 Audio optimization method and apparatus Pending CN109087659A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810878268.3A CN109087659A (en) 2018-08-03 2018-08-03 Audio optimization method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810878268.3A CN109087659A (en) 2018-08-03 2018-08-03 Audio optimization method and apparatus

Publications (1)

Publication Number Publication Date
CN109087659A true CN109087659A (en) 2018-12-25

Family

ID=64833622

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810878268.3A Pending CN109087659A (en) 2018-08-03 2018-08-03 Audio optimization method and apparatus

Country Status (1)

Country Link
CN (1) CN109087659A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111341322A (en) * 2020-04-15 2020-06-26 厦门快商通科技股份有限公司 Voiceprint model training method, device and equipment
CN111739549A (en) * 2020-08-17 2020-10-02 北京灵伴即时智能科技有限公司 Sound optimization method and sound optimization system
CN112634932A (en) * 2021-03-09 2021-04-09 南京涵书韵信息科技有限公司 Audio signal processing method and device, server and related equipment
CN112669867A (en) * 2020-12-15 2021-04-16 北京百度网讯科技有限公司 Debugging method and device of noise elimination algorithm and electronic equipment
CN114374924A (en) * 2022-01-07 2022-04-19 上海纽泰仑教育科技有限公司 Recording quality detection method and related device

Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5970446A (en) * 1997-11-25 1999-10-19 At&T Corp Selective noise/channel/coding models and recognizers for automatic speech recognition
US20020087306A1 (en) * 2000-12-29 2002-07-04 Lee Victor Wai Leung Computer-implemented noise normalization method and system
US6438513B1 (en) * 1997-07-04 2002-08-20 Sextant Avionique Process for searching for a noise model in noisy audio signals
US6778959B1 (en) * 1999-10-21 2004-08-17 Sony Corporation System and method for speech verification using out-of-vocabulary models
CN1542737A (en) * 2003-03-12 2004-11-03 ��ʽ����Ntt����Ħ Noise adaptation system of speech model, noise adaptation method, and noise adaptation program for speech recognition
CN1595497A (en) * 2003-09-12 2005-03-16 古井贞熙 Noise adaptation system and method for speech model, noise adaptation program for speech recognition
CN1965218A (en) * 2004-06-04 2007-05-16 皇家飞利浦电子股份有限公司 Performance prediction for an interactive speech recognition system
CN101583996A (en) * 2006-12-30 2009-11-18 摩托罗拉公司 A method and noise suppression circuit incorporating a plurality of noise suppression techniques
CN101710490A (en) * 2009-11-20 2010-05-19 安徽科大讯飞信息科技股份有限公司 Method and device for compensating noise for voice assessment
CN102918591A (en) * 2010-04-14 2013-02-06 谷歌公司 Geotagged environmental audio for enhanced speech recognition accuracy
CN103069480A (en) * 2010-06-14 2013-04-24 谷歌公司 Speech and noise models for speech recognition
CN103632666A (en) * 2013-11-14 2014-03-12 华为技术有限公司 Voice recognition method, voice recognition equipment and electronic equipment
CN104575510A (en) * 2015-02-04 2015-04-29 深圳酷派技术有限公司 Noise reduction method, noise reduction device and terminal
CN104575509A (en) * 2014-12-29 2015-04-29 乐视致新电子科技(天津)有限公司 Voice enhancement processing method and device
US20150199964A1 (en) * 2011-10-17 2015-07-16 Nuance Communications, Inc. System and Method for Dynamic Noise Adaptation for Robust Automatic Speech Recognition
CN104966517A (en) * 2015-06-02 2015-10-07 华为技术有限公司 Voice frequency signal enhancement method and device
CN106297779A (en) * 2016-07-28 2017-01-04 块互动(北京)科技有限公司 A kind of background noise removing method based on positional information and device
CN106328152A (en) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 Automatic identification and monitoring system for indoor noise pollution
CN106663446A (en) * 2014-07-02 2017-05-10 微软技术许可有限责任公司 User environment aware acoustic noise reduction
CN106992002A (en) * 2016-01-21 2017-07-28 福特全球技术公司 Dynamic acoustic models switching for improving noisy speech identification
CN107197388A (en) * 2017-06-29 2017-09-22 广州华多网络科技有限公司 A kind of method and system of live noise reduction
US20180040318A1 (en) * 2007-05-29 2018-02-08 Nuance Communications, Inc. Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition

Patent Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6438513B1 (en) * 1997-07-04 2002-08-20 Sextant Avionique Process for searching for a noise model in noisy audio signals
US5970446A (en) * 1997-11-25 1999-10-19 At&T Corp Selective noise/channel/coding models and recognizers for automatic speech recognition
US6778959B1 (en) * 1999-10-21 2004-08-17 Sony Corporation System and method for speech verification using out-of-vocabulary models
US20020087306A1 (en) * 2000-12-29 2002-07-04 Lee Victor Wai Leung Computer-implemented noise normalization method and system
CN1542737A (en) * 2003-03-12 2004-11-03 ��ʽ����Ntt����Ħ Noise adaptation system of speech model, noise adaptation method, and noise adaptation program for speech recognition
CN1595497A (en) * 2003-09-12 2005-03-16 古井贞熙 Noise adaptation system and method for speech model, noise adaptation program for speech recognition
CN1965218A (en) * 2004-06-04 2007-05-16 皇家飞利浦电子股份有限公司 Performance prediction for an interactive speech recognition system
CN101583996A (en) * 2006-12-30 2009-11-18 摩托罗拉公司 A method and noise suppression circuit incorporating a plurality of noise suppression techniques
US20180040318A1 (en) * 2007-05-29 2018-02-08 Nuance Communications, Inc. Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition
CN101710490A (en) * 2009-11-20 2010-05-19 安徽科大讯飞信息科技股份有限公司 Method and device for compensating noise for voice assessment
CN102918591A (en) * 2010-04-14 2013-02-06 谷歌公司 Geotagged environmental audio for enhanced speech recognition accuracy
CN103069480A (en) * 2010-06-14 2013-04-24 谷歌公司 Speech and noise models for speech recognition
US20150199964A1 (en) * 2011-10-17 2015-07-16 Nuance Communications, Inc. System and Method for Dynamic Noise Adaptation for Robust Automatic Speech Recognition
CN103632666A (en) * 2013-11-14 2014-03-12 华为技术有限公司 Voice recognition method, voice recognition equipment and electronic equipment
CN106663446A (en) * 2014-07-02 2017-05-10 微软技术许可有限责任公司 User environment aware acoustic noise reduction
CN104575509A (en) * 2014-12-29 2015-04-29 乐视致新电子科技(天津)有限公司 Voice enhancement processing method and device
CN104575510A (en) * 2015-02-04 2015-04-29 深圳酷派技术有限公司 Noise reduction method, noise reduction device and terminal
CN104966517A (en) * 2015-06-02 2015-10-07 华为技术有限公司 Voice frequency signal enhancement method and device
CN106328152A (en) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 Automatic identification and monitoring system for indoor noise pollution
CN106992002A (en) * 2016-01-21 2017-07-28 福特全球技术公司 Dynamic acoustic models switching for improving noisy speech identification
CN106297779A (en) * 2016-07-28 2017-01-04 块互动(北京)科技有限公司 A kind of background noise removing method based on positional information and device
CN107197388A (en) * 2017-06-29 2017-09-22 广州华多网络科技有限公司 A kind of method and system of live noise reduction

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘斌: "联合长短时记忆递归神经网络和非负矩阵分解的语音混响消除方法", 《信号处理》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111341322A (en) * 2020-04-15 2020-06-26 厦门快商通科技股份有限公司 Voiceprint model training method, device and equipment
CN111739549A (en) * 2020-08-17 2020-10-02 北京灵伴即时智能科技有限公司 Sound optimization method and sound optimization system
CN112669867A (en) * 2020-12-15 2021-04-16 北京百度网讯科技有限公司 Debugging method and device of noise elimination algorithm and electronic equipment
CN112669867B (en) * 2020-12-15 2023-04-11 阿波罗智联(北京)科技有限公司 Debugging method and device of noise elimination algorithm and electronic equipment
US11804236B2 (en) 2020-12-15 2023-10-31 Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd. Method for debugging noise elimination algorithm, apparatus and electronic device
CN112634932A (en) * 2021-03-09 2021-04-09 南京涵书韵信息科技有限公司 Audio signal processing method and device, server and related equipment
CN112634932B (en) * 2021-03-09 2021-06-22 赣州柏朗科技有限公司 Audio signal processing method and device, server and related equipment
CN114374924A (en) * 2022-01-07 2022-04-19 上海纽泰仑教育科技有限公司 Recording quality detection method and related device
CN114374924B (en) * 2022-01-07 2024-01-19 上海纽泰仑教育科技有限公司 Recording quality detection method and related device

Similar Documents

Publication Publication Date Title
CN109087659A (en) Audio optimization method and apparatus
JP6538846B2 (en) Method and apparatus for processing voice information
CN104394491B (en) A kind of intelligent earphone, Cloud Server and volume adjusting method and system
AU2011261756B2 (en) User-specific noise suppression for voice quality improvements
US8378964B2 (en) System and method for automatically producing haptic events from a digital audio signal
CN101242597B (en) Method and device for automatically selecting scenario mode according to environmental noise method on mobile phone
CN108847215B (en) Method and device for voice synthesis based on user timbre
JP4640463B2 (en) Playback apparatus, display method, and display program
CN107454508A (en) The television set and television system of microphone array
CN107533848B (en) The system and method restored for speech
CN110019931A (en) Audio frequency classification method, device, smart machine and storage medium
CN109785859A (en) The method, apparatus and computer equipment of management music based on speech analysis
CN106302678A (en) A kind of music recommends method and device
MX2011012749A (en) System and method of receiving, analyzing, and editing audio to create musical compositions.
CN107360507A (en) A kind of play parameter Automatic adjustment method, intelligent sound box and storage medium
CN107609034A (en) A kind of audio frequency playing method of intelligent sound box, audio playing apparatus and storage medium
CN106168958B (en) A kind of the recommendation method and server of audio-frequency information
CN109658935A (en) The generation method and system of multichannel noisy speech
US11462236B2 (en) Voice recordings using acoustic quality measurement models and actionable acoustic improvement suggestions
US20190254572A1 (en) Auditory training device, auditory training method, and program
CN103297581A (en) Mobile terminal and method for adjusting equalizer thereof
Manzanares Mena et al. Songbird community structure changes with noise in an urban reserve
CN106980487A (en) Audio control method and audio control apparatus
CN112420015A (en) Audio synthesis method, device, equipment and computer readable storage medium
CN108370457A (en) Bother noise suppressed

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181225

RJ01 Rejection of invention patent application after publication