CN101067930A - Intelligent audio frequency identifying system and identifying method - Google Patents
Intelligent audio frequency identifying system and identifying method Download PDFInfo
- Publication number
- CN101067930A CN101067930A CN 200710075008 CN200710075008A CN101067930A CN 101067930 A CN101067930 A CN 101067930A CN 200710075008 CN200710075008 CN 200710075008 CN 200710075008 A CN200710075008 A CN 200710075008A CN 101067930 A CN101067930 A CN 101067930A
- Authority
- CN
- China
- Prior art keywords
- voice data
- proper vector
- data
- identified
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 239000013598 vector Substances 0.000 claims abstract description 110
- 238000013507 mapping Methods 0.000 claims abstract description 18
- 238000012549 training Methods 0.000 claims description 47
- 239000000284 extract Substances 0.000 claims description 12
- 241001269238 Data Species 0.000 claims description 6
- 238000001228 spectrum Methods 0.000 claims description 4
- 238000012850 discrimination method Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000005303 weighing Methods 0.000 claims description 3
- 230000004069 differentiation Effects 0.000 abstract 1
- 241001465754 Metazoa Species 0.000 description 12
- 230000002159 abnormal effect Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 238000012706 support-vector machine Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000006378 damage Effects 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000002360 explosive Substances 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000035484 reaction time Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- -1 doorway Substances 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000016507 interphase Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000009331 sowing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This invention relates to an intelligent audio differentiation system and a method, in which, the system includes an audio data set for collecting and storing various sample audio data, a tracing unit and a differentiating unit, said tracing unit is used in picking up the character vector of the sample audio data and searching for and setting up a mapping relation from the sample audio data character vector to the affiliated kind, said differentiating unit is used in storing data of the mapping relation between them and picking up the character vectors of the being differentiated audio data to give a result based on the character vector of the being differentiated audio data.
Description
Technical field
The present invention relates to a kind of can be to the system and method for the automatic identification of voice data.
Background technology
The sense of hearing is that the mankind obtain one of important source of external information, also is that the mankind are used to differentiate the outside important channel that a situation arises, as: when hearing the sound of barking, have dog near just can judging; When the sound that hears a scream, come to harm with regard to having the people near the decidable.Can offer the many important information of the present invention by analysis to audio frequency.Most of at present functions of mainly finishing based on the analytic system of audio frequency are that the original audio that collects is carried out pre-service, as: denoising, extract or strengthen the audio frequency of specific characteristic, but at last the identification of audio frequency is all needed people's participation.And in the numerous application scenario of nature, need carry out automatic identification to the different sounds, for example, for the wild animal scholar who is engaged in the wild animal research work in the open air, need some rare wild animals of expensive time-tracking, if can there be the voice data automatic identification system to come the cry of certain wild animal of identification, behind the sound that picks out this kind animal, send signal, then can help the wild animal scholar to follow the trail of.And for example can be provided with the audio frequency automatic identification system in elevator, family, just can carry out automatic identification to abnormal noises such as birdie, the sound of quarrelling and fighting noisily, strike note, the broken sound of glass, explosive sound, gunshots, and send alerting signal and give the monitor staff, thereby improve the reaction time that the monitor staff handles abnormal conditions.Therefore, realize automatically audio frequency being carried out identification, will have important, using value widely.
Summary of the invention
Technical matters to be solved by this invention is: a kind of intelligent audio identification system and automatic identification method are provided, voice data is carried out automatic identification.
The present invention solves the problems of the technologies described above the technical scheme that is adopted to be:
A kind of intelligent audio discrimination method may further comprise the steps:
A, the various sample voice datas of collection mark the sample voice data that collects;
B, one by one from described sample voice data, extract the reflection its essential characteristic proper vector;
C, divide according to described proper vector under category regions, make the proper vector that comprises such sample as much as possible in each different classes of zone after dividing, set up sorter from proper vector to mapping relations the affiliated classification;
D, voice data to be identified is handled, extracted its proper vector;
E, the proper vector of voice data to be identified is input to described sorter, sorter is differentiated according to its proper vector, obtains the identification result to this voice data to be identified.
Described method, wherein said step B comprises the steps:
B1, described sample voice data is carried out pre-service, obtain training data;
B2, from training data, extract the characteristic component of reflection training data essential characteristic;
B3, described characteristic component is made up, obtain described proper vector.
Described method, wherein said step D comprises the steps:
D1, described voice data to be identified is carried out pre-service, obtain Identification Data;
D2, from Identification Data, extract the characteristic component of reflection Identification Data essential characteristic;
D3, described characteristic component is made up, obtain described proper vector.
Described method, wherein: the described characteristic component of described step B2 or D2 comprises: the energy distribution feature of audio frequency in the energy feature of audio frequency or a plurality of period in the centre frequency of audio frequency, some characteristic frequency sections.
Described method, wherein: the described proper vector of described step B3 or D3 be audio power spectrum sum in the centre frequency of audio frequency and some the characteristic frequency sections vector and.
Described method, wherein: category regions was divided according to the numerical value of described proper vector under step C was described, and was limited by curve or curved surface.
Described method, wherein: described step e comprises following processing:
E1, the proper vector of voice data to be identified is input to described sorter, sorter is differentiated according to its proper vector, obtain voice data to be identified be divided under classification classification results and refuse to declare index, it is described that to refuse to declare index be the parameter that is used for weighing the classification results confidence level;
E2, judge the confidence level of classification results according to refusing to declare index, when classification index is higher than default thresholding, judge that described classification results is credible, sorter provides classification under the voice data to be identified; When refusing to declare index and be lower than default thresholding, when sorter provides under the voice data to be identified classification, point out this classification results insincere.
Described method, wherein: described steps A comprises that the sample voice data to collecting carries out identification, determining and indicating this sample voice data is any sound.
A kind of intelligent audio identification system comprises that one is used to gather and store audio data set, a training unit and the identification unit of all kinds of sample voice datas; Described training unit is used to extract the proper vector of sample voice data, and seeks and set up from sample voice data proper vector to the mapping relations the affiliated classification; Described identification unit is used to deposit the data of mapping relations between the voice data proper vector set up and the affiliated classification, and extracts voice data proper vector to be identified, and according to the proper vector of voice data to be identified, provides identification result.
Described system, wherein: training unit comprises first pretreatment module, first characteristic extracting module and training module, described pretreatment module is used for the sample voice data is carried out denoising, obtains training data; Described characteristic extracting module is used for extracting from training data the proper vector of sample voice data, and described training module is used to seek and set up from sample voice data proper vector to the mapping relations the affiliated classification.
Described system, wherein: described identification unit comprises second pretreatment module, second characteristic extracting module and sorter, described second pretreatment module is used for voice data to be identified is carried out denoising, obtains Identification Data; Described second characteristic extracting module is used for extracting from Identification Data the proper vector of voice data to be identified, described sorter is used to deposit the data of mapping relations between the voice data proper vector of described training module output and the affiliated classification, and according to the proper vector of voice data to be identified of input, output identification result.
Beneficial effect of the present invention is: adopt intelligent audio identification system of the present invention and method, can carry out automatic identification to voice data, and system has good real time performance and extended capability.
Description of drawings
Fig. 1 is a system chart of the present invention;
Fig. 2 is a training unit block scheme of the present invention;
Fig. 3 is an identification unit block scheme of the present invention;
Fig. 4 sets up the synoptic diagram of proper vector to mapping relations between the classification for being four time-likes when the sample voice data;
Fig. 5 sets up the synoptic diagram of proper vector to mapping relations between the classification for being two time-likes when the sample voice data.
Embodiment
With embodiment the present invention is described in further detail with reference to the accompanying drawings below:
A kind of intelligent audio identification system as shown in Figure 1 comprises the audio data set 1 that is used to gather and store all kinds of sample voice datas, training unit 2 and an identification unit 3 at least.Training unit 2 is used to extract the proper vector of sample voice data, and seeks and set up from sample voice data proper vector to the mapping relations the affiliated classification; Identification unit 3 is used to deposit the data of mapping relations between the voice data proper vector set up and the affiliated classification, and extracts voice data proper vector to be identified, and according to the proper vector of voice data to be identified, provides identification result.Wherein, training unit comprises first pretreatment module, 21, the first characteristic extracting module 22 and training module 23 as shown in Figure 2.Identification unit comprises second pretreatment module, 31, the second characteristic extracting module 32 and sorter 33 as shown in Figure 3.
The foundation of audio data set 1 is to provide necessary learning sample for follow-up training unit 2.User's classification of identification audio frequency as required collects voice data.The foundation of this data set can be adopted oneself recording, collects audio material from network, and ways such as purchase audio material CD are collected learning sample.In general, the audio frequency of each class all needs to collect a plurality of samples, and in the process of sample collection, need the sample that collect manually be marked, and promptly answers the sample that collects by people's ear, comes then to determine that what sound this sample is.In order to guarantee the identification effect of system, sample should be overcharged collection as much as possible.
In training unit 2, at first need the sample voice data of gathering is carried out pre-service, promptly remove processing such as noise by 21 pairs of sample voice datas of pretreatment module from audio data set 1, sample audio frequency to be identified is separated from the audio frequency background of complexity, obtained treated training data; Then, from training data, extract the composition of reflected sample voice data essential characteristic by characteristic extracting module 22, as: the energy distribution feature of audio frequency in the energy feature (can obtain) of audio frequency or a plurality of time period in the centre frequency of audio frequency, some frequency band by sound signal is carried out Fourier Tranform, and these characteristics combination are got up, obtain corresponding proper vector.For example: the centre frequency of this sample audio frequency is 33, certain audio section self-energy spectrum and be 1000, the proper vector that then obtains is vector and (33,1000) of centre frequency and certain audio section self-energy spectrum sum.Then, utilize the characteristic component that extracts to train the sorter 33 that is used for the identification audio frequency by training module 23, so-called training classifier, search out many classification curves or curved surfaces by training module 23 according to the proper vector of N sample voice data exactly, be separated out N specification area by classification curve or curved surface, the proper vector of each sample voice data is distributed in the different separately specification areas, specification area is divided according to the numerical value of proper vector, just sets up a kind of mapping relations from the characteristic vector space to the classification.For example, when the sample voice data only is four different classes of data, classification problem at four classes, when proper vector is bidimensional, training module 23 just is equivalent to and finds two straight lines to make the proper vector of four class samples be distributed in respectively in four zones that two straight lines cut apart, as shown in Figure 4, the 1st category feature vector that triangle obtains when being training, the 2nd category feature vector that circle obtains when being training, the 3rd category feature vector that pentagram obtains when being training, pentagon the time obtains the 4th category feature vector for training, and straight line 1 and straight line 2 are that the sorting track that obtained by this four category features vector (is that four zones that these two sorting tracks are divided should comprise respectively feature space has been divided into four sub spaces.And training module is with the data that train, and the deposit data of mapping relations is in sorter 33 between voice data proper vector of just having set up and the affiliated classification.
The division principle of category regions is in the inventive method: by the division to characteristic vector space, the proper vector that only comprises similar sample in each different classes of zone after feasible the division, or the proper vector that comprises such sample as much as possible, few proper vector that comprises non-such sample of trying one's best.
The effect of identification unit 3 is according to voice data to be identified, and the sorter 33 that utilizes training module 23 training to obtain obtains identification result.Second pretreatment module 31 in the identification unit, and second characteristic extracting module 32 is identical with second pretreatment module 21 and second feature extraction, 22 effects in the training unit respectively.
After getting access to audio samples to be identified, at first to carry out pre-service to it, the Identification Data after obtaining handling by pretreatment module 31; Then, the feature extracting method in employing and the characteristic extracting module 22 carries out feature extraction to voice data to be identified, obtains the proper vector of voice data to be identified; Afterwards, the input of the proper vector of extracting as sorter 33 (being obtained by training module 23), this sorter is according to the proper vector output identification result of input.For example, when proper vector to be classified is distributed in the folded space of the latter half of the first half of straight line 1 and straight line 2 (as the hexagon among Fig. 4), the present invention just treats this that it is the 1st class that characteristic of division vector is differentiated.If proper vector to be classified is distributed in the folded space of straight line 1 and straight line 2 the first half (as the heptagon among Fig. 4), the present invention just treats this that it is the 2nd affiliated class of circular feature vector that characteristic of division vector is differentiated, and the like, octagon in the accompanying drawing can be divided into the 3rd class, star-like the 4th class that is divided into of hexagonal.
This shows, sorter has provided the identification result of audio categories to be identified according to the proper vector of the voice data to be identified of input, and the sample audio frequency of gathering when audio data set is many more, the specification area of being divided is many more, then thin more to voice data classification to be identified, classification results approaches real sound class more.
In pattern classification system, the normal sorter that uses has neural network, support vector machine, Adaboost etc.Introduce the process of obtaining the linear classification face based on the sorter of linear support vector machine below.
Be example at first to distinguish two class problems:
The proper vector of given two classes and affiliated classification thereof: (x
1, y
1) ..., (x
l, y
l) ∈ R
n* { ± 1}.The interphase w of linear support vector machine can obtain by finding the solution following optimization problem:
s.t.y
i[w·x
i-b]+η
i≥1
η
i≥0,i=1,...,l
C>0th wherein, the punishment parameter of fixing.When we obtain a new proper vector x, if wx-b 〉=0 thinks that then this proper vector belongs to classification 1; If wx-b<0 thinks that then this proper vector belongs to classification-1.The absolute value of wx-b can be used as refuses to declare index, when | wx-b| thinks then that greater than certain threshold values θ this classification is reliable.
Adopt the one-against-one method that it is expanded to the multiclass problem then.For the classification problem of a k class, one-against-one method construct k (k-1)/2 classifying face.These classifying faces adopt above-mentioned classifying face building method at two class problems to obtain k (k-1)/2 classifying face by take out the combination of various two classes from the k class then.We adopt the method for ballot to determine the classification that proper vector x is affiliated.If: the classifying face at i class and j class is w
IjIf, w
IjX-b>=0 throws a ticket then for the i class; If | w
IjX-b|<0 throws a ticket then for the j class.After finishing ballot according to k (k-1)/2 classifying face, obtaining the maximum classification of poll will be as last classification results.Simultaneously, each class is when obtaining ballot, and it is right to need | w
IjX-b| adds up, and is last and will be as refusing to declare index, when this adds up and thinks then that greater than certain threshold values θ this classification is reliable.
This identification result of sorter output comprises the classification results of the affiliated classification of audio frequency to be identified and refuses to declare index, proper vector with voice data to be identified among Fig. 4 is that hexagon is an example, because this hexagon is distributed in the 1st folded space-like of the latter half of the first half of straight line 1 and straight line 2, affiliated classification is the 1st class, to fall into the 1st space-like position placed in the middle more when hexagon, similar more with the proper vector of the 1st class sample, illustrate that the classification results confidence level is high more, when hexagon falls into the 1st space-like position the closer to sorting track, illustrate that the classification results confidence level is low more.
Refusing to declare index is the parameter that is used for weighing the classification results confidence level.For the sorter based on canon of probability, the classification results of output is the probability that belongs to certain class, and this probability promptly can be used as refuses to declare index, if the probability that belongs to all classes among the result of output is then refused this sample class is differentiated all less than certain probability; Concerning based on the sorter of classifying face, the proper vector of sample and promptly can be used as from the distance of its nearest classifying face and to refuse to declare index, if the proper vector of sample and from the distance of its nearest classifying face less than certain numerical value, then refuse this sample class is differentiated.
Refuse to declare the confidence level that index is used to judge classification results, (for example: the present invention can set up a small-scale test set can to set threshold value according to experiment in actual applications, seeking a threshold value then can be with most incredible sample in the test set according to declaring, then establishing this value is threshold value) a default threshold value, when refusing to declare index greater than default threshold value, the classification results that sorter provides is credible; When refusing to declare index less than default threshold value, illustrate that the classification results confidence level of sorter is lower, sorter in the classification, points out this classification results insincere under providing voice data to be identified.For example: as shown in Figure 5, classification problem at two classes, when proper vector was bidimensional, training linear classifier found straight line to make the proper vector of a class sample be distributed in one side of straight line with regard to being equivalent to, and the proper vector of another kind of sample is distributed in the another side of straight line.The category feature vector that triangle among Fig. 5 obtains when being training, the another kind of proper vector that circle obtains when being training, straight line is the sorting track that is obtained by this two category features vector.When the proper vector of voice data to be identified is distributed in the left side (as the square among Fig. 5) of straight line, the proper vector that the present invention is just to be classified with this is differentiated and is the class under the triangle character vector.If proper vector to be classified is distributed in the right (as the pentagram among Fig. 4) of straight line, the present invention just treats this that it is the affiliated class of circular feature vector that characteristic of division vector is differentiated.At this moment, linear classification face parameter (can obtain linear classification face parameter by asking for linear classification face normal vector) will be used to distinguish this with the symbol (plus or minus) for the treatment of the characteristic of division dot product and treat the characteristic of division vector, and the absolute value of dot product is and refuses to declare index, then be used to weigh the confidence level of classification, it is big more to refuse to declare index (absolute value of dot product), the confidence level of classification is high more, when the absolute value of dot product during greater than default thresholding, thinks that then classification is reliable.
In actual applications, can utilize this method and system to realize the various no audio frequency that nature exists is carried out identification, also can utilize this system earlier specific several audio frequency to be carried out identification, and based on the result of identification, realize follow-up function, train the sorter quick, that extended capability is good, the system of assurance has good real time performance and extended capability.。
Intelligent audio identification system of the present invention can be used for the intelligent monitoring under the multiple occasion.As: this system can be installed in elevator, birdie, the undesired sounds such as sound, strike note of quarrelling and fighting noisily are carried out automatic identification, and send alerting signal and give the monitor staff, thereby improve the reaction time that abnormal conditions in the elevator are handled, can alleviate elevator monitoring personnel's work load simultaneously.This system also can be used for family's monitoring.After this system indoors is installed, system can carry out identification at indoor contingent abnormal noise to the strike note on the broken sound of glass, doorway, explosive sound, gunshot etc., after recognizing these abnormal noises, send alerting signal immediately, thereby effectively prevent generation by criminal offences such as destruction door and window burglaries.This system also can be installed in outdoor, the sounds relevant such as automatic identification thunder, sound of the wind, the patter of rain with weather, and real-time monitors weather conditions.In addition, this system also can help the wild animal scholar who works to carry out research work in the open air.The wild animal scholar often needs to spend some rare wild animals of time-tracking of several weeks even some months, the present invention can be by broadcasting sowing the wireless senser that this system is installed in the appointed area, come the cry of certain wild animal of identification, behind the sound that picks out this kind animal, send signal, help the wild animal scholar to follow the trail of.This system also can be used for the diagnosis of mechanical fault.When machine breaks down, can send and differ from the sound that machine works is just often sent, and the trouble sound that different faults is sent is also inequality.This system just can learn according to several different fault audio frequency, be installed in then and real-time near the machine machine works sound carried out identification, after picking out trouble sound, report to the police and provide possible fault category, this result can help people to find mechanical disorder timely, and provides foundation for the fault diagnosis of machine.This system can also be applied to based in the audio retrieval of internet and the scene analysis based on audio frequency.
Should be understood that, for those of ordinary skills, can be improved according to the above description or conversion, and all these improvement and conversion all should belong to the protection domain of claims of the present invention.
Claims (11)
1, a kind of intelligent audio discrimination method may further comprise the steps:
A, the various sample voice datas of collection mark the sample voice data that collects;
B, one by one from described sample voice data, extract the reflection its essential characteristic proper vector;
C, divide according to described proper vector under category regions, make the proper vector that comprises such sample as much as possible in each different classes of zone after dividing, set up sorter from proper vector to mapping relations the affiliated classification;
D, voice data to be identified is handled, extracted its proper vector;
E, the proper vector of voice data to be identified is input to described sorter, sorter is differentiated according to its proper vector, obtains the identification result to this voice data to be identified.
2, method according to claim 1 is characterized in that: described step B comprises the steps:
B1, described sample voice data is carried out pre-service, obtain training data;
B2, from training data, extract the characteristic component of reflection training data essential characteristic;
B3, described characteristic component is made up, obtain described proper vector.
3, method according to claim 1 is characterized in that: described step D comprises the steps:
D1, described voice data to be identified is carried out pre-service, obtain Identification Data;
D2, from Identification Data, extract the characteristic component of reflection Identification Data essential characteristic;
D3, described characteristic component is made up, obtain described proper vector.
4, according to claim 2 or 3 described methods, it is characterized in that: the described characteristic component of described step B2 or D2 comprises: the energy distribution feature of audio frequency in the energy feature of audio frequency or a plurality of period in the centre frequency of audio frequency, some characteristic frequency sections.
5, method according to claim 4 is characterized in that: the described proper vector of described step B3 or D3 be audio power spectrum sum in the centre frequency of audio frequency and some the characteristic frequency sections vector and.
6, method according to claim 5 is characterized in that: category regions was divided according to the numerical value of described proper vector under step C was described, and was limited by curve or curved surface.
7, method according to claim 6 is characterized in that: described step e comprises following processing:
E1, the proper vector of voice data to be identified is input to described sorter, sorter is differentiated according to its proper vector, obtain voice data to be identified be divided under classification classification results and refuse to declare index, it is described that to refuse to declare index be the parameter that is used for weighing the classification results confidence level;
E2, judge the confidence level of classification results according to refusing to declare index, when classification index is higher than default thresholding, judge that described classification results is credible, sorter provides classification under the voice data to be identified; When refusing to declare index and be lower than default thresholding, when sorter provides under the voice data to be identified classification, point out this classification results insincere.
8, method according to claim 7 is characterized in that: described steps A comprises that the sample voice data to collecting carries out identification, and what sound determines and indicates this sample voice data is.
9, a kind of intelligent audio identification system is characterized in that: comprise that one is used to gather and store audio data set, a training unit and the identification unit of all kinds of sample voice datas; Described training unit is used to extract the proper vector of sample voice data, and seeks and set up from sample voice data proper vector to the mapping relations the affiliated classification; Described identification unit is used to deposit the data of mapping relations between the voice data proper vector set up and the affiliated classification, and extracts voice data proper vector to be identified, and according to the proper vector of voice data to be identified, provides identification result.
10, system according to claim 9 is characterized in that: training unit comprises first pretreatment module, first characteristic extracting module and training module, and described pretreatment module is used for the sample voice data is carried out denoising, obtains training data; Described characteristic extracting module is used for extracting from training data the proper vector of sample voice data, and described training module is used to seek and set up from sample voice data proper vector to the mapping relations the affiliated classification.
11, according to claim 9 or 10 described systems, it is characterized in that: described identification unit comprises second pretreatment module, second characteristic extracting module and sorter, described second pretreatment module is used for voice data to be identified is carried out denoising, obtains Identification Data; Described second characteristic extracting module is used for extracting from Identification Data the proper vector of voice data to be identified, described sorter is used to deposit the data of mapping relations between the voice data proper vector of described training module output and the affiliated classification, and according to the proper vector of voice data to be identified of input, output identification result.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200710075008 CN101067930B (en) | 2007-06-07 | 2007-06-07 | Intelligent audio frequency identifying system and identifying method |
PCT/CN2008/000765 WO2008148289A1 (en) | 2007-06-07 | 2008-04-15 | An intelligent audio identifying system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200710075008 CN101067930B (en) | 2007-06-07 | 2007-06-07 | Intelligent audio frequency identifying system and identifying method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101067930A true CN101067930A (en) | 2007-11-07 |
CN101067930B CN101067930B (en) | 2011-06-29 |
Family
ID=38880462
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200710075008 Active CN101067930B (en) | 2007-06-07 | 2007-06-07 | Intelligent audio frequency identifying system and identifying method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN101067930B (en) |
WO (1) | WO2008148289A1 (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008148289A1 (en) * | 2007-06-07 | 2008-12-11 | Shenzhen Institute Of Advanced Technology | An intelligent audio identifying system and method |
CN101587710B (en) * | 2009-07-02 | 2011-12-14 | 北京理工大学 | Multiple-codebook coding parameter quantification method based on audio emergent event |
CN102623007A (en) * | 2011-01-30 | 2012-08-01 | 清华大学 | Audio characteristic classification method based on variable duration |
CN102664004A (en) * | 2012-03-22 | 2012-09-12 | 重庆英卡电子有限公司 | Forest theft behavior identification method |
CN103198838A (en) * | 2013-03-29 | 2013-07-10 | 苏州皓泰视频技术有限公司 | Abnormal sound monitoring method and abnormal sound monitoring device used for embedded system |
CN103743477A (en) * | 2013-12-27 | 2014-04-23 | 柳州职业技术学院 | Mechanical failure detecting and diagnosing method and apparatus |
CN104464733A (en) * | 2014-10-28 | 2015-03-25 | 百度在线网络技术(北京)有限公司 | Multi-scene managing method and device of voice conversation |
CN105138696A (en) * | 2015-09-24 | 2015-12-09 | 深圳市冠旭电子有限公司 | Method and device for pushing music |
CN105679313A (en) * | 2016-04-15 | 2016-06-15 | 福建新恒通智能科技有限公司 | Audio recognition alarm system and method |
CN106531191A (en) * | 2015-09-10 | 2017-03-22 | 百度在线网络技术(北京)有限公司 | Method and device for providing danger report information |
CN107801090A (en) * | 2017-11-03 | 2018-03-13 | 北京奇虎科技有限公司 | Utilize the method, apparatus and computing device of audio-frequency information detection anomalous video file |
CN108764304A (en) * | 2018-05-11 | 2018-11-06 | Oppo广东移动通信有限公司 | scene recognition method, device, storage medium and electronic equipment |
CN108764114A (en) * | 2018-05-23 | 2018-11-06 | 腾讯音乐娱乐科技(深圳)有限公司 | A kind of signal recognition method and its equipment, storage medium, terminal |
CN108764341A (en) * | 2018-05-29 | 2018-11-06 | 中国矿业大学 | A kind of adaptive deep neural network model of operating mode and variable working condition method for diagnosing faults |
CN110658006A (en) * | 2018-06-29 | 2020-01-07 | 杭州萤石软件有限公司 | Sweeping robot fault diagnosis method and sweeping robot |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102184732A (en) * | 2011-04-28 | 2011-09-14 | 重庆邮电大学 | Fractal-feature-based intelligent wheelchair voice identification control method and system |
CN104700833A (en) * | 2014-12-29 | 2015-06-10 | 芜湖乐锐思信息咨询有限公司 | Big data speech classification method |
CN111370025A (en) * | 2020-02-25 | 2020-07-03 | 广州酷狗计算机科技有限公司 | Audio recognition method and device and computer storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6553342B1 (en) * | 2000-02-02 | 2003-04-22 | Motorola, Inc. | Tone based speech recognition |
JP3537727B2 (en) * | 2000-03-01 | 2004-06-14 | 日本電信電話株式会社 | Signal detection method, signal search method and recognition method, and recording medium |
CN1258170C (en) * | 2004-09-29 | 2006-05-31 | 上海交通大学 | Quick refusing method for non-command in inserted speech command identifying system |
CN101067930B (en) * | 2007-06-07 | 2011-06-29 | 深圳先进技术研究院 | Intelligent audio frequency identifying system and identifying method |
-
2007
- 2007-06-07 CN CN 200710075008 patent/CN101067930B/en active Active
-
2008
- 2008-04-15 WO PCT/CN2008/000765 patent/WO2008148289A1/en active Application Filing
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008148289A1 (en) * | 2007-06-07 | 2008-12-11 | Shenzhen Institute Of Advanced Technology | An intelligent audio identifying system and method |
CN101587710B (en) * | 2009-07-02 | 2011-12-14 | 北京理工大学 | Multiple-codebook coding parameter quantification method based on audio emergent event |
CN102623007A (en) * | 2011-01-30 | 2012-08-01 | 清华大学 | Audio characteristic classification method based on variable duration |
CN102623007B (en) * | 2011-01-30 | 2014-01-01 | 清华大学 | Audio characteristic classification method based on variable duration |
CN102664004A (en) * | 2012-03-22 | 2012-09-12 | 重庆英卡电子有限公司 | Forest theft behavior identification method |
CN102664004B (en) * | 2012-03-22 | 2013-10-23 | 重庆英卡电子有限公司 | Forest theft behavior identification method |
CN103198838A (en) * | 2013-03-29 | 2013-07-10 | 苏州皓泰视频技术有限公司 | Abnormal sound monitoring method and abnormal sound monitoring device used for embedded system |
CN103743477B (en) * | 2013-12-27 | 2016-01-13 | 柳州职业技术学院 | A kind of mechanical fault detection diagnostic method and equipment thereof |
CN103743477A (en) * | 2013-12-27 | 2014-04-23 | 柳州职业技术学院 | Mechanical failure detecting and diagnosing method and apparatus |
CN104464733B (en) * | 2014-10-28 | 2019-09-20 | 百度在线网络技术(北京)有限公司 | A kind of more scene management method and devices of voice dialogue |
CN104464733A (en) * | 2014-10-28 | 2015-03-25 | 百度在线网络技术(北京)有限公司 | Multi-scene managing method and device of voice conversation |
CN106531191A (en) * | 2015-09-10 | 2017-03-22 | 百度在线网络技术(北京)有限公司 | Method and device for providing danger report information |
CN105138696A (en) * | 2015-09-24 | 2015-12-09 | 深圳市冠旭电子有限公司 | Method and device for pushing music |
CN105138696B (en) * | 2015-09-24 | 2019-11-19 | 深圳市冠旭电子股份有限公司 | A kind of music method for pushing and device |
CN105679313A (en) * | 2016-04-15 | 2016-06-15 | 福建新恒通智能科技有限公司 | Audio recognition alarm system and method |
CN107801090A (en) * | 2017-11-03 | 2018-03-13 | 北京奇虎科技有限公司 | Utilize the method, apparatus and computing device of audio-frequency information detection anomalous video file |
CN108764304A (en) * | 2018-05-11 | 2018-11-06 | Oppo广东移动通信有限公司 | scene recognition method, device, storage medium and electronic equipment |
CN108764304B (en) * | 2018-05-11 | 2020-03-06 | Oppo广东移动通信有限公司 | Scene recognition method and device, storage medium and electronic equipment |
CN108764114A (en) * | 2018-05-23 | 2018-11-06 | 腾讯音乐娱乐科技(深圳)有限公司 | A kind of signal recognition method and its equipment, storage medium, terminal |
CN108764341B (en) * | 2018-05-29 | 2019-07-19 | 中国矿业大学 | A kind of Fault Diagnosis of Roller Bearings under the conditions of variable working condition |
CN108764341A (en) * | 2018-05-29 | 2018-11-06 | 中国矿业大学 | A kind of adaptive deep neural network model of operating mode and variable working condition method for diagnosing faults |
CN110658006A (en) * | 2018-06-29 | 2020-01-07 | 杭州萤石软件有限公司 | Sweeping robot fault diagnosis method and sweeping robot |
Also Published As
Publication number | Publication date |
---|---|
WO2008148289A1 (en) | 2008-12-11 |
CN101067930B (en) | 2011-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101067930B (en) | Intelligent audio frequency identifying system and identifying method | |
Heinicke et al. | Assessing the performance of a semi‐automated acoustic monitoring system for primates | |
Ektefa et al. | Intrusion detection using data mining techniques | |
CN107527617A (en) | Monitoring method, apparatus and system based on voice recognition | |
Carletti et al. | Audio surveillance using a bag of aural words classifier | |
Dugan et al. | North Atlantic right whale acoustic signal processing: Part I. Comparison of machine learning recognition algorithms | |
CN101546556A (en) | Classification system for identifying audio content | |
CN110017991A (en) | Rolling bearing fault classification method and system based on spectrum kurtosis and neural network | |
CN1302456C (en) | Sound veins identifying method | |
CN101079109A (en) | Identity identification method and system based on uniform characteristic | |
Brabant et al. | Comparing the results of four widely used automated bat identification software programs to identify nine bat species in coastal Western Europe | |
CN111460940A (en) | Stranger foot drop point studying and judging method and system | |
CN114023354A (en) | Guidance type acoustic event detection model training method based on focusing loss function | |
CN112270633A (en) | Public welfare litigation clue studying and judging system and method based on big data drive | |
Wang et al. | A novel underground pipeline surveillance system based on hybrid acoustic features | |
Xie et al. | Detecting frog calling activity based on acoustic event detection and multi-label learning | |
CN118347564A (en) | Vehicle overload detection method based on acceleration vibration sensor | |
García-de-la-Puente et al. | Deep Learning Models for Gunshot Detection in the Albufera Natural Park | |
CN117692588A (en) | Intelligent visual noise monitoring and tracing device | |
Xie et al. | Acoustic feature extraction using perceptual wavelet packet decomposition for frog call classification | |
CN115659056A (en) | Accurate matching system of user service based on big data | |
CN115409114A (en) | Two-stage early classification method for data stream | |
US6243671B1 (en) | Device and method for analysis and filtration of sound | |
Diez Gaspon et al. | Deep learning for natural sound classification | |
Jaafar et al. | Effect of natural background noise and man-made noise on automated frog calls identification system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |