CN109243492A - A kind of speech emotion recognition system and recognition methods - Google Patents
A kind of speech emotion recognition system and recognition methods Download PDFInfo
- Publication number
- CN109243492A CN109243492A CN201811263371.3A CN201811263371A CN109243492A CN 109243492 A CN109243492 A CN 109243492A CN 201811263371 A CN201811263371 A CN 201811263371A CN 109243492 A CN109243492 A CN 109243492A
- Authority
- CN
- China
- Prior art keywords
- module
- feature
- voice
- extraction module
- speech
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000008909 emotion recognition Effects 0.000 title claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 55
- 238000012545 processing Methods 0.000 claims abstract description 32
- 230000008451 emotion Effects 0.000 claims abstract description 26
- 238000007781 pre-processing Methods 0.000 claims abstract description 15
- 239000000284 extract Substances 0.000 claims abstract description 4
- 230000036651 mood Effects 0.000 claims description 45
- 238000013528 artificial neural network Methods 0.000 claims description 34
- 238000012549 training Methods 0.000 claims description 32
- 238000005516 engineering process Methods 0.000 claims description 10
- 230000015572 biosynthetic process Effects 0.000 claims description 8
- 230000002123 temporal effect Effects 0.000 claims description 6
- 238000010801 machine learning Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000005236 sound signal Effects 0.000 claims description 2
- 230000001537 neural effect Effects 0.000 claims 1
- 238000001514 detection method Methods 0.000 abstract description 6
- 238000004141 dimensional analysis Methods 0.000 abstract description 2
- 230000002996 emotional effect Effects 0.000 description 7
- 241000208340 Araliaceae Species 0.000 description 4
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 4
- 235000003140 Panax quinquefolius Nutrition 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 235000008434 ginseng Nutrition 0.000 description 4
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000002265 prevention Effects 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 230000004580 weight loss Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Hospice & Palliative Care (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Child & Adolescent Psychology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Machine Translation (AREA)
Abstract
The present invention discloses a kind of speech emotion recognition system, including speech preprocessing module, affective feature extraction module, sentiment analysis module, the input of the speech preprocessing module terminates voice data, the output end of the speech preprocessing module and the input terminal of the affective feature extraction module are connected, the output end of the affective feature extraction module and the input terminal of the sentiment analysis module are connected, the output end output analysis recognition result of the sentiment analysis module;The speech preprocessing module obtains voice signal by carrying out processing to voice data, and be transmitted to the affective feature extraction module and extract to being associated with close parameters,acoustic with emotion in the voice signal, finally it is sent into the judgement that the sentiment analysis module completes emotion.The present invention also proposes a kind of speech-emotion recognition method, increases the detection means of telephone fraud system, can carry out multi dimensional analysis for voice data, the detection accuracy rate of system improves 5%.
Description
Technical field
The present invention relates to a kind of speech emotion recognition system and recognition methods, belong to speech analysis techniques field.
Background technique
Currently, existing telephone fraud prevention intercepting system is mainly based upon the pre-alarm and prevention technology of signaling data, is based on
The fraudulent call early warning technology that nocuousness recording compares and natural person's fraudulent call early warning technology based on intelligent sound technology,
There are the following problems for this several technology paths, and simple signalling analysis, recording compare analysis, and available characteristic information is few,
Accuracy and it is comprehensive on, be difficult to accomplish that each side takes into account.In addition, needing to accumulate the language of one section of duration for voice subject analysis
Sound, to system speech access capability and handle analysis ability requirement it is all relatively high, on line system operation may require that compared with
High cost.
Summary of the invention
The technical problem to be solved by the present invention is to overcome the deficiencies of existing technologies, a kind of speech emotion recognition system is provided
And recognition methods identifies the abnormal emotion feature of target speaker, can help to assess by mood, emotion recognition processing
The abnormal behaviour taken on the telephone and intention effectively assist the early warning detection of fraudulent call.It overcomes in conventional solution and passes through
The intention understanding on basis can only obtain literal message, can not deeply excavate the skill because of mood, emotion variation bring exception information
Art defect.
In order to solve the above technical problems, the present invention provides a kind of speech emotion recognition system, which is characterized in that including voice
The input of preprocessing module, affective feature extraction module, sentiment analysis module, the speech preprocessing module terminates voice number
According to the output end of the speech preprocessing module and the input terminal of the affective feature extraction module are connected, and the emotion is special
The input terminal of the output end and the sentiment analysis module of levying extraction module is connected, and the output end of the sentiment analysis module is defeated
Recognition result is analyzed out;The speech preprocessing module obtains voice signal by carrying out processing to voice data, and is transmitted to
The affective feature extraction module is extracted to being associated with close parameters,acoustic with emotion in the voice signal, is finally sent into
The sentiment analysis module completes the judgement of emotion.
As a kind of preferred embodiment, the affective feature extraction module includes characteristic parameter extraction module, feature ginseng
Number chooses and processing module, and output end and the characteristic parameter of the characteristic parameter extraction module are chosen defeated with processing module
Enter end to be connected.
As a kind of preferred embodiment, the characteristic parameter extraction module includes that the temporal signatures being sequentially connected extract mould
Block, fundamental frequency characteristic extracting module, clear pond sound judgment module, word speed extraction module, formant extraction module, the temporal signatures mention
Modulus block is used to extract the short-time energy feature in voice signal, and the fundamental frequency characteristic extracting module is used to extract in voice signal
Fundamental frequency feature, the clear pond sound judgment module is used to extract zero-crossing rate feature in voice signal, the word speed extraction module
For extracting the word speed feature in voice signal, the formant that the formant extraction module is used to extract in voice signal is special
Sign.
As a kind of preferred embodiment, the characteristic parameter, which is chosen, to be used to complete data conversion and biography with processing module
It passs, by special to the single features parameter extracted in the characteristic parameter extraction module such as short-time energy feature, zero-crossing rate
Sign, fundamental frequency feature, word speed feature and formant feature carry out selection processing, and final characteristic parameter is taken together, each
One feature vector of sound section of each of voice signal formation, and set of eigenvectors is ultimately formed, it is defeated to form classifier training
Enter file, the training or identification for the sentiment analysis module use.
As a kind of preferred embodiment, the sentiment analysis module includes classifier modules, and the classifier modules exist
On the basis of identifying the characteristic parameter for successfully extracting voice document, by machine learning method, feelings belonging to the recording file are predicted
Thread classification.
As a kind of preferred embodiment, the classifier modules are on the basis of deep neural network, in conjunction with based on contribution
The PCA algorithm of analysis proposes a kind of deep neural network voice mood identification model based on PCA algorithm contribution analysis, passes through
PCA contribution analysis technology is extracted the main component in category feature comprising voice mood and is inputted as deep neural network, carries out
Network training effectively reduces nuisance parameter, training for promotion efficiency, realizes mood classification.
The present invention also proposes a kind of speech-emotion recognition method, which is characterized in that specifically comprises the following steps: depth nerve
Voice-over-net Emotion identification model training step;Deep neural network voice mood identification model prediction steps.
As a kind of preferred embodiment, the deep neural network voice mood identification model training step is specifically wrapped
Include: it is such as short that the voice mood database input characteristic parameter extraction module of tape label is carried out processing acquisition single features parameter
When energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature, then input characteristic parameter selection and place
Reason module carries out selection processing, and final characteristic parameter is taken together, sound section of each of each voice signal formation
One feature vector, and set of eigenvectors is ultimately formed, form classifier training input file, point of input sentiment analysis module
Class device module is trained, and obtains deep neural network voice mood identification model.
As a kind of preferred embodiment, the deep neural network speech emotional model prediction step is specifically included: will
The voice mood database input characteristic parameter extraction module of unknown classification carries out processing and obtains single features parameter such as in short-term
Energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature, then input characteristic parameter selection and processing
Module carries out selection processing, and final characteristic parameter is taken together, and sound section of each of each voice signal forms one
A feature vector, and set of eigenvectors is ultimately formed, classifier training input file is formed, the classification of sentiment analysis module is inputted
The deep neural network voice mood that device module is obtained according to the deep neural network voice mood identification model training step
Identification model predicts the classification of mood belonging to voice signal, and exports Emotion identification dimension result.
Advantageous effects of the invention: a kind of speech emotion recognition system proposed by the present invention and recognition methods, lead to
Mood, emotion recognition processing are crossed, the abnormal emotion feature of target speaker is identified, can help to assess the exception taken on the telephone
Behavior and intention effectively assist the early warning detection of fraudulent call, overcome in conventional solution and understood by the intention on basis
It can only obtain literal message, can not deeply excavate the technological deficiency because of mood, emotion variation bring exception information, increase electricity
The detection means for talking about swindle system, can carry out multi dimensional analysis for voice data, the detection accuracy rate of system improves 5%.
Detailed description of the invention
Fig. 1 is a kind of structural block diagram of speech emotion recognition system of the invention.
Fig. 2 is a kind of flow chart of speech-emotion recognition method of the invention.
Fig. 3 is the structural block diagram of emotional characteristics parameter extraction module of the invention.
Fig. 4 is the flow chart of deep neural network voice mood identification model training step of the invention.
Fig. 5 is the flow chart of deep neural network speech emotional model prediction step of the invention.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings.Following embodiment is only used for clearly illustrating the present invention
Technical solution, and not intended to limit the protection scope of the present invention.
As shown in Fig. 1 the structural block diagram of a kind of speech emotion recognition system of the invention.The present invention provides a kind of language
Sound emotion recognition system, which is characterized in that including speech preprocessing module, affective feature extraction module, sentiment analysis module, institute
The input termination voice data of speech preprocessing module is stated, the output end of the speech preprocessing module is mentioned with the affective characteristics
The input terminal of modulus block is connected, the input terminal phase of the output end of the affective feature extraction module and the sentiment analysis module
Connection, the output end output analysis recognition result of the sentiment analysis module;The speech preprocessing module passes through to voice number
Voice signal is obtained according to processing is carried out, and is transmitted to the affective feature extraction module and is associated in the voice signal with emotion
Close parameters,acoustic extracts, and is finally sent into the judgement that the sentiment analysis module completes emotion.
As a kind of preferred embodiment, the affective feature extraction module includes characteristic parameter extraction module, feature ginseng
Number chooses and processing module, and output end and the characteristic parameter of the characteristic parameter extraction module are chosen defeated with processing module
Enter end to be connected.
As shown in Fig. 3 the structural block diagram of emotional characteristics parameter extraction module of the invention.As a kind of preferable reality
Example is applied, the characteristic parameter extraction module includes the temporal signatures extraction module being sequentially connected, fundamental frequency characteristic extracting module, Qing Chi
Sound judgment module, word speed extraction module, formant extraction module, the temporal signatures extraction module are used to extract in voice signal
Short-time energy feature, the fundamental frequency characteristic extracting module is used to extract fundamental frequency feature in voice signal, and the clear pond sound is sentenced
Disconnected module is used to extract the zero-crossing rate feature in voice signal, and the word speed extraction module is used to extract the word speed in voice signal
Feature, the formant extraction module are used to extract the formant feature in voice signal.
As a kind of preferred embodiment, the characteristic parameter, which is chosen, to be used to complete data conversion and biography with processing module
It passs, by special to the single features parameter extracted in the characteristic parameter extraction module such as short-time energy feature, zero-crossing rate
Sign, fundamental frequency feature, word speed feature and formant feature carry out selection processing, and final characteristic parameter is taken together, each
One feature vector of sound section of each of voice signal formation, and set of eigenvectors is ultimately formed, it is defeated to form classifier training
Enter file, the training or identification for the sentiment analysis module use.
As a kind of preferred embodiment, the sentiment analysis module includes classifier modules, and the classifier modules exist
On the basis of identifying the characteristic parameter for successfully extracting voice document, by machine learning method, feelings belonging to the recording file are predicted
Thread classification.
As a kind of preferred embodiment, the classifier modules are on the basis of deep neural network, in conjunction with based on contribution
The PCA algorithm of analysis proposes a kind of deep neural network voice mood identification model based on PCA algorithm contribution analysis, passes through
PCA contribution analysis technology is extracted the main component in category feature comprising voice mood and is inputted as deep neural network, carries out
Network training effectively reduces nuisance parameter, training for promotion efficiency, realizes mood classification.
As shown in Fig. 2 the flow chart of a kind of speech-emotion recognition method of the invention.The present invention also proposes a kind of language
Sound emotion identification method, which is characterized in that specifically comprise the following steps: deep neural network voice mood identification model training step
Suddenly;Deep neural network voice mood identification model prediction steps.
As shown in Fig. 4 the flow chart of deep neural network voice mood identification model training step of the invention.Make
For a kind of preferred embodiment, the deep neural network voice mood identification model training step is specifically included: by tape label
Voice mood database input characteristic parameter extraction module carry out processing obtain single features parameter such as short-time energy feature,
Zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature, then input characteristic parameter selection are selected with processing module
Processing is selected, and final characteristic parameter is taken together, one feature vector of sound section of each of each voice signal formation,
And set of eigenvectors is ultimately formed, classifier training input file is formed, the classifier modules of input sentiment analysis module carry out
Training obtains deep neural network voice mood identification model.
As shown in Fig. 5 the flow chart of deep neural network speech emotional model prediction step of the invention.As one
Kind preferred embodiment, the deep neural network speech emotional model prediction step specifically include: by the voice of unknown classification
Mood data library input characteristic parameter extraction module carries out processing and obtains single features parameter such as short-time energy feature, zero-crossing rate
Feature, fundamental frequency feature, word speed feature and formant feature, then input characteristic parameter selection carry out at selection with processing module
Reason, and final characteristic parameter is taken together, one feature vector of sound section of each of each voice signal formation, and most
End form forms classifier training input file at set of eigenvectors, inputs the classifier modules of sentiment analysis module according to described
The deep neural network voice mood identification model that deep neural network voice mood identification model training step obtains predicts language
The classification of mood belonging to sound signal, and export Emotion identification dimension result.
It should be noted that the raw tone feature classification for emotion recognition has very much, initial stage possibly can not specify table
Which bright speech characteristic parameter can accurately reflect human emotion variation, so would generally extraction as much as possible can characterize feelings
The characteristic parameter of sense variation is used for Emotion identification.Although these characteristic parameters can reflect the variation of mood to varying degrees,
But there are certain association between some characteristic parameters, there is overlapping in the information reflected to a certain extent, these overlappings
Parameter is redundancy feature parameter;In addition, there are also some characteristic parameters may with wait upon the mood relevance very little of classification, or even do not have
There is direct association, these characteristic parameters are useless parameter.Either redundancy feature parameter or useless characteristic parameter, have
The complexity of whole system may be increased, or even influences the recognition efficiency of classifier.
For these reasons, before carrying out the classification of signal Emotion identification, between the characteristic parameter inside that needs to extract
Correlation is eliminated, while removing useless parameter.This just needs to choose method appropriate and chooses tool from the numerous parameters extracted
There are the actual parameter of significant contribution degree, i.e. progress feature selecting.
Parameter selection method more mature for the research of parameter selection in Emotion identification, being used at present by numerous scholars
Linear discriminant analysis (linear Discriminant Analysis, LDA), principle component analysis (Principle
Component Analysis, PCA) method, fuzzy entropy method, suboptimum search method, linearly return old modelling (Regression
Model) etc..Wherein, PCA analytic approach is presently the most common feature selecting and dimension reduction method, it is with weight losses is not wanted as far as possible
Information is objective, is several a small number of parameters by the multiple initial parameter linear combinations extracted, these transformed minority ginsengs
It is mutually incoherent between number, it is called the principal component of former characteristic parameter, it will levy most information of parameter, and dimension comprising original
It is less, it is higher than the superiority of initial parameter to be based on this, contribution analysis is carried out to above-mentioned category feature using PCA method, it is real
Existing dimension-reduction treatment.
Classifier modules are the nucleus modules of Emotion identification system, it is to identify the feature ginseng for successfully extracting voice document
On the basis of number, by machine learning method, predict that mood belonging to the recording file is classified.The module can success prediction it
Before, it needs first to train it.This programme, in conjunction with the PCA algorithm based on contribution analysis, proposes on the basis of deep neural network
A kind of deep neural network speech emotional model based on PCA contribution analysis, being extracted by PCA technology includes language in category feature
The main component of sound mood is inputted as deep neural network, carries out network training, effectively reduces nuisance parameter, training for promotion effect
Rate realizes mood classification.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, without departing from the technical principles of the invention, several improvement and deformations can also be made, these improvement and deformations
Also it should be regarded as protection scope of the present invention.
Claims (9)
1. a kind of speech emotion recognition system, which is characterized in that including speech preprocessing module, affective feature extraction module, feelings
Feel analysis module, the input of the speech preprocessing module terminates voice data, the output end of the speech preprocessing module with
The input terminal of the affective feature extraction module is connected, the output end and the sentiment analysis of the affective feature extraction module
The input terminal of module is connected, the output end output analysis recognition result of the sentiment analysis module;The voice pre-processes mould
Block obtains voice signal by carrying out processing to voice data, and is transmitted to the affective feature extraction module and believes the voice
Close parameters,acoustic is associated with emotion in number to extract, and is finally sent into the judgement that the sentiment analysis module completes emotion.
2. a kind of speech emotion recognition system according to claim 1, which is characterized in that the affective feature extraction module
Including characteristic parameter extraction module, characteristic parameter chooses and processing module, the output end of the characteristic parameter extraction module and institute
It states characteristic parameter selection and the input terminal of processing module is connected.
3. a kind of speech emotion recognition system according to claim 2, which is characterized in that the characteristic parameter extraction module
Including be sequentially connected temporal signatures extraction module, fundamental frequency characteristic extracting module, clear pond sound judgment module, word speed extraction module,
Formant extraction module, the temporal signatures extraction module are used to extract the short-time energy feature in voice signal, the fundamental frequency
Characteristic extracting module is used to extract the fundamental frequency feature in voice signal, and the clear pond sound judgment module is used to extract in voice signal
Zero-crossing rate feature, the word speed extraction module is used to extract word speed feature in voice signal, the formant extraction module
For extracting the formant feature in voice signal.
4. a kind of speech emotion recognition system according to claim 2, which is characterized in that the characteristic parameter chooses and place
Reason module is used to complete data conversion and transmitting, by the single features parameter extracted in the characteristic parameter extraction module
Such as short-time energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature carry out selection processing, and will most
Whole characteristic parameter takes together, one feature vector of sound section of each of each voice signal formation, and ultimately forms spy
Vector set is levied, classifier training input file is formed, the training or identification for the sentiment analysis module use.
5. a kind of speech emotion recognition system according to claim 1, which is characterized in that the sentiment analysis module includes
Classifier modules, the classifier modules pass through machine learning on the basis of identifying the characteristic parameter for successfully extracting voice document
Method predicts that mood belonging to the recording file is classified.
6. a kind of speech emotion recognition system according to claim 5, which is characterized in that the classifier modules are in depth
On neural net base, in conjunction with the PCA algorithm based on contribution analysis, a kind of depth mind based on PCA algorithm contribution analysis is proposed
Through voice-over-net Emotion identification model, by PCA contribution analysis technology extract in category feature comprising voice mood it is main at
It is allocated as inputting for deep neural network, carries out network training, effectively reduce nuisance parameter, training for promotion efficiency, realize mood point
Class.
7. a kind of recognition methods based on speech emotion recognition system described in claim 1, which is characterized in that specifically include as
Lower step: deep neural network voice mood identification model training step;The prediction of deep neural network voice mood identification model
Step.
8. a kind of speech-emotion recognition method according to claim 7, which is characterized in that the deep neural network voice
Emotion identification model training step specifically includes: the voice mood database input characteristic parameter extraction module of tape label is carried out
It is special that processing obtains single features parameter such as short-time energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant
Sign, then input characteristic parameter selection carries out selection processing with processing module, and final characteristic parameter is taken together, each
One feature vector of sound section of each of voice signal formation, and set of eigenvectors is ultimately formed, it is defeated to form classifier training
Enter file, the classifier modules of input sentiment analysis module are trained, and obtain deep neural network voice mood identification model.
9. a kind of speech-emotion recognition method according to claim 8, which is characterized in that the deep neural network voice
Emotion model prediction steps specifically include: at the voice mood database input characteristic parameter extraction module of unknown classification
Reason obtains single features parameter such as short-time energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature,
Then input characteristic parameter selection carries out selection processing with processing module, and final characteristic parameter is taken together, Mei Geyu
One feature vector of sound section of each of sound signal formation, and set of eigenvectors is ultimately formed, form classifier training input
File inputs the classifier modules of sentiment analysis module according to the deep neural network voice mood identification model training step
The deep neural network voice mood identification model of acquisition predicts the classification of mood belonging to voice signal, and exports Emotion identification
Dimension result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811263371.3A CN109243492A (en) | 2018-10-28 | 2018-10-28 | A kind of speech emotion recognition system and recognition methods |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811263371.3A CN109243492A (en) | 2018-10-28 | 2018-10-28 | A kind of speech emotion recognition system and recognition methods |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109243492A true CN109243492A (en) | 2019-01-18 |
Family
ID=65078554
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811263371.3A Pending CN109243492A (en) | 2018-10-28 | 2018-10-28 | A kind of speech emotion recognition system and recognition methods |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109243492A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110349586A (en) * | 2019-07-23 | 2019-10-18 | 北京邮电大学 | Telecommunication fraud detection method and device |
CN110379441A (en) * | 2019-07-01 | 2019-10-25 | 特斯联(北京)科技有限公司 | A kind of voice service method and system based on countering type smart network |
WO2020211820A1 (en) * | 2019-04-16 | 2020-10-22 | 华为技术有限公司 | Method and device for speech emotion recognition |
CN112735431A (en) * | 2020-12-29 | 2021-04-30 | 三星电子(中国)研发中心 | Model training method and device and artificial intelligence dialogue recognition method and device |
CN112735479A (en) * | 2021-03-31 | 2021-04-30 | 南方电网数字电网研究院有限公司 | Speech emotion recognition method and device, computer equipment and storage medium |
CN113314103A (en) * | 2021-05-31 | 2021-08-27 | 中国工商银行股份有限公司 | Illegal information identification method and device based on real-time speech emotion analysis |
CN113314151A (en) * | 2021-05-26 | 2021-08-27 | 中国工商银行股份有限公司 | Voice information processing method and device, electronic equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140257820A1 (en) * | 2013-03-10 | 2014-09-11 | Nice-Systems Ltd | Method and apparatus for real time emotion detection in audio interactions |
CN104200814A (en) * | 2014-08-15 | 2014-12-10 | 浙江大学 | Speech emotion recognition method based on semantic cells |
CN107705807A (en) * | 2017-08-24 | 2018-02-16 | 平安科技(深圳)有限公司 | Voice quality detecting method, device, equipment and storage medium based on Emotion identification |
CN108039181A (en) * | 2017-11-02 | 2018-05-15 | 北京捷通华声科技股份有限公司 | The emotion information analysis method and device of a kind of voice signal |
CN108305639A (en) * | 2018-05-11 | 2018-07-20 | 南京邮电大学 | Speech-emotion recognition method, computer readable storage medium, terminal |
-
2018
- 2018-10-28 CN CN201811263371.3A patent/CN109243492A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140257820A1 (en) * | 2013-03-10 | 2014-09-11 | Nice-Systems Ltd | Method and apparatus for real time emotion detection in audio interactions |
CN104200814A (en) * | 2014-08-15 | 2014-12-10 | 浙江大学 | Speech emotion recognition method based on semantic cells |
CN107705807A (en) * | 2017-08-24 | 2018-02-16 | 平安科技(深圳)有限公司 | Voice quality detecting method, device, equipment and storage medium based on Emotion identification |
CN108039181A (en) * | 2017-11-02 | 2018-05-15 | 北京捷通华声科技股份有限公司 | The emotion information analysis method and device of a kind of voice signal |
CN108305639A (en) * | 2018-05-11 | 2018-07-20 | 南京邮电大学 | Speech-emotion recognition method, computer readable storage medium, terminal |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020211820A1 (en) * | 2019-04-16 | 2020-10-22 | 华为技术有限公司 | Method and device for speech emotion recognition |
US11900959B2 (en) | 2019-04-16 | 2024-02-13 | Huawei Technologies Co., Ltd. | Speech emotion recognition method and apparatus |
CN110379441A (en) * | 2019-07-01 | 2019-10-25 | 特斯联(北京)科技有限公司 | A kind of voice service method and system based on countering type smart network |
CN110349586A (en) * | 2019-07-23 | 2019-10-18 | 北京邮电大学 | Telecommunication fraud detection method and device |
CN110349586B (en) * | 2019-07-23 | 2022-05-13 | 北京邮电大学 | Telecommunication fraud detection method and device |
CN112735431A (en) * | 2020-12-29 | 2021-04-30 | 三星电子(中国)研发中心 | Model training method and device and artificial intelligence dialogue recognition method and device |
CN112735431B (en) * | 2020-12-29 | 2023-12-22 | 三星电子(中国)研发中心 | Model training method and device and artificial intelligent dialogue recognition method and device |
CN112735479A (en) * | 2021-03-31 | 2021-04-30 | 南方电网数字电网研究院有限公司 | Speech emotion recognition method and device, computer equipment and storage medium |
CN112735479B (en) * | 2021-03-31 | 2021-07-06 | 南方电网数字电网研究院有限公司 | Speech emotion recognition method and device, computer equipment and storage medium |
CN113314151A (en) * | 2021-05-26 | 2021-08-27 | 中国工商银行股份有限公司 | Voice information processing method and device, electronic equipment and storage medium |
CN113314103A (en) * | 2021-05-31 | 2021-08-27 | 中国工商银行股份有限公司 | Illegal information identification method and device based on real-time speech emotion analysis |
CN113314103B (en) * | 2021-05-31 | 2023-03-03 | 中国工商银行股份有限公司 | Illegal information identification method and device based on real-time speech emotion analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109243492A (en) | A kind of speech emotion recognition system and recognition methods | |
CN102737629B (en) | Embedded type speech emotion recognition method and device | |
CN103258535A (en) | Identity recognition method and system based on voiceprint recognition | |
CN108269133A (en) | A kind of combination human bioequivalence and the intelligent advertisement push method and terminal of speech recognition | |
CN102623009B (en) | Abnormal emotion automatic detection and extraction method and system on basis of short-time analysis | |
CN109243446A (en) | A kind of voice awakening method based on RNN network | |
CN107731233A (en) | A kind of method for recognizing sound-groove based on RNN | |
CN101364408A (en) | Sound image combined monitoring method and system | |
CN109256150A (en) | Speech emotion recognition system and method based on machine learning | |
CN110211594B (en) | Speaker identification method based on twin network model and KNN algorithm | |
CN101393660A (en) | Intelligent gate inhibition system based on footstep recognition | |
CN109961794A (en) | A kind of layering method for distinguishing speek person of model-based clustering | |
CN108985776A (en) | Credit card security monitoring method based on multiple Information Authentication | |
CN112259104A (en) | Training device of voiceprint recognition model | |
CN109493882A (en) | A kind of fraudulent call voice automatic marking system and method | |
CN109887510A (en) | Voiceprint recognition method and device based on empirical mode decomposition and MFCC | |
CN108804669A (en) | A kind of fraudulent call method for detecting based on intention understanding technology | |
CN109448756A (en) | A kind of voice age recognition methods and system | |
CN105679323B (en) | A kind of number discovery method and system | |
CN112992155A (en) | Far-field voice speaker recognition method and device based on residual error neural network | |
CN103778917A (en) | System and method for detecting identity impersonation in telephone satisfaction survey | |
Jin et al. | Speaker verification based on single channel speech separation | |
CN109817223A (en) | Phoneme marking method and device based on audio fingerprints | |
CN108665901A (en) | A kind of phoneme/syllable extracting method and device | |
Kalinli | Tone and pitch accent classification using auditory attention cues |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190118 |