Nothing Special   »   [go: up one dir, main page]

CN109243492A - A kind of speech emotion recognition system and recognition methods - Google Patents

A kind of speech emotion recognition system and recognition methods Download PDF

Info

Publication number
CN109243492A
CN109243492A CN201811263371.3A CN201811263371A CN109243492A CN 109243492 A CN109243492 A CN 109243492A CN 201811263371 A CN201811263371 A CN 201811263371A CN 109243492 A CN109243492 A CN 109243492A
Authority
CN
China
Prior art keywords
module
feature
voice
extraction module
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811263371.3A
Other languages
Chinese (zh)
Inventor
张震
李鹏
黄远
高圣翔
殷兵
刘冠男
倪江帆
冯向雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xun Feizhi Metamessage Science And Technology Ltd
National Computer Network and Information Security Management Center
Original Assignee
Xun Feizhi Metamessage Science And Technology Ltd
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xun Feizhi Metamessage Science And Technology Ltd, National Computer Network and Information Security Management Center filed Critical Xun Feizhi Metamessage Science And Technology Ltd
Priority to CN201811263371.3A priority Critical patent/CN109243492A/en
Publication of CN109243492A publication Critical patent/CN109243492A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Hospice & Palliative Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Child & Adolescent Psychology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)

Abstract

The present invention discloses a kind of speech emotion recognition system, including speech preprocessing module, affective feature extraction module, sentiment analysis module, the input of the speech preprocessing module terminates voice data, the output end of the speech preprocessing module and the input terminal of the affective feature extraction module are connected, the output end of the affective feature extraction module and the input terminal of the sentiment analysis module are connected, the output end output analysis recognition result of the sentiment analysis module;The speech preprocessing module obtains voice signal by carrying out processing to voice data, and be transmitted to the affective feature extraction module and extract to being associated with close parameters,acoustic with emotion in the voice signal, finally it is sent into the judgement that the sentiment analysis module completes emotion.The present invention also proposes a kind of speech-emotion recognition method, increases the detection means of telephone fraud system, can carry out multi dimensional analysis for voice data, the detection accuracy rate of system improves 5%.

Description

A kind of speech emotion recognition system and recognition methods
Technical field
The present invention relates to a kind of speech emotion recognition system and recognition methods, belong to speech analysis techniques field.
Background technique
Currently, existing telephone fraud prevention intercepting system is mainly based upon the pre-alarm and prevention technology of signaling data, is based on The fraudulent call early warning technology that nocuousness recording compares and natural person's fraudulent call early warning technology based on intelligent sound technology, There are the following problems for this several technology paths, and simple signalling analysis, recording compare analysis, and available characteristic information is few, Accuracy and it is comprehensive on, be difficult to accomplish that each side takes into account.In addition, needing to accumulate the language of one section of duration for voice subject analysis Sound, to system speech access capability and handle analysis ability requirement it is all relatively high, on line system operation may require that compared with High cost.
Summary of the invention
The technical problem to be solved by the present invention is to overcome the deficiencies of existing technologies, a kind of speech emotion recognition system is provided And recognition methods identifies the abnormal emotion feature of target speaker, can help to assess by mood, emotion recognition processing The abnormal behaviour taken on the telephone and intention effectively assist the early warning detection of fraudulent call.It overcomes in conventional solution and passes through The intention understanding on basis can only obtain literal message, can not deeply excavate the skill because of mood, emotion variation bring exception information Art defect.
In order to solve the above technical problems, the present invention provides a kind of speech emotion recognition system, which is characterized in that including voice The input of preprocessing module, affective feature extraction module, sentiment analysis module, the speech preprocessing module terminates voice number According to the output end of the speech preprocessing module and the input terminal of the affective feature extraction module are connected, and the emotion is special The input terminal of the output end and the sentiment analysis module of levying extraction module is connected, and the output end of the sentiment analysis module is defeated Recognition result is analyzed out;The speech preprocessing module obtains voice signal by carrying out processing to voice data, and is transmitted to The affective feature extraction module is extracted to being associated with close parameters,acoustic with emotion in the voice signal, is finally sent into The sentiment analysis module completes the judgement of emotion.
As a kind of preferred embodiment, the affective feature extraction module includes characteristic parameter extraction module, feature ginseng Number chooses and processing module, and output end and the characteristic parameter of the characteristic parameter extraction module are chosen defeated with processing module Enter end to be connected.
As a kind of preferred embodiment, the characteristic parameter extraction module includes that the temporal signatures being sequentially connected extract mould Block, fundamental frequency characteristic extracting module, clear pond sound judgment module, word speed extraction module, formant extraction module, the temporal signatures mention Modulus block is used to extract the short-time energy feature in voice signal, and the fundamental frequency characteristic extracting module is used to extract in voice signal Fundamental frequency feature, the clear pond sound judgment module is used to extract zero-crossing rate feature in voice signal, the word speed extraction module For extracting the word speed feature in voice signal, the formant that the formant extraction module is used to extract in voice signal is special Sign.
As a kind of preferred embodiment, the characteristic parameter, which is chosen, to be used to complete data conversion and biography with processing module It passs, by special to the single features parameter extracted in the characteristic parameter extraction module such as short-time energy feature, zero-crossing rate Sign, fundamental frequency feature, word speed feature and formant feature carry out selection processing, and final characteristic parameter is taken together, each One feature vector of sound section of each of voice signal formation, and set of eigenvectors is ultimately formed, it is defeated to form classifier training Enter file, the training or identification for the sentiment analysis module use.
As a kind of preferred embodiment, the sentiment analysis module includes classifier modules, and the classifier modules exist On the basis of identifying the characteristic parameter for successfully extracting voice document, by machine learning method, feelings belonging to the recording file are predicted Thread classification.
As a kind of preferred embodiment, the classifier modules are on the basis of deep neural network, in conjunction with based on contribution The PCA algorithm of analysis proposes a kind of deep neural network voice mood identification model based on PCA algorithm contribution analysis, passes through PCA contribution analysis technology is extracted the main component in category feature comprising voice mood and is inputted as deep neural network, carries out Network training effectively reduces nuisance parameter, training for promotion efficiency, realizes mood classification.
The present invention also proposes a kind of speech-emotion recognition method, which is characterized in that specifically comprises the following steps: depth nerve Voice-over-net Emotion identification model training step;Deep neural network voice mood identification model prediction steps.
As a kind of preferred embodiment, the deep neural network voice mood identification model training step is specifically wrapped Include: it is such as short that the voice mood database input characteristic parameter extraction module of tape label is carried out processing acquisition single features parameter When energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature, then input characteristic parameter selection and place Reason module carries out selection processing, and final characteristic parameter is taken together, sound section of each of each voice signal formation One feature vector, and set of eigenvectors is ultimately formed, form classifier training input file, point of input sentiment analysis module Class device module is trained, and obtains deep neural network voice mood identification model.
As a kind of preferred embodiment, the deep neural network speech emotional model prediction step is specifically included: will The voice mood database input characteristic parameter extraction module of unknown classification carries out processing and obtains single features parameter such as in short-term Energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature, then input characteristic parameter selection and processing Module carries out selection processing, and final characteristic parameter is taken together, and sound section of each of each voice signal forms one A feature vector, and set of eigenvectors is ultimately formed, classifier training input file is formed, the classification of sentiment analysis module is inputted The deep neural network voice mood that device module is obtained according to the deep neural network voice mood identification model training step Identification model predicts the classification of mood belonging to voice signal, and exports Emotion identification dimension result.
Advantageous effects of the invention: a kind of speech emotion recognition system proposed by the present invention and recognition methods, lead to Mood, emotion recognition processing are crossed, the abnormal emotion feature of target speaker is identified, can help to assess the exception taken on the telephone Behavior and intention effectively assist the early warning detection of fraudulent call, overcome in conventional solution and understood by the intention on basis It can only obtain literal message, can not deeply excavate the technological deficiency because of mood, emotion variation bring exception information, increase electricity The detection means for talking about swindle system, can carry out multi dimensional analysis for voice data, the detection accuracy rate of system improves 5%.
Detailed description of the invention
Fig. 1 is a kind of structural block diagram of speech emotion recognition system of the invention.
Fig. 2 is a kind of flow chart of speech-emotion recognition method of the invention.
Fig. 3 is the structural block diagram of emotional characteristics parameter extraction module of the invention.
Fig. 4 is the flow chart of deep neural network voice mood identification model training step of the invention.
Fig. 5 is the flow chart of deep neural network speech emotional model prediction step of the invention.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings.Following embodiment is only used for clearly illustrating the present invention Technical solution, and not intended to limit the protection scope of the present invention.
As shown in Fig. 1 the structural block diagram of a kind of speech emotion recognition system of the invention.The present invention provides a kind of language Sound emotion recognition system, which is characterized in that including speech preprocessing module, affective feature extraction module, sentiment analysis module, institute The input termination voice data of speech preprocessing module is stated, the output end of the speech preprocessing module is mentioned with the affective characteristics The input terminal of modulus block is connected, the input terminal phase of the output end of the affective feature extraction module and the sentiment analysis module Connection, the output end output analysis recognition result of the sentiment analysis module;The speech preprocessing module passes through to voice number Voice signal is obtained according to processing is carried out, and is transmitted to the affective feature extraction module and is associated in the voice signal with emotion Close parameters,acoustic extracts, and is finally sent into the judgement that the sentiment analysis module completes emotion.
As a kind of preferred embodiment, the affective feature extraction module includes characteristic parameter extraction module, feature ginseng Number chooses and processing module, and output end and the characteristic parameter of the characteristic parameter extraction module are chosen defeated with processing module Enter end to be connected.
As shown in Fig. 3 the structural block diagram of emotional characteristics parameter extraction module of the invention.As a kind of preferable reality Example is applied, the characteristic parameter extraction module includes the temporal signatures extraction module being sequentially connected, fundamental frequency characteristic extracting module, Qing Chi Sound judgment module, word speed extraction module, formant extraction module, the temporal signatures extraction module are used to extract in voice signal Short-time energy feature, the fundamental frequency characteristic extracting module is used to extract fundamental frequency feature in voice signal, and the clear pond sound is sentenced Disconnected module is used to extract the zero-crossing rate feature in voice signal, and the word speed extraction module is used to extract the word speed in voice signal Feature, the formant extraction module are used to extract the formant feature in voice signal.
As a kind of preferred embodiment, the characteristic parameter, which is chosen, to be used to complete data conversion and biography with processing module It passs, by special to the single features parameter extracted in the characteristic parameter extraction module such as short-time energy feature, zero-crossing rate Sign, fundamental frequency feature, word speed feature and formant feature carry out selection processing, and final characteristic parameter is taken together, each One feature vector of sound section of each of voice signal formation, and set of eigenvectors is ultimately formed, it is defeated to form classifier training Enter file, the training or identification for the sentiment analysis module use.
As a kind of preferred embodiment, the sentiment analysis module includes classifier modules, and the classifier modules exist On the basis of identifying the characteristic parameter for successfully extracting voice document, by machine learning method, feelings belonging to the recording file are predicted Thread classification.
As a kind of preferred embodiment, the classifier modules are on the basis of deep neural network, in conjunction with based on contribution The PCA algorithm of analysis proposes a kind of deep neural network voice mood identification model based on PCA algorithm contribution analysis, passes through PCA contribution analysis technology is extracted the main component in category feature comprising voice mood and is inputted as deep neural network, carries out Network training effectively reduces nuisance parameter, training for promotion efficiency, realizes mood classification.
As shown in Fig. 2 the flow chart of a kind of speech-emotion recognition method of the invention.The present invention also proposes a kind of language Sound emotion identification method, which is characterized in that specifically comprise the following steps: deep neural network voice mood identification model training step Suddenly;Deep neural network voice mood identification model prediction steps.
As shown in Fig. 4 the flow chart of deep neural network voice mood identification model training step of the invention.Make For a kind of preferred embodiment, the deep neural network voice mood identification model training step is specifically included: by tape label Voice mood database input characteristic parameter extraction module carry out processing obtain single features parameter such as short-time energy feature, Zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature, then input characteristic parameter selection are selected with processing module Processing is selected, and final characteristic parameter is taken together, one feature vector of sound section of each of each voice signal formation, And set of eigenvectors is ultimately formed, classifier training input file is formed, the classifier modules of input sentiment analysis module carry out Training obtains deep neural network voice mood identification model.
As shown in Fig. 5 the flow chart of deep neural network speech emotional model prediction step of the invention.As one Kind preferred embodiment, the deep neural network speech emotional model prediction step specifically include: by the voice of unknown classification Mood data library input characteristic parameter extraction module carries out processing and obtains single features parameter such as short-time energy feature, zero-crossing rate Feature, fundamental frequency feature, word speed feature and formant feature, then input characteristic parameter selection carry out at selection with processing module Reason, and final characteristic parameter is taken together, one feature vector of sound section of each of each voice signal formation, and most End form forms classifier training input file at set of eigenvectors, inputs the classifier modules of sentiment analysis module according to described The deep neural network voice mood identification model that deep neural network voice mood identification model training step obtains predicts language The classification of mood belonging to sound signal, and export Emotion identification dimension result.
It should be noted that the raw tone feature classification for emotion recognition has very much, initial stage possibly can not specify table Which bright speech characteristic parameter can accurately reflect human emotion variation, so would generally extraction as much as possible can characterize feelings The characteristic parameter of sense variation is used for Emotion identification.Although these characteristic parameters can reflect the variation of mood to varying degrees, But there are certain association between some characteristic parameters, there is overlapping in the information reflected to a certain extent, these overlappings Parameter is redundancy feature parameter;In addition, there are also some characteristic parameters may with wait upon the mood relevance very little of classification, or even do not have There is direct association, these characteristic parameters are useless parameter.Either redundancy feature parameter or useless characteristic parameter, have The complexity of whole system may be increased, or even influences the recognition efficiency of classifier.
For these reasons, before carrying out the classification of signal Emotion identification, between the characteristic parameter inside that needs to extract Correlation is eliminated, while removing useless parameter.This just needs to choose method appropriate and chooses tool from the numerous parameters extracted There are the actual parameter of significant contribution degree, i.e. progress feature selecting.
Parameter selection method more mature for the research of parameter selection in Emotion identification, being used at present by numerous scholars Linear discriminant analysis (linear Discriminant Analysis, LDA), principle component analysis (Principle Component Analysis, PCA) method, fuzzy entropy method, suboptimum search method, linearly return old modelling (Regression Model) etc..Wherein, PCA analytic approach is presently the most common feature selecting and dimension reduction method, it is with weight losses is not wanted as far as possible Information is objective, is several a small number of parameters by the multiple initial parameter linear combinations extracted, these transformed minority ginsengs It is mutually incoherent between number, it is called the principal component of former characteristic parameter, it will levy most information of parameter, and dimension comprising original It is less, it is higher than the superiority of initial parameter to be based on this, contribution analysis is carried out to above-mentioned category feature using PCA method, it is real Existing dimension-reduction treatment.
Classifier modules are the nucleus modules of Emotion identification system, it is to identify the feature ginseng for successfully extracting voice document On the basis of number, by machine learning method, predict that mood belonging to the recording file is classified.The module can success prediction it Before, it needs first to train it.This programme, in conjunction with the PCA algorithm based on contribution analysis, proposes on the basis of deep neural network A kind of deep neural network speech emotional model based on PCA contribution analysis, being extracted by PCA technology includes language in category feature The main component of sound mood is inputted as deep neural network, carries out network training, effectively reduces nuisance parameter, training for promotion effect Rate realizes mood classification.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, without departing from the technical principles of the invention, several improvement and deformations can also be made, these improvement and deformations Also it should be regarded as protection scope of the present invention.

Claims (9)

1. a kind of speech emotion recognition system, which is characterized in that including speech preprocessing module, affective feature extraction module, feelings Feel analysis module, the input of the speech preprocessing module terminates voice data, the output end of the speech preprocessing module with The input terminal of the affective feature extraction module is connected, the output end and the sentiment analysis of the affective feature extraction module The input terminal of module is connected, the output end output analysis recognition result of the sentiment analysis module;The voice pre-processes mould Block obtains voice signal by carrying out processing to voice data, and is transmitted to the affective feature extraction module and believes the voice Close parameters,acoustic is associated with emotion in number to extract, and is finally sent into the judgement that the sentiment analysis module completes emotion.
2. a kind of speech emotion recognition system according to claim 1, which is characterized in that the affective feature extraction module Including characteristic parameter extraction module, characteristic parameter chooses and processing module, the output end of the characteristic parameter extraction module and institute It states characteristic parameter selection and the input terminal of processing module is connected.
3. a kind of speech emotion recognition system according to claim 2, which is characterized in that the characteristic parameter extraction module Including be sequentially connected temporal signatures extraction module, fundamental frequency characteristic extracting module, clear pond sound judgment module, word speed extraction module, Formant extraction module, the temporal signatures extraction module are used to extract the short-time energy feature in voice signal, the fundamental frequency Characteristic extracting module is used to extract the fundamental frequency feature in voice signal, and the clear pond sound judgment module is used to extract in voice signal Zero-crossing rate feature, the word speed extraction module is used to extract word speed feature in voice signal, the formant extraction module For extracting the formant feature in voice signal.
4. a kind of speech emotion recognition system according to claim 2, which is characterized in that the characteristic parameter chooses and place Reason module is used to complete data conversion and transmitting, by the single features parameter extracted in the characteristic parameter extraction module Such as short-time energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature carry out selection processing, and will most Whole characteristic parameter takes together, one feature vector of sound section of each of each voice signal formation, and ultimately forms spy Vector set is levied, classifier training input file is formed, the training or identification for the sentiment analysis module use.
5. a kind of speech emotion recognition system according to claim 1, which is characterized in that the sentiment analysis module includes Classifier modules, the classifier modules pass through machine learning on the basis of identifying the characteristic parameter for successfully extracting voice document Method predicts that mood belonging to the recording file is classified.
6. a kind of speech emotion recognition system according to claim 5, which is characterized in that the classifier modules are in depth On neural net base, in conjunction with the PCA algorithm based on contribution analysis, a kind of depth mind based on PCA algorithm contribution analysis is proposed Through voice-over-net Emotion identification model, by PCA contribution analysis technology extract in category feature comprising voice mood it is main at It is allocated as inputting for deep neural network, carries out network training, effectively reduce nuisance parameter, training for promotion efficiency, realize mood point Class.
7. a kind of recognition methods based on speech emotion recognition system described in claim 1, which is characterized in that specifically include as Lower step: deep neural network voice mood identification model training step;The prediction of deep neural network voice mood identification model Step.
8. a kind of speech-emotion recognition method according to claim 7, which is characterized in that the deep neural network voice Emotion identification model training step specifically includes: the voice mood database input characteristic parameter extraction module of tape label is carried out It is special that processing obtains single features parameter such as short-time energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant Sign, then input characteristic parameter selection carries out selection processing with processing module, and final characteristic parameter is taken together, each One feature vector of sound section of each of voice signal formation, and set of eigenvectors is ultimately formed, it is defeated to form classifier training Enter file, the classifier modules of input sentiment analysis module are trained, and obtain deep neural network voice mood identification model.
9. a kind of speech-emotion recognition method according to claim 8, which is characterized in that the deep neural network voice Emotion model prediction steps specifically include: at the voice mood database input characteristic parameter extraction module of unknown classification Reason obtains single features parameter such as short-time energy feature, zero-crossing rate feature, fundamental frequency feature, word speed feature and formant feature, Then input characteristic parameter selection carries out selection processing with processing module, and final characteristic parameter is taken together, Mei Geyu One feature vector of sound section of each of sound signal formation, and set of eigenvectors is ultimately formed, form classifier training input File inputs the classifier modules of sentiment analysis module according to the deep neural network voice mood identification model training step The deep neural network voice mood identification model of acquisition predicts the classification of mood belonging to voice signal, and exports Emotion identification Dimension result.
CN201811263371.3A 2018-10-28 2018-10-28 A kind of speech emotion recognition system and recognition methods Pending CN109243492A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811263371.3A CN109243492A (en) 2018-10-28 2018-10-28 A kind of speech emotion recognition system and recognition methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811263371.3A CN109243492A (en) 2018-10-28 2018-10-28 A kind of speech emotion recognition system and recognition methods

Publications (1)

Publication Number Publication Date
CN109243492A true CN109243492A (en) 2019-01-18

Family

ID=65078554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811263371.3A Pending CN109243492A (en) 2018-10-28 2018-10-28 A kind of speech emotion recognition system and recognition methods

Country Status (1)

Country Link
CN (1) CN109243492A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110349586A (en) * 2019-07-23 2019-10-18 北京邮电大学 Telecommunication fraud detection method and device
CN110379441A (en) * 2019-07-01 2019-10-25 特斯联(北京)科技有限公司 A kind of voice service method and system based on countering type smart network
WO2020211820A1 (en) * 2019-04-16 2020-10-22 华为技术有限公司 Method and device for speech emotion recognition
CN112735431A (en) * 2020-12-29 2021-04-30 三星电子(中国)研发中心 Model training method and device and artificial intelligence dialogue recognition method and device
CN112735479A (en) * 2021-03-31 2021-04-30 南方电网数字电网研究院有限公司 Speech emotion recognition method and device, computer equipment and storage medium
CN113314103A (en) * 2021-05-31 2021-08-27 中国工商银行股份有限公司 Illegal information identification method and device based on real-time speech emotion analysis
CN113314151A (en) * 2021-05-26 2021-08-27 中国工商银行股份有限公司 Voice information processing method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140257820A1 (en) * 2013-03-10 2014-09-11 Nice-Systems Ltd Method and apparatus for real time emotion detection in audio interactions
CN104200814A (en) * 2014-08-15 2014-12-10 浙江大学 Speech emotion recognition method based on semantic cells
CN107705807A (en) * 2017-08-24 2018-02-16 平安科技(深圳)有限公司 Voice quality detecting method, device, equipment and storage medium based on Emotion identification
CN108039181A (en) * 2017-11-02 2018-05-15 北京捷通华声科技股份有限公司 The emotion information analysis method and device of a kind of voice signal
CN108305639A (en) * 2018-05-11 2018-07-20 南京邮电大学 Speech-emotion recognition method, computer readable storage medium, terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140257820A1 (en) * 2013-03-10 2014-09-11 Nice-Systems Ltd Method and apparatus for real time emotion detection in audio interactions
CN104200814A (en) * 2014-08-15 2014-12-10 浙江大学 Speech emotion recognition method based on semantic cells
CN107705807A (en) * 2017-08-24 2018-02-16 平安科技(深圳)有限公司 Voice quality detecting method, device, equipment and storage medium based on Emotion identification
CN108039181A (en) * 2017-11-02 2018-05-15 北京捷通华声科技股份有限公司 The emotion information analysis method and device of a kind of voice signal
CN108305639A (en) * 2018-05-11 2018-07-20 南京邮电大学 Speech-emotion recognition method, computer readable storage medium, terminal

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020211820A1 (en) * 2019-04-16 2020-10-22 华为技术有限公司 Method and device for speech emotion recognition
US11900959B2 (en) 2019-04-16 2024-02-13 Huawei Technologies Co., Ltd. Speech emotion recognition method and apparatus
CN110379441A (en) * 2019-07-01 2019-10-25 特斯联(北京)科技有限公司 A kind of voice service method and system based on countering type smart network
CN110349586A (en) * 2019-07-23 2019-10-18 北京邮电大学 Telecommunication fraud detection method and device
CN110349586B (en) * 2019-07-23 2022-05-13 北京邮电大学 Telecommunication fraud detection method and device
CN112735431A (en) * 2020-12-29 2021-04-30 三星电子(中国)研发中心 Model training method and device and artificial intelligence dialogue recognition method and device
CN112735431B (en) * 2020-12-29 2023-12-22 三星电子(中国)研发中心 Model training method and device and artificial intelligent dialogue recognition method and device
CN112735479A (en) * 2021-03-31 2021-04-30 南方电网数字电网研究院有限公司 Speech emotion recognition method and device, computer equipment and storage medium
CN112735479B (en) * 2021-03-31 2021-07-06 南方电网数字电网研究院有限公司 Speech emotion recognition method and device, computer equipment and storage medium
CN113314151A (en) * 2021-05-26 2021-08-27 中国工商银行股份有限公司 Voice information processing method and device, electronic equipment and storage medium
CN113314103A (en) * 2021-05-31 2021-08-27 中国工商银行股份有限公司 Illegal information identification method and device based on real-time speech emotion analysis
CN113314103B (en) * 2021-05-31 2023-03-03 中国工商银行股份有限公司 Illegal information identification method and device based on real-time speech emotion analysis

Similar Documents

Publication Publication Date Title
CN109243492A (en) A kind of speech emotion recognition system and recognition methods
CN102737629B (en) Embedded type speech emotion recognition method and device
CN103258535A (en) Identity recognition method and system based on voiceprint recognition
CN108269133A (en) A kind of combination human bioequivalence and the intelligent advertisement push method and terminal of speech recognition
CN102623009B (en) Abnormal emotion automatic detection and extraction method and system on basis of short-time analysis
CN109243446A (en) A kind of voice awakening method based on RNN network
CN107731233A (en) A kind of method for recognizing sound-groove based on RNN
CN101364408A (en) Sound image combined monitoring method and system
CN109256150A (en) Speech emotion recognition system and method based on machine learning
CN110211594B (en) Speaker identification method based on twin network model and KNN algorithm
CN101393660A (en) Intelligent gate inhibition system based on footstep recognition
CN109961794A (en) A kind of layering method for distinguishing speek person of model-based clustering
CN108985776A (en) Credit card security monitoring method based on multiple Information Authentication
CN112259104A (en) Training device of voiceprint recognition model
CN109493882A (en) A kind of fraudulent call voice automatic marking system and method
CN109887510A (en) Voiceprint recognition method and device based on empirical mode decomposition and MFCC
CN108804669A (en) A kind of fraudulent call method for detecting based on intention understanding technology
CN109448756A (en) A kind of voice age recognition methods and system
CN105679323B (en) A kind of number discovery method and system
CN112992155A (en) Far-field voice speaker recognition method and device based on residual error neural network
CN103778917A (en) System and method for detecting identity impersonation in telephone satisfaction survey
Jin et al. Speaker verification based on single channel speech separation
CN109817223A (en) Phoneme marking method and device based on audio fingerprints
CN108665901A (en) A kind of phoneme/syllable extracting method and device
Kalinli Tone and pitch accent classification using auditory attention cues

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190118