Nothing Special   »   [go: up one dir, main page]

CN102184732A - Fractal-feature-based intelligent wheelchair voice identification control method and system - Google Patents

Fractal-feature-based intelligent wheelchair voice identification control method and system Download PDF

Info

Publication number
CN102184732A
CN102184732A CN2011101091682A CN201110109168A CN102184732A CN 102184732 A CN102184732 A CN 102184732A CN 2011101091682 A CN2011101091682 A CN 2011101091682A CN 201110109168 A CN201110109168 A CN 201110109168A CN 102184732 A CN102184732 A CN 102184732A
Authority
CN
China
Prior art keywords
parameter
voice signal
mfcc
fractal
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011101091682A
Other languages
Chinese (zh)
Inventor
张毅
罗元
李敏
蔡军
谢颖
林海波
黄璜
李艳花
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN2011101091682A priority Critical patent/CN102184732A/en
Publication of CN102184732A publication Critical patent/CN102184732A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a fractal-feature-based intelligent wheelchair voice identification control method, and relates to a voice identification method. The method comprises the following steps of: first, inputting a voice signal; then, performing preprocessing and feature parameter extraction; and finally, performing matching judgment with a template in a template library to acquire a command to control an intelligent wheelchair. A mixed feature parameter obtained by organically combining a fractal feature parameter of the voice signal and a conventional Mel frequency ceptral coefficient (MFCC) feature parameter improves the identification rate of a system, is used for the voice identification in a voice control system, realizes the precise control of the intelligent wheelchair, and fulfils the aim of voice interaction between a user and the wheelchair; and simultaneously, the mixed feature parameter extraction method is also suitable for other voice identification systems. In addition, the invention also discloses a fractal-feature-based intelligent wheelchair voice identification control system. The method comprises a voice signal input module, a preprocessing module, a feature parameter extraction module, a matching module, a judgment module, a command conversion module and a control module.

Description

Intelligent wheel chair speech recognition controlled method and system based on fractal characteristic
Technical field
The present invention relates to computer speech identification control field, be specifically related to a kind of speech recognition intelligent wheel chair control method based on fractal characteristic.
Background technology
Along with the development of society and the raising of human civilization degree, people particularly the elderly, disabled person's demand for services can increase day by day, and they more and more need to use modern high technology to improve their quality of life and life degree of freedom.No matter be in the world or China domestic, the process of aging population is accelerated, in addition because various traffic hazards, natural and man-made calamities and all diseases, all have every year thousands of people to lose one or more abilities (as walking, manipulative ability etc.), this social reality has promoted the application of intelligent service robot aspect helping the elderly, helping the disabled.Therefore, intelligent wheel chair becomes the focus of domestic and international scientific and technical personnel's research gradually as an important research field in the service robot series of products of helping the elderly, help the disabled.The synkinesia instrument that wheelchair uses as numerous the old,weak,sick and disableds personnel, it nearly all is the electric wheelchair of the handle control of outdoor application, it is not very convenient controlling for the limited the elderly of those four limbs mobilities or handicapped person, therefore, we are applied to voice control on the intelligent wheel chair, form a kind of novel walking-replacing tool that intelligent wheel chair and speech recognition technology are combined, it not only has all functions of common wheelchair, importantly can also control wheelchair by voice command, make the control of wheelchair simple more, convenient, pleasant property is better.Therefore, practical voice control intelligent wheel chair robot will start new life pattern and life notion for the elderly and the disabled, have very important realistic meaning.
At home and abroad, researchers have carried out the research of a large amount of relevant items: the SIAMO project that Spain in 1996 is subsidized by the ONCE foundation, target is to build multifunction system according to user's disability degree and specific demand, in order to reach requirement, study the modularization and the dirigibility of system especially, designed distributed architecture, also developed man-machine interface emphatically, wherein also the speech recognition controlled technology is applied to intelligent wheel chair, makes the user be easier to control wheelchair.The researchist of Hokkaido, Japan industrial design institute develop a kind of need not manually-operated voice controlled wheelchair.The researchist can respond to the wafer of the language sound, and device is in the controlling organization of wheelchair, after the user tells requirement facing to microphone, induction system just can be according to requiring to start running, about can be forwards, backwards and the speed walking, the chair back also can retreat, and makes things convenient for the user to have a rest.Institute of Automation Research of CAS has born " 863 " intelligent robot intelligent wheel chair project, developed a kind of wheelchair NLPR of robot that has vision and password navigation feature and can carry out interactive voice with the people, the man-machine control Interface Design of intelligent wheel chair has been paid much attention in this research, newest fruits such as the relevant Flame Image Process in integrated use pattern-recognition laboratory, computer vision and speech recognition in the design of wheelchair, make the people control the wheelchair walking freely by voice, wheelchair can be realized simple man-machine conversation function.Shanghai Communications University succeeds in developing a kind of voice controlled wheelchair, it mainly is handicapped person's design for four limbs total loss function, the user only need send instructions such as " opening ", " preceding ", " back ", " left side ", " right side ", " soon ", " slowly ", " stopping ", and wheelchair can be carried out by instruction in 1.2 seconds.But because voice signal is the non-linear process of a complexity, the performance of the speech recognition technology that grows up based on traditional lineary system theory just is difficult to further raising so.
Therefore be badly in need of a kind of speech recognition system that adopts high discrimination and control the method for intelligent wheel chair.
Summary of the invention
In view of this, in order to address the above problem, the present invention proposes the method that a kind of speech recognition system that adopts high discrimination is controlled intelligent wheel chair.Be a kind of feature extracting method based on nonlinear theory, the fractal characteristic parameter that is about to voice signal merges in traditional Mel frequency cepstral coefficient (MFCC), forms the characteristic parameter that mixes can improve system in speech recognition system discrimination like this.
One of purpose of the present invention is to propose a kind of intelligent wheel chair speech recognition controlled method method based on fractal characteristic; Two of purpose of the present invention is to propose a kind of intelligent wheel chair speech recognition control system based on fractal characteristic.
One of purpose of the present invention is achieved through the following technical solutions:
Intelligent wheel chair speech recognition controlled method based on fractal characteristic provided by the invention may further comprise the steps:
S1: voice signal order speech input;
S2: voice signal is carried out pre-service;
S3: extract characteristic parameter through voice signal after the pre-service;
S4: the template of characteristic parameter and template base is carried out pattern match;
S5: select the highest template of matching similarity as recognition result;
S6: the motion command that this recognition result is converted to intelligent wheel chair;
S7: call the control corresponding function, drive intelligent wheel chair and move according to voice signal.
Further, the pre-service among the described step S2 comprises that preemphasis filtering, the windowing of voice divides frame to handle and the double threshold end-point detection;
Further, the feature extraction among the described step S3 may further comprise the steps:
S31: MFCC (Mel frequency cepstral coefficient) parameter of extracting voice signal;
S311: at first determine counting of each frame speech sample sequence, every frame sequence s (n) is carried out preemphasis filtering handle;
S312: FFT (Fourier transform) conversion of dispersing again, delivery square obtain discrete power spectrum S (n);
S313: several bandpass filter are set in the spectral range of voice;
H m(n),m=0,1,·,M-1,n=0,1,·,N/2-1
Wherein M is the number of wave filter, gets 24 usually, and N is counting of a frame voice signal;
S314: discrete power is composed the power spectrum S (n) that is converted under the Mel frequency;
Calculate S (n) by M H m(n) performance number of back gained is promptly calculated S (n) and H m(n) sum of products on each discrete point in frequency obtains M parameter P m, m=0,1,, M-1;
S315: calculate P mNatural logarithm, obtain L m, m=0,1,, M-1
S316: to L 0, L 1,, L M-1Calculate its discrete cosine transform, obtain D m, m=0,1,, M-1;
S317: cast out the D that represents flip-flop 0, get D 1, D 2,, D KAs the MFCC parameter;
S32: extract the behavioral characteristics of voice, as the characteristic parameter of a frame voice signal,
Describe the behavioral characteristics of voice with the difference cepstrum parameter, computing formula is:
d ( n ) = 1 Σ i = - k k i 2 Σ i = - k k i · ( n + i )
Wherein c and d represent a frame speech parameter, and k is a constant, gets 2 usually, and differential parameter just is called the linear combination of the front cross frame and back two frame parameters of present frame; The differential parameter that calculates of formula is a single order MFCC differential parameter thus, in actual the use, MFCC parameter and each rank MFCC differential parameter is merged into a vector;
S33: the fractal dimension that extracts voice signal is as fractal characteristic;
S331: voice signal is normalized to the unit square zone, obtain normalized signal x (t);
S332: square area is divided into the grid that the length of side is s, calculates logN (s), log (1/s), N (s) expression is the needed smallest square number of grid covering x (t) of s with the length of side, change the size of s, calculate corresponding logN (s), log (1/s);
S333: make x i=log (1/s i), y i=logN (s i), i=1,2,, M utilizes (x i, y i) least square fitting straight line y=kx+b, k is meter box counting dimension D B, D BComputing formula be:
D B = [ ( Σ i = 1 M y i ) ( Σ i = 1 M x i ) - M ( Σ i = 1 M y i x i ) ] [ ( Σ i = 1 M x i ) 2 - M ( Σ i = 1 M x i 2 ) ] ,
The fractal characteristic of voice signal comes quantitatively characterizing by fractal dimension; Obtain the characteristic parameter of the fractal characteristics value of voice signal thus as voice signal;
S34: extract the composite character parameter,
With fractal dimension D BMerge to the characteristic parameter MFCC+ Δ MFCC+D that forms mixing in the MFCC parameter with single order MFCC differential parameter; Wherein, Δ MFCC is a single order MFCC differential parameter, and D is a fractal dimension;
Further, template base among the described step S4 forms through features training: voice signal is carried out extracting characteristic parameter after the pre-service, obtain the characteristic parameter template of each voice signal order speech, be kept at and form the reference template of this order speech in template base in the template base;
Further, described step S5 may further comprise the steps:
S51: from voice signal, extract characteristic parameter and generate test template;
S52: the reference template in test template and the template base is carried out pattern match;
S53: select the highest reference template of matching similarity as recognition result;
Further, training of the features training of described template base and pattern match adopt hidden markov model approach;
Two of purpose of the present invention is achieved through the following technical solutions:
Intelligent wheel chair speech recognition control system based on fractal characteristic provided by the invention comprises
The voice signal load module is used for input speech signal order speech;
The voice signal pretreatment module is used for voice signal is carried out pre-service; The preemphasis filtering of voice, windowing divide frame to handle and the double threshold end-point detection;
The characteristic parameter extraction module of voice signal is used to extract the characteristic parameter through voice signal after the pre-service;
Matching module is used for the template of characteristic parameter and template base is carried out pattern match;
Judge module is used to select the highest template of matching similarity as recognition result;
Command conversion module is used for this recognition result is converted to the motion command of intelligent wheel chair;
Control module is used to call the control corresponding function, drives intelligent wheel chair and moves according to voice signal.
Further, the characteristic parameter extraction module of described voice signal comprises MFCC parameter extraction module, behavioral characteristics extraction module, fractal characteristic extraction module and composite character parameter extraction module;
Described MFCC parameter extraction module is used to carry out following steps:
At first determine counting of each frame speech sample sequence, every frame sequence s (n) is carried out preemphasis filtering handle;
The FFT conversion of dispersing again, delivery square obtain discrete power spectrum S (n);
Several bandpass filter are set in the spectral range of voice;
H m(n),m=0,1,·,M-1,n=0,1,·,N/2-1
Wherein M is the number of wave filter, gets 24 usually, and N is counting of a frame voice signal;
Discrete power is composed the power spectrum S (n) that is converted under the Mel frequency;
Calculate S (n) by M H m(n) performance number of back gained is promptly calculated S (n) and H m(n) sum of products on each discrete point in frequency obtains M parameter P m, m=0,1,, M-1;
Calculate P mNatural logarithm, obtain L m, m=0,1,, M-1
To L 0, L 1,, L M-1Calculate its discrete cosine transform, obtain D m, m=0,1,, M-1;
Cast out the D that represents flip-flop 0, get D 1, D 2,, D KAs the MFCC parameter;
Described behavioral characteristics extraction module is used for following formula and calculates:
d ( n ) = 1 Σ i = - k k i 2 Σ i = - k k i · ( n + i )
Wherein c and d represent a frame speech parameter, and k is a constant, gets 2 usually, and differential parameter just is called the linear combination of the front cross frame and back two frame parameters of present frame; The differential parameter that calculates of formula is a single order MFCC differential parameter thus, in actual the use, MFCC parameter and each rank MFCC differential parameter is merged into a vector;
Described fractal characteristic extraction module, the fractal dimension that is used to extract voice signal are carried out following steps as fractal characteristic:
Voice signal is normalized to the unit square zone, obtain normalized signal x (t);
Square area is divided into the grid that the length of side is s, calculates logN (s), log (1/s), N (s) expression is the needed smallest square number of grid covering x (t) of s with the length of side, changes the size of s, calculates corresponding logN (s), log (1/s);
Make x i=log (1/s i), y i=logN (s i), i=1,2,, M utilizes (x i, y i) least square fitting straight line y=kx+b, k is meter box counting dimension D B, D BComputing formula be:
D B = [ ( Σ i = 1 M y i ) ( Σ i = 1 M x i ) - M ( Σ i = 1 M y i x i ) ] [ ( Σ i = 1 M x i ) 2 - M ( Σ i = 1 M x i 2 ) ] ,
The fractal characteristic of voice signal comes quantitatively characterizing by fractal dimension; Obtain the characteristic parameter of the fractal characteristics value of voice signal thus as voice signal;
Described composite character parameter extraction module is used to form the composite character parameter, with fractal dimension D BMerge to the characteristic parameter MFCC+ Δ MFCC+D that forms mixing in the MFCC parameter with single order MFCC differential parameter; Wherein, Δ MFCC is a single order MFCC differential parameter, and D is a fractal dimension;
Further, template base in the described matching module forms through features training: voice signal is carried out extracting characteristic parameter after the pre-service, obtain the characteristic parameter template of each voice signal order speech, be kept at and form the reference template of this order speech in template base in the template base;
Further, also comprise speech input device, signal processing apparatus, radio communication device and intelligent wheel chair body, described voice command signal is transferred to the order that signal processing apparatus carries out the controlled intelligent wheel chair body of signal Processing by speech input device, and this control command is transferred to the motion that the intelligent wheel chair body is realized the intelligent wheel chair body by radio communication device.
The invention has the advantages that: adopt the present invention that the fractal characteristic parameter of voice signal and traditional MFCC characteristic parameter are organically combined the composite character parameter that obtains, improved the discrimination of system, be used for the speech recognition of intelligent wheel chair speech control system, realized accurate control to intelligent wheel chair, reach the purpose of interactive voice between user and the intelligent wheel chair, simultaneously, this composite character parameter extracting method also is applicable to other speech recognition system.
Other advantage of the present invention, target and feature will be set forth to a certain extent in the following description, and to a certain extent, based on being conspicuous to those skilled in the art, perhaps can obtain instruction from the practice of the present invention to investigating hereinafter.The objectives and other advantages of the present invention can be passed through following instructions, claims, and the specifically noted structure realizes and obtains in the accompanying drawing.
Description of drawings
In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention is described in further detail below in conjunction with accompanying drawing, wherein:
Fig. 1 is the intelligent wheel chair speech recognition theory diagram of an embodiment of the invention;
Fig. 2 is the MFCC calculation of parameter process flow diagram of an embodiment of the invention;
Fig. 3 is the process flow diagram of an embodiment of the invention;
Fig. 4 is the structured flowchart of an embodiment of the invention.
Embodiment
Below with reference to accompanying drawing, the preferred embodiments of the present invention are described in detail; Should be appreciated that preferred embodiment only for the present invention is described, rather than in order to limit protection scope of the present invention.
Fig. 3 is the process flow diagram of an embodiment of the invention; Intelligent wheel chair speech recognition controlled method based on fractal characteristic provided by the invention may further comprise the steps:
S1: voice signal order speech input;
S2: voice signal is carried out pre-service;
S3: extract characteristic parameter through voice signal after the pre-service;
S4: the template of characteristic parameter and template base is carried out pattern match;
S5: select the highest template of matching similarity as recognition result;
S6: the motion command that this recognition result is converted to intelligent wheel chair;
S7: call the control corresponding function, drive intelligent wheel chair and move according to voice signal.
As the further improvement of the foregoing description, the pre-service among the described step S2 comprises that preemphasis filtering, the windowing of voice divides frame to handle and the double threshold end-point detection.
Fig. 1 is the intelligent wheel chair speech recognition theory diagram of an embodiment of the invention; Fig. 2 is the MFCC calculation of parameter process flow diagram of an embodiment of the invention; As shown in the figure, as the further improvement of the foregoing description, the feature extraction among the described step S3 may further comprise the steps:
S31: the MFCC parameter of extracting voice signal;
S311: at first determine counting of each frame speech sample sequence, every frame sequence s (n) is carried out preemphasis filtering handle;
S312: the FFT conversion of dispersing again, delivery square obtain discrete power spectrum S (n);
S313: several bandpass filter are set in the spectral range of voice;
H m(n),m=0,1,·,M-1,n=0,1,·,N/2-1
Wherein M is the number of wave filter, gets 24 usually, and N is counting of a frame voice signal;
S314: discrete power is composed the power spectrum S (n) that is converted under the Mel frequency;
Calculate S (n) by M H m(n) performance number of back gained is promptly calculated S (n) and H m(n) sum of products on each discrete point in frequency obtains M parameter P m, m=0,1,, M-1;
S315: calculate P mNatural logarithm, obtain L m, m=0,1,, M-1
S316: to L 0, L 1,, L M-1Calculate its discrete cosine transform, obtain D m, m=0,1,, M-1;
S317: cast out the D that represents flip-flop 0, get D 1, D 2,, D KAs the MFCC parameter.
S32: extract the behavioral characteristics of voice, as the characteristic parameter of a frame voice signal,
Describe the behavioral characteristics of voice with the difference cepstrum parameter, computing formula is:
d ( n ) = 1 Σ i = - k k i 2 Σ i = - k k i · ( n + i )
Wherein c and d represent a frame speech parameter, and k is a constant, gets 2 usually, and differential parameter just is called the linear combination of the front cross frame and back two frame parameters of present frame; The differential parameter that calculates of formula is a single order MFCC differential parameter thus, in actual the use, MFCC parameter and each rank MFCC differential parameter is merged into a vector;
S33: the fractal dimension that extracts voice signal is as fractal characteristic;
S331: voice signal is normalized to the unit square zone, obtain normalized signal x (t).
S332: square area is divided into the grid that the length of side is s, calculates logN (s), log (1/s), N (s) expression is the needed smallest square number of grid covering x (t) of s with the length of side, change the size of s, calculate corresponding logN (s), log (1/s);
S333: make x i=log (1/s i), y i=logN (s i), i=1,2,, M utilizes (x i, y i) least square fitting straight line y=kx+b, k is meter box counting dimension D B, D BComputing formula be:
D B = [ ( Σ i = 1 M y i ) ( Σ i = 1 M x i ) - M ( Σ i = 1 M y i x i ) ] [ ( Σ i = 1 M x i ) 2 - M ( Σ i = 1 M x i 2 ) ] ,
The fractal characteristic of voice signal comes quantitatively characterizing by fractal dimension; Obtain the characteristic parameter of the fractal characteristics value of voice signal thus as voice signal;
S34: extract the composite character parameter,
With fractal dimension D BMerge to the characteristic parameter MFCC+ Δ MFCC+D that forms mixing in the MFCC parameter with single order MFCC differential parameter; Wherein, Δ MFCC is a single order MFCC differential parameter, and D is a fractal dimension.The definition of fractal dimension has a variety of, as similar dimension, and Hausdorff dimension, information dimension, correlation dimension, capacity dimension and meter box counting dimension etc.Wherein the Hausdorff dimension is the most ancient, also is most important a kind of dimension, and it all has definition to any collection, and it is defined as:
D = ( lim δ → 0 ( ln M δ ( F ) / ln δ - 1 )
Wherein, M δ(F) expression covers the required number of subclass F with unit-sized δ.
Further improvement as the foregoing description, template base among the described step S4 forms through features training: voice signal is carried out extracting characteristic parameter after the pre-service, obtain the characteristic parameter template of each voice signal order speech, be kept at and form the reference template of this order speech in template base in the template base.
As the further improvement of the foregoing description, described step S5 may further comprise the steps:
S51: from voice signal, extract characteristic parameter and generate test template;
S52: the reference template in test template and the template base is carried out pattern match;
S53: select the highest reference template of matching similarity as recognition result.
As the further improvement of the foregoing description, training of the features training of described template base and pattern match adopt hidden markov model approach.
Fig. 4 is the structured flowchart of an embodiment of the invention, and as shown in the figure, the intelligent wheel chair speech recognition control system based on fractal characteristic provided by the invention comprises
Voice signal load module 41 is used for input speech signal order speech;
Voice signal pretreatment module 42 is used for voice signal is carried out pre-service; The preemphasis filtering of voice, windowing divide frame to handle and the double threshold end-point detection;
The characteristic parameter extraction module 43 of voice signal is used to extract the characteristic parameter through voice signal after the pre-service;
Matching module 44 is used for the template of characteristic parameter and template base is carried out pattern match;
Judge module 45 is used to select the highest template of matching similarity as recognition result;
Command conversion module 46 is used for this recognition result is converted to the motion command of intelligent wheel chair;
Control module 48 is used to call the control corresponding function, drives intelligent wheel chair and moves according to voice signal.
As the further improvement of the foregoing description, the characteristic parameter extraction module 43 of described voice signal comprises MFCC parameter extraction module, behavioral characteristics extraction module, fractal characteristic extraction module and composite character parameter extraction module;
Described MFCC parameter extraction module is used to carry out following steps:
At first determine counting of each frame speech sample sequence, every frame sequence s (n) is carried out preemphasis filtering handle;
The FFT conversion of dispersing again, delivery square obtain discrete power spectrum S (n);
Several bandpass filter are set in the spectral range of voice;
H m(n),m=0,1,·,M-1,n=0,1,·,N/2-1
Wherein M is the number of wave filter, gets 24 usually, and N is counting of a frame voice signal;
Discrete power is composed the power spectrum S (n) that is converted under the Mel frequency;
Calculate S (n) by M H m(n) performance number of back gained is promptly calculated S (n) and H m(n) sum of products on each discrete point in frequency obtains M parameter P m, m=0,1,, M-1;
Calculate P mNatural logarithm, obtain L m, m=0,1,, M-1
To L 0, L 1,, L M-1Calculate its discrete cosine transform, obtain D m, m=0,1,, M-1;
Cast out the D that represents flip-flop 0, get D 1, D 2,, D KAs the MFCC parameter.
Described behavioral characteristics extraction module is used for following formula and calculates:
d ( n ) = 1 Σ i = - k k i 2 Σ i = - k k i · ( n + i )
Wherein c and d represent a frame speech parameter, and k is a constant, gets 2 usually, and differential parameter just is called the linear combination of the front cross frame and back two frame parameters of present frame; The differential parameter that calculates of formula is a single order MFCC differential parameter thus, in actual the use, MFCC parameter and each rank MFCC differential parameter is merged into a vector;
Described fractal characteristic extraction module, the fractal dimension that is used to extract voice signal are carried out following steps as fractal characteristic:
Voice signal is normalized to the unit square zone, obtain normalized signal x (t).
Square area is divided into the grid that the length of side is s, calculates logN (s), log (1/s), N (s) expression is the needed smallest square number of grid covering x (t) of s with the length of side, changes the size of s, calculates corresponding logN (s), log (1/s);
Make x i=log (1/s i), y i=logN (s i), i=1,2,, M utilizes (x i, y i) least square fitting straight line y=kx+b, k is meter box counting dimension D B, D BComputing formula be:
D B = [ ( Σ i = 1 M y i ) ( Σ i = 1 M x i ) - M ( Σ i = 1 M y i x i ) ] [ ( Σ i = 1 M x i ) 2 - M ( Σ i = 1 M x i 2 ) ] ,
The fractal characteristic of voice signal comes quantitatively characterizing by fractal dimension; Obtain the characteristic parameter of the fractal characteristics value of voice signal thus as voice signal;
Described composite character parameter extraction module is used to form the composite character parameter, with fractal dimension D BMerge to the characteristic parameter MFCC+ Δ MFCC+D that forms mixing in the MFCC parameter with single order MFCC differential parameter; Wherein, Δ MFCC is a single order MFCC differential parameter, and D is a fractal dimension.The definition of fractal dimension has a variety of, as similar dimension, and Hausdorff dimension, information dimension, correlation dimension, capacity dimension and meter box counting dimension etc.Wherein the Hausdorff dimension is the most ancient, also is most important a kind of dimension, and it all has definition to any collection, and it is defined as:
D = ( lim δ → 0 ( ln M δ ( F ) / ln δ - 1 )
Wherein, M δ(F) expression covers the required number of subclass F with unit-sized δ.
Further improvement as the foregoing description, template base in the described matching module forms through features training: voice signal is carried out extracting characteristic parameter after the pre-service, obtain the characteristic parameter template of each voice signal order speech, be kept at and form the reference template of this order speech in template base in the template base.
Further improvement as the foregoing description, also comprise speech input device, signal processing apparatus, radio communication device 47 and intelligent wheel chair body 49, described voice command signal is transferred to the order that signal processing apparatus carries out the controlled intelligent wheel chair body of signal Processing by speech input device, this control command is transferred to the motion that the intelligent wheel chair body is realized the intelligent wheel chair body by radio communication device, speech input device in the present embodiment adopts microphone, as the input of whole speech control system; Signal processing apparatus in the present embodiment adopts the host computer of notebook computer as whole speech control system, is used for processes voice signals; Radio communication device in the present embodiment adopts router, is used for the communication between host computer and the slave computer; The intelligent wheel chair body is the slave computer of speech control system, is used to finish the control corresponding action.
The above is the preferred embodiments of the present invention only, is not limited to the present invention, and obviously, those skilled in the art can carry out various changes and modification and not break away from the spirit and scope of the present invention the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.

Claims (10)

1. based on the intelligent wheel chair speech recognition controlled method of fractal characteristic, it is characterized in that: may further comprise the steps:
S1: voice signal order speech input;
S2: voice signal is carried out pre-service;
S3: extract characteristic parameter through voice signal after the pre-service;
S4: the template of characteristic parameter and template base is carried out pattern match;
S5: select the highest template of matching similarity as recognition result;
S6: the motion command that this recognition result is converted to intelligent wheel chair;
S7: call the control corresponding function, drive intelligent wheel chair and move according to voice signal.
2. the intelligent wheel chair speech recognition controlled method based on fractal characteristic according to claim 1 is characterized in that: the pre-service among the described step S2 comprises that preemphasis filtering, the windowing of voice divides frame to handle and the double threshold end-point detection.
3. the intelligent wheel chair speech recognition controlled method based on fractal characteristic according to claim 1, it is characterized in that: the feature extraction among the described step S3 may further comprise the steps:
S31: the Mel frequency cepstral coefficient MFCC parameter of extracting voice signal;
S311: at first determine counting of each frame speech sample sequence, every frame sequence s (n) is carried out preemphasis filtering handle;
S312: carry out discrete Fourier transform (DFT) FFT conversion again, delivery square obtain discrete power spectrum S (n);
S313: several bandpass filter are set in the spectral range of voice; Voice in a certain spectral range can pass through this bandpass filter, and the voice spectrum of other scope is attenuated to the utmost point low value.This bandpass filter is as follows:
H m(n),m=0,1,·,M-1,n=0,1,·,N/2-1
H wherein m(n) be the transport function of each bandpass filter, M is the number of wave filter, gets 24 usually, and N is counting of a frame voice signal;
S314: discrete power is composed the power spectrum S (n) that is converted under the Mel frequency;
Calculate S (n) by M H m(n) performance number of back gained is promptly calculated S (n) and H m(n) sum of products on each discrete point in frequency obtains M parameter P m, m=0,1,, M-1, P herein mBe centre frequency;
S315: calculate P mNatural logarithm, obtain L m, m=0,1,, M-1, L herein mBe log spectrum.
S316: to L 0, L 1,, L M-1Calculate its discrete cosine transform, obtain D m, m=0,1,, M-1;
S317: cast out the D that represents flip-flop 0, get D 1, D 2,, D KAs the MFCC parameter;
S32: extract the behavioral characteristics of voice, as the characteristic parameter of a frame voice signal,
Describe the behavioral characteristics of voice with the difference cepstrum parameter, computing formula is:
d ( n ) = 1 Σ i = - k k i 2 Σ i = - k k i · ( n + i )
Wherein c and d represent a frame speech parameter, and k is a constant, gets 2 usually, and differential parameter just is called the linear combination of the front cross frame and back two frame parameters of present frame; The differential parameter that calculates of formula is a single order MFCC differential parameter thus, in actual the use, MFCC parameter and each rank MFCC differential parameter is merged into a vector;
S33: the fractal dimension that extracts voice signal is as fractal characteristic;
S331: voice signal is normalized to the unit square zone, obtain normalized signal x (t);
S332: square area is divided into the grid that the length of side is s, calculates logN (s), log (1/s), N (s) expression is the needed smallest square number of grid covering x (t) of s with the length of side, change the size of s, calculate corresponding logN (s), log (1/s);
S 333: make x i=log (1/s i), y i=logN (s i), i=1,2,, M utilizes (x i, y i) least square fitting straight line y=kx+b, k is meter box counting dimension D B, D BComputing formula be:
D B = [ ( Σ i = 1 M y i ) ( Σ i = 1 M x i ) - M ( Σ i = 1 M y i x i ) ] [ ( Σ i = 1 M x i ) 2 - M ( Σ i = 1 M x i 2 ) ] ,
The fractal characteristic of voice signal comes quantitatively characterizing by fractal dimension; Obtain the characteristic parameter of the fractal characteristics value of voice signal thus as voice signal;
S34: extract the composite character parameter,
With fractal dimension D BMerge to the characteristic parameter MFCC+ Δ MFCC+D that forms mixing in the MFCC parameter with single order MFCC differential parameter;
Wherein, Δ MFCC is a single order MFCC differential parameter, and D is a fractal dimension.
4. the intelligent wheel chair speech recognition controlled method based on fractal characteristic according to claim 1, it is characterized in that: the template base among the described step S4 forms through features training: voice signal is carried out extracting characteristic parameter after the pre-service, obtain the characteristic parameter template of each voice signal order speech, be kept at and form the reference template of this order speech in template base in the template base.
5. the intelligent wheel chair speech recognition controlled method based on fractal characteristic according to claim 1, it is characterized in that: described step S5 may further comprise the steps:
S51: from voice signal, extract characteristic parameter and generate test template;
S52: the reference template in test template and the template base is carried out pattern match;
S53: select the highest reference template of matching similarity as recognition result.
6. the intelligent wheel chair speech recognition controlled method based on fractal characteristic according to claim 1 is characterized in that: the features training training and the pattern match of described template base adopt hidden markov model approach.
7. based on the intelligent wheel chair speech recognition control system of fractal characteristic, it is characterized in that: comprise
The voice signal load module is used for input speech signal order speech;
The voice signal pretreatment module is used for voice signal is carried out pre-service; The preemphasis filtering of voice, windowing divide frame to handle and the double threshold end-point detection;
The characteristic parameter extraction module of voice signal is used to extract the characteristic parameter through voice signal after the pre-service;
Matching module is used for the template of characteristic parameter and template base is carried out pattern match;
Judge module is used to select the highest template of matching similarity as recognition result;
Command conversion module is used for this recognition result is converted to the motion command of intelligent wheel chair;
Control module is used to call the control corresponding function, drives intelligent wheel chair and moves according to voice signal.
8. the intelligent wheel chair speech recognition control system based on fractal characteristic according to claim 7, it is characterized in that: the characteristic parameter extraction module of described voice signal comprises MFCC parameter extraction module, behavioral characteristics extraction module, fractal characteristic extraction module and composite character parameter extraction module;
Described MFCC parameter extraction module is used to carry out following steps:
At first determine counting of each frame speech sample sequence, every frame sequence s (n) is carried out preemphasis filtering handle;
The FFT conversion of dispersing again, delivery square obtain discrete power spectrum S (n);
Several bandpass filter are set in the spectral range of voice;
H m(n),m=0,1,·,M-1,n=0,1,·,N/2-1
Wherein M is the number of wave filter, gets 24 usually, and N is counting of a frame voice signal;
Discrete power is composed the power spectrum S (n) that is converted under the Mel frequency;
Calculate S (n) by M H m(n) performance number of back gained is promptly calculated S (n) and H m(n) sum of products on each discrete point in frequency obtains M parameter P m, m=0,1,, M-1;
Calculate P mNatural logarithm, obtain L m, m=0,1,, M-1
To L 0, L 1,, L M-1Calculate its discrete cosine transform, obtain D m, m=0,1,, M-1;
Cast out the D that represents flip-flop 0, get D 1, D 2,, D KAs the MFCC parameter;
Described behavioral characteristics extraction module is used for following formula and calculates:
d ( n ) = 1 Σ i = - k k i 2 Σ i = - k k i · ( n + i )
Wherein c and d represent a frame speech parameter, and k is a constant, gets 2 usually, and differential parameter just is called the linear combination of the front cross frame and back two frame parameters of present frame; The differential parameter that calculates of formula is a single order MFCC differential parameter thus, in actual the use, MFCC parameter and each rank MFCC differential parameter is merged into a vector;
Described fractal characteristic extraction module, the fractal dimension that is used to extract voice signal are carried out following steps as fractal characteristic:
Voice signal is normalized to the unit square zone, obtain normalized signal x (t);
Square area is divided into the grid that the length of side is s, calculates logN (s), log (1/s), N (s) expression is the needed smallest square number of grid covering x (t) of s with the length of side, changes the size of s, calculates corresponding logN (s), log (1/s);
Make x i=log (1/s i), y i=logN (s i), i=1,2,, M utilizes (x i, y i) least square fitting straight line y=kx+b, k is meter box counting dimension D B, D BComputing formula be:
D B = [ ( Σ i = 1 M y i ) ( Σ i = 1 M x i ) - M ( Σ i = 1 M y i x i ) ] [ ( Σ i = 1 M x i ) 2 - M ( Σ i = 1 M x i 2 ) ] ,
The fractal characteristic of voice signal comes quantitatively characterizing by fractal dimension; Obtain the characteristic parameter of the fractal characteristics value of voice signal thus as voice signal;
Described composite character parameter extraction module is used to form the composite character parameter, with fractal dimension D BMerge to the characteristic parameter MFCC+ Δ MFCC+D that forms mixing in the MFCC parameter with single order MFCC differential parameter;
Wherein, Δ MFCC is a single order MFCC differential parameter, and D is a fractal dimension.
9. the intelligent wheel chair speech recognition control system based on fractal characteristic according to claim 7 is characterized in that:
Template base in the described matching module forms through features training: voice signal is carried out extracting characteristic parameter after the pre-service, obtain the characteristic parameter template of each voice signal order speech, be kept at and form the reference template of this order speech in template base in the template base.
10. the intelligent wheel chair speech recognition control system based on fractal characteristic according to claim 7, it is characterized in that: also comprise speech input device, signal processing apparatus, radio communication device and intelligent wheel chair body, described voice command signal is transferred to the order that signal processing apparatus carries out the controlled intelligent wheel chair body of signal Processing by speech input device, and this control command is transferred to the motion that the intelligent wheel chair body is realized the intelligent wheel chair body by radio communication device.
CN2011101091682A 2011-04-28 2011-04-28 Fractal-feature-based intelligent wheelchair voice identification control method and system Pending CN102184732A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011101091682A CN102184732A (en) 2011-04-28 2011-04-28 Fractal-feature-based intelligent wheelchair voice identification control method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011101091682A CN102184732A (en) 2011-04-28 2011-04-28 Fractal-feature-based intelligent wheelchair voice identification control method and system

Publications (1)

Publication Number Publication Date
CN102184732A true CN102184732A (en) 2011-09-14

Family

ID=44570898

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011101091682A Pending CN102184732A (en) 2011-04-28 2011-04-28 Fractal-feature-based intelligent wheelchair voice identification control method and system

Country Status (1)

Country Link
CN (1) CN102184732A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102800316A (en) * 2012-08-30 2012-11-28 重庆大学 Optimal codebook design method for voiceprint recognition system based on nerve network
CN104306118A (en) * 2014-11-07 2015-01-28 重庆邮电大学 Smartphone based family monitoring system on intelligent wheelchair
CN104538029A (en) * 2014-12-16 2015-04-22 重庆邮电大学 Robust speech recognition method and system based on speech enhancement and improved PNSC
CN104766607A (en) * 2015-03-05 2015-07-08 广州视源电子科技股份有限公司 Television program recommendation method and system
CN105250084A (en) * 2015-10-24 2016-01-20 陈丹 External command compiler
CN105334819A (en) * 2015-10-24 2016-02-17 陈丹 Wireless signal transmission butting device
CN106028217A (en) * 2016-06-20 2016-10-12 咻羞科技(深圳)有限公司 Intelligent device interacting system and method based on audio identification technology based
CN106448659A (en) * 2016-12-19 2017-02-22 广东工业大学 Speech endpoint detection method based on short-time energy and fractal dimensions
CN106557164A (en) * 2016-11-18 2017-04-05 北京光年无限科技有限公司 It is applied to the multi-modal output intent and device of intelligent robot
CN107331386A (en) * 2017-06-26 2017-11-07 上海智臻智能网络科技股份有限公司 End-point detecting method, device, processing system and the computer equipment of audio signal
CN110047480A (en) * 2019-04-22 2019-07-23 哈尔滨理工大学 Added Management robot head device and control for the inquiry of department, community hospital
CN110060697A (en) * 2019-04-14 2019-07-26 湖南检信智能科技有限公司 A kind of emotional characteristic extraction method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101154385A (en) * 2006-09-28 2008-04-02 北京远大超人机器人科技有限公司 Control method for robot voice motion and its control system
WO2008148289A1 (en) * 2007-06-07 2008-12-11 Shenzhen Institute Of Advanced Technology An intelligent audio identifying system and method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101154385A (en) * 2006-09-28 2008-04-02 北京远大超人机器人科技有限公司 Control method for robot voice motion and its control system
WO2008148289A1 (en) * 2007-06-07 2008-12-11 Shenzhen Institute Of Advanced Technology An intelligent audio identifying system and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《重庆邮电大学硕士学位论文》 20101115 李艳花 基于特征提取的智能轮椅语音识别控制技术的研究与实现 1-10 , 2 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102800316A (en) * 2012-08-30 2012-11-28 重庆大学 Optimal codebook design method for voiceprint recognition system based on nerve network
CN102800316B (en) * 2012-08-30 2014-04-30 重庆大学 Optimal codebook design method for voiceprint recognition system based on nerve network
CN104306118A (en) * 2014-11-07 2015-01-28 重庆邮电大学 Smartphone based family monitoring system on intelligent wheelchair
CN104538029A (en) * 2014-12-16 2015-04-22 重庆邮电大学 Robust speech recognition method and system based on speech enhancement and improved PNSC
CN104766607A (en) * 2015-03-05 2015-07-08 广州视源电子科技股份有限公司 Television program recommendation method and system
CN105334819A (en) * 2015-10-24 2016-02-17 陈丹 Wireless signal transmission butting device
CN105250084A (en) * 2015-10-24 2016-01-20 陈丹 External command compiler
CN106028217A (en) * 2016-06-20 2016-10-12 咻羞科技(深圳)有限公司 Intelligent device interacting system and method based on audio identification technology based
CN106028217B (en) * 2016-06-20 2020-01-21 咻羞科技(深圳)有限公司 Intelligent equipment interaction system and method based on audio recognition technology
CN106557164A (en) * 2016-11-18 2017-04-05 北京光年无限科技有限公司 It is applied to the multi-modal output intent and device of intelligent robot
CN106448659A (en) * 2016-12-19 2017-02-22 广东工业大学 Speech endpoint detection method based on short-time energy and fractal dimensions
CN107331386A (en) * 2017-06-26 2017-11-07 上海智臻智能网络科技股份有限公司 End-point detecting method, device, processing system and the computer equipment of audio signal
CN110060697A (en) * 2019-04-14 2019-07-26 湖南检信智能科技有限公司 A kind of emotional characteristic extraction method
CN110047480A (en) * 2019-04-22 2019-07-23 哈尔滨理工大学 Added Management robot head device and control for the inquiry of department, community hospital

Similar Documents

Publication Publication Date Title
CN102184732A (en) Fractal-feature-based intelligent wheelchair voice identification control method and system
Wang et al. Wavelet packet analysis for speaker-independent emotion recognition
CN103310788A (en) Voice information identification method and system
CN105342769A (en) Intelligent electric wheelchair
CN110309503A (en) A kind of subjective item Rating Model and methods of marking based on deep learning BERT--CNN
CN106228977A (en) The song emotion identification method of multi-modal fusion based on degree of depth study
CN103092329A (en) Lip reading technology based lip language input method
CN105919591A (en) Surface myoelectrical signal based sign language recognition vocal system and method
CN110718234A (en) Acoustic scene classification method based on semantic segmentation coding and decoding network
CN107393554A (en) In a kind of sound scene classification merge class between standard deviation feature extracting method
CN104900229A (en) Method for extracting mixed characteristic parameters of voice signals
CN103413113A (en) Intelligent emotional interaction method for service robot
CN106340298A (en) Voiceprint unlocking method integrating content recognition and speaker recognition
Guo et al. Speech Emotion Recognition by Combining Amplitude and Phase Information Using Convolutional Neural Network.
CN109977258A (en) Cross-module state searching classification device model, searching system and the search method of image and voice
CN102592593B (en) Emotional-characteristic extraction method implemented through considering sparsity of multilinear group in speech
Noroozi et al. Supervised vocal-based emotion recognition using multiclass support vector machine, random forests, and adaboost
CN103294199A (en) Silent information identifying system based on facial muscle sound signals
CN116092497A (en) Semantic cloud brain robot based on knowledge graph and artificial intelligence
Pham et al. A method upon deep learning for speech emotion recognition
CN103971676B (en) A kind of Rapid Speech isolated word recognition algorithm and application thereof, speech recognition system
Zhou et al. A hybrid speech emotion recognition system based on spectral and prosodic features
Qian et al. Target Classification in Unattended Ground Sensors With a Two-Stream Convolutional Network
Bhushan et al. A Self-Attention Based Hybrid CNN-LSTM for Speaker-Independent Speech Emotion Recognition
CN1242377C (en) Guangdong Language print identifying method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20110914