Nothing Special   »   [go: up one dir, main page]

CA2090205C - Speech coding system - Google Patents

Speech coding system

Info

Publication number
CA2090205C
CA2090205C CA002090205A CA2090205A CA2090205C CA 2090205 C CA2090205 C CA 2090205C CA 002090205 A CA002090205 A CA 002090205A CA 2090205 A CA2090205 A CA 2090205A CA 2090205 C CA2090205 C CA 2090205C
Authority
CA
Canada
Prior art keywords
speech
vector
adaptive
weighted
weighting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CA002090205A
Other languages
French (fr)
Other versions
CA2090205A1 (en
Inventor
Masahiro Serizawa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Publication of CA2090205A1 publication Critical patent/CA2090205A1/en
Application granted granted Critical
Publication of CA2090205C publication Critical patent/CA2090205C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Abstract

An input speech signal is splitted into a time section and generating the splitted signal as a speech vector and LPC coefficient sets are developed by linear prediction analysis for every time section of the speech vector. The speech vector is weighted based on the developed LPC
coefficient sets, then a plurality of the weighted speech vectors are connected and the connected speech vector having a predetermined frame length is generated.
An excitation codevector whose weighted synthesized signal is the most similar to the weighted speech vector is determined among from a plurality of excitation vectors each having the frame length which are previously stored as a sound source. A plurality or adaptive codevectors each having the frame length and obtained by cutting out a sound source signal produced from the determined excitation vectors at predetermined timing points are stored in an adaptive codebook. An adaptive codevector whose weighted synthesized signal is the most similar to the weighted speech vector is determined among from the plurality of adaptive codevectors.

Description

SPEECII CODING SYSTEM
BACKGROUND OF TI~E INVENTION
The present invention relates to a speech coding system for high quality coding of a speech signal at a low bit rate, e.g., 8 K bps or lower.
One example of conventional speech coding system is known as the CELP (Code Excited LPC Coding) disclosed in a technical paper entitled: "CODE-EXCITED LINEAR PREDICTION
(CELP): HIGH QUALITY SPEECII AT VERY LOW BIT RATES" IEEE
Proc. ICASSP-85, pp 937-940, 1985 by M. Schroeder and B.
Atal. In this system, linear predictive analysis is first performed on the speech (voice) signal in a frame of a predetermined period at the transmission side to obtain the linear predictive coefficient sets. In an adaptive codebook stored are a plurality of adaptive codevectors produced by cutting out tlle previously synthesized sound source signal at predetermined timings. From the adaptive codebook, the adaptive codevector having the smallest square distance is searched based upon the perceptual weighted speech vector and perceptual weighted synthesized adaptive codevectors in the adaptive codebook. The searched synthesized adaptive codevector is subtracted ~rom the weighted speech vector to obtain the residual vector. In an excitation codebook stored are a plurality of excitation codevectors obtained from, for example, noise signal in a predetermined frame. An excitation codevector having the smallest square distance based upon the perceptual weighted synthesized excitation codevector or residual vector.

209020~
optimum gains are calculated for the searched adaptive code-vector and excitation codevector. Indexes of the searched adaptive codevector, the excitation codevector, and the gains and the linear predictive coefficient set are transmitted. At the receiving side, the speech signal is synthesized based on these indexes.
A disadvantage of the conventional speech coding system is degradation in speech quality. The reason is that the square distance on the weighted vectors is calculated using the same LPC coefficient set within the vector in the codebook searching. Accordingly, if the vector length is long, changes in frequency characteristics of the speech signal in the vector cannot be sufficiently approximated.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to overcome the above disadvantage and to provide a speech coding system having the codebook searching capable of efficiently quantizing the speech signal.
According to the present invention, there is provided a speech coding system comprising: (a) means for splitting an input speech signal and generating a first speech vector; (b) means for further splitting the first split speech vector and generating a second speech vector; (c) means for cal~ulating a weighting filter coefficient for each speech sub-vector obtained by further splitting the second speech vector by using the input speech signal; (d) adaptive codebook means for storing adaptive codevectors each having the same length as the second speech vector, generated on the basis of sound source signal reproduced in the past; (e) means for storing a plurality of excitation codevectors each having the same length as the second speech vector; (f) means for weighting the second speech vector for each time section corresponding to each speech sub-vector by using the weighting filter coefficients; (g) means for weighting and synthesizing the adaptive codevectors for each time section corresponding to each speech sub-vector by using the weighting filter coefficients; (h) means for selecting an adaptive codevector lo by comparing the weighted second speech vector and the weighted synthesized adaptive codevectors; (i) means for weighting and synthesizing the excitation codevectors for each time section corresponding to each speech sub-vector by using the weighting filter coefficients; and (j) means for selecting an excitation codevector by using the weighted second speech vector, and the vectors obtained by weighting and synthesizing the selected adaptive codevector and the selected excitation codevector.
LPC coefficient set may be developed by linear prediction analysis for a predetermined time period longer than the time section and LPC coefficient set may be developed by interpolation of the frame LPC coefficient set for the predetermined time period.
Other objects and features will be clarified from the following description with reference to attached drawing.

BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of one embodiment of the speech coding system according to the present invention.

PREFERRED EMBODIMENTS
In operation of the speech coding system according to the present invention, in order to determine the adaptive - 3a -209020~

codevector and the excitatlon codevector, firstly, a speecl vector synthesized by a filter is calculated for each speech vector in a given time length of the input speech signal using the LPC coefficient set obtained through linear prediction Or the input speech. Then, performed is the linear predictive analysis in each Or the predetermined sections (e.g., two sections of 0 to N/2-1 and N/2 to N-1, where N is vector length = frame length) within the vector to develop the section LPC coefficient sets. The devcl,oped LPC coefficient sets are used for perceptual weighting the speech vector. The squared sum o~' the weighted syn~l-es1zed signal developed in accordance with the following expression (1). Each codevector having the small,est distance is searche~ based upon the deve]oped squared sum.
~ (x~ - gac, jl~lc k ~ gec,j~ec,k) ( ) ~ =o where Xk~ represents the k-th element of the perceptual weighted speech vector for the speccll vcctor elcment ~ and is given by: xk~= W~(~2q ) x~
Wi(~lq-l) where the weighting function is given by:

Wi(~2q~1) W~ q~~) Wi(q-l) is given by:

Wj(q-l)= I + ~ )q q-l is a shift operator representing time delay at time i satisfying the following expressions:

2090~

Xkq~' = lk-i q_lq j q_ji i represents the number of sections. w,~l) is the valllc Or I order of the LPC coefficients for the i-th element of the speech vector (which ls obtained through linear predictive analysis of the input speech signal in the analysis wlndow i.ncluding the section) whlle L, is the order of analysis.
~l and ~ are coefficients for adJusting the perceptual weighting.
I0 Xack is the k-th element of the weighted synthesi.zed adaptive vector of the adaptive codevector element Ca~k~
for the index j and is given by the following expression:

W ( y2q~

/~
l-li(q~l)= I + ~, ai(l)q-l a,(l) is the l order value for the l,PC coefficients through, for example, quantizing and decoding corresponding to the speech vector (by way of, for example, linear predictive analysis of the input speech signal in the analysis window including the splitted frame).
xc'c,~. represents the k-th element of the weighted synthesized excitation vector for the excitation codevector element Cec. k~ for the index ~ and is given by the following expression:

", W i(l~2q - I ) I Cec. k (J) W,(~q-~) Hi(q-l) 20go20~

where q~c J and q~. J are optimum gains when the adaptive codevector and the excitation codevector for the index ~
are searched. The expression l is utilized for codebook searching to correspond to any change in frequency response of the speech signal within the vector, thereby improving the quality of the coded speech.
Now, one embodimcrlt Or the speec}l coding system according to the present invention will be described by reference to ~IG. 1.
Illustrated in FIG. 1 is a block diagram of one preferred embodiment of the present invention. A speech signal is received at an input terminal 10 and is applied to a frame splitter 100, a perceptual weighting splitter 120 and a synthesis filter splitter 130. The frame splitter 100 splits the speech signal at every frame length (e.g., 5 ms) and the splitted speech signal is supplied to an in-frame splitter 105 as the speech vector. The in-frame splitter 105 further splits the speech vector supplied from the frame splitter 100 into halves, for example, and supplies the fine (in-frame) splitted speech vectors to a weighting filter 110.
The perceptual weighting splitter 120 splits the speech signal from the input terminal 10 into, for example 20 ms, window length to develop the LPC coefficient sets to be used for perceptual weighting by an LPC analyzer 125 through the linear prediction analysis. The LPC
interpolator 127 calculates interpolation set of tllc l,l'C
coefficient sets supplied from the LPC analyzer 125 for 209020~

each splitted speech vector. The interpolation set is then sent to the LPC weighting filter 110, a weighting filter 160 and a weighting filter 195.
The synthesis filter splitter 130 splits the speech signal from the input terminal 10 into, for example 20 ms, window length for developing the LPC coefficient sets to be used for synthesis by an L,PC analyzer 135 through the linear prediction analysis. An LPC coefficient quantizer 140 quantizes the LPC coefficient set from the LPC analyzer 135 and supplies the quantization index to a multiplexer 300 and also the decoded LPC coefficient set to an LPC
interpolator 142. The LPC interpolator 142 calculates interpolation set of the LPC coefficient sets received from the LPC analyzer 135 corrcsponding to each rine sp]itt;cd speech vector by using the known method. The calculated interpolation set is sent to a synthesis filter 155 and a synthesis filter 190.
The weighting filter 110 performs perceptual weighting on the fine splitted speech vector received from the in-frame splitter 105 using the interpolated LPC
coefficient sets received from the LPC interpolator 127 and sends the perceptual weighted splitted speech vector to a connector 115. The connector 115 connects the fine splitted speech vectors received from the weighting fi,lter llo and send them to a subtractor 175 and a least square error index searcher 170.
An adaptive codebook 145 stores the pitch information of the speech signal such as the past synthesized sound 209020~

, source signal of a predetermined several frames received from an adder 205 (which will be described hereinafter).
The adaptive codevector in a given frame length Cllt out in predetermined timings is sent to an in-frame splitter ]50.
S The in-frame sp]itter 150 splits the adapti.ve codevector received from the adaptive codebook 145 into, for example, halves and sends tlle fine splitted adaptive codevector to a synthesis filter 155. The synthesis filter 155 -filters the fine splitted adaptive codevector received from the in-frame splitter 150 using the interpolated LPC
coefficient set received from the LPC interpolator 142.
The weighting filter 160 is used for perceptual weighting the signal vector synthesized by the synthesis filter 155 in accordance with the interpolated LPC coefficient set received from the LPC interpolator 127. The connector 165 includes the similar buffer memory and connects the perceptual weighted splitted adaptive codevector received from the weighted filter 160.
The least square error index searcher 170 calculates the square distance of the weighted synthesized adaptive codevector received from the connector 165 and the weighted speech vector received from the connector 115. The weighted synthesized adaptive codevector is sent to the subtractor 175 when the square distance is minimum. The adaptive codevector received from the adaptive codebook 145 is sent to the adder 205 and its index is sent to the multiplexer 300.
The subtractor 175 develops the adaptive codebook 209~20~

.
resldual vectors by subtracting thc weighted synthesi7,ed adaptive codevector recelved from the least square error index searcher 170 from the weighted speech vector received from the connector 115. The adaptive codebook residual vector is then sent to a least square error index searcher Z07.
An excitation codebook 180 sends the excitation codevector to an in-frame splitter 185. The i.n-frame splitter 185 further splits (fine splits) the excitati.on codevector rece~ved from the excitation codebook 180 into, for example, halves and sends the fine splitted excitation codevector to a synthesis filter 190. The synthesis filter 190 filters the splitted excitation codevector received from the in-frame splitter 185 using the interpolated LPC
coefficient set received from the LPC interpolator 142.
The weighting filter 195 performs perceptual weighting o-f the synthesl,zed vector received from the synthesis filter 190 using the interpolated LPC coefficient set received from the LPC interpolator 127. The connector 200 connects the weighted synthesized and fine splitted excitation codevectors received from the weighting filter 195 and sends the connected vector to the least square error index searcher 207. The least square error index searcher 207 develops the square distance between the adaptive codebook residual vector received from the subtractor 175 and the weighted synthesized excitation codevector received from the connector 200. When the minimum square distance is searched, the excitation codevector received from the 209020!j excitation codebook 180 is sent to the adder 205 and its index is sent to the multiplexer 300.
The adder 205 adds the adaptive codevector received from the least square error index searcher 170 and the excitation codevector received from the least square error index searcher 207 and supplies the added result to the adaptive codebook 145.
The multiplexer 300 combines the outputs from the LPC
quantizer/decoder 140 alld the leas~ square error in~cx searcher 170 and also the indexes from the least square error index searcher 207 and sends the combined data to an output terminal 305.
In the above coding system, the perceptual weighting LPC coefficient sets may be the Ll'C coefficient sets from the LPC analyzer 135 or the quanti%ed LPC coefficient set.
In this case, the perceptual weighting splitter 120 and the LPC analyzer 125 are unnecessary. The LPC interpolator 127 can be eliminated if the l,PC coefficient sets for perceptual weighting are obtained by performing linear estimation analysis equal to the number of splits in the frame. The number of in-frame split may be 1. Also, the LPC analyzer may be modified to per~orm linear prediction analysis of the speech signal for the predetermined window length (e.g., 20 ms) at every period (e.g., 20 ms) equal to multiple times of the frame length.
In the foregoing embodiments, the determination Or the searched excitation codevector and adaptive codevector correspond to the determination of the sound source 2 0 9 U 2 0 ~

lnformation and the pitch information of the input speech si.gnal.
As understood from the above description, the speech coding apparatus according to the present invention performs codebook search using the weighted synthesized square di.stance splitted in a frame, thereby enabling to provide improved quality as compared to the conventional method.

Claims (3)

1. A speech coding system comprising:
(a) means for splitting an input speech signal and generating a first speech vector;
(b) means for further splitting the first split speech vector and generating a second speech vector;
(c) means for calculating a weighting filter coefficient for each speech sub-vector obtained by further splitting the second speech vector by using the input speech signal;
(d) adaptive codebook means for storing adaptive codevectors each having the same length as the second speech vector, generated on the basis of sound source signal reproduced in the past;
(e) means for storing a plurality of excitation codevectors each having the same length as the second speech vector;
(f) means for weighting the second speech vector for each time section corresponding to each speech sub-vector by using the weighting filter coefficients;
(g) means for weighting and synthesizing the adaptive codevectors for each time section corresponding to each speech sub-vector by using the weighting filter coefficients;
(h) means for selecting an adaptive codevector by comparing the weighted second speech vector and the weighted.

synthesized adaptive codevectors;
(i) means for weighting and synthesizing the excitation codevectors for each time section corresponding to each speech sub-vector by using the weighting filter coefficients;
and (j) means for selecting an excitation codevector by using the weighted second speech vector, and the vectors obtained by weighting and synthesizing the selected adaptive codevector and the selected excitation codevector.
2. The speech coding system according to claim 1, wherein the weighting filter coefficients for each speech sub-vector is calculated by using, as a weighting filter coefficient corresponding to a speech sub-vector other than the last part of the speech vector, coefficients obtained by interpolating weighting filter coefficients corresponding to the speech sub-vector in the last part of a speech vector in the past and weighting filter coefficients corresponding to the speech sub-vector in the last part of the pertinent speech vector.
3. The speech coding system according to claim 1, wherein LPC coefficients are used as the weighting filter coefficients .
CA002090205A 1992-02-24 1993-02-23 Speech coding system Expired - Fee Related CA2090205C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP035881/1992 1992-02-24
JP03588192A JP3248215B2 (en) 1992-02-24 1992-02-24 Audio coding device

Publications (2)

Publication Number Publication Date
CA2090205A1 CA2090205A1 (en) 1993-08-25
CA2090205C true CA2090205C (en) 1998-08-04

Family

ID=12454351

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002090205A Expired - Fee Related CA2090205C (en) 1992-02-24 1993-02-23 Speech coding system

Country Status (4)

Country Link
EP (1) EP0557940B1 (en)
JP (1) JP3248215B2 (en)
CA (1) CA2090205C (en)
DE (1) DE69329476T2 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2591430B2 (en) * 1993-06-30 1997-03-19 日本電気株式会社 Vector quantizer
DE69623903T2 (en) 1995-05-10 2003-05-15 Nintendo Co Ltd ACTUATING DEVICE WITH ANALOG STICK COVER
JP3524247B2 (en) 1995-10-09 2004-05-10 任天堂株式会社 Game machine and game machine system using the same
KR100371456B1 (en) 1995-10-09 2004-03-30 닌텐도가부시키가이샤 Three-dimensional image processing system
JP3544268B2 (en) 1995-10-09 2004-07-21 任天堂株式会社 Three-dimensional image processing apparatus and image processing method using the same
US6267673B1 (en) 1996-09-20 2001-07-31 Nintendo Co., Ltd. Video game system with state of next world dependent upon manner of entry from previous world via a portal
US6190257B1 (en) 1995-11-22 2001-02-20 Nintendo Co., Ltd. Systems and method for providing security in a video game system
US6022274A (en) 1995-11-22 2000-02-08 Nintendo Co., Ltd. Video game system using memory module
TW419645B (en) * 1996-05-24 2001-01-21 Koninkl Philips Electronics Nv A method for coding Human speech and an apparatus for reproducing human speech so coded
US6241610B1 (en) 1996-09-20 2001-06-05 Nintendo Co., Ltd. Three-dimensional image processing system having dynamically changing character polygon number
US6139434A (en) 1996-09-24 2000-10-31 Nintendo Co., Ltd. Three-dimensional image processing apparatus with enhanced automatic and user point of view control
JP3655438B2 (en) 1997-07-17 2005-06-02 任天堂株式会社 Video game system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE68922134T2 (en) * 1988-05-20 1995-11-30 Nippon Electric Co Coded speech transmission system with codebooks for synthesizing low amplitude components.
US4975956A (en) * 1989-07-26 1990-12-04 Itt Corporation Low-bit-rate speech coder using LPC data reduction processing
EP0443548B1 (en) * 1990-02-22 2003-07-23 Nec Corporation Speech coder
JPH05108096A (en) * 1991-10-18 1993-04-30 Sanyo Electric Co Ltd Vector drive type speech encoding device

Also Published As

Publication number Publication date
CA2090205A1 (en) 1993-08-25
EP0557940A2 (en) 1993-09-01
EP0557940B1 (en) 2000-09-27
EP0557940A3 (en) 1994-03-23
JP3248215B2 (en) 2002-01-21
DE69329476D1 (en) 2000-11-02
DE69329476T2 (en) 2001-02-08
JPH05232997A (en) 1993-09-10

Similar Documents

Publication Publication Date Title
US5142584A (en) Speech coding/decoding method having an excitation signal
US5140638A (en) Speech coding system and a method of encoding speech
US6594626B2 (en) Voice encoding and voice decoding using an adaptive codebook and an algebraic codebook
US6023672A (en) Speech coder
EP0957472B1 (en) Speech coding apparatus and speech decoding apparatus
EP1162604B1 (en) High quality speech coder at low bit rates
CA2090205C (en) Speech coding system
EP1096476A2 (en) Speech decoding gain control for noisy signals
EP1005022B1 (en) Speech encoding method and speech encoding system
US6009388A (en) High quality speech code and coding method
US5873060A (en) Signal coder for wide-band signals
US5797119A (en) Comb filter speech coding with preselected excitation code vectors
JPH0944195A (en) Voice encoding device
US5884252A (en) Method of and apparatus for coding speech signal
US5708756A (en) Low delay, middle bit rate speech coder
JP2613503B2 (en) Speech excitation signal encoding / decoding method
JP3299099B2 (en) Audio coding device
JP3319396B2 (en) Speech encoder and speech encoder / decoder
JPH08185199A (en) Voice coding device
JP3471542B2 (en) Audio coding device
CA2325322A1 (en) Voice coding and decoding apparatus and method thereof
JP3192051B2 (en) Audio coding device
JP3092654B2 (en) Signal encoding device

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed