CN101192409A - Method and device for selecting self-adapting codebook excitation signal - Google Patents
Method and device for selecting self-adapting codebook excitation signal Download PDFInfo
- Publication number
- CN101192409A CN101192409A CNA2006101457857A CN200610145785A CN101192409A CN 101192409 A CN101192409 A CN 101192409A CN A2006101457857 A CNA2006101457857 A CN A2006101457857A CN 200610145785 A CN200610145785 A CN 200610145785A CN 101192409 A CN101192409 A CN 101192409A
- Authority
- CN
- China
- Prior art keywords
- mrow
- adaptive codebook
- excitation signal
- codebook excitation
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000005284 excitation Effects 0.000 title claims abstract description 192
- 238000000034 method Methods 0.000 title claims abstract description 32
- 230000003044 adaptive effect Effects 0.000 claims abstract description 147
- 238000004364 calculation method Methods 0.000 claims abstract description 9
- 238000001914 filtration Methods 0.000 claims description 31
- 238000012545 processing Methods 0.000 claims description 20
- 230000004044 response Effects 0.000 claims description 14
- 238000007619 statistical method Methods 0.000 abstract description 2
- 230000003247 decreasing effect Effects 0.000 abstract 1
- 239000013598 vector Substances 0.000 description 9
- 230000015572 biosynthetic process Effects 0.000 description 7
- 238000003786 synthesis reaction Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 239000002131 composite material Substances 0.000 description 3
- 230000007774 longterm Effects 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention discloses a method and a device for selecting an adaptive codebook excitation signal. The method calculates high frequency excitation correlation according to the high frequency part of the searched adaptive codebook and the residual signals of the input speech signals. If the calculated high frequency excitation correlation is higher than the preset correlation threshold value, the searched adaptive codebook signal then is selected as the current adaptive codebook excitation signal; otherwise, the low frequency part of adaptive codebook searched and low pass filter disposed is selected as the current adaptive codebook excitation signal. The invention can improve wideband speech coding performance. Tests shows that the technical scheme improves coding signal-to-noise ratio (SNR) compared with prior art. Besides, the complexity of selection calculation is greatly decreased. Statistical analyses suggests that the complexity of the calculation method is only 60 percent of that in prior art.
Description
Technical Field
The present invention relates to the field of digital speech coding, and more particularly, to a method and apparatus for selecting an adaptive codebook excitation signal.
Background
With the increasing popularity of multimedia applications, there is a great need for efficient wideband digital speech-to-audio coding techniques. The bandwidth of the current narrow-band speech is usually limited to 200Hz-3400Hz, and the naturalness, understandability and music processing of the speech are not satisfactory. In recent years, with the rapid development of broadband digital networks, third generation mobile systems, high-speed broadband internet, etc. provide network environments capable of performing higher quality similar to the quality of face-to-face communication. Therefore, the wideband speech codec has a more realistic meaning.
Code Excited Linear Prediction (CELP), which is widely used in narrowband speech coding due to its high coding efficiency and good coding quality, extracts vocal tract parameters using linear prediction, uses a codebook containing many typical excitation vectors as excitation parameters, and searches the codebook for a vector as an excitation vector each time coding is performed, wherein the excitation vector includes two parts: some come from past excitation, i.e. adaptive codebook; the other part comes from the updated vector, i.e. the fixed codebook. And coding the sequence number of the excitation vector in the code book and transmitting the sequence number to a decoding end, searching a table by the decoding end to obtain the excitation vector, and synthesizing the speech through a synthesis filter.
Compared with narrow-band speech, broadband speech has a wider variation range, and meanwhile, for signals with weak periodicity such as unvoiced sounds and transition sounds, the periodicity cannot be extended to the whole frequency range, so that for the signals, proper low-pass filtering needs to be performed on adaptive codebook excitation signals of the signals to remove the high-frequency part with weak periodicity, harmonic characteristics of a frequency spectrum of the broadband speech are better simulated, and the encoding performance of the broadband speech is improved.
The AMR-WB + coding standard (3GPP TS 26.290, "Extended Adaptive Multi-Rate-Wideband (AMR-WB +) codec", Dec 2004) adopts the scheme disclosed in U.S. publication No. US20050108005, entitled "Method and device for Adaptive bandwidth pitch in coding bandwidth signals", which discloses a Multi-path closed-loop selection Adaptive codebook excitation signal Method, and the specific scheme is as follows:
1. calculating the target signal and impulse response:
perceptual weighting is performed on an input signal (speech (n)) which is an input speech signal, a response of a perceptual weighting filter is w (n), and a weighted domain signal wsp (n) is calculated:
wsp (n) ═ speed (n) × w (n) (' represents convolution)
Computationally weighted synthesis filter Zero input response xn2(n)
Let the target signal used for adaptive codebook search be xn (n):
xn(n)=wsp(n)-xn2(n)
the target signal will be used for selecting candidate values for pitch period search in a closed loop pitch search process.
Of weighted synthesis filters H (n), then h (n) ═ ξ-1(H(z))
2. Adaptive codebook search
The adaptive codebook parameters are pitch lag and gain. In the searching stage, the linear prediction residual error extension excitation simplifies the closed-loop searching, and each subframe is subjected to adaptive codebook searching once.
In the first and third subframes of each frame, in the 12.65, 14.25, 15.85, 18.25, 19.85, 23.05 and 23.85kbits/s modes, the search range is that of the pitch lag T1In thatWithin the range of resolution ofFractional delay; in thatWithin the range of resolution ofFractional delay; at [160, 231 ]]Within the range, only integer pitch lag searches are performed. Pitch delay T for the second and fourth subframes of each frame2In thatMiddle search with resolution ofA fractional delay. Here int (T)1) Is a fractional delay T1The integer part of (2), this range being adapted to T1Across the delay range boundary and is the nearest first or third subframe.
For the 8.85kbits/s mode, the pitch delay T is applied in the first and third sub-frames of each frame1In thatWithin the range of resolution ofFractional delay, in the interval [92, 231]In (3), only integer pitch lag search is performed. Pitch lag T for the second and fourth sub-frames2In thatMiddle search with resolution ofA fractional delay.
For the lowest 6.60kbitIn the/s mode, in the first subframe of each frame, the pitch delay T1In thatWithin the range of resolution ofIn the interval [92, 231]In (3), only integer pitch lag search is performed. Pitch lag T for the second, third and fourth subframes2In thatMiddle search with resolution ofA fractional delay.
The criterion for closed-loop pitch search is to minimize the mean-squared weighted error between the original speech and the reconstructed speech, even if the normalization coefficient r (k) is maximum.
Where x (n) denotes a target signal, yk(n) is the past filtered excitation at delay k (i.e., the convolution of the past excitation with h (n)), yk(n) ═ h (n) × exc (n-k)). The search range is limited to around a preselected value, i.e. the first and third sub-frames of each frame are open-loop pitch TopThe second or fourth is the nearest first or third sub-frame fractional delay T1Integer part of (int) (T)1). Calculating the time delay tminConvolution of yk(n) delaying other integers by t in the search range kmin+1,......,tmaxCorrected by the following relationship:
yk(n)=yk-1(n-1)+exc(-k)h(n),n=63,......,0
where exc (n) (n-231...., 63) is the value of the excitation buffer, yk-1(-1) ═ 0. In the search phase exc (n) (0...., 63) is unknown and only needed if the pitch period is less than 64, for the search simplification, the linear prediction residual is stored in exc (n) so that equation (1) is valid for all integers.
For determining the optimum integer closed loop delay T1And T2Then the fractional resolution around the best integer closed loop delay is tested. The normalization coefficient r (k) is interpolated and the maximum value is searched for the resulting fractional pitch period. FIR filter b for search16Clipping function and truncation of Hamming windowAt + -15, zero padding (i.e., b) at + -1616(16) 0), the cut-off frequency of the filter (-3dB) is 5.063 KHz.
After the pitch lag is determined, the adaptive codebook vector v (n) is calculated by interpolating the past excitation exc (n) at a given integer lag k and fractional lag t:
n=0,......,63 t=0,1,2,3
interpolation filter b64Hamming window clipping function, truncating at + -63, and filling with zeros at + -64 (i.e., b)64(64)=0) The cut-off frequency (-3dB) of the filter is 6.016 KHz.
3. Selection of adaptive codebook excitation signal
Referring to fig. 1, this step searches for the best adaptive codebook and its gain in two paths, and selects the best scheme by comparison, where the two excitation signals are:
a) searching the self-adaptive codebook v (n) obtained by calculation in the step 2;
b) a new adaptive codebook v' (n) obtained by low-pass filtering the adaptive codebook v (n) obtained by search and calculation in step 2, wherein f (z) is 0.18z-1+0.64+0.18Z, v' (n) ═ v (n) × f (n), and f (n) is the inverse Z transform of f (Z).
The following processes are the same for the excitation signals v (n) and v' (n), and the present specification will explain the processing of v (n) as an example:
i) calculating a weighted synthesis signal synth (n) by using the impulse response h (n) of the weighted synthesis filter and the convolution excitation signal v (n), as shown in the following formula:
synth(n)=v(n)*h(n)
ii) calculating a gain of the matching of the synthesized signal and the target signal, wherein the calculation formula of the gain signal is as follows:
where x (n) represents the target signal, synth (n) represents the weighted composite signal, and gain represents the prediction gain.
iii) removing the long-term correlation of the target signal, calculating the energy of the error, and removing the long-term correlation of the target signal by using the following formula:
error(n)=x(n)-gain×synth(n)
calculating to obtain an error signal error (n), and calculating the energy ener of the error signal:
the updated adaptive codebook v ' (n) is processed through the same 3 processing steps to obtain the prediction gain ' and the energy ener ' of the error signal.
For the two paths of operation processes, two error signal energy values are compared, and one path with small error signal energy is selected as a processing mode of the excitation signal, namely
If the ener is less than or equal to the ener', selecting v (n) as an excitation signal, and transmitting gain to a decoding end; if ener > ener ', v ' (n) is selected as excitation signal, gain ' is transmitted to decoding end, and information of which adaptive codebook is selected is transmitted to decoding end, and the same processing is carried out on synthesized excitation signal at decoding end.
In the selection process of the self-adaptive codebook excitation signal, a closed-loop multipath selection method is adopted, the algorithm complexity is high, and the coding performance gain is limited.
Disclosure of Invention
To reduce the complexity of the selection algorithm, the present invention provides a method and apparatus for selecting an adaptive codebook excitation signal.
A method of selecting an adaptive codebook excitation signal, comprising:
calculating a target signal and an impulse response, and searching an adaptive codebook excitation signal according to the target signal and the impulse response;
and calculating high-frequency excitation correlation between the high-frequency part of the searched adaptive codebook excitation signal and a residual signal of the input voice signal, judging whether the calculated high-frequency excitation correlation is larger than a preset correlation threshold value, and if so, selecting the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal.
An apparatus for selecting an adaptive codebook excitation signal, comprising: an adaptive codebook excitation signal searching unit, a processing unit and a comparison selection unit, wherein,
the adaptive codebook excitation signal searching unit is used for searching an adaptive codebook excitation signal according to the calculated target signal and the impulse response;
the processing unit is used for calculating the high-frequency excitation correlation between the high-frequency part of the searched adaptive codebook excitation signal and the residual signal of the input voice signal and sending the calculated high-frequency excitation correlation value to the comparison selection unit;
and the comparison selection unit is used for selecting the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal after determining that the calculated high-frequency excitation correlation is greater than a preset correlation threshold.
The method and apparatus for selecting an adaptive codebook excitation signal as described above calculate a high-frequency excitation correlation value according to a residual signal between a searched high-frequency part of an adaptive codebook and an input speech signal, and determine which signal is selected as a current adaptive codebook excitation signal by comparing the magnitude between the calculated high-frequency excitation correlation and a preset correlation threshold. If the calculated high-frequency excitation correlation is larger than a preset correlation threshold value, the searched adaptive codebook signal is selected as the current adaptive codebook excitation signal, otherwise, the low-frequency part of the searched adaptive codebook is used as the current adaptive codebook excitation signal.
Drawings
FIG. 1 is a schematic flow chart of a prior art method for selecting an adaptive codebook excitation signal;
FIG. 2 is a flow chart illustrating the selection of an adaptive codebook excitation signal according to an embodiment of the present invention;
FIG. 3 is a block diagram of an apparatus for selecting an adaptive codebook excitation signal according to an embodiment of the present invention.
Detailed Description
The technical solution of the present invention will be described with reference to specific examples.
1. Calculating the target signal and impulse response:
perceptually weighting an input speech signal, making the response of the perceptual weighting filter w (n), calculating a weighted domain signal wsp (n):
wsp (n) ═ speed (n) × w (n) (' represents convolution)
Computationally weighted synthesis filter Zero input response xn2(n)
Let the target signal used for adaptive codebook search be xn (n):
xn(n)=wsp(n)-xn2(n)
the target signal will be used for selecting a candidate value for pitch period search in the closed loop pitch search process.
Of weighted synthesis filters H (n), then h (n) ═ ξ-1(H(z))
2. Adaptive codebook search
The adaptive codebook parameters are pitch lag and gain. In the searching stage, the linear prediction residual error extension excitation simplifies the closed-loop searching, and each subframe is subjected to adaptive codebook searching once.
In the first and third subframes of each frame, in the 12.65, 14.25, 15.85, 18.25, 19.85, 23.05 and 23.85kbits/s modes, the search range is that of the pitch lag T1In thatWithin the range of resolution ofFractional delay; in thatWithin the range of resolution ofFractional delay; at [160, 231 ]]Within the range, only integer pitch lag searches are performed. Pitch delay T for the second and fourth subframes of each frame2In thatMiddle search with resolution ofA fractional delay. Here int (T)1) Is a fractional delay T1The integer part of (2), this range being adapted to T1Across the delay range boundary and is the nearest first or third subframe.
For the 8.85kbits/s mode, the pitch delay T is applied in the first and third sub-frames of each frame1In thatWithin the range of resolution ofThe delay of the fraction is delayed,in the interval [92, 231]In (3), only integer pitch lag search is performed. Pitch lag T for the second and fourth sub-frames2In thatMiddle search with resolution ofA fractional delay.
For the lowest 6.60kbit/s mode, in the first subframe of each frame, the pitch delay T1In thatWithin the range of resolution ofIn the interval [92, 231]In (3), only integer pitch lag search is performed. Pitch lag T for the second, third and fourth subframes2In thatMiddle search with resolution ofA fractional delay.
The criterion for closed-loop pitch search is to minimize the mean-squared weighted error between the original speech and the reconstructed speech, even if R (k) is at a maximum.
Where x (n) denotes a target signal, yk(n) is the past filtered excitation at delay k (i.e., the convolution of the past excitation with h (n)), yk(n) ═ h (n) × exc (n-k)). The search range is limited to around a preselected value, i.e. the first and third sub-frames of each frame are the open-loop pitch value TopThe second or fourth is the nearest first or third sub-frame fractional delay T1Integer part of (int) (T)1). Calculating the time delay tminConvolution of yk(n) delaying other integers by t in the search range kmin+1,.....,tmaxCorrected by the following relationship:
yk(n)=yk-1(n-1)+exc(-k)h(n),n=63,......,0
where exc (n) (n-231...., 63) is the value of the excitation buffer, yk-1(-1) ═ 0. In a search phase exc: (n) (n is not known and is only needed if the pitch period is less than 64. for simplicity of the search, the linear prediction residuals are stored in exc (n) so that equation (2) is valid for all integers.
For determining the optimum integer closed loop delay T1And T2Then the fractional resolution around the best integer closed loop delay is tested. The normalization coefficient r (k) is interpolated and the maximum value is searched for the resulting fractional pitch period. FIR filter b for search16Hamming window clipping function, truncating at + -15, and filling with zeros at + -16 (i.e., b)16(16) 0), the cut-off frequency of the filter (-3dB) is 5.063 KHz.
After the pitch lag is determined, the adaptive codebook vector v (n) is calculated by interpolating the past excitation exc (n) at a given integer lag k and fractional lag t:
n=0,......,63 t=0,1,2,3
interpolation filter b64Hamming window clipping function, truncating at + -63, and filling with zeros at + -64 (i.e., b)64(64) 0), the cut-off frequency of the filter (-3dB) is 6.016 KHz.
3. Open loop adaptive codebook excitation signal selection
Referring to fig. 2, the steps in this embodiment are as follows:
s1, obtaining the self-adapting codebook excitation v (n) signal calculated in the step 2.
S2, low-pass filtering the obtained adaptive codebook excitation v (n) to obtain a low-frequency part v _ low (n) thereof, wherein the calculation process is as follows:
where b (-1) ═ b (1) ═ 0.28 b (0) ═ 0.44 n ═ 0.
Wherein the low-pass filter is of the form F (z) ═ alphaz-1+ β + az where 2 α + β ═ 1, such as f (z) ═ 0.18z-1+0.64+0.18z。
S3) calculating the high frequency part v _ high (n) of the adaptive codebook excitation v (n):
v_high(n)=v(n)-v_low(n)
n is 0, m-1, m is the length of the adaptive codebook excitation signal, and m is equal to 64 in this embodiment.
S4) calculating a residual signal r (n) of the input speech signal, which is a weighted-domain speech residual signal:
whereinFor analysing filters <math><mrow>
<mrow>
<mo>(</mo>
<mover>
<mi>A</mi>
<mo>^</mo>
</mover>
<mrow>
<mo>(</mo>
<mi>z</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mn>1</mn>
<mo>-</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>p</mi>
</munderover>
<msub>
<mover>
<mi>a</mi>
<mo>^</mo>
</mover>
<mi>i</mi>
</msub>
<msup>
<mi>z</mi>
<mrow>
<mo>-</mo>
<mi>i</mi>
</mrow>
</msup>
<mo>)</mo>
</mrow>
<mo>,</mo>
</mrow></math> For a quantized linear prediction system (LPC coefficients), p is the linear prediction order.
S5) calculating a cross-correlation of the residual signal r (n) with the high frequency part v _ high (n) of the adaptive codebook excitation v (n), i.e. the high frequency excitation cross-correlation corr _ exc _ high:
the number 63 in the above equation is the correlation length, and of course, the number can be changed according to different adaptive codebook excitation lengths.
S6) determines that the cross-correlation corr _ exc _ high and the given correlation threshold γ are 0.19, if corr _ exc _ high > γ, step S7) is executed, and if corr _ exc _ high ≦ γ, step S8) is executed.
The given correlation threshold y is determined according to the final coding effect.
S7) corr _ exc _ high > gamma, the final self-adaptive excitation code book signal is v (n);
let the weighted composite signal be synth (n), then
synth(n)=h(n)*v(n)
Then, the gain magnitude is calculated: the gain is then (where x (n) is the target signal):
s8), corr _ exc _ high is less than or equal to gamma, the final self-adaptive excitation code book signal is v _ low (n);
let the weighted composite signal be synth' (n), then
synth′(n)=h(n)*v_low(n)
Then, the gain magnitude is calculated: the gain magnitude gain' is now:
similarly, 63 in equations (4) and (5) may be modified according to the actual adaptive codebook excitation length.
Therefore, the selection of the adaptive codebook excitation signal is completed, and the long-term prediction gain is calculated.
And transmitting gain or gain' codes to a decoding end, simultaneously transmitting the selected information which is used as a final adaptive excitation codebook signal to the decoding end, and performing the same processing on the synthesized excitation signal at the decoding end.
In addition, the algorithm of the above equation (3) is only an example, and the high-frequency excitation cross-correlation value corr _ exc _ high may be obtained by the following equation (6):
thus, corr _ exc _ high is not normalized, and m is the length of the adaptive codebook excitation signal.
The following steps are still performed as per S6), S7), S8), except that the size of the given correlation threshold γ at this time needs to be re-adjusted as required by the final coding effect.
In addition, for the above "3. open loop adaptive codebook excitation signal selection", there is also a possible scheme:
s1') obtaining the adaptive codebook excitation v (n) signal calculated in step 2.
S2') high-pass filtering the obtained adaptive codebook excitation v (n) to obtain a high-frequency part v _ high (n).
Then, the weighted domain speech residual signal, i.e., the residual signal r (n) of the input speech signal, is calculated, and the high-frequency excitation cross-correlation corr _ exc _ high between the residual signal r (n) and the high-frequency part v _ high (n) of the adaptive codebook excitation v (n) is calculated, and the size of the cross-correlation corr _ exc _ high and the predetermined correlation threshold γ is determined, which are exactly the same as the steps of S4 to S6).
If corr _ exc _ high > γ, the finally searched adaptive excitation codebook signal v (n) is selected as the current adaptive excitation codebook signal, and then the gain can be calculated, the specific processing procedure is exactly the same as the foregoing step S7).
If corr _ exc _ high is less than or equal to gamma, the searched low-pass filtered adaptive codebook excitation signal is selected as the current adaptive codebook excitation signal, and then the gain can be calculated, the algorithm of the gain is the same as the step 8), and for the low-frequency part of the adaptive codebook excitation signal to be selected, the following two acquisition methods are available:
one is that: and performing low-pass filtering on the searched adaptive codebook excitation v (n) to obtain a low-frequency part v _ low (n), and then selecting a low-frequency part signal v _ low (n) as a current adaptive codebook excitation signal.
The other is as follows: the low frequency part of the searched adaptive codebook excitation v (n), i.e. the part v _ high (n), is obtained by subtracting the calculated high frequency part v _ high (n) from the searched adaptive codebook excitation v (n)
v_low(n)=v(n)-v_high(n)
Then, the low frequency partial signal v _ low (n) is selected and then selected as the current adaptive codebook excitation signal.
The invention also discloses a device for selecting an adaptive codebook excitation signal, which is shown in fig. 3 and comprises: an adaptive codebook excitation signal searching unit 310, a processing unit 320, and a comparison selecting unit 330, wherein,
an adaptive codebook excitation signal searching unit 310 for searching an adaptive codebook excitation signal based on the target signal and the impulse response that have been calculated;
the processing unit 320 is configured to calculate a high-frequency excitation correlation between a high-frequency portion of the searched adaptive codebook excitation signal and a residual signal of the input speech signal, and send the calculated high-frequency excitation correlation value to the comparison selection unit 330;
the comparison selection unit 330 is configured to select the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal after determining that the calculated high-frequency excitation correlation is greater than a preset correlation threshold.
The processing unit 320 may have the following two structures:
one is that: the processing unit 320 includes a low-pass filtering unit and a calculating unit, wherein,
the low-pass filtering unit is used for carrying out low-pass filtering on the received self-adaptive codebook excitation signal;
the calculating unit is used for subtracting the low-frequency part after low-pass filtering from the searched adaptive codebook excitation signal to obtain a high-frequency part of the adaptive codebook excitation signal, then applying the high-frequency excitation correlation between the high-frequency part and a residual signal of the input voice signal, and sending the calculated high-frequency excitation correlation value to the comparison and selection unit.
The processing unit 320 further includes: a high-pass filtering unit and a calculating unit, wherein,
the high-pass filtering unit is used for carrying out high-pass filtering on the received self-adaptive codebook excitation signal;
the calculation unit applies the high-frequency excitation correlation between the high-frequency part of the self-adaptive codebook excitation signal after high-pass filtering and the residual signal of the input voice signal, and sends the calculated high-frequency excitation correlation value to the comparison selection unit. In this case, the calculating unit may be further configured to subtract the high-pass filtered high-frequency part from the searched adaptive codebook excitation signal to obtain a low-frequency part of the adaptive codebook excitation signal, and send the low-frequency part of the adaptive codebook excitation signal to the comparison selecting unit.
In the latter case, if the calculating unit does not include the function of calculating the low frequency part of the adaptive codebook excitation signal, then the processing unit 320 may further include: and the low-pass filtering unit is used for carrying out low-pass filtering on the received adaptive codebook excitation signal and sending the adaptive codebook excitation signal subjected to low-pass filtering to the comparison selection unit.
The comparison selection unit 330 is further configured to: and after the calculated high-frequency excitation correlation is determined to be less than or equal to a preset correlation threshold value, selecting the low-frequency part of the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal.
The above apparatus may further comprise: a gain calculating unit 340 for calculating a gain according to the currently selected adaptive codebook excitation signal.
The calculation method of each unit is the same as that described above, and is not described again.
The method and the device of the invention improve the broadband voice coding performance: experiments show that compared with the prior art, the scheme improves the coding signal-to-noise ratio (SNR); and, the complexity of the selection operation is greatly reduced: statistical analysis shows that the complexity of the algorithm is only 60% of the prior art.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.
Claims (17)
1. A method of selecting an adaptive codebook excitation signal, comprising:
calculating a target signal and an impulse response, and searching an adaptive codebook excitation signal according to the target signal and the impulse response;
and calculating high-frequency excitation correlation between the high-frequency part of the searched adaptive codebook excitation signal and a residual signal of the input voice signal, judging whether the calculated high-frequency excitation correlation is larger than a preset correlation threshold value, and if so, selecting the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal.
2. The method of claim 1, wherein if the calculated high frequency excitation correlation is less than or equal to a predetermined correlation threshold, the method further comprises: the low frequency part of the searched adaptive codebook excitation signal is selected as the current adaptive codebook excitation signal.
3. The method of claim 1 or 2, further comprising:
the gain is calculated based on the currently selected adaptive codebook excitation signal.
4. The method of claim 1, wherein obtaining the searched high frequency portion of the adaptive codebook excitation signal comprises:
i) low-pass filtering the searched adaptive codebook excitation signal by using a low-pass filter to obtain a low-frequency part v _ low (n) of the adaptive codebook excitation signal:
wherein b (-1) ═ b (1) ═ 0.28 b (0) ═ 0.44 n ═ 0.· m-1; v (n) is the adaptive codebook signal obtained by searching; m is the length of the self-adaptive codebook excitation signal;
ii) calculating a high frequency part v _ high (n) of the adaptive codebook excitation v (n) from the low frequency part v _ low (n) of the adaptive codebook excitation signal:
v_high(n)=v(n)-v_low(n)
n-0, m-1, m being the length of the adaptive codebook excitation signal.
5. The method of claim 1, wherein the high frequency excitation correlation corr _ exc _ high between the high frequency part of the adaptive codebook excitation signal and the residual signal of the input speech signal is calculated by:
where r (n) is a residual signal of the input speech signal, v _ high (n) is a high frequency portion of the adaptive codebook excitation v (n), and m is a length of the adaptive codebook excitation signal.
6. The method of claim 1, wherein the high frequency excitation correlation corr _ exc _ high between the high frequency part of the adaptive codebook excitation signal and the residual signal of the input speech signal is calculated by:
where r (n) is a residual signal of the input speech signal, v _ high (n) is a high frequency portion of the adaptive codebook excitation v (n), and m is a length of the adaptive codebook excitation signal.
7. The method according to claim 5 or 6, wherein the residual signal r (n) of the input speech signal is calculated by:
wherein,for analysing filters <math><mrow>
<mover>
<mi>A</mi>
<mo>^</mo>
</mover>
<mrow>
<mo>(</mo>
<mi>z</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mn>1</mn>
<mo>-</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>p</mi>
</munderover>
<msub>
<mover>
<mi>a</mi>
<mo>^</mo>
</mover>
<mi>i</mi>
</msub>
<msup>
<mi>z</mi>
<mrow>
<mo>-</mo>
<mi>i</mi>
</mrow>
</msup>
<mo>,</mo>
</mrow></math> For the quantized linear prediction system, p is the linear prediction order.
8. The method of claim 1, wherein obtaining the searched high frequency portion of the adaptive codebook excitation signal comprises:
and performing high-pass filtering on the searched adaptive codebook excitation signal by using a high-pass filter to obtain a high-frequency part v _ high (n) of the adaptive codebook excitation signal.
9. The method of claim 8, wherein selecting the low frequency portion of the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal comprises:
and subtracting the calculated high-frequency part v _ high (n) from the searched adaptive codebook excitation v (n) to obtain a low-frequency part v _ low (n) of the searched adaptive codebook excitation, and then selecting v _ low (n) as a current adaptive codebook excitation signal.
10. The method of claim 2, wherein the selecting the low frequency portion of the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal comprises:
and performing low-pass filtering on the searched adaptive codebook excitation signal by using a low-pass filter to obtain a low-frequency part v _ low (n) of the adaptive codebook excitation signal, and selecting the adaptive codebook excitation signal after low-pass filtering as the current adaptive codebook excitation signal.
11. An apparatus for selecting an adaptive codebook excitation signal, comprising: an adaptive codebook excitation signal searching unit, a processing unit and a comparison selection unit, wherein,
the adaptive codebook excitation signal searching unit is used for searching an adaptive codebook excitation signal according to the calculated target signal and the impulse response;
the processing unit is used for calculating the high-frequency excitation correlation between the high-frequency part of the searched adaptive codebook excitation signal and the residual signal of the input voice signal and sending the calculated high-frequency excitation correlation value to the comparison selection unit;
and the comparison selection unit is used for selecting the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal after determining that the calculated high-frequency excitation correlation is greater than a preset correlation threshold.
12. The apparatus of claim 11, wherein the processing unit comprises: a low-pass filtering unit and a calculating unit, wherein,
the low-pass filtering unit is used for performing low-pass filtering on the received adaptive codebook excitation signal;
the computing unit is used for subtracting the low-frequency part after the low-pass filtering from the searched adaptive codebook excitation signal to obtain a high-frequency part of the adaptive codebook excitation signal, then applying the high-frequency excitation correlation between the high-frequency part and a residual signal of the input voice signal, and sending the computed high-frequency excitation correlation value to the comparison and selection unit.
13. The apparatus of claim 11, wherein the processing unit comprises: a high-pass filtering unit and a calculating unit,
the high-pass filtering unit is used for carrying out high-pass filtering on the received self-adaptive codebook excitation signal;
the calculating unit applies the high-frequency excitation correlation between the high-frequency part of the self-adaptive codebook excitation signal after high-pass filtering and the residual signal of the input voice signal, and sends the calculated high-frequency excitation correlation value to the comparison and selection unit.
14. The apparatus of claim 13, wherein the computing unit is further configured to: and subtracting the high-pass filtered high-frequency part from the searched adaptive codebook excitation signal to obtain a low-frequency part of the adaptive codebook excitation signal, and sending the calculated low-frequency part of the adaptive codebook excitation signal to a comparison selection unit.
15. The apparatus of claim 13, wherein the processing unit further comprises: and the low-pass filtering unit is used for carrying out low-pass filtering on the received adaptive codebook excitation signal and sending the adaptive codebook excitation signal subjected to low-pass filtering to the comparison selection unit.
16. The apparatus according to claim 14 or 15, wherein the comparison selection unit is further configured to: and after the calculated high-frequency excitation correlation is determined to be less than or equal to a preset correlation threshold value, selecting the low-frequency part of the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal.
17. The apparatus of claim 16, further comprising: and the gain calculation unit is used for calculating the gain according to the currently selected adaptive codebook excitation signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2006101457857A CN100487790C (en) | 2006-11-21 | 2006-11-21 | Method and device for selecting self-adapting codebook excitation signal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2006101457857A CN100487790C (en) | 2006-11-21 | 2006-11-21 | Method and device for selecting self-adapting codebook excitation signal |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101192409A true CN101192409A (en) | 2008-06-04 |
CN100487790C CN100487790C (en) | 2009-05-13 |
Family
ID=39487357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2006101457857A Active CN100487790C (en) | 2006-11-21 | 2006-11-21 | Method and device for selecting self-adapting codebook excitation signal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100487790C (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103165134A (en) * | 2013-04-02 | 2013-06-19 | 武汉大学 | Coding and decoding device of audio signal high frequency parameter |
CN105830153A (en) * | 2013-12-16 | 2016-08-03 | 高通股份有限公司 | High-band signal modeling |
CN113196387A (en) * | 2019-01-13 | 2021-07-30 | 华为技术有限公司 | High resolution audio coding and decoding |
-
2006
- 2006-11-21 CN CNB2006101457857A patent/CN100487790C/en active Active
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103165134A (en) * | 2013-04-02 | 2013-06-19 | 武汉大学 | Coding and decoding device of audio signal high frequency parameter |
CN103165134B (en) * | 2013-04-02 | 2015-01-14 | 武汉大学 | Coding and decoding device of audio signal high frequency parameter |
CN105830153A (en) * | 2013-12-16 | 2016-08-03 | 高通股份有限公司 | High-band signal modeling |
CN113196387A (en) * | 2019-01-13 | 2021-07-30 | 华为技术有限公司 | High resolution audio coding and decoding |
Also Published As
Publication number | Publication date |
---|---|
CN100487790C (en) | 2009-05-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4662673B2 (en) | Gain smoothing in wideband speech and audio signal decoders. | |
JP4550289B2 (en) | CELP code conversion | |
EP0747882B1 (en) | Pitch delay modification during frame erasures | |
EP0422232B1 (en) | Voice encoder | |
AU763471B2 (en) | A method and device for adaptive bandwidth pitch search in coding wideband signals | |
EP1619664B1 (en) | Speech coding apparatus, speech decoding apparatus and methods thereof | |
CN101180676B (en) | Methods and apparatus for quantization of spectral envelope representation | |
JP3481390B2 (en) | How to adapt the noise masking level to a synthetic analysis speech coder using a short-term perceptual weighting filter | |
US6334105B1 (en) | Multimode speech encoder and decoder apparatuses | |
JPH0635500A (en) | Voice compressor using celp | |
US9972325B2 (en) | System and method for mixed codebook excitation for speech coding | |
JPH10124088A (en) | Device and method for expanding voice frequency band width | |
CN101542599A (en) | Method, apparatus, and system for encoding and decoding broadband voice signal | |
US20070016417A1 (en) | Method and apparatus to quantize/dequantize frequency amplitude data and method and apparatus to audio encode/decode using the method and apparatus to quantize/dequantize frequency amplitude data | |
US6804639B1 (en) | Celp voice encoder | |
JP2002544551A (en) | Multipulse interpolation coding of transition speech frames | |
EP0747884B1 (en) | Codebook gain attenuation during frame erasures | |
WO1994019790A1 (en) | Method for generating a spectral noise weighting filter for use in a speech coder | |
CN100487790C (en) | Method and device for selecting self-adapting codebook excitation signal | |
KR100300964B1 (en) | Speech coding/decoding device and method therof | |
JP2003044099A (en) | Pitch cycle search range setting device and pitch cycle searching device | |
JP3462464B2 (en) | Audio encoding method, audio decoding method, and electronic device | |
KR100718487B1 (en) | Harmonic noise weighting in digital speech coders | |
JP3249144B2 (en) | Audio coding device | |
JP3749838B2 (en) | Acoustic signal encoding method, acoustic signal decoding method, these devices, these programs, and recording medium thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |