CN101192409A

CN101192409A - Method and device for selecting self-adapting codebook excitation signal

Info

Publication number: CN101192409A
Application number: CNA2006101457857A
Authority: CN
Inventors: 胡瑞敏; 张勇; 刘霖; 杨玉红; 高戈; 王庭红; 马付伟
Original assignee: Huawei Technologies Co Ltd; Wuhan University WHU
Current assignee: Huawei Technologies Co Ltd; Wuhan University WHU
Priority date: 2006-11-21
Filing date: 2006-11-21
Publication date: 2008-06-04
Anticipated expiration: 2026-11-21
Also published as: CN100487790C

Abstract

The invention discloses a method and a device for selecting an adaptive codebook excitation signal. The method calculates high frequency excitation correlation according to the high frequency part of the searched adaptive codebook and the residual signals of the input speech signals. If the calculated high frequency excitation correlation is higher than the preset correlation threshold value, the searched adaptive codebook signal then is selected as the current adaptive codebook excitation signal; otherwise, the low frequency part of adaptive codebook searched and low pass filter disposed is selected as the current adaptive codebook excitation signal. The invention can improve wideband speech coding performance. Tests shows that the technical scheme improves coding signal-to-noise ratio (SNR) compared with prior art. Besides, the complexity of selection calculation is greatly decreased. Statistical analyses suggests that the complexity of the calculation method is only 60 percent of that in prior art.

Description

Method and apparatus for selecting adaptive codebook excitation signal

Technical Field

The present invention relates to the field of digital speech coding, and more particularly, to a method and apparatus for selecting an adaptive codebook excitation signal.

Background

With the increasing popularity of multimedia applications, there is a great need for efficient wideband digital speech-to-audio coding techniques. The bandwidth of the current narrow-band speech is usually limited to 200Hz-3400Hz, and the naturalness, understandability and music processing of the speech are not satisfactory. In recent years, with the rapid development of broadband digital networks, third generation mobile systems, high-speed broadband internet, etc. provide network environments capable of performing higher quality similar to the quality of face-to-face communication. Therefore, the wideband speech codec has a more realistic meaning.

Code Excited Linear Prediction (CELP), which is widely used in narrowband speech coding due to its high coding efficiency and good coding quality, extracts vocal tract parameters using linear prediction, uses a codebook containing many typical excitation vectors as excitation parameters, and searches the codebook for a vector as an excitation vector each time coding is performed, wherein the excitation vector includes two parts: some come from past excitation, i.e. adaptive codebook; the other part comes from the updated vector, i.e. the fixed codebook. And coding the sequence number of the excitation vector in the code book and transmitting the sequence number to a decoding end, searching a table by the decoding end to obtain the excitation vector, and synthesizing the speech through a synthesis filter.

Compared with narrow-band speech, broadband speech has a wider variation range, and meanwhile, for signals with weak periodicity such as unvoiced sounds and transition sounds, the periodicity cannot be extended to the whole frequency range, so that for the signals, proper low-pass filtering needs to be performed on adaptive codebook excitation signals of the signals to remove the high-frequency part with weak periodicity, harmonic characteristics of a frequency spectrum of the broadband speech are better simulated, and the encoding performance of the broadband speech is improved.

The AMR-WB + coding standard (3GPP TS 26.290, "Extended Adaptive Multi-Rate-Wideband (AMR-WB +) codec", Dec 2004) adopts the scheme disclosed in U.S. publication No. US20050108005, entitled "Method and device for Adaptive bandwidth pitch in coding bandwidth signals", which discloses a Multi-path closed-loop selection Adaptive codebook excitation signal Method, and the specific scheme is as follows:

1. calculating the target signal and impulse response:

perceptual weighting is performed on an input signal (speech (n)) which is an input speech signal, a response of a perceptual weighting filter is w (n), and a weighted domain signal wsp (n) is calculated:

wsp (n) ═ speed (n) × w (n) (' represents convolution)

Computationally weighted synthesis filter

H (z) = \frac{W (z)}{\hat{A} (z)}

Zero input response xn2(n)

Let the target signal used for adaptive codebook search be xn (n):

xn(n)＝wsp(n)-xn2(n)

the target signal will be used for selecting candidate values for pitch period search in a closed loop pitch search process.

Of weighted synthesis filters

H (z) = \frac{W (z)}{\hat{A} (z)}

H (n), then h (n) ═ ξ^-1(H(z))

2. Adaptive codebook search

The adaptive codebook parameters are pitch lag and gain. In the searching stage, the linear prediction residual error extension excitation simplifies the closed-loop searching, and each subframe is subjected to adaptive codebook searching once.

In the first and third subframes of each frame, in the 12.65, 14.25, 15.85, 18.25, 19.85, 23.05 and 23.85kbits/s modes, the search range is that of the pitch lag T₁In that

Within the range of resolution of

Fractional delay; in thatWithin the range of resolution ofFractional delay; at [160, 231 ]]Within the range, only integer pitch lag searches are performed. Pitch delay T for the second and fourth subframes of each frame₂In that

Middle search with resolution of

A fractional delay. Here int (T)₁) Is a fractional delay T₁The integer part of (2), this range being adapted to T₁Across the delay range boundary and is the nearest first or third subframe.

For the 8.85kbits/s mode, the pitch delay T is applied in the first and third sub-frames of each frame₁In that

Within the range of resolution of

Fractional delay, in the interval [92, 231]In (3), only integer pitch lag search is performed. Pitch lag T for the second and fourth sub-frames₂In that

Middle search with resolution of

A fractional delay.

For the lowest 6.60kbitIn the/s mode, in the first subframe of each frame, the pitch delay T₁In that

Within the range of resolution of

In the interval [92, 231]In (3), only integer pitch lag search is performed. Pitch lag T for the second, third and fourth subframes₂In that

Middle search with resolution of

A fractional delay.

The criterion for closed-loop pitch search is to minimize the mean-squared weighted error between the original speech and the reconstructed speech, even if the normalization coefficient r (k) is maximum.

Where x (n) denotes a target signal, y_k(n) is the past filtered excitation at delay k (i.e., the convolution of the past excitation with h (n)), y_k(n) ═ h (n) × exc (n-k)). The search range is limited to around a preselected value, i.e. the first and third sub-frames of each frame are open-loop pitch T_opThe second or fourth is the nearest first or third sub-frame fractional delay T₁Integer part of (int) (T)₁). Calculating the time delay t_minConvolution of y_k(n) delaying other integers by t in the search range k_min+1，......，t_maxCorrected by the following relationship:

y_k(n)＝y_k-1(n-1)+exc(-k)h(n)，n＝63，......，0

where exc (n) (n-231...., 63) is the value of the excitation buffer, y_k-1(-1) ═ 0. In the search phase exc (n) (0...., 63) is unknown and only needed if the pitch period is less than 64, for the search simplification, the linear prediction residual is stored in exc (n) so that equation (1) is valid for all integers.

For determining the optimum integer closed loop delay T₁And T₂Then the fractional resolution around the best integer closed loop delay is tested. The normalization coefficient r (k) is interpolated and the maximum value is searched for the resulting fractional pitch period. FIR filter b for search₁₆Clipping function and truncation of Hamming windowAt + -15, zero padding (i.e., b) at + -16₁₆(16) 0), the cut-off frequency of the filter (-3dB) is 5.063 KHz.

After the pitch lag is determined, the adaptive codebook vector v (n) is calculated by interpolating the past excitation exc (n) at a given integer lag k and fractional lag t:

n＝0，......，63 t＝0，1，2，3

interpolation filter b₆₄Hamming window clipping function, truncating at + -63, and filling with zeros at + -64 (i.e., b)₆₄(64)＝0) The cut-off frequency (-3dB) of the filter is 6.016 KHz.

3. Selection of adaptive codebook excitation signal

Referring to fig. 1, this step searches for the best adaptive codebook and its gain in two paths, and selects the best scheme by comparison, where the two excitation signals are:

a) searching the self-adaptive codebook v (n) obtained by calculation in the step 2;

b) a new adaptive codebook v' (n) obtained by low-pass filtering the adaptive codebook v (n) obtained by search and calculation in step 2, wherein f (z) is 0.18z^-1+0.64+0.18Z, v' (n) ═ v (n) × f (n), and f (n) is the inverse Z transform of f (Z).

The following processes are the same for the excitation signals v (n) and v' (n), and the present specification will explain the processing of v (n) as an example:

i) calculating a weighted synthesis signal synth (n) by using the impulse response h (n) of the weighted synthesis filter and the convolution excitation signal v (n), as shown in the following formula:

synth(n)＝v(n)*h(n)

ii) calculating a gain of the matching of the synthesized signal and the target signal, wherein the calculation formula of the gain signal is as follows:

<math><mrow> <mi>gain</mi> <mo>=</mo> <mfrac> <mrow> <munderover> <mi>Σ</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>63</mn> </munderover> <mi>x</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>×</mo> <mi>synth</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mi>Σ</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>63</mn> </munderover> <msup> <mi>x</mi> <mn>2</mn> </msup> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow></math>

where x (n) represents the target signal, synth (n) represents the weighted composite signal, and gain represents the prediction gain.

iii) removing the long-term correlation of the target signal, calculating the energy of the error, and removing the long-term correlation of the target signal by using the following formula:

error(n)＝x(n)-gain×synth(n)

calculating to obtain an error signal error (n), and calculating the energy ener of the error signal:

<math><mrow> <mi>ener</mi> <mo>=</mo> <munderover> <mi>Σ</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>63</mn> </munderover> <mi>error</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>×</mo> <mi>error</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow></math>

the updated adaptive codebook v ' (n) is processed through the same 3 processing steps to obtain the prediction gain ' and the energy ener ' of the error signal.

For the two paths of operation processes, two error signal energy values are compared, and one path with small error signal energy is selected as a processing mode of the excitation signal, namely

If the ener is less than or equal to the ener', selecting v (n) as an excitation signal, and transmitting gain to a decoding end; if ener > ener ', v ' (n) is selected as excitation signal, gain ' is transmitted to decoding end, and information of which adaptive codebook is selected is transmitted to decoding end, and the same processing is carried out on synthesized excitation signal at decoding end.

In the selection process of the self-adaptive codebook excitation signal, a closed-loop multipath selection method is adopted, the algorithm complexity is high, and the coding performance gain is limited.

Disclosure of Invention

To reduce the complexity of the selection algorithm, the present invention provides a method and apparatus for selecting an adaptive codebook excitation signal.

A method of selecting an adaptive codebook excitation signal, comprising:

calculating a target signal and an impulse response, and searching an adaptive codebook excitation signal according to the target signal and the impulse response;

and calculating high-frequency excitation correlation between the high-frequency part of the searched adaptive codebook excitation signal and a residual signal of the input voice signal, judging whether the calculated high-frequency excitation correlation is larger than a preset correlation threshold value, and if so, selecting the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal.

An apparatus for selecting an adaptive codebook excitation signal, comprising: an adaptive codebook excitation signal searching unit, a processing unit and a comparison selection unit, wherein,

the adaptive codebook excitation signal searching unit is used for searching an adaptive codebook excitation signal according to the calculated target signal and the impulse response;

the processing unit is used for calculating the high-frequency excitation correlation between the high-frequency part of the searched adaptive codebook excitation signal and the residual signal of the input voice signal and sending the calculated high-frequency excitation correlation value to the comparison selection unit;

and the comparison selection unit is used for selecting the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal after determining that the calculated high-frequency excitation correlation is greater than a preset correlation threshold.

The method and apparatus for selecting an adaptive codebook excitation signal as described above calculate a high-frequency excitation correlation value according to a residual signal between a searched high-frequency part of an adaptive codebook and an input speech signal, and determine which signal is selected as a current adaptive codebook excitation signal by comparing the magnitude between the calculated high-frequency excitation correlation and a preset correlation threshold. If the calculated high-frequency excitation correlation is larger than a preset correlation threshold value, the searched adaptive codebook signal is selected as the current adaptive codebook excitation signal, otherwise, the low-frequency part of the searched adaptive codebook is used as the current adaptive codebook excitation signal.

Drawings

FIG. 1 is a schematic flow chart of a prior art method for selecting an adaptive codebook excitation signal;

FIG. 2 is a flow chart illustrating the selection of an adaptive codebook excitation signal according to an embodiment of the present invention;

FIG. 3 is a block diagram of an apparatus for selecting an adaptive codebook excitation signal according to an embodiment of the present invention.

Detailed Description

The technical solution of the present invention will be described with reference to specific examples.

1. Calculating the target signal and impulse response:

perceptually weighting an input speech signal, making the response of the perceptual weighting filter w (n), calculating a weighted domain signal wsp (n):

wsp (n) ═ speed (n) × w (n) (' represents convolution)

Computationally weighted synthesis filter

H (z) = \frac{W (z)}{\hat{A} (z)}

Zero input response xn2(n)

Let the target signal used for adaptive codebook search be xn (n):

xn(n)＝wsp(n)-xn2(n)

the target signal will be used for selecting a candidate value for pitch period search in the closed loop pitch search process.

Of weighted synthesis filters

H (z) = \frac{W (z)}{\hat{A} (z)}

H (n), then h (n) ═ ξ^-1(H(z))

2. Adaptive codebook search

In the first and third subframes of each frame, in the 12.65, 14.25, 15.85, 18.25, 19.85, 23.05 and 23.85kbits/s modes, the search range is that of the pitch lag T₁In thatWithin the range of resolution of

Fractional delay; in thatWithin the range of resolution of

Fractional delay; at [160, 231 ]]Within the range, only integer pitch lag searches are performed. Pitch delay T for the second and fourth subframes of each frame₂In that

Middle search with resolution of

For the 8.85kbits/s mode, the pitch delay T is applied in the first and third sub-frames of each frame₁In thatWithin the range of resolution of

The delay of the fraction is delayed,in the interval [92, 231]In (3), only integer pitch lag search is performed. Pitch lag T for the second and fourth sub-frames₂In that

Middle search with resolution of

A fractional delay.

For the lowest 6.60kbit/s mode, in the first subframe of each frame, the pitch delay T₁In that

Within the range of resolution of

Middle search with resolution of

A fractional delay.

The criterion for closed-loop pitch search is to minimize the mean-squared weighted error between the original speech and the reconstructed speech, even if R (k) is at a maximum.

Where x (n) denotes a target signal, y_k(n) is the past filtered excitation at delay k (i.e., the convolution of the past excitation with h (n)), y_k(n) ═ h (n) × exc (n-k)). The search range is limited to around a preselected value, i.e. the first and third sub-frames of each frame are the open-loop pitch value T_opThe second or fourth is the nearest first or third sub-frame fractional delay T₁Integer part of (int) (T)₁). Calculating the time delay t_minConvolution of y_k(n) delaying other integers by t in the search range k_min+1，.....，t_maxCorrected by the following relationship:

y_k(n)＝y_k-1(n-1)+exc(-k)h(n)，n＝63，......，0

where exc (n) (n-231...., 63) is the value of the excitation buffer, y_k-1(-1) ═ 0. In a search phase exc: (n) (n is not known and is only needed if the pitch period is less than 64. for simplicity of the search, the linear prediction residuals are stored in exc (n) so that equation (2) is valid for all integers.

For determining the optimum integer closed loop delay T₁And T₂Then the fractional resolution around the best integer closed loop delay is tested. The normalization coefficient r (k) is interpolated and the maximum value is searched for the resulting fractional pitch period. FIR filter b for search₁₆Hamming window clipping function, truncating at + -15, and filling with zeros at + -16 (i.e., b)₁₆(16) 0), the cut-off frequency of the filter (-3dB) is 5.063 KHz.

n＝0，......，63 t＝0，1，2，3

interpolation filter b₆₄Hamming window clipping function, truncating at + -63, and filling with zeros at + -64 (i.e., b)₆₄(64) 0), the cut-off frequency of the filter (-3dB) is 6.016 KHz.

3. Open loop adaptive codebook excitation signal selection

Referring to fig. 2, the steps in this embodiment are as follows:

s1, obtaining the self-adapting codebook excitation v (n) signal calculated in the step 2.

S2, low-pass filtering the obtained adaptive codebook excitation v (n) to obtain a low-frequency part v _ low (n) thereof, wherein the calculation process is as follows:

where b (-1) ═ b (1) ═ 0.28 b (0) ═ 0.44 n ═ 0.

Wherein the low-pass filter is of the form F (z) ═ alphaz^-1+ β + az where 2 α + β ═ 1, such as f (z) ═ 0.18z^-1+0.64+0.18z。

S3) calculating the high frequency part v _ high (n) of the adaptive codebook excitation v (n):

v_high(n)＝v(n)-v_low(n)

n is 0, m-1, m is the length of the adaptive codebook excitation signal, and m is equal to 64 in this embodiment.

S4) calculating a residual signal r (n) of the input speech signal, which is a weighted-domain speech residual signal:

r (n) = speech (n) * \hat{A} (z)

wherein

For analysing filters

For a quantized linear prediction system (LPC coefficients), p is the linear prediction order.

S5) calculating a cross-correlation of the residual signal r (n) with the high frequency part v _ high (n) of the adaptive codebook excitation v (n), i.e. the high frequency excitation cross-correlation corr _ exc _ high:

the number 63 in the above equation is the correlation length, and of course, the number can be changed according to different adaptive codebook excitation lengths.

S6) determines that the cross-correlation corr _ exc _ high and the given correlation threshold γ are 0.19, if corr _ exc _ high > γ, step S7) is executed, and if corr _ exc _ high ≦ γ, step S8) is executed.

The given correlation threshold y is determined according to the final coding effect.

S7) corr _ exc _ high > gamma, the final self-adaptive excitation code book signal is v (n);

let the weighted composite signal be synth (n), then

synth(n)＝h(n)*v(n)

Then, the gain magnitude is calculated: the gain is then (where x (n) is the target signal):

<math><mrow> <mi>gain</mi> <mo>=</mo> <mfrac> <mrow> <munderover> <mi>Σ</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>63</mn> </munderover> <mi>x</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>×</mo> <mi>synth</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mi>Σ</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>63</mn> </munderover> <msup> <mi>synth</mi> <mn>2</mn> </msup> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow></math>

s8), corr _ exc _ high is less than or equal to gamma, the final self-adaptive excitation code book signal is v _ low (n);

let the weighted composite signal be synth' (n), then

synth′(n)＝h(n)*v_low(n)

Then, the gain magnitude is calculated: the gain magnitude gain' is now:

<math><mrow> <msup> <mi>gain</mi> <mo>′</mo> </msup> <mo>=</mo> <mfrac> <mrow> <munderover> <mi>Σ</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>63</mn> </munderover> <mi>x</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>×</mo> <msup> <mi>synth</mi> <mo>′</mo> </msup> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mi>Σ</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>63</mn> </munderover> <msup> <mrow> <mo>(</mo> <msup> <mi>synth</mi> <mo>′</mo> </msup> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow></math>

similarly, 63 in equations (4) and (5) may be modified according to the actual adaptive codebook excitation length.

Therefore, the selection of the adaptive codebook excitation signal is completed, and the long-term prediction gain is calculated.

And transmitting gain or gain' codes to a decoding end, simultaneously transmitting the selected information which is used as a final adaptive excitation codebook signal to the decoding end, and performing the same processing on the synthesized excitation signal at the decoding end.

In addition, the algorithm of the above equation (3) is only an example, and the high-frequency excitation cross-correlation value corr _ exc _ high may be obtained by the following equation (6):

thus, corr _ exc _ high is not normalized, and m is the length of the adaptive codebook excitation signal.

The following steps are still performed as per S6), S7), S8), except that the size of the given correlation threshold γ at this time needs to be re-adjusted as required by the final coding effect.

In addition, for the above "3. open loop adaptive codebook excitation signal selection", there is also a possible scheme:

s1') obtaining the adaptive codebook excitation v (n) signal calculated in step 2.

S2') high-pass filtering the obtained adaptive codebook excitation v (n) to obtain a high-frequency part v _ high (n).

Then, the weighted domain speech residual signal, i.e., the residual signal r (n) of the input speech signal, is calculated, and the high-frequency excitation cross-correlation corr _ exc _ high between the residual signal r (n) and the high-frequency part v _ high (n) of the adaptive codebook excitation v (n) is calculated, and the size of the cross-correlation corr _ exc _ high and the predetermined correlation threshold γ is determined, which are exactly the same as the steps of S4 to S6).

If corr _ exc _ high > γ, the finally searched adaptive excitation codebook signal v (n) is selected as the current adaptive excitation codebook signal, and then the gain can be calculated, the specific processing procedure is exactly the same as the foregoing step S7).

If corr _ exc _ high is less than or equal to gamma, the searched low-pass filtered adaptive codebook excitation signal is selected as the current adaptive codebook excitation signal, and then the gain can be calculated, the algorithm of the gain is the same as the step 8), and for the low-frequency part of the adaptive codebook excitation signal to be selected, the following two acquisition methods are available:

one is that: and performing low-pass filtering on the searched adaptive codebook excitation v (n) to obtain a low-frequency part v _ low (n), and then selecting a low-frequency part signal v _ low (n) as a current adaptive codebook excitation signal.

The other is as follows: the low frequency part of the searched adaptive codebook excitation v (n), i.e. the part v _ high (n), is obtained by subtracting the calculated high frequency part v _ high (n) from the searched adaptive codebook excitation v (n)

v_low(n)＝v(n)-v_high(n)

Then, the low frequency partial signal v _ low (n) is selected and then selected as the current adaptive codebook excitation signal.

The invention also discloses a device for selecting an adaptive codebook excitation signal, which is shown in fig. 3 and comprises: an adaptive codebook excitation signal searching unit 310, a processing unit 320, and a comparison selecting unit 330, wherein,

an adaptive codebook excitation signal searching unit 310 for searching an adaptive codebook excitation signal based on the target signal and the impulse response that have been calculated;

the processing unit 320 is configured to calculate a high-frequency excitation correlation between a high-frequency portion of the searched adaptive codebook excitation signal and a residual signal of the input speech signal, and send the calculated high-frequency excitation correlation value to the comparison selection unit 330;

the comparison selection unit 330 is configured to select the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal after determining that the calculated high-frequency excitation correlation is greater than a preset correlation threshold.

The processing unit 320 may have the following two structures:

one is that: the processing unit 320 includes a low-pass filtering unit and a calculating unit, wherein,

the low-pass filtering unit is used for carrying out low-pass filtering on the received self-adaptive codebook excitation signal;

the calculating unit is used for subtracting the low-frequency part after low-pass filtering from the searched adaptive codebook excitation signal to obtain a high-frequency part of the adaptive codebook excitation signal, then applying the high-frequency excitation correlation between the high-frequency part and a residual signal of the input voice signal, and sending the calculated high-frequency excitation correlation value to the comparison and selection unit.

The processing unit 320 further includes: a high-pass filtering unit and a calculating unit, wherein,

the high-pass filtering unit is used for carrying out high-pass filtering on the received self-adaptive codebook excitation signal;

the calculation unit applies the high-frequency excitation correlation between the high-frequency part of the self-adaptive codebook excitation signal after high-pass filtering and the residual signal of the input voice signal, and sends the calculated high-frequency excitation correlation value to the comparison selection unit. In this case, the calculating unit may be further configured to subtract the high-pass filtered high-frequency part from the searched adaptive codebook excitation signal to obtain a low-frequency part of the adaptive codebook excitation signal, and send the low-frequency part of the adaptive codebook excitation signal to the comparison selecting unit.

In the latter case, if the calculating unit does not include the function of calculating the low frequency part of the adaptive codebook excitation signal, then the processing unit 320 may further include: and the low-pass filtering unit is used for carrying out low-pass filtering on the received adaptive codebook excitation signal and sending the adaptive codebook excitation signal subjected to low-pass filtering to the comparison selection unit.

The comparison selection unit 330 is further configured to: and after the calculated high-frequency excitation correlation is determined to be less than or equal to a preset correlation threshold value, selecting the low-frequency part of the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal.

The above apparatus may further comprise: a gain calculating unit 340 for calculating a gain according to the currently selected adaptive codebook excitation signal.

The calculation method of each unit is the same as that described above, and is not described again.

The method and the device of the invention improve the broadband voice coding performance: experiments show that compared with the prior art, the scheme improves the coding signal-to-noise ratio (SNR); and, the complexity of the selection operation is greatly reduced: statistical analysis shows that the complexity of the algorithm is only 60% of the prior art.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method of selecting an adaptive codebook excitation signal, comprising:

2. The method of claim 1, wherein if the calculated high frequency excitation correlation is less than or equal to a predetermined correlation threshold, the method further comprises: the low frequency part of the searched adaptive codebook excitation signal is selected as the current adaptive codebook excitation signal.

3. The method of claim 1 or 2, further comprising:

the gain is calculated based on the currently selected adaptive codebook excitation signal.

4. The method of claim 1, wherein obtaining the searched high frequency portion of the adaptive codebook excitation signal comprises:

i) low-pass filtering the searched adaptive codebook excitation signal by using a low-pass filter to obtain a low-frequency part v _ low (n) of the adaptive codebook excitation signal:

wherein b (-1) ═ b (1) ═ 0.28 b (0) ═ 0.44 n ═ 0.· m-1; v (n) is the adaptive codebook signal obtained by searching; m is the length of the self-adaptive codebook excitation signal;

ii) calculating a high frequency part v _ high (n) of the adaptive codebook excitation v (n) from the low frequency part v _ low (n) of the adaptive codebook excitation signal:

v_high(n)＝v(n)-v_low(n)

n-0, m-1, m being the length of the adaptive codebook excitation signal.

5. The method of claim 1, wherein the high frequency excitation correlation corr _ exc _ high between the high frequency part of the adaptive codebook excitation signal and the residual signal of the input speech signal is calculated by:

where r (n) is a residual signal of the input speech signal, v _ high (n) is a high frequency portion of the adaptive codebook excitation v (n), and m is a length of the adaptive codebook excitation signal.

6. The method of claim 1, wherein the high frequency excitation correlation corr _ exc _ high between the high frequency part of the adaptive codebook excitation signal and the residual signal of the input speech signal is calculated by:

7. The method according to claim 5 or 6, wherein the residual signal r (n) of the input speech signal is calculated by:

r (n) = speech (n) * \hat{A} (z)

wherein,

for analysing filters

For the quantized linear prediction system, p is the linear prediction order.

8. The method of claim 1, wherein obtaining the searched high frequency portion of the adaptive codebook excitation signal comprises:

and performing high-pass filtering on the searched adaptive codebook excitation signal by using a high-pass filter to obtain a high-frequency part v _ high (n) of the adaptive codebook excitation signal.

9. The method of claim 8, wherein selecting the low frequency portion of the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal comprises:

and subtracting the calculated high-frequency part v _ high (n) from the searched adaptive codebook excitation v (n) to obtain a low-frequency part v _ low (n) of the searched adaptive codebook excitation, and then selecting v _ low (n) as a current adaptive codebook excitation signal.

10. The method of claim 2, wherein the selecting the low frequency portion of the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal comprises:

and performing low-pass filtering on the searched adaptive codebook excitation signal by using a low-pass filter to obtain a low-frequency part v _ low (n) of the adaptive codebook excitation signal, and selecting the adaptive codebook excitation signal after low-pass filtering as the current adaptive codebook excitation signal.

11. An apparatus for selecting an adaptive codebook excitation signal, comprising: an adaptive codebook excitation signal searching unit, a processing unit and a comparison selection unit, wherein,

12. The apparatus of claim 11, wherein the processing unit comprises: a low-pass filtering unit and a calculating unit, wherein,

the low-pass filtering unit is used for performing low-pass filtering on the received adaptive codebook excitation signal;

the computing unit is used for subtracting the low-frequency part after the low-pass filtering from the searched adaptive codebook excitation signal to obtain a high-frequency part of the adaptive codebook excitation signal, then applying the high-frequency excitation correlation between the high-frequency part and a residual signal of the input voice signal, and sending the computed high-frequency excitation correlation value to the comparison and selection unit.

13. The apparatus of claim 11, wherein the processing unit comprises: a high-pass filtering unit and a calculating unit,

the calculating unit applies the high-frequency excitation correlation between the high-frequency part of the self-adaptive codebook excitation signal after high-pass filtering and the residual signal of the input voice signal, and sends the calculated high-frequency excitation correlation value to the comparison and selection unit.

14. The apparatus of claim 13, wherein the computing unit is further configured to: and subtracting the high-pass filtered high-frequency part from the searched adaptive codebook excitation signal to obtain a low-frequency part of the adaptive codebook excitation signal, and sending the calculated low-frequency part of the adaptive codebook excitation signal to a comparison selection unit.

15. The apparatus of claim 13, wherein the processing unit further comprises: and the low-pass filtering unit is used for carrying out low-pass filtering on the received adaptive codebook excitation signal and sending the adaptive codebook excitation signal subjected to low-pass filtering to the comparison selection unit.

16. The apparatus according to claim 14 or 15, wherein the comparison selection unit is further configured to: and after the calculated high-frequency excitation correlation is determined to be less than or equal to a preset correlation threshold value, selecting the low-frequency part of the searched adaptive codebook excitation signal as the current adaptive codebook excitation signal.

17. The apparatus of claim 16, further comprising: and the gain calculation unit is used for calculating the gain according to the currently selected adaptive codebook excitation signal.