Nothing Special   »   [go: up one dir, main page]

US10170129B2 - Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain - Google Patents

Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain Download PDF

Info

Publication number
US10170129B2
US10170129B2 US14/678,610 US201514678610A US10170129B2 US 10170129 B2 US10170129 B2 US 10170129B2 US 201514678610 A US201514678610 A US 201514678610A US 10170129 B2 US10170129 B2 US 10170129B2
Authority
US
United States
Prior art keywords
matrix
vector
autocorrelation matrix
coding algorithm
speech coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
US14/678,610
Other languages
English (en)
Other versions
US20180218743A9 (en
US20150213810A1 (en
Inventor
Tom BAECKSTROEM
Markus Multrus
Guillaume Fuchs
Christian Helmrich
Martin Dietz
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Original Assignee
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV filed Critical Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority to US14/678,610 priority Critical patent/US10170129B2/en
Publication of US20150213810A1 publication Critical patent/US20150213810A1/en
Assigned to FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V. reassignment FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DIETZ, MARTIN, BAECKSTROEM, TOM, FUCHS, GUILLAUME, Helmrich, Christian, MULTRUS, MARKUS
Publication of US20180218743A9 publication Critical patent/US20180218743A9/en
Priority to US16/209,610 priority patent/US11264043B2/en
Application granted granted Critical
Publication of US10170129B2 publication Critical patent/US10170129B2/en
Priority to US17/576,797 priority patent/US12002481B2/en
Priority to US18/680,606 priority patent/US20240321284A1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/10Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/10Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
    • G10L19/107Sparse pulse excitation, e.g. by using algebraic codebook
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks

Definitions

  • the present invention relates to audio signal coding, and, in particular, to an apparatus for encoding a speech signal employing ACELP in the autocorrelation domain.
  • CELP Code-Excited Linear Prediction
  • LP linear predictive
  • LTP long-time predictor
  • a residual signal represented by a codebook also known as the fixed codebook
  • ACELP Algebraic Code-Excited Linear Prediction
  • ACELP is based on modeling the spectral envelope by a linear predictive (LP) filter, the fundamental frequency of voiced sounds by a long time predictor (LTP) and the prediction residual by an algebraic codebook.
  • LTP and algebraic codebook parameters are optimized by a least squares algorithm in a perceptual domain, where the perceptual domain is specified by a filter.
  • h(39) and the vector h(k) is the impulse response of the LP model.
  • the perceptual model (which usually corresponds to a weighted LP model) is omitted, but it is assumed that the perceptual model is included in the impulse response h(k). This omission has no impact on the generality of results, but simplifies notation.
  • the inclusion of the perceptual model is applied as in [1].
  • ZIR zero impulse response
  • the concept appears when considering the original domain synthesis signal in comparison to the synthesised residual.
  • the residual is encoded in blocks corresponding to the frame or sub-frame size.
  • the fixed length residual will have an infinite length “tail”, corresponding to the impulse response of the LP filter. That is, although the residual codebook vector is of finite length, it will have an effect on the synthesis signal far beyond the current frame or sub-frame.
  • the effect of a frame into the future can be calculated by extending the codebook vector with zeros and calculating the synthesis output of Equation 1 for this extended signal.
  • This extension of the synthesised signal is known as the zero impulse response. Then, to take into account the effect of prior frames in encoding the current frame, the ZIR of the prior frame is subtracted from the target of the current frame. In encoding the current frame, thus, only that part of the signal is considered, which was not already modelled by the previous frame.
  • the ZIR is taken into account as follows: When a (sub)frame N ⁇ 1 has been encoded, the quantized residual is extended with zeros to the length of the next (sub)frame N. The extended quantized residual is filtered by the LP to obtain the ZIR of the quantized signal. The ZIR of the quantized signal is then subtracted from the original (not quantized) signal and this modified signal forms the target signal when encoding (sub)frame N. This way, all quantization errors made in (sub)frame N ⁇ 1 will be taken into account when quantizing (sub)frame N. This practice improves the perceptual quality of the output signal considerably.
  • a decoder for decoding an encoded speech signal being encoded by an apparatus for encoding a speech signal by determining a codebook vector of a speech coding algorithm which apparatus may have:
  • a method for decoding an encoded speech signal being encoded according to the method for encoding a speech signal by determining a codebook vector of a speech coding algorithm which method for encoding may have the steps of:
  • a system may have:
  • a method may have the steps of:
  • Another embodiment may have a computer program for implementing, when being executed on a computer or signal processor, the method for encoding a speech signal by determining a codebook vector of a speech coding algorithm, which method may have the steps of:
  • Another embodiment may have a computer program for implementing, when being executed on a computer or signal processor, the method for decoding an encoded speech signal being encoded according to the method for encoding a speech signal by determining a codebook vector of a speech coding algorithm, which method for encoding may have the steps of:
  • Another embodiment may have a computer program for implementing, when being executed on a computer or signal processor, the method which may have the steps of:
  • the apparatus is configured to use the codebook vector to encode the speech signal.
  • the apparatus may generate the encoded speech signal such that the encoded speech signal comprises a plurality of Linear Prediction coefficients, an indication of the fundamental frequency of voiced sounds (e.g., pitch parameters), and an indication of the codebook vector, e.g, an index of the codebook vector.
  • a decoder for decoding an encoded speech signal being encoded by an apparatus according to the above-described embodiment to obtain a decoded speech signal is provided.
  • the system comprises an apparatus according to the above-described embodiment for encoding an input speech signal to obtain an encoded speech signal. Moreover, the system comprises a decoder according to the above-described embodiment for decoding the encoded speech signal to obtain a decoded speech signal.
  • Improved concepts for the objective function of the speech coding algorithm ACELP are provided, which take into account not only the effect of the impulse response of the previous frame to the current frame, but also the effect of the impulse response of the current frame into the next frame, when optimizing parameters of current frame.
  • Some embodiments realize these improvements by changing the correlation matrix, which is central to conventional ACELP optimisation to an autocorrelation matrix, which has Hermitian Toeplitz structure. By employing this structure, it is possible to make ACELP optimisation more efficient in terms of both computational complexity as well as memory requirements. Concurrently, also the perceptual model applied becomes more consistent and interframe dependencies can be avoided to improve performance under the influence of packet-loss.
  • Speech coding with the ACELP paradigm is based on a least squares algorithm in a perceptual domain, where the perceptual domain is specified by a filter.
  • the computational complexity of the conventional definition of the least squares problem can be reduced by taking into account the impact of the zero impulse response into the next frame.
  • the provided modifications introduce a Toeplitz structure to a correlation matrix appearing in the objective function, which simplifies the structure and reduces computations.
  • the proposed concepts reduce computational complexity up to 17% without reducing perceptual quality.
  • Embodiments are based on the finding that by a slight modification of the objective function, complexity in the optimization of the residual codebook can be further reduced. This reduction in complexity comes without reduction in perceptual quality.
  • ACELP residual optimization is based on iterative search algorithms, with the presented modification, it is possible to increase the number of iterations without an increase in complexity, and in this way obtain an improved perceptual quality.
  • the optimal solution to the conventional approach is not necessarily optimal with respect to the modified objective function and vice versa. This alone does not mean that one approach would be better than the other, but analytic arguments do show that the modified objective function is more consistent.
  • the provided concepts treat all samples within a sub-frame equally, with consistent and well-defined perceptual and signal models.
  • the proposed modifications can be applied such that they only change the optimization of the residual codebook. It does therefore not change the bit-stream structure and can be applied in a back-ward compatible manner to existing ACELP codecs.
  • a method for encoding a speech signal by determining a codebook vector of a speech coding algorithm comprises:
  • Determining an autocorrelation matrix R comprises determining vector coefficients of a vector r.
  • the autocorrelation matrix R comprises a plurality of rows and a plurality of columns.
  • the method comprises:
  • FIG. 1 illustrates an apparatus for encoding a speech signal by determining a codebook vector of a speech coding algorithm according to an embodiment
  • FIG. 2 illustrates a decoder according to an embodiment and a decoder
  • FIG. 3 illustrates a system comprising an apparatus for encoding a speech signal according to an embodiment and a decoder.
  • FIG. 1 illustrates an apparatus for encoding a speech signal by determining a codebook vector of a speech coding algorithm according to an embodiment.
  • the apparatus comprises a matrix determiner ( 110 ) for determining an autocorrelation matrix R, and a codebook vector determiner ( 120 ) for determining the codebook vector depending on the autocorrelation matrix R.
  • the matrix determiner ( 110 ) is configured to determine the autocorrelation matrix R by determining vector coefficients of a vector r.
  • R(i, j) indicates the coefficients of the autocorrelation matrix R, wherein i is a first index indicating one of a plurality of rows of the autocorrelation matrix R, and wherein j is a second index indicating one of the plurality of columns of the autocorrelation matrix R.
  • the apparatus is configured to use the codebook vector to encode the speech signal.
  • the apparatus may generate the encoded speech signal such that the encoded speech signal comprises a plurality of Linear Prediction coefficients, an indication of the fundamental frequency of voiced sounds (e.g. pitch parameters), and an indication of the codebook vector.
  • the apparatus may be configured to determine a plurality of linear predictive coefficients (a(k)) depending on the speech signal. Moreover, the apparatus is configured to determine a residual signal depending on the plurality of linear predictive coefficients (a(k)). Furthermore, the matrix determiner 110 may be configured to determine the autocorrelation matrix R depending on the residual signal.
  • Equation 4 The ACELP algorithm is centred around Equation 4, which in turn is based on Equation 3.
  • Equation 3 should thus be extended such that it takes into account the ZIR into the next frame. It should be noticed that here, inter alia, the difference to conventional technology is that both the ZIR from the previous frame and also the ZIR into the next frame are taken into account.
  • Equation 4 This objective function is very similar to Equation 4. The main difference is that instead of the correlation matrix B, here a Hermitian Toeplitz matrix R is in the denominator.
  • this novel formulation has the benefit that all samples of the residual e within a frame will receive the same perceptual weighting.
  • Equation 10 Since the objective function in Equation 10 is so similar to Equation 4, the structure of the general ACELP can be retained. Specifically, any of the following operations can be performed with either objective function, with only minor modifications to the algorithm:
  • Some embodiments employ the concepts of the present invention by, wherever in the ACELP algorithm, where the correlation matrix B appears, it is replaced by the autocorrelation matrix R. If all instances of the matrix B are omitted, then calculating its value can be avoided.
  • the autocorrelation matrix R is determined by determining the coefficients of the first column r(0), . . . , r(N ⁇ 1) of the autocorrelation matrix R.
  • sequence r(k) is the autocorrelation of h(k).
  • r(k) can be obtained by even more effective means.
  • the sequence h(k) is the impulse response of a linear predictive filter A(z) filtered by a perceptual weighting function W(z), which is taken to include the pre-emphasis.
  • W(z) perceptual weighting function
  • a codebook vector of a codebook may then, e.g., be determined based on the autocorrelation matrix R.
  • Equation 10 may, according to some embodiments, be used to determine a codebook vector of the codebook.
  • Equation 10 defines the objective function in the form
  • the objective function is basically a normalized correlation between the target vector d and the codebook vector and ê the best possible codebook vector is that, which gives the highest value for the normalized correlation f(ê), e.g., which maximizes the normalized correlation f(ê).
  • Codebook vectors can thus optimized with the same approaches as in the mentioned standards. Specifically, for example, the very simple algorithm for finding the best algebraic codebook (i.e. the fixed codebook) vector ê for the residual can be applied, as described below. It should, however, be noted, that significant effort has been invested in the design of efficient search algorithms (c.f. AMR and G.718), and this search algorithm is only an illustrative example of application.
  • the target is modified such that it includes the ZIR into the following frame.
  • Equation 1 describes the linear predictive model used in ACELP-type codecs.
  • the Zero Impulse Response also sometimes known as the Zero Input Response
  • the ZIR can be readily calculated by defining the residual which is zero from position N forward as
  • the ZIR can be determined by filtering the past input signal as
  • This target is in principle exactly equal to the target in the AMR and G.718 standards.
  • the quantized signal ⁇ circumflex over (d) ⁇ (n) is compared to d(n) for the duration of a frame K ⁇ n ⁇ K+N.
  • the residual of the current frame has an influence on the following frames, whereby it is useful to consider its influence when quantizing the signal, that is, one thus may want to evaluate the difference ⁇ circumflex over (d) ⁇ (n) ⁇ d(n) also beyond the current frame, n>K+N.
  • one may want to consider the influence of the residual of the current frame only by setting residuals of the following frames to zero. Therefore, the ZIR of d(n) into the next frame may be compared.
  • the modified target is obtained:
  • the long-time predictor (LTP) is actually also a linear predictor.
  • the matrix determiner 110 may be configured to determine the autocorrelation matrix R depending on a perceptually weighted linear predictor, for example, depending on the long-time predictor.
  • the LP and LTP can be convolved into one joint predictor, which includes both the spectral envelope shape as well as the harmonic structure.
  • the impulse response of such a predictor will be very long, whereby it is even more difficult to handle with conventional technology.
  • the autocorrelation of the linear predictor is already known, then the autocorrelation of the joint predictor can be calculated by simply filtering the autocorrelation with the LTP forward and backward, or with a similar process in the frequency domain.
  • ACELP systems are complex because filtering by LP causes complicated correlations between the residual samples, which are described by the matrix B or in the current context by matrix R. Since the samples of e(n) are correlated, it is not possible to just quantise e(n) with desired accuracy, but many combinations of different quantisations with a trial-and-error approach have to be tried, to find the best quantisation with respect to the objective function of Equation 3 or 10, respectively.
  • R has Hermitian Toeplitz structure
  • several efficient matrix decompositions can be applied, such as the singular value decomposition, Cholesky decomposition or Vandermonde decomposition of Hankel matrices (Hankel matrices are upside-down Toeplitz matrices, whereby the same decompositions can be applied to Toeplitz and Hankel matrices) (see [6] and [7]).
  • R E D E H be a decomposition of R such that D is a diagonal matrix of the same size and rank as R.
  • Some embodiments employ equation 12 to determine a codebook vector of the codebook.
  • Equation 12 since the elements of f′ are orthogonal (as can be seen from Equation 12) and they have the same weight in the objective function of Equation 12, they can be quantized separately, and with the same quantization step size. That quantization will automatically find the optimal (the largest) value of the objective function in Equation 12, which is possible with that quantization accuracy. In other words, the quantization algorithms presented above, will both return the optimal quantization with respect to Equation 12.
  • Vandermonde factorization of a Toeplitz matrix can be chosen such that the Vandermonde matrix is a Fourier transform matrix but with unevenly distributed frequencies.
  • the Vandermonde matrix corresponds to a frequency-warped Fourier transform. It follows that in this case the vector f corresponds to a frequency domain representation of the residual signal on a warped frequency scale (see the “root-exchange property” in [8]).
  • the path through which inter-frame dependency is generated can be quantified by the ZIR from the current frame into the next is realized.
  • three modifications to the conventional ACELP need to be made.
  • Embodiments modify conventional ACELP algorithms by inclusion of the effect of the impulse response of the current frame into the next frame, into the objective function of the current frame.
  • this modification corresponds to replacing a correlation matrix with an autocorrelation matrix that has Hermitian Toeplitz structure. This modification has the following benefits:
  • FIG. 2 illustrates a decoder 220 for decoding an encoded speech signal being encoded by an apparatus according to the above-described embodiment to obtain a decoded speech signal.
  • the decoder 220 is configured to receive the encoded speech signal, wherein the encoded speech signal comprises the an indication of the codebook vector, being determined by an apparatus for encoding a speech signal according to one of the above-described embodiments, for example, an index of the determined codebook vector. Furthermore, the decoder 220 is configured to decode the encoded speech signal to obtain a decoded speech signal depending on the codebook vector.
  • FIG. 3 illustrates a system according to an embodiment.
  • the system comprises an apparatus 210 according to one of the above-described embodiments for encoding an input speech signal to obtain an encoded speech signal.
  • the encoded speech signal comprises an indication of the determined codebook vector determined by the apparatus 210 for encoding a speech signal, e.g., it comprises an index of the codebook vector.
  • the system comprises a decoder 220 according to the above-described embodiment for decoding the encoded speech signal to obtain a decoded speech signal.
  • the decoder 220 is configured to receive the encoded speech signal.
  • the decoder 220 is configured to decode the encoded speech signal to obtain a decoded speech signal depending on the determined codebook vector.
  • aspects have been described in the context of an apparatus, these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
  • the inventive decomposed signal can be stored on a digital storage medium or can be transmitted on a transmission medium such as a wireless transmission medium or a wired transmission medium such as the Internet.
  • embodiments of the invention can be implemented in hardware or in software.
  • the implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed.
  • a digital storage medium for example a floppy disk, a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed.
  • Some embodiments according to the invention comprise a non-transitory data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
  • embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer.
  • the program code may for example be stored on a machine readable carrier.
  • inventions comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
  • an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
  • a further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.
  • a further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein.
  • the data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.
  • a further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
  • a processing means for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
  • a further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
  • a programmable logic device for example a field programmable gate array
  • a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein.
  • the methods are advantageously performed by any hardware apparatus.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
US14/678,610 2012-10-05 2015-04-03 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain Active US10170129B2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US14/678,610 US10170129B2 (en) 2012-10-05 2015-04-03 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain
US16/209,610 US11264043B2 (en) 2012-10-05 2018-12-04 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain
US17/576,797 US12002481B2 (en) 2012-10-05 2022-01-14 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain
US18/680,606 US20240321284A1 (en) 2012-10-05 2024-05-31 Apparatus for encoding a speech signal employing acelp in the autocorrelation domain

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201261710137P 2012-10-05 2012-10-05
PCT/EP2013/066074 WO2014053261A1 (fr) 2012-10-05 2013-07-31 Appareil pour coder un signal de parole employant acelp dans le domaine d'autocorrélation
US14/678,610 US10170129B2 (en) 2012-10-05 2015-04-03 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2013/066074 Continuation WO2014053261A1 (fr) 2012-10-05 2013-07-31 Appareil pour coder un signal de parole employant acelp dans le domaine d'autocorrélation

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/209,610 Continuation US11264043B2 (en) 2012-10-05 2018-12-04 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain

Publications (3)

Publication Number Publication Date
US20150213810A1 US20150213810A1 (en) 2015-07-30
US20180218743A9 US20180218743A9 (en) 2018-08-02
US10170129B2 true US10170129B2 (en) 2019-01-01

Family

ID=48906260

Family Applications (4)

Application Number Title Priority Date Filing Date
US14/678,610 Active US10170129B2 (en) 2012-10-05 2015-04-03 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain
US16/209,610 Active US11264043B2 (en) 2012-10-05 2018-12-04 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain
US17/576,797 Active US12002481B2 (en) 2012-10-05 2022-01-14 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain
US18/680,606 Pending US20240321284A1 (en) 2012-10-05 2024-05-31 Apparatus for encoding a speech signal employing acelp in the autocorrelation domain

Family Applications After (3)

Application Number Title Priority Date Filing Date
US16/209,610 Active US11264043B2 (en) 2012-10-05 2018-12-04 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain
US17/576,797 Active US12002481B2 (en) 2012-10-05 2022-01-14 Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain
US18/680,606 Pending US20240321284A1 (en) 2012-10-05 2024-05-31 Apparatus for encoding a speech signal employing acelp in the autocorrelation domain

Country Status (22)

Country Link
US (4) US10170129B2 (fr)
EP (3) EP2904612B1 (fr)
JP (1) JP6122961B2 (fr)
KR (1) KR101691549B1 (fr)
CN (1) CN104854656B (fr)
AR (1) AR092875A1 (fr)
AU (1) AU2013327192B2 (fr)
BR (1) BR112015007137B1 (fr)
CA (3) CA2887009C (fr)
ES (2) ES2701402T3 (fr)
FI (1) FI3444818T3 (fr)
HK (1) HK1213359A1 (fr)
MX (1) MX347921B (fr)
MY (1) MY194208A (fr)
PL (2) PL2904612T3 (fr)
PT (2) PT3444818T (fr)
RU (1) RU2636126C2 (fr)
SG (1) SG11201502613XA (fr)
TR (1) TR201818834T4 (fr)
TW (1) TWI529702B (fr)
WO (1) WO2014053261A1 (fr)
ZA (1) ZA201503025B (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160372128A1 (en) * 2014-03-14 2016-12-22 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoder, decoder and method for encoding and decoding

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014053261A1 (fr) 2012-10-05 2014-04-10 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Appareil pour coder un signal de parole employant acelp dans le domaine d'autocorrélation
CA2940657C (fr) * 2014-04-17 2021-12-21 Voiceage Corporation Procedes, codeur et decodeur pour le codage et le decodage predictifs lineaires de signaux sonores lors de la transition entre des trames possedant des taux d'echantillonnage diff erents
CN110289008B (zh) 2014-05-01 2022-10-21 日本电信电话株式会社 周期性综合包络序列生成装置、方法、记录介质
AU2016312404B2 (en) * 2015-08-25 2020-11-26 Dolby International Ab Audio decoder and decoding method

Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4815135A (en) * 1984-07-10 1989-03-21 Nec Corporation Speech signal processor
US4868867A (en) * 1987-04-06 1989-09-19 Voicecraft Inc. Vector excitation speech or audio coder for transmission or storage
US5265167A (en) 1989-04-25 1993-11-23 Kabushiki Kaisha Toshiba Speech coding and decoding apparatus
JPH0720896A (ja) 1993-07-05 1995-01-24 Nippon Telegr & Teleph Corp <Ntt> 音声の励振信号符号化法
WO1998005030A1 (fr) 1996-07-31 1998-02-05 Qualcomm Incorporated Procede et appareil permettant de rechercher une table de codes d'ondes d'excitation dans un codeur a prevision lineaire par codes d'ondes de signaux excitateurs en transmission numerique de la parole
US5717825A (en) * 1995-01-06 1998-02-10 France Telecom Algebraic code-excited linear prediction speech coding method
US5854998A (en) * 1994-04-29 1998-12-29 Audiocodes Ltd. Speech processing system quantizer of single-gain pulse excitation in speech coder
US5963898A (en) * 1995-01-06 1999-10-05 Matra Communications Analysis-by-synthesis speech coding method with truncation of the impulse response of a perceptual weighting filter
US6055496A (en) * 1997-03-19 2000-04-25 Nokia Mobile Phones, Ltd. Vector quantization in celp speech coder
KR20000074365A (ko) 1999-05-20 2000-12-15 윤종용 음성 부호화시에 대수코드북에서의 대수코드 탐색방법
US6226604B1 (en) * 1996-08-02 2001-05-01 Matsushita Electric Industrial Co., Ltd. Voice encoder, voice decoder, recording medium on which program for realizing voice encoding/decoding is recorded and mobile communication apparatus
US20020153891A1 (en) * 1999-07-06 2002-10-24 Smith John Alec Sydney Methods of and apparatus for analysing a signal
US20040101048A1 (en) * 2002-11-14 2004-05-27 Paris Alan T Signal processing of multi-channel data
EP1833047A1 (fr) 2006-03-10 2007-09-12 Matsushita Electric Industrial Co., Ltd. Dispositif et procédé pour la recherche d'un dictionnaire d'excitations fixe de codage
US20090281798A1 (en) * 2005-05-25 2009-11-12 Koninklijke Philips Electronics, N.V. Predictive encoding of a multi channel signal
US20100014692A1 (en) * 2008-07-17 2010-01-21 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for generating audio output signals using object based metadata
US20110002263A1 (en) * 2009-07-06 2011-01-06 Yuan Zhu Beamforming using base and differential codebooks
WO2011026231A1 (fr) 2009-09-02 2011-03-10 Nortel Networks Limited Systèmes et procédés de codage utilisant un livre de codes réduit à réinitialisation adaptative
US8036887B2 (en) * 1996-11-07 2011-10-11 Panasonic Corporation CELP speech decoder modifying an input vector with a fixed waveform to transform a waveform of the input vector
US20110313777A1 (en) * 2009-01-21 2011-12-22 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus, method and computer program for obtaining a parameter describing a variation of a signal characteristic of a signal
RU2010151983A (ru) 2008-06-19 2012-06-27 Панасоник Корпорейшн (Jp) Квантователь, кодер и их способы
US20160225387A1 (en) * 2013-08-28 2016-08-04 Dolby Laboratories Licensing Corporation Hybrid waveform-coded and parametric-coded speech enhancement

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4910781A (en) * 1987-06-26 1990-03-20 At&T Bell Laboratories Code excited linear predictive vocoder using virtual searching
CA2010830C (fr) * 1990-02-23 1996-06-25 Jean-Pierre Adoul Regles de codage dynamique permettant un codage efficace des paroles au moyen de codes algebriques
US5495555A (en) * 1992-06-01 1996-02-27 Hughes Aircraft Company High quality low bit rate celp-based speech codec
FR2700632B1 (fr) * 1993-01-21 1995-03-24 France Telecom Système de codage-décodage prédictif d'un signal numérique de parole par transformée adaptative à codes imbriqués.
US5924062A (en) * 1997-07-01 1999-07-13 Nokia Mobile Phones ACLEP codec with modified autocorrelation matrix storage and search
US6704703B2 (en) * 2000-02-04 2004-03-09 Scansoft, Inc. Recursively excited linear prediction speech coder
US7103537B2 (en) * 2000-10-13 2006-09-05 Science Applications International Corporation System and method for linear prediction
KR100464369B1 (ko) * 2001-05-23 2005-01-03 삼성전자주식회사 음성 부호화 시스템의 여기 코드북 탐색 방법
US6766289B2 (en) * 2001-06-04 2004-07-20 Qualcomm Incorporated Fast code-vector searching
DE10140507A1 (de) * 2001-08-17 2003-02-27 Philips Corp Intellectual Pty Verfahren für die algebraische Codebook-Suche eines Sprachsignalkodierers
US7003461B2 (en) * 2002-07-09 2006-02-21 Renesas Technology Corporation Method and apparatus for an adaptive codebook search in a speech processing system
US7797156B2 (en) * 2005-02-15 2010-09-14 Raytheon Bbn Technologies Corp. Speech analyzing system with adaptive noise codebook
CN101401153B (zh) * 2006-02-22 2011-11-16 法国电信公司 Celp技术中改进的数字音频信号的编码/解码
US8566106B2 (en) * 2007-09-11 2013-10-22 Voiceage Corporation Method and device for fast algebraic codebook search in speech and audio coding
US20100011041A1 (en) * 2008-07-11 2010-01-14 James Vannucci Device and method for determining signals
US20100153100A1 (en) * 2008-12-11 2010-06-17 Electronics And Telecommunications Research Institute Address generator for searching algebraic codebook
US9112591B2 (en) 2010-04-16 2015-08-18 Samsung Electronics Co., Ltd. Apparatus for encoding/decoding multichannel signal and method thereof
WO2014053261A1 (fr) * 2012-10-05 2014-04-10 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Appareil pour coder un signal de parole employant acelp dans le domaine d'autocorrélation
EP2916319A1 (fr) * 2014-03-07 2015-09-09 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Concept pour le codage d'informations
EP2919232A1 (fr) * 2014-03-14 2015-09-16 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Codeur, décodeur et procédé de codage et de décodage

Patent Citations (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4815135A (en) * 1984-07-10 1989-03-21 Nec Corporation Speech signal processor
US4868867A (en) * 1987-04-06 1989-09-19 Voicecraft Inc. Vector excitation speech or audio coder for transmission or storage
US5265167A (en) 1989-04-25 1993-11-23 Kabushiki Kaisha Toshiba Speech coding and decoding apparatus
JPH0720896A (ja) 1993-07-05 1995-01-24 Nippon Telegr & Teleph Corp <Ntt> 音声の励振信号符号化法
US5854998A (en) * 1994-04-29 1998-12-29 Audiocodes Ltd. Speech processing system quantizer of single-gain pulse excitation in speech coder
JPH10502191A (ja) 1995-01-06 1998-02-24 フランス テレコム 代数的符号励振線形予測音声符号化方法
US5717825A (en) * 1995-01-06 1998-02-10 France Telecom Algebraic code-excited linear prediction speech coding method
US5963898A (en) * 1995-01-06 1999-10-05 Matra Communications Analysis-by-synthesis speech coding method with truncation of the impulse response of a perceptual weighting filter
US5751901A (en) 1996-07-31 1998-05-12 Qualcomm Incorporated Method for searching an excitation codebook in a code excited linear prediction (CELP) coder
JP2000515998A (ja) 1996-07-31 2000-11-28 クゥアルコム・インコーポレイテッド コード励振形線形予測(celp)コーダにおいて励振コードブックを検索する方法およびその装置
WO1998005030A1 (fr) 1996-07-31 1998-02-05 Qualcomm Incorporated Procede et appareil permettant de rechercher une table de codes d'ondes d'excitation dans un codeur a prevision lineaire par codes d'ondes de signaux excitateurs en transmission numerique de la parole
US6226604B1 (en) * 1996-08-02 2001-05-01 Matsushita Electric Industrial Co., Ltd. Voice encoder, voice decoder, recording medium on which program for realizing voice encoding/decoding is recorded and mobile communication apparatus
US8036887B2 (en) * 1996-11-07 2011-10-11 Panasonic Corporation CELP speech decoder modifying an input vector with a fixed waveform to transform a waveform of the input vector
US6055496A (en) * 1997-03-19 2000-04-25 Nokia Mobile Phones, Ltd. Vector quantization in celp speech coder
KR20000074365A (ko) 1999-05-20 2000-12-15 윤종용 음성 부호화시에 대수코드북에서의 대수코드 탐색방법
US20020153891A1 (en) * 1999-07-06 2002-10-24 Smith John Alec Sydney Methods of and apparatus for analysing a signal
US20040101048A1 (en) * 2002-11-14 2004-05-27 Paris Alan T Signal processing of multi-channel data
US20090281798A1 (en) * 2005-05-25 2009-11-12 Koninklijke Philips Electronics, N.V. Predictive encoding of a multi channel signal
EP1833047A1 (fr) 2006-03-10 2007-09-12 Matsushita Electric Industrial Co., Ltd. Dispositif et procédé pour la recherche d'un dictionnaire d'excitations fixe de codage
RU2010151983A (ru) 2008-06-19 2012-06-27 Панасоник Корпорейшн (Jp) Квантователь, кодер и их способы
RU2486609C2 (ru) 2008-06-19 2013-06-27 Панасоник Корпорейшн Квантователь, кодер и их способы
US20100014692A1 (en) * 2008-07-17 2010-01-21 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for generating audio output signals using object based metadata
US20110313777A1 (en) * 2009-01-21 2011-12-22 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus, method and computer program for obtaining a parameter describing a variation of a signal characteristic of a signal
US20110002263A1 (en) * 2009-07-06 2011-01-06 Yuan Zhu Beamforming using base and differential codebooks
WO2011026231A1 (fr) 2009-09-02 2011-03-10 Nortel Networks Limited Systèmes et procédés de codage utilisant un livre de codes réduit à réinitialisation adaptative
US20160225387A1 (en) * 2013-08-28 2016-08-04 Dolby Laboratories Licensing Corporation Hybrid waveform-coded and parametric-coded speech enhancement

Non-Patent Citations (12)

* Cited by examiner, † Cited by third party
Title
Backstrom et al, "Vandermonde Factorization of Toeplitz Matrices and Applications in Filtering and Warping," Dec. 2013, In Signal Processing, IEEE Transactions on , vol. 61, No. 24, pp. 6257-6263. *
Chen et al, "Frequency-selective techniques based on singular value decomposition (SVD), total least squares (TLS), and bandpass filtering", 1994, In Proc. SPIE 2296, Advanced Signal Processing: Algorithms, Architectures, and Implementations V, 601, pp. 1-11. *
Delprat et al,"Fractional excitation and other efficient transformed codebooks for CELP coding of speech," 1992,Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on , vol. 1, No., pp. 329-332 vol. 1. *
Demeure et al, "Linear Statistical Models for Stationary Sequences and Related Algorithms for Cholesky Factorization of Toeplitz Matrices" 1987, In IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-35, No. I, pp. 29-42. *
Kumar, "High Computational Performance in Code Exited Linear Prediction Speech Model Using Faster Codebook Search Techniques," 2007, InComputing: Theory and Applications, 2007. ICCTA '07. International Conference on, Kolkata, 2007, pp. 458-462. *
Moriya, Takehiro, "Improvement of Search of Excited Vector 10.3.1 Correlation, Search of Frequency Domain, Audio Coding", Aggregate Corporation of Electronic Information Communication Society. First Edition, Oct. 20, 1998, pp. 96-99.
Mukherjee, "On some properties of positive definite Toeplitz matrices and their possible applications", 1988, In Linear Algebra Appl 102:211-240. *
Sanchez et al, "Low-delay wideband speech coding using a new frequency domain approach," 1993, In Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on , vol. 2, No., pp. 415418, vol. 2. *
Srivastava, "Fundamentals of Linear Prediction," 1999, Department for Electrical and Computer Engineering Mississippi State University, pp. 1-13. *
Tismenetsky, Miron. "A decomposition of Toeplitz matrices and optimal circulant preconditioning.", 1991, Linear algebra and its applications 154 (1991): 105-121. *
Trancoso, "An Overview of Different Trends on CELP Coding", 1995, in Speech Recognition and Coding, New Advances and Trends. Edited by Rubio-Ayuso J. and Lopez-Soler J.M., NATO ASI Series, Springer 1995. *
Zhou, "A modified low-bit-rate ACELP speech coder and its implementationA modified low-bit-rate ACELP speech coder and its implementation", 2003, Thesis Concordia University, pp. 1-98. *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160372128A1 (en) * 2014-03-14 2016-12-22 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoder, decoder and method for encoding and decoding
US10586548B2 (en) * 2014-03-14 2020-03-10 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoder, decoder and method for encoding and decoding

Also Published As

Publication number Publication date
TWI529702B (zh) 2016-04-11
EP3444818A1 (fr) 2019-02-20
CA2979948C (fr) 2019-10-22
BR112015007137B1 (pt) 2021-07-13
US20190115035A1 (en) 2019-04-18
CA2979857A1 (fr) 2014-04-10
CN104854656A (zh) 2015-08-19
US20240321284A1 (en) 2024-09-26
US20220223163A1 (en) 2022-07-14
SG11201502613XA (en) 2015-05-28
CA2887009C (fr) 2019-12-17
EP2904612B1 (fr) 2018-09-19
PL3444818T3 (pl) 2023-08-21
BR112015007137A2 (pt) 2017-07-04
HK1213359A1 (zh) 2016-06-30
FI3444818T3 (fi) 2023-06-22
EP3444818B1 (fr) 2023-04-19
ES2948895T3 (es) 2023-09-21
AR092875A1 (es) 2015-05-06
CN104854656B (zh) 2017-12-19
TR201818834T4 (tr) 2019-01-21
MX347921B (es) 2017-05-17
KR20150070200A (ko) 2015-06-24
CA2979857C (fr) 2019-10-15
PT2904612T (pt) 2018-12-17
EP4213146A1 (fr) 2023-07-19
RU2015116458A (ru) 2016-11-27
AU2013327192A1 (en) 2015-04-30
KR101691549B1 (ko) 2016-12-30
US20180218743A9 (en) 2018-08-02
MY194208A (en) 2022-11-21
PT3444818T (pt) 2023-06-30
ZA201503025B (en) 2016-01-27
WO2014053261A1 (fr) 2014-04-10
JP2015532456A (ja) 2015-11-09
RU2636126C2 (ru) 2017-11-20
CA2979948A1 (fr) 2014-04-10
US20150213810A1 (en) 2015-07-30
ES2701402T3 (es) 2019-02-22
US11264043B2 (en) 2022-03-01
TW201415457A (zh) 2014-04-16
US12002481B2 (en) 2024-06-04
EP2904612A1 (fr) 2015-08-12
AU2013327192B2 (en) 2016-06-09
JP6122961B2 (ja) 2017-04-26
PL2904612T3 (pl) 2019-05-31
MX2015003927A (es) 2015-07-23
CA2887009A1 (fr) 2014-04-10

Similar Documents

Publication Publication Date Title
US12002481B2 (en) Apparatus for encoding a speech signal employing ACELP in the autocorrelation domain
US10586548B2 (en) Encoder, decoder and method for encoding and decoding

Legal Events

Date Code Title Description
AS Assignment

Owner name: FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BAECKSTROEM, TOM;MULTRUS, MARKUS;FUCHS, GUILLAUME;AND OTHERS;SIGNING DATES FROM 20150608 TO 20150609;REEL/FRAME:037092/0293

FEPP Fee payment procedure

Free format text: PETITION RELATED TO MAINTENANCE FEES GRANTED (ORIGINAL EVENT CODE: PTGR); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4