WO2004070706A1 - Procede de codage et de decodage audio a debit variable - Google Patents
Procede de codage et de decodage audio a debit variable Download PDFInfo
- Publication number
- WO2004070706A1 WO2004070706A1 PCT/FR2003/003870 FR0303870W WO2004070706A1 WO 2004070706 A1 WO2004070706 A1 WO 2004070706A1 FR 0303870 W FR0303870 W FR 0303870W WO 2004070706 A1 WO2004070706 A1 WO 2004070706A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parameters
- subset
- coding bits
- bits
- coding
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 62
- 230000003595 spectral effect Effects 0.000 claims description 39
- 238000001228 spectrum Methods 0.000 claims description 18
- 230000000873 masking effect Effects 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 12
- 230000005236 sound signal Effects 0.000 claims description 12
- 230000009466 transformation Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 6
- 230000003247 decreasing effect Effects 0.000 claims description 5
- 238000012937 correction Methods 0.000 claims description 4
- 230000008447 perception Effects 0.000 claims description 4
- 238000003786 synthesis reaction Methods 0.000 claims description 4
- 230000015572 biosynthetic process Effects 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 2
- 230000004044 response Effects 0.000 claims description 2
- 230000002194 synthesizing effect Effects 0.000 claims description 2
- 238000013139 quantization Methods 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000011002 quantification Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 101150012579 ADSL gene Proteins 0.000 description 1
- 102100020775 Adenylosuccinate lyase Human genes 0.000 description 1
- 108700040193 Adenylosuccinate lyases Proteins 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/24—Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/002—Dynamic bit allocation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
Definitions
- the present invention relates to devices for coding and decoding audio signals, intended in particular to take place in applications for the transmission or storage of digitized and compressed audio signals (speech and / or sounds).
- this invention relates to audio coding systems having the capacity to provide various bit rates, also called multi-bit coding systems.
- Such systems are distinguished from fixed rate coders by their ability to modify the coding rate, possibly during processing, which is particularly suitable for transmission over heterogeneous access networks: whether these are networks of the type IP mixing fixed and mobile access, high speed (ADSL), low speed (PSTN, GPRS modems) or involving terminals of varying capacities (mobile, PC, ).
- Multi-bit "switchable" coders are based on a coding structure belonging to a technological family (temporal or frequency coding, for example: CELP, sinusoidal, or transform), in which a bit rate indication is simultaneously supplied to the coder and to the decoder .
- the coder uses this information to select the parts of the algorithm and the tables relevant to the chosen bit rate.
- the decoder operates symmetrically. Numerous switchable multi-bit coding structures have been proposed for audio coding.
- a base layer also called “hierarchical” coding systems, also called “scalables”, the binary data resulting from the coding operation is distributed in successive layers.
- a base layer also called
- core is formed of the binary elements absolutely necessary for the decoding of the binary train, and determining a minimum quality of decoding.
- the following layers make it possible to progressively improve the quality of the signal resulting from the decoding operation, each new layer bringing new information, which, exploited by the decoder, provide a signal of increasing quality as output.
- One of the peculiarities of hierarchical coding is the possibility offered to intervene at any level of the transmission or storage chain to delete part of the bit stream without having to provide any particular indication to the coder or the decoder.
- the decoder uses the binary information it receives and produces a signal of corresponding quality.
- Hierarchical coding structures operate from a single type of coder, designed to deliver hierarchical coded information.
- additional layers improve the quality of the output signal without modifying the bandwidth, we speak rather of "nested coders" (see for example RD lacovo et al., "Embedded CELP Coding for Variable Bit-Rate Between 6.4 and 9.6 kbit / s ", Proc. ICASSP 1991, pp. 681-686).
- this type of encoder does not allow large differences between the lowest and the highest speed. proposed.
- the hierarchy is often used to gradually increase the bandwidth of the signal: the core provides a baseband signal, for example telephone (300-3400 Hz), and the following layers allow the coding of additional frequency bands (for example, band widened up to 7 kHz, HiFi band up to 20 kHz or intermediate, ).
- Subband coders or coders using a time-frequency transformation as described in the documents "Subband / transform coding using filter banks designs based on time domain aliasing cancellation" by J.P. Princen et al. (Proc. IEEE ICASSP-87, pp. 2161-2164) and "High Quality Audio Transform Coding at 64 kbit / s", by Y. Mahieux et al. (IEEE Trans. Commun, Vol. 42, No. 11, November 1994, pp. 3010-3019), are particularly suitable for such operations.
- each stage consisting of a sub-coder.
- the sub-coder of the stage of a given level can either encode parts of the signal not coded by the preceding stages, or encode the coding residue of the preceding stage, the residue is obtained by subtracting the decoded signal from the signal original.
- CELP and time-frequency transform are particularly effective for scanning large ranges of flow rates.
- each layer corresponds to the encoding of certain parameters, and the granularity of the hierarchical binary train depends on the bit rate assigned to these parameters (typically a layer can contain around a few tens of bits per frame, a signal frame consisting of a certain number of signal samples over a given duration, the example described below considering a frame of 960 samples corresponding to 60 ms of signal).
- the bandwidth of the decoded signals can vary depending on the level of the binary element layers, the modification of the online bit rate can produce annoying artifacts during listening.
- the object of the present invention is in particular to propose a multi-bit coding solution which overcomes the drawbacks mentioned in the case of the use of existing switchable and hierarchical codings.
- the invention thus provides a method of coding a digital audio signal frame into an output binary sequence, in which a maximum number Nmax of coding bits is defined for a set of parameters which can be calculated from the signal frame, composed of a first and a second subset.
- the proposed process includes the following steps:
- the parameters of the first subset are calculated, and these parameters are coded on a number N0 of coding bits such that N0 ⁇ Nmax;
- the allocation and / or the order of classification of the Nmax - N0 coding bits are determined as a function of the coded parameters of the first subset.
- the coding method further comprises the following steps in response to the indication of a number N of bits of the output binary sequence available for coding said set of parameters, with N0 ⁇ N ⁇ Nmax:
- the parameters of the second subset to which the N - N0 coding bits classified the first are allocated in said order are selected; - the selected parameters of the second subset are calculated, and these parameters are coded to produce the N - N0 coding bits classified first;
- the method according to the invention makes it possible to define a multi-bit coding, which will operate at least in a range corresponding for each frame to a number of bits ranging from N0 to Nmax.
- the number N of bits of the output binary sequence is strictly less than Nmax.
- the coder has this remarkable fact that the allocation of the bits used does not refer to the effective output rate of the coder, but to another number Nmax agreed with the decoder.
- Nmax N according to the instantaneous speed available on a transmission channel.
- the output sequence of such a switchable multi-bit coder could be processed by a decoder which would not receive the entire sequence, as soon as it is able to recover the structure of the coding bits of the second subset thanks to the knowledge of Nmax.
- N Nmax
- N Nmax
- the decoder When reading N 'bits of this content stored at a lower bit rate, the decoder will be able to find the structure of the coding bits of the second subset as soon as N' ⁇ N0.
- the order of classification of the coding bits allocated to the parameters of the second subset can be a pre-established order.
- the order of classification of the coding bits allocated to the parameters of the second subset is variable. It can in particular be an order of decreasing importance determined as a function of at least the coded parameters of the first subset.
- the decoder which will receive a binary sequence of N 1 bits for the frame, with N0 ⁇ N ' ⁇ N ⁇ Nmax, will be able to deduce this order from the N0 bits received for the coding of the first subset.
- Nmax - N0 bits to the coding of the parameters of the second subset can be carried out in a fixed manner (in this case, the order of classification of these bits will depend at least on the coded parameters of the first subset).
- the allocation of Nmax - N0 bits to the coding of the parameters of the second subset is a function of the coded parameters of the first subset.
- this order of classification of the coding bits allocated to the parameters of the second subset is determined using at least one psychoacoustic criterion as a function of the coded parameters of the first subset.
- the parameters of the second subset may relate to spectral bands of the signal.
- the method advantageously comprises a step of estimating a spectral envelope of the coded signal from the coded parameters of the first subset and a step of calculating a frequency masking curve by applying a model of auditory perception.
- the psychoacoustic criterion refers to the level of the estimated spectral envelope with respect to the masking curve in each spectral band.
- the coding bits in the output sequence are ordered so that the N0 coding bits of the first sub- set precede the N - NO coding bits of the selected parameters of the second subset and the respective coding bits of the selected parameters of the second subset appear therein in the order determined for said coding bits. This allows, in case the binary sequence is truncated, to receive the most important part.
- the number N can vary from one frame to another, in particular depending for example on the available capacity of the transmission resource.
- the multi-bit audio coding according to the present invention may be used in a very flexible switchable or hierarchical mode, since any number of bits to be transmitted chosen freely between N0 and Nmax. Can be selected at any time, that is to say frame by frame.
- the coding of the parameters of the first subset can be at variable bit rate, which varies the number N0 from one frame to another. This makes it possible to best adjust the distribution of the bits as a function of the frames to be coded.
- the first subset includes parameters calculated by an encoder core.
- the coding core has an operating frequency band lower than the bandwidth of the signal to be coded, and the first subset further comprises energy levels of the audio signal associated with frequency bands greater than the operating band of the core.
- encoder This type of structure is that of a two-level hierarchical coder, which delivers, for example via the coding kernel, a coded signal of a quality deemed sufficient and which, depending on the available bit rate, completes the coding carried out by the coding kernel with additional information from the coding method according to the invention.
- the coding bits of the first subset in the output sequence are then ordered so that the coding bits of the parameters calculated by the coding core are immediately followed by the coding bits of the energy levels associated with the bands. higher frequencies. This ensures the same bandwidth for the frames successively coded as soon as the decoder receives enough bits to have information from the coding core and coded energy levels associated with the higher frequency bands.
- a difference signal is estimated between the signal to be coded and a synthesis signal derived from the coded parameters produced by the coding core, and the first subset further comprises energy levels of the difference signal. associated with frequency bands included in the operating band of the encoder core.
- a second aspect of the invention relates to a method of decoding an input binary sequence for synthesizing a digital audio signal corresponding to the decoding of a coded frame according to the coding method of the invention.
- a maximum number Nmax of coding bits is defined for a set of parameters for describing a signal frame, composed of first and second subsets.
- the input sequence comprises, for a signal frame, a number N ′ of coding bits of the set of parameters, with N 1 ⁇ Nmax.
- the decoding method according to the invention comprises the following steps:
- the allocation and / or the order of classification of the Nmax - N0 coding bits are determined as a function of the parameters recovered from the first subset.
- the decoding method further comprises the following steps:
- the selected parameters of the second subset are recovered on the basis of said N '- NO coding bits extracted; and - the signal frame is synthesized using the parameters recovered from the first and second subsets.
- This decoding method is advantageously associated with methods for regenerating the parameters which are lacking due to the truncation of the sequence of Nmax bits produced, virtually or not, by the coder.
- a third aspect of the invention relates to an audio coder, comprising digital signal processing means arranged to implement a coding method according to the invention.
- Another aspect of the invention relates to an audio decoder, comprising digital signal processing means arranged to implement a decoding method according to the invention.
- FIG. 1 is a block diagram of an example of audio coder according to the invention.
- FIG. 2 represents a binary output sequence of N bits in an embodiment of the invention.
- FIG. 3 is a block diagram of an audio decoder according to the invention.
- the coder shown in Figure 1 has a hierarchical structure with two coding stages.
- a first coding stage 1 consists for example of a coding core in the telephone band (300-3400 Hz) of the CELP type.
- This coder is in the example considered a G.723.1 coder standardized by the ITU-T ("International Telecommunication Union") in fixed mode at 6.4 kbit / s. He calculates G.723.1 parameters according to the standard and quantifies them using 192 P1 coding bits per 30 ms frame.
- the second coding stage 2 makes it possible to increase the bandwidth towards the widened band (50-7000 Hz), operates on the coding residue E of the first stage, provided by a subtractor 3 in the diagram of FIG. 1.
- a signal synchronization module 4 delays the audio signal frame S by the time taken by the processing of the coding core 1. Its output is addressed to the subtractor 3 which subtracts from it the synthetic signal S 'equal to the output of the decoder core operating on the base quantized parameters as represented by the output bits P1 of the coding core.
- the coder 1 incorporates a local decoder providing S '.
- the audio signal to be coded S has for example a bandwidth of
- a frame consists for example of 960 samples, that is to say 60 ms of signal or two elementary frames of the G.723.1 coding core. As the latter operates on signals sampled at
- the signal S is subsampled by a factor of 2 at the input of the coding core 1.
- the synthetic signal S ' is oversampled at 16 kHz at the output of the coding core 1.
- the second stage 2 operates for example on elementary frames, or sub-frames, of 20 ms (320 samples at 16 kHz).
- the second stage 2 comprises a time-frequency transformation module 5, for example of the MDCT ("Modified Discrete Cosine Transform") type to which the residue E obtained by the subtractor 3 is addressed.
- MDCT Modified Discrete Cosine Transform
- the operation of the modules 3 and 5 shown in Figure 1 can be achieved by performing the following operations for each 20 ms subframe: - MDCT transformation of the input signal S delayed by the module 4, which provides 320 MDCT coefficients.
- the spectrum being limited to 7225 Hz, only the first 289 MDCT coefficients are different from 0;
- the resulting spectrum is distributed into several bands of different widths by a module 6.
- the bandwidth of the coded G.723.1 can be subdivided into 21 bands while the higher frequencies are distributed into 11 additional bands.
- the residue E is identical to the input signal S.
- the quantized scale factors are denoted FQ in FIG. 1.
- the difference Nmax - N0 1536 - N2 (1) - N2 (2) - N2 (3) is available to more finely quantify the spectra of the bands.
- a module 8 normalizes the MDCT coefficients distributed into bands by the module 6, by dividing them by the quantized scale factors FQ respectively determined for these bands.
- the spectra thus normalized are supplied to the quantization module 9 which uses a vector quantization scheme of known type.
- the quantization bits from module 9 are denoted P3 in FIG. 1.
- An output multiplexer 10 collects the bits P1, P2 and P3 from the modules 1, 7 and 9 to form the binary sequence ⁇ of the encoder output.
- the total number of bits N of the output sequence representing a current frame is not necessarily equal to Nmax. It can be lower.
- the allocation of the quantization bits to the bands is carried out on the basis of the number Nmax.
- this allocation is carried out for each subframe by the module 12 from the number Nmax - N0, quantified scale factors FQ and a spectral masking curve calculated by a module 11.
- this last module 11 determines an approximate value of the original spectral envelope of the signal S from that of the difference signal, as quantified by the module 7, and that which it determines with the same resolution for the synthetic signal S 'resulting from the coding core. These last two envelopes can also be determined by a decoder which would only have the parameters of the first aforementioned subset. Thus the estimated spectral envelope of the signal S will also be available at the decoder.
- the module 11 calculates a spectral masking curve by applying, in a manner known per se, a band-by-band auditory perception model to the estimated original spectral envelope. This curve 11 gives a level of masking for each band considered.
- the module 12 performs a dynamic allocation of the Nmax - N0 bits remaining of the sequence ⁇ among the 3 * 32 bands of the three transformations
- MDCT of the difference signal In the implementation of the invention presented here, according to a criterion of psychacoustic perceptual importance referring to the level of the estimated spectral envelope with respect to the masking curve in each band, a band is allocated to each band proportional to this level. Other classification criteria could be used.
- the module 9 knows how many bits are to be considered for the quantization of each band in each sub-frame.
- a scheduling of the bits representing the bands is carried out by a module 13 according to a criterion of perceptual importance.
- the module 13 classifies the 3 x 32 bands in an order of decreasing importance which may be the decreasing order of the signal-to-mask ratios (ratio between the estimated spectral envelope and the masking curve in each band). This order is used for the construction of the binary sequence ⁇ in accordance with the invention.
- the bands which are to be quantified by the module 9 are determined by selecting the bands classified first by the module 13 and retaining for each band selected a number of bits as determined by the module 12.
- the MDCT coefficients of each selected band are quantified by the module 9, for example using a vector quantizer, in accordance with the number of bits allocated, to produce a total number of bits equal to N - N0.
- the above coding method allows a decoding of the frame if the decoder receives N 'bits with N0 ⁇ N' ⁇ N. This number N 'will generally vary from one frame to another.
- a decoder according to the invention is illustrated in FIG. 3.
- a demultiplexer 20 separates the sequence of received bits ⁇ ′ to extract the coding bits P1 and P2 therefrom.
- the 384 bits P1 are supplied to the decoder core 21 of type G.723.1 so that the latter synthesizes two frames of the basic signal S 1 in the telephone band.
- the bits P2 are decoded according to the Huffman algorithm by a module 22 which thus recovers the quantized scale factors FQ for each of the 3 sub-frames.
- a module 23 for calculating the masking curve receives the basic signal S ′ and the quantized scale factors FQ and produces the spectral masking levels for each of the 96 bands. From these masking levels, quantified scale factors FQ and knowledge of the number Nmax (as well as that of the number N0 which is deduced from the Huffman decoding of the bits P2 by the module 22), a module 24 determines an allocation of bits in the same way as the module 12 of FIG. 1. In addition, a module 25 proceeds to the scheduling of the bands according to the same classification criterion as the module 13 described with reference to FIG. 1.
- the standardized MDCT coefficients relating to the missing bands can also be synthesized by interpolation or extrapolation as described below (module 27). These missing bands may have been eliminated by the coder due to a truncation at N ⁇ Nmax, or they may have been eliminated during transmission (N ' ⁇ N)
- the normalized MDCT coefficients, synthesized by module 26 and / or module 27, are multiplied by their respective quantified scale factors (multiplier 28) before being presented to module 29 which performs the inverse frequency-time transformation of the transformation MDCT operated by module 5 of the encoder.
- the resulting correction time signal is added to the synthetic signal S 'delivered by the decoder core 21
- the decoder can synthesize a signal S even in cases where it does not receive the first N0 bits of the sequence.
- the decoder performs three MDCT analyzes followed by three MDCT syntheses, allowing the memories of the MDCT transformation to be updated.
- the output signal contains a telephone band quality signal. If the first 2 x N1 bits are not even received, the decoder considers the corresponding frame as erased and can use a known algorithm for concealing the erased frames.
- the decoder can start to synthesize a signal in wide band. It can in particular proceed as follows.
- the module 22 recovers the parts of the three spectral envelopes received.
- the spectral envelope is corrected to regularize it by avoiding the holes due to the bands not received: the zero values in the upper part of the spectral envelopes FQ are for example replaced by the hundredth of the value of the masking curve calculated previously, so that they remain inaudible.
- the full spectrum of the low bands and the spectral envelope of the high bands are known at this point.
- the module 27 then generates the high spectrum.
- the fine structure of these bands is generated by reflecting the fine structure of its known neighborhood before weighting by the scaling factors (multipliers 28).
- the "known neighborhood" corresponds to the spectrum of the signal S 'produced by the decoder core
- the synthesized signal is obtained in an enlarged band.
- the decoder also receives at least part of the low spectral envelope of the difference signal (part c), it may or may not take this information into account to refine the spectral envelope in step 3.
- the module 26 recovers some of the normalized MDCT coefficients according to the allocation and the scheduling indicated by modules 24 and 25. These MDCT coefficients therefore do not need to be interpolated as in step 5 above.
- the process of steps 1 to 6 is applicable by the module 27 in the same manner as previously, the knowledge of the MDCT coefficients received for certain bands allowing a more reliable interpolation in step 5.
- Unreceived bands may vary from one MDCT subframe to the next.
- the "known neighborhood" of a missing band can correspond to the same band in another sub-frame where it does not miss, and / or to one or more closest bands in the frequency domain during the same sub- frame. It is also possible to regenerate a missing MDCT spectrum in a band for a sub-frame by making a weighted sum of evaluated contributions from several bands / sub-frames of the "known neighborhood".
- the last coded parameter transmitted can, depending on the case, be transmitted completely or partially.
- Two cases can then arise: either the coding structure adopted makes it possible to use the partial information received (case of scalar quantifiers, or vector quantization with partitioned dictionaries),
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
Abstract
Description
Claims
Priority Applications (10)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP03799688A EP1581930B1 (fr) | 2003-01-08 | 2003-12-22 | Procede de codage et de decodage audio a debit variable |
CA2512179A CA2512179C (fr) | 2003-01-08 | 2003-12-22 | Procede de codage et de decodage audio a debit variable |
US10/541,340 US7457742B2 (en) | 2003-01-08 | 2003-12-22 | Variable rate audio encoder via scalable coding and enhancement layers and appertaining method |
MXPA05007356A MXPA05007356A (es) | 2003-01-08 | 2003-12-22 | Metodo para codificar y descodificar audio a una velocidad variable. |
CN2003801084396A CN1735928B (zh) | 2003-01-08 | 2003-12-22 | 用于可变速率音频编解码的方法 |
BR0317954-0A BR0317954A (pt) | 2003-01-08 | 2003-12-22 | Processo de codificação e decodificação áudio com taxa variável |
DE60319590T DE60319590T2 (de) | 2003-01-08 | 2003-12-22 | Verfahren zur codierung und decodierung von audio mit variabler rate |
AU2003299395A AU2003299395B2 (en) | 2003-01-08 | 2003-12-22 | Method for encoding and decoding audio at a variable rate |
JP2004567790A JP4390208B2 (ja) | 2003-01-08 | 2003-12-22 | 音声を可変レートで符号化および復号する方法 |
KR1020057012791A KR101061404B1 (ko) | 2003-01-08 | 2003-12-22 | 가변 레이트로 오디오를 인코딩 및 디코딩하는 방법 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR0300164A FR2849727B1 (fr) | 2003-01-08 | 2003-01-08 | Procede de codage et de decodage audio a debit variable |
FR03/00164 | 2003-01-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2004070706A1 true WO2004070706A1 (fr) | 2004-08-19 |
Family
ID=32524763
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FR2003/003870 WO2004070706A1 (fr) | 2003-01-08 | 2003-12-22 | Procede de codage et de decodage audio a debit variable |
Country Status (15)
Country | Link |
---|---|
US (1) | US7457742B2 (fr) |
EP (1) | EP1581930B1 (fr) |
JP (1) | JP4390208B2 (fr) |
KR (1) | KR101061404B1 (fr) |
CN (1) | CN1735928B (fr) |
AT (1) | ATE388466T1 (fr) |
AU (1) | AU2003299395B2 (fr) |
BR (1) | BR0317954A (fr) |
CA (1) | CA2512179C (fr) |
DE (1) | DE60319590T2 (fr) |
ES (1) | ES2302530T3 (fr) |
FR (1) | FR2849727B1 (fr) |
MX (1) | MXPA05007356A (fr) |
WO (1) | WO2004070706A1 (fr) |
ZA (1) | ZA200505257B (fr) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007055507A1 (fr) * | 2005-11-08 | 2007-05-18 | Samsung Electronics Co., Ltd. | Dispositifs et procedes permettant de codage et de decodage audio adapte au temps et a la frequence |
JP4859670B2 (ja) * | 2004-10-27 | 2012-01-25 | パナソニック株式会社 | 音声符号化装置および音声符号化方法 |
JP5173795B2 (ja) * | 2006-03-17 | 2013-04-03 | パナソニック株式会社 | スケーラブル符号化装置およびスケーラブル符号化方法 |
Families Citing this family (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7921007B2 (en) * | 2004-08-17 | 2011-04-05 | Koninklijke Philips Electronics N.V. | Scalable audio coding |
US7930173B2 (en) | 2006-06-19 | 2011-04-19 | Sharp Kabushiki Kaisha | Signal processing method, signal processing apparatus and recording medium |
JP4827661B2 (ja) * | 2006-08-30 | 2011-11-30 | 富士通株式会社 | 信号処理方法及び装置 |
US20080243518A1 (en) * | 2006-11-16 | 2008-10-02 | Alexey Oraevsky | System And Method For Compressing And Reconstructing Audio Files |
EP1927981B1 (fr) * | 2006-12-01 | 2013-02-20 | Nuance Communications, Inc. | Affinement spectral de signaux audio |
JP4871894B2 (ja) * | 2007-03-02 | 2012-02-08 | パナソニック株式会社 | 符号化装置、復号装置、符号化方法および復号方法 |
JP4708446B2 (ja) * | 2007-03-02 | 2011-06-22 | パナソニック株式会社 | 符号化装置、復号装置およびそれらの方法 |
US7925783B2 (en) * | 2007-05-23 | 2011-04-12 | Microsoft Corporation | Transparent envelope for XML messages |
KR101290622B1 (ko) * | 2007-11-02 | 2013-07-29 | 후아웨이 테크놀러지 컴퍼니 리미티드 | 오디오 복호화 방법 및 장치 |
JP5520967B2 (ja) * | 2009-02-16 | 2014-06-11 | エレクトロニクス アンド テレコミュニケーションズ リサーチ インスチチュート | 適応的正弦波コーディングを用いるオーディオ信号の符号化及び復号化方法及び装置 |
EP2249333B1 (fr) * | 2009-05-06 | 2014-08-27 | Nuance Communications, Inc. | Procédé et appareil d'évaluation d'une fréquence fondamentale d'un signal vocal |
FR2947944A1 (fr) * | 2009-07-07 | 2011-01-14 | France Telecom | Codage/decodage perfectionne de signaux audionumeriques |
FR2947945A1 (fr) * | 2009-07-07 | 2011-01-14 | France Telecom | Allocation de bits dans un codage/decodage d'amelioration d'un codage/decodage hierarchique de signaux audionumeriques |
EP2490216B1 (fr) * | 2009-10-14 | 2019-04-24 | III Holdings 12, LLC | Codage de la parole par couches |
US20120029926A1 (en) | 2010-07-30 | 2012-02-02 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for dependent-mode coding of audio signals |
US9208792B2 (en) | 2010-08-17 | 2015-12-08 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for noise injection |
CN101950562A (zh) * | 2010-11-03 | 2011-01-19 | 武汉大学 | 基于音频关注度的分级编码方法及系统 |
NO2669468T3 (fr) * | 2011-05-11 | 2018-06-02 | ||
TWI606441B (zh) * | 2011-05-13 | 2017-11-21 | 三星電子股份有限公司 | 解碼裝置 |
WO2013142650A1 (fr) | 2012-03-23 | 2013-09-26 | Dolby International Ab | Diversité de taux d'échantillonnage dans un système de communication vocale |
ES2827278T3 (es) | 2014-04-17 | 2021-05-20 | Voiceage Corp | Método, dispositivo y memoria no transitoria legible por ordenador para codificación y decodificación predictiva linealde señales sonoras en la transición entre tramas que tienen diferentes tasas de muestreo |
CN106992786B (zh) * | 2017-03-21 | 2020-07-07 | 深圳三星通信技术研究有限公司 | 一种基带数据压缩方法、装置和系统 |
KR102258814B1 (ko) * | 2018-10-04 | 2021-07-14 | 주식회사 엘지에너지솔루션 | Bms 간 통신 시스템 및 방법 |
KR102352240B1 (ko) * | 2020-02-14 | 2022-01-17 | 국방과학연구소 | Amr 음성데이터의 압축포맷정보를 추정하는 방법 및 그 장치 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB8421498D0 (en) * | 1984-08-24 | 1984-09-26 | British Telecomm | Frequency domain speech coding |
DE19706516C1 (de) * | 1997-02-19 | 1998-01-15 | Fraunhofer Ges Forschung | Verfahren und Vorricntungen zum Codieren von diskreten Signalen bzw. zum Decodieren von codierten diskreten Signalen |
US6016111A (en) * | 1997-07-31 | 2000-01-18 | Samsung Electronics Co., Ltd. | Digital data coding/decoding method and apparatus |
FR2813722B1 (fr) | 2000-09-05 | 2003-01-24 | France Telecom | Procede et dispositif de dissimulation d'erreurs et systeme de transmission comportant un tel dispositif |
US7620545B2 (en) * | 2003-07-08 | 2009-11-17 | Industrial Technology Research Institute | Scale factor based bit shifting in fine granularity scalability audio coding |
-
2003
- 2003-01-08 FR FR0300164A patent/FR2849727B1/fr not_active Expired - Fee Related
- 2003-12-22 AT AT03799688T patent/ATE388466T1/de not_active IP Right Cessation
- 2003-12-22 US US10/541,340 patent/US7457742B2/en active Active
- 2003-12-22 DE DE60319590T patent/DE60319590T2/de not_active Expired - Lifetime
- 2003-12-22 ZA ZA200505257A patent/ZA200505257B/en unknown
- 2003-12-22 BR BR0317954-0A patent/BR0317954A/pt not_active IP Right Cessation
- 2003-12-22 EP EP03799688A patent/EP1581930B1/fr not_active Expired - Lifetime
- 2003-12-22 ES ES03799688T patent/ES2302530T3/es not_active Expired - Lifetime
- 2003-12-22 MX MXPA05007356A patent/MXPA05007356A/es active IP Right Grant
- 2003-12-22 CA CA2512179A patent/CA2512179C/fr not_active Expired - Lifetime
- 2003-12-22 CN CN2003801084396A patent/CN1735928B/zh not_active Expired - Lifetime
- 2003-12-22 WO PCT/FR2003/003870 patent/WO2004070706A1/fr active IP Right Grant
- 2003-12-22 AU AU2003299395A patent/AU2003299395B2/en not_active Expired
- 2003-12-22 KR KR1020057012791A patent/KR101061404B1/ko active IP Right Grant
- 2003-12-22 JP JP2004567790A patent/JP4390208B2/ja not_active Expired - Lifetime
Non-Patent Citations (2)
Title |
---|
CHRISTOPH ERDMANN ET AL: "A candidate proposal for a 3GPP adaptive multi-rate wideband speech codec", INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP '01, vol. 2, 7 May 2001 (2001-05-07) - 11 May 2001 (2001-05-11), SALT LAKE CITY, pages 757 - 760, XP002253623 * |
YE SHEN ET AL: "A progressive algorithm for perceptual coding of digital audio signals", SIGNALS, SYSTEMS, AND COMPUTERS, 1999. CONFERENCE RECORD OF THE THIRTY-THIRD ASILOMAR CONFERENCE ON OCT. 24-27, 1999, PISCATAWAY, NJ, USA,IEEE, US, 24 October 1999 (1999-10-24), pages 1105 - 1109, XP010373807, ISBN: 0-7803-5700-0 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4859670B2 (ja) * | 2004-10-27 | 2012-01-25 | パナソニック株式会社 | 音声符号化装置および音声符号化方法 |
WO2007055507A1 (fr) * | 2005-11-08 | 2007-05-18 | Samsung Electronics Co., Ltd. | Dispositifs et procedes permettant de codage et de decodage audio adapte au temps et a la frequence |
US8548801B2 (en) | 2005-11-08 | 2013-10-01 | Samsung Electronics Co., Ltd | Adaptive time/frequency-based audio encoding and decoding apparatuses and methods |
US8862463B2 (en) | 2005-11-08 | 2014-10-14 | Samsung Electronics Co., Ltd | Adaptive time/frequency-based audio encoding and decoding apparatuses and methods |
JP5173795B2 (ja) * | 2006-03-17 | 2013-04-03 | パナソニック株式会社 | スケーラブル符号化装置およびスケーラブル符号化方法 |
Also Published As
Publication number | Publication date |
---|---|
CA2512179C (fr) | 2013-04-16 |
US20060036435A1 (en) | 2006-02-16 |
CN1735928B (zh) | 2010-05-12 |
CA2512179A1 (fr) | 2004-08-19 |
ZA200505257B (en) | 2006-09-27 |
ES2302530T3 (es) | 2008-07-16 |
US7457742B2 (en) | 2008-11-25 |
DE60319590D1 (de) | 2008-04-17 |
FR2849727B1 (fr) | 2005-03-18 |
BR0317954A (pt) | 2005-11-29 |
AU2003299395A1 (en) | 2004-08-30 |
CN1735928A (zh) | 2006-02-15 |
JP4390208B2 (ja) | 2009-12-24 |
KR20050092107A (ko) | 2005-09-20 |
EP1581930A1 (fr) | 2005-10-05 |
ATE388466T1 (de) | 2008-03-15 |
JP2006513457A (ja) | 2006-04-20 |
MXPA05007356A (es) | 2005-09-30 |
DE60319590T2 (de) | 2009-03-26 |
EP1581930B1 (fr) | 2008-03-05 |
AU2003299395B2 (en) | 2010-03-04 |
FR2849727A1 (fr) | 2004-07-09 |
KR101061404B1 (ko) | 2011-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1581930B1 (fr) | Procede de codage et de decodage audio a debit variable | |
EP2277172B1 (fr) | Dissimulation d'erreur de transmission dans un signal audionumerique dans une structure de decodage hierarchique | |
EP1905010B1 (fr) | Codage/décodage audio hiérarchique | |
EP1808684B1 (fr) | Appareil de decodage modulable | |
CA2347735C (fr) | Procede de recuperation du contenu a haute frequence et dispositif pour signal a large bande synthetise sur-echantillonne | |
CA2766864C (fr) | Codage/decodage perfectionne de signaux audionumeriques | |
EP1047045A2 (fr) | Matériel sain etméthode de synthesisation | |
EP0111612A1 (fr) | Procédé et dispositif de codage d'un signal vocal | |
EP2452337A1 (fr) | Allocation de bits dans un codage/décodage d'amélioration d'un codage/décodage hiérarchique de signaux audionumériques | |
CN113470667A (zh) | 语音信号的编解码方法、装置、电子设备及存储介质 | |
EP1692689A1 (fr) | Procede de codage multiple optimise | |
FR2875351A1 (fr) | Procede de traitement de donnees par passage entre domaines differents de sous-bandes | |
EP3175443B1 (fr) | Détermination d'un budget de codage d'une trame de transition lpd/fd | |
EP2652735B1 (fr) | Codage perfectionne d'un etage d'amelioration dans un codeur hierarchique | |
JP2006126826A (ja) | オーディオ信号符号化/復号化方法及びその装置 | |
EP2203915B1 (fr) | Dissimulation d'erreur de transmission dans un signal numerique avec repartition de la complexite | |
WO2005045808A1 (fr) | Ponderation du bruit d'une harmonique dans des codeurs vocaux numeriques | |
FR2737360A1 (fr) | Procedes de codage et de decodage de signaux audiofrequence, codeur et decodeur pour la mise en oeuvre de tels procedes | |
EP1192619A1 (fr) | Codage et decodage audio par interpolation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2003799688 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1174/KOLNP/2005 Country of ref document: IN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2005/05257 Country of ref document: ZA Ref document number: 2512179 Country of ref document: CA Ref document number: 200505257 Country of ref document: ZA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2003299395 Country of ref document: AU |
|
ENP | Entry into the national phase |
Ref document number: 2006036435 Country of ref document: US Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10541340 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: PA/A/2005/007356 Country of ref document: MX |
|
WWE | Wipo information: entry into national phase |
Ref document number: 20038A84396 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2004567790 Country of ref document: JP Ref document number: 1020057012791 Country of ref document: KR |
|
WWP | Wipo information: published in national office |
Ref document number: 1020057012791 Country of ref document: KR |
|
WWP | Wipo information: published in national office |
Ref document number: 2003799688 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: PI0317954 Country of ref document: BR |
|
WWP | Wipo information: published in national office |
Ref document number: 10541340 Country of ref document: US |
|
WWG | Wipo information: grant in national office |
Ref document number: 2003799688 Country of ref document: EP |