Codage audio haute qualité (20 Hz-15 kHz) en sous bandes à débit réduit (64 kbit/s) et à faible retard (5 ms)

Samir Saoudi¹,
Karine Hay¹ &
Laurent Mainard²

48 Accesses
Explore all metrics

Résumé

Cet article présente une nouvelle méthode de codage audionumérique à faible retard et à débit réduit pour des applications émergentes en télécommunication telles que l’audioconférence ou la vidéoconférence. Le codeur développé permet de traiter des signaux génériques (parole, musique, sons d’accompagnement, sons d’ambiance,...) à un débit de 64 kbit/s avec un retard de codage et décodage aux environs de 5 ms dans la bande de fréquence 20-15000 Hz. Ce codeur s’articule sur une décomposition en sous-bandes du signal d’entrée associée à un codage de la parole de type LD-CELP. L’introduction d’un modéle psychoacoustique pennet de déterminer le débit optimal à allouer à chacune des sous-bandes en fonction des caractéristiques propres de l’audition humaine. Pour satisfaire la demande de débit requise par le modele psychoacoustique et pour réduire la complexité algorithmique induite par cette structure de codage, nous avons présenté une nouvelle méthode de quantification vectorielle basée sur les réseaux réguliers de points. Cette méthode permet de quantifier le signal résiduel au sein du codeur LD-CELP sans effectuer de recherche trop complexe du meilleur quantificateur. Des tests objectifs et subjectifs ont été réalisés sur des séquences sonores critiques utilisées par l’iso. Des tests formels ont montré que la qualité du codeur proposé est comparable aux meilleures implantations de la norme MPEG-I, couche II, avec l’avantage, pour la solution proposée, d’avoir un retard de codage et décodage beaucoup plus faible (5 ms).

Abstract

In this paper, we present a new method for high quality audio coding at low delay and low bit rate for telecommunications applications such as audioconfe-rence or videoconference. The developped coder is adapted to code generic audio signals at a bit rate of 64 kbit/s with a delay close to 5 ms in the 20-15000 Hz bandwidth. The method is based on speech coding as well as audio coding concepts. The coder combines subband decomposition of the input signal and LD-CELP techniques. We introduce in this structure of coding a psychoacoustic model which allows to allocate an optimal bit rate on each subband according to perceptual properties of the human hearing. In order to satisfy the bit rate requirement of the psychoacoustic model and to reduce the complexity of such a coding algorithm, we suggested a new method of vector quantization based on lattice quantization. This method allows to quantify the residual signal in the LD-CELP coder and avoid the complexity of the full search. Objective and subjective tests have been made on a test set of audio signals which is a critical sub-set used by ISO. Formal tests showed that the quality of the proposed coder is comparable to the best implementation of the MPEG-1, Layer II, but our solution has the advantage of reaching a very low delay (5 ms).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bibliographie

Brandenburg (K.), Stoll (G.), iso-mpeq-1 Audio: A generic Standard for coding of high-quality digital audio.In J. Audio Eng. SO.,42 (10): 780–792, (October 1994).
Google Scholar
Brandenburg (K. H.), Herre (J.), Johnston (J.), Mahieux (Y.), Schroeder (E. F.). ASPEC-adaptive spectral perceptual entropy coding of high quality digital audio : algorithms and evalution of quality. In Proceedings of the AES 7th international conference, pages 201-209, (1989).
Charbonnier (A.), Petit (J. P.). Sub-band adpcm coding for high qualityaudio signals. In Procedings of IEEE Int. Conf. Acoust and Speech, Signal Process, pages 2540-2543, (April 1988).
Chen (J. H.), Cox (R.V.), Lin (Y.C.), Jayant (N.), Melchner (M.J.). A low-delay celp coder for the CCITT 16 kb/s speech coding standard. InIEEE J. Select. Areas Commun.,10, no 5 pp. 830–849, (June 1992).
Article Google Scholar
Conway (J. H.), Sloane (N. J. A.). Sphere packing, lattice and groups.Springer Verlag, New York, (1988).
Google Scholar
Gersho (A.). Advances in speech and audio compression. In Proceedings of IEEE,pp. 900-918, (June 1994).
Grusec (T.), Thibault (L.), Beaton (R. J.). Sensitive methodologies for the subjective evaluation of high quality audio coding systems. In Proceedings of the AES UK: DSP conference, pp. 62-76, (1992).
*** iso/mpeg. Coding of moving pictures and associated audio for digital storage media up to about 1.5 Mbit/s. ISSO/IEC JTCI/SC29AVGJ1 11172-3 International Standard, (1993).
Jayant (N.). High quality coding of telephone speech and wideband audio. InAdvances in Speech Signal Processing, S. Furui and M. M. Sondhi, Eds., New York: Dekker, (1992).
Google Scholar
Jayant (N.), Noll (P.). Digital coding of waveform. Prentice-Hall, Englewood Cliffs, (1984).
Johnston (J.), Brandenburg (K.). Wideband coding-perceptual considerations for speech and music. InAdvances in Speech Signal Processing, S. Furui and M. M. Sondhi, Eds., New York: Dekker, (1992).
Google Scholar
Mainard (L.), Creignou (C.), Lever (M.), Dehery (Y.F.). A Real-Time PC-based high quality mpeg Layer II codec. In Proceedings of the AES 101th international conference, Los-Angeles, (1996).
Max (J.). Quantizing for minimum distortion. IEEE Trans, on information theory, pp. 7-12, (March 1960).
Murgia (C.), Feng (G.), Quinquis (C.), Le Guyader (A.). Very low delay and high quality coding of 20 Hz-15 kHz speech at 64 kbit/s. In4th Europ. Conf. on Speech Comm. and Technol., volume 1, pp. 37–40, (sept. 1995).
Google Scholar
Painter (T.), Spanias (A.). A review of algorithms for perceptual coding of digital audio signals. In 13th International Conference on Digital Signal Processing, pages 179-208, (July 1997).
Princen (J. P.), Bradley (B.). Analysis/synthesis filterbank based on time domain aliasing cancellation.IEEE Trans, on ASSP, 34: pp. 1153–1161.(1986).
Article Google Scholar
Rault (J. B.). Algorithmes de réduction de débit pour le codage des voies son haute qualité. Thèse de doctorat, Université de Rennes I. 1987.
Coding of speech at 16 kbit/s using low-delay code excited linear prediction, Recommendation G.728, CCITT. (sept. 1992).
SCHROEDER (M. R.), Atal (B. S.). Code excited linear prediction (celp), high quality speech at very low bit rates. In Proc IEEE hit Conf. Acoust., Speech, Signal Process, pp. 937-940, (April 1985).
Tang (B.), Shen (A.), Pottie (G.), Alwan (A.). Spectra! analysis of subband filtered signals. Proc IEEE Int. Conf. Acoust., Speech, Signal Process, pp. 1324-1327, (May 1995).
Todd (C), Davidson (G. A.), Davis (M. F.), Fielder (L. D.), Link (B. D.), Vernon (S.). AC-3 flexible perceptual coding for audio transmission and storage. In 96th AES Convention, preprint 3796,(1994).
Yoshida (T.). The rewriter minidisc system. Proccedings of IEEE, pages 1492-1500, (October 1994).
Zwicker (E.), Feldtkeller (R.). Psychoacoustique, l’oreille récepteur d’information. Collection technique et scientifique des télécommunications. Masson, (1981).

Download references

Author information

Authors and Affiliations

ENST-Bretagne, Département SC, BP. 832, F-29285, Brest cedex, France
Samir Saoudi & Karine Hay
France Télécom-CNET/DSM, rue du Clos Courtel, BP. 59, F. 35512, Cesson Sévigné cedex, France
Laurent Mainard

Authors

Samir Saoudi
View author publications
You can also search for this author in PubMed Google Scholar
Karine Hay
View author publications
You can also search for this author in PubMed Google Scholar
Laurent Mainard
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Saoudi, S., Hay, K. & Mainard, L. Codage audio haute qualité (20 Hz-15 kHz) en sous bandes à débit réduit (64 kbit/s) et à faible retard (5 ms). Ann. Télécommun. 54, 267–280 (1999). https://doi.org/10.1007/BF02995537

Download citation

Received: 04 September 1998
Accepted: 23 April 1999
Issue Date: May 1999
DOI: https://doi.org/10.1007/BF02995537

Codage audio haute qualité (20 Hz-15 kHz) en sous bandes à débit réduit (64 kbit/s) et à faible retard (5 ms)

Résumé

Abstract

Access this article

Subscribe and save

Buy Now

Bibliographie

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Mots clés

Key words

Subscribe and save

Buy Now

Navigation

Codage audio haute qualité (20 Hz-15 kHz) en sous bandes à débit réduit (64 kbit/s) et à faible retard (5 ms)

Résumé

Abstract

Access this article

Subscribe and save

Buy Now

Bibliographie

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Mots clés

Key words

Subscribe and save

Buy Now

Search

Navigation