Mokbel et al., 1997 - Google Patents

Towards improving ASR robustness for PSN and GSM telephone applications

Mokbel et al., 1997

Document ID: 9121208509955602764
Author: Mokbel C; Mauuary L; Karray L; Jouvet D; Monné J; Simonin J; Bartkova K
Publication year: 1997
Publication venue: Speech communication

External Links

Cited by

Snippet

In real-life applications, errors in the speech recognition system are mainly due to inefficient detection of speech segments, unreliable rejection of Out-Of-Vocabulary (OOV) words, and insufficient account of noise and transmission channel effects. In this paper, we review a set …

Continue reading at citeseerx.ist.psu.edu (PDF) (other versions)

230000004301 light adaptation 0 abstract description 96

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis

Similar Documents

Publication	Publication Date	Title
Mokbel et al.	1997	Towards improving ASR robustness for PSN and GSM telephone applications
Raj et al.	2004	Reconstruction of missing features for robust speech recognition
Viikki et al.	1998	Cepstral domain segmental feature vector normalization for noise robust speech recognition
Li et al.	2014	An overview of noise-robust automatic speech recognition
EP0792503B1 (en)	2003-03-26	Signal conditioned minimum error rate training for continuous speech recognition
DE69831288T2 (en)	2006-06-08	Sound processing adapted to ambient noise
US6868380B2 (en)	2005-03-15	Speech recognition system and method for generating phonotic estimates
Moreno et al.	1998	Data-driven environmental compensation for speech recognition: A unified approach
Besacier et al.	2000	Localization and selection of speaker-specific information with statistical modeling
Stern et al.	1996	Signal processing for robust speech recognition
Chowdhury et al.	2012	Bayesian on-line spectral change point detection: a soft computing approach for on-line ASR
Nakatani et al.	2013	Dominance based integration of spatial and spectral features for speech enhancement
Soe Naing et al.	2020	Discrete Wavelet Denoising into MFCC for Noise Suppressive in Automatic Speech Recognition System.
Haton	2005	Automatic speech recognition: A Review
Nguyen et al.	2016	Feature adaptation using linear spectro-temporal transform for robust speech recognition
Cui et al.	2006	Adaptation of children’s speech with limited data based on formant-like peak alignment
Ozerov et al.	2011	GMM-based classification from noisy features
Kalamani et al.	2019	Continuous Tamil Speech Recognition technique under non stationary noisy environments
Fredouille et al.	2000	AMIRAL: a block-segmental multirecognizer architecture for automatic speaker recognition
Pattanayak et al.	2021	Pitch-robust acoustic feature using single frequency filtering for children’s KWS
de Veth et al.	2001	Acoustic backing-off as an implementation of missing feature theory
Seltzer et al.	2006	Training wideband acoustic models using mixed-bandwidth training data for speech recognition
CN114970695B (en)	2024-03-22	Speaker segmentation clustering method based on non-parametric Bayesian model
Giuliani et al.	1995	Hands free continuous speech recognition in noisy environment using a four microphone array
Surendran et al.	2001	Transformation-based Bayesian prediction for adaptation of HMMs