Mokbel et al., 1997 - Google Patents
Towards improving ASR robustness for PSN and GSM telephone applicationsMokbel et al., 1997
View PDF- Document ID
- 9121208509955602764
- Author
- Mokbel C
- Mauuary L
- Karray L
- Jouvet D
- Monné J
- Simonin J
- Bartkova K
- Publication year
- Publication venue
- Speech communication
External Links
Snippet
In real-life applications, errors in the speech recognition system are mainly due to inefficient detection of speech segments, unreliable rejection of Out-Of-Vocabulary (OOV) words, and insufficient account of noise and transmission channel effects. In this paper, we review a set …
- 230000004301 light adaptation 0 abstract description 96
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mokbel et al. | Towards improving ASR robustness for PSN and GSM telephone applications | |
Raj et al. | Reconstruction of missing features for robust speech recognition | |
Viikki et al. | Cepstral domain segmental feature vector normalization for noise robust speech recognition | |
Li et al. | An overview of noise-robust automatic speech recognition | |
EP0792503B1 (en) | Signal conditioned minimum error rate training for continuous speech recognition | |
DE69831288T2 (en) | Sound processing adapted to ambient noise | |
US6868380B2 (en) | Speech recognition system and method for generating phonotic estimates | |
Moreno et al. | Data-driven environmental compensation for speech recognition: A unified approach | |
Besacier et al. | Localization and selection of speaker-specific information with statistical modeling | |
Stern et al. | Signal processing for robust speech recognition | |
Chowdhury et al. | Bayesian on-line spectral change point detection: a soft computing approach for on-line ASR | |
Nakatani et al. | Dominance based integration of spatial and spectral features for speech enhancement | |
Soe Naing et al. | Discrete Wavelet Denoising into MFCC for Noise Suppressive in Automatic Speech Recognition System. | |
Haton | Automatic speech recognition: A Review | |
Nguyen et al. | Feature adaptation using linear spectro-temporal transform for robust speech recognition | |
Cui et al. | Adaptation of children’s speech with limited data based on formant-like peak alignment | |
Ozerov et al. | GMM-based classification from noisy features | |
Kalamani et al. | Continuous Tamil Speech Recognition technique under non stationary noisy environments | |
Fredouille et al. | AMIRAL: a block-segmental multirecognizer architecture for automatic speaker recognition | |
Pattanayak et al. | Pitch-robust acoustic feature using single frequency filtering for children’s KWS | |
de Veth et al. | Acoustic backing-off as an implementation of missing feature theory | |
Seltzer et al. | Training wideband acoustic models using mixed-bandwidth training data for speech recognition | |
CN114970695B (en) | Speaker segmentation clustering method based on non-parametric Bayesian model | |
Giuliani et al. | Hands free continuous speech recognition in noisy environment using a four microphone array | |
Surendran et al. | Transformation-based Bayesian prediction for adaptation of HMMs |