Dinkar et al., 2021 - Google Patents
From local hesitations to global impressions of a speaker's feeling of knowingDinkar et al., 2021
View PDF- Document ID
- 13048082839814338890
- Author
- Dinkar T
- Biancardi B
- Clavel C
- Publication year
- Publication venue
- Proceedings of the 4th International Conference on Natural Language and Speech Processing (ICNLSP 2021)
External Links
Snippet
The listener's interpretation of a speaker's utterance includes estimates about the speaker's commitment to what they are saying. Previous works have shown that fillers (eg “um”) are linked to both the speaker's metacognitive state, and the listener's impression of a speaker's …
- 239000000945 filler 0 abstract description 155
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10334384B2 (en) | Scheduling playback of audio in a virtual acoustic space | |
US8447604B1 (en) | Method and apparatus for processing scripts and related data | |
KR101990023B1 (en) | Method for chunk-unit separation rule and display automated key word to develop foreign language studying, and system thereof | |
US11076052B2 (en) | Selective conference digest | |
EP3254279B1 (en) | Conference word cloud | |
US20180027351A1 (en) | Optimized virtual scene layout for spatial meeting playback | |
US9311914B2 (en) | Method and apparatus for enhanced phonetic indexing and search | |
CN112233680B (en) | Speaker character recognition method, speaker character recognition device, electronic equipment and storage medium | |
Moore | Automated transcription and conversation analysis | |
US20210232776A1 (en) | Method for recording and outputting conversion between multiple parties using speech recognition technology, and device therefor | |
Kopparapu | Non-linguistic analysis of call center conversations | |
CN115485768A (en) | End-to-end multi-speaker overlapping speech recognition | |
CN114328867A (en) | Intelligent interruption method and device in man-machine conversation | |
Soe et al. | Evaluating AI assisted subtitling | |
CN111798871A (en) | Session link identification method, device and equipment and storage medium | |
US20100076747A1 (en) | Mass electronic question filtering and enhancement system for audio broadcasts and voice conferences | |
Bechet et al. | Adapting dependency parsing to spontaneous speech for open domain spoken language understanding | |
CN107886940B (en) | Voice translation processing method and device | |
Dinkar et al. | From local hesitations to global impressions of a speaker’s feeling of knowing | |
Yamasaki et al. | Transcribing And Aligning Conversational Speech: A Hybrid Pipeline Applied To French Conversations | |
Dinkar et al. | From local hesitations to global impressions of the listener | |
US10657202B2 (en) | Cognitive presentation system and method | |
US11632345B1 (en) | Message management for communal account | |
Kwak et al. | VoxMM: Rich Transcription of Conversations in the Wild | |
Mirzaei et al. | Adaptive Listening Difficulty Detection for L2 Learners Through Moderating ASR Resources. |