Sofianos et al., 2012 - Google Patents
H-Semantics: A hybrid approach to singing voice separationSofianos et al., 2012
- Document ID
- 2028972945158869922
- Author
- Sofianos S
- Ariyaeeinia A
- Polfreman R
- Sotudeh R
- Publication year
- Publication venue
- Journal of the Audio Engineering Society
External Links
Snippet
The singing voice is the most prominent content of music tracks that can be described as songs. Separation from its music accompaniment is considered highly desirable in the field of music information retrieval, as it facilitates such applications as melody extraction, lyrics …
- 238000000926 separation method 0 title abstract description 40
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Das et al. | Fundamentals, present and future perspectives of speech enhancement | |
Sadjadi et al. | Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification | |
CN102486920A (en) | Audio event detection method and device | |
CN104183245A (en) | Method and device for recommending music stars with tones similar to those of singers | |
Hoffmann et al. | Bass enhancement settings in portable devices based on music genre recognition | |
Singh et al. | Countermeasures to replay attacks: A review | |
Sofianos et al. | H-Semantics: A hybrid approach to singing voice separation | |
Zouhir et al. | A bio-inspired feature extraction for robust speech recognition | |
Chi et al. | Spectro-temporal modulation energy based mask for robust speaker identification | |
Zhan et al. | Audio post-processing detection and identification based on audio features | |
Saishu et al. | A CNN-based approach to identification of degradations in speech signals | |
Uhle et al. | Speech enhancement of movie sound | |
Lopatka et al. | Improving listeners' experience for movie playback through enhancing dialogue clarity in soundtracks | |
Liu et al. | Identification of fake stereo audio | |
Sarria-Paja et al. | Strategies to enhance whispered speech speaker verification: A comparative analysis | |
Goodwin et al. | Frequency-domain algorithms for audio signal enhancement based on transient modification | |
Mohammed et al. | A system for semantic information extraction from mixed soundtracks deploying MARSYAS framework | |
Saxena et al. | Extricate Features Utilizing Mel Frequency Cepstral Coefficient in Automatic Speech Recognition System | |
Waghmare et al. | A Comparative Study of the Various Emotional Speech Databases | |
Kumar et al. | Speech quality evaluation for different pitch detection algorithms in LPC speech analysis–synthesis system | |
Kim et al. | Speech music discrimination using an ensemble of biased classifiers | |
Revathi et al. | Emotion Recognition from Speech Using Multiple Features and Clusters | |
Gergen et al. | Reduction of reverberation effects in the MFCC modulation spectrum for improved classification of acoustic signals. | |
Sardar et al. | Use of Median Timbre Features for Speaker Identification of Whispering Sound | |
Jain et al. | Feature extraction techniques based on human auditory system |