Nothing Special   »   [go: up one dir, main page]

Sofianos et al., 2012 - Google Patents

H-Semantics: A hybrid approach to singing voice separation

Sofianos et al., 2012

Document ID
2028972945158869922
Author
Sofianos S
Ariyaeeinia A
Polfreman R
Sotudeh R
Publication year
Publication venue
Journal of the Audio Engineering Society

External Links

Snippet

The singing voice is the most prominent content of music tracks that can be described as songs. Separation from its music accompaniment is considered highly desirable in the field of music information retrieval, as it facilitates such applications as melody extraction, lyrics …
Continue reading at www.aes.org (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3074Audio data retrieval

Similar Documents

Publication Publication Date Title
Das et al. Fundamentals, present and future perspectives of speech enhancement
Sadjadi et al. Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification
CN102486920A (en) Audio event detection method and device
CN104183245A (en) Method and device for recommending music stars with tones similar to those of singers
Hoffmann et al. Bass enhancement settings in portable devices based on music genre recognition
Singh et al. Countermeasures to replay attacks: A review
Sofianos et al. H-Semantics: A hybrid approach to singing voice separation
Zouhir et al. A bio-inspired feature extraction for robust speech recognition
Chi et al. Spectro-temporal modulation energy based mask for robust speaker identification
Zhan et al. Audio post-processing detection and identification based on audio features
Saishu et al. A CNN-based approach to identification of degradations in speech signals
Uhle et al. Speech enhancement of movie sound
Lopatka et al. Improving listeners' experience for movie playback through enhancing dialogue clarity in soundtracks
Liu et al. Identification of fake stereo audio
Sarria-Paja et al. Strategies to enhance whispered speech speaker verification: A comparative analysis
Goodwin et al. Frequency-domain algorithms for audio signal enhancement based on transient modification
Mohammed et al. A system for semantic information extraction from mixed soundtracks deploying MARSYAS framework
Saxena et al. Extricate Features Utilizing Mel Frequency Cepstral Coefficient in Automatic Speech Recognition System
Waghmare et al. A Comparative Study of the Various Emotional Speech Databases
Kumar et al. Speech quality evaluation for different pitch detection algorithms in LPC speech analysis–synthesis system
Kim et al. Speech music discrimination using an ensemble of biased classifiers
Revathi et al. Emotion Recognition from Speech Using Multiple Features and Clusters
Gergen et al. Reduction of reverberation effects in the MFCC modulation spectrum for improved classification of acoustic signals.
Sardar et al. Use of Median Timbre Features for Speaker Identification of Whispering Sound
Jain et al. Feature extraction techniques based on human auditory system