Sofianos et al., 2012 - Google Patents

H-Semantics: A hybrid approach to singing voice separation

Sofianos et al., 2012

Document ID: 2028972945158869922
Author: Sofianos S; Ariyaeeinia A; Polfreman R; Sotudeh R
Publication year: 2012
Publication venue: Journal of the Audio Engineering Society

External Links

Cited by

Snippet

The singing voice is the most prominent content of music tracks that can be described as songs. Separation from its music accompaniment is considered highly desirable in the field of music information retrieval, as it facilitates such applications as melody extraction, lyrics …

Continue reading at www.aes.org (other versions)

238000000926 separation method 0 title abstract description 40

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval

Similar Documents

Publication	Publication Date	Title
Das et al.	2021	Fundamentals, present and future perspectives of speech enhancement
Sadjadi et al.	2015	Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification
CN102486920A (en)	2012-06-06	Audio event detection method and device
CN104183245A (en)	2014-12-03	Method and device for recommending music stars with tones similar to those of singers
Hoffmann et al.	2016	Bass enhancement settings in portable devices based on music genre recognition
Singh et al.	2020	Countermeasures to replay attacks: A review
Sofianos et al.	2012	H-Semantics: A hybrid approach to singing voice separation
Zouhir et al.	2014	A bio-inspired feature extraction for robust speech recognition
Chi et al.	2012	Spectro-temporal modulation energy based mask for robust speaker identification
Zhan et al.	2017	Audio post-processing detection and identification based on audio features
Saishu et al.	2021	A CNN-based approach to identification of degradations in speech signals
Uhle et al.	2008	Speech enhancement of movie sound
Lopatka et al.	2016	Improving listeners' experience for movie playback through enhancing dialogue clarity in soundtracks
Liu et al.	2021	Identification of fake stereo audio
Sarria-Paja et al.	2015	Strategies to enhance whispered speech speaker verification: A comparative analysis
Goodwin et al.	2006	Frequency-domain algorithms for audio signal enhancement based on transient modification
Mohammed et al.	2015	A system for semantic information extraction from mixed soundtracks deploying MARSYAS framework
Saxena et al.	2022	Extricate Features Utilizing Mel Frequency Cepstral Coefficient in Automatic Speech Recognition System
Waghmare et al.	2012	A Comparative Study of the Various Emotional Speech Databases
Kumar et al.	2021	Speech quality evaluation for different pitch detection algorithms in LPC speech analysis–synthesis system
Kim et al.	2015	Speech music discrimination using an ensemble of biased classifiers
Revathi et al.	2022	Emotion Recognition from Speech Using Multiple Features and Clusters
Gergen et al.	2015	Reduction of reverberation effects in the MFCC modulation spectrum for improved classification of acoustic signals.
Sardar et al.	2021	Use of Median Timbre Features for Speaker Identification of Whispering Sound
Jain et al.	2018	Feature extraction techniques based on human auditory system