Hollosi et al., 2010 - Google Patents

Complexity Scalable Perceptual Tempo Estimation from HE-AAC Encoded Music

Hollosi et al., 2010

Document ID: 1158160505144676285
Author: Hollosi D; Biswas A
Publication year: 2010
Publication venue: Audio Engineering Society Convention 128

External Links

Cited by

Snippet

ABSTRACT A modulation frequency-based method for perceptual tempo estimation from HE- AAC encoded music is proposed. The method is designed to work on the fully-decoded PCM-domain; the intermediate HE-AAC transform-domain after partial decoding; and …

Continue reading at www.aes.org (other versions)

230000000051 modifying 0 abstract description 103

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/90—Pitch determination of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition

Similar Documents

Publication	Publication Date	Title
KR101612768B1 (en)	2016-04-18	A System For Estimating A Perceptual Tempo And A Method Thereof
EP2659481B1 (en)	2016-07-27	Scene change detection around a set of seed points in media data
EP2786377B1 (en)	2016-03-02	Chroma extraction from an audio codec
Suied et al.	2014	Auditory gist: recognition of very short sounds from timbre cues
MX2012009787A (en)	2012-09-12	Apparatus and method for modifying an audio signal using envelope shaping.
Cunningham et al.	2019	Subjective evaluation of music compressed with the ACER codec compared to AAC, MP3, and uncompressed PCM
Dick et al.	2017	Generation and evaluation of isolated audio coding artifacts
Hollosi et al.	2010	Complexity Scalable Perceptual Tempo Estimation from HE-AAC Encoded Music
Heusdens et al.	2006	Bit-rate scalable intraframe sinusoidal audio coding based on rate-distortion optimization
Tolonen	2000	Object-based sound source modeling for musical signals
Fenton et al.	2015	Hybrid Multiresolution Analysis of “Punch” in Musical Signals
Etame et al.	2008	Characterization of the multidimensional perceptive space for current speech and sound codecs
Cunningham et al.	2014	Data reduction of audio by exploiting musical repetition
Fan et al.	2023	Transient Detection Methods for Audio Coding
Chookaszian	2022	Music Visualization Using Source Separated Stereophonic Music
Kurniawati et al.	2002	The Significance of Tonality Index and Non-linear Psychoacoustics Models for Masking Threshold Estimation
Hollomey et al.	2016	Using phase information to improve the reconstruction accuracy in sinusoidal modeling
Jacobson et al.	2008	The effects of lossy audio encoding on onset detection tasks
Umapathy et al.	2009	Audio Coding and Classification: Principles and Algorithms
Pires	2018	Computer Sound Transformations Guided by Perceptually Motivated Features
Rose et al.	2006	A Frame Loss Concealment Technique for MPEG-AAC
ACZÉL et al.	2009	Note-based sound source separation of polyphonic recordings
Van Nieuwenhuizen	2011	Comparison of two audio fingerprinting algorithms for advertisement identification
Coffin	2010	Separation of repeating and varying components in audio mixtures
Paul-Taiwo et al.	2004	Feature Extractors for Music Information Retrieval: Noise Robustness