Nothing Special   »   [go: up one dir, main page]

PT2301011T - Método e discriminador para classificar diferentes segmentos de um sinal de áudio compreendendo segmentos de discurso e de música - Google Patents

Método e discriminador para classificar diferentes segmentos de um sinal de áudio compreendendo segmentos de discurso e de música

Info

Publication number
PT2301011T
PT2301011T PT09776747T PT09776747T PT2301011T PT 2301011 T PT2301011 T PT 2301011T PT 09776747 T PT09776747 T PT 09776747T PT 09776747 T PT09776747 T PT 09776747T PT 2301011 T PT2301011 T PT 2301011T
Authority
PT
Portugal
Prior art keywords
segments
discriminator
speech
audio signal
classifying different
Prior art date
Application number
PT09776747T
Other languages
English (en)
Inventor
Hirschfeld Jens
Herre Jürgen
Wabnik Stefan
Bayer Stefan
Fuchs Guillaume
Rettelbach Nikolaus
Nagel Frederik
Yokotani Yoshikazu
Lecomte Jérémie
Original Assignee
Fraunhofer Ges Forschung
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer Ges Forschung filed Critical Fraunhofer Ges Forschung
Publication of PT2301011T publication Critical patent/PT2301011T/pt

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/81Detection of presence or absence of voice signals for discriminating voice from music
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/20Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Image Analysis (AREA)
PT09776747T 2008-07-11 2009-06-16 Método e discriminador para classificar diferentes segmentos de um sinal de áudio compreendendo segmentos de discurso e de música PT2301011T (pt)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US7987508P 2008-07-11 2008-07-11

Publications (1)

Publication Number Publication Date
PT2301011T true PT2301011T (pt) 2018-10-26

Family

ID=40851974

Family Applications (1)

Application Number Title Priority Date Filing Date
PT09776747T PT2301011T (pt) 2008-07-11 2009-06-16 Método e discriminador para classificar diferentes segmentos de um sinal de áudio compreendendo segmentos de discurso e de música

Country Status (20)

Country Link
US (1) US8571858B2 (pt)
EP (1) EP2301011B1 (pt)
JP (1) JP5325292B2 (pt)
KR (2) KR101281661B1 (pt)
CN (1) CN102089803B (pt)
AR (1) AR072863A1 (pt)
AU (1) AU2009267507B2 (pt)
BR (1) BRPI0910793B8 (pt)
CA (1) CA2730196C (pt)
CO (1) CO6341505A2 (pt)
ES (1) ES2684297T3 (pt)
HK (1) HK1158804A1 (pt)
MX (1) MX2011000364A (pt)
MY (1) MY153562A (pt)
PL (1) PL2301011T3 (pt)
PT (1) PT2301011T (pt)
RU (1) RU2507609C2 (pt)
TW (1) TWI441166B (pt)
WO (1) WO2010003521A1 (pt)
ZA (1) ZA201100088B (pt)

Families Citing this family (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010003563A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding and decoding audio samples
CN101847412B (zh) * 2009-03-27 2012-02-15 华为技术有限公司 音频信号的分类方法及装置
KR101666521B1 (ko) * 2010-01-08 2016-10-14 삼성전자 주식회사 입력 신호의 피치 주기 검출 방법 및 그 장치
ES2530957T3 (es) 2010-10-06 2015-03-09 Fraunhofer Ges Forschung Aparato y método para procesar una señal de audio y para proporcionar una mayor granularidad temporal para un códec de voz y de audio unificado combinado (USAC)
US8521541B2 (en) * 2010-11-02 2013-08-27 Google Inc. Adaptive audio transcoding
CN103000172A (zh) * 2011-09-09 2013-03-27 中兴通讯股份有限公司 信号分类方法和装置
US20130090926A1 (en) * 2011-09-16 2013-04-11 Qualcomm Incorporated Mobile device context information using speech detection
JPWO2013061584A1 (ja) * 2011-10-28 2015-04-02 パナソニック株式会社 音信号ハイブリッドデコーダ、音信号ハイブリッドエンコーダ、音信号復号方法、及び音信号符号化方法
CN105163398B (zh) 2011-11-22 2019-01-18 华为技术有限公司 连接建立方法和用户设备
US9111531B2 (en) 2012-01-13 2015-08-18 Qualcomm Incorporated Multiple coding mode signal classification
ES2555136T3 (es) * 2012-02-17 2015-12-29 Huawei Technologies Co., Ltd. Codificador paramétrico para codificar una señal de audio multicanal
US20130317821A1 (en) * 2012-05-24 2013-11-28 Qualcomm Incorporated Sparse signal detection with mismatched models
ES2604652T3 (es) 2012-08-31 2017-03-08 Telefonaktiebolaget Lm Ericsson (Publ) Método y dispositivo para detectar la actividad vocal
US9589570B2 (en) * 2012-09-18 2017-03-07 Huawei Technologies Co., Ltd. Audio classification based on perceptual quality for low or medium bit rates
RU2656681C1 (ru) * 2012-11-13 2018-06-06 Самсунг Электроникс Ко., Лтд. Способ и устройство для определения режима кодирования, способ и устройство для кодирования аудиосигналов и способ, и устройство для декодирования аудиосигналов
EP2954635B1 (en) * 2013-02-19 2021-07-28 Huawei Technologies Co., Ltd. Frame structure for filter bank multi-carrier (fbmc) waveforms
ES2634621T3 (es) * 2013-02-20 2017-09-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Aparato y procedimiento para generar una señal de audio o imagen codificada o para descodificar una señal de audio o imagen codificada en presencia de transitorios utilizando una parte de superposición múltiple
CN106409310B (zh) 2013-08-06 2019-11-19 华为技术有限公司 一种音频信号分类方法和装置
US9666202B2 (en) * 2013-09-10 2017-05-30 Huawei Technologies Co., Ltd. Adaptive bandwidth extension and apparatus for the same
KR101498113B1 (ko) * 2013-10-23 2015-03-04 광주과학기술원 사운드 신호의 대역폭 확장 장치 및 방법
WO2015126228A1 (ko) * 2014-02-24 2015-08-27 삼성전자 주식회사 신호 분류 방법 및 장치, 및 이를 이용한 오디오 부호화방법 및 장치
CN107452390B (zh) 2014-04-29 2021-10-26 华为技术有限公司 音频编码方法及相关装置
US9666210B2 (en) * 2014-05-15 2017-05-30 Telefonaktiebolaget Lm Ericsson (Publ) Audio signal classification and coding
CN105336338B (zh) * 2014-06-24 2017-04-12 华为技术有限公司 音频编码方法和装置
US9886963B2 (en) * 2015-04-05 2018-02-06 Qualcomm Incorporated Encoder selection
ES2829413T3 (es) * 2015-05-20 2021-05-31 Ericsson Telefon Ab L M Codificación de señales de audio de múltiples canales
US10706873B2 (en) * 2015-09-18 2020-07-07 Sri International Real-time speaker state analytics platform
WO2017196422A1 (en) * 2016-05-12 2017-11-16 Nuance Communications, Inc. Voice activity detection feature based on modulation-phase differences
US10699538B2 (en) * 2016-07-27 2020-06-30 Neosensory, Inc. Method and system for determining and providing sensory experiences
WO2018048907A1 (en) 2016-09-06 2018-03-15 Neosensory, Inc. C/O Tmc+260 Method and system for providing adjunct sensory information to a user
CN107895580B (zh) * 2016-09-30 2021-06-01 华为技术有限公司 一种音频信号的重建方法和装置
US10744058B2 (en) * 2017-04-20 2020-08-18 Neosensory, Inc. Method and system for providing information to a user
US10325588B2 (en) 2017-09-28 2019-06-18 International Business Machines Corporation Acoustic feature extractor selected according to status flag of frame of acoustic signal
EP3895164B1 (en) * 2018-12-13 2022-09-07 Dolby Laboratories Licensing Corporation Method of decoding audio content, decoder for decoding audio content, and corresponding computer program
RU2761940C1 (ru) * 2018-12-18 2021-12-14 Общество С Ограниченной Ответственностью "Яндекс" Способы и электронные устройства для идентификации пользовательского высказывания по цифровому аудиосигналу
EP3956890B1 (en) 2019-04-18 2024-02-21 Dolby Laboratories Licensing Corporation A dialog detector
CN110288983B (zh) * 2019-06-26 2021-10-01 上海电机学院 一种基于机器学习的语音处理方法
US11467667B2 (en) 2019-09-25 2022-10-11 Neosensory, Inc. System and method for haptic stimulation
US11467668B2 (en) 2019-10-21 2022-10-11 Neosensory, Inc. System and method for representing virtual object information with haptic stimulation
WO2021142162A1 (en) 2020-01-07 2021-07-15 Neosensory, Inc. Method and system for haptic stimulation
CA3170065A1 (en) * 2020-04-16 2021-10-21 Vladimir Malenovsky Method and device for speech/music classification and core encoder selection in a sound codec
US11497675B2 (en) 2020-10-23 2022-11-15 Neosensory, Inc. Method and system for multimodal stimulation
WO2022147615A1 (en) * 2021-01-08 2022-07-14 Voiceage Corporation Method and device for unified time-domain / frequency domain coding of a sound signal
US11862147B2 (en) 2021-08-13 2024-01-02 Neosensory, Inc. Method and system for enhancing the intelligibility of information for a user
US20230147185A1 (en) * 2021-11-08 2023-05-11 Lemon Inc. Controllable music generation
US11995240B2 (en) 2021-11-16 2024-05-28 Neosensory, Inc. Method and system for conveying digital texture information to a user
CN116070174A (zh) * 2023-03-23 2023-05-05 长沙融创智胜电子科技有限公司 一种多类别目标识别方法及系统

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IT1232084B (it) * 1989-05-03 1992-01-23 Cselt Centro Studi Lab Telecom Sistema di codifica per segnali audio a banda allargata
JPH0490600A (ja) * 1990-08-03 1992-03-24 Sony Corp 音声認識装置
JPH04342298A (ja) * 1991-05-20 1992-11-27 Nippon Telegr & Teleph Corp <Ntt> 瞬時ピッチ分析方法及び有声・無声判定方法
RU2049456C1 (ru) * 1993-06-22 1995-12-10 Вячеслав Алексеевич Сапрыкин Способ передачи речевых сигналов
US6134518A (en) 1997-03-04 2000-10-17 International Business Machines Corporation Digital audio signal coding using a CELP coder and a transform coder
JP3700890B2 (ja) * 1997-07-09 2005-09-28 ソニー株式会社 信号識別装置及び信号識別方法
RU2132593C1 (ru) * 1998-05-13 1999-06-27 Академия управления МВД России Многоканальное устройство для передачи речевых сигналов
SE0004187D0 (sv) 2000-11-15 2000-11-15 Coding Technologies Sweden Ab Enhancing the performance of coding systems that use high frequency reconstruction methods
DE60202881T2 (de) 2001-11-29 2006-01-19 Coding Technologies Ab Wiederherstellung von hochfrequenzkomponenten
US6785645B2 (en) * 2001-11-29 2004-08-31 Microsoft Corporation Real-time speech and music classifier
AUPS270902A0 (en) * 2002-05-31 2002-06-20 Canon Kabushiki Kaisha Robust detection and classification of objects in audio using limited training data
JP4348970B2 (ja) * 2003-03-06 2009-10-21 ソニー株式会社 情報検出装置及び方法、並びにプログラム
JP2004354589A (ja) * 2003-05-28 2004-12-16 Nippon Telegr & Teleph Corp <Ntt> 音響信号判別方法、音響信号判別装置、音響信号判別プログラム
EP1758274A4 (en) * 2004-06-01 2012-03-14 Nec Corp SYSTEM, METHOD AND PROGRAM PROVIDING INFORMATION
US7130795B2 (en) * 2004-07-16 2006-10-31 Mindspeed Technologies, Inc. Music detection with low-complexity pitch correlation algorithm
JP4587916B2 (ja) * 2005-09-08 2010-11-24 シャープ株式会社 音声信号判別装置、音質調整装置、コンテンツ表示装置、プログラム、及び記録媒体
ATE463028T1 (de) 2006-09-13 2010-04-15 Ericsson Telefon Ab L M Verfahren und anordnungen für einen sprach- /audiosender und empfänger
CN1920947B (zh) * 2006-09-15 2011-05-11 清华大学 用于低比特率音频编码的语音/音乐检测器
CA2663904C (en) * 2006-10-10 2014-05-27 Qualcomm Incorporated Method and apparatus for encoding and decoding audio signals
KR101016224B1 (ko) * 2006-12-12 2011-02-25 프라운호퍼-게젤샤프트 추르 푀르데룽 데어 안제반텐 포르슝 에 파우 인코더, 디코더 및 시간 영역 데이터 스트림을 나타내는 데이터 세그먼트를 인코딩하고 디코딩하는 방법
KR100964402B1 (ko) * 2006-12-14 2010-06-17 삼성전자주식회사 오디오 신호의 부호화 모드 결정 방법 및 장치와 이를 이용한 오디오 신호의 부호화/복호화 방법 및 장치
KR100883656B1 (ko) * 2006-12-28 2009-02-18 삼성전자주식회사 오디오 신호의 분류 방법 및 장치와 이를 이용한 오디오신호의 부호화/복호화 방법 및 장치
WO2010001393A1 (en) * 2008-06-30 2010-01-07 Waves Audio Ltd. Apparatus and method for classification and segmentation of audio content, based on the audio signal

Also Published As

Publication number Publication date
JP2011527445A (ja) 2011-10-27
KR20130036358A (ko) 2013-04-11
CN102089803B (zh) 2013-02-27
US8571858B2 (en) 2013-10-29
MX2011000364A (es) 2011-02-25
PL2301011T3 (pl) 2019-03-29
TWI441166B (zh) 2014-06-11
RU2011104001A (ru) 2012-08-20
ZA201100088B (en) 2011-08-31
CA2730196A1 (en) 2010-01-14
BRPI0910793B1 (pt) 2020-11-24
CO6341505A2 (es) 2011-11-21
AR072863A1 (es) 2010-09-29
BRPI0910793B8 (pt) 2021-08-24
HK1158804A1 (en) 2012-07-20
CN102089803A (zh) 2011-06-08
ES2684297T3 (es) 2018-10-02
JP5325292B2 (ja) 2013-10-23
EP2301011B1 (en) 2018-07-25
EP2301011A1 (en) 2011-03-30
CA2730196C (en) 2014-10-21
AU2009267507B2 (en) 2012-08-02
AU2009267507A1 (en) 2010-01-14
WO2010003521A1 (en) 2010-01-14
BRPI0910793A2 (pt) 2016-08-02
KR101281661B1 (ko) 2013-07-03
KR101380297B1 (ko) 2014-04-02
MY153562A (en) 2015-02-27
RU2507609C2 (ru) 2014-02-20
KR20110039254A (ko) 2011-04-15
US20110202337A1 (en) 2011-08-18
TW201009813A (en) 2010-03-01

Similar Documents

Publication Publication Date Title
PL2301011T3 (pl) Sposób i dyskryminator do klasyfikacji różnych segmentów sygnału audio zawierającego segmenty mowy i muzyki
HK1258592A1 (zh) 音頻再現方法和聲音再現系統
HK1248912A1 (zh) 用於音頻信號的帶寬擴展的設備和方法
HUE047607T2 (hu) Eljárás és eszköz hangjel észlelési spektrális dekódolására, beleértve a spektrális lyukak kitöltését
PL2186090T3 (pl) Detektor stanów przejściowych i sposób wspierający kodowanie sygnału audio
HK1217384A1 (zh) 對音頻信號處理的裝置和對時域音頻信號進行處理的方法
EP2259254A4 (en) METHOD AND APPARATUS FOR PROCESSING A SOUND SIGNAL
EP2191462A4 (en) METHOD AND DEVICE FOR DECODING A SOUND SIGNAL
IL209095A (en) A method for enhancing the ability to hear speech in a multi-channel audio signal and a device containing a circuit for improving the ability to hear speech in a multi-channel audio signal
PL2478519T3 (pl) Rewerberator i sposób rewerberacji sygnału audio
EG26480A (en) Method and device to encode / decode an audio signal through spectrum bands
HK1250089A1 (zh) 用於合成一個音頻信號參數化表示的裝置和方法
EP2232487A4 (en) METHOD AND APPARATUS FOR PROCESSING AUDIO SIGNAL
PL2308244T3 (pl) System akustyczny i sposób jego działania
HK1149624A1 (en) Device and method for synchronizing multi-channel expansion data with an audio signal and for processing said audio signal
EP2413313A4 (en) METHOD AND DEVICE FOR CLASSIFYING AUDIO SIGNALS
EP2609589A4 (en) DEVICE AND METHOD FOR POST-EDITING DECODED MULTI-CHANNEL TONE SIGNALS OR DECODED STEREOSIGNALS
PL2559029T3 (pl) Sposób i koder i dekoder do odtwarzania bez przerw sygnału audio
EP2522015A4 (en) APPARATUS AND METHOD FOR PROCESSING AUDIO SIGNAL
EP2612321A4 (en) DEVICE AND METHOD FOR POST-PROCESSING MULTICANAL AUDIO SIGNAL OR DECODED STEREO SIGNAL
EP2628322A4 (en) METHOD AND DEVICE FOR DOWNWARD MULTI-CHANNEL AUDIO SIGNALS
TWI340600B (en) Method for processing an audio signal, method of encoding an audio signal and apparatus thereof
EP2296143A4 (en) AUDIO SIGNAL DECODING DEVICE AND BALANCE ADJUSTMENT METHOD FOR AN AUDIO SIGNAL DECODING DEVICE
EP2557566B8 (en) Method and apparatus for processing an audio signal
GB2486855B (en) Apparatus and method for reproducing an audio signal