Nothing Special   »   [go: up one dir, main page]

CN111669697B - A method and system for extracting coherent sound and ambient sound from multi-channel signals - Google Patents

A method and system for extracting coherent sound and ambient sound from multi-channel signals Download PDF

Info

Publication number
CN111669697B
CN111669697B CN202010447863.9A CN202010447863A CN111669697B CN 111669697 B CN111669697 B CN 111669697B CN 202010447863 A CN202010447863 A CN 202010447863A CN 111669697 B CN111669697 B CN 111669697B
Authority
CN
China
Prior art keywords
channel
sound
coherent
ambient
coherent sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010447863.9A
Other languages
Chinese (zh)
Other versions
CN111669697A (en
Inventor
吴彦琴
桑晋秋
郑成诗
张芳杰
李晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Original Assignee
Institute of Acoustics CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS filed Critical Institute of Acoustics CAS
Priority to CN202010447863.9A priority Critical patent/CN111669697B/en
Publication of CN111669697A publication Critical patent/CN111669697A/en
Application granted granted Critical
Publication of CN111669697B publication Critical patent/CN111669697B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/06Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being correlation coefficients

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

本发明公开了一种多通道信号的相干声与环境声提取方法及系统,所述方法包括:计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;根据各个通道的相干声计算各个通道的环境声;将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。本发明的方法探究出在各通道环境声能量相同条件下的,针对任意通道数信号估计相干声的权重表达式,利用各个通道的信号能量以及通道间相关值,求出权重表达式中的各个未知参数,实现多通道信号的相干声与环境声提取,提取精度高。

Figure 202010447863

The invention discloses a method and system for extracting coherent sound and ambient sound of multi-channel signals. The method comprises: calculating the weight expression of the coherent sound of N channel signals, estimating the coherent sound according to the weight expression, and calculating each channel accordingly. Among them, the ambient sound energy of each channel is the same; calculate the ambient sound of each channel according to the coherent sound of each channel; perform inverse Fourier transform on the coherent sound of the N channels and the ambient sound of the N channels to obtain the time domain Represents coherent and ambient sound. The method of the invention explores the weight expression for estimating coherent sound for signals with any number of channels under the condition that the ambient sound energy of each channel is the same, and uses the signal energy of each channel and the correlation value between channels to obtain each of the weight expressions. Unknown parameters can realize the extraction of coherent sound and ambient sound of multi-channel signals, and the extraction accuracy is high.

Figure 202010447863

Description

一种多通道信号的相干声与环境声提取方法及系统A method and system for extracting coherent sound and ambient sound from multi-channel signals

技术领域technical field

本发明涉及空间声重放领域,特别涉及一种多通道信号的相干声与环境声提取方法及系统。The invention relates to the field of spatial sound reproduction, in particular to a method and system for extracting coherent sound and ambient sound of multi-channel signals.

背景技术Background technique

空间声重放技术在娱乐媒体中得到了广泛应用,比如电影院、家庭影院以及便携式电子设备在播放影片时,通过耳机或扬声器重放出具有一定声像宽度和沉浸感良好的空间声,可以为消费者带来更好的视听体验。近年来,空间声重放在尖端的科学研究和实用工程领域也逐渐显露出重要的应用前景,比如虚拟现实、航空、航天等领域。Spatial sound playback technology has been widely used in entertainment media, such as movie theaters, home theaters, and portable electronic devices. When playing movies, the spatial sound with a certain sound image width and good immersion is played back through headphones or speakers, which can be used for consumption. bring a better audio-visual experience. In recent years, space acoustic reproduction has gradually shown important application prospects in cutting-edge scientific research and practical engineering fields, such as virtual reality, aviation, aerospace and other fields.

空间声主要包含两种性质不同的成分,其一是具有方向性的声成分,称为相干声;其二是具有扩散性、无法辨别方向的声成分,称为环境声。为了实现更好的声重放效果,需要对空间声进行相干声与环境声提取(Primary-Ambient Extraction,PAE)并进行不同的处理。比如,音频编解码系统中,将PAE作为音频编码或解码的前端,可以实现有效且沉浸感较好的空间声重放。Spatial sound mainly contains two components with different properties, one is the sound component with directionality, called coherent sound; the other is the sound component with diffusivity and cannot distinguish the direction, called ambient sound. In order to achieve better sound reproduction effect, coherent sound and ambient sound extraction (Primary-Ambient Extraction, PAE) need to be performed on spatial sound and different processing is performed. For example, in an audio coding and decoding system, using PAE as the front end of audio coding or decoding can achieve effective and immersive spatial sound playback.

针对两通道信号的PAE方法发展较为成熟,应用较为广泛的是主成分分析法和最小二乘法。针对多通道信号,可以使用成对相关法进行PAE。但是成对相关法提取成分准确度不高。因此将适用于立体声的PAE方法拓展至多通道信号具有重要意义。主成分分析法在相干声占主要能量的前提下,通过计算输入信号的协方差矩阵的特征值,对立体声信号进行PAE。该方法也可对多通道信号进行成分提取,但是当通道数较多时,计算复杂度增大,而且主成分分析法仅在相干声占主要能量时提取效果较好。最小二乘法在各个通道环境声能量相等的前提下,通过计算估计相干声的权重,实现对立体声信号的PAE。但是,直接将最小二乘法应用于多通道信号时,估计权重不易求解。环境声成分在空间声中主要起烘托气氛的作用,为了达到更好的环绕感,环境声在各个通道能量分布差异较小。因此,在各个通道环境声能量相等的前提下,对多通道信号进行相干声与环境声提取有着重要意义。The PAE method for two-channel signals is relatively mature, and the principal component analysis method and the least squares method are widely used. For multi-channel signals, the pairwise correlation method can be used for PAE. However, the accuracy of extracting components by pairwise correlation method is not high. Therefore, it is of great significance to extend the PAE method suitable for stereo to multi-channel signals. The principal component analysis method performs PAE on the stereo signal by calculating the eigenvalues of the covariance matrix of the input signal under the premise that coherent sound occupies the main energy. This method can also extract components from multi-channel signals, but when the number of channels is large, the computational complexity increases, and the principal component analysis method is only effective when coherent sound occupies the main energy. The least squares method realizes the PAE of the stereo signal by calculating and estimating the weight of the coherent sound under the premise that the ambient sound energy of each channel is equal. However, when the least squares method is directly applied to multi-channel signals, the estimated weights are not easy to solve. The ambient sound component mainly plays the role of setting off the atmosphere in the spatial sound. In order to achieve a better surround feeling, the energy distribution difference of the ambient sound in each channel is small. Therefore, under the premise that the ambient sound energy of each channel is equal, it is of great significance to extract coherent sound and ambient sound for multi-channel signals.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于克服上述技术缺陷,在环境声在各个通道能量相等的前提下,通过计算通道数较少时使用最小二乘法估计相干声的权重,根据权重随通道数变化的规律性,得出针对任意通道数的多通道信号进行相干声估计时的权重表达式。The purpose of the present invention is to overcome the above-mentioned technical defects. Under the premise that the energy of the ambient sound in each channel is equal, the least squares method is used to estimate the weight of the coherent sound when the number of channels is small. According to the regularity of the weight change with the number of channels, the The weight expression for coherent sound estimation for multi-channel signals with arbitrary number of channels is obtained.

为实现上述目的,本发明提出了一种多通道信号的相干声与环境声提取方法,所述方法包括:In order to achieve the above purpose, the present invention proposes a method for extracting coherent sound and ambient sound from multi-channel signals, the method comprising:

计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;Calculate the weight expression of the coherent sound of the N channel signals, estimate the coherent sound according to the weight expression, and thereby calculate the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;

根据各个通道的相干声计算各个通道的环境声;Calculate the ambient sound of each channel according to the coherent sound of each channel;

将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。Inverse Fourier transform is performed on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and the ambient sound represented in the time domain.

作为上述方法的一种改进,所述计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;具体包括:As an improvement of the above method, the weight expression of the coherent sound of the N channel signals is calculated, and the coherent sound is estimated according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same; Specifically include:

将时域多通道信号进行傅里叶变换,第n个通道输入信号Xn表示为:Fourier transform is performed on the time-domain multi-channel signal, and the input signal X n of the nth channel is expressed as:

Xn=βnS+An X nn S+A n

其中,S表示相干声的频谱,βn表示第n个通道的相干声与第一个通道的相干声存在的幅度差异因子,1≤n≤N,β1=1,An表示第n个通道的环境声的频谱;Among them, S represents the spectrum of the coherent sound, β n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, 1≤n≤N, β 1 =1, An represents the nth channel The spectrum of the ambient sound of the channel;

计算第n个通道输入信号Xn的短时能量

Figure BDA0002506583330000021
Calculate the short-term energy of the nth channel input signal Xn
Figure BDA0002506583330000021

Figure BDA0002506583330000022
Figure BDA0002506583330000022

计算第一个通道和第二个通道的相关值Φ12Calculate the correlation value Φ 12 for the first channel and the second channel:

Figure BDA0002506583330000023
Figure BDA0002506583330000023

根据第一个通道的短时能量

Figure BDA0002506583330000024
第二个通道的短时能量
Figure BDA0002506583330000025
以及两个通道间相关值Φ12,计算中间参数C和D:According to the short-term energy of the first channel
Figure BDA0002506583330000024
Short-term energy of the second channel
Figure BDA0002506583330000025
and the correlation value Φ 12 between the two channels, calculate the intermediate parameters C and D:

Figure BDA0002506583330000026
Figure BDA0002506583330000026

Figure BDA0002506583330000027
Figure BDA0002506583330000027

由此计算相干声的短时能量PS、环境声的短时能量PA以及β2 From this, the short-term energy P S of the coherent sound, the short - term energy PA and β 2 of the ambient sound are calculated.

Figure BDA0002506583330000031
Figure BDA0002506583330000031

Figure BDA0002506583330000032
Figure BDA0002506583330000032

Figure BDA0002506583330000033
Figure BDA0002506583330000033

计算βnCalculate β n :

Figure BDA0002506583330000034
Figure BDA0002506583330000034

第n个通道的权重值为:The weight value of the nth channel is:

Figure BDA0002506583330000035
Figure BDA0002506583330000035

则相干声的估计值

Figure BDA0002506583330000036
为:Then the estimated value of coherent sound
Figure BDA0002506583330000036
for:

Figure BDA0002506583330000037
Figure BDA0002506583330000037

则第n个通道相干声Sn

Figure BDA0002506583330000038
Then the nth channel coherent sound Sn:
Figure BDA0002506583330000038

作为上述方法的一种改进,所述根据各个通道的相干声计算各个通道的环境声;具体为:As an improvement of the above method, the ambient sound of each channel is calculated according to the coherent sound of each channel; specifically:

第n个通道的环境声An为:The ambient sound A n of the nth channel is:

An=Xn-SnAn = Xn -Sn .

本发明的实施例2提出了一种多通道信号的相干声与环境声提取系统,所述系统包括:Embodiment 2 of the present invention proposes a multi-channel signal coherent sound and ambient sound extraction system, the system includes:

相干声提取模块,用于计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;The coherent sound extraction module is used to calculate the weight expression of the coherent sound of the N channel signals, and estimates the coherent sound according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;

环境声提取模块,用于根据各个通道的相干声计算各个通道的环境声;The ambient sound extraction module is used to calculate the ambient sound of each channel according to the coherent sound of each channel;

频域转时域模块,用于将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。The frequency-domain to time-domain module is used to perform inverse Fourier transform on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and ambient sound represented in the time domain.

作为上述系统的一种改进,所述相干声提取模块的具体实现过程包括:As an improvement of the above system, the specific implementation process of the coherent sound extraction module includes:

将时域多通道信号进行傅里叶变换,第n个通道输入信号Xn表示为:Fourier transform is performed on the time-domain multi-channel signal, and the input signal X n of the nth channel is expressed as:

Xn=βnS+An X nn S+A n

其中,S表示相干声的频谱,βn表示第n个通道的相干声与第一个通道的相干声存在的幅度差异因子,1≤n≤N,β1=1,An表示第n个通道的环境声的频谱;Among them, S represents the spectrum of the coherent sound, β n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, 1≤n≤N, β 1 =1, An represents the nth channel The spectrum of the ambient sound of the channel;

计算第n个通道输入信号Xn的短时能量

Figure BDA0002506583330000041
Calculate the short-term energy of the nth channel input signal Xn
Figure BDA0002506583330000041

Figure BDA0002506583330000042
Figure BDA0002506583330000042

计算第一个通道和第二个通道的相关值Φ12Calculate the correlation value Φ 12 for the first channel and the second channel:

Figure BDA0002506583330000043
Figure BDA0002506583330000043

根据第一个通道的短时能量

Figure BDA0002506583330000044
第二个通道的短时能量
Figure BDA0002506583330000045
以及两个通道间相关值Φ12,计算中间参数C和D:According to the short-term energy of the first channel
Figure BDA0002506583330000044
Short-term energy of the second channel
Figure BDA0002506583330000045
and the correlation value Φ 12 between the two channels, calculate the intermediate parameters C and D:

Figure BDA0002506583330000046
Figure BDA0002506583330000046

Figure BDA0002506583330000047
Figure BDA0002506583330000047

由此计算相干声的短时能量PS、环境声的短时能量PA以及β2 From this, the short-term energy P S of the coherent sound, the short - term energy PA and β 2 of the ambient sound are calculated.

Figure BDA0002506583330000048
Figure BDA0002506583330000048

Figure BDA0002506583330000049
Figure BDA0002506583330000049

Figure BDA00025065833300000410
Figure BDA00025065833300000410

计算βnCalculate β n :

Figure BDA00025065833300000411
Figure BDA00025065833300000411

第n个通道的权重值为:The weight value of the nth channel is:

Figure BDA00025065833300000412
Figure BDA00025065833300000412

则相干声的估计值

Figure BDA00025065833300000413
为:Then the estimated value of coherent sound
Figure BDA00025065833300000413
for:

Figure BDA0002506583330000051
Figure BDA0002506583330000051

则第n个通道相干声Sn

Figure BDA0002506583330000052
Then the nth channel coherent sound Sn:
Figure BDA0002506583330000052

作为上述系统的一种改进,所述环境声计算模块的具体实现过程包括:As an improvement of the above system, the specific implementation process of the ambient sound calculation module includes:

第n个通道的环境声An为:The ambient sound A n of the nth channel is:

An=Xn-SnAn = Xn -Sn .

本发明的优势在于:The advantages of the present invention are:

本发明的方法探究出在各通道环境声能量相同条件下的,针对任意通道数信号估计相干声的权重表达式,利用各个通道的信号能量以及通道间相关值,求出权重表达式中的各个未知参数,实现多通道信号的相干声与环境声提取,提取精度高。The method of the invention explores the weight expression for estimating coherent sound for signals with any number of channels under the condition that the ambient sound energy of each channel is the same, and uses the signal energy of each channel and the correlation value between channels to obtain each of the weight expressions. Unknown parameters can realize the extraction of coherent sound and ambient sound of multi-channel signals, and the extraction accuracy is high.

附图说明Description of drawings

图1是本发明的多通道信号的相干声与环境声提取方法的流程图;Fig. 1 is the flow chart of the coherent sound and ambient sound extraction method of multi-channel signal of the present invention;

图2(a)是使用本发明的方法和成对相关法对混合五通道信号1进行相干声成分提取的误差图;Fig. 2 (a) is the error diagram that uses the method of the present invention and pairwise correlation method to carry out coherent sound component extraction to mixed five-channel signal 1;

图2(b)是使用本发明的方法和成对相关法对混合五通道信号1进行环境声成分提取的误差图;Fig. 2 (b) is the error diagram that uses the method of the present invention and pairwise correlation method to carry out ambient sound component extraction to mixed five-channel signal 1;

图3(a)是使用本发明的方法和成对相关法对混合五通道信号2进行相干声成分提取的误差图;Fig. 3 (a) is the error diagram that uses the method of the present invention and pairwise correlation method to carry out coherent sound component extraction to mixed five-channel signal 2;

图3(b)是使用本发明的方法和成对相关法对混合五通道信号2进行环境声成分提取的误差图。FIG. 3( b ) is an error diagram of extracting ambient sound components from the mixed five-channel signal 2 using the method of the present invention and the pairwise correlation method.

具体实施方式Detailed ways

下面结合附图和具体实施例对本发明的技术方案进行详细说明。The technical solutions of the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

实施例1Example 1

如图1所示,本发明的实施例1提出了一种用于多通道信号各个通道环境声能量相等时的相干声与环境声提取方法,包括以下步骤:As shown in FIG. 1 , Embodiment 1 of the present invention proposes a method for extracting coherent sound and ambient sound when the ambient sound energy of each channel of a multi-channel signal is equal, including the following steps:

步骤1)将多通道信号分帧后进行傅里叶变换得到频谱,根据多通道信号模型表示出各个通道的短时能量以及任意两个通道间相关值,具体包括:Step 1) After dividing the multi-channel signal into frames, perform Fourier transform to obtain the frequency spectrum, and represent the short-term energy of each channel and the correlation value between any two channels according to the multi-channel signal model, specifically including:

多通道信号模型中,输入信号表示为相干声与环境声的叠加。由于相干声和环境声自身的特性不同,假设各个通道的相干声之间是完全相关的,即存在线性关系;假设相干声与每个通道的环境声以及通道间的环境声均是不相关的。In the multi-channel signal model, the input signal is represented as the superposition of coherent sound and ambient sound. Due to the different characteristics of coherent sound and ambient sound, it is assumed that the coherent sound of each channel is completely correlated, that is, there is a linear relationship; it is assumed that the coherent sound is uncorrelated with the ambient sound of each channel and the ambient sound between channels .

步骤1-1)将时域多通道信号进行傅里叶变换,得到频谱:Step 1-1) Fourier transform the time-domain multi-channel signal to obtain the spectrum:

Xn=βnS+An,n=1,2,…,NX nn S+A n ,n=1,2,...,N

其中,N为通道数,S表示相干声的频谱,βn表示第n个通道相干声与第一个通道的相干声存在的幅度差异因子,且β1=1,An表示第n个通道的环境声的频谱;Among them, N is the number of channels, S represents the spectrum of the coherent sound, β n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, and β 1 =1, An represents the nth channel. the spectrum of ambient sound;

步骤1-2)各个通道的信号能量可以表示为:Step 1-2) The signal energy of each channel can be expressed as:

Figure BDA0002506583330000061
Figure BDA0002506583330000061

其中,E{}表示短时平均。Among them, E{} represents the short-term average.

步骤1-3)各个通道间的相关值可以表示为:Step 1-3) The correlation value between each channel can be expressed as:

Figure BDA0002506583330000062
Figure BDA0002506583330000062

其中,

Figure BDA0002506583330000063
为第n1个通道和第n2个通道间的相关值,n1=1,2,…,N,n2=1,2,…,N,n1≠n2;in,
Figure BDA0002506583330000063
is the correlation value between the n1th channel and the n2th channel, n 1 = 1,2,...,N,n 2 =1,2,...,N,n 1 ≠n 2 ;

步骤2)计算出通道数较少时使用最小二乘法估计相干声的权重值,并探究其规律性,具体包括:Step 2) Calculate the weight value of coherent sound using the least squares method when the number of channels is small, and explore its regularity, including:

步骤2-1)针对两通道信号,计算出输入信号X1和X2估计相干声S的权重值:Step 2-1) For the two-channel signal, calculate the weight values of the input signals X 1 and X 2 to estimate the coherent sound S:

步骤2-1-1)估计相干声S:Step 2-1-1) Estimate coherent sound S:

Figure BDA0002506583330000064
Figure BDA0002506583330000064

其中,w1和w2表示待求的估计权重。Among them, w 1 and w 2 represent the estimated weights to be sought.

步骤2-1-2)S的估计误差σS表示为:Step 2-1-2) The estimation error σ S of S is expressed as:

Figure BDA0002506583330000065
Figure BDA0002506583330000065

步骤2-1-3)使用最小二乘算法进行求解,即当估计误差与输入立体声信号完全不相关时,得到的权重为最优估计:Step 2-1-3) Use the least squares algorithm to solve, that is, when the estimation error is completely uncorrelated with the input stereo signal, the obtained weight is the optimal estimation:

E{σSX1}=0E{σ S X 1 }=0

E{σSX2}=0.E{σ S X 2 }=0.

此时,最优估计的权重表示为:At this time, the weight of the optimal estimate is expressed as:

Figure BDA0002506583330000071
Figure BDA0002506583330000071

Figure BDA0002506583330000072
Figure BDA0002506583330000072

其中,PS表示相干声的短时能量,PA表示环境声的短时能量。Among them, PS represents the short - term energy of coherent sound, and PA represents the short-term energy of ambient sound.

步骤2-2)针对三通道信号,计算出输入信号X1、X2以及X3估计相干声S的权重值:Step 2-2) For the three-channel signal, calculate the weight values of the input signals X 1 , X 2 and X 3 to estimate the coherent sound S:

步骤2-2-1)估计相干声S:Step 2-2-1) Estimate coherent sound S:

Figure BDA0002506583330000073
Figure BDA0002506583330000073

其中,w1、w2和w3表示待求的估计权重。Among them, w 1 , w 2 and w 3 represent the estimated weights to be sought.

步骤2-2-2)与步骤2-1)类似的处理方法可以求得三通道信号估计相干声的权重值:Step 2-2-2) Similar to step 2-1), the weight value of the estimated coherent sound of the three-channel signal can be obtained:

Figure BDA0002506583330000074
Figure BDA0002506583330000074

Figure BDA0002506583330000075
Figure BDA0002506583330000075

Figure BDA0002506583330000076
Figure BDA0002506583330000076

步骤2-3)通过更多计算通道数更多时的相干声的估计权重,发现权重值可统一表达。针对通道数为N的多通道信号,估计的相干声表示为:Step 2-3) By calculating the estimated weight of the coherent sound when the number of channels is more, it is found that the weight value can be expressed uniformly. For a multi-channel signal with N channels, the estimated coherent sound is expressed as:

Figure BDA0002506583330000077
Figure BDA0002506583330000077

其中,权重值可以表示为:Among them, the weight value can be expressed as:

Figure BDA0002506583330000081
Figure BDA0002506583330000081

步骤3)计算估计相干声的权重中各个未知参数,完成多通道信号的相干声与环境声提取,具体包括:Step 3) Calculate and estimate each unknown parameter in the weight of the coherent sound, and complete the extraction of the coherent sound and the ambient sound of the multi-channel signal, specifically including:

步骤3-1)已知β1=1,因此可根据步骤1)中前两个通道的信号能量和通道间相关值求出未知参数PS、PA以及β2Step 3-1) It is known that β 1 = 1, so the unknown parameters P S , PA and β 2 can be obtained according to the signal energy and inter-channel correlation value of the first two channels in step 1):

Figure BDA0002506583330000082
Figure BDA0002506583330000082

Figure BDA0002506583330000083
Figure BDA0002506583330000083

Figure BDA0002506583330000084
Figure BDA0002506583330000084

其中,in,

Figure BDA0002506583330000085
Figure BDA0002506583330000085

Figure BDA0002506583330000086
Figure BDA0002506583330000086

步骤3-2)根据除第一通道和第二通道外的其他通道的能量值,可求出当3≤n≤N时的βnStep 3-2) According to the energy values of other channels except the first channel and the second channel, β n when 3≤n≤N can be obtained:

Figure BDA0002506583330000087
Figure BDA0002506583330000087

步骤3-3)针对通道数为N的多通道信号,将步骤3-1)和步骤3-2)中计算得到的参数PS、PA以及βn(n=1,2,…,N)带入步骤2-3)计算得到的估计相干声的权重值wn(1,2,…,N)即可完成从多通道信号中提取相干声的操作。Step 3-3) For the multi-channel signal with the number of channels N, the parameters P S , P A and β n (n=1, 2, . . . , N) calculated in step 3-1) and step 3-2) ) into the estimated coherent sound weight value w n (1, 2, .

步骤4)对对任意通道数的多通道信号进行PAE,具体包括:Step 4) Perform PAE on multi-channel signals of any number of channels, specifically including:

步骤4-1)计算各个通道的相干声,具体包括:Step 4-1) Calculate the coherent sound of each channel, specifically including:

由于步骤2)计算出对任意通道数的多通道信号进行PAE时估计相干声的权重表达式,步骤3)计算出权重表达式中的各个未知参数,因此当确定了多通道信号的通道数,可直接根据权重表达式估计相干声S。此相干声直接为第一个通道的相干声,其他通道的相干声由S线性处理得到,即为βnS(n=2,…,N)。Since step 2) calculates the weight expression for estimating coherent sound when performing PAE on a multi-channel signal of any number of channels, and step 3) calculates each unknown parameter in the weight expression, so when the number of channels of the multi-channel signal is determined, The coherent sound S can be estimated directly from the weight expression. This coherent sound is directly the coherent sound of the first channel, and the coherent sound of other channels is obtained by S linear processing, namely β n S (n=2,...,N).

步骤4-2)计算各个通道的环境声,具体包括:Step 4-2) Calculate the ambient sound of each channel, including:

将各个通道剩余成分认定为环境声,即An=XnnS。The remaining components of each channel are identified as ambient sound, that is, An = Xn - βnS .

步骤4-3)将所得的N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。Step 4-3) Perform inverse Fourier transform on the obtained N-channel coherent sounds and N-channel ambient sounds to obtain coherent sounds and ambient sounds represented in the time domain.

下面结合仿真实例,对本发明所提出的方法性能进行说明:Below in conjunction with the simulation example, the method performance proposed by the present invention is described:

将完全相关的相干声与完全不相关的环境声按照一定比例合成混合五通道信号,使用本发明提出的多通道PAE方法和成对相关法进行成分提取。合成了两组混合多通道信号,即纯净语音作为相干声、海浪声作为环境声的混合五通道信号1以及纯净音乐声作为相干声、森林背景声作为环境声的混合五通道信号2。混合时,为了控制各个通道间相干声能量的分布,设定各个通道间相干声幅度差异因子βn与其参考值β0之间呈一定的比例关系;设定各个通道的环境声能量相等为PA0;为了控制混合信号中相干声成分所占比例,设定不同的相干声能量占比γ。参考值β0和PA0由γ决定。The fully correlated coherent sound and the completely uncorrelated ambient sound are synthesized and mixed with a five-channel signal according to a certain ratio, and the components are extracted by the multi-channel PAE method and the paired correlation method proposed by the present invention. Two groups of mixed multi-channel signals are synthesized, namely, the mixed five-channel signal 1 with pure voice as coherent sound and ocean wave sound as ambient sound, and the mixed five-channel signal 2 with pure music sound as coherent sound and forest background sound as ambient sound. When mixing, in order to control the distribution of coherent sound energy between each channel, set a certain proportional relationship between the coherent sound amplitude difference factor β n between each channel and its reference value β 0 ; set the ambient sound energy of each channel to be equal to P A0 ; In order to control the proportion of coherent sound components in the mixed signal, set different proportions of coherent sound energy γ. The reference values β 0 and P A0 are determined by γ.

本实验设定各个通道相干声的幅度存在β1=β2=β0,β3=2β0,β4=β5=0.5β0的比例关系。相干声能量占比γ取值为0.05至0.95(间隔为0.1)。相干声的提取误差εP分别表示为:In this experiment, the amplitude of coherent sound of each channel is set to have a proportional relationship of β 120 , β 3 =2β 0 , and β 45 =0.5β 0 . The coherent sound energy ratio γ is 0.05 to 0.95 (with an interval of 0.1). The extraction error ε P of coherent sound is expressed as:

Figure BDA0002506583330000091
Figure BDA0002506583330000091

环境声的提取误差εa分别表示为:The extraction error ε a of ambient sound is expressed as:

Figure BDA0002506583330000092
Figure BDA0002506583330000092

图2(a)和图2(b)代表了本发明所提出的算法和成对相关法分别对混合五通道信号1进行PAE时相干声和环境声的提取误差;图3(a)和图3(b)代表了本发明所提出的算法和成对相关法分别对混合五通道信号2进行PAE时相干声和环境声的提取误差。可以看出,在相干声能量占比γ取值为0.05至0.95(间隔为0.1)的整个区间内,本发明提出的方法的提取误差均小于成对相关法。Fig. 2(a) and Fig. 2(b) represent the extraction errors of coherent sound and ambient sound when the algorithm proposed by the present invention and the pairwise correlation method respectively perform PAE on the mixed five-channel signal 1; Fig. 3(a) and Fig. 3(b) represents the extraction error of coherent sound and ambient sound when the algorithm proposed in the present invention and the pairwise correlation method respectively perform PAE on the mixed five-channel signal 2. It can be seen that the extraction error of the method proposed in the present invention is smaller than that of the pairwise correlation method in the entire range of the coherent acoustic energy ratio γ from 0.05 to 0.95 (with an interval of 0.1).

本发明的实施例2提出了一种多通道信号的相干声与环境声提取系统,所述系统包括:Embodiment 2 of the present invention proposes a multi-channel signal coherent sound and ambient sound extraction system, the system includes:

相干声提取模块,用于计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;The coherent sound extraction module is used to calculate the weight expression of the coherent sound of the N channel signals, and estimates the coherent sound according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;

环境声提取模块,用于根据各个通道的相干声计算各个通道的环境声;The ambient sound extraction module is used to calculate the ambient sound of each channel according to the coherent sound of each channel;

频域转时域模块,用于将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。The frequency-domain to time-domain module is used to perform inverse Fourier transform on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and ambient sound represented in the time domain.

最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,对本发明的技术方案进行修改或者等同替换,都不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the embodiments, those of ordinary skill in the art should understand that any modification or equivalent replacement of the technical solutions of the present invention will not depart from the spirit and scope of the technical solutions of the present invention, and should be included in the present invention. within the scope of the claims.

Claims (4)

1.一种多通道信号的相干声与环境声提取方法,所述方法包括:1. A method for extracting coherent sound and ambient sound of a multi-channel signal, the method comprising: 计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;Calculate the weight expression of the coherent sound of the N channel signals, estimate the coherent sound according to the weight expression, and thereby calculate the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same; 根据各个通道的相干声计算各个通道的环境声;Calculate the ambient sound of each channel according to the coherent sound of each channel; 将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声;Perform inverse Fourier transform on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and ambient sound represented in the time domain; 所述计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;具体包括:Described calculating the weight expression of the coherent sound of the N channel signals, estimating the coherent sound according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same; specifically including: 将时域多通道信号进行傅里叶变换,第n个通道输入信号Xn表示为:Fourier transform is performed on the time-domain multi-channel signal, and the input signal X n of the nth channel is expressed as: Xn=βnS+An X nn S+A n 其中,S表示相干声的频谱,βn表示第n个通道的相干声与第一个通道的相干声存在的幅度差异因子,1≤n≤N,β1=1,An表示第n个通道的环境声的频谱;Among them, S represents the spectrum of the coherent sound, β n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, 1≤n≤N, β 1 =1, An represents the nth channel The spectrum of the ambient sound of the channel; 计算第n个通道输入信号Xn的短时能量
Figure FDA0002918929400000011
Calculate the short-term energy of the nth channel input signal Xn
Figure FDA0002918929400000011
Figure FDA0002918929400000012
Figure FDA0002918929400000012
计算第一个通道和第二个通道的相关值Φ12Calculate the correlation value Φ 12 for the first channel and the second channel:
Figure FDA0002918929400000013
Figure FDA0002918929400000013
根据第一个通道的短时能量
Figure FDA0002918929400000014
第二个通道的短时能量
Figure FDA0002918929400000015
以及两个通道间相关值Φ12,计算中间参数C和D:
According to the short-term energy of the first channel
Figure FDA0002918929400000014
Short-term energy of the second channel
Figure FDA0002918929400000015
and the correlation value Φ 12 between the two channels, calculate the intermediate parameters C and D:
Figure FDA0002918929400000016
Figure FDA0002918929400000016
Figure FDA0002918929400000017
Figure FDA0002918929400000017
由此计算相干声的短时能量PS、环境声的短时能量PA以及β2 From this, the short-term energy P S of the coherent sound, the short - term energy PA and β 2 of the ambient sound are calculated.
Figure FDA0002918929400000021
Figure FDA0002918929400000021
Figure FDA0002918929400000022
Figure FDA0002918929400000022
Figure FDA0002918929400000023
Figure FDA0002918929400000023
计算βnCalculate β n :
Figure FDA0002918929400000024
Figure FDA0002918929400000024
第n个通道的权重值为:The weight value of the nth channel is:
Figure FDA0002918929400000025
Figure FDA0002918929400000025
则相干声的估计值
Figure FDA0002918929400000026
为:
Then the estimated value of coherent sound
Figure FDA0002918929400000026
for:
Figure FDA0002918929400000027
Figure FDA0002918929400000027
则第n个通道相干声Sn
Figure FDA0002918929400000028
Then the nth channel coherent sound Sn:
Figure FDA0002918929400000028
2.根据权利要求1所述的多通道信号的相干声与环境声提取方法,其特征在于,所述根据各个通道的相干声计算各个通道的环境声;具体为:2. The method for extracting coherent sound and ambient sound of multi-channel signals according to claim 1, wherein the ambient sound of each channel is calculated according to the coherent sound of each channel; specifically: 第n个通道的环境声An为:The ambient sound A n of the nth channel is: An=Xn-SnAn = Xn -Sn . 3.一种多通道信号的相干声与环境声提取系统,其特征在于,所述系统包括:3. A coherent sound and ambient sound extraction system for multi-channel signals, wherein the system comprises: 相干声提取模块,用于计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;The coherent sound extraction module is used to calculate the weight expression of the coherent sound of the N channel signals, and estimates the coherent sound according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same; 环境声提取模块,用于根据各个通道的相干声计算各个通道的环境声;The ambient sound extraction module is used to calculate the ambient sound of each channel according to the coherent sound of each channel; 频域转时域模块,用于将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声;The frequency-domain to time-domain module is used to inverse Fourier transform the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and ambient sound represented in the time domain; 所述相干声提取模块的具体实现过程包括:The specific implementation process of the coherent sound extraction module includes: 将时域多通道信号进行傅里叶变换,第n个通道输入信号Xn表示为:Fourier transform is performed on the time-domain multi-channel signal, and the input signal X n of the nth channel is expressed as: Xn=βnS+An X nn S+A n 其中,S表示相干声的频谱,βn表示第n个通道的相干声与第一个通道的相干声存在的幅度差异因子,1≤n≤N,β1=1,An表示第n个通道的环境声的频谱;Among them, S represents the spectrum of the coherent sound, β n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, 1≤n≤N, β 1 =1, An represents the nth channel The spectrum of the ambient sound of the channel; 计算第n个通道输入信号Xn的短时能量
Figure FDA0002918929400000031
Calculate the short-term energy of the nth channel input signal Xn
Figure FDA0002918929400000031
Figure FDA0002918929400000032
Figure FDA0002918929400000032
计算第一个通道和第二个通道的相关值Φ12Calculate the correlation value Φ 12 for the first channel and the second channel:
Figure FDA0002918929400000033
Figure FDA0002918929400000033
根据第一个通道的短时能量
Figure FDA0002918929400000034
第二个通道的短时能量
Figure FDA0002918929400000035
以及两个通道间相关值Φ12,计算中间参数C和D:
According to the short-term energy of the first channel
Figure FDA0002918929400000034
short-term energy of the second channel
Figure FDA0002918929400000035
and the correlation value Φ 12 between the two channels, calculate the intermediate parameters C and D:
Figure FDA0002918929400000036
Figure FDA0002918929400000036
Figure FDA0002918929400000037
Figure FDA0002918929400000037
由此计算相干声的短时能量PS、环境声的短时能量PA以及β2 From this, the short-term energy P S of the coherent sound, the short - term energy PA and β 2 of the ambient sound are calculated.
Figure FDA0002918929400000038
Figure FDA0002918929400000038
Figure FDA0002918929400000039
Figure FDA0002918929400000039
Figure FDA00029189294000000310
Figure FDA00029189294000000310
计算βnCalculate β n :
Figure FDA00029189294000000311
Figure FDA00029189294000000311
第n个通道的权重值为:The weight value of the nth channel is:
Figure FDA00029189294000000312
Figure FDA00029189294000000312
则相干声的估计值
Figure FDA00029189294000000313
为:
Then the estimated value of coherent sound
Figure FDA00029189294000000313
for:
Figure FDA0002918929400000041
Figure FDA0002918929400000041
则第n个通道相干声Sn
Figure FDA0002918929400000042
Then the nth channel coherent sound Sn:
Figure FDA0002918929400000042
4.根据权利要求3所述的多通道信号的相干声与环境声提取系统,其特征在于,所述环境声提取模块的具体实现过程包括:4. The coherent sound and ambient sound extraction system of multi-channel signals according to claim 3, wherein the specific implementation process of the ambient sound extraction module comprises: 第n个通道的环境声An为:The ambient sound A n of the nth channel is: An=Xn-SnAn = Xn -Sn .
CN202010447863.9A 2020-05-25 2020-05-25 A method and system for extracting coherent sound and ambient sound from multi-channel signals Active CN111669697B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010447863.9A CN111669697B (en) 2020-05-25 2020-05-25 A method and system for extracting coherent sound and ambient sound from multi-channel signals

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010447863.9A CN111669697B (en) 2020-05-25 2020-05-25 A method and system for extracting coherent sound and ambient sound from multi-channel signals

Publications (2)

Publication Number Publication Date
CN111669697A CN111669697A (en) 2020-09-15
CN111669697B true CN111669697B (en) 2021-05-18

Family

ID=72384501

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010447863.9A Active CN111669697B (en) 2020-05-25 2020-05-25 A method and system for extracting coherent sound and ambient sound from multi-channel signals

Country Status (1)

Country Link
CN (1) CN111669697B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101401151A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis
CN101816191A (en) * 2007-09-26 2010-08-25 弗劳恩霍夫应用研究促进协会 Apparatus and method for extracting environmental signal and computer program in apparatus and method for obtaining weighting coefficient for extracting environmental signal
EP2523473A1 (en) * 2011-05-11 2012-11-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for generating an output signal employing a decomposer
CN109036455A (en) * 2018-09-17 2018-12-18 中科上声(苏州)电子有限公司 Direct sound wave and background sound extracting method, speaker system and its sound playback method
CN110534129A (en) * 2018-05-23 2019-12-03 哈曼贝克自动系统股份有限公司 The separation of dry sound and ambient sound

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902822B (en) * 2014-03-28 2017-09-08 西安交通大学苏州研究院 Sources number detection method in the case of the mixing of incoherent and coherent signal
CN110531310B (en) * 2019-07-25 2021-07-13 西安交通大学 Direction of Arrival Estimation Method for Far-Field Coherent Signals Based on Subspace and Interpolation Transformation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101401151A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis
CN101816191A (en) * 2007-09-26 2010-08-25 弗劳恩霍夫应用研究促进协会 Apparatus and method for extracting environmental signal and computer program in apparatus and method for obtaining weighting coefficient for extracting environmental signal
EP2523473A1 (en) * 2011-05-11 2012-11-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for generating an output signal employing a decomposer
CN110534129A (en) * 2018-05-23 2019-12-03 哈曼贝克自动系统股份有限公司 The separation of dry sound and ambient sound
CN109036455A (en) * 2018-09-17 2018-12-18 中科上声(苏州)电子有限公司 Direct sound wave and background sound extracting method, speaker system and its sound playback method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
相干声与环境声提取方法的客观性能评估;吴彦琴等;《声学技术》;20191031;第38卷(第5期);全文 *

Also Published As

Publication number Publication date
CN111669697A (en) 2020-09-15

Similar Documents

Publication Publication Date Title
JP2024084842A (en) Method or device for compressing or decompressing higher-order ambisonic signal representations
JP6100441B2 (en) Binaural room impulse response filtering using content analysis and weighting
CN1985303B (en) Apparatus and method for generating a multi-channel output signal
US8705750B2 (en) Device and method for converting spatial audio signal
Gu et al. Complex neural spatial filter: Enhancing multi-channel target speech separation in complex domain
CN110728989B (en) A Binaural Speech Separation Method Based on Long Short-Term Memory Network LSTM
TW202022853A (en) Method and apparatus for decoding encoded audio signal in ambisonics format for l loudspeakers at known positions and computer readable storage medium
CN104134444B (en) A kind of song based on MMSE removes method and apparatus of accompanying
Su et al. Inras: Implicit neural representation for audio scenes
CN114203163A (en) Audio signal processing method and device
CN106847301A (en) A kind of ears speech separating method based on compressed sensing and attitude information
CN117501362A (en) Audio rendering system, method and electronic equipment
CN111464932A (en) Sound field reconstruction method, device, device and storage medium based on multiple listening points
CN111669697B (en) A method and system for extracting coherent sound and ambient sound from multi-channel signals
CN104424971B (en) A kind of audio file play method and device
Chun et al. Real-time conversion of stereo audio to 5.1 channel audio for providing realistic sounds
Hu et al. SMMA-Net: An audio clue-based target speaker extraction network with spectrogram matching and mutual attention
Lee et al. DeFTAN-II: Efficient multichannel speech enhancement with subgroup processing
Zhou et al. Binaural Sound Source Localization Based on Convolutional Neural Network.
CN109068262B (en) A loudspeaker-based personalized sound image reproduction method and device
CN111711918B (en) A method and system for extracting coherent sound and ambient sound from multi-channel signals
WO2020057050A1 (en) Method for extracting direct sound and background sound, and loudspeaker system and sound reproduction method therefor
CN117376784A (en) Method for expanding mono stereo field, electronic device, and storage medium
Gramaccioni et al. L3das23: Learning 3d audio sources for audio-visual extended reality
Wu et al. Microphone array speech separation algorithm based on dnn

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant