CN111669697B

CN111669697B - A method and system for extracting coherent sound and ambient sound from multi-channel signals

Info

Publication number: CN111669697B
Application number: CN202010447863.9A
Authority: CN
Inventors: 吴彦琴; 桑晋秋; 郑成诗; 张芳杰; 李晓东
Original assignee: Institute of Acoustics CAS
Current assignee: Institute of Acoustics CAS
Priority date: 2020-05-25
Filing date: 2020-05-25
Publication date: 2021-05-18
Anticipated expiration: 2040-05-25
Also published as: CN111669697A

Abstract

The invention discloses a method and system for extracting coherent sound and ambient sound of multi-channel signals. The method comprises: calculating the weight expression of the coherent sound of N channel signals, estimating the coherent sound according to the weight expression, and calculating each channel accordingly. Among them, the ambient sound energy of each channel is the same; calculate the ambient sound of each channel according to the coherent sound of each channel; perform inverse Fourier transform on the coherent sound of the N channels and the ambient sound of the N channels to obtain the time domain Represents coherent and ambient sound. The method of the invention explores the weight expression for estimating coherent sound for signals with any number of channels under the condition that the ambient sound energy of each channel is the same, and uses the signal energy of each channel and the correlation value between channels to obtain each of the weight expressions. Unknown parameters can realize the extraction of coherent sound and ambient sound of multi-channel signals, and the extraction accuracy is high.

Description

A method and system for extracting coherent sound and ambient sound from multi-channel signals

技术领域technical field

本发明涉及空间声重放领域，特别涉及一种多通道信号的相干声与环境声提取方法及系统。The invention relates to the field of spatial sound reproduction, in particular to a method and system for extracting coherent sound and ambient sound of multi-channel signals.

背景技术Background technique

空间声重放技术在娱乐媒体中得到了广泛应用，比如电影院、家庭影院以及便携式电子设备在播放影片时，通过耳机或扬声器重放出具有一定声像宽度和沉浸感良好的空间声，可以为消费者带来更好的视听体验。近年来，空间声重放在尖端的科学研究和实用工程领域也逐渐显露出重要的应用前景，比如虚拟现实、航空、航天等领域。Spatial sound playback technology has been widely used in entertainment media, such as movie theaters, home theaters, and portable electronic devices. When playing movies, the spatial sound with a certain sound image width and good immersion is played back through headphones or speakers, which can be used for consumption. bring a better audio-visual experience. In recent years, space acoustic reproduction has gradually shown important application prospects in cutting-edge scientific research and practical engineering fields, such as virtual reality, aviation, aerospace and other fields.

空间声主要包含两种性质不同的成分，其一是具有方向性的声成分，称为相干声；其二是具有扩散性、无法辨别方向的声成分，称为环境声。为了实现更好的声重放效果，需要对空间声进行相干声与环境声提取(Primary-Ambient Extraction,PAE)并进行不同的处理。比如，音频编解码系统中，将PAE作为音频编码或解码的前端，可以实现有效且沉浸感较好的空间声重放。Spatial sound mainly contains two components with different properties, one is the sound component with directionality, called coherent sound; the other is the sound component with diffusivity and cannot distinguish the direction, called ambient sound. In order to achieve better sound reproduction effect, coherent sound and ambient sound extraction (Primary-Ambient Extraction, PAE) need to be performed on spatial sound and different processing is performed. For example, in an audio coding and decoding system, using PAE as the front end of audio coding or decoding can achieve effective and immersive spatial sound playback.

针对两通道信号的PAE方法发展较为成熟，应用较为广泛的是主成分分析法和最小二乘法。针对多通道信号，可以使用成对相关法进行PAE。但是成对相关法提取成分准确度不高。因此将适用于立体声的PAE方法拓展至多通道信号具有重要意义。主成分分析法在相干声占主要能量的前提下，通过计算输入信号的协方差矩阵的特征值，对立体声信号进行PAE。该方法也可对多通道信号进行成分提取，但是当通道数较多时，计算复杂度增大，而且主成分分析法仅在相干声占主要能量时提取效果较好。最小二乘法在各个通道环境声能量相等的前提下，通过计算估计相干声的权重，实现对立体声信号的PAE。但是，直接将最小二乘法应用于多通道信号时，估计权重不易求解。环境声成分在空间声中主要起烘托气氛的作用，为了达到更好的环绕感，环境声在各个通道能量分布差异较小。因此，在各个通道环境声能量相等的前提下，对多通道信号进行相干声与环境声提取有着重要意义。The PAE method for two-channel signals is relatively mature, and the principal component analysis method and the least squares method are widely used. For multi-channel signals, the pairwise correlation method can be used for PAE. However, the accuracy of extracting components by pairwise correlation method is not high. Therefore, it is of great significance to extend the PAE method suitable for stereo to multi-channel signals. The principal component analysis method performs PAE on the stereo signal by calculating the eigenvalues of the covariance matrix of the input signal under the premise that coherent sound occupies the main energy. This method can also extract components from multi-channel signals, but when the number of channels is large, the computational complexity increases, and the principal component analysis method is only effective when coherent sound occupies the main energy. The least squares method realizes the PAE of the stereo signal by calculating and estimating the weight of the coherent sound under the premise that the ambient sound energy of each channel is equal. However, when the least squares method is directly applied to multi-channel signals, the estimated weights are not easy to solve. The ambient sound component mainly plays the role of setting off the atmosphere in the spatial sound. In order to achieve a better surround feeling, the energy distribution difference of the ambient sound in each channel is small. Therefore, under the premise that the ambient sound energy of each channel is equal, it is of great significance to extract coherent sound and ambient sound for multi-channel signals.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于克服上述技术缺陷，在环境声在各个通道能量相等的前提下，通过计算通道数较少时使用最小二乘法估计相干声的权重，根据权重随通道数变化的规律性，得出针对任意通道数的多通道信号进行相干声估计时的权重表达式。The purpose of the present invention is to overcome the above-mentioned technical defects. Under the premise that the energy of the ambient sound in each channel is equal, the least squares method is used to estimate the weight of the coherent sound when the number of channels is small. According to the regularity of the weight change with the number of channels, the The weight expression for coherent sound estimation for multi-channel signals with arbitrary number of channels is obtained.

为实现上述目的，本发明提出了一种多通道信号的相干声与环境声提取方法，所述方法包括：In order to achieve the above purpose, the present invention proposes a method for extracting coherent sound and ambient sound from multi-channel signals, the method comprising:

计算N个通道信号相干声的权重表达式，根据权重表达式估计相干声，由此计算各个通道的相干声；其中，每个通道的环境声能量相同；Calculate the weight expression of the coherent sound of the N channel signals, estimate the coherent sound according to the weight expression, and thereby calculate the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;

根据各个通道的相干声计算各个通道的环境声；Calculate the ambient sound of each channel according to the coherent sound of each channel;

将N个通道相干声与N个通道环境声进行逆傅里叶变换，得到时域表示的相干声与环境声。Inverse Fourier transform is performed on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and the ambient sound represented in the time domain.

作为上述方法的一种改进，所述计算N个通道信号相干声的权重表达式，根据权重表达式估计相干声，由此计算各个通道的相干声；其中，每个通道的环境声能量相同；具体包括：As an improvement of the above method, the weight expression of the coherent sound of the N channel signals is calculated, and the coherent sound is estimated according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same; Specifically include:

将时域多通道信号进行傅里叶变换，第n个通道输入信号X_n表示为：Fourier transform is performed on the time-domain multi-channel signal, and the input signal X _n of the nth channel is expressed as:

X_n＝β_nS+A_n X _n =β _n S+A _n

其中，S表示相干声的频谱，β_n表示第n个通道的相干声与第一个通道的相干声存在的幅度差异因子，1≤n≤N，β₁＝1，A_n表示第n个通道的环境声的频谱；Among them, S represents the spectrum of the coherent sound, β _n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, 1≤n≤N, β ₁ =1, An represents the _nth channel The spectrum of the ambient sound of the channel;

计算第n个通道输入信号X_n的短时能量

Calculate the short-term energy of the _nth channel input signal Xn

计算第一个通道和第二个通道的相关值Φ₁₂：Calculate the correlation value Φ ₁₂ for the first channel and the second channel:

根据第一个通道的短时能量

第二个通道的短时能量

以及两个通道间相关值Φ₁₂，计算中间参数C和D：According to the short-term energy of the first channel

Short-term energy of the second channel

and the correlation value Φ ₁₂ between the two channels, calculate the intermediate parameters C and D:

由此计算相干声的短时能量P_S、环境声的短时能量P_A以及β₂ From this, the short-term energy P _S of the coherent sound, the short _- term energy PA and β ₂ of the ambient sound are calculated.

计算β_n：Calculate β _n :

第n个通道的权重值为：The weight value of the nth channel is:

则相干声的估计值

为：Then the estimated value of coherent sound

for:

则第n个通道相干声S_n：

Then the _nth channel coherent sound Sn:

作为上述方法的一种改进，所述根据各个通道的相干声计算各个通道的环境声；具体为：As an improvement of the above method, the ambient sound of each channel is calculated according to the coherent sound of each channel; specifically:

第n个通道的环境声A_n为：The ambient sound A _n of the nth channel is:

A_n＝X_n-S_n。An = _Xn _-Sn _.

本发明的实施例2提出了一种多通道信号的相干声与环境声提取系统，所述系统包括：Embodiment 2 of the present invention proposes a multi-channel signal coherent sound and ambient sound extraction system, the system includes:

相干声提取模块，用于计算N个通道信号相干声的权重表达式，根据权重表达式估计相干声，由此计算各个通道的相干声；其中，每个通道的环境声能量相同；The coherent sound extraction module is used to calculate the weight expression of the coherent sound of the N channel signals, and estimates the coherent sound according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;

环境声提取模块，用于根据各个通道的相干声计算各个通道的环境声；The ambient sound extraction module is used to calculate the ambient sound of each channel according to the coherent sound of each channel;

频域转时域模块，用于将N个通道相干声与N个通道环境声进行逆傅里叶变换，得到时域表示的相干声与环境声。The frequency-domain to time-domain module is used to perform inverse Fourier transform on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and ambient sound represented in the time domain.

作为上述系统的一种改进，所述相干声提取模块的具体实现过程包括：As an improvement of the above system, the specific implementation process of the coherent sound extraction module includes:

X_n＝β_nS+A_n X _n =β _n S+A _n

计算第n个通道输入信号X_n的短时能量

Calculate the short-term energy of the _nth channel input signal Xn

根据第一个通道的短时能量

第二个通道的短时能量

Short-term energy of the second channel

计算β_n：Calculate β _n :

第n个通道的权重值为：The weight value of the nth channel is:

则相干声的估计值

为：Then the estimated value of coherent sound

for:

则第n个通道相干声S_n：

Then the _nth channel coherent sound Sn:

作为上述系统的一种改进，所述环境声计算模块的具体实现过程包括：As an improvement of the above system, the specific implementation process of the ambient sound calculation module includes:

第n个通道的环境声A_n为：The ambient sound A _n of the nth channel is:

A_n＝X_n-S_n。An = _Xn _-Sn _.

本发明的优势在于：The advantages of the present invention are:

本发明的方法探究出在各通道环境声能量相同条件下的，针对任意通道数信号估计相干声的权重表达式，利用各个通道的信号能量以及通道间相关值，求出权重表达式中的各个未知参数，实现多通道信号的相干声与环境声提取，提取精度高。The method of the invention explores the weight expression for estimating coherent sound for signals with any number of channels under the condition that the ambient sound energy of each channel is the same, and uses the signal energy of each channel and the correlation value between channels to obtain each of the weight expressions. Unknown parameters can realize the extraction of coherent sound and ambient sound of multi-channel signals, and the extraction accuracy is high.

附图说明Description of drawings

图1是本发明的多通道信号的相干声与环境声提取方法的流程图；Fig. 1 is the flow chart of the coherent sound and ambient sound extraction method of multi-channel signal of the present invention;

图2(a)是使用本发明的方法和成对相关法对混合五通道信号1进行相干声成分提取的误差图；Fig. 2 (a) is the error diagram that uses the method of the present invention and pairwise correlation method to carry out coherent sound component extraction to mixed five-channel signal 1;

图2(b)是使用本发明的方法和成对相关法对混合五通道信号1进行环境声成分提取的误差图；Fig. 2 (b) is the error diagram that uses the method of the present invention and pairwise correlation method to carry out ambient sound component extraction to mixed five-channel signal 1;

图3(a)是使用本发明的方法和成对相关法对混合五通道信号2进行相干声成分提取的误差图；Fig. 3 (a) is the error diagram that uses the method of the present invention and pairwise correlation method to carry out coherent sound component extraction to mixed five-channel signal 2;

图3(b)是使用本发明的方法和成对相关法对混合五通道信号2进行环境声成分提取的误差图。FIG. 3( b ) is an error diagram of extracting ambient sound components from the mixed five-channel signal 2 using the method of the present invention and the pairwise correlation method.

具体实施方式Detailed ways

下面结合附图和具体实施例对本发明的技术方案进行详细说明。The technical solutions of the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

实施例1Example 1

如图1所示，本发明的实施例1提出了一种用于多通道信号各个通道环境声能量相等时的相干声与环境声提取方法，包括以下步骤：As shown in FIG. 1 , Embodiment 1 of the present invention proposes a method for extracting coherent sound and ambient sound when the ambient sound energy of each channel of a multi-channel signal is equal, including the following steps:

步骤1)将多通道信号分帧后进行傅里叶变换得到频谱，根据多通道信号模型表示出各个通道的短时能量以及任意两个通道间相关值，具体包括：Step 1) After dividing the multi-channel signal into frames, perform Fourier transform to obtain the frequency spectrum, and represent the short-term energy of each channel and the correlation value between any two channels according to the multi-channel signal model, specifically including:

多通道信号模型中，输入信号表示为相干声与环境声的叠加。由于相干声和环境声自身的特性不同，假设各个通道的相干声之间是完全相关的，即存在线性关系；假设相干声与每个通道的环境声以及通道间的环境声均是不相关的。In the multi-channel signal model, the input signal is represented as the superposition of coherent sound and ambient sound. Due to the different characteristics of coherent sound and ambient sound, it is assumed that the coherent sound of each channel is completely correlated, that is, there is a linear relationship; it is assumed that the coherent sound is uncorrelated with the ambient sound of each channel and the ambient sound between channels .

步骤1-1)将时域多通道信号进行傅里叶变换，得到频谱：Step 1-1) Fourier transform the time-domain multi-channel signal to obtain the spectrum:

X_n＝β_nS+A_n,n＝1,2,…,NX _n =β _n S+A _n ,n=1,2,...,N

其中，N为通道数，S表示相干声的频谱，β_n表示第n个通道相干声与第一个通道的相干声存在的幅度差异因子，且β₁＝1，A_n表示第n个通道的环境声的频谱；Among them, N is the number of channels, S represents the spectrum of the coherent sound, β _n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, and β ₁ =1, An represents the _nth channel. the spectrum of ambient sound;

步骤1-2)各个通道的信号能量可以表示为：Step 1-2) The signal energy of each channel can be expressed as:

其中，E{}表示短时平均。Among them, E{} represents the short-term average.

步骤1-3)各个通道间的相关值可以表示为：Step 1-3) The correlation value between each channel can be expressed as:

其中，

为第n₁个通道和第n₂个通道间的相关值，n₁＝1,2,…,N,n₂＝1,2,…,N,n₁≠n₂；in,

is the correlation value between the _{n1th channel and the n2th channel, n 1} ₌ 1,2,...,N,n ₂ =1,2,...,N,n ₁ ≠n ₂ _;

步骤2)计算出通道数较少时使用最小二乘法估计相干声的权重值，并探究其规律性，具体包括：Step 2) Calculate the weight value of coherent sound using the least squares method when the number of channels is small, and explore its regularity, including:

步骤2-1)针对两通道信号，计算出输入信号X₁和X₂估计相干声S的权重值：Step 2-1) For the two-channel signal, calculate the weight values of the input signals X ₁ and X ₂ to estimate the coherent sound S:

步骤2-1-1)估计相干声S：Step 2-1-1) Estimate coherent sound S:

其中，w₁和w₂表示待求的估计权重。Among them, w ₁ and w ₂ represent the estimated weights to be sought.

步骤2-1-2)S的估计误差σ_S表示为：Step 2-1-2) The estimation error σ _S of S is expressed as:

步骤2-1-3)使用最小二乘算法进行求解，即当估计误差与输入立体声信号完全不相关时，得到的权重为最优估计：Step 2-1-3) Use the least squares algorithm to solve, that is, when the estimation error is completely uncorrelated with the input stereo signal, the obtained weight is the optimal estimation:

E{σ_SX₁}＝0E{σ _S X ₁ }=0

E{σ_SX₂}＝0.E{σ _S X ₂ }=0.

此时，最优估计的权重表示为：At this time, the weight of the optimal estimate is expressed as:

其中，P_S表示相干声的短时能量，P_A表示环境声的短时能量。Among them, _PS represents the short _- term energy of coherent sound, and PA represents the short-term energy of ambient sound.

步骤2-2)针对三通道信号，计算出输入信号X₁、X₂以及X₃估计相干声S的权重值：Step 2-2) For the three-channel signal, calculate the weight values of the input signals X ₁ , X ₂ and X ₃ to estimate the coherent sound S:

步骤2-2-1)估计相干声S：Step 2-2-1) Estimate coherent sound S:

其中，w₁、w₂和w₃表示待求的估计权重。Among them, w ₁ , w ₂ and w ₃ represent the estimated weights to be sought.

步骤2-2-2)与步骤2-1)类似的处理方法可以求得三通道信号估计相干声的权重值：Step 2-2-2) Similar to step 2-1), the weight value of the estimated coherent sound of the three-channel signal can be obtained:

步骤2-3)通过更多计算通道数更多时的相干声的估计权重，发现权重值可统一表达。针对通道数为N的多通道信号，估计的相干声表示为：Step 2-3) By calculating the estimated weight of the coherent sound when the number of channels is more, it is found that the weight value can be expressed uniformly. For a multi-channel signal with N channels, the estimated coherent sound is expressed as:

其中，权重值可以表示为：Among them, the weight value can be expressed as:

步骤3)计算估计相干声的权重中各个未知参数，完成多通道信号的相干声与环境声提取，具体包括：Step 3) Calculate and estimate each unknown parameter in the weight of the coherent sound, and complete the extraction of the coherent sound and the ambient sound of the multi-channel signal, specifically including:

步骤3-1)已知β₁＝1，因此可根据步骤1)中前两个通道的信号能量和通道间相关值求出未知参数P_S、P_A以及β₂：Step 3-1) It is known that β ₁ ₌ 1, so the unknown parameters P _S , PA and β ₂ can be obtained according to the signal energy and inter-channel correlation value of the first two channels in step 1):

其中，in,

步骤3-2)根据除第一通道和第二通道外的其他通道的能量值，可求出当3≤n≤N时的β_n：Step 3-2) According to the energy values of other channels except the first channel and the second channel, β _n when 3≤n≤N can be obtained:

步骤3-3)针对通道数为N的多通道信号，将步骤3-1)和步骤3-2)中计算得到的参数P_S、P_A以及β_n(n＝1,2,…,N)带入步骤2-3)计算得到的估计相干声的权重值w_n(1,2,…,N)即可完成从多通道信号中提取相干声的操作。Step 3-3) For the multi-channel signal with the number of channels N, the parameters P _S , P _A and β _n (n=1, 2, . . . , N) calculated in step 3-1) and step 3-2) ) into the estimated coherent sound weight value w _n (1, 2, .

步骤4)对对任意通道数的多通道信号进行PAE，具体包括：Step 4) Perform PAE on multi-channel signals of any number of channels, specifically including:

步骤4-1)计算各个通道的相干声，具体包括：Step 4-1) Calculate the coherent sound of each channel, specifically including:

由于步骤2)计算出对任意通道数的多通道信号进行PAE时估计相干声的权重表达式，步骤3)计算出权重表达式中的各个未知参数，因此当确定了多通道信号的通道数，可直接根据权重表达式估计相干声S。此相干声直接为第一个通道的相干声，其他通道的相干声由S线性处理得到，即为β_nS(n＝2,…,N)。Since step 2) calculates the weight expression for estimating coherent sound when performing PAE on a multi-channel signal of any number of channels, and step 3) calculates each unknown parameter in the weight expression, so when the number of channels of the multi-channel signal is determined, The coherent sound S can be estimated directly from the weight expression. This coherent sound is directly the coherent sound of the first channel, and the coherent sound of other channels is obtained by S linear processing, namely β _n S (n=2,...,N).

步骤4-2)计算各个通道的环境声，具体包括：Step 4-2) Calculate the ambient sound of each channel, including:

将各个通道剩余成分认定为环境声，即A_n＝X_n-β_nS。The remaining components of each channel are identified as ambient sound, that is, An = _Xn _- _βnS .

步骤4-3)将所得的N个通道相干声与N个通道环境声进行逆傅里叶变换，得到时域表示的相干声与环境声。Step 4-3) Perform inverse Fourier transform on the obtained N-channel coherent sounds and N-channel ambient sounds to obtain coherent sounds and ambient sounds represented in the time domain.

下面结合仿真实例，对本发明所提出的方法性能进行说明：Below in conjunction with the simulation example, the method performance proposed by the present invention is described:

将完全相关的相干声与完全不相关的环境声按照一定比例合成混合五通道信号，使用本发明提出的多通道PAE方法和成对相关法进行成分提取。合成了两组混合多通道信号，即纯净语音作为相干声、海浪声作为环境声的混合五通道信号1以及纯净音乐声作为相干声、森林背景声作为环境声的混合五通道信号2。混合时，为了控制各个通道间相干声能量的分布，设定各个通道间相干声幅度差异因子β_n与其参考值β₀之间呈一定的比例关系；设定各个通道的环境声能量相等为P_A0；为了控制混合信号中相干声成分所占比例，设定不同的相干声能量占比γ。参考值β₀和P_A0由γ决定。The fully correlated coherent sound and the completely uncorrelated ambient sound are synthesized and mixed with a five-channel signal according to a certain ratio, and the components are extracted by the multi-channel PAE method and the paired correlation method proposed by the present invention. Two groups of mixed multi-channel signals are synthesized, namely, the mixed five-channel signal 1 with pure voice as coherent sound and ocean wave sound as ambient sound, and the mixed five-channel signal 2 with pure music sound as coherent sound and forest background sound as ambient sound. When mixing, in order to control the distribution of coherent sound energy between each channel, set a certain proportional relationship between the coherent sound amplitude difference factor β _n between each channel and its reference value β ₀ ; set the ambient sound energy of each channel to be equal to P _A0 ; In order to control the proportion of coherent sound components in the mixed signal, set different proportions of coherent sound energy γ. The reference values β ₀ and P _A0 are determined by γ.

本实验设定各个通道相干声的幅度存在β₁＝β₂＝β₀，β₃＝2β₀，β₄＝β₅＝0.5β₀的比例关系。相干声能量占比γ取值为0.05至0.95(间隔为0.1)。相干声的提取误差ε_P分别表示为：In this experiment, the amplitude of coherent sound of each channel is set to have a proportional relationship of β ₁ =β ₂ =β ₀ , β ₃ =2β ₀ , and β ₄ =β ₅ =0.5β ₀ . The coherent sound energy ratio γ is 0.05 to 0.95 (with an interval of 0.1). The extraction error ε _P of coherent sound is expressed as:

环境声的提取误差ε_a分别表示为：The extraction error ε _a of ambient sound is expressed as:

图2(a)和图2(b)代表了本发明所提出的算法和成对相关法分别对混合五通道信号1进行PAE时相干声和环境声的提取误差；图3(a)和图3(b)代表了本发明所提出的算法和成对相关法分别对混合五通道信号2进行PAE时相干声和环境声的提取误差。可以看出，在相干声能量占比γ取值为0.05至0.95(间隔为0.1)的整个区间内，本发明提出的方法的提取误差均小于成对相关法。Fig. 2(a) and Fig. 2(b) represent the extraction errors of coherent sound and ambient sound when the algorithm proposed by the present invention and the pairwise correlation method respectively perform PAE on the mixed five-channel signal 1; Fig. 3(a) and Fig. 3(b) represents the extraction error of coherent sound and ambient sound when the algorithm proposed in the present invention and the pairwise correlation method respectively perform PAE on the mixed five-channel signal 2. It can be seen that the extraction error of the method proposed in the present invention is smaller than that of the pairwise correlation method in the entire range of the coherent acoustic energy ratio γ from 0.05 to 0.95 (with an interval of 0.1).

最后所应说明的是，以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明，本领域的普通技术人员应当理解，对本发明的技术方案进行修改或者等同替换，都不脱离本发明技术方案的精神和范围，其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the embodiments, those of ordinary skill in the art should understand that any modification or equivalent replacement of the technical solutions of the present invention will not depart from the spirit and scope of the technical solutions of the present invention, and should be included in the present invention. within the scope of the claims.

Claims

1. A method for extracting coherent sound and ambient sound of a multi-channel signal, the method comprising:

Calculate the weight expression of the coherent sound of the N channel signals, estimate the coherent sound according to the weight expression, and thereby calculate the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;

Calculate the ambient sound of each channel according to the coherent sound of each channel;

Perform inverse Fourier transform on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and ambient sound represented in the time domain;

Described calculating the weight expression of the coherent sound of the N channel signals, estimating the coherent sound according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same; specifically including:

Fourier transform is performed on the time-domain multi-channel signal, and the input signal X _n of the nth channel is expressed as:

X _n =β _n S+A _n

Among them, S represents the spectrum of the coherent sound, β _n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, 1≤n≤N, β ₁ =1, An represents the _nth channel The spectrum of the ambient sound of the channel;

Calculate the short-term energy of the _nth channel input signal Xn

Calculate the correlation value Φ ₁₂ for the first channel and the second channel:

According to the short-term energy of the first channel

Short-term energy of the second channel

From this, the short-term energy P _S of the coherent sound, the short _- term energy PA and β ₂ of the ambient sound are calculated.

Calculate β _n :

The weight value of the nth channel is:

Then the estimated value of coherent sound

for:

Then the _nth channel coherent sound Sn:

2. The method for extracting coherent sound and ambient sound of multi-channel signals according to claim 1, wherein the ambient sound of each channel is calculated according to the coherent sound of each channel; specifically:

The ambient sound A _n of the nth channel is:

An = _Xn _-Sn _.

3. A coherent sound and ambient sound extraction system for multi-channel signals, wherein the system comprises:

The coherent sound extraction module is used to calculate the weight expression of the coherent sound of the N channel signals, and estimates the coherent sound according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;

The ambient sound extraction module is used to calculate the ambient sound of each channel according to the coherent sound of each channel;

The frequency-domain to time-domain module is used to inverse Fourier transform the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and ambient sound represented in the time domain;

The specific implementation process of the coherent sound extraction module includes:

X _n =β _n S+A _n

Calculate the short-term energy of the _nth channel input signal Xn

According to the short-term energy of the first channel

short-term energy of the second channel

Calculate β _n :

The weight value of the nth channel is:

Then the estimated value of coherent sound

for:

Then the _nth channel coherent sound Sn:

4. The coherent sound and ambient sound extraction system of multi-channel signals according to claim 3, wherein the specific implementation process of the ambient sound extraction module comprises:

The ambient sound A _n of the nth channel is:

An = _Xn _-Sn _.