CN111669697B - A method and system for extracting coherent sound and ambient sound from multi-channel signals - Google Patents
A method and system for extracting coherent sound and ambient sound from multi-channel signals Download PDFInfo
- Publication number
- CN111669697B CN111669697B CN202010447863.9A CN202010447863A CN111669697B CN 111669697 B CN111669697 B CN 111669697B CN 202010447863 A CN202010447863 A CN 202010447863A CN 111669697 B CN111669697 B CN 111669697B
- Authority
- CN
- China
- Prior art keywords
- channel
- sound
- coherent
- ambient
- coherent sound
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000001427 coherent effect Effects 0.000 title claims abstract description 138
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000000605 extraction Methods 0.000 claims abstract description 29
- 230000014509 gene expression Effects 0.000 claims abstract description 24
- 108091006146 Channels Proteins 0.000 claims description 137
- 238000001228 spectrum Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000012847 principal component analysis method Methods 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/305—Electronic adaptation of stereophonic audio signals to reverberation of the listening space
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/06—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being correlation coefficients
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
本发明公开了一种多通道信号的相干声与环境声提取方法及系统,所述方法包括:计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;根据各个通道的相干声计算各个通道的环境声;将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。本发明的方法探究出在各通道环境声能量相同条件下的,针对任意通道数信号估计相干声的权重表达式,利用各个通道的信号能量以及通道间相关值,求出权重表达式中的各个未知参数,实现多通道信号的相干声与环境声提取,提取精度高。
The invention discloses a method and system for extracting coherent sound and ambient sound of multi-channel signals. The method comprises: calculating the weight expression of the coherent sound of N channel signals, estimating the coherent sound according to the weight expression, and calculating each channel accordingly. Among them, the ambient sound energy of each channel is the same; calculate the ambient sound of each channel according to the coherent sound of each channel; perform inverse Fourier transform on the coherent sound of the N channels and the ambient sound of the N channels to obtain the time domain Represents coherent and ambient sound. The method of the invention explores the weight expression for estimating coherent sound for signals with any number of channels under the condition that the ambient sound energy of each channel is the same, and uses the signal energy of each channel and the correlation value between channels to obtain each of the weight expressions. Unknown parameters can realize the extraction of coherent sound and ambient sound of multi-channel signals, and the extraction accuracy is high.
Description
技术领域technical field
本发明涉及空间声重放领域,特别涉及一种多通道信号的相干声与环境声提取方法及系统。The invention relates to the field of spatial sound reproduction, in particular to a method and system for extracting coherent sound and ambient sound of multi-channel signals.
背景技术Background technique
空间声重放技术在娱乐媒体中得到了广泛应用,比如电影院、家庭影院以及便携式电子设备在播放影片时,通过耳机或扬声器重放出具有一定声像宽度和沉浸感良好的空间声,可以为消费者带来更好的视听体验。近年来,空间声重放在尖端的科学研究和实用工程领域也逐渐显露出重要的应用前景,比如虚拟现实、航空、航天等领域。Spatial sound playback technology has been widely used in entertainment media, such as movie theaters, home theaters, and portable electronic devices. When playing movies, the spatial sound with a certain sound image width and good immersion is played back through headphones or speakers, which can be used for consumption. bring a better audio-visual experience. In recent years, space acoustic reproduction has gradually shown important application prospects in cutting-edge scientific research and practical engineering fields, such as virtual reality, aviation, aerospace and other fields.
空间声主要包含两种性质不同的成分,其一是具有方向性的声成分,称为相干声;其二是具有扩散性、无法辨别方向的声成分,称为环境声。为了实现更好的声重放效果,需要对空间声进行相干声与环境声提取(Primary-Ambient Extraction,PAE)并进行不同的处理。比如,音频编解码系统中,将PAE作为音频编码或解码的前端,可以实现有效且沉浸感较好的空间声重放。Spatial sound mainly contains two components with different properties, one is the sound component with directionality, called coherent sound; the other is the sound component with diffusivity and cannot distinguish the direction, called ambient sound. In order to achieve better sound reproduction effect, coherent sound and ambient sound extraction (Primary-Ambient Extraction, PAE) need to be performed on spatial sound and different processing is performed. For example, in an audio coding and decoding system, using PAE as the front end of audio coding or decoding can achieve effective and immersive spatial sound playback.
针对两通道信号的PAE方法发展较为成熟,应用较为广泛的是主成分分析法和最小二乘法。针对多通道信号,可以使用成对相关法进行PAE。但是成对相关法提取成分准确度不高。因此将适用于立体声的PAE方法拓展至多通道信号具有重要意义。主成分分析法在相干声占主要能量的前提下,通过计算输入信号的协方差矩阵的特征值,对立体声信号进行PAE。该方法也可对多通道信号进行成分提取,但是当通道数较多时,计算复杂度增大,而且主成分分析法仅在相干声占主要能量时提取效果较好。最小二乘法在各个通道环境声能量相等的前提下,通过计算估计相干声的权重,实现对立体声信号的PAE。但是,直接将最小二乘法应用于多通道信号时,估计权重不易求解。环境声成分在空间声中主要起烘托气氛的作用,为了达到更好的环绕感,环境声在各个通道能量分布差异较小。因此,在各个通道环境声能量相等的前提下,对多通道信号进行相干声与环境声提取有着重要意义。The PAE method for two-channel signals is relatively mature, and the principal component analysis method and the least squares method are widely used. For multi-channel signals, the pairwise correlation method can be used for PAE. However, the accuracy of extracting components by pairwise correlation method is not high. Therefore, it is of great significance to extend the PAE method suitable for stereo to multi-channel signals. The principal component analysis method performs PAE on the stereo signal by calculating the eigenvalues of the covariance matrix of the input signal under the premise that coherent sound occupies the main energy. This method can also extract components from multi-channel signals, but when the number of channels is large, the computational complexity increases, and the principal component analysis method is only effective when coherent sound occupies the main energy. The least squares method realizes the PAE of the stereo signal by calculating and estimating the weight of the coherent sound under the premise that the ambient sound energy of each channel is equal. However, when the least squares method is directly applied to multi-channel signals, the estimated weights are not easy to solve. The ambient sound component mainly plays the role of setting off the atmosphere in the spatial sound. In order to achieve a better surround feeling, the energy distribution difference of the ambient sound in each channel is small. Therefore, under the premise that the ambient sound energy of each channel is equal, it is of great significance to extract coherent sound and ambient sound for multi-channel signals.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于克服上述技术缺陷,在环境声在各个通道能量相等的前提下,通过计算通道数较少时使用最小二乘法估计相干声的权重,根据权重随通道数变化的规律性,得出针对任意通道数的多通道信号进行相干声估计时的权重表达式。The purpose of the present invention is to overcome the above-mentioned technical defects. Under the premise that the energy of the ambient sound in each channel is equal, the least squares method is used to estimate the weight of the coherent sound when the number of channels is small. According to the regularity of the weight change with the number of channels, the The weight expression for coherent sound estimation for multi-channel signals with arbitrary number of channels is obtained.
为实现上述目的,本发明提出了一种多通道信号的相干声与环境声提取方法,所述方法包括:In order to achieve the above purpose, the present invention proposes a method for extracting coherent sound and ambient sound from multi-channel signals, the method comprising:
计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;Calculate the weight expression of the coherent sound of the N channel signals, estimate the coherent sound according to the weight expression, and thereby calculate the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;
根据各个通道的相干声计算各个通道的环境声;Calculate the ambient sound of each channel according to the coherent sound of each channel;
将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。Inverse Fourier transform is performed on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and the ambient sound represented in the time domain.
作为上述方法的一种改进,所述计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;具体包括:As an improvement of the above method, the weight expression of the coherent sound of the N channel signals is calculated, and the coherent sound is estimated according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same; Specifically include:
将时域多通道信号进行傅里叶变换,第n个通道输入信号Xn表示为:Fourier transform is performed on the time-domain multi-channel signal, and the input signal X n of the nth channel is expressed as:
Xn=βnS+An X n =β n S+A n
其中,S表示相干声的频谱,βn表示第n个通道的相干声与第一个通道的相干声存在的幅度差异因子,1≤n≤N,β1=1,An表示第n个通道的环境声的频谱;Among them, S represents the spectrum of the coherent sound, β n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, 1≤n≤N, β 1 =1, An represents the nth channel The spectrum of the ambient sound of the channel;
计算第n个通道输入信号Xn的短时能量 Calculate the short-term energy of the nth channel input signal Xn
计算第一个通道和第二个通道的相关值Φ12:Calculate the correlation value Φ 12 for the first channel and the second channel:
根据第一个通道的短时能量第二个通道的短时能量以及两个通道间相关值Φ12,计算中间参数C和D:According to the short-term energy of the first channel Short-term energy of the second channel and the correlation value Φ 12 between the two channels, calculate the intermediate parameters C and D:
由此计算相干声的短时能量PS、环境声的短时能量PA以及β2 From this, the short-term energy P S of the coherent sound, the short - term energy PA and β 2 of the ambient sound are calculated.
计算βn:Calculate β n :
第n个通道的权重值为:The weight value of the nth channel is:
则相干声的估计值为:Then the estimated value of coherent sound for:
则第n个通道相干声Sn: Then the nth channel coherent sound Sn:
作为上述方法的一种改进,所述根据各个通道的相干声计算各个通道的环境声;具体为:As an improvement of the above method, the ambient sound of each channel is calculated according to the coherent sound of each channel; specifically:
第n个通道的环境声An为:The ambient sound A n of the nth channel is:
An=Xn-Sn。An = Xn -Sn .
本发明的实施例2提出了一种多通道信号的相干声与环境声提取系统,所述系统包括:Embodiment 2 of the present invention proposes a multi-channel signal coherent sound and ambient sound extraction system, the system includes:
相干声提取模块,用于计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;The coherent sound extraction module is used to calculate the weight expression of the coherent sound of the N channel signals, and estimates the coherent sound according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;
环境声提取模块,用于根据各个通道的相干声计算各个通道的环境声;The ambient sound extraction module is used to calculate the ambient sound of each channel according to the coherent sound of each channel;
频域转时域模块,用于将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。The frequency-domain to time-domain module is used to perform inverse Fourier transform on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and ambient sound represented in the time domain.
作为上述系统的一种改进,所述相干声提取模块的具体实现过程包括:As an improvement of the above system, the specific implementation process of the coherent sound extraction module includes:
将时域多通道信号进行傅里叶变换,第n个通道输入信号Xn表示为:Fourier transform is performed on the time-domain multi-channel signal, and the input signal X n of the nth channel is expressed as:
Xn=βnS+An X n =β n S+A n
其中,S表示相干声的频谱,βn表示第n个通道的相干声与第一个通道的相干声存在的幅度差异因子,1≤n≤N,β1=1,An表示第n个通道的环境声的频谱;Among them, S represents the spectrum of the coherent sound, β n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, 1≤n≤N, β 1 =1, An represents the nth channel The spectrum of the ambient sound of the channel;
计算第n个通道输入信号Xn的短时能量 Calculate the short-term energy of the nth channel input signal Xn
计算第一个通道和第二个通道的相关值Φ12:Calculate the correlation value Φ 12 for the first channel and the second channel:
根据第一个通道的短时能量第二个通道的短时能量以及两个通道间相关值Φ12,计算中间参数C和D:According to the short-term energy of the first channel Short-term energy of the second channel and the correlation value Φ 12 between the two channels, calculate the intermediate parameters C and D:
由此计算相干声的短时能量PS、环境声的短时能量PA以及β2 From this, the short-term energy P S of the coherent sound, the short - term energy PA and β 2 of the ambient sound are calculated.
计算βn:Calculate β n :
第n个通道的权重值为:The weight value of the nth channel is:
则相干声的估计值为:Then the estimated value of coherent sound for:
则第n个通道相干声Sn: Then the nth channel coherent sound Sn:
作为上述系统的一种改进,所述环境声计算模块的具体实现过程包括:As an improvement of the above system, the specific implementation process of the ambient sound calculation module includes:
第n个通道的环境声An为:The ambient sound A n of the nth channel is:
An=Xn-Sn。An = Xn -Sn .
本发明的优势在于:The advantages of the present invention are:
本发明的方法探究出在各通道环境声能量相同条件下的,针对任意通道数信号估计相干声的权重表达式,利用各个通道的信号能量以及通道间相关值,求出权重表达式中的各个未知参数,实现多通道信号的相干声与环境声提取,提取精度高。The method of the invention explores the weight expression for estimating coherent sound for signals with any number of channels under the condition that the ambient sound energy of each channel is the same, and uses the signal energy of each channel and the correlation value between channels to obtain each of the weight expressions. Unknown parameters can realize the extraction of coherent sound and ambient sound of multi-channel signals, and the extraction accuracy is high.
附图说明Description of drawings
图1是本发明的多通道信号的相干声与环境声提取方法的流程图;Fig. 1 is the flow chart of the coherent sound and ambient sound extraction method of multi-channel signal of the present invention;
图2(a)是使用本发明的方法和成对相关法对混合五通道信号1进行相干声成分提取的误差图;Fig. 2 (a) is the error diagram that uses the method of the present invention and pairwise correlation method to carry out coherent sound component extraction to mixed five-
图2(b)是使用本发明的方法和成对相关法对混合五通道信号1进行环境声成分提取的误差图;Fig. 2 (b) is the error diagram that uses the method of the present invention and pairwise correlation method to carry out ambient sound component extraction to mixed five-
图3(a)是使用本发明的方法和成对相关法对混合五通道信号2进行相干声成分提取的误差图;Fig. 3 (a) is the error diagram that uses the method of the present invention and pairwise correlation method to carry out coherent sound component extraction to mixed five-channel signal 2;
图3(b)是使用本发明的方法和成对相关法对混合五通道信号2进行环境声成分提取的误差图。FIG. 3( b ) is an error diagram of extracting ambient sound components from the mixed five-channel signal 2 using the method of the present invention and the pairwise correlation method.
具体实施方式Detailed ways
下面结合附图和具体实施例对本发明的技术方案进行详细说明。The technical solutions of the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
实施例1Example 1
如图1所示,本发明的实施例1提出了一种用于多通道信号各个通道环境声能量相等时的相干声与环境声提取方法,包括以下步骤:As shown in FIG. 1 ,
步骤1)将多通道信号分帧后进行傅里叶变换得到频谱,根据多通道信号模型表示出各个通道的短时能量以及任意两个通道间相关值,具体包括:Step 1) After dividing the multi-channel signal into frames, perform Fourier transform to obtain the frequency spectrum, and represent the short-term energy of each channel and the correlation value between any two channels according to the multi-channel signal model, specifically including:
多通道信号模型中,输入信号表示为相干声与环境声的叠加。由于相干声和环境声自身的特性不同,假设各个通道的相干声之间是完全相关的,即存在线性关系;假设相干声与每个通道的环境声以及通道间的环境声均是不相关的。In the multi-channel signal model, the input signal is represented as the superposition of coherent sound and ambient sound. Due to the different characteristics of coherent sound and ambient sound, it is assumed that the coherent sound of each channel is completely correlated, that is, there is a linear relationship; it is assumed that the coherent sound is uncorrelated with the ambient sound of each channel and the ambient sound between channels .
步骤1-1)将时域多通道信号进行傅里叶变换,得到频谱:Step 1-1) Fourier transform the time-domain multi-channel signal to obtain the spectrum:
Xn=βnS+An,n=1,2,…,NX n =β n S+A n ,n=1,2,...,N
其中,N为通道数,S表示相干声的频谱,βn表示第n个通道相干声与第一个通道的相干声存在的幅度差异因子,且β1=1,An表示第n个通道的环境声的频谱;Among them, N is the number of channels, S represents the spectrum of the coherent sound, β n represents the amplitude difference factor between the coherent sound of the nth channel and the coherent sound of the first channel, and β 1 =1, An represents the nth channel. the spectrum of ambient sound;
步骤1-2)各个通道的信号能量可以表示为:Step 1-2) The signal energy of each channel can be expressed as:
其中,E{}表示短时平均。Among them, E{} represents the short-term average.
步骤1-3)各个通道间的相关值可以表示为:Step 1-3) The correlation value between each channel can be expressed as:
其中,为第n1个通道和第n2个通道间的相关值,n1=1,2,…,N,n2=1,2,…,N,n1≠n2;in, is the correlation value between the n1th channel and the n2th channel, n 1 = 1,2,...,N,n 2 =1,2,...,N,n 1 ≠n 2 ;
步骤2)计算出通道数较少时使用最小二乘法估计相干声的权重值,并探究其规律性,具体包括:Step 2) Calculate the weight value of coherent sound using the least squares method when the number of channels is small, and explore its regularity, including:
步骤2-1)针对两通道信号,计算出输入信号X1和X2估计相干声S的权重值:Step 2-1) For the two-channel signal, calculate the weight values of the input signals X 1 and X 2 to estimate the coherent sound S:
步骤2-1-1)估计相干声S:Step 2-1-1) Estimate coherent sound S:
其中,w1和w2表示待求的估计权重。Among them, w 1 and w 2 represent the estimated weights to be sought.
步骤2-1-2)S的估计误差σS表示为:Step 2-1-2) The estimation error σ S of S is expressed as:
步骤2-1-3)使用最小二乘算法进行求解,即当估计误差与输入立体声信号完全不相关时,得到的权重为最优估计:Step 2-1-3) Use the least squares algorithm to solve, that is, when the estimation error is completely uncorrelated with the input stereo signal, the obtained weight is the optimal estimation:
E{σSX1}=0E{σ S X 1 }=0
E{σSX2}=0.E{σ S X 2 }=0.
此时,最优估计的权重表示为:At this time, the weight of the optimal estimate is expressed as:
其中,PS表示相干声的短时能量,PA表示环境声的短时能量。Among them, PS represents the short - term energy of coherent sound, and PA represents the short-term energy of ambient sound.
步骤2-2)针对三通道信号,计算出输入信号X1、X2以及X3估计相干声S的权重值:Step 2-2) For the three-channel signal, calculate the weight values of the input signals X 1 , X 2 and X 3 to estimate the coherent sound S:
步骤2-2-1)估计相干声S:Step 2-2-1) Estimate coherent sound S:
其中,w1、w2和w3表示待求的估计权重。Among them, w 1 , w 2 and w 3 represent the estimated weights to be sought.
步骤2-2-2)与步骤2-1)类似的处理方法可以求得三通道信号估计相干声的权重值:Step 2-2-2) Similar to step 2-1), the weight value of the estimated coherent sound of the three-channel signal can be obtained:
步骤2-3)通过更多计算通道数更多时的相干声的估计权重,发现权重值可统一表达。针对通道数为N的多通道信号,估计的相干声表示为:Step 2-3) By calculating the estimated weight of the coherent sound when the number of channels is more, it is found that the weight value can be expressed uniformly. For a multi-channel signal with N channels, the estimated coherent sound is expressed as:
其中,权重值可以表示为:Among them, the weight value can be expressed as:
步骤3)计算估计相干声的权重中各个未知参数,完成多通道信号的相干声与环境声提取,具体包括:Step 3) Calculate and estimate each unknown parameter in the weight of the coherent sound, and complete the extraction of the coherent sound and the ambient sound of the multi-channel signal, specifically including:
步骤3-1)已知β1=1,因此可根据步骤1)中前两个通道的信号能量和通道间相关值求出未知参数PS、PA以及β2:Step 3-1) It is known that
其中,in,
步骤3-2)根据除第一通道和第二通道外的其他通道的能量值,可求出当3≤n≤N时的βn:Step 3-2) According to the energy values of other channels except the first channel and the second channel, β n when 3≤n≤N can be obtained:
步骤3-3)针对通道数为N的多通道信号,将步骤3-1)和步骤3-2)中计算得到的参数PS、PA以及βn(n=1,2,…,N)带入步骤2-3)计算得到的估计相干声的权重值wn(1,2,…,N)即可完成从多通道信号中提取相干声的操作。Step 3-3) For the multi-channel signal with the number of channels N, the parameters P S , P A and β n (n=1, 2, . . . , N) calculated in step 3-1) and step 3-2) ) into the estimated coherent sound weight value w n (1, 2, .
步骤4)对对任意通道数的多通道信号进行PAE,具体包括:Step 4) Perform PAE on multi-channel signals of any number of channels, specifically including:
步骤4-1)计算各个通道的相干声,具体包括:Step 4-1) Calculate the coherent sound of each channel, specifically including:
由于步骤2)计算出对任意通道数的多通道信号进行PAE时估计相干声的权重表达式,步骤3)计算出权重表达式中的各个未知参数,因此当确定了多通道信号的通道数,可直接根据权重表达式估计相干声S。此相干声直接为第一个通道的相干声,其他通道的相干声由S线性处理得到,即为βnS(n=2,…,N)。Since step 2) calculates the weight expression for estimating coherent sound when performing PAE on a multi-channel signal of any number of channels, and step 3) calculates each unknown parameter in the weight expression, so when the number of channels of the multi-channel signal is determined, The coherent sound S can be estimated directly from the weight expression. This coherent sound is directly the coherent sound of the first channel, and the coherent sound of other channels is obtained by S linear processing, namely β n S (n=2,...,N).
步骤4-2)计算各个通道的环境声,具体包括:Step 4-2) Calculate the ambient sound of each channel, including:
将各个通道剩余成分认定为环境声,即An=Xn-βnS。The remaining components of each channel are identified as ambient sound, that is, An = Xn - βnS .
步骤4-3)将所得的N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。Step 4-3) Perform inverse Fourier transform on the obtained N-channel coherent sounds and N-channel ambient sounds to obtain coherent sounds and ambient sounds represented in the time domain.
下面结合仿真实例,对本发明所提出的方法性能进行说明:Below in conjunction with the simulation example, the method performance proposed by the present invention is described:
将完全相关的相干声与完全不相关的环境声按照一定比例合成混合五通道信号,使用本发明提出的多通道PAE方法和成对相关法进行成分提取。合成了两组混合多通道信号,即纯净语音作为相干声、海浪声作为环境声的混合五通道信号1以及纯净音乐声作为相干声、森林背景声作为环境声的混合五通道信号2。混合时,为了控制各个通道间相干声能量的分布,设定各个通道间相干声幅度差异因子βn与其参考值β0之间呈一定的比例关系;设定各个通道的环境声能量相等为PA0;为了控制混合信号中相干声成分所占比例,设定不同的相干声能量占比γ。参考值β0和PA0由γ决定。The fully correlated coherent sound and the completely uncorrelated ambient sound are synthesized and mixed with a five-channel signal according to a certain ratio, and the components are extracted by the multi-channel PAE method and the paired correlation method proposed by the present invention. Two groups of mixed multi-channel signals are synthesized, namely, the mixed five-
本实验设定各个通道相干声的幅度存在β1=β2=β0,β3=2β0,β4=β5=0.5β0的比例关系。相干声能量占比γ取值为0.05至0.95(间隔为0.1)。相干声的提取误差εP分别表示为:In this experiment, the amplitude of coherent sound of each channel is set to have a proportional relationship of β 1 =β 2 =β 0 , β 3 =2β 0 , and β 4 =β 5 =0.5β 0 . The coherent sound energy ratio γ is 0.05 to 0.95 (with an interval of 0.1). The extraction error ε P of coherent sound is expressed as:
环境声的提取误差εa分别表示为:The extraction error ε a of ambient sound is expressed as:
图2(a)和图2(b)代表了本发明所提出的算法和成对相关法分别对混合五通道信号1进行PAE时相干声和环境声的提取误差;图3(a)和图3(b)代表了本发明所提出的算法和成对相关法分别对混合五通道信号2进行PAE时相干声和环境声的提取误差。可以看出,在相干声能量占比γ取值为0.05至0.95(间隔为0.1)的整个区间内,本发明提出的方法的提取误差均小于成对相关法。Fig. 2(a) and Fig. 2(b) represent the extraction errors of coherent sound and ambient sound when the algorithm proposed by the present invention and the pairwise correlation method respectively perform PAE on the mixed five-
本发明的实施例2提出了一种多通道信号的相干声与环境声提取系统,所述系统包括:Embodiment 2 of the present invention proposes a multi-channel signal coherent sound and ambient sound extraction system, the system includes:
相干声提取模块,用于计算N个通道信号相干声的权重表达式,根据权重表达式估计相干声,由此计算各个通道的相干声;其中,每个通道的环境声能量相同;The coherent sound extraction module is used to calculate the weight expression of the coherent sound of the N channel signals, and estimates the coherent sound according to the weight expression, thereby calculating the coherent sound of each channel; wherein, the ambient sound energy of each channel is the same;
环境声提取模块,用于根据各个通道的相干声计算各个通道的环境声;The ambient sound extraction module is used to calculate the ambient sound of each channel according to the coherent sound of each channel;
频域转时域模块,用于将N个通道相干声与N个通道环境声进行逆傅里叶变换,得到时域表示的相干声与环境声。The frequency-domain to time-domain module is used to perform inverse Fourier transform on the N-channel coherent sound and the N-channel ambient sound to obtain the coherent sound and ambient sound represented in the time domain.
最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,对本发明的技术方案进行修改或者等同替换,都不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the embodiments, those of ordinary skill in the art should understand that any modification or equivalent replacement of the technical solutions of the present invention will not depart from the spirit and scope of the technical solutions of the present invention, and should be included in the present invention. within the scope of the claims.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010447863.9A CN111669697B (en) | 2020-05-25 | 2020-05-25 | A method and system for extracting coherent sound and ambient sound from multi-channel signals |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010447863.9A CN111669697B (en) | 2020-05-25 | 2020-05-25 | A method and system for extracting coherent sound and ambient sound from multi-channel signals |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111669697A CN111669697A (en) | 2020-09-15 |
CN111669697B true CN111669697B (en) | 2021-05-18 |
Family
ID=72384501
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010447863.9A Active CN111669697B (en) | 2020-05-25 | 2020-05-25 | A method and system for extracting coherent sound and ambient sound from multi-channel signals |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111669697B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101401151A (en) * | 2006-03-15 | 2009-04-01 | 法国电信公司 | Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis |
CN101816191A (en) * | 2007-09-26 | 2010-08-25 | 弗劳恩霍夫应用研究促进协会 | Apparatus and method for extracting environmental signal and computer program in apparatus and method for obtaining weighting coefficient for extracting environmental signal |
EP2523473A1 (en) * | 2011-05-11 | 2012-11-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for generating an output signal employing a decomposer |
CN109036455A (en) * | 2018-09-17 | 2018-12-18 | 中科上声(苏州)电子有限公司 | Direct sound wave and background sound extracting method, speaker system and its sound playback method |
CN110534129A (en) * | 2018-05-23 | 2019-12-03 | 哈曼贝克自动系统股份有限公司 | The separation of dry sound and ambient sound |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103902822B (en) * | 2014-03-28 | 2017-09-08 | 西安交通大学苏州研究院 | Sources number detection method in the case of the mixing of incoherent and coherent signal |
CN110531310B (en) * | 2019-07-25 | 2021-07-13 | 西安交通大学 | Direction of Arrival Estimation Method for Far-Field Coherent Signals Based on Subspace and Interpolation Transformation |
-
2020
- 2020-05-25 CN CN202010447863.9A patent/CN111669697B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101401151A (en) * | 2006-03-15 | 2009-04-01 | 法国电信公司 | Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis |
CN101816191A (en) * | 2007-09-26 | 2010-08-25 | 弗劳恩霍夫应用研究促进协会 | Apparatus and method for extracting environmental signal and computer program in apparatus and method for obtaining weighting coefficient for extracting environmental signal |
EP2523473A1 (en) * | 2011-05-11 | 2012-11-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for generating an output signal employing a decomposer |
CN110534129A (en) * | 2018-05-23 | 2019-12-03 | 哈曼贝克自动系统股份有限公司 | The separation of dry sound and ambient sound |
CN109036455A (en) * | 2018-09-17 | 2018-12-18 | 中科上声(苏州)电子有限公司 | Direct sound wave and background sound extracting method, speaker system and its sound playback method |
Non-Patent Citations (1)
Title |
---|
相干声与环境声提取方法的客观性能评估;吴彦琴等;《声学技术》;20191031;第38卷(第5期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111669697A (en) | 2020-09-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2024084842A (en) | Method or device for compressing or decompressing higher-order ambisonic signal representations | |
JP6100441B2 (en) | Binaural room impulse response filtering using content analysis and weighting | |
CN1985303B (en) | Apparatus and method for generating a multi-channel output signal | |
US8705750B2 (en) | Device and method for converting spatial audio signal | |
Gu et al. | Complex neural spatial filter: Enhancing multi-channel target speech separation in complex domain | |
CN110728989B (en) | A Binaural Speech Separation Method Based on Long Short-Term Memory Network LSTM | |
TW202022853A (en) | Method and apparatus for decoding encoded audio signal in ambisonics format for l loudspeakers at known positions and computer readable storage medium | |
CN104134444B (en) | A kind of song based on MMSE removes method and apparatus of accompanying | |
Su et al. | Inras: Implicit neural representation for audio scenes | |
CN114203163A (en) | Audio signal processing method and device | |
CN106847301A (en) | A kind of ears speech separating method based on compressed sensing and attitude information | |
CN117501362A (en) | Audio rendering system, method and electronic equipment | |
CN111464932A (en) | Sound field reconstruction method, device, device and storage medium based on multiple listening points | |
CN111669697B (en) | A method and system for extracting coherent sound and ambient sound from multi-channel signals | |
CN104424971B (en) | A kind of audio file play method and device | |
Chun et al. | Real-time conversion of stereo audio to 5.1 channel audio for providing realistic sounds | |
Hu et al. | SMMA-Net: An audio clue-based target speaker extraction network with spectrogram matching and mutual attention | |
Lee et al. | DeFTAN-II: Efficient multichannel speech enhancement with subgroup processing | |
Zhou et al. | Binaural Sound Source Localization Based on Convolutional Neural Network. | |
CN109068262B (en) | A loudspeaker-based personalized sound image reproduction method and device | |
CN111711918B (en) | A method and system for extracting coherent sound and ambient sound from multi-channel signals | |
WO2020057050A1 (en) | Method for extracting direct sound and background sound, and loudspeaker system and sound reproduction method therefor | |
CN117376784A (en) | Method for expanding mono stereo field, electronic device, and storage medium | |
Gramaccioni et al. | L3das23: Learning 3d audio sources for audio-visual extended reality | |
Wu et al. | Microphone array speech separation algorithm based on dnn |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |