Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2303.07816 (eess)

[Submitted on 14 Mar 2023]

Title:Multi-Channel Masking with Learnable Filterbank for Sound Source Separation

Authors:Wang Dai, Archontis Politis, Tuomas Virtanen

View PDF

Abstract:This work proposes a learnable filterbank based on a multi-channel masking framework for multi-channel source separation. The learnable filterbank is a 1D Conv layer, which transforms the raw waveform into a 2D representation. In contrast to the conventional single-channel masking method, we estimate a mask for each individual microphone channel. The estimated masks are then applied to the transformed waveform representation like in the traditional filter-and-sum beamforming operation. Specifically, each mask is used to multiply the corresponding channel's 2D representation, and the masked output of all channels are then summed. At last, a 1D transposed Conv layer is used to convert the summed masked signal into the waveform domain. The experimental results show our method outperforms single-channel masking with a learnable filterbank and can outperform multi-channel complex masking with STFT complex spectrum in the STGCSEN model if a learnable filterbank is transformed to a higher feature dimension. The spatial response analysis also verifies that multi-channel masking in the learnable filterbank domain has spatial selectivity.

Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2303.07816 [eess.AS]
	(or arXiv:2303.07816v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2303.07816

Submission history

From: Wang Dai [view email]
[v1] Tue, 14 Mar 2023 11:46:47 UTC (984 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Multi-Channel Masking with Learnable Filterbank for Sound Source Separation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Multi-Channel Masking with Learnable Filterbank for Sound Source Separation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators