A selection function for pitched instrument source separation

Yukai Gong¹,
Longquan Dai¹ &
Jinhui Tang¹

237 Accesses
2 Citations
Explore all metrics

Abstract

There exist a large number of methods for pitched instrument source separation. The core problem is to separate the time-frequency overlapping harmonics. To yield better results, we propose a function to select fine harmonic separation results from existing methods. Our strategy is based on the discovery that a fine separation result usually has a low total amplitude fluctuation. For source harmonics overlapped in a frequency band, each method produces a separation result. Employing the harmonics separated by each method, we can estimate the total amplitude fluctuation of each group of overlapping harmonics. Our selection function maps the band index to the method index by selecting the method with the minimum total amplitude fluctuation. Experiments are conducted on sample mixtures from the University of Iowa Musical Instrument Sample Database. Three advanced separation techniques are compared, including common amplitude modulation (CAM), harmonic bandwidth companding (HBW-comp) and ideal binary mask (IBM) filtering. Experiment results indicate that the proposed selection function is able to boost the separation performance significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recovering Overlapping Partials for Monaural Perfect Harmonic Musical Sound Separation Using Modified Common Amplitude Modulation

Synthesizing the note-specific atoms based on their fundamental frequency, used for single-channel musical source separation

Article 15 January 2019

Multi-source Separation Using over Iterative Empirical Mode Decomposition

References

Koteswararao, Y.V., Rao, C.B.R.: Multichannel speech separation using hybrid GOMF and enthalpy-based deep neural networks. Multimedia Syst. 27, 271–286 (2021)
Article Google Scholar
Xie, L., Fu, Z., Feng, W., Luo, Y.: Pitch-density-based features and an SVM binary tree approach for multi-class audio classification in broadcast news. Multimedia Syst. 17, 101–11 (2011)
Article Google Scholar
Rafii, Z., Liutkus, A., Stoter, F. R., Mimilakis, S. I., FitzGerald, D., Pardo, B.: “An Overview of Lead and Accompaniment Separation in Music,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 8, 2018
Li, Y., Woodruff, J.: Monaural musical sound separation based on pitch and common amplitude modulation. IEEE Trans. Audio Speech Lang. Process. 17(7), 1361–1371 (2009)
Article Google Scholar
Zivanovic, M.: Harmonic bandwidth companding for separation of overlapping harmonics in pitched signals. IEEE/ACM Trans. Audio Speech Lang. Process. 23(5), 898–908 (2015)
Google Scholar
Hu, G., Wang, D.L.: Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Trans. Neural Networks 15(5), 1135–1150 (2004)
Article Google Scholar
Stoter, F. R., Liutkus, A., Badeau, R., Edler, B., Magron, P.: “Common fate model for unison source separation,” in IEEE International Conference on Acoustics, Speech, and Signal Processing, 2016
Pishdadian, F., Pardo, B.: Multi-resolution common fate transform. IEEE/ACM Trans. Audio Speech Lang. Process. 27(2), 342–354 (2019)
Article Google Scholar
Tachibana, H., Ono, N., Sagayama, S.: Singing voice enhancement in monaural music signals based on two-stage harmonic/percussive sound separation on multiple resolution spectrograms. IEEE/ACM Trans. Audio Speech Lang. Process. 22(1), 228–237 (2014)
Article Google Scholar
Brian, C.J.: Moore. Academic Press, An introduction to the psychology of hearing (1997)
The University of IOWA Musical Instrument Sample Database. [Online]. Available: http//:theremin.music.uiowa.edu/
Vincent, E., Gribonval, R., Fevotte, C.: Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1462–1469 (2006)
Article Google Scholar
Wang, D.L., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. Wiley/IEEE Press, Hoboken (2006)
Book Google Scholar
Li, Y., Wang, D.L.: Separation of singing voice from music accompaniment for monaural recordings. IEEE Trans. Audio Speech Lang. Process. 15(4), 1475–1487 (2007)
Article Google Scholar
Serra, X.: “Musical sound modeling with sinusoids plus noise,” in Musical Signal Processing, 1997
McAulay, R., Quatieri, T.: Speech analysis/synthesis based on a sinusoidal representation. IEEE Trans. Acoustic Speech Signal Process. 34(4), 744–754 (1986)
Article Google Scholar
Fevotte, C., Godsill, S.J.: A Bayesian approach for blind separation of sparse sources. IEEE Trans. Audio Speech Lang. Process. 14(6), 2174–2188 (2006)
Article Google Scholar
Ozerov, A., Philippe, P., Bimbot, F., Gribonval, R.: Adaptation of Bayesian models for single-channel source separation and its application to voice/music separation in popular songs. IEEE Trans. Audio Speech Lang. Process. 15(5), 1564–1578 (2007)
Article Google Scholar
Casey, M. A., Westner, W.: “Separation of mixed audio sources by independent subspace analysis,” in Proceedings of International Computer Music Conference, 2000
Virtanen, T.: Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 15(3), 1066–1074 (2007)
Article Google Scholar
Abdallah, S.A., Plumbley, M.D.: Unsupervised analysis of polyphonic music by sparse coding. IEEE Trans. Neural Networks 17(1), 1066–1074 (2007)
Google Scholar
Huang, P. S., Kim, M., Hasegawa-Johnson, M., Smaragdis, P.: “Joint optimization of masks and deep recurrent neural networks for monaural source separation,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, 2015
Chandna, P., Miron, M., Janer, J., Gómez, E.: “Monoaural audio source separation using deep convolutional neural networks,” in 13^th International Conference on Latent Variable Analysis and Signal Separation, 2017
Hershey, J. R., Chen, Z., Roux, J. L., Watanabe, S.: “Deep clustering: Discriminative embeddings for segmentation and separation,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 2016
Luo, Y., Chen, Z., Hershey, J. R., Roux, J. L., Mesgarani, N.: “Deep clustering and conventional networks for music separation: Stronger together,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 2017
Grais, E.M., Roma, G., Simpson, A.J.R., Plumbley, M.D.: Two-stage single-channel audio source separation using deep neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. 25(9), 1773–1783 (2017)
Article Google Scholar
Every, M.R., Szymanski, J.E.: Separation of synchronous pitched notes by spectral filtering of harmonics. IEEE Trans. Audio Speech Lang. Process. 14(5), 1845–1856 (2006)
Article Google Scholar
Virtanen, T., Klapuri, A.: “Separation of harmonic sounds using multipitch analysis and iterative parameter estimation,” in Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 83–86, 2001
Bay, M., Beauchamp, J. W.: “Harmonic source separation using prestored spectra,” Independent Component Analysis and Blind Signal Separation, pp. 561–568, 2006
Duan, Z., Zhang, Y., Zhang, C., Shi, Z.: Unsupervised single-channel music source separation by average harmonic structure modeling. IEEE Trans. Audio Speech Lang. Process. 16(4), 766–778 (2008)
Article Google Scholar
Gong, Y., Shu, X., Tang, J.: “Recovering overlapping partials for monaural perfect harmonic musical sound separation using modified common amplitude modulation,” in Pacific Rim Conference on Multimedia, pp. 903–912, 2017
Jensen, K.: “Timbre models of musical sounds,” Ph.D. dissertation, University of Copenhagen, 1999

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
Yukai Gong, Longquan Dai & Jinhui Tang

Authors

Yukai Gong
View author publications
You can also search for this author in PubMed Google Scholar
Longquan Dai
View author publications
You can also search for this author in PubMed Google Scholar
Jinhui Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Longquan Dai.

Additional information

Communicated by X. Yang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gong, Y., Dai, L. & Tang, J. A selection function for pitched instrument source separation. Multimedia Systems 28, 311–319 (2022). https://doi.org/10.1007/s00530-021-00836-z

Download citation

Received: 31 March 2021
Accepted: 12 July 2021
Published: 31 August 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s00530-021-00836-z

A selection function for pitched instrument source separation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recovering Overlapping Partials for Monaural Perfect Harmonic Musical Sound Separation Using Modified Common Amplitude Modulation

Synthesizing the note-specific atoms based on their fundamental frequency, used for single-channel musical source separation

Multi-source Separation Using over Iterative Empirical Mode Decomposition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A selection function for pitched instrument source separation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recovering Overlapping Partials for Monaural Perfect Harmonic Musical Sound Separation Using Modified Common Amplitude Modulation

Synthesizing the note-specific atoms based on their fundamental frequency, used for single-channel musical source separation

Multi-source Separation Using over Iterative Empirical Mode Decomposition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation