Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria

Published: 01 March 2007 Publication History

Abstract

An unsupervised learning algorithm for the separation of sound sources in one-channel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a time-varying gain. Each sound source, in turn, is modeled as a sum of one or more components. The parameters of the components are estimated by minimizing the reconstruction error between the input spectrogram and the model, while restricting the component spectrograms to be nonnegative and favoring components whose gains are slowly varying and sparse. Temporal continuity is favored by using a cost term which is the sum of squared differences between the gains in adjacent frames, and sparseness is favored by penalizing nonzero gains. The proposed iterative estimation algorithm is initialized with random values, and the gains and the spectra are then alternatively updated using multiplicative update rules until the values converge. Simulation experiments were carried out using generated mixtures of pitched musical instrument samples and drum sounds. The performance of the proposed method was compared with independent subspace analysis and basic nonnegative matrix factorization, which are based on the same linear model. According to these simulations, the proposed method enables a better separation quality than the previous algorithms. Especially, the temporal continuity criterion improved the detection of pitched musical sounds. The sparseness criterion did not produce significant improvements

Cited By

View all
  • (2024)MAVAR-SE: Multi-scale Audio-Visual Association Representation Network for End-to-End Speaker ExtractionMultiMedia Modeling10.1007/978-3-031-53308-2_17(227-238)Online publication date: 29-Jan-2024
  • (2023)A unified audio-visual learning framework for localization, separation, and recognitionProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619449(25006-25017)Online publication date: 23-Jul-2023
  • (2023)Learning-based robust speaker counting and separation with the aid of spatial coherenceEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-023-00298-32023:1Online publication date: 20-Sep-2023
  • Show More Cited By

Index Terms

  1. Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image IEEE Transactions on Audio, Speech, and Language Processing
      IEEE Transactions on Audio, Speech, and Language Processing  Volume 15, Issue 3
      March 2007
      374 pages

      Publisher

      IEEE Press

      Publication History

      Published: 01 March 2007

      Author Tags

      1. Acoustic signal analysis
      2. audio source separation
      3. blind source separation
      4. music
      5. nonnegative matrix factorization
      6. sparse coding
      7. unsupervised learning

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 29 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)MAVAR-SE: Multi-scale Audio-Visual Association Representation Network for End-to-End Speaker ExtractionMultiMedia Modeling10.1007/978-3-031-53308-2_17(227-238)Online publication date: 29-Jan-2024
      • (2023)A unified audio-visual learning framework for localization, separation, and recognitionProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619449(25006-25017)Online publication date: 23-Jul-2023
      • (2023)Learning-based robust speaker counting and separation with the aid of spatial coherenceEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-023-00298-32023:1Online publication date: 20-Sep-2023
      • (2023)Self-Supervised Fine-Grained Cycle-Separation Network (FSCN) for Visual-Audio SeparationIEEE Transactions on Multimedia10.1109/TMM.2022.320028225(5864-5876)Online publication date: 1-Jan-2023
      • (2023)Algorithms for audio inpainting based on probabilistic nonnegative matrix factorizationSignal Processing10.1016/j.sigpro.2022.108905206:COnline publication date: 1-May-2023
      • (2023)A review on speech separation in cocktail party environment: challenges and approachesMultimedia Tools and Applications10.1007/s11042-023-14649-x82:20(31035-31067)Online publication date: 23-Feb-2023
      • (2022)RPCA-DRNN technique for monaural singing voice separationEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-022-00236-92022:1Online publication date: 5-Feb-2022
      • (2022)Research on a Hyperplane Decomposition NMF Algorithm Applied to Speech Signal SeparationProceedings of the 2022 4th International Conference on Video, Signal and Image Processing10.1145/3577164.3577188(152-157)Online publication date: 25-Nov-2022
      • (2022)Speaker extraction network with attention mechanism for speech dialogue systemService Oriented Computing and Applications10.1007/s11761-022-00340-w16:2(111-119)Online publication date: 1-Jun-2022
      • (2022)Single-channel Multi-speakers Speech Separation Based on Isolated Speech SegmentsNeural Processing Letters10.1007/s11063-022-10887-655:1(385-400)Online publication date: 10-Jun-2022
      • Show More Cited By

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media