Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Environmental Sound Recognition With Time–Frequency Audio Features

Published: 01 August 2009 Publication History

Abstract

The paper considers the task of recognizing environmental sounds for the understanding of a scene or context surrounding an audio sensor. A variety of features have been proposed for audio recognition, including the popular Mel-frequency cepstral coefficients (MFCCs) which describe the audio spectral shape. Environmental sounds, such as chirpings of insects and sounds of rain which are typically noise-like with a broad flat spectrum, may include strong temporal domain signatures. However, only few temporal-domain features have been developed to characterize such diverse audio signals previously. Here, we perform an empirical feature analysis for audio environment characterization and propose to use the matching pursuit (MP) algorithm to obtain effective time-frequency features. The MP-based method utilizes a dictionary of atoms for feature selection, resulting in a flexible, intuitive and physically interpretable set of features. The MP-based feature is adopted to supplement the MFCC features to yield higher recognition accuracy for environmental sounds. Extensive experiments are conducted to demonstrate the effectiveness of these joint features for unstructured environmental sound classification, including listening tests to study human recognition capabilities. Our recognition system has shown to produce comparable performance as human listeners.

Cited By

View all
  • (2023)A Survey on Deep Learning Based Forest Environment Sound Classification at the EdgeACM Computing Surveys10.1145/361810456:3(1-36)Online publication date: 5-Oct-2023
  • (2022)Deep convolutional neural network for environmental sound classification via dilationJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-21928343:2(1827-1833)Online publication date: 1-Jan-2022
  • (2022)ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing UsersProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3502020(1-16)Online publication date: 29-Apr-2022
  • Show More Cited By

Index Terms

  1. Environmental Sound Recognition With Time–Frequency Audio Features
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Audio, Speech, and Language Processing
    IEEE Transactions on Audio, Speech, and Language Processing  Volume 17, Issue 6
    August 2009
    179 pages

    Publisher

    IEEE Press

    Publication History

    Published: 01 August 2009

    Author Tags

    1. Audio classification
    2. Mel-frequency cepstral coefficient (MFCC)
    3. auditory scene recognition
    4. data representation
    5. feature extraction
    6. feature selection
    7. matching pursuit

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 22 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A Survey on Deep Learning Based Forest Environment Sound Classification at the EdgeACM Computing Surveys10.1145/361810456:3(1-36)Online publication date: 5-Oct-2023
    • (2022)Deep convolutional neural network for environmental sound classification via dilationJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-21928343:2(1827-1833)Online publication date: 1-Jan-2022
    • (2022)ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing UsersProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3502020(1-16)Online publication date: 29-Apr-2022
    • (2021)Proposal of the Aesthetic Experience-Oriented Evaluation Framework for Field-recording Sound Retrieval SystemProceedings of the 2021 3rd International Conference on Image, Video and Signal Processing10.1145/3459212.3459223(65-72)Online publication date: 19-Mar-2021
    • (2021)VoIPLocProceedings of the 14th ACM Conference on Security and Privacy in Wireless and Mobile Networks10.1145/3448300.3467816(323-334)Online publication date: 28-Jun-2021
    • (2021)Acoustic scene classification with matrix factorization for unsupervised feature learning2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2016.7472918(6445-6449)Online publication date: 11-Mar-2021
    • (2020)Interaction with the Soundscape: Exploring Emotional Audio Generation for Improved Individual WellbeingArtificial Intelligence in HCI10.1007/978-3-030-50334-5_15(229-242)Online publication date: 19-Jul-2020
    • (2019)Image Approach to Speech Recognition on CNNProceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control10.1145/3386164.3389100(1-6)Online publication date: 25-Sep-2019
    • (2019)Detection of Tennis Events from Acoustic DataProceedings Proceedings of the 2nd International Workshop on Multimedia Content Analysis in Sports10.1145/3347318.3355520(91-99)Online publication date: 15-Oct-2019
    • (2019)Environmental Audio Scene and Sound Event Recognition for Autonomous SurveillanceACM Computing Surveys10.1145/332224052:3(1-34)Online publication date: 18-Jun-2019
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media