Tyagi et al., 2023 - Google Patents

Urban Sound Classification for Audio Analysis using Long Short Term Memory

Tyagi et al., 2023

Document ID: 11524687629811144405
Author: Tyagi S; Aggarwal K; Kumar D; Garg S; et al.
Publication year: 2023
Publication venue: NEU Journal for Artificial Intelligence and Internet of Things

External Links

Cited by

Snippet

The process of audio classification involves categorizing audio signals into predefined classes based on their acoustic characteristics. Deep learning techniques have played a significant role in addressing this issue. Researchers have proposed various approaches to …

Continue reading at dergi.neu.edu.tr (PDF) (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks

Similar Documents

Publication	Publication Date	Title
Latif et al.	2020	Deep representation learning in speech processing: Challenges, recent advances, and future trends
Jothimani et al.	2022	MFF-SAug: Multi feature fusion with spectrogram augmentation of speech emotion recognition using convolution neural network
Bhatti et al.	2004	A neural network approach for human emotion recognition in speech
Bansal et al.	2022	Environmental Sound Classification: A descriptive review of the literature
Langari et al.	2020	Efficient speech emotion recognition using modified feature extraction
Orjesek et al.	2019	DNN based music emotion recognition from raw audio signal
Schmidt et al.	2012	Feature Learning in Dynamic Environments: Modeling the Acoustic Structure of Musical Emotion.
Tyagi et al.	2023	Urban Sound Classification for Audio Analysis using Long Short Term Memory
Hosseini et al.	2024	Multimodal modelling of human emotion using sound, image and text fusion
Thornton	2019	Audio recognition using mel spectrograms and convolution neural networks
Iqbal et al.	2020	Mfcc and machine learning based speech emotion recognition over tess and iemocap datasets
Yasmin et al.	2022	A rough set theory and deep learning-based predictive system for gender recognition using audio speech
Jing et al.	2023	A deep interpretable representation learning method for speech emotion recognition
Wang et al.	2023	A hierarchical birdsong feature extraction architecture combining static and dynamic modeling
Kumar et al.	2023	Automatic Bird Species Recognition using Audio and Image Data: A Short Review
Yue et al.	2023	Equilibrium optimizer for emotion classification from english speech signals
Sarkar et al.	2019	Raga identification from Hindustani classical music signal using compositional properties
Vijayan et al.	2022	Development and Analysis of Convolutional Neural Network based Accurate Speech Emotion Recognition Models
Islam et al.	2022	DCNN-LSTM based audio classification combining multiple feature engineering and data augmentation techniques
Mızrak et al.	2023	Gender Detection by Acoustic Characteristics of Sound with Machine Learning Algorithms
CN114067788A (en)	2022-02-18	Guangdong opera vocal cavity classification method based on combination of CNN and LSTM
Krishnendu	2023	Classification Of Carnatic Music Ragas Using RNN Deep Learning Models
Annabel et al.	2023	Environmental Sound Classification Using 1-D and 2-D Convolutional Neural Networks
Fu et al.	2020	Composite feature extraction for speech emotion recognition
Alamri	2023	Emotion recognition in Arabic speech from Saudi dialect corpus using machine learning and deep learning algorithms