Tyagi et al., 2023 - Google Patents
Urban Sound Classification for Audio Analysis using Long Short Term MemoryTyagi et al., 2023
View PDF- Document ID
- 11524687629811144405
- Author
- Tyagi S
- Aggarwal K
- Kumar D
- Garg S
- et al.
- Publication year
- Publication venue
- NEU Journal for Artificial Intelligence and Internet of Things
External Links
Snippet
The process of audio classification involves categorizing audio signals into predefined classes based on their acoustic characteristics. Deep learning techniques have played a significant role in addressing this issue. Researchers have proposed various approaches to …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Latif et al. | Deep representation learning in speech processing: Challenges, recent advances, and future trends | |
Jothimani et al. | MFF-SAug: Multi feature fusion with spectrogram augmentation of speech emotion recognition using convolution neural network | |
Bhatti et al. | A neural network approach for human emotion recognition in speech | |
Bansal et al. | Environmental Sound Classification: A descriptive review of the literature | |
Langari et al. | Efficient speech emotion recognition using modified feature extraction | |
Orjesek et al. | DNN based music emotion recognition from raw audio signal | |
Schmidt et al. | Feature Learning in Dynamic Environments: Modeling the Acoustic Structure of Musical Emotion. | |
Tyagi et al. | Urban Sound Classification for Audio Analysis using Long Short Term Memory | |
Hosseini et al. | Multimodal modelling of human emotion using sound, image and text fusion | |
Thornton | Audio recognition using mel spectrograms and convolution neural networks | |
Iqbal et al. | Mfcc and machine learning based speech emotion recognition over tess and iemocap datasets | |
Yasmin et al. | A rough set theory and deep learning-based predictive system for gender recognition using audio speech | |
Jing et al. | A deep interpretable representation learning method for speech emotion recognition | |
Wang et al. | A hierarchical birdsong feature extraction architecture combining static and dynamic modeling | |
Kumar et al. | Automatic Bird Species Recognition using Audio and Image Data: A Short Review | |
Yue et al. | Equilibrium optimizer for emotion classification from english speech signals | |
Sarkar et al. | Raga identification from Hindustani classical music signal using compositional properties | |
Vijayan et al. | Development and Analysis of Convolutional Neural Network based Accurate Speech Emotion Recognition Models | |
Islam et al. | DCNN-LSTM based audio classification combining multiple feature engineering and data augmentation techniques | |
Mızrak et al. | Gender Detection by Acoustic Characteristics of Sound with Machine Learning Algorithms | |
CN114067788A (en) | Guangdong opera vocal cavity classification method based on combination of CNN and LSTM | |
Krishnendu | Classification Of Carnatic Music Ragas Using RNN Deep Learning Models | |
Annabel et al. | Environmental Sound Classification Using 1-D and 2-D Convolutional Neural Networks | |
Fu et al. | Composite feature extraction for speech emotion recognition | |
Alamri | Emotion recognition in Arabic speech from Saudi dialect corpus using machine learning and deep learning algorithms |