Nothing Special   »   [go: up one dir, main page]

skip to main content
Skip header Section
Fundamentals of speech recognitionAugust 1993
Publisher:
  • Prentice-Hall, Inc.
  • Division of Simon and Schuster One Lake Street Upper Saddle River, NJ
  • United States
ISBN:978-0-13-015157-5
Published:01 August 1993
Pages:
507
Skip Bibliometrics Section
Reflects downloads up to 25 Nov 2024Bibliometrics
Abstract

No abstract available.

Cited By

  1. ACM
    Srivastava T, Khanna P, Pan S, Nguyen P and Jain S Unvoiced: Designing an LLM-assisted Unvoiced User Interface using Earables Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems, (784-798)
  2. Parlak C and Altun Y (2024). A Quest for Formant-Based Compact Nonuniform Trapezoidal Filter Banks for Speech Processing with VGG16, Circuits, Systems, and Signal Processing, 43:11, (7309-7338), Online publication date: 1-Nov-2024.
  3. Kivaisi A and Zhao Q (2024). Improved mini-batch multiple augmentation for low-resource spoken word recognition, Expert Systems with Applications: An International Journal, 252:PA, Online publication date: 15-Oct-2024.
  4. Shojaee Zade M, Mesbah M, Habibian M and Faroqi H (2024). Converting Urban Trips to Multi-Dimensional Signals to Improve Trip Purpose Inference, IEEE Transactions on Intelligent Transportation Systems, 25:10, (14497-14506), Online publication date: 1-Oct-2024.
  5. Sun X, Xiong J, Feng C, Li H, Wu Y, Fang D and Chen X (2024). EarSSR: Silent Speech Recognition via Earphones, IEEE Transactions on Mobile Computing, 23:8, (8493-8507), Online publication date: 1-Aug-2024.
  6. ACM
    Chen T, Yang Y, Qiu C, Fan X, Guo X and Shangguan L Enabling Hands-Free Voice Assistant Activation on Earphones Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services, (155-168)
  7. Papyan N, Kulhandjian M, Kulhandjian H and Aslanyan L (2024). AI-Based Drone Assisted Human Rescue in Disaster Environments: Challenges and Opportunities, Pattern Recognition and Image Analysis, 34:1, (169-186), Online publication date: 1-Mar-2024.
  8. Aslanyan L (2024). Sequential Data Classification under Dynamic Emission, Pattern Recognition and Image Analysis, 34:1, (187-198), Online publication date: 1-Mar-2024.
  9. Nagaraja B, Yadava G and Anees M (2024). Advancements in encoded speech data by background noise suppression under uncontrolled environment, International Journal of Speech Technology, 27:1, (77-84), Online publication date: 1-Mar-2024.
  10. O'Shaughnessy D (2024). Trends and developments in automatic speech recognition research, Computer Speech and Language, 83:C, Online publication date: 1-Jan-2024.
  11. Becerra A, Rosa J, Velásquez E, Zepeda G, Escalante N and Pedroza A (2024). Portable student attendance management module for university environment by using biometric mechanisms, Multimedia Tools and Applications, 83:1, (1215-1239), Online publication date: 1-Jan-2024.
  12. Zhang Q, Wu L, Guo Z and Liu B (2024). Gabor Phase Retrieval in the Generalized Paley–wiener Space, Circuits, Systems, and Signal Processing, 43:1, (470-494), Online publication date: 1-Jan-2024.
  13. Al-Issa S, Al-Ayyoub M, Al-Khaleel O and Elmitwally N (2023). Building a neural speech recognizer for quranic recitations, International Journal of Speech Technology, 26:4, (1131-1151), Online publication date: 1-Dec-2023.
  14. Nedjah N, Bonilla A and de Macedo Mourelle L (2023). Automatic speech recognition of Portuguese phonemes using neural networks ensemble, Expert Systems with Applications: An International Journal, 229:PA, Online publication date: 1-Nov-2023.
  15. Pham T (2023). Prediction of Five-Year Survival Rate for Rectal Cancer Using Markov Models of Convolutional Features of RhoB Expression on Tissue Microarray, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 20:5, (3195-3204), Online publication date: 1-Sep-2023.
  16. Shome N, Saritha B, Kashyap R and Laskar R (2023). A robust DNN model for text-independent speaker identification using non-speaker embeddings in diverse data conditions, Neural Computing and Applications, 35:26, (18933-18947), Online publication date: 1-Sep-2023.
  17. Yu Z, Chang Y, Zhang N and Xiao C SMACK Proceedings of the 32nd USENIX Conference on Security Symposium, (3799-3816)
  18. O'Shaughnessy D (2023). Review of analysis methods for speech applications, Speech Communication, 151:C, (64-75), Online publication date: 1-Jun-2023.
  19. Vatolkin I, Gotham M, López N and Ostermann F Musical Genre Recognition Based on Deep Descriptors of Harmony, Instrumentation, and Segments Artificial Intelligence in Music, Sound, Art and Design, (413-427)
  20. Stappen L, Baird A, Schumann L and Schuller B (2023). The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset: Collection, Insights and Improvements, IEEE Transactions on Affective Computing, 14:2, (1334-1350), Online publication date: 1-Apr-2023.
  21. Bak T, Lee J, Bae H, Yang J, Bae J and Joo Y Avocodo Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, (12562-12570)
  22. Bagaev D, de Vries B and Peña A (2023). Reactive Message Passing for Scalable Bayesian Inference, Scientific Programming, 2023, Online publication date: 1-Jan-2023.
  23. Chen J and Ng M (2023). Phase Retrieval of Quaternion Signal via Wirtinger Flow, IEEE Transactions on Signal Processing, 71, (2863-2878), Online publication date: 1-Jan-2023.
  24. Li P, Roch M, Klinck H, Fleishman E, Gillespie D, Nosal E, Shiu Y and Liu X (2023). Learning Stage-Wise GANs for Whistle Extraction in Time-Frequency Spectrograms, IEEE Transactions on Multimedia, 25, (9302-9314), Online publication date: 1-Jan-2023.
  25. Thaleiser S and Enzner G (2023). Binaural-Projection Multichannel Wiener Filter for Cue-Preserving Binaural Speech Enhancement, IEEE/ACM Transactions on Audio, Speech and Language Processing, 31, (3730-3745), Online publication date: 1-Jan-2023.
  26. Yen B, Li Y and Hioka Y (2023). Rotor Noise-Aware Noise Covariance Matrix Estimation for Unmanned Aerial Vehicle Audition, IEEE/ACM Transactions on Audio, Speech and Language Processing, 31, (2491-2506), Online publication date: 1-Jan-2023.
  27. Qi J, Yang C, Chen P and Tejedor J (2023). Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on Riemannian Gradient Descent With Illustrations of Speech Processing, IEEE/ACM Transactions on Audio, Speech and Language Processing, 31, (633-642), Online publication date: 1-Jan-2023.
  28. Parvathala V, Andhavarapu S, Pamisetty G and Murty K (2023). Neural Comb Filtering Using Sliding Window Attention Network for Speech Enhancement, Circuits, Systems, and Signal Processing, 42:1, (322-343), Online publication date: 1-Jan-2023.
  29. Wu M, Louw T, Lahijanian M, Ruan W, Huang X, Merat N and Kwiatkowska M Gaze-based Intention Anticipation over Driving Manoeuvres in Semi-Autonomous Vehicles 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (6210-6216)
  30. ACM
    Fu Y, Wang S, Zhong L, Chen L, Ren J and Zhang Y SVoice Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems, (622-636)
  31. Gosztolya G (2022). Estimating the degree of conflict in speech by employing Bag-of-Audio-Words and Fisher Vectors, Expert Systems with Applications: An International Journal, 205:C, Online publication date: 1-Nov-2022.
  32. Deb S and Dandapat S (2022). Analysis of out-of-breath speech for assessment of person’s physical fitness, Computer Speech and Language, 76:C, Online publication date: 1-Nov-2022.
  33. Kaur B, Rathi S and Agrawal R (2022). Enhanced depression detection from speech using Quantum Whale Optimization Algorithm for feature selection, Computers in Biology and Medicine, 150:C, Online publication date: 1-Nov-2022.
  34. Sahu L and Pradhan G (2022). Analysis of Short-Time Magnitude Spectra for Improving Intelligibility Assessment of Dysarthric Speech, Circuits, Systems, and Signal Processing, 41:10, (5676-5698), Online publication date: 1-Oct-2022.
  35. Mehra V and Pandey D (2022). Assistive Technology-Based Solution for Hearing Impairment Using Smartphones, International Journal of Software Innovation, 10:1, (1-17), Online publication date: 30-Sep-2022.
  36. Bhattacharjee M, Mahadeva Prasanna S and Guha P (2022). Speech/music classification using phase-based and magnitude-based features, Speech Communication, 142:C, (34-48), Online publication date: 1-Jul-2022.
  37. Chen Z, Li M, Wang R, Sun W, Liu J, Li H, Wang T, Lian Y, Zhang J and Wang X (2022). Diagnosis of COVID-19 via acoustic analysis and artificial intelligence by monitoring breath sounds on smartphones, Journal of Biomedical Informatics, 130:C, Online publication date: 1-Jun-2022.
  38. Chakraborty G, Sharma M, Saikia N and Sarma K (2022). Soft-computation based speech recognition system for Sylheti language, International Journal of Speech Technology, 25:2, (499-509), Online publication date: 1-Jun-2022.
  39. Bock T, Hunsen C, Joblin M and Apel S (2021). Synchronous development in open-source projects: A higher-level perspective, Automated Software Engineering, 29:1, Online publication date: 1-May-2022.
  40. Revathi A, Sasikaladevi N, Arunprasanth D and Amirtharajan R (2022). Robust respiratory disease classification using breathing sounds (RRDCBS) multiple features and models, Neural Computing and Applications, 34:10, (8155-8172), Online publication date: 1-May-2022.
  41. ACM
    Chuangulueam C, Kijsirikul B and Thubthong N Voice Impersonation for Thai Speech Using CycleGAN over Prosody Proceedings of the 4th International Conference on Management Science and Industrial Engineering, (443-447)
  42. Hosoda Y, Kawamura A and Iiguni Y (2022). Speech Bandwidth Extension Using Data Hiding Based on Discrete Hartley Transform Domain, Circuits, Systems, and Signal Processing, 41:4, (2290-2307), Online publication date: 1-Apr-2022.
  43. Londhe A, Rao P, Upadhyay S, Jain R and Koundal D (2022). Extracting Behavior Identification Features for Monitoring and Managing Speech-Dependent Smart Mental Illness Healthcare Systems, Computational Intelligence and Neuroscience, 2022, Online publication date: 1-Jan-2022.
  44. Kim M and Shin J (2022). Improved Speech Enhancement Considering Speech PSD Uncertainty, IEEE/ACM Transactions on Audio, Speech and Language Processing, 30, (1939-1951), Online publication date: 1-Jan-2022.
  45. Mrinalini K, Vijayalakshmi P and Nagarajan T (2022). SBSim: A Sentence-BERT Similarity-Based Evaluation Metric for Indian Language Neural Machine Translation Systems, IEEE/ACM Transactions on Audio, Speech and Language Processing, 30, (1396-1406), Online publication date: 1-Jan-2022.
  46. Nagakrishnan R and Revathi A (2022). Generic speech based person authentication system with genuine and spoofed utterances: different feature sets and models, Multimedia Tools and Applications, 81:1, (1179-1208), Online publication date: 1-Jan-2022.
  47. Nataraj L, Mohammed T, Nanjundaswamy T, Chikkagoudar S, Chandrasekaran S and Manjunath B OMD: Orthogonal Malware Detection using Audio, Image, and Static Features MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM), (703-708)
  48. Mizutani E and Dreyfus S (2021). On using dynamic programming for time warping in pattern recognition, Information Sciences: an International Journal, 580:C, (684-704), Online publication date: 1-Nov-2021.
  49. Kaur G, Srivastava M and Kumar A (2021). Speech Recognition Using Enhanced Features with Deep Belief Network for Real Time Application, Wireless Personal Communications: An International Journal, 120:4, (3225-3242), Online publication date: 1-Oct-2021.
  50. Vrigkas M, Kazakos E, Nikou C and Kakadiaris I (2021). Human activity recognition using robust adaptive privileged probabilistic learning, Pattern Analysis & Applications, 24:3, (915-932), Online publication date: 1-Aug-2021.
  51. ACM
    E. M, M. S, Rao K, Jayagopi D and Ramasubramanian V (2021). Approaches for Multilingual Phone Recognition in Code-switched and Non-code-switched Scenarios Using Indian Languages, ACM Transactions on Asian and Low-Resource Language Information Processing, 20:4, (1-19), Online publication date: 31-Jul-2021.
  52. Pinilla S, Mishra K and Sadler B WaveMax: FrFT-Based Convex Phase Retrieval for Radar Waveform Design 2021 IEEE International Symposium on Information Theory (ISIT), (2387-2392)
  53. R S, Kamath A, A N S and R K (2021). Fully Responsive Image and Speech Detection Artificial Yankee (FRIDAY): Human Assistant, SN Computer Science, 2:4, Online publication date: 1-Jul-2021.
  54. Huang Y and Yang J (2021). A multi-scale descriptor for real time RGB-D hand gesture recognition, Pattern Recognition Letters, 144:C, (97-104), Online publication date: 1-Apr-2021.
  55. Thimmaraja Y, Nagaraja B and Jayanna H (2021). Speech enhancement and encoding by combining SS-VAD and LPC, International Journal of Speech Technology, 24:1, (165-172), Online publication date: 1-Mar-2021.
  56. ACM
    Jasim M, Khaloo P, Wadhwa S, Zhang A, Sarvghad A and Mahyar N (2021). CommunityClick, Proceedings of the ACM on Human-Computer Interaction, 4:CSCW3, (1-32), Online publication date: 5-Jan-2021.
  57. Sakata T, Ikeda N, Ueda Y and Watanabe A (2021). Vocal Tract Length Estimation Using Accumulated Means of Formants and Its Effects on Speaker-Normalization, IEEE/ACM Transactions on Audio, Speech and Language Processing, 29, (1049-1064), Online publication date: 1-Jan-2021.
  58. Yang W, Benesty J, Huang G and Chen J (2020). A New Class of Differential Beamformers, IEEE/ACM Transactions on Audio, Speech and Language Processing, 29, (594-606), Online publication date: 1-Jan-2021.
  59. Munawar A, Raza S and Qasim A (2020). Design and Development of AI-Based Tourist Facilitator and Information Agent, Applied Computer Systems, 25:2, (124-133), Online publication date: 1-Dec-2020.
  60. El Ouahabi S, Atounti M and Bellouki M (2020). Optimal parameters selected for automatic recognition of spoken Amazigh digits and letters using Hidden Markov Model Toolkit, International Journal of Speech Technology, 23:4, (861-871), Online publication date: 1-Dec-2020.
  61. Banothu R, Basha S, Molakatala N, Gautam V and Gangashetty S Speech Based Access of Kisan Information System in Telugu Language Intelligent Human Computer Interaction, (287-298)
  62. GM H, Gourisaria M, Pandey M and Rautaray S (2020). A comprehensive survey and analysis of generative models in machine learning, Computer Science Review, 38:C, Online publication date: 1-Nov-2020.
  63. Shchetinin E, Sevastianov L, Kulyabov D, Ayrjan E and Demidova A Deep Neural Networks for Emotion Recognition Distributed Computer and Communication Networks, (365-379)
  64. Chen Y, Ye J and Li J (2020). Aggregated Wasserstein Distance and State Registration for Hidden Markov Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 42:9, (2133-2147), Online publication date: 1-Sep-2020.
  65. Aggarwal G, Monga R and Gochhayat S (2020). A Novel Hybrid PSO Assisted Optimization for Classification of Intellectual Disability Using Speech Signal, Wireless Personal Communications: An International Journal, 113:4, (1955-1971), Online publication date: 1-Aug-2020.
  66. Aguiar de Lima T and Da Costa-Abreu M (2020). A survey on automatic speech recognition systems for Portuguese language and its variations, Computer Speech and Language, 62:C, Online publication date: 1-Jul-2020.
  67. Lee M, Chiang S, Yeh S and Wen T (2020). Study on emotion recognition and companion Chatbot using deep neural network, Multimedia Tools and Applications, 79:27-28, (19629-19657), Online publication date: 1-Jul-2020.
  68. Becerra A, Rosa J, González E, Pedroza A, Escalante N and Santos E (2020). A comparative case study of neural network training by using frame-level cost functions for automatic speech recognition purposes in Spanish, Multimedia Tools and Applications, 79:27-28, (19669-19715), Online publication date: 1-Jul-2020.
  69. Watanabe K Discrete Optimal Reconstruction Distributions for Itakura-Saito Distortion Measure 2020 IEEE International Symposium on Information Theory (ISIT), (2399-2404)
  70. Deng H, Chen W, Shen Q, Ma A, Yuen P and Feng G (2020). Invariant subspace learning for time series data based on dynamic time warping distance, Pattern Recognition, 102:C, Online publication date: 1-Jun-2020.
  71. Sayoud A and Djendi M (2020). Efficient subband fast adaptive algorithm based-backward blind source separation for speech intelligibility enhancement, International Journal of Speech Technology, 23:2, (471-479), Online publication date: 1-Jun-2020.
  72. ACM
    Xu H, Shen M and Duan Y A passive controlled hand rehabilitation instrument Proceedings of the 2020 2nd International Conference on Big Data and Artificial Intelligence, (433-436)
  73. Vafeiadis A, Votis K, Giakoumis D, Tzovaras D, Chen L and Hamzaoui R (2020). Audio content analysis for unobtrusive event detection in smart homes, Engineering Applications of Artificial Intelligence, 89:C, Online publication date: 1-Mar-2020.
  74. Revathi A, Ravichandran C, Saisiddarth P and Prasad G (2020). Isolated Command Recognition Using MFCC and Clustering Algorithm, SN Computer Science, 1:2, Online publication date: 1-Mar-2020.
  75. Revathi A, Nagakrishnan R and Sasikaladevi N (2020). Twin identification from speech: linear and non-linear cepstral features and models, International Journal of Speech Technology, 23:1, (183-189), Online publication date: 1-Mar-2020.
  76. Kadyan V, Mantri A and Aggarwal R (2019). Improved filter bank on multitaper framework for robust Punjabi-ASR system, International Journal of Speech Technology, 23:1, (87-100), Online publication date: 1-Mar-2020.
  77. Takano W, Murakami Y and Nakamura Y (2020). Representation and classification of whole-body motion integrated with finger motion, Robotics and Autonomous Systems, 124:C, Online publication date: 1-Feb-2020.
  78. Takano W, Kanayama H, Takahashi T, Moridaira T and Nakamura Y (2020). A data-driven approach to probabilistic impedance control for humanoid robots, Robotics and Autonomous Systems, 124:C, Online publication date: 1-Feb-2020.
  79. Bhattacharjee M, Prasanna S and Guha P (2020). Speech/Music Classification Using Features From Spectral Peaks, IEEE/ACM Transactions on Audio, Speech and Language Processing, 28, (1549-1559), Online publication date: 1-Jan-2020.
  80. Kodrasi I and Bourlard H (2020). Spectro-Temporal Sparsity Characterization for Dysarthric Speech Detection, IEEE/ACM Transactions on Audio, Speech and Language Processing, 28, (1210-1222), Online publication date: 1-Jan-2020.
  81. Nugraha A, Sekiguchi K and Yoshii K (2020). A Flow-Based Deep Latent Variable Model for Speech Spectrogram Modeling and Enhancement, IEEE/ACM Transactions on Audio, Speech and Language Processing, 28, (1104-1117), Online publication date: 1-Jan-2020.
  82. Zhao D and Du F (2018). A novel approach for scale and rotation adaptive estimation based on time series alignment, The Visual Computer: International Journal of Computer Graphics, 36:1, (175-189), Online publication date: 1-Jan-2020.
  83. Praveen Kumar P, Thimmaraja Yadava G and Jayanna H (2019). Continuous Kannada Speech Recognition System Under Degraded Condition, Circuits, Systems, and Signal Processing, 39:1, (391-419), Online publication date: 1-Jan-2020.
  84. Gosztolya G (2019). Posterior-thresholding feature extraction for paralinguistic speech classification, Knowledge-Based Systems, 186:C, Online publication date: 15-Dec-2019.
  85. Ma Q, Zheng J, Li S and Cottrell G Learning representations for time series clustering Proceedings of the 33rd International Conference on Neural Information Processing Systems, (3781-3791)
  86. Qin C, Zhang W and Qu D (2019). A new joint CTC-attention-based speech recognition model with multi-level multi-head attention, EURASIP Journal on Audio, Speech, and Music Processing, 2019:1, (1-12), Online publication date: 1-Dec-2019.
  87. Giannoulis P, Potamianos G and Maragos P (2019). Room-localized speech activity detection in multi-microphone smart homes, EURASIP Journal on Audio, Speech, and Music Processing, 2019:1, (1-23), Online publication date: 1-Dec-2019.
  88. Calvo-Zaragoza J, Toselli A and Vidal E (2019). Handwritten Music Recognition for Mensural notation with convolutional recurrent neural networks, Pattern Recognition Letters, 128:C, (115-121), Online publication date: 1-Dec-2019.
  89. Calvo-Zaragoza J, Toselli A and Vidal E (2019). Hybrid hidden Markov models and artificial neural networks for handwritten music recognition in mensural notation, Pattern Analysis & Applications, 22:4, (1573-1584), Online publication date: 1-Nov-2019.
  90. Jati A and Georgiou P (2019). Neural Predictive Coding Using Convolutional Neural Networks Toward Unsupervised Learning of Speaker Characteristics, IEEE/ACM Transactions on Audio, Speech and Language Processing, 27:10, (1577-1589), Online publication date: 1-Oct-2019.
  91. Takano W and Nakamura Y (2019). Synthesis of kinematically constrained full-body motion from stochastic motion model, Autonomous Robots, 43:7, (1881-1894), Online publication date: 1-Oct-2019.
  92. ACM
    Wan C, Wang L and Phoha V (2018). A Survey on Gait Recognition, ACM Computing Surveys, 51:5, (1-35), Online publication date: 30-Sep-2019.
  93. Andrade T, Cancela B and Gama J Discovering Common Pathways Across Users’ Habits in Mobility Data Progress in Artificial Intelligence, (410-421)
  94. Yadava T and Jayanna H (2019). Speech enhancement by combining spectral subtraction and minimum mean square error-spectrum power estimator based on zero crossing, International Journal of Speech Technology, 22:3, (639-648), Online publication date: 1-Sep-2019.
  95. Bansal M and Sircar P (2019). A Novel AFM Signal Model for Parametric Representation of Speech Phonemes, Circuits, Systems, and Signal Processing, 38:9, (4079-4095), Online publication date: 1-Sep-2019.
  96. Arunachalam R (2019). A strategic approach to recognize the speech of the children with hearing impairment: different sets of features and models, Multimedia Tools and Applications, 78:15, (20787-20808), Online publication date: 1-Aug-2019.
  97. Takano W, Takahashi T and Nakamura Y (2019). Sequential Monte Carlo controller that integrates physical consistency and motion knowledge, Autonomous Robots, 43:6, (1523-1536), Online publication date: 1-Aug-2019.
  98. O'shaughnessy D (2019). Recognition and Processing of Speech Signals Using Neural Networks, Circuits, Systems, and Signal Processing, 38:8, (3454-3481), Online publication date: 1-Aug-2019.
  99. Guo Y, Wang T, Li J, Wang A and Wang W (2019). Multiple Input Single Output Phase Retrieval, Circuits, Systems, and Signal Processing, 38:8, (3818-3840), Online publication date: 1-Aug-2019.
  100. ACM
    FarrÚs M (2018). Voice Disguise in Automatic Speaker Recognition, ACM Computing Surveys, 51:4, (1-22), Online publication date: 31-Jul-2019.
  101. ACM
    Arnupapsanyakorn S and Ratanamahatana C An Enhanced Time Series Classification Using Linear-Regression Based Shape Descriptor Proceedings of the 2019 2nd International Conference on Data Science and Information Technology, (85-91)
  102. ACM
    Abazid M, Houmani N and Garcia-Salicetti S Impact of Spatial Constraints when Signing in Uncontrolled Mobile Conditions Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, (89-94)
  103. Sasikaladevi N, Geetha K, Revathi A, Mahalakshmi N and Archana N (2019). SCAN-speech biometric template protection based on genus-2 hyper elliptic curve, Multimedia Tools and Applications, 78:13, (18339-18361), Online publication date: 1-Jul-2019.
  104. Alfaro-Contreras M, Calvo-Zaragoza J and Iñesta J Approaching End-to-End Optical Music Recognition for Homophonic Scores Pattern Recognition and Image Analysis, (147-158)
  105. Huang Y, Chen W, Chen H, Wang L and Wu K G-Fall: Device-free and Training-free Fall Detection with Geophones 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), (1-9)
  106. ACM
    Almotairi M, Alsahfi T and Elmasri R Challenges of comparing and matching roads from different spatial datasets Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments, (164-171)
  107. Sharma U, Maheshkar S, Mishra A and Kaushik R (2019). Visual Speech Recognition Using Optical Flow and Hidden Markov Model, Wireless Personal Communications: An International Journal, 106:4, (2129-2147), Online publication date: 1-Jun-2019.
  108. Azam M and Bouguila N (2019). Bounded Generalized Gaussian Mixture Model with ICA, Neural Processing Letters, 49:3, (1299-1320), Online publication date: 1-Jun-2019.
  109. Kishi R, Trojahn T and Goularte R (2019). Correlation based feature fusion for the temporal video scene segmentation task, Multimedia Tools and Applications, 78:11, (15623-15646), Online publication date: 1-Jun-2019.
  110. Djendi M and Sayoud A (2019). A new dual subband fast NLMS adaptive filtering algorithm for blind speech quality enhancement and acoustic noise reduction, International Journal of Speech Technology, 22:2, (391-406), Online publication date: 1-Jun-2019.
  111. Srivastava S, Chandra M and Sahoo G (2019). Speaker identification and its application in automobile industry for automatic seat adjustment, Microsystem Technologies, 25:6, (2339-2347), Online publication date: 1-Jun-2019.
  112. ACM
    Murthy Y and Koolagudi S (2018). Content-Based Music Information Retrieval (CB-MIR) and Its Applications toward the Music Industry, ACM Computing Surveys, 51:3, (1-46), Online publication date: 31-May-2019.
  113. Blanc K, Lingrand D, Paladini A, Coviello L, Mitrev D, Söhler E, Guzman L and Precioso F Analysis of temporal alignment for Video Classification 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), (1-5)
  114. Vatolkin I and Stoller D Evolutionary Multi-objective Training Set Selection of Data Instances and Augmentations for Vocal Detection Computational Intelligence in Music, Sound, Art and Design, (201-216)
  115. Palaz D, Magimai-Doss M and Collobert R (2019). End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition, Speech Communication, 108:C, (15-32), Online publication date: 1-Apr-2019.
  116. Prasada Rao K, Chandra Sekhara Rao M and Hemanth Chowdary N (2019). An integrated approach to emotion recognition and gender classification, Journal of Visual Communication and Image Representation, 60:C, (339-345), Online publication date: 1-Apr-2019.
  117. Vall A, Dorfer M, Eghbal-Zadeh H, Schedl M, Burjorjee K and Widmer G (2019). Feature-combination hybrid recommender systems for automated music playlist continuation, User Modeling and User-Adapted Interaction, 29:2, (527-572), Online publication date: 1-Apr-2019.
  118. Sinha S, Jain A and Agrawal S (2019). Empirical analysis of linguistic and paralinguistic information for automatic dialect classification, Artificial Intelligence Review, 51:4, (647-672), Online publication date: 1-Apr-2019.
  119. ACM
    Benmachiche A, Makhlouf A and Bouhadada T Evolutionary learning of HMM with Gaussian mixture densities for Automatic speech recognition Proceedings of the 9th International Conference on Information Systems and Technologies, (1-6)
  120. Granell E, Romero V and Martínez-Hinarejos C (2019). Image–speech combination for interactive computer assisted transcription of handwritten documents, Computer Vision and Image Understanding, 180:C, (74-83), Online publication date: 1-Mar-2019.
  121. Gupta S, Karanath A, Mahrifa K, Dileep A and Thenkanidiyoor V (2019). Segment-level probabilistic sequence kernel and segment-level pyramid match kernel based extreme learning machine for classification of varying length patterns of speech, International Journal of Speech Technology, 22:1, (231-249), Online publication date: 1-Mar-2019.
  122. Kalamani M, Krishnamoorthi M and Valarmathi R (2019). Continuous Tamil Speech Recognition technique under non stationary noisy environments, International Journal of Speech Technology, 22:1, (47-58), Online publication date: 1-Mar-2019.
  123. Kadyan V, Mantri A, Aggarwal R and Singh A (2019). A comparative study of deep neural network based Punjabi-ASR system, International Journal of Speech Technology, 22:1, (111-119), Online publication date: 1-Mar-2019.
  124. Zakeri V and Hodgson A (2019). Automatic Identification of Hard and Soft Bone Tissues by Analyzing Drilling Sounds, IEEE/ACM Transactions on Audio, Speech and Language Processing, 27:2, (404-414), Online publication date: 1-Feb-2019.
  125. Bayle Y, Robine M and Hanna P (2019). SATIN, Multimedia Tools and Applications, 78:3, (2703-2718), Online publication date: 1-Feb-2019.
  126. De Floriani L (2018). Message from the Editor-in-Chief, IEEE Transactions on Visualization and Computer Graphics, 25:1, (xi-xi), Online publication date: 1-Jan-2019.
  127. Andrienko N, Andrienko G, Garcia J and Scarlatti D (2018). Analysis of Flight Variability: a Systematic Approach, IEEE Transactions on Visualization and Computer Graphics, 25:1, (54-64), Online publication date: 1-Jan-2019.
  128. Revathi A, Jeyalakshmi C and Thenmozhi K (2019). Person authentication using speech as a biometric against play back attacks, Multimedia Tools and Applications, 78:2, (1569-1582), Online publication date: 1-Jan-2019.
  129. Gosztolya G and Busa-Fekete R (2019). Calibrating AdaBoost for phoneme classification, Soft Computing - A Fusion of Foundations, Methodologies and Applications, 23:1, (115-128), Online publication date: 1-Jan-2019.
  130. Werner M, Ji W and AbouRizk S Improving tunneling simulation using bayesian updating and hidden Markov chains Proceedings of the 2018 Winter Simulation Conference, (3930-3940)
  131. Kanhe A and Gnanasekaran A (2018). Robust image-in-audio watermarking technique based on DCT-SVD transform, EURASIP Journal on Audio, Speech, and Music Processing, 2018:1, (1-12), Online publication date: 1-Dec-2018.
  132. Hsu S, Lee C, Chang P, Han C and Fan K (2018). Local Wavelet Acoustic Pattern: A Novel Time–Frequency Descriptor for Birdsong Recognition, IEEE Transactions on Multimedia, 20:12, (3187-3199), Online publication date: 1-Dec-2018.
  133. Aggarwal G and Singh L (2018). Evaluation of Supervised Learning Algorithms Based on Speech Features as Predictors to the Diagnosis of Mild to Moderate Intellectual Disability, 3D Research, 9:4, (1-11), Online publication date: 1-Dec-2018.
  134. Zoulikha M and Djendi M (2018). A new robust forward BSS adaptive algorithm based on automatic voice activity detector for speech quality enhancement, International Journal of Speech Technology, 21:4, (1007-1020), Online publication date: 1-Dec-2018.
  135. Bansal M and Sircar P (2018). Low bit-rate speech coding based on multicomponent AFM signal model, International Journal of Speech Technology, 21:4, (783-795), Online publication date: 1-Dec-2018.
  136. Revathi A, Sasikaladevi N and Jeyalakshmi C (2018). Digital speech watermarking to enhance the security using speech as a biometric for person authentication, International Journal of Speech Technology, 21:4, (1021-1031), Online publication date: 1-Dec-2018.
  137. Touazi A and Debyeche M (2018). An investigation of the impact of MVA normalization on the advanced front-end features, International Journal of Speech Technology, 21:4, (887-893), Online publication date: 1-Dec-2018.
  138. Shahnawazuddin S, Singh C, Kathania H, Ahmad W and Pradhan G (2018). An Experimental Study on the Significance of Variable Frame-Length and Overlap in the Context of Children's Speech Recognition, Circuits, Systems, and Signal Processing, 37:12, (5540-5553), Online publication date: 1-Dec-2018.
  139. Bhukya R, Sarma B and Prasanna S (2018). End Point Detection Using Speech-Specific Knowledge for Text-Dependent Speaker Verification, Circuits, Systems, and Signal Processing, 37:12, (5507-5539), Online publication date: 1-Dec-2018.
  140. ACM
    Almotairi M, Alsahfi T and Elmasri R Using Local and Global Divergence Measures to Identify Road Similarity in Different Road Network Datasets Proceedings of the 11th ACM SIGSPATIAL International Workshop on Computational Transportation Science, (21-28)
  141. Kanhe A and Aghila G (2018). A DCT---SVD-Based Speech Steganography in Voiced Frames, Circuits, Systems, and Signal Processing, 37:11, (5049-5068), Online publication date: 1-Nov-2018.
  142. ACM
    Choi H, Lee W, Aafer Y, Fei F, Tu Z, Zhang X, Xu D and Deng X Detecting Attacks Against Robotic Vehicles Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, (801-816)
  143. Cuzzocrea A, Mumolo E and Hassani M An Effective and Efficient Approach for Supporting the Generation of Synthetic Memory Reference Traces via Hierarchical Hidden/Non-Hidden Markov Models 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (2953-2959)
  144. Becerra A, Rosa J, González E, Pedroza A and Escalante N (2018). Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition, Multimedia Tools and Applications, 77:20, (27231-27267), Online publication date: 1-Oct-2018.
  145. Grekow J (2018). Musical performance analysis in terms of emotions it evokes, Journal of Intelligent Information Systems, 51:2, (415-437), Online publication date: 1-Oct-2018.
  146. Furukawa M and Shinomoto S (2018). Inferring objects from a multitude of oscillations, Neural Computing and Applications, 30:8, (2471-2478), Online publication date: 1-Oct-2018.
  147. Revathi A, Sasikaladevi N, Nagakrishnan R and Jeyalakshmi C (2018). Robust emotion recognition from speech, International Journal of Speech Technology, 21:3, (723-739), Online publication date: 1-Sep-2018.
  148. Klaylat S, Osman Z, Hamandi L and Zantout R (2018). Emotion recognition in Arabic speech, Analog Integrated Circuits and Signal Processing, 96:2, (337-351), Online publication date: 1-Aug-2018.
  149. Li Y and Nakos V Sublinear- Time Algorithms for Compressive Phase Retrieval 2018 IEEE International Symposium on Information Theory (ISIT), (2301-2305)
  150. Becerra A, De La Rosa J and González E (2018). Speech recognition in a dialog system, Multimedia Tools and Applications, 77:12, (15875-15911), Online publication date: 1-Jun-2018.
  151. Wang Z and Piccardi M (2018). Minimum-risk temporal alignment of videos, Multimedia Tools and Applications, 77:12, (14891-14906), Online publication date: 1-Jun-2018.
  152. Lv F and Sun W (2018). Real phase retrieval from unordered partial frame coefficients, Advances in Computational Mathematics, 44:3, (879-896), Online publication date: 1-Jun-2018.
  153. Prescher D, Bornschein J, Köhlmann W and Weber G (2018). Touching graphical applications, Universal Access in the Information Society, 17:2, (391-409), Online publication date: 1-Jun-2018.
  154. Kathania H, Ahmad W, Shahnawazuddin S and Samaddar A (2018). Explicit Pitch Mapping for Improved Children's Speech Recognition, Circuits, Systems, and Signal Processing, 37:5, (2021-2044), Online publication date: 1-May-2018.
  155. Jannati M and Sayadiyan A (2018). Part-Syllable Transformation-Based Voice Conversion with Very Limited Training Data, Circuits, Systems, and Signal Processing, 37:5, (1935-1957), Online publication date: 1-May-2018.
  156. ACM
    Vertanen K, Fletcher C, Gaines D, Gould J and Kristensson P The Impact of Word, Multiple Word, and Sentence Input on Virtual Keyboard Decoding Performance Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, (1-12)
  157. Hu G, Hu Y, Yang K, Yu Z, Sung F, Zhang Z, Xie F, Liu J, Robertson N, Hospedales T and Miemie Q Deep Stock Representation Learning: From Candlestick Charts to Investment Decisions 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2706-2710)
  158. Bendory T, Edidin D and Eldar Y Recovering Signals from their FROG Trace 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (1488-1492)
  159. Van Vaerenbergh S, Santamaria I, Elvira V and Salvatori M Pattern Localization in Time Series Through Signal-To-Model Alignment in Latent Space 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2711-2715)
  160. Juvela L, Bollepalli B, Wang X, Kameoka H, Airaksinen M, Yamagishi J and Alku P Speech Waveform Synthesis from MFCC Sequences with Generative Adversarial Networks 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (5679-5683)
  161. Yela D, Ewert S, O'Hanlon K and Sandler M Shift-Invariant Kernel Additive Modelling for Audio Source Separation 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (616-620)
  162. Sinha R and Shahnawazuddin S (2018). Assessment of pitch-adaptive front-end signal processing for childrens speech recognition, Computer Speech and Language, 48:C, (103-121), Online publication date: 1-Mar-2018.
  163. Zerrouki N and Houacine A (2018). Combined curvelets and hidden Markov models for human fall detection, Multimedia Tools and Applications, 77:5, (6405-6424), Online publication date: 1-Mar-2018.
  164. Zhao J and Itti L (2018). shapeDTW, Pattern Recognition, 74:C, (171-184), Online publication date: 1-Feb-2018.
  165. Capecci M, Ceravolo M, Ferracuti F, Iarlori S, Kyrki V, Monteriù A, Romeo L and Verdini F (2018). A Hidden Semi-Markov Model based approach for rehabilitation exercise assessment, Journal of Biomedical Informatics, 78:C, (1-11), Online publication date: 1-Feb-2018.
  166. Ghosh R, Roy P and Kumar P (2018). Smart Device Authentication Based on Online Handwritten Script Identification and Word Recognition in Indic Scripts Using Zone-Wise Features, International Journal of Information System Modeling and Design, 9:1, (21-55), Online publication date: 1-Jan-2018.
  167. Kłosowski P (2017). Statistical analysis of orthographic and phonemic language corpus for word-based and phoneme-based Polish language modelling, EURASIP Journal on Audio, Speech, and Music Processing, 2017:1, (1-16), Online publication date: 1-Dec-2017.
  168. Zafeiriou L, Panagakis Y, Pantic M and Zafeiriou S (2017). Nonnegative Decompositions for Dynamic Visual Data Analysis, IEEE Transactions on Image Processing, 26:12, (5603-5617), Online publication date: 1-Dec-2017.
  169. Bonilla Cardona D, Nedjah N and Mourelle L (2017). Online phoneme recognition using multi-layer perceptron networks combined with recurrent non-linear autoregressive neural networks with exogenous inputs, Neurocomputing, 265:C, (78-90), Online publication date: 22-Nov-2017.
  170. ACM
    Venkataramani S, Smaragdis P and Mysore G AutoDub Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, (533-538)
  171. Bharathi B (2017). Speaker-specific-text based speaker verification system using spectral and phase based features, International Journal of Speech Technology, 20:3, (465-474), Online publication date: 1-Sep-2017.
  172. ACM
    Bouraoui H, Jerad C, Chattopadhyay A and Hadj-Alouane N (2017). Hardware Architectures for Embedded Speaker Recognition Applications, ACM Transactions on Embedded Computing Systems, 16:3, (1-28), Online publication date: 31-Aug-2017.
  173. Huang Y, Ao W and Zhang G (2017). Novel Sub-band Spectral Centroid Weighted Wavelet Packet Features with Importance-Weighted Support Vector Machines for Robust Speech Emotion Recognition, Wireless Personal Communications: An International Journal, 95:3, (2223-2238), Online publication date: 1-Aug-2017.
  174. C.K. Y, Hariharan M, Ngadiran R, Adom A, Yaacob S and Polat K (2017). Hybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speech, Applied Soft Computing, 56:C, (217-232), Online publication date: 1-Jul-2017.
  175. Nadeau A and Sharma G (2017). An Audio Watermark Designed for Efficient and Robust Resynchronization After Analog Playback, IEEE Transactions on Information Forensics and Security, 12:6, (1393-1405), Online publication date: 1-Jun-2017.
  176. Pham T (2017). Validation of Computer Models for Evaluating the Efficacy of Cognitive Stimulation Therapy, Wireless Personal Communications: An International Journal, 94:3, (301-314), Online publication date: 1-Jun-2017.
  177. ACM
    Chandarana D, Shah V, Kumar A and Saul L SpeakQL Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, (1-6)
  178. ACM
    Huang Y, Huang Y, Xue N and Bigham J Leveraging Complementary Contributions of Different Workers for Efficient Crowdsourcing of Video Captions Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, (4617-4626)
  179. Jaganathan K, Oymak S and Hassibi B (2017). Sparse Phase Retrieval, IEEE Transactions on Signal Processing, 65:9, (2402-2410), Online publication date: 1-May-2017.
  180. Wang W, Liu A, Shahzad M, Ling K and Lu S (2017). Device-Free Human Activity Recognition Using Commercial WiFi Devices, IEEE Journal on Selected Areas in Communications, 35:5, (1118-1131), Online publication date: 1-May-2017.
  181. Takano W, Yamada Y and Nakamura Y (2017). Generation of action description from classification of motion and object, Robotics and Autonomous Systems, 91:C, (247-257), Online publication date: 1-May-2017.
  182. Potamianos G, Marcheret E, Mroueh Y, Goel V, Koumbaroulis A, Vartholomaios A and Thermos S Audio and visual modality combination in speech processing applications The Handbook of Multimodal-Multisensor Interfaces, (489-543)
  183. Katsamanis A, Pitsikalis V, Theodorakis S and Maragos P Multimodal gesture recognition The Handbook of Multimodal-Multisensor Interfaces, (449-487)
  184. Oviatt S, Schuller B, Cohen P, Sonntag D, Potamianos G and Krüger A (2017). The Handbook of Multimodal-Multisensor Interfaces, 10.1145/3015783, Online publication date: 24-Apr-2017.
  185. Schneider J, Börner D, Rosmalen P and Specht M (2017). Presentation Trainer, Journal of Computer Assisted Learning, 33:2, (164-177), Online publication date: 1-Apr-2017.
  186. Sangeetha J and Jothilakshmi S (2017). Speech translation system for english to dravidian languages, Applied Intelligence, 46:3, (534-550), Online publication date: 1-Apr-2017.
  187. Chin Y, Chen B and Wang J Kernel weighted Fisher sparse analysis on multiple maps for audio event recognition 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (6010-6014)
  188. Hwang K and Sung W Character-level language modeling with hierarchical recurrent neural networks 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (5720-5724)
  189. Wang J, Kim M, Hernandez-Mulero A, Heitzman D and Ferrari P Towards decoding speech production from single-trial magnetoencephalography (MEG) signals 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (3036-3040)
  190. Kumar D, Bezdek J, Rajasegarar S, Leckie C and Palaniswami M (2017). A visual-numeric approach to clustering and anomaly detection for trajectory data, The Visual Computer: International Journal of Computer Graphics, 33:3, (265-281), Online publication date: 1-Mar-2017.
  191. Wang H, Yang W, Yuan C, Ling H and Hu W (2017). Human activity prediction using temporally-weighted generalized time warping, Neurocomputing, 225:C, (139-147), Online publication date: 15-Feb-2017.
  192. ACM
    Meutzner H, Gupta S, Nguyen V, Holz T and Kolossa D (2016). Toward Improved Audio CAPTCHAs Based on Auditory Perception and Language Understanding, ACM Transactions on Privacy and Security, 19:4, (1-31), Online publication date: 3-Feb-2017.
  193. Granell E, Martinez-Hinarejos C, Granell E and Martinez-Hinarejos C (2017). Multimodal Crowdsourcing for Transcribing Handwritten Documents, IEEE/ACM Transactions on Audio, Speech and Language Processing, 25:2, (409-419), Online publication date: 1-Feb-2017.
  194. Yang F, Balakrishnan S and Wainwright M (2017). Statistical and computational guarantees for the Baum-Welch algorithm, The Journal of Machine Learning Research, 18:1, (4528-4580), Online publication date: 1-Jan-2017.
  195. Nandi D, Pati D and Rao K (2017). Parametric representation of excitation source information for language identification, Computer Speech and Language, 41:C, (88-115), Online publication date: 1-Jan-2017.
  196. ACM
    Dang S, Chaudhury S, Lall B and Roy P Autoregressive hidden Markov model with missing data for modelling functional MR imaging data Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing, (1-8)
  197. Dov D, Talmon R and Cohen I (2016). Kernel-Based Sensor Fusion With Application to Audio-Visual Voice Activity Detection, IEEE Transactions on Signal Processing, 64:24, (6406-6416), Online publication date: 1-Dec-2016.
  198. You S, Wu Y and Peng S (2016). Comparative study of singing voice detection methods, Multimedia Tools and Applications, 75:23, (15509-15524), Online publication date: 1-Dec-2016.
  199. Hossain M (2016). Patient State Recognition System for Healthcare Using Speech and Facial Expressions, Journal of Medical Systems, 40:12, (1-8), Online publication date: 1-Dec-2016.
  200. Granell E and Martínez-Hinarejos C Collaborator Effort Optimisation in Multimodal Crowdsourcing for Transcribing Historical Manuscripts Advances in Speech and Language Technologies for Iberian Languages, (234-244)
  201. ACM
    Sadana R, Setlur V and Stasko J Redefining a Contribution for Immersive Visualization Research Proceedings of the 2016 ACM Companion on Interactive Surfaces and Spaces, (41-45)
  202. Wang S, Ewert S, Dixon S, Siying Wang , Ewert S and Dixon S (2016). Robust and Efficient Joint Alignment of Multiple Musical Performances, IEEE/ACM Transactions on Audio, Speech and Language Processing, 24:11, (2132-2145), Online publication date: 1-Nov-2016.
  203. Sigtia S, Stark A, Krstulovic S, Plumbley M, Sigtia S, Stark A, Krstulovic S and Plumbley M (2016). Automatic Environmental Sound Recognition, IEEE/ACM Transactions on Audio, Speech and Language Processing, 24:11, (2096-2107), Online publication date: 1-Nov-2016.
  204. Huang X, Ye Y, Xiong L, Lau R, Jiang N and Wang S (2016). Time series k-means, Information Sciences: an International Journal, 367:C, (1-13), Online publication date: 1-Nov-2016.
  205. Lim K, Buntine W, Chen C and Du L (2016). Nonparametric Bayesian topic modelling with the hierarchical Pitman-Yor processes, International Journal of Approximate Reasoning, 78:C, (172-191), Online publication date: 1-Nov-2016.
  206. Silversides K and Melkumyan A (2016). A Dynamic Time Warping based covariance function for Gaussian Processes signature identification, Computers & Geosciences, 96:C, (69-76), Online publication date: 1-Nov-2016.
  207. Nirmal J, Zaveri M, Patnaik S and Kachare P (2016). Voice conversion system using salient sub-bands and radial basis function, Neural Computing and Applications, 27:8, (2615-2628), Online publication date: 1-Nov-2016.
  208. ACM
    Bragg D, Huynh N and Ladner R A Personalizable Mobile Sound Detector App Design for Deaf and Hard-of-Hearing Users Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility, (3-13)
  209. Lafay G, Lagrange M, Rossignol M, Benetos E, Roebel A, Lafay G, Lagrange M, Rossignol M, Benetos E, Roebel A, Roebel A, Lafay G, Lagrange M, Rossignol M and Benetos E (2016). A Morphological Model for Simulating Acoustic Scenes and Its Application to Sound Event Detection, IEEE/ACM Transactions on Audio, Speech and Language Processing, 24:10, (1854-1864), Online publication date: 1-Oct-2016.
  210. Shahin I (2016). Speaker Identification in a Shouted Talking Environment Based on Novel Third-Order Circular Suprasegmental Hidden Markov Models, Circuits, Systems, and Signal Processing, 35:10, (3770-3792), Online publication date: 1-Oct-2016.
  211. Kocyan T, Slaninová K and Martinovič J Flexible Global Constraint Extension for Dynamic Time Warping Computer Information Systems and Industrial Management, (389-401)
  212. ACM
    Granell E and Martínez-Hinarejos C A Multimodal Crowdsourcing Framework for Transcribing Historical Handwritten Documents Proceedings of the 2016 ACM Symposium on Document Engineering, (157-163)
  213. (2016). Combination of multiple acoustic models with unsupervised adaptation for lecture speech transcription, Speech Communication, 82:C, (1-13), Online publication date: 1-Sep-2016.
  214. Chelotti J, Vanrell S, Milone D, Utsumi S, Galli J, Rufiner H and Giovanini L (2016). A real-time algorithm for acoustic monitoring of ingestive behavior of grazing cattle, Computers and Electronics in Agriculture, 127:C, (64-75), Online publication date: 1-Sep-2016.
  215. ACM
    Kanhe A and Aghila G DCT based Audio Steganography in Voiced and Un-voiced Frames Proceedings of the International Conference on Informatics and Analytics, (1-4)
  216. ACM
    Hu H, Velez-Ginorio J and Qi G Temporal Order-based First-Take-All Hashing for Fast Attention-Deficit-Hyperactive-Disorder Detection Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (905-914)
  217. Nisar S, Khan O and Tariq M (2016). An Efficient Adaptive Window Size Selection Method for Improving Spectrogram Visualization, Computational Intelligence and Neuroscience, 2016, (16), Online publication date: 1-Aug-2016.
  218. Alameda-Pineda X, Staiano J, Subramanian R, Batrinca L, Ricci E, Lepri B, Lanz O and Sebe N (2016). SALSA: A Novel Dataset for Multimodal Group Behavior Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 38:8, (1707-1720), Online publication date: 1-Aug-2016.
  219. Sengupta N, Sahidullah M and Saha G (2016). Lung sound classification using cepstral-based statistical features, Computers in Biology and Medicine, 75:C, (118-129), Online publication date: 1-Aug-2016.
  220. ACM
    Kokkinidis K, Stergiaki A and Tsagaris A Error prooving and sensorimotor feedback for singing voice Proceedings of the 3rd International Symposium on Movement and Computing, (1-4)
  221. Baltazar A (2016). ZatLab Gesture Recognition Framework, International Journal of Creative Interfaces and Computer Graphics, 7:2, (11-24), Online publication date: 1-Jul-2016.
  222. Kim C and Stern R (2016). Power-normalized cepstral coefficients (PNCC) for robust speech recognition, IEEE/ACM Transactions on Audio, Speech and Language Processing, 24:7, (1315-1329), Online publication date: 1-Jul-2016.
  223. Phan H, Lu J, Asente P, Chan A and Fu H Patternista Proceedings of the Joint Symposium on Computational Aesthetics and Sketch Based Interfaces and Modeling and Non-Photorealistic Animation and Rendering, (79-88)
  224. Sigtia S, Benetos E and Dixon S (2016). An end-to-end neural network for polyphonic piano music transcription, IEEE/ACM Transactions on Audio, Speech and Language Processing, 24:5, (927-939), Online publication date: 1-May-2016.
  225. Receveur S, Weiß R and Fingscheidt T (2016). Turbo automatic speech recognition, IEEE/ACM Transactions on Audio, Speech and Language Processing, 24:5, (846-862), Online publication date: 1-May-2016.
  226. Shokouhi M, Ozertem U and Craswell N Did You Say U2 or YouTube? Proceedings of the 25th International Conference on World Wide Web, (1215-1224)
  227. Atkins J and Sharma D (2016). Visualization of Babble---Speech Interactions Using Andrews Curves, Circuits, Systems, and Signal Processing, 35:4, (1313-1331), Online publication date: 1-Apr-2016.
  228. ACM
    Therese S and Lingam C Speaker Identification and Authentication System using Energy based Cepstral Data Technique Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, (1-6)
  229. Zhou F and Torre F (2016). Generalized Canonical Time Warping, IEEE Transactions on Pattern Analysis and Machine Intelligence, 38:2, (279-294), Online publication date: 1-Feb-2016.
  230. Margolies R, Sridharan A, Aggarwal V, Jana R, Shankaranarayanan N, Vaishampayan V and Zussman G (2016). Exploiting mobility in proportional fair cellular scheduling, IEEE/ACM Transactions on Networking, 24:1, (355-367), Online publication date: 1-Feb-2016.
  231. Miodonska Z, Bugdol M and Krecichwost M (2016). Dynamic time warping in phoneme modeling for fast pronunciation error detection, Computers in Biology and Medicine, 69:C, (277-285), Online publication date: 1-Feb-2016.
  232. ACM
    Li K, Zhou Z and Lee C (2016). Sign Transition Modeling and a Scalable Solution to Continuous Sign Language Recognition for Real-World Applications, ACM Transactions on Accessible Computing, 8:2, (1-23), Online publication date: 30-Jan-2016.
  233. Khatwani A, Pawar K, Hegde S, Rao S, Seshasayee A and Ramasubramanian V Spoken Document Retrieval Proceedings of the Third International Conference on Mining Intelligence and Knowledge Exploration - Volume 9468, (297-311)
  234. Bankó Z and Abonyi J (2015). Mixed dissimilarity measure for piecewise linear approximation based time series applications, Expert Systems with Applications: An International Journal, 42:21, (7664-7675), Online publication date: 30-Nov-2015.
  235. Wu X, Matsumoto Y, Duh K and Shindo H An Improved Hierarchical Word Sequence Language Model Using Word Association Proceedings of the Third International Conference on Statistical Language and Speech Processing - Volume 9449, (275-287)
  236. ACM
    Sinha T, Zhao R and Cassell J Exploring Socio-Cognitive Effects of Conversational Strategy Congruence in Peer Tutoring Proceedings of the 1st Workshop on Modeling INTERPERsonal SynchrONy And infLuence, (5-12)
  237. ACM
    Ho E, Chan J, Cheung Y and Yuen P Modeling spatial relations of human body parts for indexing and retrieving close character interactions Proceedings of the 21st ACM Symposium on Virtual Reality Software and Technology, (187-190)
  238. ACM
    Liu Y, Xu F, Chai J, Tong X, Wang L and Huo Q (2015). Video-audio driven real-time facial animation, ACM Transactions on Graphics, 34:6, (1-10), Online publication date: 4-Nov-2015.
  239. Roy A, Halevi T and Memon N An HMM-based multi-sensor approach for continuous mobile authentication MILCOM 2015 - 2015 IEEE Military Communications Conference, (1311-1316)
  240. ACM
    Gao S and Li H Octave-dependent Probabilistic Latent Semantic Analysis to Chorus Detection of Popular Song Proceedings of the 23rd ACM international conference on Multimedia, (979-982)
  241. Peruffo Minotto V, Rosito Jung C and Bowon Lee (2015). Multimodal Multi-Channel On-Line Speaker Diarization Using Sensor Fusion Through SVM, IEEE Transactions on Multimedia, 17:10, (1694-1705), Online publication date: 1-Oct-2015.
  242. Stowell D, Giannoulis D, Benetos E, Lagrange M and Plumbley M (2015). Detection and Classification of Acoustic Scenes and Events, IEEE Transactions on Multimedia, 17:10, (1733-1746), Online publication date: 1-Oct-2015.
  243. Jian Cheng , Haijun Liu , Feng Wang , Hongsheng Li and Ce Zhu (2015). Silhouette Analysis for Human Action Recognition Based on Supervised Temporal t-SNE and Incremental Learning, IEEE Transactions on Image Processing, 24:10, (3203-3217), Online publication date: 1-Oct-2015.
  244. ACM
    Wang W, Liu A, Shahzad M, Ling K and Lu S Understanding and Modeling of WiFi Signal Based Human Activity Recognition Proceedings of the 21st Annual International Conference on Mobile Computing and Networking, (65-76)
  245. Yamada M, Sigal L, Raptis M, Toyoda M, Chang Y and Sugiyama M (2015). Cross-Domain Matching with Squared-Loss Mutual Information, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37:9, (1764-1776), Online publication date: 1-Sep-2015.
  246. ACM
    Wu M and Jang J (2015). Combining Acoustic and Multilevel Visual Features for Music Genre Classification, ACM Transactions on Multimedia Computing, Communications, and Applications, 12:1, (1-17), Online publication date: 24-Aug-2015.
  247. ACM
    Ping D, Sun X and Mao B TextLogger Proceedings of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks, (1-12)
  248. Guo J, Liu J, Chen X, Han Q and Zhou K Tunable Discounting Mechanisms for Language Modeling Revised Selected Papers, Part II, of the 5th International Conference on Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques - Volume 9243, (585-594)
  249. Fagiani M, Principi E, Squartini S and Piazza F (2015). Signer independent isolated Italian sign recognition based on hidden Markov models, Pattern Analysis & Applications, 18:2, (385-402), Online publication date: 1-May-2015.
  250. ACM
    Meutzner H, Gupta S and Kolossa D Constructing Secure Audio CAPTCHAs by Exploiting Differences between Humans and Machines Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, (2335-2338)
  251. ACM
    Nagendar G and Jawahar C Fast approximate dynamic warping kernels Proceedings of the 2nd ACM IKDD Conference on Data Sciences, (30-38)
  252. Gharehbaghi A, Ask P and Babic A (2015). A pattern recognition framework for detecting dynamic changes on cyclic time series, Pattern Recognition, 48:3, (696-708), Online publication date: 1-Mar-2015.
  253. ACM
    Sharma S and Patil H Combining Evidences from Bark Scale and Mel Scale Warped Features for VTLN Proceedings of the 2nd International Conference on Perception and Machine Intelligence, (133-136)
  254. Chung-Hsien Chang , Bo-Wei Chen , Shi-Huang Chen , Jhing-Fa Wang and Yu-Hao Chiu (2015). Low-Complexity Hardware Design for Fast Solving LSPs With Coordinated Polynomial Solution, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23:2, (230-243), Online publication date: 1-Feb-2015.
  255. ACM
    Bhardwaj A, Chaudhuri S and Dabeer O (2014). Design and Analysis of Predictive Sampling of Haptic Signals, ACM Transactions on Applied Perception, 11:4, (1-20), Online publication date: 9-Jan-2015.
  256. Pitsikalis V, Katsamanis A, Theodorakis S and Maragos P (2015). Multimodal gesture recognition via multiple hypotheses rescoring, The Journal of Machine Learning Research, 16:1, (255-284), Online publication date: 1-Jan-2015.
  257. Jensen J and Tan Z (2015). Minimum mean-square error estimation of mel-frequency cepstral features-a theoretically consistent approach, IEEE/ACM Transactions on Audio, Speech and Language Processing, 23:1, (186-197), Online publication date: 1-Jan-2015.
  258. Khoury I, Giménez A, Juan A and Andrés-Ferrer J (2015). Window repositioning for printed Arabic recognition, Pattern Recognition Letters, 51:C, (86-93), Online publication date: 1-Jan-2015.
  259. ACM
    Meutzner H, Nguyen V, Holz T and Kolossa D Using automatic speech recognition for attacking acoustic CAPTCHAs Proceedings of the 30th Annual Computer Security Applications Conference, (276-285)
  260. ACM
    Kächele M, Schels M and Schwenker F Inferring Depression and Affect from Application Dependent Meta Knowledge Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, (41-48)
  261. ACM
    Li X, Kardes H, Wang X and Sun A HMM-based address parsing Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, (433-436)
  262. ACM
    Nawaz S and Mascolo C Mining users' significant driving routes with low-power sensors Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems, (236-250)
  263. ACM
    Li X, Kardes H, Wang X and Sun A HMM-based Address Parsing with Massive Synthetic Training Data Generation Proceedings of the 4th International Workshop on Location and the Web, (33-36)
  264. Martins R and Ynoguti C Normalização do locutor em sistemas de reconhecimento de fala para usuários crianças Proceedings of the 13th Brazilian Symposium on Human Factors in Computing Systems, (381-384)
  265. ACM
    Jonas M Capstone experience Proceedings of the 15th Annual Conference on Information technology education, (55-60)
  266. ACM
    Wang Y, Liu J, Chen Y, Gruteser M, Yang J and Liu H E-eyes Proceedings of the 20th annual international conference on Mobile computing and networking, (617-628)
  267. Yin L, Dong M, Duan Y, Deng W, Zhao K and Guo J (2014). A high-performance training-free approach for hand gesture recognition with accelerometer, Multimedia Tools and Applications, 72:1, (843-864), Online publication date: 1-Sep-2014.
  268. Sundereisan S, Bhadriraju A, Khan M, Ramakrishnan N and Prakash B Sanstext Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, (649-656)
  269. ACM
    Ferguson S, Schubert E and Stevens C Dynamic dance warping Proceedings of the 2014 International Workshop on Movement and Computing, (94-99)
  270. ACM
    Schulze C, Henter D, Borth D and Dengel A Automatic Detection of CSA Media by Multi-modal Feature Fusion for Law Enforcement Support Proceedings of International Conference on Multimedia Retrieval, (353-360)
  271. Gimnez A, Andrs-Ferrer J and Juan A (2014). Discriminative Bernoulli HMMs for isolated handwritten word recognition, Pattern Recognition Letters, 35:C, (157-168), Online publication date: 1-Jan-2014.
  272. Gimnez A, Khoury I, Andrs-Ferrer J and Juan A (2014). Handwriting word recognition using windowed Bernoulli HMMs, Pattern Recognition Letters, 35:C, (149-156), Online publication date: 1-Jan-2014.
  273. Lember J and Koloydenko A (2014). Bridging Viterbi and posterior decoding, The Journal of Machine Learning Research, 15:1, (1-58), Online publication date: 1-Jan-2014.
  274. Islam M (2014). Feature and score fusion based multiple classifier selection for iris recognition, Computational Intelligence and Neuroscience, 2014, (10-10), Online publication date: 1-Jan-2014.
  275. ACM
    Huang X, Baker J and Reddy R (2014). A historical perspective of speech recognition, Communications of the ACM, 57:1, (94-103), Online publication date: 1-Jan-2014.
  276. ACM
    Ponce-López V, Escalera S and Baró X Multi-modal social signal analysis for predicting agreement in conversation settings Proceedings of the 15th ACM on International conference on multimodal interaction, (495-502)
  277. ACM
    Meudt S, Zharkov D, Kächele M and Schwenker F Multi classifier systems and forward backward feature selection algorithms to classify emotional coloured speech Proceedings of the 15th ACM on International conference on multimodal interaction, (551-556)
  278. Terissi L, Cerda M, Gómez J, Hitschfeld-Kahler N and Girau B (2013). A comprehensive system for facial animation of generic 3D head models driven by speech, EURASIP Journal on Audio, Speech, and Music Processing, 2013:1, (1-18), Online publication date: 1-Dec-2013.
  279. Leopold H, Eid-Sabbagh R, Mendling J, Azevedo L and Baião F (2013). Detection of naming convention violations in process models for different languages, Decision Support Systems, 56:C, (310-325), Online publication date: 1-Dec-2013.
  280. Gosztolya G Using the Logarithmic Generator Function in the Spoken Term Detection Task Proceedings of the 10th International Conference on Modeling Decisions for Artificial Intelligence - Volume 8234, (94-104)
  281. ACM
    Sankararaman S, Agarwal P, Mølhave T, Pan J and Boedihardjo A Model-driven matching and segmentation of trajectories Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, (234-243)
  282. ACM
    Sánchez-Lozano E, Lopez-Otero P, Docio-Fernandez L, Argones-Rúa E and Alba-Castro J Audiovisual three-level fusion for continuous estimation of Russell's emotion circumplex Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge, (31-40)
  283. ACM
    Acar E, Hopfgartner F and Albayrak S Violence detection in hollywood movies by the fusion of visual and mid-level audio cues Proceedings of the 21st ACM international conference on Multimedia, (717-720)
  284. ACM
    Xu C, Li S, Liu G, Zhang Y, Miluzzo E, Chen Y, Li J and Firner B Crowd++ Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing, (43-52)
  285. Nguyen K, Stewart R and Zhang H (2013). An intelligent pattern recognition model to automate the categorisation of residential water end-use events, Environmental Modelling & Software, 47:C, (108-127), Online publication date: 1-Sep-2013.
  286. Kipyatkova I and Karpov A Lexicon Size and Language Model Order Optimization for Russian LVCSR Proceedings of the 15th International Conference on Speech and Computer - Volume 8113, (219-226)
  287. Hayashi A, Iwata K and Suematsu N Finding the most likely upper level state sequence for hierarchical HMMs Proceedings of the First international conference on Statistical Language and Speech Processing, (111-122)
  288. Vignolo L, Milone D and Rufiner H (2013). Genetic wavelet packets for speech recognition, Expert Systems with Applications: An International Journal, 40:6, (2350-2359), Online publication date: 1-May-2013.
  289. Grais E and Erdogan H (2013). Regularized nonnegative matrix factorization using Gaussian mixture priors for supervised single channel source separation, Computer Speech and Language, 27:3, (746-762), Online publication date: 1-May-2013.
  290. ACM
    Shirali-Shahreza S, Penn G, Balakrishnan R and Ganjali Y SeeSay and HearSay CAPTCHA for mobile interaction Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, (2147-2156)
  291. Hahm S, Watanabe S, Ogawa A, Fujimoto M, Hori T and Nakamura A (2013). Prior-shared feature and model space speaker adaptation by consistently employing map estimation, Speech Communication, 55:3, (415-431), Online publication date: 1-Mar-2013.
  292. Sunil Kumar R and Lajish V (2013). Phoneme recognition using zerocrossing interval distribution of speech patterns and ANN, International Journal of Speech Technology, 16:1, (125-131), Online publication date: 1-Mar-2013.
  293. Roy P and Das P (2013). A hybrid VQ-GMM approach for identifying Indian languages, International Journal of Speech Technology, 16:1, (33-39), Online publication date: 1-Mar-2013.
  294. ACM
    Schedl M, Orio N, Liem C and Peeters G A professionally annotated and enriched multimodal data set on popular music Proceedings of the 4th ACM Multimedia Systems Conference, (78-83)
  295. ACM
    Milenkovic M and Amft O An opportunistic activity-sensing approach to save energy in office buildings Proceedings of the fourth international conference on Future energy systems, (247-258)
  296. ACM
    Jun S and Hwang E Music segmentation and summarization based on self-similarity matrix Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication, (1-4)
  297. Ding I (2013). Speech recognition using variable-length frame overlaps by intelligent fuzzy control, Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, 25:1, (49-56), Online publication date: 1-Jan-2013.
  298. FrançOisse K, Fouss F and Saerens M (2013). A link-analysis-based discriminant analysis for exploring partially labeled graphs, Pattern Recognition Letters, 34:2, (146-154), Online publication date: 1-Jan-2013.
  299. Kini B and Sekhar C (2013). Large margin mixture of AR models for time series classification, Applied Soft Computing, 13:1, (361-371), Online publication date: 1-Jan-2013.
  300. ACM
    Ramachandrula S, Jain S and Ravishankar H Offline handwritten word recognition in Hindi Proceeding of the workshop on Document Analysis and Recognition, (49-54)
  301. Manevarte B, Ahmad W and Hegde R Distant speaker verification using a combined family of MVDR estimates Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing, (628-638)
  302. Bankó Z and Abonyi J (2012). Correlation based dynamic time warping of multivariate time series, Expert Systems with Applications: An International Journal, 39:17, (12814-12823), Online publication date: 1-Dec-2012.
  303. Silva W and Serra G An Intelligent System Based on Discrete Cosine Transform for Speech Recognition Advances in Artificial Intelligence – IBERAMIA 2012, (320-329)
  304. ACM
    Ulges A, Schulze C, Borth D and Stahl A Pornography detection in video benefits (a lot) from a multi-modal approach Proceedings of the 2012 ACM international workshop on Audio and multimedia methods for large-scale video analysis, (21-26)
  305. Hariharan M, Sindhu R and Yaacob S (2012). Normal and hypoacoustic infant cry signal classification using time-frequency analysis and general regression neural network, Computer Methods and Programs in Biomedicine, 108:2, (559-569), Online publication date: 1-Nov-2012.
  306. ACM
    Glodek M, Schels M, Palm G and Schwenker F Multiple classifier combination using reject options and markov fusion networks Proceedings of the 14th ACM international conference on Multimodal interaction, (465-472)
  307. Nicolaou M, Pavlovic V and Pantic M Dynamic probabilistic CCA for analysis of affective behaviour Proceedings of the 12th European conference on Computer Vision - Volume Part VII, (98-111)
  308. Yang Y and Shah M Complex events detection using data-driven concepts Proceedings of the 12th European conference on Computer Vision - Volume Part III, (722-735)
  309. ACM
    Sunny S, Peter S and Jacob K A comparative study of parametric coding and wavelet coding based feature extraction techniques in recognizing spoken words Proceedings of the CUBE International Information Technology Conference, (326-331)
  310. Silva W and Serra G A hybrid approach based on DCT-Genetic-Fuzzy inference system for speech recognition Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning, (52-59)
  311. Bartoš T and Skopal T Revisiting techniques for lowerbounding the dynamic time warping distance Proceedings of the 5th international conference on Similarity Search and Applications, (192-208)
  312. ACM
    An S, James D and Marschner S (2012). Motion-driven concatenative synthesis of cloth sounds, ACM Transactions on Graphics, 31:4, (1-10), Online publication date: 5-Aug-2012.
  313. Hariharan M, Saraswathy J, Sindhu R, Khairunizam W and Yaacob S (2012). Infant cry classification to identify asphyxia using time-frequency analysis and radial basis neural networks, Expert Systems with Applications: An International Journal, 39:10, (9515-9523), Online publication date: 1-Aug-2012.
  314. Khorsheed M and Al-Omari H A markovian engine for text recognition Proceedings of the 9th international conference on Image Analysis and Recognition - Volume Part I, (375-381)
  315. Tardón L and Barbancho I Music Similarity Evaluation Using the Variogram for MFCC Modelling Revised Selected Papers of the 9th International Symposium on From Sounds to Music and Emotions - Volume 7900, (313-332)
  316. ACM
    Misra D, Hall R, Payne S and Thoma G Digital preservation and knowledge discovery based on documents from an international health science program Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, (23-26)
  317. Lee S, Oh K and Kim T (2012). How many reference patterns can improve profitability for real-time trading in futures market?, Expert Systems with Applications: An International Journal, 39:8, (7458-7470), Online publication date: 1-Jun-2012.
  318. Hariharan M, Chee L, Ai O and Yaacob S (2012). Classification of Speech Dysfluencies Using LPC Based Parameterization Techniques, Journal of Medical Systems, 36:3, (1821-1830), Online publication date: 1-Jun-2012.
  319. Hariharan M, Chee L and Yaacob S (2012). Analysis of Infant Cry Through Weighted Linear Prediction Cepstral Coefficients and Probabilistic Neural Network, Journal of Medical Systems, 36:3, (1309-1315), Online publication date: 1-Jun-2012.
  320. ACM
    Kumar A, Paek T and Lee B Voice typing Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, (2277-2286)
  321. Zavaglia M, Canolty R, Schofield T, Leff A, Ursino M, Knight R and Penny W (2012). A dynamical pattern recognition model of gamma activity in auditory cortex, Neural Networks, 28:C, (1-14), Online publication date: 1-Apr-2012.
  322. Kamaruddin N, Wahab A and Quek C (2012). Cultural dependency analysis for understanding speech emotion, Expert Systems with Applications: An International Journal, 39:5, (5115-5133), Online publication date: 1-Apr-2012.
  323. Narayanan R, Daghar A, Zaki M and Tahar S Verifying jitter in an analog and mixed signal design using dynamic time warping Proceedings of the Conference on Design, Automation and Test in Europe, (1413-1416)
  324. Costin M Images revealing useful clues on Romanian vowels spectrograms Proceedings of the 6th international conference on Communications and Information Technology, and Proceedings of the 3rd World conference on Education and Educational Technologies, (112-117)
  325. ACM
    Piech C, Sahami M, Koller D, Cooper S and Blikstein P Modeling how students learn to program Proceedings of the 43rd ACM technical symposium on Computer Science Education, (153-160)
  326. ACM
    Cheung J and Li X Sequence clustering and labeling for unsupervised query intent discovery Proceedings of the fifth ACM international conference on Web search and data mining, (383-392)
  327. Narwaria M, Lin W and Cetin A (2012). Scalable image quality assessment with 2D mel-cepstrum and machine learning approach, Pattern Recognition, 45:1, (299-313), Online publication date: 1-Jan-2012.
  328. ACM
    Dale K, Sunkavalli K, Johnson M, Vlasic D, Matusik W and Pfister H Video face replacement Proceedings of the 2011 SIGGRAPH Asia Conference, (1-10)
  329. Sidaoui B and Sadouni K Efficient binary tree multiclass SVM using genetic algorithms for vowels recognition Proceedings of the 10th WSEAS international conference on Computational Intelligence, Man-Machine Systems and Cybernetics, and proceedings of the 10th WSEAS international conference on Information Security and Privacy, (228-234)
  330. ACM
    Dale K, Sunkavalli K, Johnson M, Vlasic D, Matusik W and Pfister H (2011). Video face replacement, ACM Transactions on Graphics, 30:6, (1-10), Online publication date: 1-Dec-2011.
  331. ACM
    Garg R, Varna A and Wu M "Seeing" ENF Proceedings of the 19th ACM international conference on Multimedia, (23-32)
  332. Oh M and Park H Preprocessing of independent vector analysis using feed-forward network for robust speech recognition Proceedings of the 18th international conference on Neural Information Processing - Volume Part II, (366-373)
  333. Travieso C, Alonso J, Ticay-Rivas J and Del Pozo-Baños M Apnea detection based on hidden Markov model Kernel Proceedings of the 5th international conference on Advances in nonlinear speech processing, (71-79)
  334. ACM
    Maddage N and Li H (2011). Beat space segmentation and octave scale cepstral feature for sung language recognition in pop music, ACM Transactions on Multimedia Computing, Communications, and Applications, 7:4, (1-19), Online publication date: 1-Nov-2011.
  335. ACM
    Jonas M Capstone experience Proceedings of the 2011 conference on Information technology education, (275-280)
  336. Glodek M, Tschechne S, Layher G, Schels M, Brosch T, Scherer S, Kächele M, Schmidt M, Neumann H, Palm G and Schwenker F Multiple classifier systems for the classificatio of audio-visual emotional states Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II, (359-368)
  337. Jankowski J Analysis of multiplayer platform users activity based on the virtual and real time dimension Proceedings of the Third international conference on Social informatics, (312-315)
  338. Nebel J, Lewandowski M, Thévenon J, Martínez F and Velastin S Are current monocular computer vision systems for human action recognition suitable for visual surveillance applications? Proceedings of the 7th international conference on Advances in visual computing - Volume Part II, (290-299)
  339. Freire N, Borbinha J and Calado P A language independent approach for named entity recognition in subject headings Proceedings of the 15th international conference on Theory and practice of digital libraries: research and advanced technology for digital libraries, (52-61)
  340. Esparza J, Scherer S and Schwenker F Studying self- and active-training methods for multi-feature set emotion recognition Proceedings of the First IAPR TC3 conference on Partially Supervised Learning, (19-31)
  341. Zen G, Ricci E, Messelodi S and Sebe N Sorting atomic activities for discovering spatio-temporal patterns in dynamic scenes Proceedings of the 16th international conference on Image analysis and processing: Part I, (207-216)
  342. Makowski R and Zimroz R Adaptive bearings vibration modelling for diagnosis Proceedings of the Second international conference on Adaptive and intelligent systems, (248-259)
  343. ACM
    Huang H, Liu Y and Boves L Investigation of supervised dimensionality reduction methods for phonetic classification Proceedings of the Third International Conference on Internet Multimedia Computing and Service, (128-133)
  344. Dubiner M and Singer Y Entire relaxation path for maximum entropy problems Proceedings of the Conference on Empirical Methods in Natural Language Processing, (941-948)
  345. Zablotskiy S, Pitakrat T, Zablotskaya K and Minker W GMM parameter estimation by means of EM and genetic algorithms Proceedings of the 14th international conference on Human-computer interaction: design and development approaches - Volume Part I, (527-536)
  346. Yu D, Yu X, Hu Q, Liu J and Wu A (2011). Dynamic time warping constraint learning for large margin nearest neighbor classification, Information Sciences: an International Journal, 181:13, (2787-2796), Online publication date: 1-Jul-2011.
  347. Biernacki P Application of multi-agents in TV commercial recognition system Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications, (406-413)
  348. ACM
    Srinivasan M and Metoyer R Semi-automatic end-user tools for construction of virtual avatar behaviors Proceedings of the 16th International Conference on 3D Web Technology, (121-128)
  349. ACM
    Han W, Lee J, Moon Y, Hwang S and Yu H A new approach for processing ranked subsequence matching based on ranked union Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, (457-468)
  350. Vignolo L, Rufiner H, Milone D and Goddard J (2011). Evolutionary cepstral coefficients, Applied Soft Computing, 11:4, (3419-3428), Online publication date: 1-Jun-2011.
  351. Shadike M, Li X and Wasili B Large vocabulary continuous speech recognition of uyghur Proceedings of the 8th international conference on Advances in neural networks - Volume Part III, (594-600)
  352. Li H, Guo C and Yang L A method of similarity measure and visualization for long time series using binary patterns Proceedings of the 15th international conference on New Frontiers in Applied Data Mining, (136-147)
  353. Wu S, Falk T and Chan W (2011). Automatic speech emotion recognition using modulation spectral features, Speech Communication, 53:5, (768-785), Online publication date: 1-May-2011.
  354. Tezuka T and Maeda A Audio lifelog search system using a topic model for reducing recognition errors Proceedings of the 16th international conference on Database systems for advanced applications: Part II, (73-82)
  355. ACM
    Bai H, Wang L, Qin G, Zhang J, Tao K, Chang X and Dong Y TV program segmentation using multi-modal information fusion Proceedings of the 1st ACM International Conference on Multimedia Retrieval, (1-8)
  356. ACM
    Venkatesh N, Gulati R, Bhujade R and Chandra M Fixed-point implementation of isolated sub-word level speech recognition using hidden Markov models Proceedings of the 2011 ACM Symposium on Applied Computing, (368-373)
  357. ACM
    Patel I and Rao Y Modified MFCC windowed technique for speaker word recognition Proceedings of the International Conference & Workshop on Emerging Trends in Technology, (1311-1315)
  358. ACM
    Bakshi A Speaker recognition with statistical analysis method Proceedings of the International Conference & Workshop on Emerging Trends in Technology, (984-986)
  359. ACM
    Samant R and Rao S The effect of noise in automatic text classification Proceedings of the International Conference & Workshop on Emerging Trends in Technology, (557-558)
  360. ACM
    Kekre H, Athawale A and Sharma G Speech recognition using vector quantization Proceedings of the International Conference & Workshop on Emerging Trends in Technology, (400-403)
  361. Karam J Theoretical equivalency and practical advantages of wavelet based paradigms in signal processing Proceeding of 10th WSEAS international conference on electronics, hardware, wireless and optical communications, and 10th WSEAS international conference on signal processing, robotics and automation, and 3rd WSEAS international conference on nanotechnology, and 2nd WSEAS international conference on Plasma-fusion-nuclear physics, (198-205)
  362. ACM
    Kumar P, Jakhanwal N, Bhowmick A and Chandra M Gender classification using pitch and formants Proceedings of the 2011 International Conference on Communication, Computing & Security, (319-324)
  363. Karam J (2011). Biorthoganal wavelet packets and Mel scale analysis for automatic recognition of Arabic speech via radial basis functions, WSEAS Transactions on Signal Processing, 7:1, (44-53), Online publication date: 1-Jan-2011.
  364. Vignolo L, Rufiner H, Milone D and Goddard J (2011). Evolutionary splines for cepstral filterbank optimization in phoneme classification, EURASIP Journal on Advances in Signal Processing, 2011, (1-14), Online publication date: 1-Jan-2011.
  365. ACM
    Choudhary A, Chaudhury S and Banerjee S Distributed framework for composite event recognition in a calibrated pan-tilt camera network Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing, (140-147)
  366. Viana H, De Novais D and Trindade R Mathematical treatment of uncertainty in the speech recognition process Proceedings of the 2010 international conference on Mathematical models for engineering science, (272-276)
  367. Kulkarni K, Boyer E, Horaud R and Kale A An unsupervised framework for action recognition using actemes Proceedings of the 10th Asian conference on Computer vision - Volume Part IV, (592-605)
  368. Ravinder K Comparison of HMM and DTW for isolated word recognition system of Punjabi language Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications, (244-252)
  369. ACM
    Kosugi N Misual Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services, (609-616)
  370. ACM
    Lin J, Cervone G and Franzese P Assessment of error in air quality models using dynamic time warping Proceedings of the 1st ACM SIGSPATIAL International Workshop on Data Mining for Geoinformatics, (38-44)
  371. ACM
    Moro A, Mumolo E and Nolich M Automatic 3d virtual cloning of a speaking human face Proceedings of the 2010 ACM workshop on Surreal media and virtual cloning, (45-50)
  372. ACM
    Mbogho A and Katz M The impact of accents on automatic recognition of South African English speech Proceedings of the 2010 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists, (187-192)
  373. Tarar S, Singh A and Singh S Speech recognition approach Proceedings of the 10th WSEAS international conference on Applied computer science, (44-48)
  374. Herbig T, Gerl F and Minker W Detection of unknown speakers in an unsupervised speech controlled system Proceedings of the Second international conference on Spoken dialogue systems for ambient environments, (25-35)
  375. Huang A, Abugharbieh R and Tam R (2010). A novel rotationally invariant region-based hidden Markov model for efficient 3-D image segmentation, IEEE Transactions on Image Processing, 19:10, (2737-2748), Online publication date: 1-Oct-2010.
  376. ACM
    Thepvilojanapong N, Konomi S, Tobe Y, Ohta Y, Iwai M and Sezaki K Opportunistic collaboration in participatory sensing environments Proceedings of the fifth ACM international workshop on Mobility in the evolving internet architecture, (39-44)
  377. ACM
    Mohamed A and Nair K Continuous Malayalam speech recognition using Hidden Markov Models Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India, (1-4)
  378. Okada S, Hasegawa O and Nishida T Machine learning approaches for time-series data based on self-organizing incremental neural network Proceedings of the 20th international conference on Artificial neural networks: Part III, (541-550)
  379. Biernacki P Intelligent system for commercial block recognition using audio signal only Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part I, (360-368)
  380. ACM
    Najjar M and Abdollahi Azgomi M A distributed multi-approach intrusion detection system for web services Proceedings of the 3rd international conference on Security of information and networks, (238-244)
  381. Lewandowski M, Makris D and Nebel J View and style-independent action manifolds for human activity recognition Proceedings of the 11th European conference on Computer vision: Part VI, (547-560)
  382. Guo C, Li H and Pan D An improved piecewise aggregate approximation based on statistical features for time series mining Proceedings of the 4th international conference on Knowledge science, engineering and management, (234-244)
  383. ACM
    Maddage N, Sim K and Li H (2010). Word level automatic alignment of music and lyrics using vocal synthesis, ACM Transactions on Multimedia Computing, Communications, and Applications, 6:3, (1-16), Online publication date: 1-Aug-2010.
  384. Lee J and Park C (2010). Hybrid simulated annealing and its application to optimization of hidden Markov models for visual speech recognition, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 40:4, (1188-1196), Online publication date: 1-Aug-2010.
  385. Scanzio S, Cumani S, Gemello R, Mana F and Laface P (2010). Parallel implementation of Artificial Neural Network training for speech recognition, Pattern Recognition Letters, 31:11, (1302-1309), Online publication date: 1-Aug-2010.
  386. Mailhot F Instance-based acquisition of vowel harmony Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology, (1-8)
  387. Lv Z, Wu X, Li M and Zhang D (2010). A novel eye movement detection algorithm for EOG driven human computer interface, Pattern Recognition Letters, 31:9, (1041-1047), Online publication date: 1-Jul-2010.
  388. Avdelidis K, Dimoulas C, Kalliris G and Papanikolaou G Adaptive phoneme alignment based on rough set theory Proceedings of the 7th international conference on Rough sets and current trends in computing, (100-109)
  389. ACM
    Bao X and Roy Choudhury R MoVi Proceedings of the 8th international conference on Mobile systems, applications, and services, (357-370)
  390. ACM
    Champin P, Encelle B, Evans N, O.-Beldame M, Prié Y and Troncy R Towards collaborative annotation for video accessibility Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A), (1-4)
  391. Trentin E, Scherer S and Schwenker F Maximum echo-state-likelihood networks for emotion recognition Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition, (60-71)
  392. Schmidt M, Schels M and Schwenker F A hidden markov model based approach for facial expression recognition in image sequences Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition, (149-160)
  393. Bocchi L, Lapi S and Ballerini L Evolution of communicating individuals Proceedings of the 2010 international conference on Applications of Evolutionary Computation - Volume Part I, (328-335)
  394. Bouchaffra D (2010). Conformation-based hidden Markov models, IEEE Transactions on Neural Networks, 21:4, (595-608), Online publication date: 1-Apr-2010.
  395. Milone D, Di Persia L and Torres M (2010). Denoising and recognition using hidden Markov models with observation distributions modeled by hidden Markov trees, Pattern Recognition, 43:4, (1577-1589), Online publication date: 1-Apr-2010.
  396. ACM
    Park H, Youn S, Hong E, Lee C, Kwon Y, Ko H, Park M, Sohn Y and Kim J Sharing of baseball event through social media Proceedings of the international conference on Multimedia information retrieval, (389-392)
  397. ACM
    Müller M, Grosche P and Wiering F Automated analysis of performance variations in folk song recordings Proceedings of the international conference on Multimedia information retrieval, (247-256)
  398. ACM
    Tingle D, Kim Y and Turnbull D Exploring automatic music annotation with "acoustically-objective" tags Proceedings of the international conference on Multimedia information retrieval, (55-62)
  399. Papay K Designing a hungarian multimodal database - speech recording and annotation Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues, (403-411)
  400. Karam J Various speech processing techniques for multimedia applications Proceedings of the 9th WSEAS international conference on Signal processing, robotics and automation, (304-309)
  401. Karam J (2010). A comprehensive approach for speech related multimedia applications, WSEAS Transactions on Signal Processing, 6:1, (12-21), Online publication date: 1-Jan-2010.
  402. Chellappa R, Sankaranarayanan A, Veeraraghavan A and Turaga P (2010). Statistical Methods and Models for Video-Based Tracking, Modeling, and Recognition, Foundations and Trends in Signal Processing, 3:1–2, (1-151), Online publication date: 1-Jan-2010.
  403. Lam Y, Coutinho J, Ho C, Leong P and Luk W (2010). Multiloop parallelisation using unrolling and fission, International Journal of Reconfigurable Computing, 2010, (1-1), Online publication date: 1-Jan-2010.
  404. Chandrakala S and Sekhar C Classification of Multi-variate Varying Length Time Series Using Descriptive Statistical Features Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence, (13-18)
  405. Gu H, Shen R and Chen Z A Sound-Directed Cameraman Accommodating Unfettered-Speaking in e-Learning Classrooms Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing, (332-343)
  406. Al-Naymat G, Chawla S and Taheri J SparseDTW Proceedings of the Eighth Australasian Data Mining Conference - Volume 101, (117-127)
  407. Okamoto T and Ishida Y (2009). Towards an immunity-based system for detecting masqueraders, International Journal of Knowledge-based and Intelligent Engineering Systems, 13:3,4, (103-110), Online publication date: 1-Dec-2009.
  408. Park H, Oh S and Lee S (2009). A Bark-scale filter bank approach to independent component analysis for acoustic mixtures, Neurocomputing, 73:1-3, (304-314), Online publication date: 1-Dec-2009.
  409. Xu D, Yan K and Wu H Blind channel equalization using expectation maximization of auxiliary objective function for complex constellations Proceedings of the 28th IEEE conference on Global telecommunications, (5012-5017)
  410. ACM
    Chen S and Chen S Content-based music genre classification using timbral feature vectors and support vector machine Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human, (1095-1101)
  411. Ziółko B and Ziółko M Time durations of phonemes in polish language for speech and speaker recognition Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics, (105-114)
  412. ACM
    Jeong C and Kim D Linear predictive coding representation of correlated mutation for protein sequence alignment Proceedings of the third international workshop on Data and text mining in bioinformatics, (27-34)
  413. Qin C and Carreira-Perpiñán M The geometry of the articulatory region that produces a speech sound Proceedings of the 43rd Asilomar conference on Signals, systems and computers, (1742-1746)
  414. Giorgino T, Tormene P, Maggioni G, Pistarini C and Quaglini S (2009). Wireless support to poststroke rehabilitation, IEEE Transactions on Information Technology in Biomedicine, 13:6, (1012-1018), Online publication date: 1-Nov-2009.
  415. Pham T, Brandl M and Beck D (2009). Fuzzy declustering-based vector quantization, Pattern Recognition, 42:11, (2570-2577), Online publication date: 1-Nov-2009.
  416. Gullo F, Ponti G, Tagarelli A and Greco S (2009). A time series representation model for accurate and fast similarity detection, Pattern Recognition, 42:11, (2998-3014), Online publication date: 1-Nov-2009.
  417. Bayilmis C, Kelebekler E, Erturk I, Ceken C and Ozcelik I (2009). Integration of a speech activated control system and a wireless interworking unit for a CAN-based distributed application, Journal of Network and Computer Applications, 32:6, (1210-1218), Online publication date: 1-Nov-2009.
  418. Gatica-Perez D (2009). Automatic nonverbal analysis of social interaction in small groups, Image and Vision Computing, 27:12, (1775-1787), Online publication date: 1-Nov-2009.
  419. Temko A and Nadeu C (2009). Acoustic event detection in meeting-room environments, Pattern Recognition Letters, 30:14, (1281-1288), Online publication date: 30-Oct-2009.
  420. Huda S, Yearwood J and Togneri R (2009). A stochastic version of Expectation Maximization algorithm for better estimation of Hidden Markov Model, Pattern Recognition Letters, 30:14, (1301-1309), Online publication date: 30-Oct-2009.
  421. ACM
    Shirali-Shahreza S Compact representation of multimedia files for indexing, classification and retrieval Proceedings of the International Conference on Management of Emergent Digital EcoSystems, (489-492)
  422. ACM
    Liu Y and Sato Y Visual localization of non-stationary sound sources Proceedings of the 17th ACM international conference on Multimedia, (513-516)
  423. Othman M, Zhang Z, Imamura T and Miyake T Modeling driver operation behavior by linear prediction analysis and auto associative neural network Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics, (649-653)
  424. Muramatsu D and Matsumoto T Online signature verification algorithm with a user-specific global-parameter fusion model Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics, (486-491)
  425. ACM
    Gullo F, Ponti G, Tagarelli A, liritano S, Ruffolo M and Labate D Low-voltage electricity customer profiling based on load data clustering Proceedings of the 2009 International Database Engineering & Applications Symposium, (330-333)
  426. Sanna M and Murroni M A codebook design method for fricative enhancement in artificial bandwidth extension Proceedings of the 5th International ICST Mobile Multimedia Communications Conference, (1-7)
  427. Ikizler N and Duygulu P (2009). Histogram of oriented rectangles, Image and Vision Computing, 27:10, (1515-1526), Online publication date: 1-Sep-2009.
  428. ACM
    Jang H, Song J and Jeong H Advanced narrow speech channeling algorithm for robot speech recognition Proceedings of the 2009 International Conference on Hybrid Information Technology, (130-137)
  429. Li T, Xu W, Pan J and Yan Y Improving automatic speech recognizer of voice search using system combination Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 4, (477-480)
  430. Tran H and Li H (2009). Jump function Kolmogorov for audio classification in noise-mismatch conditions, IEEE Transactions on Signal Processing, 57:8, (2908-2918), Online publication date: 1-Aug-2009.
  431. Jothilakshmi S, Ramalingam V and Palanivel S (2009). Unsupervised speaker segmentation with residual phase and MFCC features, Expert Systems with Applications: An International Journal, 36:6, (9799-9804), Online publication date: 1-Aug-2009.
  432. Pham T, Müller C and Crane D (2009). Fuzzy scaling analysis of a mouse mutant with brain morphological changes, IEEE Transactions on Information Technology in Biomedicine, 13:4, (629-635), Online publication date: 1-Jul-2009.
  433. Chu S, Tang H and Huang T Locality preserving speaker clustering Proceedings of the 2009 IEEE international conference on Multimedia and Expo, (494-497)
  434. Tang H, Chu S, Hasegawa-Johnson M and Huang T Emotion recognition from speech via boosted Gaussian mixture models Proceedings of the 2009 IEEE international conference on Multimedia and Expo, (294-297)
  435. ACM
    Lu H, Pan W, Lane N, Choudhury T and Campbell A SoundSense Proceedings of the 7th international conference on Mobile systems, applications, and services, (165-178)
  436. Georgoulas G, Georgopoulos V, Stylios G and Stylios C Detection of articulation disorders using empirical mode decomposition and neural networks Proceedings of the 2009 international joint conference on Neural Networks, (2484-2489)
  437. ACM
    Stefan A, Wang H and Athitsos V Towards automated large vocabulary gesture search Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments, (1-8)
  438. Wang K (2009). Wavelet-based voice activity detection algorithm in variable-level noise environment, WSEAS Transactions on Computers, 8:6, (949-955), Online publication date: 1-Jun-2009.
  439. Kozat S and Singer A (2009). Switching strategies for sequential decision problems with multiplicative loss with application to portfolios, IEEE Transactions on Signal Processing, 57:6, (2192-2208), Online publication date: 1-Jun-2009.
  440. Lee C, Shih J, Yu K and Lin H (2009). Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features, IEEE Transactions on Multimedia, 11:4, (670-682), Online publication date: 1-Jun-2009.
  441. Veeraraghavan A, Srivastava A, Roy-Chowdhury A and Chellappa R (2009). Rate-invariant recognition of humans and their activities, IEEE Transactions on Image Processing, 18:6, (1326-1339), Online publication date: 1-Jun-2009.
  442. Ramkumar B (2009). Automatic modulation classification for cognitive radios using cyclic feature detection, IEEE Circuits and Systems Magazine, 09:2, (27-45), Online publication date: 1-Jun-2009.
  443. Mirzaei A and Safabakhsh R (2009). Optimal matching by the transiently chaotic neural network, Applied Soft Computing, 9:3, (863-873), Online publication date: 1-Jun-2009.
  444. ACM
    Shen J, Shepherd J, Cui B and Tan K (2009). A novel framework for efficient automated singer identification in large music databases, ACM Transactions on Information Systems, 27:3, (1-31), Online publication date: 1-May-2009.
  445. Kamaruddin N and Wahab A (2009). Features extraction for speech emotion, Journal of Computational Methods in Sciences and Engineering, 9:1,2S1, (1-12), Online publication date: 1-Apr-2009.
  446. Hosom J (2009). Speaker-independent phoneme alignment using transition-dependent states, Speech Communication, 51:4, (352-368), Online publication date: 1-Apr-2009.
  447. Keshet J, Grangier D and Bengio S (2009). Discriminative keyword spotting, Speech Communication, 51:4, (317-329), Online publication date: 1-Apr-2009.
  448. Juang C, Lai C and Tu C (2009). Dynamic programming prediction errors of recurrent neural fuzzy networks for speech recognition, Expert Systems with Applications: An International Journal, 36:3, (6368-6374), Online publication date: 1-Apr-2009.
  449. Yang H, Lay Y, Lin C and Hong P (2009). The radar-graphic speech learning system for hearing impaired, Expert Systems with Applications: An International Journal, 36:3, (4804-4809), Online publication date: 1-Apr-2009.
  450. Albornoz E, Milone D and Rufiner H Multiple feature extraction and hierarchical classifiers for emotions recognition Proceedings of the Second international conference on Development of Multimodal Interfaces: active Listening and Synchrony, (242-254)
  451. Kim E and Lee S Extraction of lip movement image signals from sucessive image frames Proceedings of the 11th international conference on Advanced Communication Technology - Volume 3, (1974-1978)
  452. Huda S, Yearwood J and Togneri R (2009). A constraint-based evolutionary learning approach to the expectation maximization for optimal estimation of the hidden Markov model for speech signal modeling, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39:1, (182-197), Online publication date: 1-Feb-2009.
  453. ACM
    Nehe N, Jadhav D and Holambe R Multiresolution features and polynomial kernel subspace approach for isolated word recognition Proceedings of the International Conference on Advances in Computing, Communication and Control, (373-378)
  454. ACM
    Nehe N and Holambe R New robust subband Cepstral feature for isolated world recognition Proceedings of the International Conference on Advances in Computing, Communication and Control, (326-330)
  455. Kos M, Grašič, M and Kačič Z (2009). Online speech/music segmentation based on the variance mean of filter bank energy, EURASIP Journal on Advances in Signal Processing, 2009, (10-10), Online publication date: 1-Jan-2009.
  456. Valenzise G, Prandi G, Tagliasacchi M and Sarti A (2009). Identification of sparse audio tampering using distributed source coding and compressive sensing techniques, Journal on Image and Video Processing, 2009, (1-12), Online publication date: 1-Jan-2009.
  457. Kim S and Gao J A Dynamic Programming Technique for Optimizing Dissimilarity-Based Classifiers Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, (654-663)
  458. Lee J, Oh S and Lee S (2008). Letters, Neurocomputing, 72:1-3, (636-642), Online publication date: 1-Dec-2008.
  459. İkizler N and Forsyth D (2008). Searching for Complex Human Activities with No Visual Examples, International Journal of Computer Vision, 80:3, (337-357), Online publication date: 1-Dec-2008.
  460. Jing Y, Pavlović V and Rehg J (2008). Boosted Bayesian network classifiers, Machine Language, 73:2, (155-184), Online publication date: 1-Nov-2008.
  461. ACM
    Wang J, Makihara Y and Yagi Y People tracking and segmentation using spatiotemporal shape constraints Proceedings of the 1st ACM workshop on Vision networks for behavior analysis, (31-38)
  462. ACM
    Yu Y, Downie J, Chen L, Oria V and Joe K Searching musical audio datasets by a batch of multi-variant tracks Proceedings of the 1st ACM international conference on Multimedia information retrieval, (121-127)
  463. ACM
    Chechik G, Ie E, Rehn M, Bengio S and Lyon D Large-scale content-based audio retrieval from text queries Proceedings of the 1st ACM international conference on Multimedia information retrieval, (105-112)
  464. ACM
    Huang E and Fu L Segmented gesture recognition for controlling character animation Proceedings of the 2008 ACM symposium on Virtual reality software and technology, (205-208)
  465. ACM
    Xu M, Jin J, Luo S and Duan L Hierarchical movie affective content analysis based on arousal and valence features Proceedings of the 16th ACM international conference on Multimedia, (677-680)
  466. Callut J, Françoisse K, Saerens M and Dupont P Semi-supervised classification from discriminative random walks Proceedings of the 2008th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, (162-177)
  467. ACM
    Xu M, Jin J and Luo S Personalized video adaptation based on video content analysis Proceedings of the 9th International Workshop on Multimedia Data Mining: held in conjunction with the ACM SIGKDD 2008, (26-35)
  468. ACM
    Yen L, Saerens M, Mantrach A and Shimbo M A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, (785-793)
  469. Sherkat R and Rafiei D (2008). On efficiently searching trajectories and archival data for historical similarities, Proceedings of the VLDB Endowment, 1:1, (896-908), Online publication date: 1-Aug-2008.
  470. ACM
    Stefan A, Athitsos V, Alon J and Sclaroff S Translation and scale-invariant gesture recognition in complex scenes Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments, (1-8)
  471. Chu S Unstructured audio classification for environment recognition Proceedings of the 23rd national conference on Artificial intelligence - Volume 3, (1845-1846)
  472. ACM
    Liu Z, Gibbon D, Drucker H and Basso A Content personalization and adaptation for three-screen services Proceedings of the 2008 international conference on Content-based image and video retrieval, (635-644)
  473. ACM
    Luz S, Masoodian M and Rogers B Interactive visualisation techniques for dynamic speech transcription, correction and training Proceedings of the 9th ACM SIGCHI New Zealand Chapter's International Conference on Human-Computer Interaction: Design Centered HCI, (9-16)
  474. Pham T, Beck D, Brandl M and Zhou X Classification of Proteomic Signals by Block Kriging Error Matching Proceedings of the 3rd international conference on Image and Signal Processing, (281-288)
  475. Bahrani M, Sameti H, Hafezi N and Momtazi S A New Word Clustering Method for Building N-Gram Language Models in Continuous Speech Recognition Systems Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence, (286-293)
  476. Kannan R and Prabhakar R (2008). Off-line cursive handwritten Tamil character recognition, WSEAS Transactions on Signal Processing, 4:6, (351-360), Online publication date: 1-Jun-2008.
  477. Seeger M (2008). Bayesian Inference and Optimal Design for the Sparse Linear Model, The Journal of Machine Learning Research, 9, (759-813), Online publication date: 1-Jun-2008.
  478. Pereira A, Smith L and Yu C (2008). Social coordination in toddler's word learning, Connection Science, 20:2-3, (73-89), Online publication date: 1-Jun-2008.
  479. Petry A, Soares S, Marchioro G and De Franceschi A A distributed speaker authentication system Proceedings of the WSEAS International Conference on Applied Computing Conference, (262-267)
  480. Chandramohan V and Pham T Cancer classification using kernelized fuzzy C-means Proceedings of the 9th WSEAS International Conference on Fuzzy Systems, (90-99)
  481. Petry A, Soares S, Marchioro G and De Franceschi A (2008). Speaker recognition techniques for remote authentication of users in computer networks, WSEAS TRANSACTIONS on SYSTEMS, 7:5, (590-599), Online publication date: 1-May-2008.
  482. ACM
    Xu M, Xu C, Duan L, Jin J and Luo S (2008). Audio keywords generation for sports video analysis, ACM Transactions on Multimedia Computing, Communications, and Applications, 4:2, (1-23), Online publication date: 1-May-2008.
  483. Azmi M, Tolba H, Mahdy S and Fashal M (2008). Syllable-based automatic arabic speech recognition in noisy-telephone channel, WSEAS Transactions on Signal Processing, 4:4, (211-220), Online publication date: 1-Apr-2008.
  484. Lipeika A and Lipeikienė J (2008). On the Use of the Formant Features in the Dynamic Time Warping Based Recognition of Isolated Words, Informatica, 19:2, (213-226), Online publication date: 1-Apr-2008.
  485. Abdelkader M, Roy-Chowdhury A, Chellappa R and Akdemir U (2008). Activity representation using 3D shape models, Journal on Image and Video Processing, 2008, (1-16), Online publication date: 1-Apr-2008.
  486. Pao T, Chen Y and Yeh J (2008). Comparison of Classification Methods for Detecting Emotion from Mandarin Speech, IEICE - Transactions on Information and Systems, E91-D:4, (1074-1081), Online publication date: 1-Apr-2008.
  487. Valencia-Jiménez J and Fernández-Caballero A Holonic multi-agent system model for fuzzy automatic speech / speaker recognition Proceedings of the 2nd KES International conference on Agent and multi-agent systems: technologies and applications, (73-82)
  488. ACM
    Cui B, Jagadish H, Ooi B and Tan K Compacting music signatures for efficient music retrieval Proceedings of the 11th international conference on Extending database technology: Advances in database technology, (229-240)
  489. ACM
    Einsele F, Ingold R and Hennebert J A language-independent, open-vocabulary system based on HMMs for recognition of ultra low resolution words Proceedings of the 2008 ACM symposium on Applied computing, (429-433)
  490. Sharma A, Shrotriya M, Farooq O and Abbasi Z (2008). Hybrid wavelet based LPC features for Hindi speech recognition, International Journal of Information and Communication Technology, 1:3/4, (373-381), Online publication date: 1-Mar-2008.
  491. ACM
    Hashimoto K, Aoki-Kinoshita K, Ueda N, Kanehisa M and Mamitsuka H (2008). A new efficient probabilistic model for mining labeled ordered trees applied to glycobiology, ACM Transactions on Knowledge Discovery from Data, 2:1, (1-30), Online publication date: 1-Mar-2008.
  492. Ibrahim M, Khalid M and Yusof R Text dependent speaker verification system using discriminative weighting method and Artificial Neural Networks Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications, (200-204)
  493. Legrand B, Chang C, Ong S, Neo S and Palanisamy N (2008). Chromosome classification using dynamic time warping, Pattern Recognition Letters, 29:3, (215-222), Online publication date: 1-Feb-2008.
  494. Tuononen M, Hautamäki R and Fränti P Automatic voice activity detection in different speech applications Proceedings of the 1st international conference on Forensic applications and techniques in telecommunications, information, and multimedia and workshop, (1-6)
  495. Nicholl P, Amira A, Bouchaffra D and Perrott R (2008). A statistical multiresolution approach for face recognition using structural hidden Markov models, EURASIP Journal on Advances in Signal Processing, 2008, (22), Online publication date: 1-Jan-2008.
  496. Zhang C, Zheng C, Yu X and Ouyang Y (2008). Estimating VDT mental fatigue using multichannel linear descriptors and KPCA-HMM, EURASIP Journal on Advances in Signal Processing, 2008, (1-11), Online publication date: 1-Jan-2008.
  497. Chaovalitwongse W and Pardalos P (2008). On the time series support vector machine using dynamic time warping kernel for brain activity classification, Cybernetics and Systems Analysis, 44:1, (125-138), Online publication date: 1-Jan-2008.
  498. Einsele F, Ingold R and Hennebert J A HMM-based approach to recognize ultra low resolution anti-aliased words Proceedings of the 2nd international conference on Pattern recognition and machine intelligence, (511-518)
  499. Einsele F, Ingold R and Hennebert J A HMM-Based Approach to Recognize Ultra Low Resolution Anti-Aliased Words Pattern Recognition and Machine Intelligence, (511-518)
  500. D'Aguanno A and Vercellesi G Automatic synchronisation between audio and score musical description layers Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia, (200-210)
  501. Vallejo E, Cody M and Taylor C Unsupervised acoustic classification of bird species using hierarchical self-organizing maps Proceedings of the 3rd Australian conference on Progress in artificial life, (212-221)
  502. Galassi U, Botta M and Giordana A (2007). Hierarchical Hidden Markov Models for User/Process Profile Learning, Fundamenta Informaticae, 78:4, (487-505), Online publication date: 1-Dec-2007.
  503. Galassi U, Botta M and Giordana A (2007). Hierarchical Hidden Markov Models for User/Process Profile Learning, Fundamenta Informaticae, 78:4, (487-505), Online publication date: 1-Dec-2007.
  504. Freitas C, De Carvalho J, Oliveira J, Aires S and Sabourin R Confusion matrix disagreement for multiple classifiers Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications, (387-396)
  505. Rodríguez J, Guerra S and Fernández L Using adaptive filter to increase automatic speech recognition rate in a digit corpus Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications, (78-87)
  506. Freitas C, de Carvalho J, Oliveira J, Aires S and Sabourin R Confusion Matrix Disagreement for Multiple Classifiers Progress in Pattern Recognition, Image Analysis and Applications, (387-396)
  507. Narita H, Sawamura Y and Hayashi A Learning a Kernel Matrix for Time Series Data from DTW Distances Neural Information Processing, (336-345)
  508. ACM
    Everitt K, Harada S, Bilmes J and Landay J Disambiguating speech commands using physical context Proceedings of the 9th international conference on Multimodal interfaces, (247-254)
  509. Rodríguez J and Guerra S Using adaptive filter and wavelets to increase automatic speech recognition rate in noisy environment Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence, (1015-1024)
  510. Milone D and Di Persia L An EM algorithm to learn sequences in the wavelet domain Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence, (518-528)
  511. Lauria S Human robot interactions Proceedings of the 2nd international conference on Advances in brain, vision and artificial intelligence, (555-565)
  512. Esposito A, Stejskal V, Smékal Z and Bourbakis N The significance of empty speech pauses Proceedings of the 2nd international conference on Advances in brain, vision and artificial intelligence, (542-554)
  513. Gordillo J and Conde E (2007). An HMM for detecting spam mail, Expert Systems with Applications: An International Journal, 33:3, (667-682), Online publication date: 1-Oct-2007.
  514. ACM
    Nwe T and Li H Singing voice detection using perceptually-motivated features Proceedings of the 15th ACM international conference on Multimedia, (309-312)
  515. ACM
    Kleban J, Sarkar A, Moxley E, Mangiat S, Joshi S, Kuo T and Manjunath B Feature fusion and redundancy pruning for rush video summarization Proceedings of the international workshop on TRECVID video summarization, (84-88)
  516. Han W, Lee J, Moon Y and Jiang H Ranked subsequence matching in time-series databases Proceedings of the 33rd international conference on Very large data bases, (423-434)
  517. Callut J and Dupont P Learning Partially Observable Markov Models from First Passage Times Proceedings of the 18th European conference on Machine Learning, (91-103)
  518. Miotto R and Orio N Automatic identification of music works through audio matching Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries, (124-135)
  519. Harris C and Cahill V An empirical study of the potential for context-aware power management Proceedings of the 9th international conference on Ubiquitous computing, (235-252)
  520. Wysoski S, Benuskova L and Kasabov N Text-independent speaker authentication with spiking neural networks Proceedings of the 17th international conference on Artificial neural networks, (758-767)
  521. Feldhoffer G, Oroszi B, Takács G, Tihanyi A and Bárdi T Inter-speaker synchronization in audiovisual database for lip-readable speech to animation conversion Proceedings of the 10th international conference on Text, speech and dialogue, (447-454)
  522. Awais M, Masud S, Ahktar J and Shamail S Arabic phoneme identification using conventional and concurrent neural networks in non native speakers Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications, (897-905)
  523. Wright C, Ballard L, Monrose F and Masson G Language identification of encrypted VoIP traffic Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, (1-12)
  524. Hsu E, da Silva M and Popović J Guided time warping for motion editing Proceedings of the 2007 ACM SIGGRAPH/Eurographics symposium on Computer animation, (45-52)
  525. Lauria S (2007). Talking to Machines, Circuits, Systems, and Signal Processing, 26:4, (513-526), Online publication date: 1-Aug-2007.
  526. ACM
    Turnbull D, Barrington L, Torres D and Lanckriet G Towards musical query-by-semantic-description using the CAL500 data set Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, (439-446)
  527. Pham T and Zhou X A novel image feature for nuclear-phase classification in high content screening Proceedings of the 2007 international conference on Advances in mass data analysis of signals and images in medicine biotechnology and chemistry, (84-93)
  528. Hoey J and Little J (2007). Value-Directed Human Behavior Analysis from Video Using Partially Observable Markov Decision Processes, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29:7, (1118-1132), Online publication date: 1-Jul-2007.
  529. Aradilla G and Bourlard H Posterior-based features and distances in template matching for speech recognition Proceedings of the 4th international conference on Machine learning for multimodal interaction, (204-214)
  530. Abate S and Menzel W Syllable-based speech recognition for Amharic Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, (33-40)
  531. Nguyen P, Mahajan M and He X Training non-parametric features for statistical machine translation Proceedings of the Second Workshop on Statistical Machine Translation, (72-79)
  532. Han W, Hon K, Chan C, Choy C and Pun K (2007). A Speech Recognition IC Using Hidden Markov Models with Continuous Observation Densities, Journal of VLSI Signal Processing Systems, 47:3, (223-232), Online publication date: 1-Jun-2007.
  533. Ko A, Sabourin R and De Souza Britto A A new HMM-based ensemble generation method for numeral recognition Proceedings of the 7th international conference on Multiple classifier systems, (52-61)
  534. Vayanos P, Chen M, Jelfs B and Mandic D Exploiting nonlinearity in adaptive signal processing Proceedings of the 2007 international conference on Advances in nonlinear speech processing, (57-77)
  535. Chakrabartty S and Cauwenberghs G (2007). Gini Support Vector Machine: Quadratic Entropy Based Robust Multi-Class Probability Regression, The Journal of Machine Learning Research, 8, (813-839), Online publication date: 1-May-2007.
  536. Jeong S, Kim H and Hahn M (2007). Response Time Reduction of Speech Recognizers Using Single Gaussians, IEICE - Transactions on Information and Systems, E90-D:5, (868-871), Online publication date: 1-May-2007.
  537. Leung K, Leung F, Lam H and Ling S (2007). Application of a modified neural fuzzy network and an improved genetic algorithm to speech recognition, Neural Computing and Applications, 16:4-5, (419-431), Online publication date: 1-May-2007.
  538. ACM
    Hsu C, Chen C, Shih T and Chen C (2007). Measuring similarity between transliterations against noise data, ACM Transactions on Asian Language Information Processing, 6:1, (5-es), Online publication date: 1-Apr-2007.
  539. Travieso C, Briceño J, Ferrer M and Alonso J Using fisher kernel on 2D-shape identification Proceedings of the 11th international conference on Computer aided systems theory, (740-746)
  540. Klempous R Movement identification analysis based on motion capture Proceedings of the 11th international conference on Computer aided systems theory, (629-637)
  541. Pham T (2007). Spectral distortion measures for biological sequence comparisons and database searching, Pattern Recognition, 40:2, (516-529), Online publication date: 1-Feb-2007.
  542. Cho S, Kim J, Kim D, Kang J, Lee J, Kim H, Kim S, Oh D, Jeon S and Chung S Text-dependent speaker verification using genetic algorithm and competitive learning neural network Proceedings of the Fourth IASTED International Conference on Signal Processing, Pattern Recognition, and Applications, (293-297)
  543. Petrovska-Delacrétaz D, El Hannani A and Chollet G Text-independent speaker verification Progress in nonlinear speech processing, (135-169)
  544. Pantazis Y and Stylianou Y On the detection of discontinuities in concatenative speech synthesis Progress in nonlinear speech processing, (89-100)
  545. Rabiner L and Schafer R (2007). Introduction to digital speech processing, Foundations and Trends in Signal Processing, 1:1, (1-194), Online publication date: 1-Jan-2007.
  546. Wang C and Gan W (2007). Efficient algorithm and architecture of critical-band transform for low-power speech applications, EURASIP Journal on Advances in Signal Processing, 2007:1, (76-76), Online publication date: 1-Jan-2007.
  547. Alonso M, Richard G and David B (2007). Accurate tempo estimation based on harmonic + noise decomposition, EURASIP Journal on Advances in Signal Processing, 2007:1, (161-161), Online publication date: 1-Jan-2007.
  548. Miyabe S, Hinamoto Y, Saruwatari H, Shikano K and Tatekura Y (2007). Interface for barge-in free spoken dialogue system based on sound field reproduction and microphone array, EURASIP Journal on Advances in Signal Processing, 2007:1, (184-184), Online publication date: 1-Jan-2007.
  549. Andreão R and Boudy J (2007). Combining wavelet transform and hidden Markov models for ECG segmentation, EURASIP Journal on Advances in Signal Processing, 2007:1, (95-95), Online publication date: 1-Jan-2007.
  550. Dusan S (2007). On the relevance of some spectral and temporal patterns for vowel classification, Speech Communication, 49:1, (71-82), Online publication date: 1-Jan-2007.
  551. Navarro-Mesa J, Ravelo-García A, Lorenzo-García F, Martín-González S, Hernández-Pérez E and Quintana-Morales P An approach to the determination of differences between good and bad sleepers by means of an automatic sleep stage scoring Proceedings of the 6th WSEAS international conference on Applied computer science, (337-340)
  552. Giuliani M, Nwe T and Li H Meeting segmentation using two-layer cascaded subband filters Proceedings of the 5th international conference on Chinese Spoken Language Processing, (672-682)
  553. Zhang S and Laprie Y The implementation of service enabling with spoken language of a multi-modal system ozone Proceedings of the 5th international conference on Chinese Spoken Language Processing, (640-647)
  554. Lu X and Dang J Auditory contrast spectrum for robust speech recognition Proceedings of the 5th international conference on Chinese Spoken Language Processing, (325-334)
  555. Pham T, Wang H, Zhou X, Beck D, Brandl M, Hoehn G, Azok J, Brennan M, Hazen S, Li K and Wong S Linear predictive coding and its decision logic for early prediction of major adverse cardiac events using mass spectrometry data Proceedings of the 2006 workshop on Intelligent systems for bioinformatics - Volume 73, (61-66)
  556. Kim S and Smyth P (2006). Segmental Hidden Markov Models with Random Effects for Waveform Modeling, The Journal of Machine Learning Research, 7, (945-969), Online publication date: 1-Dec-2006.
  557. Çetingül H, Erzin E, Yemez Y and Tekalp A (2006). Multimodal speaker/speech recognition using lip motion, lip texture and audio, Signal Processing, 86:12, (3549-3558), Online publication date: 1-Dec-2006.
  558. Monaci G, Escoda Ò and Vandergheynst P (2006). Analysis of multimodal sequences using geometric video representations, Signal Processing, 86:12, (3534-3548), Online publication date: 1-Dec-2006.
  559. Ortega-González V, Angeles-Yreta A, Medina-Apodaca J, Landassuri-Moreno V and Figueroa-Nazuno J Eigenconjugation Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications, (237-246)
  560. ACM
    Greco S, Ruffolo M and Tagarelli A Effective and efficient similarity search in time series Proceedings of the 15th ACM international conference on Information and knowledge management, (808-809)
  561. ACM
    Liu P and Soong F Word graph based speech rcognition error correction by handwriting input Proceedings of the 8th international conference on Multimodal interfaces, (339-346)
  562. Qiao Y, Nishiara M and Yasuhara M (2006). A Framework Toward Restoration of Writing Order from Single-Stroked Handwriting Image, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28:11, (1724-1737), Online publication date: 1-Nov-2006.
  563. ACM
    Paleari M and Lisetti C Toward multimodal fusion of affective cues Proceedings of the 1st ACM international workshop on Human-centered multimedia, (99-108)
  564. ACM
    Shiu Y, Jeong H and Kuo C Similarity matrix processing for music structure analysis Proceedings of the 1st ACM workshop on Audio and music computing multimedia, (69-76)
  565. ACM
    Ando R, Shinoda K, Furui S and Mochizuki T Robust scene recognition using language models for scene contexts Proceedings of the 8th ACM international workshop on Multimedia information retrieval, (99-106)
  566. ACM
    Cui B, Shen J, Cong G, Shen H and Yu C Exploring composite acoustic features for efficient music similarity query Proceedings of the 14th ACM international conference on Multimedia, (412-420)
  567. Marvi H An approach to speaker identification using acoustic-feature planes Proceedings of the 5th WSEAS international conference on Data networks, communications and computers, (240-243)
  568. Kim Y and Jeong H Two-Level dynamic programming hardware implementation for real time processing Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I, (1090-1097)
  569. Ma X, Zhou W, Ju F and Jiang Q Speech feature extraction based on wavelet modulation scale for robust speech recognition Proceedings of the 13th international conference on Neural Information Processing - Volume Part II, (499-505)
  570. Park S and Kim S (2006). Prefix-querying with anL1 distance metric for time-series subsequence matching under time warping, Journal of Information Science, 32:5, (387-399), Online publication date: 1-Oct-2006.
  571. Cheung Y (2006). A maximum likelihood approach to temporal factor analysis in state-space model, Signal Processing, 86:10, (2966-2980), Online publication date: 1-Oct-2006.
  572. Buscicchio C, Górecki P and Caponetti L Speech emotion recognition using spiking neural networks Proceedings of the 16th international conference on Foundations of Intelligent Systems, (38-46)
  573. Zhang R, Wu T, Li X and Xu D A speech stream detection in adverse acoustic environments based on cross correlation technique Proceedings of the Second international conference on Advances in Natural Computation - Volume Part II, (664-667)
  574. Thoma G, Mao S, Misra D and Rees J Design of a digital library for early 20th century medico-legal documents Proceedings of the 10th European conference on Research and Advanced Technology for Digital Libraries, (147-157)
  575. Müller M and Röder T Motion templates for automatic classification and retrieval of motion capture data Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation, (137-146)
  576. ACM
    Moerchen F, Mierswa I and Ultsch A Understandable models Of music collections based on exhaustive feature generation with temporal statistics Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, (882-891)
  577. Khorsheed M Mono-font cursive arabic text recognition using speech recognition system Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition, (755-763)
  578. Bevilacqua V, Mastronardi G, Pedone A, Romanazzi G and Daleno D Hidden markov models for recognition using artificial neural networks Proceedings of the 2006 international conference on Intelligent Computing - Volume Part I, (126-134)
  579. ACM
    Shen J, Cui B, Shepherd J and Tan K Towards efficient automated singer identification in large music databases Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, (59-66)
  580. Lin S, Wang J, Wang J and Yang H An ARM-Based embedded system design for speech-to-speech translation Proceedings of the 2006 international conference on Embedded and Ubiquitous Computing, (499-508)
  581. Zhu J and Li J An HMM-based approach to automatic phrasing for Mandarin text-to-speech synthesis Proceedings of the COLING/ACL on Main conference poster sessions, (977-982)
  582. Pham T Similarity searching in DNA sequences by spectral distortion measures Proceedings of the 6th Industrial Conference on Data Mining conference on Advances in Data Mining: applications in Medicine, Web Mining, Marketing, Image and Signal Mining, (24-37)
  583. Chapados N and Bengio Y The K best-paths approach to approximate dynamic programming with application to portfolio optimization Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence, (491-502)
  584. Liu Z and Sarkar S (2006). Improved Gait Recognition by Gait Dynamics Normalization, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28:6, (863-876), Online publication date: 1-Jun-2006.
  585. Kornagel U (2006). Techniques for artificial bandwidth extension of telephone speech, Signal Processing, 86:6, (1296-1306), Online publication date: 1-Jun-2006.
  586. Park C, Ki M, Namkung J and Paik J Multimodal priority verification of face and speech using momentum back-propagation neural network Proceedings of the Third international conference on Advnaces in Neural Networks - Volume Part II, (140-149)
  587. El-Obaid M, Al-Nassiri A and Maaly I Arabic phoneme recognition using neural networks Proceedings of the 5th WSEAS international conference on Signal processing, (99-104)
  588. Xue W, Du S, Fang C and Ye Y Voice activity detection using wavelet-based multiresolution spectrum and support vector machines and audio mixing algorithm Proceedings of the 2006 international conference on Computer Vision in Human-Computer Interaction, (78-88)
  589. Teruszkin R and Gil Vianna Resende F Phonetic sequence to graphemes conversion based on DTW and one-stage algorithms Proceedings of the 7th international conference on Computational Processing of the Portuguese Language, (220-224)
  590. Roh M, Christmas B, Kittler J and Lee S Robust player gesture spotting and recognition in low-resolution sports video Proceedings of the 9th European conference on Computer Vision - Volume Part IV, (347-358)
  591. Özuysal M, Lepetit V, Fleuret F and Fua P Feature harvesting for tracking-by-detection Proceedings of the 9th European conference on Computer Vision - Volume Part III, (592-605)
  592. Zimmermann M, Chappelier J and Bunke H (2006). Offline Grammar-Based Recognition of Handwritten Sentences, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28:5, (818-821), Online publication date: 1-May-2006.
  593. Humm A, Hennebert J and Ingold R Gaussian mixture models for CHASM signature verification Proceedings of the Third international conference on Machine Learning for Multimodal Interaction, (102-113)
  594. Vaufreydaz D, Emonet R and Reignier P A lightweight speech detection system for perceptive environments Proceedings of the Third international conference on Machine Learning for Multimodal Interaction, (336-345)
  595. Macho D, Nadeu C and Temko A Robust speech activity detection in interactive smart-room environments Proceedings of the Third international conference on Machine Learning for Multimodal Interaction, (236-247)
  596. Zhang B, Dou W and Chen L Combining short and long term audio features for TV sports highlight detection Proceedings of the 28th European conference on Advances in Information Retrieval, (472-475)
  597. Temko A, Malkin R, Zieger C, Macho D, Nadeu C and Omologo M CLEAR evaluation of acoustic event detection and classification systems Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships, (311-322)
  598. Kasprzak W, Okazaki A and Kowalski A ICA-Based speech features in the frequency domain Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation, (609-616)
  599. Bay M and Beauchamp J Harmonic source separation using prestored spectra Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation, (561-568)
  600. Kounoudes A, Antonakoudi A, Kekatos V and Peleties P Combined speech recognition and speaker verification over the fixed and mobile telephone networks Proceedings of the 24th IASTED international conference on Signal processing, pattern recognition, and applications, (228-233)
  601. Antal M and Toderean G Speaker recognition and broad phonetic groups Proceedings of the 24th IASTED international conference on Signal processing, pattern recognition, and applications, (155-159)
  602. Taghva K, Beckley R and Coombs J The effects of OCR error on the extraction of private information Proceedings of the 7th international conference on Document Analysis Systems, (348-357)
  603. Poh N and Bengio S (2006). Database, protocols and tools for evaluating score-level fusion algorithms in biometric authentication, Pattern Recognition, 39:2, (223-233), Online publication date: 1-Feb-2006.
  604. Kim S, Yoon J, Park S and Won J (2006). Shape-based retrieval in time-series databases, Journal of Systems and Software, 79:2, (191-203), Online publication date: 1-Feb-2006.
  605. Xu R, Mei G, Ren Z, Kwan C, Aube J, Rochet C and Stanford V Speaker identification and speech recognition using phased arrays Ambient Intelligence in Everyday Life, (227-238)
  606. Bouchaffra D and Tan J (2006). Structural hidden Markov models, Intelligent Data Analysis, 10:1, (67-79), Online publication date: 1-Jan-2006.
  607. Choi E On compensating the Mel-frequency cepstral coefficients for noisy speech recognition Proceedings of the 29th Australasian Computer Science Conference - Volume 48, (49-54)
  608. Radhakrishnan R, Divakaran A, Xiong Z and Otsuka I (2006). A content-adaptive analysis and representation framework for audio event discovery from "unscripted" multimedia, EURASIP Journal on Advances in Signal Processing, 2006, (191-191), Online publication date: 1-Jan-2006.
  609. Reilly R (2006). An overcomplete signal basis approach to nonlinear time-tone analysis with application to audio and speech processing, EURASIP Journal on Advances in Signal Processing, 2006, (106-106), Online publication date: 1-Jan-2006.
  610. Liu S, Xu M, Yi H, Chia L and Rajan D (2006). Multimodal semantic analysis and annotation for basketball video, EURASIP Journal on Advances in Signal Processing, 2006, (182-182), Online publication date: 1-Jan-2006.
  611. Li F, Ma J and Huang D MFCC and SVM based recognition of chinese vowels Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part II, (812-819)
  612. Veeraraghavan A, Roy-Chowdhury A and Chellappa R (2005). Matching Shape Sequences in Video with Applications in Human Movement Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:12, (1896-1909), Online publication date: 1-Dec-2005.
  613. ACM
    Okabe Y, Saito S and Nakajima M Paintbrush rendering of lines using HMMs Proceedings of the 3rd international conference on Computer graphics and interactive techniques in Australasia and South East Asia, (91-98)
  614. Vallejo A, Nolazco-Flores J, Morales-Menéndez R, Sucar L and Rodríguez C Tool-Wear monitoring based on continuous hidden markov models Proceedings of the 10th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis and Applications, (880-890)
  615. García-Perera L, Nolazco-Flores J and Mex-Perera C Phoneme spotting for speech-based crypto-key generation Proceedings of the 10th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis and Applications, (770-777)
  616. Guerra S, Rodríguez J, Riveron E and Nazuno J Speech recognition using energy parameters to classify syllables in the spanish language Proceedings of the 10th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis and Applications, (161-170)
  617. Furui S (2005). Toward Robust Speech Recognition and Understanding, Journal of VLSI Signal Processing Systems, 41:3, (245-254), Online publication date: 1-Nov-2005.
  618. Zhang J, Sun J and Dai B Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour Proceedings of the First international conference on Affective Computing and Intelligent Interaction, (326-333)
  619. Pao T, Chen Y, Yeh J and Liao W Combining acoustic features for improved emotion recognition in mandarin speech Proceedings of the First international conference on Affective Computing and Intelligent Interaction, (279-285)
  620. Callut J and Dupont P Inducing hidden Markov models to model long-term dependencies Proceedings of the 16th European conference on Machine Learning, (513-521)
  621. ACM
    Zhong L, Sinclair M and Jha N A personal-area network of low-power wireless interfacing devices for handhelds Proceedings of the 7th international conference on Human computer interaction with mobile devices & services, (251-254)
  622. Sumi K, Tanaka K and Matsuyama T Measurement of human concentration with multiple cameras Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part IV, (129-135)
  623. Chetouani M, Hussain A, Faundez-Zanuy M and Gas B Non-linear predictive models for speech processing Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II, (779-784)
  624. Dupont P, Denis F and Esposito Y (2005). Links between probabilistic automata and hidden Markov models, Pattern Recognition, 38:9, (1349-1371), Online publication date: 1-Sep-2005.
  625. Chen S and Srihari S Use of Exterior Contours and Shape Features in Off-line Signature Verification Proceedings of the Eighth International Conference on Document Analysis and Recognition, (1280-1284)
  626. Xu Y, Zhang C and Lu N A bayesian method for high-frequency restoration of low sample-rate speech Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I, (544-552)
  627. Gotou N, Hayashi A and Suematu N Learning with segment boundaries for hierarchical HMMs Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I, (538-543)
  628. Pham T Spectral analysis of protein sequences Proceedings of the 4th international conference on Advances in Machine Learning and Cybernetics, (595-604)
  629. ACM
    Carneiro G and Vasconcelos N A database centric view of semantic image annotation and retrieval Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, (559-566)
  630. Ueda N, Aoki-Kinoshita K, Yamaguchi A, Akutsu T and Mamitsuka H (2005). A Probabilistic Model for Mining Labeled Ordered Trees, IEEE Transactions on Knowledge and Data Engineering, 17:8, (1051-1064), Online publication date: 1-Aug-2005.
  631. ACM
    Ren L, Patrick A, Efros A, Hodgins J and Rehg J A data-driven approach to quantifying natural human motion ACM SIGGRAPH 2005 Papers, (1090-1097)
  632. ACM
    Hsu E, Pulli K and Popović J Style translation for human motion ACM SIGGRAPH 2005 Papers, (1082-1089)
  633. Oliver N and Horvitz E A comparison of HMMs and dynamic bayesian networks for recognizing office activities Proceedings of the 10th international conference on User Modeling, (199-209)
  634. Garcia-Salicetti S, Mellakh M, Allano L and Dorizzi B A generic protocol for multibiometric systems evaluation on virtual and real subjects Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication, (494-502)
  635. Poh N and Bengio S A score-level fusion benchmark database for biometric authentication Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication, (1059-1070)
  636. Tyagi V, Wellekens C and Bourlard H A variable-scale piecewise stationary spectral analysis technique applied to ASR Proceedings of the Second international conference on Machine Learning for Multimodal Interaction, (274-284)
  637. Pardo B and Birmingham W Modeling form for on-line following of musical performances Proceedings of the 20th national conference on Artificial intelligence - Volume 2, (1018-1023)
  638. Bahi H and Sellami M Neural expert model applied to phonemes recognition Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition, (507-515)
  639. Xiang S, Zhang C, Chen X and Lu N A new approach to human motion sequence recognition with application to diving actions Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition, (487-496)
  640. Hayashi A, Mizuhara Y and Suematsu N Embedding time series data for classification Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition, (356-365)
  641. Nagy N, Zhang X, Nagy G and Schneider E A quantitative categorization of phonemic dialect features in context Proceedings of the 5th international conference on Modeling and Using Context, (326-338)
  642. ACM
    Antoniol G, Rollo V and Venturi G (2005). Linear predictive coding and cepstrum coefficients for mining time variant information from software repositories, ACM SIGSOFT Software Engineering Notes, 30:4, (1-5), Online publication date: 1-Jul-2005.
  643. ACM
    Ren L, Patrick A, Efros A, Hodgins J and Rehg J (2005). A data-driven approach to quantifying natural human motion, ACM Transactions on Graphics, 24:3, (1090-1097), Online publication date: 1-Jul-2005.
  644. ACM
    Hsu E, Pulli K and Popović J (2005). Style translation for human motion, ACM Transactions on Graphics, 24:3, (1082-1089), Online publication date: 1-Jul-2005.
  645. Grenager T, Klein D and Manning C Unsupervised learning of field segmentation models for information extraction Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, (371-378)
  646. ACM
    Oh J and Blowers M Text-independent open-set speaker identification for military missions using genetic rule-based system Proceedings of the 7th annual workshop on Genetic and evolutionary computation, (172-174)
  647. ACM
    Sakurai Y, Yoshikawa M and Faloutsos C FTW Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, (326-337)
  648. Pylvänäinen T Accelerometer based gesture recognition using continuous HMMs Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part I, (639-646)
  649. Rice S (2005). A survey course on computer audio, Journal of Computing Sciences in Colleges, 20:6, (118-124), Online publication date: 1-Jun-2005.
  650. Ferrer M, Alonso J and Travieso C (2005). Offline Geometric Parameters for Automatic Signature Verification Using Fixed-Point Arithmetic, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:6, (993-997), Online publication date: 1-Jun-2005.
  651. Kocsor A On kernel discriminant analyses applied to phoneme classification Proceedings of the Second international conference on Advances in neural networks - Volume Part II, (357-362)
  652. Galassi U, Giordana A, Saitta L and Botta M Learning profiles based on hierarchical hidden markov model Proceedings of the 15th international conference on Foundations of Intelligent Systems, (47-55)
  653. ACM
    Antoniol G, Rollo V and Venturi G Linear predictive coding and cepstrum coefficients for mining time variant information from software repositories Proceedings of the 2005 international workshop on Mining software repositories, (1-5)
  654. García-Perera L, Nolazco-Flores J and Mex-Perera C Parameter optimization in a text-dependent cryptographic-speech-key generation task Proceedings of the 3rd international conference on Non-Linear Analyses and Algorithms for Speech Processing, (92-99)
  655. Gangashetty S, Sekhar C and Yegnanarayana B Spotting multilingual consonant-vowel units of speech using neural network models Proceedings of the 3rd international conference on Non-Linear Analyses and Algorithms for Speech Processing, (303-317)
  656. Godino-Llorente J, Gómez-Vilda P, Sáenz-Lechón N, Blanco-Velasco M, Cruz-Roldán F and Ferrer-Ballester M Support vector machines applied to the detection of voice disorders Proceedings of the 3rd international conference on Non-Linear Analyses and Algorithms for Speech Processing, (219-230)
  657. Shen J, Shepherd J and Ngu A On efficient music genre classification Proceedings of the 10th international conference on Database Systems for Advanced Applications, (253-264)
  658. Srinivasan S DocWeb Proceedings of the Third IEEE International Conference on Pervasive Computing and Communications Workshops, (153-157)
  659. McCowan I, Gatica-Perez D, Bengio S, Lathoud G, Barnard M and Zhang D (2005). Automatic Analysis of Multimodal Group Actions in Meetings, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:3, (305-317), Online publication date: 1-Mar-2005.
  660. Shou Y, Mamoulis N and Cheung D (2005). Fast and Exact Warping of Time Series Using Adaptive Segmental Approximations, Machine Language, 58:2-3, (231-267), Online publication date: 1-Feb-2005.
  661. Paczolay D, Felföldi L and Kocsor A (2005). Classifier combination schemes in speech impediment therapy systems, Acta Cybernetica, 17:2, (385-399), Online publication date: 10-Jan-2005.
  662. Gosztolya G and Kocsor A (2005). A hierarchical evaluation methodology in speech recognition, Acta Cybernetica, 17:2, (213-224), Online publication date: 10-Jan-2005.
  663. Alon J, Athitsos V, Yuan Q and Sclaroff S Simultaneous Localization and Recognition of Dynamic Hand Gestures Proceedings of the IEEE Workshop on Motion and Video Computing (WACV/MOTION'05) - Volume 2 - Volume 02, (254-260)
  664. Ganapathiraju M, Balakrishnan N, Reddy R and Klein-Seetharaman J Computational biology and language Ambient Intelligence for Scientific Discovery, (25-47)
  665. García-Perera P, Mex-Perera C and Nolazco-Flores J Cryptographic-speech-key generation using the SVM technique over the lp-cepstral speech space Nonlinear Speech Modeling and Applications, (370-374)
  666. Esposito A and Aversano G Text independent methods for speech segmentation Nonlinear Speech Modeling and Applications, (261-290)
  667. Rodas J and Rojo J (2005). Knowledge discovery in repeated very short serial measurements with a blocking factor. Application to a psychiatric domain, International Journal of Hybrid Intelligent Systems, 2:1, (57-87), Online publication date: 1-Jan-2005.
  668. Saastamoinen J, Karpov E, Hautamäki V and Fränti P (2005). Accuracy of MFCC-based speaker recognition in series 60 device, EURASIP Journal on Advances in Signal Processing, 2005, (2816-2827), Online publication date: 1-Jan-2005.
  669. Gu L, Harris J, Shrivastav R and Sapienza C (2005). Disordered speech assessment using automatic methods based on quantitative measures, EURASIP Journal on Advances in Signal Processing, 2005, (1400-1409), Online publication date: 1-Jan-2005.
  670. Dong L, Foo S and Lian Y (2005). A two-channel training algorithm for hidden Markov model and its application to lip reading, EURASIP Journal on Advances in Signal Processing, 2005, (1382-1399), Online publication date: 1-Jan-2005.
  671. Felzenszwalb P and Huttenlocher D (2005). Pictorial Structures for Object Recognition, International Journal of Computer Vision, 61:1, (55-79), Online publication date: 1-Jan-2005.
  672. Tangwongsan S, Po-Aramsri P and Phoophuangpairoj R Highly efficient and effective techniques for thai syllable speech recognition Proceedings of the 9th Asian Computing Science conference on Advances in Computer Science: dedicated to Jean-Louis Lassez on the Occasion of His 5th Cycle Birthday, (259-270)
  673. Xu M, Duan L, Cai J, Chia L, Xu C and Tian Q HMM-Based audio keyword generation Proceedings of the 5th Pacific Rim conference on Advances in Multimedia Information Processing - Volume Part III, (566-574)
  674. ACM
    Wright C, Monrose F and Masson G HMM profiles for network traffic classification Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security, (9-15)
  675. ACM
    Zhang D, Gatica-Perez D, Bengio S, McCowan I and Lathoud G Multimodal group action clustering in meetings Proceedings of the ACM 2nd international workshop on Video surveillance & sensor networks, (54-62)
  676. ACM
    Lu L, Wang M and Zhang H Repeating pattern discovery and structure analysis from acoustic music data Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval, (275-282)
  677. ACM
    Radhakrishnan R, Divakaran A and Xiong Z A time series clustering based framework for multimedia mining and summarization using audio features Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval, (157-164)
  678. ACM
    Xu M, Duan L, Chia L and Xu C Audio keyword generation for sports video analysis Proceedings of the 12th annual ACM international conference on Multimedia, (758-759)
  679. ACM
    Tao D, Liu H and Tang X K-BOX Proceedings of the 12th annual ACM international conference on Multimedia, (464-467)
  680. ACM
    Nwe T, Shenoy A and Wang Y Singing voice detection in popular music Proceedings of the 12th annual ACM international conference on Multimedia, (324-327)
  681. ACM
    Suga Y, Kosugi N and Morimoto M Real-time background music monitoring based on content-based retrieval Proceedings of the 12th annual ACM international conference on Multimedia, (120-127)
  682. Alewine N, Ruback H and Deligne S (2004). Pervasive Speech Recognition, IEEE Pervasive Computing, 3:4, (78-81), Online publication date: 1-Oct-2004.
  683. Vemuri S and Bender W (2004). Next-Generation Personal Memory Aids, BT Technology Journal, 22:4, (125-138), Online publication date: 1-Oct-2004.
  684. Di Nunzio G, Ferro N and Orio N Experiments on statistical approaches to compensate for limited linguistic resources Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images, (60-72)
  685. Bengio S and Bourlard H Multi channel sequence processing Proceedings of the First international conference on Deterministic and Statistical Methods in Machine Learning, (22-36)
  686. Kocsor A and Tóth L (2004). Application of Kernel-Based Feature Space Transformations and Learning Methods to Phoneme Classification, Applied Intelligence, 21:2, (129-142), Online publication date: 1-Sep-2004.
  687. Hsu E, Gentry S and Popović J Example-based control of human motion Proceedings of the 2004 ACM SIGGRAPH/Eurographics symposium on Computer animation, (69-77)
  688. Sato T and Kameya Y Negation elimination for finite PCFGs Proceedings of the 14th international conference on Logic Based Program Synthesis and Transformation, (117-132)
  689. Nakagawa T Chinese and Japanese word segmentation using word-level and character-level information Proceedings of the 20th international conference on Computational Linguistics, (466-es)
  690. GuoDong Z Modeling of long distance context dependency Proceedings of the 20th international conference on Computational Linguistics, (92-es)
  691. ACM
    Agichtein E and Ganti V Mining reference tables for automatic text segmentation Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, (20-29)
  692. Vargas F, Fagundes R, Barros D, Brum D and Rhod E (2004). Merging a DSP-Oriented Signal Integrity Technique and SW-Based Fault Handling Mechanisms to Ensure Reliable DSP Systems, Journal of Electronic Testing: Theory and Applications, 20:4, (397-411), Online publication date: 1-Aug-2004.
  693. ACM
    Bowring J, Rehg J and Harrold M Active learning for automatic classification of software behavior Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis, (195-205)
  694. Kim S, Smyth P and Luther S Modeling waveform shapes with random effects segmental hidden Markov models Proceedings of the 20th conference on Uncertainty in artificial intelligence, (309-316)
  695. Jojic N, Jojic V and Heckerman D Joint discovery of haplotype blocks and complex trait associations from SNP sequences Proceedings of the 20th conference on Uncertainty in artificial intelligence, (286-292)
  696. ACM
    Bowring J, Rehg J and Harrold M (2004). Active learning for automatic classification of software behavior, ACM SIGSOFT Software Engineering Notes, 29:4, (195-205), Online publication date: 1-Jul-2004.
  697. Langer M, Zhang L, Klein A, Bhatia A, Pereira J and Rekhi D A spectral-particle hybrid method for rendering falling snow Proceedings of the Fifteenth Eurographics conference on Rendering Techniques, (217-226)
  698. McCowan I, Gatica-Perez D, Bengio S, Moore D and Bourlard H Towards computer understanding of human interactions Proceedings of the First international conference on Machine Learning for Multimodal Interaction, (56-75)
  699. Oliver N and Horvitz E S-SEER Proceedings of the First international conference on Machine Learning for Multimodal Interaction, (122-135)
  700. ACM
    Kosugi N, Sakurai Y and Morimoto M SoundCompass Proceedings of the 2004 ACM SIGMOD international conference on Management of data, (881-886)
  701. ACM
    Aoki K, Ueda N, Yamaguchi A, Akutsu T, Kanehisa M and Mamitsuka H (2004). Managing and analyzing carbohydrate data, ACM SIGMOD Record, 33:2, (33-38), Online publication date: 1-Jun-2004.
  702. Arifi V, Clausen M, Kurth F and Müller M Score-PCM music synchronization based on extracted score parameters Proceedings of the Second international conference on Computer Music Modeling and Retrieval, (193-210)
  703. Moissinac J, Yvon F and Hazez S Automating indexing of classes and conferences Coupling approaches, coupling media and coupling languages for information retrieval, (885-894)
  704. ACM
    Intille S, Bao L, Tapia E and Rondoni J Acquiring in situ training data for context-aware ubiquitous computing applications Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, (1-8)
  705. Wu C, Chiu Y and Cheng K (2004). Error-Tolerant Sign Retrieval Using Visual Features and Maximum A Posteriori Estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26:4, (495-508), Online publication date: 1-Apr-2004.
  706. Wei J (2004). Markov Edit Distance, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26:3, (311-321), Online publication date: 1-Mar-2004.
  707. Bahlmann C and Burkhardt H (2004). The Writer Independent Online Handwriting Recognition System frog on hand and Cluster Generative Statistical Dynamic Time Warping, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26:3, (299-310), Online publication date: 1-Mar-2004.
  708. Nickel R Robust speaker verification with optimal pitch bases expansions Proceedings of the winter international synposium on Information and communication technologies, (1-6)
  709. Lewis T and Powers D Sensor fusion weighting measures in Audio-Visual Speech Recognition Proceedings of the 27th Australasian conference on Computer science - Volume 26, (305-314)
  710. Jhanwar N and Raina A (2004). Pitch correlogram clustering for fast speaker identification, EURASIP Journal on Advances in Signal Processing, 2004, (2640-2649), Online publication date: 1-Jan-2004.
  711. Yao K and Lee T (2004). Time-varying noise estimation for speech enhancement and recognition using sequential Monte Carlo method, EURASIP Journal on Advances in Signal Processing, 2004, (2366-2384), Online publication date: 1-Jan-2004.
  712. Tran D and Sharma D Automatic gender recognition Proceedings of the 2nd WSEAS International Conference on Electronics, Control and Signal Processing, (1-5)
  713. Wilson D and Martinez T (2003). The general inefficiency of batch training for gradient descent learning, Neural Networks, 16:10, (1429-1451), Online publication date: 1-Dec-2003.
  714. ACM
    Moon Y, Leung C and Pun K Fixed-point GMM-based speaker verification over mobile embedded system Proceedings of the 2003 ACM SIGMM workshop on Biometrics methods and applications, (53-57)
  715. ACM
    Oliver N and Horvitz E Selective perception policies for guiding sensing and computation in multimodal systems Proceedings of the 5th international conference on Multimodal interfaces, (36-43)
  716. ACM
    Melucci M and Orio N A novel method for stemmer generation based on hidden markov models Proceedings of the twelfth international conference on Information and knowledge management, (131-138)
  717. ACM
    Kang H Affective content detection using HMMs Proceedings of the eleventh ACM international conference on Multimedia, (259-262)
  718. ACM
    Lu L and Zhang H Automated extraction of music snippets Proceedings of the eleventh ACM international conference on Multimedia, (140-147)
  719. Miller D and Browning J (2003). A Mixture Model and EM-Based Algorithm for Class Discovery, Robust Classification, and Outlier Rejection in Mixed Labeled/Unlabeled Data Sets, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25:11, (1468-1483), Online publication date: 1-Nov-2003.
  720. ACM
    Krishna R, Mahlke S and Austin T Architectural optimizations for low-power, real-time speech recognition Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems, (220-231)
  721. Yin P, Essa I and Rehg J Boosted Audio-Visual HMM for Speech Reading Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures
  722. Pham T Alignment-Free Sequence Comparison with Vector Quantization and Hidden Markov Models Proceedings of the IEEE Computer Society Conference on Bioinformatics
  723. Shimodaira H, Sudo T, Nakai M and Sagayama S On-line Overlaid-Handwriting Recognition Based on Substroke HMMs Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
  724. Shafiei M and Rabiee H A New On-Line Signature Verification Algorithm Using Variable Length Segmentation and Hidden Markov Models Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
  725. Zou M, Tong J, Liu C and Lou Z On-line Signature Verification Using Local Shape Analysis Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
  726. Jax P and Vary P (2003). On artificial bandwidth extension of telephone speech, Signal Processing, 83:8, (1707-1719), Online publication date: 1-Aug-2003.
  727. ACM
    Li T, Ogihara M and Li Q A comparative study on content-based music genre classification Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, (282-289)
  728. Kovar L and Gleicher M Flexible automatic motion blending with registration curves Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation, (214-224)
  729. Baillie M and Jose J Audio-based event detection for sports video Proceedings of the 2nd international conference on Image and video retrieval, (300-309)
  730. Velivelli A, Ngo C and Huang T Detection of documentary scene changes by audio-visual fusion Proceedings of the 2nd international conference on Image and video retrieval, (227-238)
  731. Mellody M, Bartsch M and Wakefield G (2003). Analysis of Vowels in Sung Queries for a Music Information Retrieval System, Journal of Intelligent Information Systems, 21:1, (35-52), Online publication date: 1-Jul-2003.
  732. Cohen I, Sebe N, Garg A, Chen L and Huang T (2003). Facial expression recognition from video sequences, Computer Vision and Image Understanding, 91:1-2, (160-187), Online publication date: 1-Jul-2003.
  733. Jin A, Samad S and Hussain A Theoretic evidence k-nearest neighbourhood classifiers in a bimodal biometric verification system Proceedings of the 4th international conference on Audio- and video-based biometric person authentication, (778-786)
  734. Kale A, Cuntoor N, Yegnanarayana B, Rajagopalan A and Chellappa R Gait analysis for human identification Proceedings of the 4th international conference on Audio- and video-based biometric person authentication, (706-714)
  735. Fan N and Rosca J Enhanced VQ-based algorithms for speech independent speaker identification Proceedings of the 4th international conference on Audio- and video-based biometric person authentication, (470-477)
  736. Leung C and Moon Y Effect of window size and shift period in mel-warped cepstral feature extraction on GMM-based speaker verification Proceedings of the 4th international conference on Audio- and video-based biometric person authentication, (438-445)
  737. Kim J, Paek H, Chung C, Hwang J and Lee W On the extraction of the valid speech-sound by the merging algorithm with the discrete wavelet transform Proceedings of the 1st international conference on Computational science: PartI, (314-322)
  738. Huang C and Wang H (2003). Bandwidth-adjusted LPC analysis for robust speech recognition, Pattern Recognition Letters, 24:9-10, (1583-1587), Online publication date: 1-Jun-2003.
  739. Koerich A, Sabourin R and Suen C (2003). Large vocabulary off-line handwriting recognition, Pattern Analysis & Applications, 6:2, (97-121), Online publication date: 1-Jun-2003.
  740. Kim J, Paek H, Chung C, Yim W and Lee S The merging algorithm for an extraction of valid speech-sounds Proceedings of the 2003 international conference on Computational science and its applications: PartII, (599-606)
  741. Lane T and Brodley C (2003). An Empirical Study of Two Approaches to Sequence Learning for Anomaly Detection, Machine Language, 51:1, (73-107), Online publication date: 1-Apr-2003.
  742. Kin-Pong Chan F, Wai-chee Fu A and Yu C (2003). Haar Wavelets for Efficient Similarity Search of Time-Series, IEEE Transactions on Knowledge and Data Engineering, 15:3, (686-705), Online publication date: 1-Mar-2003.
  743. Munich M and Perona P (2003). Visual Identification by Signature Tracking, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25:2, (200-217), Online publication date: 1-Feb-2003.
  744. Vargas F, Fagundes R and Barros D (2003). A New On-Line Robust Approach to Design Noise-Immune Speech Recognition Systems, Journal of Electronic Testing: Theory and Applications, 19:1, (61-72), Online publication date: 1-Feb-2003.
  745. Blumberg B D-Learning Exploring artificial intelligence in the new millennium, (37-67)
  746. Allen J Speech recognition and synthesis Encyclopedia of Computer Science, (1664-1667)
  747. Srihari S and Govindaraju V Pattern recognition Encyclopedia of Computer Science, (1375-1382)
  748. Mouchtaris A, Narayanan S and Kyriakakis C (2003). Virtual microphones for multichannel audio resynthesis, EURASIP Journal on Advances in Signal Processing, 2003, (968-979), Online publication date: 1-Jan-2003.
  749. Fish R, Ostendorf M, Bernard G and Castanon D (2003). Multilevel Classification of Milling Tool Wear with Confidence Estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25:1, (75-85), Online publication date: 1-Jan-2003.
  750. ACM
    Sundaram H, Xie L and Chang S A utility framework for the automatic generation of audio-visual skims Proceedings of the tenth ACM international conference on Multimedia, (189-198)
  751. ACM
    Chen H, Zheng N, Liang L, Li Y, Xu Y and Shum H PicToon Proceedings of the tenth ACM international conference on Multimedia, (171-178)
  752. ACM
    Vaton S and Gravey A Iterative Bayesian estimation of network traffic matrices in the case of bursty flows Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment, (89-90)
  753. Xiong Z, Chen Y, Wang R and Huang T Improved Information Maximization based Face and Facial Feature Detection from Real-time Video and Application in a Multi-Modal Person Identification System Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
  754. Oliver N, Horvitz E and Garg A Layered Representations for Human Activity Recognition Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
  755. Jacobs R, Jiang W and Tanner M (2002). Factorial hidden Markov models and the generalized backfitting algorithm, Neural Computation, 14:10, (2415-2437), Online publication date: 1-Oct-2002.
  756. Principe J, Euliano N and Garani S (2002). Principles and networks for self-organization in space-time, Neural Networks, 15:8-9, (1069-1083), Online publication date: 1-Oct-2002.
  757. Aversano G and Esposito A Automatic Parameter Estimation for a Context-Independent Speech Segmentation Algorithm Proceedings of the 5th International Conference on Text, Speech and Dialogue, (293-300)
  758. Kueng T and Su K A robust cross-style bilingual sentences alignment model Proceedings of the 19th international conference on Computational linguistics - Volume 1, (1-7)
  759. Keogh E Exact indexing of dynamic time warping Proceedings of the 28th international conference on Very Large Data Bases, (406-417)
  760. ACM
    Michael C and Ghosh A (2002). Simple, state-based approaches to program-based anomaly detection, ACM Transactions on Information and System Security, 5:3, (203-237), Online publication date: 1-Aug-2002.
  761. Jiang X and Ser W (2002). Online Fingerprint Template Improvement, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:8, (1121-1126), Online publication date: 1-Aug-2002.
  762. ACM
    Blumberg B, Downie M, Ivanov Y, Berlin M, Johnson M and Tomlinson B Integrated learning for interactive synthetic characters Proceedings of the 29th annual conference on Computer graphics and interactive techniques, (417-426)
  763. ACM
    Hu N and Dannenberg R A comparison of melodic database retrieval techniques using sung queries Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, (301-307)
  764. Kikui G and Yamamoto H Finding translation pairs from English-Japanese untokenized aligned corpora Proceedings of the ACL-02 workshop on Speech-to-speech translation: algorithms and systems - Volume 7, (23-30)
  765. Nakagawa T, Kudo T and Matsumoto Y Revision learning and its application to part-of-speech tagging Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, (497-504)
  766. Tang M, Luo X and Roukos S Active learning for statistical natural language parsing Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, (120-127)
  767. Chen Y, Gao W, Zhu T and Ling C (2002). Learning Prosodic Patterns for Mandarin Speech Synthesis, Journal of Intelligent Information Systems, 19:1, (95-109), Online publication date: 1-Jul-2002.
  768. Watanabe T, Sugawara K and Sugihara H (2002). A New Pattern Representation Scheme Using Data Compression, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:5, (579-590), Online publication date: 1-May-2002.
  769. Khorsheed M (2002). Off-Line Arabic Character Recognition --- A Review, Pattern Analysis & Applications, 5:1, (31-45), Online publication date: 1-May-2002.
  770. Bugatti A, Flammini A and Migliorati P (2002). Audio classification in speech and music, EURASIP Journal on Advances in Signal Processing, 2002:4, (372-378), Online publication date: 1-Apr-2002.
  771. ACM
    Kim S, Yoon J, Park S and Kim T Shape-based retrieval of similar subsequences in time-series databases Proceedings of the 2002 ACM symposium on Applied computing, (438-445)
  772. Munich M and Perona P (2002). Visual Input for Pen-Based Computers, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:3, (313-328), Online publication date: 1-Mar-2002.
  773. Sugawara T, Miyanaga Y and Yoshida N A Design of Analog C-Matrix Circuits used for Signal/Data Processing Proceedings of the 2002 Asia and South Pacific Design Automation Conference
  774. Lewis T and Powers D (2002). Audio-visual speech recognition using red exclusion and neural networks, Australian Computer Science Communications, 24:1, (149-156), Online publication date: 1-Jan-2002.
  775. Lewis T and Powers D Audio-visual speech recognition using red exclusion and neural networks Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4, (149-156)
  776. Bugatti A, Flammini A and Migliorati P (2002). Audio classification in speech and music, EURASIP Journal on Advances in Signal Processing, 2002:1, (372-378), Online publication date: 1-Jan-2002.
  777. Gordan M, Kotropoulos C and Pitas I (2002). A support vector machine-based dynamic network for visual speech recognition applications, EURASIP Journal on Advances in Signal Processing, 2002:1, (1248-1259), Online publication date: 1-Jan-2002.
  778. Aleksic P, Williams J, Wu Z and Katsaggelos A (2002). Audio-visual speech recognition using MPEG-4 compliant visual features, EURASIP Journal on Advances in Signal Processing, 2002:1, (1213-1227), Online publication date: 1-Jan-2002.
  779. Zhang X, Broun C, Mersereau R and Clements M (2002). Automatic speechreading with applications to human-computer interfaces, EURASIP Journal on Advances in Signal Processing, 2002:1, (1228-1247), Online publication date: 1-Jan-2002.
  780. Nefian A, Liang L, Pi X, Liu X and Murphy K (2002). Dynamic Bayesian networks for audio-visual speech recognition, EURASIP Journal on Advances in Signal Processing, 2002:1, (1274-1288), Online publication date: 1-Jan-2002.
  781. Yang M, Kriegman D and Ahuja N (2002). Detecting Faces in Images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:1, (34-58), Online publication date: 1-Jan-2002.
  782. Nam J and Tewfik A (2002). Event-Driven Video Abstraction and Visualization, Multimedia Tools and Applications, 16:1-2, (55-77), Online publication date: 1-Jan-2002.
  783. Zhu Y, De Silva L and Ko C (2002). Using moment invariants and HMM in facial expression recognition, Pattern Recognition Letters, 23:1-3, (83-91), Online publication date: 1-Jan-2002.
  784. Lu G (2001). Indexing and Retrieval of Audio, Multimedia Tools and Applications, 15:3, (269-290), Online publication date: 1-Dec-2001.
  785. ACM
    Kakumanu P, Gutierrez-Osuna R, Esposito A, Bryll R, Goshtasby A and Garcia O Speech driven facial animation Proceedings of the 2001 workshop on Perceptive user interfaces, (1-5)
  786. ACM
    Park S, Kim S, Cho J and Padmanabhan S Prefix-querying Proceedings of the tenth international conference on Information and knowledge management, (255-262)
  787. ACM
    Naphade M, Wang R and Huang T Supporting audiovisual query using dynamic programming Proceedings of the ninth ACM international conference on Multimedia, (411-420)
  788. ACM
    Li Y, Yu F, Xu Y, Chang E and Shum H Speech-driven cartoon animation with emotions Proceedings of the ninth ACM international conference on Multimedia, (365-371)
  789. ACM
    Pfeiffer S Pause concepts for audio segmentation at different semantic levels Proceedings of the ninth ACM international conference on Multimedia, (187-193)
  790. Zheng Y, Lin Z and Tay D (2001). State-dependent vector hybrid linear and nonlinear ARMA modeling, Circuits, Systems, and Signal Processing, 20:5, (551-574), Online publication date: 1-Sep-2001.
  791. Choi K, Luo Y and Hwang J (2001). Hidden Markov Model Inversion for Audio-to-Visual Conversion in an MPEG-4 Facial Animation System, Journal of VLSI Signal Processing Systems, 29:1-2, (51-61), Online publication date: 1-Aug-2001.
  792. Schwenker F, Kestler H and Palm G Unsupervised and supervised learning in radial-basis-function networks Self-Organizing neural networks, (217-243)
  793. Wilson A and Bobick A Hidden Markov models for modeling and recognizing gesture under variation Hidden Markov models, (123-160)
  794. Lee J, Kim J and Kim J Data-driven design of HMM topology for online handwriting recognition Hidden Markov models, (107-121)
  795. ACM
    Borkar V, Deshmukh K and Sarawagi S (2001). Automatic segmentation of text into structured records, ACM SIGMOD Record, 30:2, (175-186), Online publication date: 1-Jun-2001.
  796. Alatan A, Akansu A and Wolf W (2001). Multi-Modal Dialog Scene Detection Using Hidden Markov Models for Content-Based Multimedia Indexing, Multimedia Tools and Applications, 14:2, (137-151), Online publication date: 1-Jun-2001.
  797. ACM
    Borkar V, Deshmukh K and Sarawagi S Automatic segmentation of text into structured records Proceedings of the 2001 ACM SIGMOD international conference on Management of data, (175-186)
  798. ACM
    Park S, Kim S and Chu W Segment-based approach for subsequence searches in sequence databases Proceedings of the 2001 ACM symposium on Applied computing, (248-252)
  799. ACM
    Nelson L, Bly S and Sokoler T Quiet calls Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, (174-181)
  800. ACM
    Nakamura K, Zhu Q, Maruoke S, Horiyama T, Kimura S and Watanabe K Speech recognition chip for monosyllables Proceedings of the 2001 Asia and South Pacific Design Automation Conference, (396-399)
  801. ACM
    Nakamura K, Zhu Q, Maruoka S, Horiyama T, Kimura S and Watanabe K A real-time 64-monosyllable recognition LSI with learning mechanism Proceedings of the 2001 Asia and South Pacific Design Automation Conference, (31-32)
  802. Sang Hyun P and Wesley W. C (2001). Discovering and Matching Elastic Rules from Sequence Databases, Fundamenta Informaticae, 47:1-2, (75-90), Online publication date: 1-Jan-2001.
  803. Saul L and Rahim M (2000). Markov Processes on Curves, Machine Language, 41:3, (345-363), Online publication date: 1-Dec-2000.
  804. ACM
    Wang H and Chen B Content-based language models for spoken document retrieval Proceedings of the fifth international workshop on on Information retrieval with Asian languages, (149-155)
  805. Carreira-Perpiñán M (2000). Mode-Finding for Mixtures of Gaussian Distributions, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:11, (1318-1323), Online publication date: 1-Nov-2000.
  806. Chen Y and Wang J (2000). Segmentation of Single- or Multiple-Touching Handwritten Numeral String Using Background and Foreground Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:11, (1304-1317), Online publication date: 1-Nov-2000.
  807. Kovács-Vajna Z (2000). A Fingerprint Verification System Based on Triangular Matching and Dynamic Time Warping, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:11, (1266-1276), Online publication date: 1-Nov-2000.
  808. ACM
    Rui Y, Gupta A and Acero A Automatically extracting highlights for TV Baseball programs Proceedings of the eighth ACM international conference on Multimedia, (105-115)
  809. ACM
    Sundaram H and Chang S Determining computable scenes in films and their structures using audio-visual memory models Proceedings of the eighth ACM international conference on Multimedia, (95-104)
  810. Wechsler M, Munteanu E and Schäuble P (2000). New Approaches to Spoken Document Retrieval, Information Retrieval, 3:3, (173-188), Online publication date: 1-Oct-2000.
  811. Yi B and Faloutsos C Fast Time Sequence Indexing for Arbitrary Lp Norms Proceedings of the 26th International Conference on Very Large Data Bases, (385-394)
  812. North B, Blake A, Isard M and Rittscher J (2000). Learning and Classification of Complex Dynamics, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:9, (1016-1034), Online publication date: 1-Sep-2000.
  813. ACM
    Keogh E and Pazzani M Scaling up dynamic time warping for datamining applications Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, (285-289)
  814. Ivanov Y and Bobick A (2000). Recognition of Visual Activities and Interactions by Stochastic Parsing, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:8, (852-872), Online publication date: 1-Aug-2000.
  815. Giese M and Poggio T (2000). Morphable Models for the Analysis and Synthesis of Complex Motion Patterns, International Journal of Computer Vision, 38:1, (59-73), Online publication date: 30-Jun-2000.
  816. Hu J, Kashi R and Wilfong G (2000). Comparison and Classification of Documents Based on Layout Similarity, Information Retrieval, 2:2-3, (227-243), Online publication date: 1-May-2000.
  817. Boreczky J, Foote J, Girgensohn A and Wilcox L Interactive similarity search for video browsing and retrieval Content-Based Multimedia Information Access - Volume 1, (637-648)
  818. Amini M, Zaragoza H and Gallinari P Learning for sequence extraction tasks Content-Based Multimedia Information Access - Volume 1, (476-490)
  819. Bett M, Gross R, Yu H, Zhu X, Pan Y, Yang J and Waibel A Multimodal meeting tracker Content-Based Multimedia Information Access - Volume 1, (32-45)
  820. Li X, Parizeau M and Plamondon R (2000). Training Hidden Markov Models with Multiple Observations-A Combinatorial Method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:4, (371-377), Online publication date: 1-Apr-2000.
  821. Cong L, Asghar S and Cong B Robust Speech Recognition Using Neural Networks and Hidden Markov Models Proceedings of the The International Conference on Information Technology: Coding and Computing (ITCC'00)
  822. ACM
    Yang J, Zhu X, Gross R, Kominek J, Pan Y and Waibel A Multimodal people ID for a multimedia meeting browser Proceedings of the seventh ACM international conference on Multimedia (Part 1), (159-168)
  823. ACM
    Foote J Visualizing music and audio using self-similarity Proceedings of the seventh ACM international conference on Multimedia (Part 1), (77-80)
  824. ACM
    Nam J and Tewfik A Dynamic video summarization and visualization Proceedings of the seventh ACM international conference on Multimedia (Part 2), (53-56)
  825. ACM
    Singhal A and Pereira F Document expansion for speech retrieval Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, (34-41)
  826. Phillips S and Rogers A (1999). Parallel Speech Recognition, International Journal of Parallel Programming, 27:4, (257-288), Online publication date: 1-Aug-1999.
  827. Pavlovic V, Frey B and Huang T Variational learning in mixed-state dynamic graphical models Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, (522-530)
  828. Neufeld E (1999). Review of "Statistical methods for speech recognition" by Frederick Jelinek. The MIT Press 1997., Computational Linguistics, 25:2, (297-298), Online publication date: 1-Jun-1999.
  829. ACM
    Johnson M, Wilson A, Blumberg B, Kline C and Bobick A Sympathetic interfaces Proceedings of the SIGCHI conference on Human Factors in Computing Systems, (152-158)
  830. Lee D and Seung H Learning in intelligent embedded systems Proceedings of the Workshop on Embedded Systems on Workshop on Embedded Systems, (9-9)
  831. Machine Translation staff (1998). A Controlled Skip Parser, Machine Translation, 13:1, (1-15), Online publication date: 1-Oct-1998.
  832. Akbar M and Caelen J Parole et traduction automatique Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 1, (36-40)
  833. Mohri M and Pereira F Dynamic compilation of weighted context-free grammars Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 2, (891-897)
  834. ACM
    Wechsler M, Munteanu E and Schäuble P New techniques for open-vocabulary spoken document retrieval Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, (20-27)
  835. Isard M and Blake A (1998). C ONDENSATION —Conditional Density Propagation forVisual Tracking, International Journal of Computer Vision, 29:1, (5-28), Online publication date: 1-Aug-1998.
  836. Thomas I, Zukerman I and Raskutti B Extracting phoneme pronunciation information from corpora Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning, (175-183)
  837. Fung P and McKeown K (1998). A Technical Word- and Term-Translation Aid Using Noisy Parallel Corpora across Language Groups, Machine Translation, 12:1/2, (53-87), Online publication date: 1-Jan-1998.
  838. Stolcke A (1997). Linguistic Knowledge and Empirical Methods in Speech Recognition, AI Magazine, 18:4, (25-31), Online publication date: 1-Dec-1997.
  839. Thomas I, Zukerman I, Oliver J, Albrecht D and Raskutti B Lexical access for speech understanding using minimum message length encoding Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence, (464-471)
  840. Kearns M, Mansour Y and Ng A An information-theoretic analysis of hard and soft assignment methods for clustering Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence, (282-293)
  841. ACM
    Korn F, Jagadish H and Faloutsos C (1997). Efficiently supporting ad hoc queries in large datasets of time sequences, ACM SIGMOD Record, 26:2, (289-300), Online publication date: 1-Jun-1997.
  842. ACM
    Korn F, Jagadish H and Faloutsos C Efficiently supporting ad hoc queries in large datasets of time sequences Proceedings of the 1997 ACM SIGMOD international conference on Management of data, (289-300)
  843. Ha T (1997). The Optimum Class-Selective Rejection Rule, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:6, (608-615), Online publication date: 1-Jun-1997.
  844. Van Hulle M (1997). The formation of topographic maps that maximize the average mutual information of the output responses to noiseless input signals, Neural Computation, 9:3, (595-606), Online publication date: 1-Apr-1997.
  845. ACM
    Buchsbaum A and Giancarlo R (1997). Algorithmic aspects in speech recognition, ACM Journal of Experimental Algorithmics, 2, (1-es), Online publication date: 1-Jan-1997.
  846. Hu J, Brown M and Turin W (1996). HMM Based On-Line Handwriting Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18:10, (1039-1045), Online publication date: 1-Oct-1996.
  847. Kim Y, Park Y and Chun J A Dynamic Indexing Structure for Searching Time-Series Patterns Proceedings of the 20th Conference on Computer Software and Applications
  848. ACM
    Liang R and Ouhyoung M A sign language recognition system using hidden markov model and context sensitive search Proceedings of the ACM Symposium on Virtual Reality Software and Technology, (59-66)
  849. ACM
    Aref W, Barbará D and Vallabhaneni P The handwritten trie Proceedings of the 1995 ACM SIGMOD international conference on Management of data, (151-162)
  850. ACM
    Aref W, Barbará D and Vallabhaneni P (1995). The handwritten trie, ACM SIGMOD Record, 24:2, (151-162), Online publication date: 22-May-1995.
  851. ACM
    Kimber D, Wilcox L, Chen F and Moran T Speaker segmentation for browsing recorded audio Conference Companion on Human Factors in Computing Systems, (212-213)
  852. ACM
    Bradford J (1995). The human factors of speech-based interfaces, ACM SIGCHI Bulletin, 27:2, (61-67), Online publication date: 1-Apr-1995.
  853. ACM
    Hemphill C and Thrift P Surfing the Web by voice Proceedings of the third ACM international conference on Multimedia, (215-222)
  854. Kupiec J, Kimber D and Balasubramanian V Speech-based retrieval using semantic co-occurrence filtering Proceedings of the workshop on Human Language Technology, (373-377)
  855. Pham T Measures of spatial distortion using kriging 2016 IEEE 8th International Conference on Intelligent Systems (IS), (438-442)
  856. Pham T The multiple-point variogram of images for robust texture classification 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (1303-1307)
  857. Prätzlich T, Driedger J and Müller M Memory-restricted multiscale dynamic time warping 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (569-573)
Contributors
  • University of California, Santa Barbara
  • Georgia Institute of Technology

Reviews

James H. Bradford

The authors' goal in writing this book is set out in the preface: “…the fundamental goal of the book would be to provide a theoretically sound, technically accurate, and reasonably complete description of the basic knowledge and ideas that constitute a modern system for speech recognition by machine” (p.<__?__Pub Fmt interword-space>xxxi). The authors did not achieve their goal, because of careless writing and poor editing. The book is divided into nine chapters. Each chapter addresses a different aspect of what might be termed the engineering issues of speech recognition. Chapter 1 provides a short description of the structure and content of the remainder of the book. The substance begins in chapter 2. Chapter 2 deals with the production, perception, and acoustics of speech. Problems arise within the first few pages. In Figure 2.5, the “Glottal Volume Velocity” of a typical speaker is illustrated. Neither the figure nor the associated text offers any definition of glottal volume velocity, however, or mentions its significance. Here, as in many other parts of the book, the authors seem to have included material simply because they knew it and not because it would be of any use to their readers. I was left with the impression that many figures throughout the book had been cut from other work and pasted into the text. The result is an unhappy collage of poorly related material. The authors have failed to adequately understand their readership, and this constitutes a serious problem. At many points throughout the book, I was left wondering for whom this book was written. For example, in Section 2.5, Hopfield artificial neural networks are described. Hopfield nets are generally considered to be a moderately advanced topic, yet the material is presented in a single paragraph. It is not clear what a reader unfamiliar with the field would learn from this coverage. Another precept of good writing is to avoid forward references whenever possible. This book goes beyond normal violation of this guideline—forward references exist but are never made explicit to the reader. For example, the term “melscale” is used in Figure 2.50 on page 64, but the definition of “melscale” does not appear until page 78. Chapter 3 provides a thorough description of the basic techniques of preprocessing speech to provide suitable input to the recognition algorithms. The description includes material on filter banks, linear predictive coding, and vector quantization. Chapter 4 provides extensive coverage of the various kinds of similarity (or distortion) measures that can be used to classify patterns in speech signals. Section 4.7 gives a useful description of the dynamic time warping algorithm that is fundamental to classical speech recognition. Chapter 5 provides a wealth of practical guidelines (supported by empirical studies) on how to assemble the various distortion measures and clustering techniques to produce a practical speech recognition system. Section 5.7, on speech recognition under adverse conditions, is interesting, useful, and readable. Arguably the most important technique of modern speech recognition, hidden Markov models (HMMs), is covered in chapter 6. Much of this chapter consists of a highly informative tutorial on HMMs that is based on an earlier paper by Rabiner [1]. This chapter contains the single most irritating mistake in the book. On page 339, the authors claim the following: “Using g t i , we can solve for the individually most likely state at time t , as q * t = arg min 1?i?N g t i 1? t?T .” I puzzled over this equation for some time before going back to Rabiner's original paper [1]. In fact, the equation should have been “ q * t = arg max 1?i?N g t i 1? t?T .” This kind of mistake creates endless difficulty for a reader who is being exposed to the subject for the first time. An error in a key formula not only misleads the reader, it is apt to undermine the reader's confidence in all of the hundreds of formulas found throughout the text. For someone who does not have the knowledge and confidence that derive from familiarity with the material, the question arises: “Which of these many formulas are right, and which are wrong__?__” This leads to my most important point: scientific publication is worse than useless if the authors do not take the care to get it right. Chapter 7 is a straightforward extension of previous material to address connected word recognition (classical speech recognition deals with disconnected words—words surrounded by short periods of silence). Chapter 8 gives an overview of some of the problems encountered when large-vocabulary speech recognition is attempted. Chapter<__?__Pub Fmt hardspace>9 concludes the book by describing areas in which speech recognition has been successfully applied. This brisk and readable chapter brings this unfortunate book to a close. There is no doubt that the authors know their material. Indeed, it is hardly an exaggeration to say that they discovered much of it. But it is not enough to be a good researcher. The authors of a book must also be good communicators with a clear conception of their readership. This book fails in two senses. The structure and organization <__?__Pub Caret>have many problems. Undefined terms, unmentioned forward references, and inappropriate graphics can be found throughout. The second and perhaps greater failure is that the authors have lost track of their prospective readers. The result is a disappointing book of very limited value.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations