Maghrebian dialect recognition based on support vector machines and neural network classifiers

Mohamed Hassine¹,
Lotfi Boussaid² &
Hassani Messaoud¹

317 Accesses
12 Citations
Explore all metrics

Abstract

This paper investigates the feed forward back propagation neural network (FFBPNN) and the support vector machine (SVM) for the classification of two Maghrebian dialects: Tunisian and Moroccan. The dialect used by the Moroccan speakers is called “La Darijja” and that of Tunisians is called “Darija”. An Automatic Speech Recognition System is implemented in order to identify ten Arabic digits (from zero to nine). The implementation of our present system consists of two phases: The features extraction using a variety of popular hybrid techniques and the classification phase using separately the FFBPNN and the SVM. The experimental results showed that the recognition rates with both approaches have reached 98.3 % with FFBPNN and 97.5 % with SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

K-Means Clustering and Support Vector Machine for Assamese Dialect Identification

Machine Learning Approach for Assamese Dialect Identification

Speech Emotion Recognition Using Support Vector Machine and Linear Discriminant Analysis

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Alorifis, F. S. (2008). Automatic identification of arabic dialects using hidden markov models, Thesis, University of Pittsburgh.
Ameen, A., Uma, R., & Madhusudana, R. (2012). Speaker recognition system using combined vector quantization and discrete Hidden Markov model. International Journal Of Computational Engineering Research, 2(3), 2250–3005.
Google Scholar
Amour, M., Bouhjar, A., & Boukhris, F. (2004). Introduction to amazigh language. Paris: IRCAM.
Google Scholar
Antoniol, G., Rollo, V. F., & Venturi, G. (2005). Linear Predictive Coding and Cepstrum coefficients for mining time variant information from software repositories. St. Louis: International Workshop on Mining Software Repositories.
Book Google Scholar
Barkat-Defradas, M., Hamdi, R., & Pellegrino, F. (2004). From linguistic characterization to automatic identification of arabic dialects (pp. 29–30). Paris: MIDL.
Google Scholar
Biadsy, F. Hirschberg, J. & Habash, N. (2009). Spoken arabic dialect identification using phonotactic modeling. In Proceedings of the Workshop on Computational Approaches to Semitic Languages at the meeting of the European Association for Computational Linguistics (EACL), Athens.
Boril, H., Sangwan, A., Hansen, J, H.L. (2012). Arabic Dialect Identification—‘Is the Secret in the Silence and Other Observations”, Center for Robust Speech Systems (CRSS), Erik Jonsson School of Engineering,University of Texas at Dallas, Richardson, INTERSPEECH.
Boukous, A. (1998). The Moroccan Sociolinguistic Situation. Plurilinguismes (Le Maroc) (Vol. 16, pp. 5–30). Paris: Centre d’Etudes et de Recherches en Planification Linguistique.
Google Scholar
Chang, C.-C. & Lin, C.-J. (2004). LIBSVM—a library for support vector Machines, 2004. http://www.csie.edu.tw/cjlin/libsvm/.
Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2, 121–167.
Article Google Scholar
Cobert, D. (2003). Darija, A language of Modernity, estidios dialectlogia norteafricana y andalusi, 2003.
Dasigi, P., & Diab, M. (2011). CODACT: towards identifying orthographic variants in dialectal arabic, In Proceedings of the 5th International Joint Conference on Natural Language Processing (pp. 318–326), Chiang Mai, Thailand, 8–13.
Diab, M., Hacioglu, K., & Jurafsky, D. (2007). Automated methods for processing Arabic text: From Tokenization to base phrase chunking. In A. den van Bosch, A. den van Bosch, & A. Soudi (Eds.), Arabic computational morphology: Knowledge-based and empirical methods. New York: Springer.
Google Scholar
Elfardy, H. & Diab, M. (2103) Sentence-level dialect identification in arabic.In Proceedings of the 51st AnnualMeeting of the Association for Computational Linguistics, ACL 2013, Sofia.
Gunawan, W., & Hasegawa-Johnson, M. (2001). PLP coefficients can be quantized at 400 bps (pp. 1–4). Salt Lake City: ICASSP.
Google Scholar
Habash, N., Rambow, O., & Roth, R. (2009). MADA + TOKAN, A toolkit for Arabic tokenization, diacritization, morphological disambiguation, POS tagging, stemming and lemmatization. In K. Choukri, & B. Maegaard (Eds.) Proceedings of the second international conference on Arabic Language resources and tools. The MEDAR Consortium.
Habash, N., Roth, R., Rambow, O., Eskander, R., & Tomeh, N. (2013). Morphological Analysis and Disambiguation for Dialectal Arabic, In Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies (NAACL-HLT), Atlanta.
Hamdi, R., (2007). Rhythmic variation in arabic dialects, Thèse Université 7 novembre de Carthage Tunisie.
Hassine, M., Boussaid, L., & Messaoud, H. (2015). Hybrid techniques for Arabic Letter recognition. International Journal of Intelligent Information Systems, 4(1), 27–34.
Article Google Scholar
Haykin, S. (2009). Neural networks and learning machines. New York: Prentice Hall.
Google Scholar
Hermansky, H. (1990). Perceptual linear predictive (PLP) analysis for speech. The Journal of the Acoustical Society of America, 87, 1738–1752.
Article Google Scholar
Lindasalwa, M., Mumtaj, B., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. Journal of Computing, 2(3)
MATLAB User’s Guide. Mathworks Inc., 2006.
Nour-Eddine, L., & Abdelkader, A. (2015). GMM-Based Maghreb Dialect Identification System. Journal of Information Processing Systems, 11(1), 22–38.
Google Scholar
Pasha, Arfath, Al-Badrashinyy, Mohamed, Diaby, Mona, El Kholy, Ahmed, Eskander, Ramy, Habash, Nizar, et al. (2014). MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. New York: Columbia University, Center for Computational Learning Systems, LREC.
Google Scholar
Price, J. Sophomore student (2005) Design an automatic speech recognition system using maltab. In Progress report for: Chesapeake information based aeronautics consortium August 2005, University of Maryland Eastern Shore Princess Anne.
Sadat, F., Kazemi, F., & Farzindar, A. (2014). Automatic dentification of Arabic Language Varieties and Dialects in Social Media, In Proceedings of the Second Workshop on Natural Language Processing for Social Media (SocialNLP) (pp. 22–27), Dublin, August 24 2014.
Semet, G., & Treffo, G. (2002). Speech recognition based on MFCC coefficients, TIPE.
Shlens J. (2003). A tutorial on principal component analysis, Derivation, Discussion and Singular Value Decomposition.
Smola, A.J. & Scholkopf, B. (1998) A tutorial on support vector regression, Tech. rep., NeuroCOLT2 Technical Report NC2-TR-1998-030.
Srinivasan, A. (2011). Speech recognition using hidden markov model. Applied Mathematical Sciences, 5(79), 3943–3948.
Google Scholar
Venkateswarlu, R.L.K., Kumari, R. V. & Vani Jayasri, G. (2011). Speech recognition using radial basis function neural network, IEEE.
Yu, H. & Kim, S. (2012). SVM tutorial: Classification, regression, and ranking. In Handbook of Natural Computing (pp. 479–506).

Download references

Author information

Authors and Affiliations

Laboratory of Automatic Signal and Image Processing, National School of Engineers of Monastir, University of Monastir, 5019, Monastir, Tunisia
Mohamed Hassine & Hassani Messaoud
Laboratory of Electronics and Microelectronics, Faculty of Science of Monastir, University of Monastir, 5019, Monastir, Tunisia
Lotfi Boussaid

Authors

Mohamed Hassine
View author publications
You can also search for this author in PubMed Google Scholar
Lotfi Boussaid
View author publications
You can also search for this author in PubMed Google Scholar
Hassani Messaoud
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lotfi Boussaid.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hassine, M., Boussaid, L. & Messaoud, H. Maghrebian dialect recognition based on support vector machines and neural network classifiers. Int J Speech Technol 19, 687–695 (2016). https://doi.org/10.1007/s10772-016-9360-6

Download citation

Received: 18 March 2016
Accepted: 08 August 2016
Published: 16 August 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s10772-016-9360-6

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

K-Means Clustering and Support Vector Machine for Assamese Dialect Identification

Machine Learning Approach for Assamese Dialect Identification

Speech Emotion Recognition Using Support Vector Machine and Linear Discriminant Analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Maghrebian dialect recognition based on support vector machines and neural network classifiers

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

K-Means Clustering and Support Vector Machine for Assamese Dialect Identification

Machine Learning Approach for Assamese Dialect Identification

Speech Emotion Recognition Using Support Vector Machine and Linear Discriminant Analysis

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation