Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3368756.3369088acmotherconferencesArticle/Chapter ViewAbstractPublication PagessmartcityappConference Proceedingsconference-collections
research-article

Arabic text diacritization: overview and solution

Published: 02 October 2019 Publication History

Abstract

As recurrent neural networks and statistic models continue to give better results in different fields in science and because Arabic text diacritization is paramount important for many Arabic language processing tasks, we are going in this paper to present an overview of systems handling that problem. Besides, we are going to propose an automatic diacritic restoration system for Arabic texts. We propose here an approach using Long Short Term Memory LSTM network and Alkhalil Morpho Sys2.

References

[1]
Abandah, G.A., Graves, A., Al-Shagoor, B., Arabiyat, A., Jamour, F. and Al-Taee, M. 2015. Automatic diacritization of Arabic text using recurrent neural networks. International Journal on Document Analysis and Recognition (IJDAR). 18, 2 (2015), 183--197.
[2]
Abed, S., Alshayeji, M. and Sultan, S. 2019. Diacritics Effect on Arabic Speech Recognition. Arabian Journal for Science and Engineering. (2019).
[3]
Alansary, S. 2017. Alserag: An Automatic Diacritization System for Arabic. Shaalan K., Hassanien A., Tolba F. (eds) Intelligent Natural Language Processing: Trends and Applications. Studies in Computational Intelligence. 523--543.
[4]
Alnefaie, R. and Azmi, A.M. 2017. Automatic minimal diacritization of Arabic texts. Proceedings of the 3rd International Conference on Arabic Computational Linguistics, ACLing 2017, 5--6 November, Dubai, United Arab Emirates (2017), 169--174.
[5]
Alqudah, S., Abandah, G. and Arabiyat, A. 2017. Investigating Hybrid Approaches for Arabic Text Diacritization with Recurrent Neural Networks. 2017 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT) (2017).
[6]
Azmi, A.M. and Almajed, R.S. 2013. A survey of automatic Arabic diacritization techniques. Natural Language Engineering. 21, 3 (2013), 477--495.
[7]
Bebah, M., Amine, C., Azzeddine, M. and Abdelhak, L. 2014. Hybrid approaches for automatic vowelization of arabic texts. International Journal on Natural Language Computing (IJNLC). 3, 4 (2014), 53--71.
[8]
Belinkov, Y. and Glass, J. 2015. Arabic Diacritization with Recurrent Neural Networks. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17--21 September (2015), 2281--2285.
[9]
Boudchiche, M., Mazroui, A., Ould, M. and Ould, A. 2017. AlKhalil Morpho Sys 2: A robust Arabic morphosyntactic analyzer. Journal of King Saud University - Computer and Information Sciences. 29, 2 (2017), 141--146.
[10]
Chennoufi, A. and Mazroui, A. 2016. Morphological, syntactic and diacritics rules for automatic diacritization of Arabic sentences. Journal of King Saud University - Computer and Information Sciences. 29, 2 (2016), 156--163.
[11]
Darwish, K., Mubarak, H. and Abdelali, A. 2017. Arabic Diacritization: Stats, Rules, and Hacks. Proceedings of The Third Arabic Natural Language Processing Workshop (WANLP), Valencia, Spain, April 3 (2017), 9--17.
[12]
Diab, M., Ghoneim, M. and Habash, N. 2007. Arabic Diacritization in the Context of Statistical Machine Translation. Proceedings The 11th Machine Translation Summit (MT-Summit XI), At Copenhagen, Denmark (2007).
[13]
Elshafei, M., Al-Muhtaseb, H. and Alghamdi, M.M. 2006. Statistical methods for automatic diacritization of Arabic text. Proceedings 18th National computer Conference, Riyadh, March 26--29 (2006), 301--306.
[14]
Fadel, A., Tuffaha, I., Al-Jawarneh, B. and Al-Ayyoub, M. 2019. Arabic Text Diacritization Using Deep Neural Networks. International conference on Computer Applications & Information Security, At Riyadh, Saudi Arabia (2019), 2--9.
[15]
Fashwan, A. and Alansary, S. 2016. A Rule Based Method for Adding Case Ending Diacritics for Modern Standard Arabic Texts. proceedings of the 16th Conference on Language Engineering, At Cairo, Egypt (2016).
[16]
Fashwan, A. and Alansary, S. 2017. SHAKKIL: an automatic diacritization system for modern standard Arabic texts. Proceedings of The Third Arabic Natural Language Processing Workshop (WANLP), Valencia, Spain, April 3 (2017), 84--93.
[17]
Gal, Y. 2002. An HMM approach to vowel restoration in Arabic and Hebrew. Proceedings of the ACL-02 workshop on Computational approaches to semitic languages, (2002), 1--7.
[18]
Hamed, O. and Zesch, T. 2017. A Survey and Comparative Study of Arabic Diacritization Tools. In Journal for Language Technology and Computational Linguistics JLCL. 32, 1 (2017), 27--47.
[19]
Metwally, A.S. and Rashwan, M.A. 2016. A Multi-Layered Approach for Arabic Text Diacritization. IEEE International Conference on Cloud Computing and Big Data Analysis (2016), 389--393.
[20]
Mohamed Attia Mohamed Elaraby Ahmed 2000. A LARGE-SCALE COMPUTATIONAL PROCESSOR OF THE ARABIC MORPHOLOGY, AND APPLICATIONS. Thesis submitted to the of Master of science in Computer engineering.
[21]
Rashwan, M., Sallab, A. Al, Raafat, H. and Rafea, A. 2014. Automatic Arabic diacritics restoration based on deep nets. Proceedings of the EMNLP 2014 Workshop on Arabic Natural Langauge Processing (ANLP), Doha, Qatar, October 25 (2014), 65--72.
[22]
Said, A., El-sharqwi, M., Chalabi, A. and Kamal, E. 2013. A Hybrid Approach for Arabic Diacritization. 18th International Conference on Applications of Natural Language to Information Systems, NLDB 2013, Salford, UK, June 19--21, 2013 (2013), 53--64.
[23]
Shaalan, K. 2010. Rule-based Approach in Arabic Natural Language Processing. International Journal on Information and Communication Technologies. 3, 3 (2010), 11--19.
[24]
Shaalan, K., Abo Bakr, H.M. and Ziedan, I. 2009. A Arabic Text Diacritization: Overview And Solution hybrid approach for building Arabic diacritizer. Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages, Athens, Greece, 31 March (2009), 27--35.
[25]
Shahrour, A., Khalifa, S. and Habash, N. 2015. Improving Arabic Diacritization through Syntactic Analysis. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, SCA2019, October 2-4, 2019, Casablanca, Morocco EMNLP, Lisbon, Portugal, 17--21 September (2015), 1309--1315.
[26]
Zayyan, A.A., Elmahdy, M., binti Husni, H. and Al Ja'am, J.M. 2016. Automatic Diacritics Restoration for Dialectal Arabic Text. International Journal of Computing and Information Sciences. 12, 2 (2016), 159--165.

Cited By

View all
  • (2022)How Much Does Lookahead Matter for Disambiguation? Partial Arabic Diacritization Case StudyComputational Linguistics10.1162/coli_a_0045648:4(1103-1123)Online publication date: 1-Dec-2022
  • (2021)Automatic Methods and Neural Networks in Arabic Texts Diacritization: A Comprehensive SurveyIEEE Access10.1109/ACCESS.2021.31229779(145012-145032)Online publication date: 2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
SCA '19: Proceedings of the 4th International Conference on Smart City Applications
October 2019
788 pages
ISBN:9781450362894
DOI:10.1145/3368756
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. LSTM
  2. arabic diacritization
  3. hybrid
  4. machine learning
  5. recurrent neural network

Qualifiers

  • Research-article

Conference

SCA2019

Acceptance Rates

Overall Acceptance Rate 183 of 487 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)How Much Does Lookahead Matter for Disambiguation? Partial Arabic Diacritization Case StudyComputational Linguistics10.1162/coli_a_0045648:4(1103-1123)Online publication date: 1-Dec-2022
  • (2021)Automatic Methods and Neural Networks in Arabic Texts Diacritization: A Comprehensive SurveyIEEE Access10.1109/ACCESS.2021.31229779(145012-145032)Online publication date: 2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media