Abstract
Parsing Arabic corpora is an important task aiming to understand Arabic language, enrich and enhance the electronic resources, and increase the efficiency of natural language applications like translation or the recognition. In this paper, we propose a parsing approach for Arabic sentences especially for nominal ones. To do this, we first study the typology of the Arabic nominal sentence. Then, we develop a set of rules generating different nominal sentences. After that, we present our parsing approach based on transducers and on our tag set. In addition, we transform recursive graph of transducers into transducer cascade to reduce the complexity. Finally, we present the implementation and experimentation of our approach in NooJ platform. The obtained results are satisfactory.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Shaalan, K., Othman, E., Rafea, A.: Towards resolving ambiguity in understanding Arabic sentence. In: The Proceedings of the International Conference on Arabic Language Resources and Tools, NEMLAR, 22nd–23rd September, Cairo, Egypt, pp. 118–122 (2004)
Attia, M.: Handling Arabic morphological and syntactic ambiguity within the LFG framework with a view to machine translation (2008)
Al-Taani, A.T., Al-Rub, S.A.: A rule-based approach for tagging non-vocalized Arabic words. Int. Arab J. Inf. Technol. (IAJIT) 6(3), 320–328 (2009). 4 Diagrams, 5 Charts, 1 Graph
Diab, M., Hacioglu, K., Jurafsky, D.: Automatic tagging of Arabic text: from raw text to base phrase chunks. Linguistics Department, Stanford University (2004)
Hadddar, K., Abdelkarim, A., Ben Hamadou, A.: Étude et analyse de la phrase nominale arabe en HPSG. In: TALN 2006 (2006)
Dichy, J., Alrahabi, M.: Levée d’ambiguité par la methode d’exploration contextuelle: la sequence ‘alif-nûn’ en arabic. In: Second International Conference (SIIE) (2009)
Habash, N., Rambow, O., Roth, R.: MADA+TOKAN: a toolkit for Arabic tokenization, diacritization, morphological disambiguation, POS tagging, stemming ald lemmatization. In: Proceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR), Cairo, Egypt (2009)
Diab, M.: Second generation tools (AMIRA 2.0): fast and robust tokenization, POS tagging, and base phrase chunking. In: MEDAR 2nd International Conference on Arabic Language Resources and Tools, April, Cairo, Egypt (2009)
Beesley, K.: Finite-state morphological analysis and generation of Arabic at xerox research: status and plans. In: ACL/EACL 2001, 6th July, Toulouse, France (2001)
Zalila, I., Haddar, K.: Construction of an HPSG grammar for the Arabic relative sentences. In: Natural Language Processing, RANLP 2011, 12–14 September 2011, Hissar, Bulgaria (2011)
Ellouze, S., Haddar, K., Abdelhamid, A.: Etude et analyse du pluriel brisé arabe avec la plateforme NooJ (2009)
Hammouda, N.G., Haddar, K.: Toward the resolution of Arabic lexical ambiguities with transduction on text’s automaton. In: CICLing (2015)
Hammouda, N.G., Haddar, K.: Integration of a segmentation tool for Arabic corpora in NooJ platform to build an automatic annotation tool. In: Will appears in NooJ (2016)
Silberztein, M.: Disambiguation tools for NooJ. In: Proceedings of the 2008 International NooJ Conference, pp. 158–171. Cambridge Scholars Publishing, Newcastle (2010)
Mesfar, S.: Analyse morpho-syntaxique automatique et reconnaissance des entités nommées en arabe standard. University of Franche Comté, p. 235 (2008). Thesis
Gross, M.: The Construction of Local Grammar. Finite-State Language Processing, pp. 329–354. MIT Press, England (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Hammouda, N.G., Haddar, K. (2018). Arabic NooJ Parser: Nominal Sentence Case. In: Mbarki, S., Mourchid, M., Silberztein, M. (eds) Formalizing Natural Languages with NooJ and Its Natural Language Processing Applications. NooJ 2017. Communications in Computer and Information Science, vol 811. Springer, Cham. https://doi.org/10.1007/978-3-319-73420-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-73420-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73419-4
Online ISBN: 978-3-319-73420-0
eBook Packages: Computer ScienceComputer Science (R0)