Issue Downloads
Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents
Nowadays, social media is used by many people to express their opinions about a variety of topics. Opinion Mining or Sentiment Analysis techniques extract opinions from user generated contents. Over the years, a multitude of Sentiment Analysis studies ...
Online Handwritten Gurmukhi Words Recognition: An Inclusive Study
Identification of offline and online handwritten words is a challenging and complex task. In comparison to Latin and Oriental scripts, the research and study of handwriting recognition at word level in Indic scripts is at its initial phases. The two ...
Co-occurrence Weight Selection in Generation of Word Embeddings for Low Resource Languages
This study aims to increase the performance of word embeddings by proposing a new weighting scheme for co-occurrence counting. The idea behind this new family of weights is to overcome the disadvantage of distant appearing word pairs, which are indeed ...
On the Usage of a Classical Arabic Corpus as a Language Resource: Related Research and Key Challenges
This article presents a literature review of computer-science-related research applied on hadith, a kind of Arabic narration which appeared in the 7th century. We study and compare existent works in several fields of Natural Language Processing (NLP), ...
Multitask Pointer Network for Korean Dependency Parsing
Dependency parsing is a fundamental problem in natural language processing. We introduce a novel dependency-parsing framework called head-pointing--based dependency parsing. In this framework, we cast the Korean dependency parsing problem as a ...
Unsupervised Joint PoS Tagging and Stemming for Agglutinative Languages
The number of possible word forms is theoretically infinite in agglutinative languages. This brings up the out-of-vocabulary (OOV) issue for part-of-speech (PoS) tagging in agglutinative languages. Since inflectional morphology does not change the PoS ...
A Survey of Discourse Representations for Chinese Discourse Annotation
A key element in computational discourse analysis is the design of a formal representation for the discourse structure of a text. With machine learning being the dominant method, it is important to identify a discourse representation that can be used to ...
A Survey of Opinion Mining in Arabic: A Comprehensive System Perspective Covering Challenges and Advances in Tools, Resources, Models, Applications, and Visualizations
- Gilbert Badaro,
- Ramy Baly,
- Hazem Hajj,
- Wassim El-Hajj,
- Khaled Bashir Shaban,
- Nizar Habash,
- Ahmad Al-Sallab,
- Ali Hamdi
Opinion-mining or sentiment analysis continues to gain interest in industry and academics. While there has been significant progress in developing models for sentiment analysis, the field remains an active area of research for many languages across the ...
Automatic Diacritics Restoration for Tunisian Dialect
Modern Standard Arabic, as well as Arabic dialect languages, are usually written without diacritics. The absence of these marks constitute a real problem in the automatic processing of these data by NLP tools. Indeed, writing Arabic without diacritics ...
Identifying and Analyzing Different Aspects of English-Hindi Code-Switching in Twitter
Code-switching or the juxtaposition of linguistic units from two or more languages in a single utterance, has, in recent times, become very common in text, thanks to social media and other computer mediated forms of communication. In this exploratory ...
A Comparative Analysis on Hindi and English Extractive Text Summarization
Text summarization is the process of transfiguring a large documental information into a clear and concise form. In this article, we present a detailed comparative study of various extractive methods for automatic text summarization on Hindi and English ...
Regularizing Output Distribution of Abstractive Chinese Social Media Text Summarization for Improved Semantic Consistency
Abstractive text summarization is a highly difficult problem, and the sequence-to-sequence model has shown success in improving the performance on the task. However, the generated summaries are often inconsistent with the source content in semantics. In ...
Leveraging Additional Resources for Improving Statistical Machine Translation on Asian Low-Resource Languages
Phrase-based machine translation (MT) systems require large bilingual corpora for training. Nevertheless, such large bilingual corpora are unavailable for most language pairs in the world, causing a bottleneck for the development of MT. For the Asian ...
Converting Dependency Structure Into Persian Phrase Structure
Treebank is one of the important and useful resources in natural language processing represented in two different annotated schemas: phrase and dependency structures. There are many works that convert a phrase structure into a dependency structure and ...