Issue Downloads
Q-Learning for Shift-Reduce Parsing in Indonesian Tree-LSTM-Based Text Generation
Tree-LSTM algorithm accommodates tree structure processing to extract information outside the linear sequence pattern. The use of Tree-LSTM in text generation problems requires the help of an external parser at each generation iteration. Developing a good ...
Chinese EmoBank: Building Valence-Arousal Resources for Dimensional Sentiment Analysis
An increasing amount of research has recently focused on dimensional sentiment analysis that represents affective states as continuous numerical values on multiple dimensions, such as valence-arousal (VA) space. Compared to the categorical approach that ...
Dual Discriminator GAN: Restoring Ancient Yi Characters
In China, the damage of ancient Yi books are serious. Due to the lack of ancient Yi experts, the repairation of ancient Yi books is progressing very slowly. The artificial intelligence is successful in the field of image and text, so it is feasible for ...
Hypernymy Detection for Low-resource Languages: A Study for Hindi, Bengali, and Amharic
Numerous attempts for hypernymy relation (e.g., dog “is-a” animal) detection have been made for resourceful languages like English, whereas efforts made for low-resource languages are scarce primarily due to lack of gold-standard datasets and suitable ...
Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
In the present study, we propose novel sequence-to-sequence pre-training objectives for low-resource machine translation (NMT): Japanese-specific sequence to sequence (JASS) for language pairs involving Japanese as the source or target language, and ...
Arabic Word Sense Disambiguation for Information Retrieval
In the context of using semantic resources for information retrieval, the relationship and distance between concepts are considered important for word sense disambiguation. In this article, we experiment with Conceptual Density and Random Walk with graph ...
Emotion Recognition with Conversational Generation Transfer
Emotion recognition in conversation is one of the essential tasks of natural language processing. However, this task’s annotation data is insufficient since such data is hard to collect and annotate. Meanwhile, there is large-scale data for conversational ...
Chinese Event Extraction via Graph Attention Network
Event extraction plays an important role in natural language processing (NLP) applications, including question answering and information retrieval. Most of the previous state-of-the-art methods were lack of ability in capturing features in long range. ...
Interactive Gated Decoder for Machine Reading Comprehension
Owing to the availability of various large-scale Machine Reading Comprehension (MRC) datasets, building an effective model to extract passage spans for question answering has been well studied in previous works. However, in reality, there are some ...
Investigating the Effect of Preprocessing Arabic Text on Offensive Language and Hate Speech Detection
Preprocessing of input text can play a key role in text classification by reducing dimensionality and removing unnecessary content. This study aims to investigate the impact of preprocessing on Arabic offensive language classification. We explore six ...
A Lemmatizer for Low-resource Languages: WSD and Its Role in the Assamese Language
The morphological variations of highly inflected languages that appear in a text impede the progress of computer processing and root word determination tasks while extracting an abstract. As a remedy to this difficulty, a lemmatization algorithm is ...
Arabic Fake News Detection: A Fact Checking Based Deep Learning Approach
Fake news stories can polarize society, particularly during political events. They undermine confidence in the media in general. Current NLP systems are still lacking the ability to properly interpret and classify Arabic fake news. Given the high stakes ...
Text-to-Speech Synthesis: Literature Review with an Emphasis on Malayalam Language
Text-to-Speech Synthesis (TTS) is an active area of research to generate synthetic speech from underlying text. The identified syllables are uttered with proper duration and prosody characteristics to emulate natural speech. It falls under the category of ...
Multi-domain Spoken Language Understanding Using Domain- and Task-aware Parameterization
Spoken language understanding (SLU) has been addressed as a supervised learning problem, where a set of training data is available for each domain. However, annotating data for a new domain can be both financially costly and non-scalable. One existing ...
Word Sense Disambiguation using Cooperative Game Theory and Fuzzy Hindi WordNet based on ConceptNet
Natural Language is fuzzy in nature. The fuzziness of Hindi language was captured in the Fuzzy Hindi WordNet (FHWN). FHWN assigned membership values to fuzzy relationships by consulting experts from various domains. However, these membership values need ...
Konkani WordNet: Corpus-Based Enhancement using Crowdsourcing
Konkani is one of the languages included in the eighth schedule of the Indian constitution. It is the official language of Goa and is spoken mainly in Goa and some places in Karnataka and Kerala. Konkani WordNet or Konkani Shabdamalem (kōṁkanī śabdamālēṁ) ...
Mulan: A Multiple Residual Article-Wise Attention Network for Legal Judgment Prediction
Legal judgment prediction (LJP) is used to predict judgment results based on the description of individual legal cases. In order to be more suitable for actual application scenarios in which the case has cited multiple articles and has multiple charges, ...
Handwritten New Tai Lue Character Recognition Using Convolutional Prior Features and Deep Variationally Sparse Gaussian Process Modeling
New Tai Lue is widely used in Southwest China and Southeast Asia. Hence, it is important to study related handwritten character recognition. Considering the many similar characters in handwritten New Tai Lue, this paper proposes an offline handwritten New ...
Word Level Script Identification Using Convolutional Neural Network Enhancement for Scenic Images
Script identification from complex and colorful images is an integral part of the text recognition and classification system. Such images may contain twofold challenges: (1) Challenges related to the camera like blurring effect, non-uniform illumination ...
Combining a Novel Scoring Approach with Arabic Stemming Techniques for Arabic Chatbots Conversation Engine
Arabic is recognized as one of the main languages around the world. Many attempts and efforts have been done to provide computing solutions to support the language. Developing Arabic chatbots is still an evolving research field and requires extra efforts ...