Ca 4 NLP Report - 1

MULTI LINGUAL SENTIMENT ANALYSIS
CSE 380 – Natural Language Processing

PROJECT REPORT
Quarter IV (Year III)
Submitted By
Aadithya Prabha R E0321008
Deevna Sai Reddy E0321053
In partial fulfilment for the award of the degree of
BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING
(Artificial Intelligence and Data Analytics)
Sri Ramachandra Faculty of Engineering and Technology
Sri Ramachandra Institute of Higher Education and Research, Porur,
Chennai-600116
JULY, 2024
1
ACKNOWLEDGEMENTS
We express our sincere gratitude to our dean Mr. Ragunathan for their support
and for providing the required facilities for carrying out this study.
We wish to thank my faculty mentor, Dr. Uma Ranjan, Department of
Computer Science and Engineering, Sri Ramachandra Engineering and
Technology for extending help and encouragement throughout the project.
Without her continuous guidance and persistent help, this project would not have
been a success for me.
We extend my heartfelt appreciation to all the members of Sri Ramachandra
Faculty of Engineering and Technology, my dear parents, and friends who have
provided unwavering support and helping us overcome obstacles during the
period. Your unwavering support, guidance, and encouragement were all
crucial.
2
TABLE OF CONTENTS
S.No Content Page no

1 INTRODUCTION 4
2 SIGNIFICANCE OF THE PROBLEM 5
3 LITERATURE REVIEW 6-7
4 OUR PROPOSED METHOD 8-9
5 DESCRIPTION OF DATASET 10-11
6 METHODOLOGY & APPROACH 12-13
7 RESULT 14-16
8 MACHINE CONFIGURATIONS 17
9 CONCLUSION 18
10 FUTURE ENCHANMENTS 19
11 REFERENCES 20
12 CODE 21
3
INTRODUCTION
In the digital age, the proliferation of online content has transformed how
individuals express their opinions and experiences. Sentiment analysis, a
branch of natural language processing (NLP), focuses on the computational
treatment of opinions, sentiments, and emotions expressed in text. It serves as
a vital tool for businesses, researchers, and policymakers to gauge public
sentiment, understand consumer behavior, and derive actionable insights from
vast amounts of unstructured data.
The primary objective of this report is to present a comprehensive overview of
a sentiment analysis model developed to classify sentiments expressed in user
reviews across multiple languages and categories, specifically books, DVDs,
and music. By categorizing sentiments into three distinct classes—positive,
negative, and neutral—the model aims to provide a nuanced understanding of
user feedback, which can significantly inform decision-making processes in
various sectors.
This report will detail the methodology employed in developing the sentiment
analysis model, including the data preprocessing steps, model architecture, and
training process. Additionally, it will provide an overview of the dataset used,
along with the results obtained from evaluating the model's performance across
different languages and categories. Finally, the report will conclude with
recommendations for future improvements and potential applications of the
model in real-world scenarios.
By addressing these components, this report aims to contribute to the ongoing
discourse in the field of sentiment analysis and highlight the importance of
developing robust models capable of understanding and interpreting human
emotions in text.
4
SIGNIFICANCE OF THE PROBLEM
Sentiment analysis has emerged as a critical component in understanding

consumer behavior and public opinion in today's data-driven world. As businesses
and organizations increasingly rely on user-generated content—such as reviews,
comments, and social media posts—the ability to accurately analyze and interpret
sentiments expressed in these texts becomes paramount. This section discusses the
significance of the problem addressed by our sentiment analysis model,
highlighting its implications for various stakeholders and the broader context of
natural language processing (NLP).
Sentiment analysis serves as a bridge between raw textual data and actionable
insights. By categorizing sentiments into positive, negative, or neutral classes,
organizations can gauge customer satisfaction, identify areas for improvement,
and tailor their products and services to better meet consumer needs. For instance,
businesses can analyze customer reviews to understand what features are
appreciated or criticized, allowing for data-driven decision-making.
Moreover, sentiment analysis is not limited to commercial applications. It plays a

vital role in political campaigns, social media monitoring, and public health
assessments. By analyzing sentiments expressed in social media posts or news
articles, researchers and policymakers can gain insights into public opinion trends,
enabling them to respond effectively to societal issues.
5
LITERATURE REVIEW
Sentiment analysis has garnered significant attention in the field of natural language
processing (NLP) due to its ability to extract meaningful insights from textual data.
This literature review examines key studies and methodologies relevant to sentiment
analysis, particularly those employing deep learning techniques and addressing
challenges in multilingual sentiment classification.
One of the foundational works in this area is the study by Kim (2014), which
introduced the use of Convolutional Neural Networks (CNNs) for sentence
classification. Kim demonstrated that CNNs could effectively capture local features
in text, leading to improved accuracy in sentiment classification tasks. This approach
marked a shift from traditional machine learning methods, which often relied on
handcrafted features, to data-driven models capable of learning complex patterns
directly from raw text.
Building on this, the work of Zhang et al. (2018) explored the application of recurrent
neural networks (RNNs), particularly Long Short-Term Memory (LSTM) networks,
for sentiment analysis. The authors highlighted the advantages of LSTMs in handling
sequential data and capturing long-range dependencies, making them well-suited for
sentiment classification tasks where context is crucial. Their experiments
demonstrated that LSTMs outperformed traditional classifiers, particularly in
datasets with longer textual inputs.
In the context of multilingual sentiment analysis, recent research has focused on

developing models that can generalize across different languages. For instance, the
study by Pires et al. (2019) introduced a multilingual BERT model that leverages
transfer learning to improve sentiment classification in multiple languages. By pre-
training on large multilingual corpora, the model achieved state-of-the-art results on
several benchmark datasets, demonstrating the effectiveness of transfer learning in
6
addressing the challenges of language diversity.
However, despite these advancements, challenges remain in the realm of data

imbalance and the complexity of human language. The work of Chen et al. (2020)
addressed the issue of class imbalance in sentiment datasets, proposing techniques
such as oversampling and data augmentation to enhance model performance. Their
findings indicated that balancing the training data significantly improved the model's
ability to classify minority classes, which is particularly relevant in sentiment analysis
where neutral sentiments are often underrepresented.Furthermore, the intricacies of
sentiment expression, including sarcasm and cultural nuances, pose additional
challenges. Research by Ghosh et al. (2019) emphasized the need for models that can
understand context and sentiment nuances, particularly in social media texts where
informal language and slang are prevalent. Their work suggested that incorporating
contextual embeddings and fine-tuning models on domain-specific data could
enhance sentiment classification accuracy.In summary, the literature on sentiment
analysis highlights a progression from traditional methods to advanced deep learning
techniques, with a growing emphasis on multilingual capabilities and addressing data
imbalance. The proposed sentiment analysis model in this report builds upon these
foundational studies, aiming to enhance sentiment classification across multiple
languages and categories while addressing the challenges identified in the
literature. This literature review provides an in-depth examination of relevant studies
and methodologies in sentiment analysis, focusing on the evolution of techniques and
the challenges faced in multilingual contexts.
7
OUR PROPOSED METHOD
The proposed sentiment analysis model leverages a Long Short-Term Memory

(LSTM) network, a type of recurrent neural network (RNN) known for its ability
to effectively handle sequential data and capture long-range dependencies. The
model architecture consists of the following key components:
1. Embedding Layer
The embedding layer converts words into dense vector representations, allowing
the model to learn semantic relationships between words. This layer takes the
input sequences of words and maps them to a higher-dimensional space, where
similar words are represented by similar vectors.
2. LSTM Layers
The LSTM layers are responsible for processing the embedded sequences and
learning the temporal dependencies within the text. LSTM cells are designed to
selectively remember and forget information, enabling the model to capture
context and understand the sentiment expressed in the reviews.
In the proposed architecture, two LSTM layers are stacked, with the first layer
returning the full sequence of hidden states and the second layer only returning
the final hidden state. This allows the model to learn both local and global features
from the input text.
3. Dropout Layer
To prevent overfitting and improve the model's generalization capabilities, a

dropout layer is applied after the LSTM layers. Dropout randomly sets a fraction
8
of the input units to 0 during training, forcing the model to learn more robust
features and reducing the risk of co-adaptation among neurons.
4. Dense Output Layer
The final layer of the model is a dense layer with a softmax activation function.
This layer takes the output from the LSTM layers and produces a probability
distribution over the sentiment classes (positive, negative, and neutral). The class
with the highest probability is selected as the predicted sentiment.
The model is trained using the Adam optimizer and sparse categorical cross-
entropy loss function. The training process involves feeding the padded input
sequences and their corresponding sentiment labels to the model, allowing it to
learn the mapping between the text and sentiment classes.
By leveraging the power of LSTM networks and incorporating techniques such as

embedding, dropout, and softmax classification, the proposed model aims to
accurately classify sentiments expressed in reviews across multiple languages and
categories.
9
DESCRIPTION OF DATASET
The dataset used is WEBIS-CLS-10, in this sentiment analysis project consists

of user reviews collected from various sources(Amazon) over 80000 reviews,
covering books, DVDs, and music across multiple languages. Each review
entry includes the following key components:
1. Summary
A brief overview of the review, providing a concise summary of the user's
opinion.
2. Text
The full text of the review, containing the user's detailed feedback and
sentiments.
3. Category
The category of the item being reviewed, such as books, DVDs, or music.
4. Rating
A numerical rating provided by the user, typically on a scale of 1 to 5. This

rating is used to determine the sentiment polarity of the review.
The dataset is formatted in XML files, with each review represented as an

individual item within the XML structure. The dataset covers four languages:
German, French, Japanese, and English, allowing the sentiment analysis
model to be trained and evaluated across diverse linguistic contexts.
To determine the sentiment polarity of each review, a rule-based approach is
10
employed based on the user-provided rating. Reviews with a rating greater
than 3 are considered positive, those with a rating less than 2 are considered
negative, and those with a rating of 2 or 3 are considered neutral.
The dataset is divided into training and test sets for each language and
category. The training set is used to fit the sentiment analysis model, while the
test set is used for evaluating the model's performance and generalization
capabilities.
It is important to note that the dataset may contain noisy or inconsistent data,
such as reviews with missing or irrelevant information, which can potentially
impact the model's performance. Careful preprocessing and data cleaning
steps may be necessary to ensure the quality and reliability of the dataset.
11
METHODOLOGY & APPROACH
The methodology employed in this sentiment analysis project involves a systematic

approach to data preprocessing, model development, training, and evaluation. The
following sections outline the key steps taken to build and assess the sentiment
analysis model.
1. Data Collection and Preprocessing
The dataset consists of user reviews collected from various sources, formatted
in XML files. Each review entry includes fields such as summary, text,
category, and rating. The preprocessing steps include:
• Parsing XML Files: The XML files are parsed to extract relevant fields,
and the data is organized into a structured format (DataFrame) for
further processing.
• Handling Missing Data: Reviews with missing or empty text fields are
filtered out to ensure the quality of the input data. This step is crucial to
prevent errors during tokenization and model training.
• Sentiment Labeling: The sentiment of each review is determined based
on the user-provided rating. Reviews with ratings greater than 3 are
labeled as positive, those with ratings less than 2 as negative, and ratings
of 2 or 3 as neutral.
• Tokenization: The text data is tokenized using Keras' Tokenizer, which
converts words into sequences of integers. This process allows the
model to work with numerical representations of the text.
• Padding Sequences: To ensure uniform input size for the model, the
sequences are padded to a fixed length (e.g., 100 tokens) using Keras'
pad_sequences function. This step is essential for handling variable-
length input data.
2. Model Architecture
The sentiment analysis model is built using a Long Short-Term Memory

(LSTM) network, which is particularly effective for sequential data. The
architecture includes the following layers:
• Embedding Layer: This layer converts the integer sequences into dense
vector representations, allowing the model to learn semantic
relationships between words.
• LSTM Layers: Two stacked LSTM layers process the embedded
sequences, capturing temporal dependencies and contextual information
12
within the text. The first LSTM layer returns the full sequence of hidden
states, while the second layer returns only the final hidden state.
• Dropout Layer: A dropout layer is included to mitigate overfitting by
randomly setting a fraction of the input units to zero during training.
• Dense Output Layer: The final layer is a dense layer with a softmax
activation function, producing a probability distribution over the
sentiment classes (positive, negative, neutral).
3. Model Training
The model is trained using the Adam optimizer and sparse categorical cross-
entropy loss function. The training process consists of the following steps:
• Training and Validation Split: The dataset is divided into training and
validation sets to monitor the model's performance during training. A
typical split might involve using 80% of the data for training and 20%
for validation.
• Epochs and Batch Size: The model is trained for a specified number of
epochs (e.g., 5) with a defined batch size (e.g., 64). During each epoch,
the model learns from the training data and updates its weights based on
the computed loss.
• Monitoring Performance: The validation loss and accuracy are
monitored at the end of each epoch to assess the model's ability to
generalize to unseen data. Early stopping can be employed to prevent
overfitting if the validation loss starts to increase.
4. Model Evaluation
After training, the model is evaluated on a separate test set for each language
and category. The evaluation process includes:
• Performance Metrics: Key metrics such as accuracy, precision, recall,

and F1-score are calculated to assess the model's performance. A
classification report is generated to provide a detailed breakdown of the
model's performance across different sentiment classes.
• Error Analysis: Analyzing misclassifications helps identify patterns in
the errors, such as difficulties in classifying neutral sentiments or
handling specific linguistic nuances.
The methodology outlined above provides a comprehensive framework for

developing a sentiment analysis model capable of classifying sentiments across
multiple languages and categories. By leveraging advanced deep learning techniques
and systematic data preprocessing, the model aims to deliver accurate and actionable
insights from user-generated content.
13
RESULT
The sentiment analysis model was trained across four languages—German, French,
Japanese, and English—over five epochs, yielding varying results. For German
reviews, the model achieved a final training accuracy of 92.02% but only reached a
validation accuracy of 63.17%, with a test accuracy of 28% across categories
(books, DVDs, music), indicating significant misclassification, particularly with
positive and neutral sentiments. In French, the model's performance was notably
poor, with a maximum training accuracy of 91.21% and a test accuracy of only 22%
for DVDs and 23% for books, highlighting the model's inability to generalize
effectively. The Japanese dataset presented additional challenges, with the model
encountering errors during evaluation, resulting in an accuracy of 50% for DVDs
but undefined metrics for books due to data issues. For English reviews, the model
achieved a training accuracy of 89.38% and a validation accuracy of 62.75%, but
similarly struggled with low test accuracy across categories, reflecting a pattern of
high recall for negative sentiments but poor precision for positive and neutral
classes. Overall, while the model demonstrated strong training performance, it
struggled with generalization and class imbalance, leading to low accuracy on the
test sets and undefined metrics for certain classes, necessitating further
improvements in data handling and model architecture.
14
Comparison with Previous Work
The results obtained from this sentiment analysis model can be compared to
previous studies in the field. Kim (2014) demonstrated that Convolutional Neural
Networks (CNNs) can achieve state-of-the-art performance on sentence
classification tasks, with accuracy scores ranging from 0.81 to 0.89 on various
datasets.
However, it is important to note that the datasets and evaluation metrics used in
different studies may vary, making direct comparisons challenging. Additionally,
the complexity of the task and the diversity of languages and domains can
significantly impact the model's performance.
The sentiment analysis model was evaluated on separate test sets for each language
and category, revealing significant limitations in its ability to accurately classify
sentiments. For German reviews, the model achieved a test accuracy of only 28%
across all categories (books, DVDs, music), with undefined precision and F1-scores
for positive and neutral sentiments, indicating a strong bias towards predicting
negative sentiments. The French dataset showed even lower performance, with
accuracies ranging from 22% to 25%, and the Japanese dataset encountered errors
during evaluation, particularly for the books category, suggesting data quality
issues.
15
16
MACHINE CONFIGURATIONS
The proposed model can be implemented on standard computing hardware with

sufficient computational resources for training machine learning models. A typical
configuration may include:
• CPU: Intel Core i5 or higher
• RAM: 8GB or more
• Storage: SSD for faster data access
• Operating System: Windows, Linux, or macOS
• Software: Python programming language, TensorFlow or PyTorch for machine

learning frameworks, Jupyter Notebook or similar for code development
CONCLUSION
17
The sentiment analysis model developed in this project aimed to classify
sentiments expressed in user reviews across multiple languages and categories.
While the model achieved high training accuracy, the evaluation results revealed
limitations in accurately classifying positive and neutral sentiments, particularly in
multilingual contexts.
To improve the model's performance, future work should focus on addressing data
imbalance, experimenting with different model architectures, and incorporating
advanced techniques such as transfer learning and data augmentation.
Additionally, further research is needed to understand the challenges posed by
language complexity and cultural nuances in sentiment analysis.
Despite the limitations, this project contributes to the ongoing research in

sentiment analysis and highlights the importance of developing robust models
capable of handling diverse languages and domains. The insights gained from this
work can inform future research directions and practical applications in areas such
as customer experience management, social media monitoring, and public opinion
analysis.
Key points :
• The model struggled to generalize beyond the training data, as evidenced

by the large gap between training and test set performance.
• Class imbalance was a significant issue, with the model favoring the
majority class (negative sentiments) and failing to accurately classify
positive and neutral sentiments.
• The undefined precision and F1-scores for certain classes highlighted the
model's inability to handle underrepresented classes.
• The Japanese dataset presented additional challenges, with errors occurring
during evaluation, potentially due to data quality problems.
FUTURE ENCHANMENTS
18
Implementing the BERT model for sequence classification: The model architecture
was updated to use a pre-trained BERT model fine-tuned for sentiment analysis.
This allowed the model to leverage learned representations from vast amounts of
text data, potentially improving its ability to capture contextual information and
handle language complexities.
Handling class imbalance: Techniques such as oversampling the minority classes or

undersampling the majority class were employed to address the issue of imbalanced
data. This aimed to ensure that the model learns from a more balanced distribution
of sentiments, reducing its bias towards the majority class.
Improving data preprocessing: The data preprocessing steps were enhanced to

handle missing data more effectively, remove irrelevant information, and normalize
the text. This helped to create a cleaner and more consistent input for the model,
potentially improving its learning capabilities.
Optimizing hyperparameters: The learning rate and batch size were adjusted during
training to find the optimal values that balance convergence speed and accuracy.
This fine-tuning aimed to help the model learn more effectively and generalize
better to unseen data.
By incorporating these improvements, the sentiment analysis model was expected

to demonstrate enhanced performance in accurately classifying sentiments across
different languages and categories. However, the effectiveness of these
enhancements would be evaluated through further testing and analysis.
REFERENCES
19
1. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
https://arxiv.org/abs/1810.04805
2. Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez,
Lukasz Kaiser, and others
3. A Survey on Sentiment Analysis: Approaches and Applications

Ravi Kumar, S. K. Sinha, and R. K. Gupta
https://www.sciencedirect.com/science/article/pii/S1877050919310303
4. Deep Learning for Sentiment Analysis: A Survey

Y. Zhang, J. Zhao, and Y. LeCun
5. Sentiment Analysis: A Comprehensive Review

Bo Pang, Lillian Lee
https://www.cs.cornell.edu/home/llee/papers/sentiment.pdf
6. Transfer Learning for Sentiment Analysis: A Survey

S. Ruder
https://ruder.io/transfer-learning/
7. Multilingual Sentiment Analysis: A Survey

S. A. H. Al-Maadeed, M. A. Al-Maadeed, and H. H. Al-Hammadi
8. Sentiment Analysis in Social Media: A Review

A. A. B. Al-Garadi, R. A. A. D. A. R. Al-Sharif, and A. A. M. Al-Mahmoud
CODE
20
Aadithya Prabha R – E0321008
https://github.com/aadithyaprabha/crosslingualsentimentanalysis
Deevna Reddy – E0321053
https://github.com/deevnared/Cross-Lingual-Sentiment-Analysis
21

Ca 4 NLP Report - 1

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

Ca 4 NLP Report - 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ca 4 NLP Report - 1

Uploaded by

Copyright:

Available Formats

MULTI LINGUAL SENTIMENT ANALYSIS

CSE 380 – Natural Language Processing

Aadithya Prabha R E0321008

Deevna Sai Reddy E0321053

In partial fulfilment for the award of the degree of

We wish to thank my faculty mentor, Dr. Uma Ranjan, Department of

Computer Science and Engineering, Sri Ramachandra Engineering and

Technology for extending help and encouragement throughout the project.

been a success for me.

We extend my heartfelt appreciation to all the members of Sri Ramachandra

provided unwavering support and helping us overcome obstacles during the

period. Your unwavering support, guidance, and encouragement were all

S.No Content Page no

2 SIGNIFICANCE OF THE PROBLEM 5

3 LITERATURE REVIEW 6-7

4 OUR PROPOSED METHOD 8-9

5 DESCRIPTION OF DATASET 10-11

6 METHODOLOGY & APPROACH 12-13

Sentiment analysis has emerged as a critical component in understanding

Moreover, sentiment analysis is not limited to commercial applications. It plays a

In the context of multilingual sentiment analysis, recent research has focused on

However, despite these advancements, challenges remain in the realm of data

The proposed sentiment analysis model leverages a Long Short-Term Memory

To prevent overfitting and improve the model's generalization capabilities, a

4. Dense Output Layer

By leveraging the power of LSTM networks and incorporating techniques such as

The dataset used is WEBIS-CLS-10, in this sentiment analysis project consists

A numerical rating provided by the user, typically on a scale of 1 to 5. This

The dataset is formatted in XML files, with each review represented as an

To determine the sentiment polarity of each review, a rule-based approach is

The methodology employed in this sentiment analysis project involves a systematic

1. Data Collection and Preprocessing

The sentiment analysis model is built using a Long Short-Term Memory

• Performance Metrics: Key metrics such as accuracy, precision, recall,

The methodology outlined above provides a comprehensive framework for

The proposed model can be implemented on standard computing hardware with

• CPU: Intel Core i5 or higher

• RAM: 8GB or more

• Storage: SSD for faster data access

• Operating System: Windows, Linux, or macOS

• Software: Python programming language, TensorFlow or PyTorch for machine

Despite the limitations, this project contributes to the ongoing research in

• The model struggled to generalize beyond the training data, as evidenced

Handling class imbalance: Techniques such as oversampling the minority classes or

Improving data preprocessing: The data preprocessing steps were enhanced to

By incorporating these improvements, the sentiment analysis model was expected

2. Attention Is All You Need

3. A Survey on Sentiment Analysis: Approaches and Applications

4. Deep Learning for Sentiment Analysis: A Survey

5. Sentiment Analysis: A Comprehensive Review

6. Transfer Learning for Sentiment Analysis: A Survey

7. Multilingual Sentiment Analysis: A Survey

8. Sentiment Analysis in Social Media: A Review

Deevna Reddy – E0321053

You might also like