Nothing Special   »   [go: up one dir, main page]

Ahmad 2017 Ijca 915758

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/321084834

Sentiment Analysis of Tweets using SVM

Article in International Journal of Computer Applications · November 2017


DOI: 10.5120/ijca2017915758

CITATIONS READS

118 14,889

3 authors:

Munir Ahmad Shabib Aftab


National College of Business Administration & Economics Virtual University of Pakistan
79 PUBLICATIONS 3,901 CITATIONS 51 PUBLICATIONS 1,987 CITATIONS

SEE PROFILE SEE PROFILE

Iftikhar Ali
Virtual University of Pakistan
15 PUBLICATIONS 602 CITATIONS

SEE PROFILE

All content following this page was uploaded by Shabib Aftab on 21 November 2017.

The user has requested enhancement of the downloaded file.


2017 Citation:
Ahmad, Munir & Aftab, Shabib & Ali, Iftikhar. (2017). Sentiment Analysis of Tweets using SVM. International Journal of Computer Applications. 177.
975-8887. 10.5120/ijca2017915758.
International Journal of Computer Applications (0975 – 8887)
Volume 177 – No.5, November 2017

Sentiment Analysis of Tweets using SVM

Munir Ahmad Shabib Aftab Iftikhar Ali


Department of Computer Department of Computer Department of Computer
Science Science Science
Virtual University of Pakistan Virtual University of Pakistan Virtual University of Pakistan

ABSTRACT we talk about the supervised versions of machine learning


Community's view and feedback have always proved to be the techniques then it is necessary to mention here that they need
most essential and valuable resource for companies and a training dataset to get themselves trained for the real input.
organizations. With social media being the emerging trend In this technique, some of the dataset with the pre identified
among everyone, it paves way for unprecedented analysis and output class is given to the algorithm in order to make the
evaluation of various aspects for which organizations had to rules and then the real input data (test data) is given. Some
rely on unconventional, time consuming and error prone well-known machine learning techniques include Maximum
methods earlier. This technique of analysis directly falls under Entropy, Stochastic Gradient Descent (SGD), Random Forest
the domain of "sentiment analysis". Sentiment analysis (RF), SailAil Sentiment Analyzer (SASA), Multi-Layer
encompasses the vast field of effective classification of user Perceptron (MLP), Naïve Bayes (NB), Multinomial Naïve
generated text under defined polarities. There are several tools Bayes (MNB) and Support Vector Machine (SVM) as
and algorithms available to perform sentiment detection and discussed in detail by [13]. A hybrid platform is one which
analysis including supervised machine learning algorithms combines both the techniques elaborated above. It uses the
that perform classification on the target corpus, after getting lexicon classification through a predefined dictionary and
trained with training data. Lexical techniques which performs classifies that data using machine learning methods. Most
classification on the basis of dictionary based annotated commonly used hybrid techniques include pSenti [14], SAIL
corpus and Hybrid tools which are combination of machine [15], NILC_USP [16] and Alchemy API [17] as discussed in
learning and lexicon based algorithms. In this paper we have detail by [18]. In this research, Support Vector Machine
used Support Vector Machine (SVM) for sentiment analysis (SVM) is selected for sentiment analysis of two pre classified
in Weka. SVM is one of the widely used supervised machine sets of tweet. SVM is formally introduced by [19] and proved
learning algorithms for textual polarity detection. To analyze to be one of the widely used supervised machine learning
the performance of SVM, two pre classified datasets of tweets algorithms for the purpose of classification. It is a prevalent
are used and for comparative analysis, three measures are method which has proved to be very effective at various fronts
used: Precision, Recall and F-Measure. Results are shown in of text categorization and has outperformed Naïve Bayes
the form of tables and graphs. classifiers on many occasions as pointed out by [20]. For the
purpose of performance evaluation of SVM Precision, Recall
Keywords and F-Measure are used for both datasets.
Polarity Detection, Sentiment Analysis, Opinion Mining, Data Further organization of this paper is as follows. Section 2
Mining, Data Classification, Machine Learning, Support describes the related work. Section 3 elaborates materials and
Vector Machine, SVM methods. Section 4 is about classification. Section 5 discusses
the results and finally section 6 concludes the paper.
1. INTRODUCTION
Need for effective and efficient text mining tools and 2. RELATED WORK
techniques is increasing now days due to the staggering Sentiment analysis of the textual data is one of the hot topics
amount of textual data. This data is increasing day by day due today. Many researchers are working on the automated
to social networking websites (Facebook and Twitter etc). The techniques of extraction and analysis of huge amount of user
organizations can get unlimited benefits from mining the generated data, which is available in social networking
sentiments and polarity of this massive amount of information websites. In [21], the authors proposed a way to get the pre
and reviews. With the implementation of sentiment analysis, labeled data from twitter which can be used to train SVM
organizations can take effective measures in order to maintain classifier. They used the twitter hash tags to judge the polarity
and improve their place in the market by assessing which of tweet. To analyze the accuracy of proposed technique, a
products or services require improvement, from which price test study on the classifier was conducted which showed the
allocations the majority is unsatisfied with and what type of result with the accuracy of 85%. In [22], the authors analyzed
new features the community wants etc. Mostly three the performance of J48 and MLP for classification of five
techniques have been discussed in the literature for sentiment different datasets. Parameters to measure the accuracy in the
analysis which are Lexicon based, Machine learning based, study were TP rate, FP rate, Precision, Recall, F-measure and
and their Hybrid [1], [9], [10], [18]. Lexicon based approach ROC Area. MLP performed better on each dataset. Moreover
comprises of a predefined dictionary which includes the results showed that the Neural Network also has the better
weightage of words and their sentiment orientation to learning capability and can be a good option for classification
determine the sentiment inclination of textual data. It problems. The authors in [23] introduced a new technique to
effectively classifies text using its set dictionary as explained classify the sentiment of tweets as positive or negative. They
by [11]. Some well-known lexicon based tools are presented and discussed the results of machine learning
SentiStrength 3.0, SentiWrodNet, WordNet, Linguistic algorithms for twitter sentiment analysis by using distant
Inquiry Word Count (LIWC), Affective Norms for English supervision. Training data, the authors used consisted of
Words (ANEW) and SenticNet as discussed in [12]. Now if tweets with emotions which were used as noisy labels.

25
International Journal of Computer Applications (0975 – 8887)
Volume 177 – No.5, November 2017

According to authors, the machine learning algorithms such as 3. MATERIALS AND METHODS
Naive Bayes, Maximum Entropy and SVM when trained with This paper aims to analyze the performance of Support Vector
emotion tweets can have accuracy more than 80%. The study Machine (SVM) for polarity detection (positive, negative and
also highlighted the steps used in preprocessing stage of neutral) of textual data. Two Pre-labeled twitter datasets are
classification for high accuracy. [24] Presented an application considered for this analysis. The reason of choosing the pre-
of Arabic sentiment analysis on twitter data. They analyzed labeled tweets as test data is to analyze the performance and
1000 tweets for polarity detection by using machine learning accuracy of SVM. The output polarity for each tweet from
techniques, NB and SVM. In the proposed approach feature this algorithm will be compared to the pre-labeled class and
vectors were applied to machine learning classifiers for higher then the difference will be calculated by Weka. The
accuracy. The authors also pointed out some problem areas in performance will be measured in terms of precision, recall and
training data such as multiple occurrences of tweets, opinion f measure [1], [2], [3], [8].
spamming and dual opinion tweets. These issues could put the
question mark for the level of achieved accuracy. In [25], the
authors have used three different machine learning algorithms 3.1 Weka
Naïve Bayes, Decision Trees and Support Vector Machine for In this study, we have used Weka [4], [7] for classification
sentiment classification of Arabic dataset which was obtained and performance analysis of SVM. It is one of the widely used
from twitter. This research has followed a framework for tools to analyze the working of data mining and machine
Arabic tweets classification in which two special sub-tasks learning algorithms. Weka is developed in Java language at
were performed in pre-processing, Term Frequency-Inverse the University of Waikato, New Zealand. It is widely accepted
Document Frequency (TF-IDF) and Arabic stemming. They due to its easy to use GUI interface. It is very famous tool due
have used one dataset with three algorithms and performance to its portability and General Public License.
has been evaluated on the basis three different information
retrieval metrics precision, recall, and f-measure. In [26], the 3.2 Datasets
authors have proposed an efficient feature vector technique by Two pre-labeled datasets of tweets are used in this research.
dividing the feature extraction process in two steps after the First dataset contains the tweets about self-driving cars [5]. It
preprocessing. In first step, those features are extracted which contains 110 very negative, 685 slightly negative, 4245
are twitter specific and then added to feature vector. After that neutral, 1444 slightly positive, 459 very positive and 213
these features are removed from the tweets and then again the irrelevant tweets.
feature extraction process is done just like the case with
normal text. These extracted features are also added to the Table 1. Twitter dataset for self-driving cars
feature vector. The accuracy of the proposed feature vector
Class Tweets
technique is same for Nave Bayes, SVM, Maximum Entropy
and Ensemble classifiers. However this technique performed Very Negative 110
well for the domain of electronic products. [27] Proposed
adaptive multiclass SVM model which works with topic Slightly Negative 685
adaptive sentiment classifier. The authors focused on non-text Neutral 4245
features to handle the sparsity of tweets. An iterative
algorithm is proposed, consisted of three steps: optimization, Slightly Positive 1444
unlabeled data selection and adaptive feature expansion. With
6 topic tweets, the proposed algorithm achieved promising Very Positive 459
high accuracy as compared to other well-known supervised Irrelevant 213
and semi supervised classifiers. The authors in [28] focused
on the polarity of hashtags as a classification feature of tweets Total 7156
in political domain. They proposed the rules for automatic
dataset labeling based on the positive and negative hashtags,
and finally proposed a method to enrich terms in the tweet by Second dataset [6] contains tweets about apple products
hashtag term extraction. The authors highlighted that use of (iphone, iPod etc). This dataset consists of 1218 negative,
positive and negative hashtags for dataset labeling and 2162 neutral, 423 positive and 81 irrelevant tweets.
sentiment classification has accuracy of more than 95%. Table 2. Twitter dataset for Apple products
Moreover this hashtag feature outperforms the unigram
feature when combined with Naïve Bayes, SVM or Logistic Class Tweets
Regression algorithms, but the accuracy decreases when
Negative 1218
combined with Random Forest algorithm based on computing
time to build the model. In [29], three data mining techniques Neutral 2162
are used to predict and analyze students’ academic
performance. The authors have used Decision tree (C4.5), Positive 423
Multilayer Perception and Naïve Bayes. All these techniques
Irrelevant 81
were applied on student’s data which was collected from 2
undergraduate courses in two semesters. According to results, Total 3884
Naïve Bayes showed the prediction accuracy of 86% which
was higher among other MLP and Decision tree. With this
type of prediction it would be easy for teachers to detect those The dataset or input phase of our classification approach
students early, who are expected to get F grade in the course. includes the downloading of relevant datasets and
So ultimately, with the teacher’s special care to those transformation of this data into CSV/ARFF format to use in
students, the academic performance can be improved. WEKA Workbench [4], [7]. We have used simple CLI to
convert text files into ARFF format by using

26
International Journal of Computer Applications (0975 – 8887)
Volume 177 – No.5, November 2017

“weka.core.converters.TextDirectoryLoader” function as 4. CLASSIFICATION


shown in Figure 1. This is the phase in which SVM runs on the normalized data
for classification and gives the results. Performance analysis
of any supervised machine learning algorithm can be
performed by providing the pre classified data as test data and
comparing the output polarities with the pre classified
polarities. We have used two datasets of pre-label tweets as
input data. The results are measured in terms of precision,
recall and f measure.

5. RESULTS
This section focuses on the results and comparative analysis
of SVM in different measures for both datasets. For
comparison, three evaluation parameters are used in this
study: Precision, Recall and F Measure.
The precision can be calculated using TP and FP rate as
shown below:

TP is used for sentences, which are correctly classified, and


FP is for those sentences, which are wrongly classified.
Recall can be calculated as shown below:

Fig 1: Simple CLI in Weka


FN is used for non-classified sentences and TP is for correctly
3.3 Pre-processing classified sentences (as explained above).
Pre processing of the input data is very important stage of F-measure can be computed as bellow:
classification procedure. In this stage the dataset get
normalized and prepared for the classification algorithm so
that the particular algorithm can run smoothly and bring
effective results in minimum time [8]. According to many
researches, parameters for pre-processing includes TF-IDF, First dataset is taken from [5] and contains the tweets
Stemmer, stopwords Handler and tokenizer etc [1], [25], [30]. regarding self driving cars. According to results, the average
In this study we have used the default parameters for Precision, Recall and F-Measure is 55.8%, 59.9% and 57.2%
preprocessing as shown in Figure 2. respectively.
These results are arranged in Table 3 and class wise result in
each measure is shown with graph (Figure 3).
Table 3. Class wise Precision, Recall and F-Measure for
First Dataset
F-
Class Precision Recall Measure

Very Negative 0.224 0.1 0.138

Slightly Negative 0.247 0.184 0.211

Neutral 0.708 0.841 0.769

Slightly Positive 0.428 0.305 0.356

Very Positive 0.278 0.237 0.256

Irrelevant 0.225 0.136 0.17

Average 0.558 0.599 0.572

Fig 2: Default parameter selection in Weka

27
International Journal of Computer Applications (0975 – 8887)
Volume 177 – No.5, November 2017

According to Weka the accuracy of SVM during sentiment


1
classification is different in both datasets. For first dataset of
0.8 self-driving cars, it is 59.91% and for second dataset of apple,
0.6 it is 71.2%.
0.4 Table 5. SVM Accuracy
0.2
0 Datasets Accuracy %

Self-Driving Cars 59.91%

Apple 71.2%

Precision Recall F-Measure


6. CONCLUSION
In this paper, we have analyzed the performance of Support
Fig 3: Twitter dataset for self-driving cars Vector Machine (SVM) for sentiment analysis. For
performance analysis of SVM, we have used two pre
The Score for neutral class in Precision, Recall and F-Measure classified datasets of tweets, first dataset consisted of tweets
is 70.8%, 84.1% and 76.9% respectively, which is higher than regarding self driving cars and second dataset was about the
other classes. apple products. Weka tool is used for performance analysis
and comparison. Results are measured in terms of precision,
Second dataset is taken from [6] and contains the tweets recall and f-measure. According to results, for first dataset the
regarding 'apple' products. According to results, the average average precision, recall and f-measure is 55.8%, 59.9% and
Precision, Recall and F-Measure is 70.2%, 71.2% and 69.9% 57.2% respectively. For second dataset the average Precision,
respectively. Recall and F-Measure is 70.2%, 71.2% and 69.9%
Complete results are arranged in Table 4 and class wise result respectively. Complete results are shown in tabular and in
in each measure is shown with graph (Figure 4). graphical forms. The results clearly show the dependency of
SVM performance upon input dataset. The performance
Table 4. Class wise Precision, Recall and F-Measure for dependency of SVM and other machine learning techniques
Second Dataset should be explored further by using large and different
datasets. For comparative analysis the results of this paper can
F-
be used as baseline. Moreover it should also be investigated
Class Precision Recall Measure
that for classification purpose, which machine learning
algorithm performs better on which type of dataset and what
Negative 0.732 0.602 0.661
might be the reasons? This can lead the researchers to the
improved versions of machine learning algorithms for
Neutral 0.729 0.859 0.789
classification purpose.
Positive 0.548 0.376 0.446 7. REFERENCES
[1] Ahmad, M., & Aftab, S. (2017). Analyzing the
Irrelative 0.318 0.173 0.224 Performance of SVM for Polarity Detection with
Different Datasets. International Journal of Modern
Average 0.702 0.712 0.699 Education and Computer Science (IJMECS), 9(10), 29-
36.
[2] Sharma, A., & Dey, S. (2013, October). A boosted SVM
based sentiment analysis approach for online opinionated
1 text. In Proceedings of the 2013 Research in Adaptive
0.8 and Convergent Systems (pp. 28-34). ACM.
0.6 [3] Singh, V. K., Piryani, R., Uddin, A., & Waila, P. (2013,
0.4 January). Sentiment analysis of textual reviews;
Evaluating machine learning, unsupervised and
0.2 SentiWordNet approaches. In Knowledge and Smart
0 Technology (KST), 2013 5th International Conference
Negative Neutral Positive Irrelative Average on (pp. 122-127). IEEE.
[4] Holmes, G., Donkin, A., & Witten, I. H. (1994,
Precision Recall F-Measure
December). Weka: A machine learning workbench.
In Intelligent Information Systems, 1994. Proceedings of
Fig 4: Twitter dataset for apple the 1994 Second Australian and New Zealand
Conference on (pp. 357-361). IEEE.
According to results the Precision is high in Negative class
(73.2%) however Recall and F-Measure both are high in [5] Crowdflower.com. (2017). [online] Available at:
Neutral class (85.9% and 78.9% respectively). https://www.crowdflower.com/wp-
content/uploads/2016/03/Twitter-sentiment-self-drive-
DFE.csv [Accessed 15 Aug. 2017].

28
International Journal of Computer Applications (0975 – 8887)
Volume 177 – No.5, November 2017

[6] Crowdflower.com. (2017). [online] Available at: [20] Pang, B., Lee, L., & Vaithyanathan, S. (2002, July).
https://www.crowdflower.com/wp- Thumbs up?: sentiment classification using machine
content/uploads/2016/03/ Apple-Twitter-Sentiment- learning techniques. In Proceedings of the ACL-02
DFE.csv [Accessed 15 Aug. 2017]. conference on Empirical methods in natural language
processing-Volume 10 (pp. 79-86). Association for
[7] Weka: http://www.cs.waikato.ac.nz/~ml/weka/ Computational Linguistics
[8] Zainudin, S., Jasim, D. S., & Bakar, A. A. (2016). [21] Zgheib, W. A., & Barbar, A. M. A Study using Support
Comparative Analysis of Data Mining Techniques for Vector Machines to Classify the Sentiments of Tweets.
Malaysian Rainfall Prediction. International Journal on
Advanced Science, Engineering and Information [22] Arora, R. (2012). Comparative analysis of classification
Technology, 6(6), 1148-1153. algorithms on different datasets using
WEKA. International Journal of Computer
[9] Pang, B., & Lee, L. (2008). Opinion mining and Applications, 54(13).
sentiment analysis. Foundations and Trends® in
Information Retrieval, 2(1–2), 1-135. [23] Go, A., Bhayani, R., & Huang, L. (2009). Twitter
sentiment classification using distant
[10] Saif, H., He, Y., Fernandez, M., & Alani, H. (2016). supervision. CS224N Project Report, Stanford, 1(2009),
Contextual semantics for sentiment analysis of 12.
Twitter. Information Processing & Management, 52(1),
5-19. [24] Shoukry, A., & Rafea, A. (2012, May). Sentence-level
Arabic sentiment analysis. In Collaboration Technologies
[11] Liu, B. (2012). Sentiment analysis and opinion mining. and Systems (CTS), 2012 International Conference
Synthesis lectures on human language technologies, 5(1), on (pp. 546-550). IEEE.
1-167
[25] Altawaier, M. M., & Tiun, S. (2016). Comparison of
[12] Ahmad, M., Aftab, S., Muhammad, S. S., & Waheed, U. Machine Learning Approaches on Arabic Twitter
(2017). Tools and Techniques for Lexicon Driven Sentiment Analysis. International Journal on Advanced
Sentiment Analysis: A Review. Int. J. Multidiscip. Sci. Science, Engineering and Information Technology, 6(6),
Eng, 8(1), 17-23. 1067-1073.
[13] Ahmad, M., Aftab, S., Muhammad, S. S., & Ahmad, S. [26] Neethu, M. S., & Rajasree, R. (2013, July). Sentiment
(2017). Machine Learning Techniques for Sentiment analysis in twitter using machine learning techniques.
Analysis: A Review. Int. J. Multidiscip. Sci. Eng, 8(3), In Computing, Communications and Networking
27-32. Technologies (ICCCNT), 2013 Fourth International
[14] Mudinas, A., Zhang, D., & Levene, M. (2012, August). Conference on (pp. 1-5). IEEE.
Combining lexicon and learning based approaches for [27] Liu, S., Li, F., Li, F., Cheng, X., & Shen, H. (2013,
concept-level sentiment analysis. In Proceedings of the October). Adaptive co-training SVM for sentiment
First International Workshop on Issues of Sentiment classification on tweets. In Proceedings of the 22nd
Discovery and Opinion Mining(p. 5). ACM. ACM international conference on Information &
[15] Malandrakis, N., Kazemzadeh, A., Potamianos, A., & Knowledge Management (pp. 2079-2088). ACM.
Narayanan, S. (2013, June). SAIL: A hybrid approach to [28] Alfina, I., Sigmawaty, D., Nurhidayati, F., & Hidayanto,
sentiment analysis. In SemEval@ NAACL-HLT (pp. A. N. (2017, February). Utilizing Hashtags for Sentiment
438-442). Analysis of Tweets in The Political Domain.
[16] Balage Filho, P., & Pardo, T. (2013, June). NILC_USP: In Proceedings of the 9th International Conference on
A Hybrid System for Sentiment Analysis in Twitter Machine Learning and Computing (pp. 43-47). ACM.
Messages. In SemEval@ NAACL-HLT (pp. 568-572). [29] Mueen, A., Zafar, B., & Manzoor, U. (2016). Modeling
[17] “AlchemyAPI.” [Online]. Available: and Predicting Students' Academic Performance Using
https://www.ibm.com/watson/alchemy-api.html. Data Mining Techniques. International Journal of
Modern Education and Computer Science, 8(11), 36.
[18] Ahmad, M., Aftab, S., Ali, I., & Hameed, N. (2017).
Hybrid Tools and Techniques for Sentiment Analysis: A [30] Isa, D., Lee, L. H., Kallimani, V. P., & Rajkumar, R.
Review. Int. J. Multidiscip. Sci. Eng, 8(3) (2008). Text document preprocessing with the Bayes
formula for classification using the support vector
[19] Cortes, C., & Vapnik, V. (1995). Support vector machine. IEEE Transactions on Knowledge and Data
machine. Machine learning, 20(3), 273-297 engineering, 20(9), 1264-1272.

IJCATM : www.ijcaonline.org
29

View publication stats

You might also like