Sentiment Analysis Methodology of Twitter Data with an application on Hajj season
Abstract
With the rapid growth of the internet, millions of people are sharing their views and opinions on a variety of topics on microblogging sites, as it contains simple expressions. Microblogging websites are just social media sites to which user makes real time short and frequent posts about everything. In big event gathering like Hajj, to get rapid and accurate views and impressions of hajji about some quality of service or other views is of a great importance as time and space are limited. In this paper, we utilize tweets during Hajj todo sentiment analysis; the tweets are preprocessed by experience three phases; tokenization, normalization, and part of speech (POS) tagging. In the final step, Naïve Bayes classifier used to classify tweet as positive or negative by comparing each word in the query tweet with the labeled words in the lexicon.
Keywords
Download Options
Introduction
In the last few years, twitter has been hugely increased as a social network enables users to send and read 140 character messages in real time. Moreover, users can share their opinions about many topics, e.g., sports, social etc, discuss complains and express positive attitudes.
Inspired by the huge growth of twitter, companies and organizations are increasingly seeking ways to mine twitter for information about people's opinion about their services and products. In KSA, there are permanent Hajj and Umra seasons where people come to do their religious rituals; they stay more than weeks in KSA.
The Hajji express their impressions many aspects like hotels, transport.etc is an important source of information for decision maker if it is mined and analyzed to get the Hajji feedback about many topics.
Among uses of sentiment analysis, is that the business improvement of an organization can be tracked by user’s feedback [1, 3, 6].
This paper discusses twitter sentiment analysis in details for English language as it has a plenty of available resources. In the future work, it is planned to extend the work for Arabic language.
The paper is organized as follows: section 2 describe
system architecture, section 3 feature extraction of tweets data, section 4 classification, section 5 experiment, and conclusion in section 6.
Conclusion
In this paper, we studied the methodology of sentiment analysis and the result was consistent for English corpus that was available for the study. We plan to do the same work for other languages in Hajj especially Arabic that are the majority of Hajji. Moreover, we plan to do our work on line in the next step.