TY - GEN
T1 - Utilizing hashtags for sentiment analysis of tweets in the political domain
AU - Alfina, Ika
AU - Sigmawaty, Dinda
AU - Nurhidayati, Fitriasari
AU - Hidayanto, Achmad Nizar
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/2/24
Y1 - 2017/2/24
N2 - The objective of this research is to investigate the benefit of utilizing hashtags to determine sentiment polarity of tweets in the political domain. We used the sentiment polarity of hashtags as the features in classification, proposed rules for automatically annotating dataset based on the number of positive and negative hashtags in the tweets, and proposed a method to enrich terms in the tweet by extracting hashtag terms. We named the number of positive and negative hashtags as SentiHT feature. The experiments and evaluation show that sentiment classification using SentiHT feature and the automatically labeled dataset using SentiHT has a very good accuracy of more than 95%. Moreover, SentiHT outperforms unigram feature when combined with Naïve Bayes, SVM or Logistic Regression algorithms, but the opposite occurs when using Random Forest algorithm. Based on computing time to build the model, we recommend using SentiHT feature combined with Naïve Bayes algorithm.
AB - The objective of this research is to investigate the benefit of utilizing hashtags to determine sentiment polarity of tweets in the political domain. We used the sentiment polarity of hashtags as the features in classification, proposed rules for automatically annotating dataset based on the number of positive and negative hashtags in the tweets, and proposed a method to enrich terms in the tweet by extracting hashtag terms. We named the number of positive and negative hashtags as SentiHT feature. The experiments and evaluation show that sentiment classification using SentiHT feature and the automatically labeled dataset using SentiHT has a very good accuracy of more than 95%. Moreover, SentiHT outperforms unigram feature when combined with Naïve Bayes, SVM or Logistic Regression algorithms, but the opposite occurs when using Random Forest algorithm. Based on computing time to build the model, we recommend using SentiHT feature combined with Naïve Bayes algorithm.
KW - Machine learning
KW - NLP
KW - Politics
KW - Sentiment analysis
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85024389339&partnerID=8YFLogxK
U2 - 10.1145/3055635.3056631
DO - 10.1145/3055635.3056631
M3 - Conference contribution
AN - SCOPUS:85024389339
T3 - ACM International Conference Proceeding Series
SP - 43
EP - 47
BT - Proceedings of 2017 9th International Conference on Machine Learning and Computing, ICMLC 2017
PB - Association for Computing Machinery
T2 - 9th International Conference on Machine Learning and Computing, ICMLC 2017
Y2 - 24 February 2017 through 26 February 2017
ER -