TY - GEN
T1 - Automatically building a corpus for sentiment analysis on Indonesian tweets
AU - Wicaksono, Alfan Farizki
AU - Vania, Clara
AU - Trisedya, Bayu Distiawan
AU - Adriani, Mirna
N1 - Publisher Copyright:
Copyright 2014 by Alfan Farizki Wicaksono, Clara Vania, Bayu Distiawan T., and Mirna Adriani.
PY - 2014
Y1 - 2014
N2 - The popularity of the user generated content, such as Twitter, has made it a rich source for the sentiment analysis and opinion mining tasks. This paper presents our study in automatically building a training corpus for the sentiment analysis on Indonesian tweets. We start with a set of seed sentiment corpus and subsequently expand them using a classifier model whose parameters are estimated using the Expectation and Maximization (EM) framework. We apply our automatically built corpus to perform two tasks, namely opinion tweet extraction and tweet polarity classification using various machine learning approaches. Experiment result shows that a classifier model trained on our data, which is automatically constructed using our proposed method, outperforms the baseline system in terms of opinion tweet extraction and tweet polarity classification.
AB - The popularity of the user generated content, such as Twitter, has made it a rich source for the sentiment analysis and opinion mining tasks. This paper presents our study in automatically building a training corpus for the sentiment analysis on Indonesian tweets. We start with a set of seed sentiment corpus and subsequently expand them using a classifier model whose parameters are estimated using the Expectation and Maximization (EM) framework. We apply our automatically built corpus to perform two tasks, namely opinion tweet extraction and tweet polarity classification using various machine learning approaches. Experiment result shows that a classifier model trained on our data, which is automatically constructed using our proposed method, outperforms the baseline system in terms of opinion tweet extraction and tweet polarity classification.
UR - http://www.scopus.com/inward/record.url?scp=84993660538&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84993660538
T3 - Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014
SP - 185
EP - 194
BT - Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014
A2 - Boonkwan, Prachya
A2 - Aroonmanakun, Wirote
A2 - Supnithi, Thepchai
PB - Faculty of Pharmaceutical Sciences, Chulalongkorn University
T2 - 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014
Y2 - 12 December 2014 through 14 December 2014
ER -