TY - GEN
T1 - Emotion Classification on Indonesian Twitter Dataset
AU - Saputri, Mei Silviana
AU - Mahendra, Rahmad
AU - Adriani, Mirna
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - The rapid growth of Twitter usage attracts many researchers to utilize Twitter data for several purposes, including emotion analysis. However, there is a resource limitation in standard dataset for emotion analysis task for under-resourced language, especially Indonesian. In this study, we build an Indonesian twitter dataset for emotion classification task which is publicly available. In addition, we conduct feature engineering to decide the best feature in emotion classification. The features used in this research are lexicon-based, Bag-of-Words, word embeddings, orthography and Part-Of-Speech (POS)tag features. We test those features in two datasets with different characteristics. F1-score is employed as an evaluation metric. The results of our experiments show that implementing the combination of all proposed features in our built dataset can achieve 69.73% of F1-Score, which outperforms the baseline model by 26.64%.
AB - The rapid growth of Twitter usage attracts many researchers to utilize Twitter data for several purposes, including emotion analysis. However, there is a resource limitation in standard dataset for emotion analysis task for under-resourced language, especially Indonesian. In this study, we build an Indonesian twitter dataset for emotion classification task which is publicly available. In addition, we conduct feature engineering to decide the best feature in emotion classification. The features used in this research are lexicon-based, Bag-of-Words, word embeddings, orthography and Part-Of-Speech (POS)tag features. We test those features in two datasets with different characteristics. F1-score is employed as an evaluation metric. The results of our experiments show that implementing the combination of all proposed features in our built dataset can achieve 69.73% of F1-Score, which outperforms the baseline model by 26.64%.
KW - emotion classification
KW - feature engineering
KW - indonesian tweet
KW - natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85062771477&partnerID=8YFLogxK
U2 - 10.1109/IALP.2018.8629262
DO - 10.1109/IALP.2018.8629262
M3 - Conference contribution
AN - SCOPUS:85062771477
T3 - Proceedings of the 2018 International Conference on Asian Language Processing, IALP 2018
SP - 90
EP - 95
BT - Proceedings of the 2018 International Conference on Asian Language Processing, IALP 2018
A2 - Dong, Minghui
A2 - Bijaksana, Moch.
A2 - Sujaini, Herry
A2 - Negara, Arif Bijaksana Putra
A2 - Romadhony, Ade
A2 - Ruskanda, Fariska Z.
A2 - Nurfadhilah, Elvira
A2 - Aini, Lyla Ruslana
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 22nd International Conference on Asian Language Processing, IALP 2018
Y2 - 15 November 2018 through 17 November 2018
ER -