TY - GEN
T1 - Hate speech detection on Indonesian long text documents using machine learning approach
AU - Aulia, Nofa
AU - Budi, Indra
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/4/19
Y1 - 2019/4/19
N2 - Due to the growth of hate speech on social media in recent years, it is important to understand this issue. An automatic hate speech detection system is needed to help to counter this problem. There have been many studies on detecting hate speech in short documents like Twitter data. But to our knowledge, research on long documents is rare, we suppose that the difficulty is increasing due to the possibility of the message of the text may be hidden. In this research, we explore in detecting hate speech on Indonesian long documents using machine learning approach. We build a new Indonesian hate speech dataset from Facebook. The experiment showed that the best performance obtained by Support Vector Machine (SVM) as its classifier algorithm using TF-IDF, char quad-gram, word unigram, and lexicon features that yield f1-score of 85%.
AB - Due to the growth of hate speech on social media in recent years, it is important to understand this issue. An automatic hate speech detection system is needed to help to counter this problem. There have been many studies on detecting hate speech in short documents like Twitter data. But to our knowledge, research on long documents is rare, we suppose that the difficulty is increasing due to the possibility of the message of the text may be hidden. In this research, we explore in detecting hate speech on Indonesian long documents using machine learning approach. We build a new Indonesian hate speech dataset from Facebook. The experiment showed that the best performance obtained by Support Vector Machine (SVM) as its classifier algorithm using TF-IDF, char quad-gram, word unigram, and lexicon features that yield f1-score of 85%.
KW - Hate speech detection
KW - Long documents
KW - Machine learning
KW - SVM
UR - http://www.scopus.com/inward/record.url?scp=85071108622&partnerID=8YFLogxK
U2 - 10.1145/3330482.3330491
DO - 10.1145/3330482.3330491
M3 - Conference contribution
AN - SCOPUS:85071108622
T3 - ACM International Conference Proceeding Series
SP - 164
EP - 169
BT - ICCAI 2019 - 2019 5th International Conference on Computing and Artificial Intelligence
PB - Association for Computing Machinery
T2 - 5th International Conference on Computing and Artificial Intelligence, ICCAI 2019
Y2 - 19 April 2019 through 22 April 2019
ER -