TY - JOUR
T1 - Identify abusive and offensive language in indonesian twitter using deep learning approach
AU - Okky Ibrohim, Muhammad
AU - Sazany, Erryan
AU - Budi, Indra
N1 - Publisher Copyright:
© Published under licence by IOP Publishing Ltd.
PY - 2019/4/16
Y1 - 2019/4/16
N2 -
Indonesia has a huge number of Twitter users and a lot of them often communicate using abusive language. Not only in the context of jokes, many Indonesian netizens often use abusive language to curse (offense) someone. Research about abusive language detection in Indonesian Twitter has been done using classical machine learning approach. However, the performance was still not too good, especially in differentiating whether the tweet is an abusive but not offensive or an offensive language. This paper implements a deep learning approach to enhance the performance when identifying abusive but not offensive or an offensive language. We use Long Short-Term Memory (LSTM) with word embedding because our literature study found that LSTM with word embedding is good for text classification (both for English or Indonesian text classification). The experiment result shows that LSTM with word embedding can increase the F
1
-Score from the previous work until 19.44%, from 70.06% to 83.68%.
AB -
Indonesia has a huge number of Twitter users and a lot of them often communicate using abusive language. Not only in the context of jokes, many Indonesian netizens often use abusive language to curse (offense) someone. Research about abusive language detection in Indonesian Twitter has been done using classical machine learning approach. However, the performance was still not too good, especially in differentiating whether the tweet is an abusive but not offensive or an offensive language. This paper implements a deep learning approach to enhance the performance when identifying abusive but not offensive or an offensive language. We use Long Short-Term Memory (LSTM) with word embedding because our literature study found that LSTM with word embedding is good for text classification (both for English or Indonesian text classification). The experiment result shows that LSTM with word embedding can increase the F
1
-Score from the previous work until 19.44%, from 70.06% to 83.68%.
UR - http://www.scopus.com/inward/record.url?scp=85065711376&partnerID=8YFLogxK
U2 - 10.1088/1742-6596/1196/1/012041
DO - 10.1088/1742-6596/1196/1/012041
M3 - Conference article
AN - SCOPUS:85065711376
SN - 1742-6588
VL - 1196
JO - Journal of Physics: Conference Series
JF - Journal of Physics: Conference Series
IS - 1
M1 - 12041
T2 - International Conference on Information System, Computer Science and Engineering 2018, ICONISCSE 2018
Y2 - 26 November 2018 through 27 November 2018
ER -