Indonesia has a huge number of Twitter users and a lot of them often communicate using abusive language. Not only in the context of jokes, many Indonesian netizens often use abusive language to curse (offense) someone. Research about abusive language detection in Indonesian Twitter has been done using classical machine learning approach. However, the performance was still not too good, especially in differentiating whether the tweet is an abusive but not offensive or an offensive language. This paper implements a deep learning approach to enhance the performance when identifying abusive but not offensive or an offensive language. We use Long Short-Term Memory (LSTM) with word embedding because our literature study found that LSTM with word embedding is good for text classification (both for English or Indonesian text classification). The experiment result shows that LSTM with word embedding can increase the F 1 -Score from the previous work until 19.44%, from 70.06% to 83.68%.
|Journal||Journal of Physics: Conference Series|
|Publication status||Published - 16 Apr 2019|
|Event||International Conference on Information System, Computer Science and Engineering 2018, ICONISCSE 2018 - Palembang, Indonesia|
Duration: 26 Nov 2018 → 27 Nov 2018