Identify abusive and offensive language in indonesian twitter using deep learning approach

Muhammad Okky Ibrohim, Erryan Sazany, Indra Budi

Research output: Contribution to journalConference articlepeer-review

12 Citations (Scopus)

Abstract

Indonesia has a huge number of Twitter users and a lot of them often communicate using abusive language. Not only in the context of jokes, many Indonesian netizens often use abusive language to curse (offense) someone. Research about abusive language detection in Indonesian Twitter has been done using classical machine learning approach. However, the performance was still not too good, especially in differentiating whether the tweet is an abusive but not offensive or an offensive language. This paper implements a deep learning approach to enhance the performance when identifying abusive but not offensive or an offensive language. We use Long Short-Term Memory (LSTM) with word embedding because our literature study found that LSTM with word embedding is good for text classification (both for English or Indonesian text classification). The experiment result shows that LSTM with word embedding can increase the F 1 -Score from the previous work until 19.44%, from 70.06% to 83.68%.

Original languageEnglish
Article number12041
JournalJournal of Physics: Conference Series
Volume1196
Issue number1
DOIs
Publication statusPublished - 16 Apr 2019
EventInternational Conference on Information System, Computer Science and Engineering 2018, ICONISCSE 2018 - Palembang, Indonesia
Duration: 26 Nov 201827 Nov 2018

Fingerprint

Dive into the research topics of 'Identify abusive and offensive language in indonesian twitter using deep learning approach'. Together they form a unique fingerprint.

Cite this