Indonesian tweets hate speech target classification using machine learning

Sandy Kurniawan, Indra Budi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In recent years, hate speech found in social media is increasing. The increase in the number of hate speech is caused by the increasing number of social media active users around the world. A lot of hate speech is aimed at governments or certain individuals. Hate speech is very harmful because it may affect the target negatively, whether the target is individuals or groups. Identification of targets in hate speech is crucial as it can be used to prevent the impact of hate speech such as exclusion, discrimination, and violence directed to the target in the hate speech. In this paper, we present our study in hate speech target classification in Indonesian Twitter. We studied hate speech target classification on Indonesian Twitter by comparing the classification performance based on the algorithms and feature representations used. Word n-grams were used as the feature representation combine with Bag-of-Words and Term Frequency - Inverse Document Frequency (TF-IDF). The classification was performed using Naive Bayes, Support Vector Machine (SVM), and Random Forest Decision Tree (RFDT). The best result achieved F1-score of 0.84772 when using TF-IDF with word unigram features combine with SVM classifier.

Original languageEnglish
Title of host publication2020 5th International Conference on Informatics and Computing, ICIC 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728180472
DOIs
Publication statusPublished - 3 Nov 2020
Event5th International Conference on Informatics and Computing, ICIC 2020 - Virtual, Jakarta, Indonesia
Duration: 3 Nov 20204 Nov 2020

Publication series

Name2020 5th International Conference on Informatics and Computing, ICIC 2020

Conference

Conference5th International Conference on Informatics and Computing, ICIC 2020
CountryIndonesia
CityVirtual, Jakarta
Period3/11/204/11/20

Keywords

  • Hate speech
  • Machine learning
  • Target
  • Text classification

Fingerprint Dive into the research topics of 'Indonesian tweets hate speech target classification using machine learning'. Together they form a unique fingerprint.

Cite this