TY - JOUR
T1 - Sentiment analysis of hate speech as an information tool to prevent riots and environmental damage
AU - Marchellim, G. A.
AU - Ruldeviyani, Y.
N1 - Funding Information:
This research was supported by PUTI Q2 grant “Data Everywhere: Escalating the Data Quality for Enterprises' Decision Making Processes” (NKB-850/UN2.RST/HKP.05. 00/ 2020). We would express our gratitude to the Faculty of Computer Science and Directorate of Research and Community Engagement, Universitas Indonesia.
Publisher Copyright:
© Published under licence by IOP Publishing Ltd.
PY - 2021/3/23
Y1 - 2021/3/23
N2 - Twitter is currently one of the most popular social media in the world. Indonesia ranked in 8th the most Twitter users from all over the world with the 10.65 million users. President of the Republic of Indonesia, Joko Widodo, was ranked 8th leader with the most followers on Twitter, with 11 million users. Everyone can express or give an opinion on something or what they thought through social media including create or spreading the hate speech. One result of this hate speech can lead to mass demonstrations which can result in environmental destruction. Our research focused on sentiment analysis of hate speech towards Indonesian president by comparing the performance between Random Forest algorithm and Support Vector Machine algorithm to prevent the negative result of hate speech that can be happen in the future. This research obtained 550 tweets and labelled manually whether it is categorized as a hate speech or not and divided into each of 275 data for positive and negative sentiments. Random Forest classifier has the best performance by combining gini criterion with the number of trees 128 with the result of the accuracy score is 76.7%, precision score 82.7%, recall score 64%, and F1 score 73.3%. Support Vector Machine classifier has the best performance by combining the linear kernel type and set the C parameter to 100 with the result of the accuracy score at 74.7%, precision score 100%, recall score 49.3%, and F1 score 66.1%. Random Forest classifier performs its best in scoring F1 score, accuracy, and recall while Support Vector Machine classifier performs its best in scoring precision. Based on the results, Random Forest classifier for hate speech detection can be applied for hate speech detection to prevent damaging the environment as it shows better score on accuracy and F1 score than Support Vector Machine Classifier.
AB - Twitter is currently one of the most popular social media in the world. Indonesia ranked in 8th the most Twitter users from all over the world with the 10.65 million users. President of the Republic of Indonesia, Joko Widodo, was ranked 8th leader with the most followers on Twitter, with 11 million users. Everyone can express or give an opinion on something or what they thought through social media including create or spreading the hate speech. One result of this hate speech can lead to mass demonstrations which can result in environmental destruction. Our research focused on sentiment analysis of hate speech towards Indonesian president by comparing the performance between Random Forest algorithm and Support Vector Machine algorithm to prevent the negative result of hate speech that can be happen in the future. This research obtained 550 tweets and labelled manually whether it is categorized as a hate speech or not and divided into each of 275 data for positive and negative sentiments. Random Forest classifier has the best performance by combining gini criterion with the number of trees 128 with the result of the accuracy score is 76.7%, precision score 82.7%, recall score 64%, and F1 score 73.3%. Support Vector Machine classifier has the best performance by combining the linear kernel type and set the C parameter to 100 with the result of the accuracy score at 74.7%, precision score 100%, recall score 49.3%, and F1 score 66.1%. Random Forest classifier performs its best in scoring F1 score, accuracy, and recall while Support Vector Machine classifier performs its best in scoring precision. Based on the results, Random Forest classifier for hate speech detection can be applied for hate speech detection to prevent damaging the environment as it shows better score on accuracy and F1 score than Support Vector Machine Classifier.
UR - http://www.scopus.com/inward/record.url?scp=85103853944&partnerID=8YFLogxK
U2 - 10.1088/1755-1315/700/1/012024
DO - 10.1088/1755-1315/700/1/012024
M3 - Conference article
AN - SCOPUS:85103853944
SN - 1755-1307
VL - 700
JO - IOP Conference Series: Earth and Environmental Science
JF - IOP Conference Series: Earth and Environmental Science
IS - 1
M1 - 012024
T2 - 9th Engineering International Conference, EIC 2020
Y2 - 24 September 2020
ER -