TY - GEN
T1 - Malicious Account Detection on Twitter Based on Tweet Account Features using Machine Learning
AU - Pakaya, Farhan Nurdiatama
AU - Ibrohim, Muhammad Okky
AU - Budi, Indra
PY - 2019/10
Y1 - 2019/10
N2 - As one of the most popular social media, Twitter is facing issues with the massive numbers of its users. This has led many to exploit the platform to perform cyber crime to other users. One of the cybercrime is the activity of malicious accounts. Malicious accounts such as spambots and fake followers can be problematic as they may harm other users. Spambots can send other users unwanted messages and fake followers can increase other accounts following numbers signaling trustworthiness or influence. Much research has been conducted to build a malicious account detector, but mostly use profile-based and graph-based features. On the other hand, malicious and genuine accounts can have distinct ways to tweet. In this research, we build a classification model using only account tweets. We also build further classification distinguishing fake followers and spambots from genuine accounts. In this research, maximum accuracy has been reached at 95.55% in malicious vs genuine account detection using tf-idf features and XGBoost algorithm and 95.2% in all three types of accounts using Word2Vec features and XGBoost algorithm.
AB - As one of the most popular social media, Twitter is facing issues with the massive numbers of its users. This has led many to exploit the platform to perform cyber crime to other users. One of the cybercrime is the activity of malicious accounts. Malicious accounts such as spambots and fake followers can be problematic as they may harm other users. Spambots can send other users unwanted messages and fake followers can increase other accounts following numbers signaling trustworthiness or influence. Much research has been conducted to build a malicious account detector, but mostly use profile-based and graph-based features. On the other hand, malicious and genuine accounts can have distinct ways to tweet. In this research, we build a classification model using only account tweets. We also build further classification distinguishing fake followers and spambots from genuine accounts. In this research, maximum accuracy has been reached at 95.55% in malicious vs genuine account detection using tf-idf features and XGBoost algorithm and 95.2% in all three types of accounts using Word2Vec features and XGBoost algorithm.
KW - machine learning
KW - malicious accounts
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85081061105&partnerID=8YFLogxK
U2 - 10.1109/ICIC47613.2019.8985840
DO - 10.1109/ICIC47613.2019.8985840
M3 - Conference contribution
T3 - Proceedings of 2019 4th International Conference on Informatics and Computing, ICIC 2019
BT - Proceedings of 2019 4th International Conference on Informatics and Computing, ICIC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th International Conference on Informatics and Computing, ICIC 2019
Y2 - 23 October 2019 through 24 October 2019
ER -