Cyber risk prediction through social media big data analytics and statistical machine learning

Athor Subroto, Andri Apriyana

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)

Abstract

As a natural outcome of achieving equilibrium, digital economic progress will most likely be subject to increased cyber risks. Therefore, the purpose of this study is to present an algorithmic model that utilizes social media big data analytics and statistical machine learning to predict cyber risks. The data for this study consisted of 83,015 instances from the common vulnerabilities and exposures (CVE) database (early 1999 to March 2017) and 25,599 cases of cyber risks from Twitter (early 2016 to March 2017), after which 1000 instances from both platforms were selected. The predictions were made by analyzing the software vulnerabilities to threats, based on social media conversations, while prediction accuracy was measured by comparing the cyber risk data from Twitter with that from the CVE database. Utilizing confusion matrix, we can achieve the best prediction by using Rweka package to carry out machine learning (ML) experimentation and artificial neural network (ANN) with the accuracy rate of 96.73%. Thus, in this paper, we offer new insights into cyber risks and how such vulnerabilities can be adequately understood and predicted. The findings of this study can be used by managers of public and private companies to formulate effective strategies for reducing cyber risks to critical infrastructures.

Original languageEnglish
Article number50
JournalJournal of Big Data
Volume6
Issue number1
DOIs
Publication statusPublished - 1 Dec 2019

Keywords

  • Big data
  • Cyber risks
  • Machine learning
  • Non-traditional actuary
  • Predictive analytics
  • Social media

Fingerprint

Dive into the research topics of 'Cyber risk prediction through social media big data analytics and statistical machine learning'. Together they form a unique fingerprint.

Cite this