Generating Fake News Detection Model Using A Two-Stage Evolutionary Approach

Jeffery T.H. Kong, W. K. Wong, Filbert H. Juwono, Catur Apriono

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

While fake news is morally reprehensible, irresponsible parties intentionally use it to achieve their goals by disseminating it to vulnerable and targeted groups. Machine learning techniques have been researched extensively to detect fake news. On the other hand, evolutionary-based algorithms are now gaining popularity in the research community. In this study, a two-stage evolutionary approach is proposed to generate and optimize a mathematical equation for fake news detection. In the first stage, tree-based Genetic Programming (GP) algorithm is used to generate mathematical expressions to detect correlations between the language-independent (Lang-IND) features, extracted from <italic>Fake.my-COVID19</italic> dataset, the newly curated fake news dataset in a mixed Malay - English language. The uniqueness of the proposed approach is that the mathematical expressions are formed by basic arithmetic operators or to include complex arithmetic operators such as addition, multiplication, subtraction, division, square, abs, log1p, sign, square root, and exponential together with Lang-IND features as the variables. Prior to second stage of the evolutionary approach, a sensitivity analysis is applied to shorten the best equation while maintaining the F1-score performance. In the second stage, an Adaptive Differential Evolution (ADE), is used to fine-tune the mathematical model. The experimental results conclude that the proposed two-stage evolutionary approach can be applied in fake news detection and the model can learn to predict using the Lang-IND features. Results from the first stage shows that the equation from GP scores F1-score of 83.23% on <italic>Fake.my-COVID19</italic> dataset using complex arithmetic operators and at tree depth of 8. After the fine-tuning stage, the model performance increases the F1-score to 84.44%. The performance of the proposed two-stage evolutionary approach outperforms the baseline performance of six commonly-used machine learning algorithms, with Random Forest having the highest F1-score of 84.07%. The mathematical model is also tested separately on two other unseen datasets of different domain topic or language and achieves acceptable F1-scores.

Original languageEnglish
Pages (from-to)1
Number of pages1
JournalIEEE Access
Volume11
DOIs
Publication statusAccepted/In press - 2023

Keywords

  • COVID-19
  • differential evolution
  • evolutionary approach
  • Fake news
  • fake news detection
  • Feature extraction
  • genetic programming
  • Machine learning
  • Mathematical models
  • Random forests
  • Social networking (online)

Fingerprint

Dive into the research topics of 'Generating Fake News Detection Model Using A Two-Stage Evolutionary Approach'. Together they form a unique fingerprint.

Cite this