Enhancing Electricity Theft Detection through K-Nearest Neighbors and Logistic Regression Algorithms with Synthetic Minority Oversampling Technique: A Case Study on State Electricity Company (PLN) Customer Data

Yan Maraden, Gunawan Wibisono, I. Gde Dharma Nugraha, Budi Sudiarto, Fauzan Hanif Jufri, Kazutaka, Anton Satria Prabuwono

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Electricity theft has caused massive losses and damage to electricity utilities. The damage affects the electricity supply’s quality and increases the generation load. The losses happen not only for the electricity utilities but also affect the legitimate users who have to pay excessive electricity bills. That is why the method to detect electricity theft is indispensable. Recently, machine learning algorithms have been used to develop a model for detecting electricity theft. However, most algorithms have problems due to imbalanced data, overfitting issues, and lack of data. Therefore, this paper proposes a solution that implements the oversampling technique to address the problems and increase the developed model’s accuracy. It is used to perform oversampling on the imbalanced dataset. Our proposed method consists of a pre-processing step to remove empty values and extract several parameters. After that, the oversampling technique is performed on the result of the pre-processing step. The logistic regression model combined with the oversampling techniques shows the best performance results on the developed model of electricity theft detection based on the state electricity company customers. The experiment shows that the proposed method, logistic regression combined with the synthetic minority oversampling technique, shows superior performance in terms of the accuracy of the training data and data testing, precision, recall, and F1-scores of 98.97%, 98.7%, 95%, 99%, and 97%, respectively. Moreover, the experiment also shows that the proposed solution outperforms existing methods.

Original languageEnglish
Article number5405
JournalEnergies
Volume16
Issue number14
DOIs
Publication statusPublished - Jul 2023

Keywords

  • anomalies detection
  • electricity theft
  • k-nearest neighbors
  • logistic regression
  • machine learning

Fingerprint

Dive into the research topics of 'Enhancing Electricity Theft Detection through K-Nearest Neighbors and Logistic Regression Algorithms with Synthetic Minority Oversampling Technique: A Case Study on State Electricity Company (PLN) Customer Data'. Together they form a unique fingerprint.

Cite this