TY - JOUR
T1 - Enhancing Electricity Theft Detection through K-Nearest Neighbors and Logistic Regression Algorithms with Synthetic Minority Oversampling Technique
T2 - A Case Study on State Electricity Company (PLN) Customer Data
AU - Maraden, Yan
AU - Wibisono, Gunawan
AU - Nugraha, I. Gde Dharma
AU - Sudiarto, Budi
AU - Jufri, Fauzan Hanif
AU - Kazutaka,
AU - Prabuwono, Anton Satria
N1 - Funding Information:
This work was supported by Universitas Indonesia Research Grant for International Publication Financial Year 2022/2023, contract number: NKB-1474/UN2.RST/HKP.05.00/2022.
Publisher Copyright:
© 2023 by the authors.
PY - 2023/7
Y1 - 2023/7
N2 - Electricity theft has caused massive losses and damage to electricity utilities. The damage affects the electricity supply’s quality and increases the generation load. The losses happen not only for the electricity utilities but also affect the legitimate users who have to pay excessive electricity bills. That is why the method to detect electricity theft is indispensable. Recently, machine learning algorithms have been used to develop a model for detecting electricity theft. However, most algorithms have problems due to imbalanced data, overfitting issues, and lack of data. Therefore, this paper proposes a solution that implements the oversampling technique to address the problems and increase the developed model’s accuracy. It is used to perform oversampling on the imbalanced dataset. Our proposed method consists of a pre-processing step to remove empty values and extract several parameters. After that, the oversampling technique is performed on the result of the pre-processing step. The logistic regression model combined with the oversampling techniques shows the best performance results on the developed model of electricity theft detection based on the state electricity company customers. The experiment shows that the proposed method, logistic regression combined with the synthetic minority oversampling technique, shows superior performance in terms of the accuracy of the training data and data testing, precision, recall, and F1-scores of 98.97%, 98.7%, 95%, 99%, and 97%, respectively. Moreover, the experiment also shows that the proposed solution outperforms existing methods.
AB - Electricity theft has caused massive losses and damage to electricity utilities. The damage affects the electricity supply’s quality and increases the generation load. The losses happen not only for the electricity utilities but also affect the legitimate users who have to pay excessive electricity bills. That is why the method to detect electricity theft is indispensable. Recently, machine learning algorithms have been used to develop a model for detecting electricity theft. However, most algorithms have problems due to imbalanced data, overfitting issues, and lack of data. Therefore, this paper proposes a solution that implements the oversampling technique to address the problems and increase the developed model’s accuracy. It is used to perform oversampling on the imbalanced dataset. Our proposed method consists of a pre-processing step to remove empty values and extract several parameters. After that, the oversampling technique is performed on the result of the pre-processing step. The logistic regression model combined with the oversampling techniques shows the best performance results on the developed model of electricity theft detection based on the state electricity company customers. The experiment shows that the proposed method, logistic regression combined with the synthetic minority oversampling technique, shows superior performance in terms of the accuracy of the training data and data testing, precision, recall, and F1-scores of 98.97%, 98.7%, 95%, 99%, and 97%, respectively. Moreover, the experiment also shows that the proposed solution outperforms existing methods.
KW - anomalies detection
KW - electricity theft
KW - k-nearest neighbors
KW - logistic regression
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85166235422&partnerID=8YFLogxK
U2 - 10.3390/en16145405
DO - 10.3390/en16145405
M3 - Article
AN - SCOPUS:85166235422
SN - 1996-1073
VL - 16
JO - Energies
JF - Energies
IS - 14
M1 - 5405
ER -