Ensemble Learning on Large Scale Financial Imbalanced Data

H. R. Sanabila, Wisnu Jatmiko

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

16 Citations (Scopus)

Abstract

This study focused on evaluating the performance of ensemble learning on handling imbalanced data. Imbalanced data is a special problem in classification task where the class distribution is not uniformed. Resampling (SMOTE and ENN) is employed to improve the classifier performance. Four metrics is applied for performance evaluation i.e., precision, recall, specificity, and F-1 score. Based on the experiments, Bagging has a superior performance compared to baseline classifiers (Naïve Bayes and Log Regression) and other ensemble learnings (Boosting and Random Forest). In addition, the combination of SMOTE and ENN successfully increase the classification performance and avoiding biased to the majority class.

Original languageEnglish
Title of host publication2018 International Workshop on Big Data and Information Security, IWBIS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages93-98
Number of pages6
ISBN (Electronic)9781538655252
DOIs
Publication statusPublished - 24 Sept 2018
Event2018 International Workshop on Big Data and Information Security, IWBIS 2018 - Balai Kartini, Jakarta, Indonesia
Duration: 12 May 201813 May 2018

Publication series

Name2018 International Workshop on Big Data and Information Security, IWBIS 2018

Conference

Conference2018 International Workshop on Big Data and Information Security, IWBIS 2018
Country/TerritoryIndonesia
CityBalai Kartini, Jakarta
Period12/05/1813/05/18

Keywords

  • Big data
  • Ensemble learning
  • Imbalanced data
  • Spark (key words)

Fingerprint

Dive into the research topics of 'Ensemble Learning on Large Scale Financial Imbalanced Data'. Together they form a unique fingerprint.

Cite this