Maintaining imbalance highly dependent medical data using dirichlet process data generation

Tieta Antaresti, Mohamad Ivan Fanany, Aniati Murni Arymurthy

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The existence of imbalanced data between one class and another class is an important issue to be considered in a classification problem. One of the well-known data balancing technique is the artificial oversampling, which increase the size of datasets. In this research, multinomial classification was applied to classify some recorded features obtained from a single ECG (electrocardiograph) sensor. Therefore, a Dirichlet process, a dirichlet distribution of cumulative distribution function of each data partition, was needed to model the distribution of the new generated data by also considering the statistical properties of the previous data. Data balancing process had given the result of 77.21% classification accuracy (CA), and 90.9% area under ROC curve (AUC).

Original languageEnglish
Title of host publication2011 6th International Conference on Digital Information Management, ICDIM 2011
Pages18-22
Number of pages5
DOIs
Publication statusPublished - 2011
Event2011 6th International Conference on Digital Information Management, ICDIM 2011 - var.pagings, Australia
Duration: 26 Sept 201128 Sept 2011

Publication series

Name2011 6th International Conference on Digital Information Management, ICDIM 2011

Conference

Conference2011 6th International Conference on Digital Information Management, ICDIM 2011
Country/TerritoryAustralia
Cityvar.pagings
Period26/09/1128/09/11

Fingerprint

Dive into the research topics of 'Maintaining imbalance highly dependent medical data using dirichlet process data generation'. Together they form a unique fingerprint.

Cite this