TY - GEN
T1 - Restricted Boltzmann machines for unsupervised feature selection with partial least square feature extractor for microarray datasets
AU - Sutawika, Lintang Adyuta
AU - Wasito, Ito
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - Feature selection is a key component in microarray data analysis. This is due to the fact that microarray datasets consists of features that are far exceed the number of instances. High dimensional data are also known to contain significant amount of noise and irrelevant variables that do not contribute to classification tasks and may even hinder classification performance. In this paper, a feature selection method which consists of two stages is proposed. At the first step, feature selection is done through a stacked Restricted Boltzmann Machines by means of comparing the error between reconstructed data and the original data. The next stage will use Partial Least Square to extract synthesis features from the previously selected features that will be then used for classification. The performance of the proposed method is done through the classification of ten microarray datasets that are widely used. The proposed model is able to out perform state-of-the-art in 2 datasets, namely 82.11% for GLIOMA and 72.39% for Breast datasets.
AB - Feature selection is a key component in microarray data analysis. This is due to the fact that microarray datasets consists of features that are far exceed the number of instances. High dimensional data are also known to contain significant amount of noise and irrelevant variables that do not contribute to classification tasks and may even hinder classification performance. In this paper, a feature selection method which consists of two stages is proposed. At the first step, feature selection is done through a stacked Restricted Boltzmann Machines by means of comparing the error between reconstructed data and the original data. The next stage will use Partial Least Square to extract synthesis features from the previously selected features that will be then used for classification. The performance of the proposed method is done through the classification of ten microarray datasets that are widely used. The proposed model is able to out perform state-of-the-art in 2 datasets, namely 82.11% for GLIOMA and 72.39% for Breast datasets.
KW - deep learning
KW - gene expression
KW - microarray data analysis
KW - restricted boltzmann machine
UR - http://www.scopus.com/inward/record.url?scp=85050954001&partnerID=8YFLogxK
U2 - 10.1109/ICACSIS.2017.8355043
DO - 10.1109/ICACSIS.2017.8355043
M3 - Conference contribution
AN - SCOPUS:85050954001
T3 - 2017 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2017
SP - 257
EP - 260
BT - 2017 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th International Conference on Advanced Computer Science and Information Systems, ICACSIS 2017
Y2 - 28 October 2017 through 29 October 2017
ER -