TY - GEN
T1 - Classification analysis using support vector machine, decision tree, and neural network with principal component analysis to determine molecular structure relationship from its biological activity on dipeptidyl peptidase IV inhibitors
AU - Hamzah, Haris
AU - Bustamam, Alhadi
AU - Yanuar, Any
AU - Sarwinda, Dewi
N1 - Publisher Copyright:
© 2020 American Institute of Physics Inc.. All rights reserved.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/11/16
Y1 - 2020/11/16
N2 - A chronic metabolic disease that of ten affects adults is type 2 diabetes. Dipeptidyl peptidase-IV (DPP-IV) inhibitors are drug targets for diabetes mellitus type 2 (T2DM) that can block the enzyme dipeptidyl peptidase-IV. At this time, there are adverse effects from these inhibitors. Therefore, novel DPP-IV inhibitors are still expected with minimal adverse effects. In this paper, a machine learning approach is used to predict the molecular structure of DPP-IV inhibitors. There are 3363 inhibitors consisting of 1849 inhibitors with active labels and 1514 inhibitors with inactive labels that are optimized using fingerprint topology as descriptors. However, fingerprint topology always produces high-dimensional data. So, the principal component analysis method is proposed to reduce the dimension of the data set. Then, support vector machine, decision tree, and neural network are used for classifying DPP-IV inhibitors. The overall classification using the support vector machine method produces specificity, sensitivity, accuracy, and Matthews coefficient correlation C, respectively 0.774,0.826,0.803, and 0.604. These results indicate that the support vector machine method has a good ability in the classification of active and inactive DPP-IV inhibitors based on topological fingerprint as descriptors.
AB - A chronic metabolic disease that of ten affects adults is type 2 diabetes. Dipeptidyl peptidase-IV (DPP-IV) inhibitors are drug targets for diabetes mellitus type 2 (T2DM) that can block the enzyme dipeptidyl peptidase-IV. At this time, there are adverse effects from these inhibitors. Therefore, novel DPP-IV inhibitors are still expected with minimal adverse effects. In this paper, a machine learning approach is used to predict the molecular structure of DPP-IV inhibitors. There are 3363 inhibitors consisting of 1849 inhibitors with active labels and 1514 inhibitors with inactive labels that are optimized using fingerprint topology as descriptors. However, fingerprint topology always produces high-dimensional data. So, the principal component analysis method is proposed to reduce the dimension of the data set. Then, support vector machine, decision tree, and neural network are used for classifying DPP-IV inhibitors. The overall classification using the support vector machine method produces specificity, sensitivity, accuracy, and Matthews coefficient correlation C, respectively 0.774,0.826,0.803, and 0.604. These results indicate that the support vector machine method has a good ability in the classification of active and inactive DPP-IV inhibitors based on topological fingerprint as descriptors.
UR - http://www.scopus.com/inward/record.url?scp=85096681850&partnerID=8YFLogxK
U2 - 10.1063/5.0030748
DO - 10.1063/5.0030748
M3 - Conference contribution
AN - SCOPUS:85096681850
T3 - AIP Conference Proceedings
BT - International Conference on Science and Applied Science, ICSAS 2020
A2 - Purnama, Budi
A2 - Nugraha, Dewanta Arya
A2 - Anwar, Fuad
PB - American Institute of Physics Inc.
T2 - 2020 International Conference on Science and Applied Science, ICSAS 2020
Y2 - 7 July 2020
ER -