COMPUTATIONAL QSAR-BASED MACHINE LEARNING APPROACH FOR PREDICTING ACTIVITY OF SGLT2 INHIBITORS USING THE KNIME PLATFORM

Research output: Contribution to journalArticlepeer-review

Abstract

Objective: This study aims to identify optimal predictive models and key molecular fragments by preparing a dataset and using machine learning techniques within the Konstanz Information Miner (KNIME) platform. Methods: The human sodium-glucose cotransporter 2 (SGLT2) target dataset was obtained from the ChEMBL database and refined by removing salts, incomplete/incorrect data, and duplicates. The data was classified into active and inactive compounds, and fingerprints and descriptors were calculated. Christian Borgelt's Molecular Substructure Miner (MoSS) was employed to identify frequent molecular fragments. Following data partitioning, various ‘classification’ and ‘regression’ machine learning (ML) based Quantitative Structure-Activity Relationship (QSAR) models were developed and evaluated using different techniques, including sensitivity and mean Squared Error (MSE). Results: In QSAR classification, the Support Vector Machine (SVM) model demonstrated the best performance with an accuracy of 81.66%, while in QSAR Regression, the Extreme Gradient Boosting (XGB) model exhibited the best coefficient of determination (R2) and mean Absolute Error (MAE) values of 0.69 and 0.47 respectively. The identification of frequent Molecular Fragments highlighted common characteristics in active SGLT2 inhibitors. Conclusion: The results of developing these QSAR models indicate that machine learning methods can be effectively used to predict SGLT2 inhibitors virtually, thereby expediting the drug discovery process.

Original languageEnglish
Pages (from-to)328-333
Number of pages6
JournalInternational Journal of Applied Pharmaceutics
Volume17
Issue number1
DOIs
Publication statusPublished - 1 Jan 2025

Keywords

  • Artificial intelligent
  • In silico
  • KNIME
  • Machine learning
  • QSAR
  • SGLT2 inhibitor

Fingerprint

Dive into the research topics of 'COMPUTATIONAL QSAR-BASED MACHINE LEARNING APPROACH FOR PREDICTING ACTIVITY OF SGLT2 INHIBITORS USING THE KNIME PLATFORM'. Together they form a unique fingerprint.

Cite this