TY - JOUR
T1 - Prediction of human-Streptococcus pneumoniae protein-protein interactions using logistic regression
AU - Prasasty, Vivitri Dewi
AU - Hutagalung, Rory Anthony
AU - Gunadi, Reinhart
AU - Sofia, Dewi Yustika
AU - Rosmalena, Rosmalena
AU - Yazid, Fatmawaty
AU - Sinaga, Ernawati
N1 - Funding Information:
We would like to thank Dr. Kholis Abdurachim Audah, Swiss German University for the valuable inputs regarding to our study.
Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2021/6
Y1 - 2021/6
N2 - Streptococcus pneumoniae is a major cause of mortality in children under five years old. In recent years, the emergence of antibiotic-resistant strains of S. pneumoniae increases the threat level of this pathogen. For that reason, the exploration of S. pneumoniae protein virulence factors should be considered in developing new drugs or vaccines, for instance by the analysis of host-pathogen protein-protein interactions (HP-PPIs). In this research, prediction of protein-protein interactions was performed with a logistic regression model with the number of protein domain occurrences as features. By utilizing HP-PPIs of three different pathogens as training data, the model achieved 57–77 % precision, 64–75 % recall, and 96–98 % specificity. Prediction of human-S. pneumoniae protein-protein interactions using the model yielded 5823 interactions involving thirty S. pneumoniae proteins and 324 human proteins. Pathway enrichment analysis showed that most of the pathways involved in the predicted interactions are immune system pathways. Network topology analysis revealed β-galactosidase (BgaA) as the most central among the S. pneumoniae proteins in the predicted HP-PPI networks, with a degree centrality of 1.0 and a betweenness centrality of 0.451853. Further experimental studies are required to validate the predicted interactions and examine their roles in S. pneumoniae infection.
AB - Streptococcus pneumoniae is a major cause of mortality in children under five years old. In recent years, the emergence of antibiotic-resistant strains of S. pneumoniae increases the threat level of this pathogen. For that reason, the exploration of S. pneumoniae protein virulence factors should be considered in developing new drugs or vaccines, for instance by the analysis of host-pathogen protein-protein interactions (HP-PPIs). In this research, prediction of protein-protein interactions was performed with a logistic regression model with the number of protein domain occurrences as features. By utilizing HP-PPIs of three different pathogens as training data, the model achieved 57–77 % precision, 64–75 % recall, and 96–98 % specificity. Prediction of human-S. pneumoniae protein-protein interactions using the model yielded 5823 interactions involving thirty S. pneumoniae proteins and 324 human proteins. Pathway enrichment analysis showed that most of the pathways involved in the predicted interactions are immune system pathways. Network topology analysis revealed β-galactosidase (BgaA) as the most central among the S. pneumoniae proteins in the predicted HP-PPI networks, with a degree centrality of 1.0 and a betweenness centrality of 0.451853. Further experimental studies are required to validate the predicted interactions and examine their roles in S. pneumoniae infection.
KW - Host-pathogen protein-protein interactions
KW - Logistic regression
KW - Network centrality
KW - Pathway enrichment
KW - Streptococcus pneumoniae
UR - http://www.scopus.com/inward/record.url?scp=85105320211&partnerID=8YFLogxK
U2 - 10.1016/j.compbiolchem.2021.107492
DO - 10.1016/j.compbiolchem.2021.107492
M3 - Article
C2 - 33964803
AN - SCOPUS:85105320211
SN - 1476-9271
VL - 92
JO - Computational Biology and Chemistry
JF - Computational Biology and Chemistry
M1 - 107492
ER -