Feature-Rich Classifiers for Recognizing Textual Entailment in Indonesian

Rani Aulia Hidayat, Isnaini Nurul Khasanah, Wava Carissa Putri, Rahmad Mahendra

Research output: Contribution to journalConference articlepeer-review

1 Citation (Scopus)

Abstract

Recognizing Textual Entailment (RTE) is a Natural Language Processing task to determine whether a sentence (text) semantically entails another sentence (hypothesis). In this paper, we extracted and learned 35 features from a pair of text and hypothesis in Indonesian. The ablation study was conducted to analyze features contribution to RTE model. The experiments shown that, using Support Vector Machine (SVM) and Logistic Regression, the token-based features contribute positively to improve the model performance. The best model in our experiment is SVM that scored F1-Score of 79.65%. Despite sacrificing 5-points accuracy to the state-of-the-art BERT model, SVM classifier is 31 hours more efficient in terms of training time.

Original languageEnglish
Pages (from-to)148-155
Number of pages8
JournalProcedia CIRP
Volume189
DOIs
Publication statusPublished - 2021
Event5th International Conference on Artificial Intelligence in Computational Linguistics, ACLing 2021 - Virtual, Online, United Arab Emirates
Duration: 4 Jun 20215 Jun 2021

Keywords

  • Ablation study
  • Feature
  • Indonesia, Text classification
  • Natural Language Inference
  • Recognizing Textual Entailment

Fingerprint

Dive into the research topics of 'Feature-Rich Classifiers for Recognizing Textual Entailment in Indonesian'. Together they form a unique fingerprint.

Cite this