Indonesian Automated Short Answer Scoring Using Sentence Transformers and Siamese LSTM

Nurul Chamidah, Indra Budi, Rizal Fathoni Aji

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The objective of automated short-answer scoring is to provide automated scores that are as close as possible to human scores. However, the lexicon-based approach using a bag of words has not yielded satisfactory score predictions due to the possibility that sentences with identical meanings can be conveyed using different words. To address the limitation, this study proposes the use of semantic features derived from pretrained Sentence BERT (SBERT) embedding vectors, which are then fed into the Siamese LSTM. The proposed model was evaluated using an Indonesian automated short answer scoring benchmark dataset. Furthermore, cosine similarity and Manhattan distance were evaluated as output functions on the Siamese LSTM. The experimental results obtained from the test data yielded an optimal Pearson's correlation of 0.920 and a Root Mean Squared Error (RMSE) of 10.019 when SBERT and Siamese LSTM were fine-tuned using Manhattan distance as the output function. These values show a correlation of 0.77 and an RMSE of 10.019, which are 0.77 and 7.154 higher, respectively, than the values reported in the previous study. The test result for cosine similarity as the output function yielded the optimal RMSE of 11.494 and correlation of 0.892, which represent improvements of 5.679 and 0.049, respectively, in RMSE and correlation, respectively, over the previous study. These results illustrate that the integration of sentence embeddings and LSTM in this study has produced superior results in Indonesian automated short answer scoring on the benchmark dataset using lexicon-based techniques.

Original languageEnglish
Title of host publication2024 9th International Conference on Informatics and Computing, ICIC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331517601
DOIs
Publication statusPublished - 2024
Event9th International Conference on Informatics and Computing, ICIC 2024 - Hybrid, Medan, Indonesia
Duration: 24 Oct 202425 Oct 2024

Publication series

Name2024 9th International Conference on Informatics and Computing, ICIC 2024

Conference

Conference9th International Conference on Informatics and Computing, ICIC 2024
Country/TerritoryIndonesia
CityHybrid, Medan
Period24/10/2425/10/24

Keywords

  • automated short answer scoring
  • SBERT
  • sentence transformers
  • Siamese LSTM

Fingerprint

Dive into the research topics of 'Indonesian Automated Short Answer Scoring Using Sentence Transformers and Siamese LSTM'. Together they form a unique fingerprint.

Cite this