Paragraph vs Sentence in Automatic Question Generation Fine-Tuning using Text-to-Text Transfer Transformer for Bahasa Indonesia

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Automatic Question Generation (AQG) has been developed to help create usable questions for assessment purposes. AQG has been adopted in many domains and languages, including Indonesia. The state-of-the-art method in AQG is the transformer model. The recent Indonesian AQG model, however, still inherits limitations in the form of irrelevant question-answer (QA) pairs. We propose different preprocessing mechanisms to reduce the irrelevant QA pairs utilising only sentence and top 3 sentences as the input, which has never been explored before. We used basic string matching and BM2SOkapi for this purpose. The multilingual Text-to-Text Transfer Transformer (mT5) base variant is fine-tuned in Indonesian SQuAD and TydiQA dataset with three different input schemes: paragraph, sentence, and top 3 most relevant sentences. We evaluated the model using BLEU and ROUGE metrics. Our findings suggest that different input scenarios can influence the performance of the model. The characteristic of the dataset also plays an important role in deciding which input scheme to use. Our findings could be the basis of further development for AQG in Indonesian, especially enhancing the preprocessing of the current and future models.

Original languageEnglish
Title of host publicationProceedings - 2024 10th International Conference on Education and Technology
Subtitle of host publicationDigital Resilience and Empowering Ethical Artificial Intelligence (AI) for Education, ICET 2024
PublisherInstitute of Electrical and Electronics Engineers
Pages155-161
Number of pages7
ISBN (Electronic)9798331522483
DOIs
Publication statusPublished - 2024
Event10th International Conference on Education and Technology, ICET 2024 - Malang, Indonesia
Duration: 10 Oct 2024 → …

Publication series

NameProceedings - International Conference on Education and Technology, ICET
ISSN (Print)2770-4807

Conference

Conference10th International Conference on Education and Technology, ICET 2024
Country/TerritoryIndonesia
CityMalang
Period10/10/24 → …

Keywords

  • automatic question generation
  • indonesia
  • mt5

Fingerprint

Dive into the research topics of 'Paragraph vs Sentence in Automatic Question Generation Fine-Tuning using Text-to-Text Transfer Transformer for Bahasa Indonesia'. Together they form a unique fingerprint.

Cite this