Indonesian-English Code-Switching Speech Recognition using the Machine Speech Chain based Semi-Supervised Learning

Rais Vaza Man Tazakka, Dessi Lestari, Ayu Purwarianti, Dipta Tanaya, Kurniawati Azizah, Sakriani Sakti

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Indonesia is home to a diverse linguistic landscape, where individuals seamlessly transition between Indonesian, English, and local dialects in their everyday conversations—a phenomenon known as code-switching. Understanding and accommodating this linguistic fluidity is essential, particularly in the development of accurate speech recognition systems. However, tackling Indonesian-English code-switching poses a challenge due to the scarcity of paired code-switching data. Thus, this study endeavors to address Indonesian-English code-switching in speech recognition, leveraging unlabeled data and employing a semi-supervised technique known as the machine speech chain. Our findings demonstrate that the machine speech chain method effectively enhances automatic speech recognition (ASR) performance in recognizing code-switching between Indonesian and English, utilizing previously untapped resources of unlabeled data.

Original languageEnglish
Title of host publication3rd Annual Meeting of the ELRA-ISCA Special Interest Group on Under-Resourced Languages, SIGUL 2024 at LREC-COLING 2024 - Workshop Proceedings
EditorsMaite Melero, Sakriani Sakti, Claudia Soria
PublisherEuropean Language Resources Association (ELRA)
Pages143-148
Number of pages6
ISBN (Electronic)9782493814296
Publication statusPublished - 2024
Event3rd Annual Meeting of the ELRA-ISCA Special Interest Group on Under-Resourced Languages, SIGUL 2024 - Turin, Italy
Duration: 21 May 202422 May 2024

Publication series

Name3rd Annual Meeting of the ELRA-ISCA Special Interest Group on Under-Resourced Languages, SIGUL 2024 at LREC-COLING 2024 - Workshop Proceedings

Conference

Conference3rd Annual Meeting of the ELRA-ISCA Special Interest Group on Under-Resourced Languages, SIGUL 2024
Country/TerritoryItaly
CityTurin
Period21/05/2422/05/24

Keywords

  • code-switching
  • machine speech chain
  • speech recognition systems

Fingerprint

Dive into the research topics of 'Indonesian-English Code-Switching Speech Recognition using the Machine Speech Chain based Semi-Supervised Learning'. Together they form a unique fingerprint.

Cite this