Building MEDISCO: Indonesian Speech Corpus for Medical Domain

Muhammad Reza Qorib, Mirna Adriani

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

In this paper we report our work of building MEDISCO: Medical Indonesian Speech Corpus. The medical text corpus is collected from five Indonesian online medical consultation websites. From the text corpus, we created a speech corpus that consists of 360 sentences read by 13 speakers. In total, our speech corpus contains 731 medical terms and consists of 4,680 utterances with total duration 10 hours.

Original languageEnglish
Title of host publicationProceedings of the 2018 International Conference on Asian Language Processing, IALP 2018
EditorsMinghui Dong, Moch. Bijaksana, Herry Sujaini, Arif Bijaksana Putra Negara, Ade Romadhony, Fariska Z. Ruskanda, Elvira Nurfadhilah, Lyla Ruslana Aini
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages133-138
Number of pages6
ISBN (Electronic)9781728111766
DOIs
Publication statusPublished - 2 Jul 2018
Event22nd International Conference on Asian Language Processing, IALP 2018 - Bandung, Indonesia
Duration: 15 Nov 201817 Nov 2018

Publication series

NameProceedings of the 2018 International Conference on Asian Language Processing, IALP 2018

Conference

Conference22nd International Conference on Asian Language Processing, IALP 2018
Country/TerritoryIndonesia
CityBandung
Period15/11/1817/11/18

Keywords

  • Indonesian Automatic Speech Recognition
  • Medical Speech Corpus
  • Text Corpus

Fingerprint

Dive into the research topics of 'Building MEDISCO: Indonesian Speech Corpus for Medical Domain'. Together they form a unique fingerprint.

Cite this