An Evaluation of Neural Vocoder-Based Voice Cloning System for Dysphonia Speech Disorder

Dhiya Dewangga, Dessi Lestari, Ayu Purwarianti, Dipta Tanaya, Kurniawati Azizah, Sakriani Sakti

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Dysphonia is a voice disorder affecting voice quality, quantity, and intensity, occurring at various ages and diverse backgrounds. Dysphonia impacts the difficulty of communication, thereby reducing the overall quality of life. Medical solutions have been proposed to improve the speech quality of individuals with dysphonia. However, these solutions are often limited by considerable expenses and time-consuming procedures. Therefore, alternative solutions are needed to enhance speech quality. The widespread development of technology in various domains can be proposed as an alternative solution. One is speech processing technology using text-to-speech (TTS) with voice cloning techniques. Our work presents the impact of the vocoder in a voice cloning system on the quality of synthesized speech for dysphonia speakers. We compare selected vocoder models based on architecture and performance. Furthermore, we explore the effect of using Speaker Conditionals on the vocoder. We perform an objective evaluation for each vocoder to measure the quality of the models.

Original languageEnglish
Title of host publication2024 27th Conference on the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2024 - Proceedings
EditorsMing-Hsiang Su, Jui-Feng Yeh, Yuan-Fu Liao, Chi-Chun Lee, Yu Taso
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331506032
DOIs
Publication statusPublished - 2024
Event27th Conference on the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2024 - Hsinchu, Taiwan, Province of China
Duration: 17 Oct 202419 Oct 2024

Publication series

Name2024 27th Conference on the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2024 - Proceedings

Conference

Conference27th Conference on the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2024
Country/TerritoryTaiwan, Province of China
CityHsinchu
Period17/10/2419/10/24

Keywords

  • dysphonia
  • speech synthesis
  • text-to-speech
  • vocoder
  • voice cloning

Fingerprint

Dive into the research topics of 'An Evaluation of Neural Vocoder-Based Voice Cloning System for Dysphonia Speech Disorder'. Together they form a unique fingerprint.

Cite this