Carryover Tonal Variations for Speech Recognition in Standard Chinese

Hana Nurul Hasanah, Qing Yang, Yiya Chen

Research output: Contribution to conferencePaperpeer-review

Abstract

Substantial pitch variation due to tonal coarticulation occurs when lexical tones are produced in succession. When a coarticulated tone is excised out of context and played in isolation, coarticulatory pitch variations may inhibit tone recognition. It remains unclear how listeners utilize coarticulatory pitch cues for online speech recognition in context. Using the printed-word eye-tracking paradigm, we tested the recognition of the high tone in a low-high tonal sequence by native Standard Chinese (SC) listeners. In this sequence, the high tone exhibits coarticulatory rising f0. We manipulated the presentation of the preceding low tone: auditorily and visually present, auditorily absent (i.e., substituted by pink noise) but visually present or visually replaced by a high tone (to prompt inappropriate tonal coarticulatory cue). Analyses of the point of divergence and proportions of eye fixations revealed that listeners’ correct fixations at the high-tone target started early and increased quickly even though only the rising f0 part of the high tone was played auditorily, with a gradual delay following the compatibility between the visual and auditory stimulus presentations. The immediate utilization of tonal coarticulation for speech recognition by SC listeners suggests the need for fine-grained coarticulatory information in speech representation and processing.

Original languageEnglish
Pages398-402
DOIs
Publication statusPublished - 2 Jul 2024
EventSpeech Prosody 2024
- Leiden, Netherlands
Duration: 2 Jul 20245 Jul 2024

Conference

ConferenceSpeech Prosody 2024
Country/TerritoryNetherlands
CityLeiden
Period2/07/245/07/24

Keywords

  • online speech processing
  • carryover tonal information
  • eye-tracking

Fingerprint

Dive into the research topics of 'Carryover Tonal Variations for Speech Recognition in Standard Chinese'. Together they form a unique fingerprint.

Cite this