Improving automatic forced alignment for dysarthric speech transcription
Refereed conference paper presented and published in conference proceedings

Full Text

Times Cited

Other information
AbstractDysarthria is a motor speech disorder due to neurologic deficits. The impaired movement of muscles for speech production leads to disordered speech where utterances have prolonged pause intervals, slow speaking rates, poor articulation of phonemes, syllable deletions, etc. These present challenges towards the use of speech technologies for automatic processing of dysarthric speech data. In order to address these challenges, this work begins by addressing the performance degradation faced in forced alignment. We perform initial alignments to locate long pauses in dysarthric speech and make use of the pause intervals as anchor points. We apply speech recognition for word lattice outputs for recovering the time-stamps of the words in disordered or incomplete pronunciations. By verifying the initial alignments with word lattices, we obtain the reliably aligned segments. These segments provide constraints for new alignment grammars, that can improve alignment and transcription quality. We have applied the proposed strategy to the TORGO corpus and obtained improved alignments for most dysarthric speech data, while maintaining good alignments for non-dysarthric speech data.
All Author(s) ListYeung Y.T., Wong K.H., Meng H.
Name of Conference16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
Start Date of Conference06/09/2015
End Date of Conference10/09/2015
Place of ConferenceDresden
Country/Region of ConferenceGermany
Volume Number2015-January
Pages2991 - 2995
LanguagesEnglish-United Kingdom
KeywordsAutomatic forced alignment, Dysarthric speech, Speech recognition, Word lattices

Last updated on 2020-01-09 at 00:05