Acoustic Assessment of Disordered Voice with Continuous Speech Based on Utterance-level ASR Posterior Features
Refereed conference paper presented and published in conference proceedings

Times Cited
Web of Science7WOS source URL (as at 22/09/2021) Click here for the latest count
Altmetrics Information

Other information
AbstractMost previous studies on acoustic assessment of disordered voice were focused on extracting perturbation features from isolated vowels produced with steady-state phonation. Natural speech, however, is considered to be more preferable in the aspects of flexibility, effectiveness and reliability for clinical practice. This paper presents an investigation on applying automatic speech recognition (ASR) technology to disordered voice assessment of Cantonese speakers. A DNN-based ASR system is trained using phonetically-rich continuous utterances from normal
speakers. It was found that frame-level phone posteriors obtained from the ASR system are strongly correlated with the severity level of voice disorder. Phone posteriors in utterances with severe disorder exhibit significantly larger variation than those with mild disorder. A set of utterance-level posterior features are computed to quantify such variation for pattern recognition purpose. An SVM based classifier is used to classify an input utterance into the categories of mild, moderate and severe disorder. The two-class classification accuracy for mild and severe disorders is 90.3%, and significant confusion between mild and moderate disorders is observed. For some of the subjects
with severe voice disorder, the classification results are highly inconsistent among individual utterances. Furthermore, short utterances tend to have more classification errors.
Index Terms: disordered voice, continuous speech, speech recognition, acoustic posteriors
All Author(s) ListLiu Y. Y., Lee T., Ching P. C., Law T., Lee K. Y. S.
Name of ConferenceINTERSPEECH 2017
Start Date of Conference20/08/2017
End Date of Conference24/08/2017
Place of ConferenceStockholm
Country/Region of ConferenceSweden
Proceedings TitleProceedings of Interspeech 2017
Place of PublicationSweden
Pages2680 - 2684
LanguagesEnglish-United States
Keywordsdisordered voice, continuous speech, speech recognition, acoustic posteriors

Last updated on 2021-23-09 at 01:02