Disordered Speech Assessment Using Kullback-Leibler Divergence Features with Multi-Task Acoustic Modeling
Refereed conference paper presented and published in conference proceedings


摘要For acoustical assessment of pathological speech, naturally spoken sentences are believed to be most suitable from the perspectives of both patients and clinicians. This is a challenging problem, as the extraction of pathology-dependent features is not straightforward. Previous research showed that features derived from lattice posteriors and decoding results of automatic speech recognition (ASR) could be used to quantifying various types of speech impairments. This paper describes a novel feature that can be derived from phone posterior probabilities generated by an ASR system. The Kullback-Leibler (KL) divergence is used to measure the phone-level distortion between unimpaired and impaired speakers. A Cantonese ASR system is trained with a combination of normal and impaired speech corpora. The multi-task learning approach is applied in order to incorporate different speech characteristics. Experimental results show that the proposed KL divergence feature is effective in the continuous speech based assessment of different pathologies, including voice disorder and post-stroke aphasia. The KL divergence feature is found to outperform conventional acoustic features and supra-segmental duration features, and is complementary to text features in quantifying language impairment. Index Terms: disordered speech assessment, voice disorders, aphasia, continuous speech, KL divergence, ASR, multi-task learning.
著者Yuanyuan Liu, Ying Qin, Siyuan Feng, Tan Lee, P.C. Ching
會議名稱11th International Symposium on Chinese Spoken Language Processing (ISCSLP)
會議論文集題名Proceedings of ISCSLP 2018
頁次61 - 65
關鍵詞disordered speech assessment, voice disorders, aphasia, continuous speech, KL divergence, ASR, multi-task learning

上次更新時間 2020-21-10 於 02:53