Applying Multitask Learning to Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech
Refereed conference paper presented and published in conference proceedings


Full Text

Other information
AbstractFor mispronunciation detection and diagnosis (MDD), nowadays approaches generally treat the phonemes in correct and mispronunciations as the same despite the fact they may actually carry different characteristics. Furthermore, serious data imbalance issue between correct and mispronunciation in dataset further influences the performances. To address these problems, this paper investigates the use of multi-task (MT) learning technique to enhance the acoustic-phonemic model (APM) for MDD. The phonemes in correct and mispronunciations are processed separately but in multi-task manner considering both correct and mispronunciation recognition tasks. A feature representation module is further proposed to improve performance. Compared with baseline APM, the proposed MT-APM, R-MT-APM achieve better performance not only in Precision, Recall and F-Measure, but also in mispronunciation detection and diagnosis accuracies. With feature representation module, R-MT-APM achieves the highest mispronunciation detection accuracy.
All Author(s) ListShaoguang Mao, Zhiyong Wu, Runnan Li, Xu Li, Helen Meng, Lianhong Cai
Name of ConferenceIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018)
Start Date of Conference15/04/2018
End Date of Conference20/04/2018
Place of ConferenceCalgary
Country/Region of ConferenceCanada
Proceedings TitleProceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018)
Year2018
LanguagesEnglish-United Kingdom
KeywordsComputer-aided pronunciation training, mispronunciation detection and diagnosis, multi-task learning, acoustic-phonemic model, feature representation

Last updated on 2018-03-07 at 11:59