Multi-Task Deep Learning for User Intention Understanding in Speech Interaction Systems
Refereed conference paper presented and published in conference proceedings

Full Text

Times Cited

Other information
AbstractSpeech interaction systems have been gaining popularity in recent years. The main purpose of these systems is to generate more satisfactory responses according to users’ speech utterances, in which the most critical problem is to analyze user intention. Researches show that user intention conveyed through speech is not only expressed by content, but also closely related with users’ speaking manners (e.g. with or without acoustic emphasis). How to incorporate these heterogeneous attributes to infer user intention remains an open problem. In this paper, we define Intention Prominence (IP) as the semantic combination of focus by text and emphasis by speech, and propose a multi-task deep learning framework to predict IP. Specifically, we first use long short-term memory (LSTM) which is capable of modeling long short-term contextual dependencies to detect focus and emphasis, and incorporate the tasks for focus and emphasis detection with multi-task learning (MTL) to reinforce the performance of each other. We then employ Bayesian network (BN) to incorporate multimodal features (focus, emphasis, and location reflecting users’ dialect conventions) to predict IP based on feature correlations. Experiments on a data set of 135,566 utterances collected from real-world Sogou Voice Assistant illustrate that our method can outperform the comparison methods over 6.9-24.5% in terms of F1-measure. Moreover, a real practice in the Sogou Voice Assistant indicates that our
method can improve the performance on user intention understanding by 7%.
All Author(s) ListYishuang Ning, Jia Jia, Zhiyong Wu, Runnan Li, Yongsheng An, Yanfeng Wang, Helen Meng
Name of ConferenceThe Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)
Start Date of Conference04/02/2017
End Date of Conference09/02/2017
Place of ConferenceSan Francisco
Country/Region of ConferenceUnited States of America
Proceedings TitleProceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)
Pages161 - 167
LanguagesEnglish-United States

Last updated on 2020-04-07 at 03:15