Exploring Topic Discriminating Power of Words in Latent Dirichlet Allocation
Refereed conference paper presented and published in conference proceedings


Full Text

Other information
AbstractLatent Dirichlet Allocation (LDA) and its variants have been widely used to discover latent topics in textual documents. However, some of topics generated by LDA may be noisy with irrelevant words scattering across these topics. We name this kind of words as topic-indiscriminate words, which tend to make topics more ambiguous and less interpretable by humans. In our work, we propose a new topic model named TWLDA, which assigns low weights to words with low topic discriminating power (ability). Our experimental results show that the proposed approach, which effectively reduces the number of topic-indiscriminate words in discovered topics, improves the effectiveness of LDA.
All Author(s) ListKai Yang, Yi Cai, Zhenhong Chen, Ho-fung Leung, Raymond LAU
Name of ConferenceCOLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Start Date of Conference11/12/2016
End Date of Conference17/12/2016
Place of ConferenceOsaka
Country/Region of ConferenceJapan
Proceedings TitleProceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Year2016
Month12
PublisherThe COLING 2016 Organizing Committee
Pages2238 - 2247
LanguagesEnglish-United States

Last updated on 2018-20-01 at 19:40