Supervised topic models with word order structure for document classification and retrieval learning
Publication in refereed journal


摘要One limitation of most existing probabilistic latent topic models for document classification is that the topic model itself does not consider useful side-information, namely, class labels of documents. Topic models, which in turn consider the side-information, popularly known as supervised topic models, do not consider the word order structure in documents. One of the motivations behind considering the word order structure is to capture the semantic fabric of the document. We investigate a low-dimensional latent topic model for document classification. Class label information and word order structure are integrated into a supervised topic model enabling a more effective interaction among such information for solving document classification. We derive a collapsed Gibbs sampler for our model. Likewise, supervised topic models with word order structure have not been explored in document retrieval learning. We propose a novel supervised topic model for document retrieval learning which can be regarded as a pointwise model for tackling the learning-to-rank task. Available relevance assessments and word order structure are integrated into the topic model itself. We conduct extensive experiments on several publicly available benchmark datasets, and show that our model improves upon the state-of-the-art models.
著者Jameel S., Lam W., Bing L.
期刊名稱Information Retrieval
詳細描述Issue 4.
出版社Kluwer Academic Publishers
頁次283 - 330
關鍵詞Document classification, Learning-to-rank, Maximum-margin, Structured topic model, Topic modeling

上次更新時間 2021-15-01 於 00:49