Learning a Unified Embedding Space of Web Search from Large-scale Query Log
Publication in refereed journal


摘要In the procedure of Web search, a user first comes up with an information need and a query is issued with the need as guidance. After that, some URLs are clicked and other queries may be issued if those URL5 do not meet his need well. We advocate that Web search is governed by a unified hidden space, and each involved element such as query and URL has its inborn position, i.e., projected as a vector, in this space. Each of above actions in the search procedure, i.e. issuing queries or clicking URLs, is an interaction result of those elements in the space. In this paper, we aim at uncovering such a unified hidden space of Web search that uniformly captures the hidden semantics of search queries, URLs and other involved elements in Web search. We learn the semantic space with search session data, because a search session can be regarded as an instantiation of users' information need on a particular semantic topic and it keeps the interaction information of queries and URLs. We use a set of session graphs to represent search sessions, and the space learning task is cast as a vector learning problem for the graph vertices by maximizing the log-likelihood of a training session data set. Specifically, we developed the well-known Word2vec to perform the learning procedure. Experiments on the query log data of a commercial search engine are conducted to examine the efficacy of learnt vectors, and the results show that our framework is helpful for different finer tasks in Web search.
著者Bing LD, Niu ZY, Li PJ, Lam W, Wang HF
期刊名稱Knowledge-Based Systems
頁次38 - 48
關鍵詞Web search, Query representation, Embedding space, Session analysis
Web of Science 學科類別Computer Science, Artificial Intelligence;Computer Science

上次更新時間 2021-18-01 於 01:16