Learning a Unified Embedding Space of Web Search from Large-scale Query Log
Publication in refereed journal

Times Cited
Altmetrics Information

Other information
AbstractIn the procedure of Web search, a user first comes up with an information need and a query is issued with the need as guidance. After that, some URLs are clicked and other queries may be issued if those URL5 do not meet his need well. We advocate that Web search is governed by a unified hidden space, and each involved element such as query and URL has its inborn position, i.e., projected as a vector, in this space. Each of above actions in the search procedure, i.e. issuing queries or clicking URLs, is an interaction result of those elements in the space. In this paper, we aim at uncovering such a unified hidden space of Web search that uniformly captures the hidden semantics of search queries, URLs and other involved elements in Web search. We learn the semantic space with search session data, because a search session can be regarded as an instantiation of users' information need on a particular semantic topic and it keeps the interaction information of queries and URLs. We use a set of session graphs to represent search sessions, and the space learning task is cast as a vector learning problem for the graph vertices by maximizing the log-likelihood of a training session data set. Specifically, we developed the well-known Word2vec to perform the learning procedure. Experiments on the query log data of a commercial search engine are conducted to examine the efficacy of learnt vectors, and the results show that our framework is helpful for different finer tasks in Web search.
Acceptance Date24/02/2018
All Author(s) ListBing LD, Niu ZY, Li PJ, Lam W, Wang HF
Journal nameKnowledge-Based Systems
Volume Number150
Pages38 - 48
LanguagesEnglish-United Kingdom
KeywordsWeb search, Query representation, Embedding space, Session analysis
Web of Science Subject CategoriesComputer Science, Artificial Intelligence;Computer Science

Last updated on 2021-05-03 at 01:56