Data Augmentation Based on Adversarial Autoencoder Handling Imbalance for Learning to Rank
Refereed conference paper presented and published in conference proceedings



摘要Data imbalance is a key limiting factor for Learning to Rank (LTR) models in information retrieval. Resampling methods and ensemble methods cannot handle the imbalance problem well since none of them incorporate more informative data into the training procedure of LTR models. We propose a data generation model based on Adversarial Autoencoder (AAE) for tackling the data imbalance in LTR via informative data augmentation. This model can be utilized for handling two types of data imbalance, namely, imbalance regarding relevance levels for a particular query and imbalance regarding the amount of relevance judgements in different queries. In the proposed model, relevance information is disentangled from the latent representations in this AAE-based model in order to reconstruct data with specific relevance levels. The semantic information of queries, derived from word embeddings, is incorporated in the adversarial training stage for regularizing the distribution of the latent representation. Two informative data augmentation strategies suitable for LTR are designed utilizing the proposed data generation model. Experiments on benchmark LTR datasets demonstrate that our proposed framework can significantly improve the performance of LTR models.
著者Qian Yu, Wai Lam
會議名稱The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)
會議地點Honolulu, Hawaii, USA
會議論文集題名The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)
頁次411 - 418

上次更新時間 2021-22-01 於 23:49