Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation
Refereed conference paper presented and published in conference proceedings


Full Text

Times Cited

Other information
AbstractCommonsense generation aims to generate a plausible sentence containing all given unordered concept words. Previous methods focusing on this task usually directly concatenate these words as the input of a pre-trained language model (PLM). However, in PLMs’ pretraining process, the inputs are often corrupted sentences with correct word order. This input distribution discrepancy between pre-training and fine-tuning makes the model difficult to fully utilize the knowledge of PLMs. In this paper, we propose a two-stage framework to alleviate this issue. Firstly, in pre-training stage, we design a new format of input to endow PLMs the ability to deal with masked sentences with incorrect word order. Secondly, during finetuning, we insert the special token [MASK] between two consecutive concept words to make the input distribution more similar to the input distribution in pre-training. We conduct extensive experiments and provide a thorough analysis to demonstrate the effectiveness of our proposed method. The code is available at https://github.com/LHRYANG/CommonGen.
All Author(s) ListHaoran Yang, Yan Wang, Piji Li, Wei Bi, Wai Lam, Chen Xu
Name of ConferenceThe 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL)
Start Date of Conference02/05/2023
End Date of Conference06/05/2023
Place of ConferenceDubrovnik
Country/Region of ConferenceCroatia
Proceedings TitleFindings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL)
Year2023
Month5
Pages376 - 383
LanguagesEnglish-United States

Last updated on 2024-16-04 at 00:35