A Neural Compositional Paradigm for Image Captioning
Refereed conference paper presented and published in conference proceedings


全文

其它資訊
摘要Mainstream captioning models often follow a sequential structure to generate captions, leading to issues such as introduction of irrelevant semantics, lack of diversity in the generated captions, and inadequate generalization performance. In this paper, we present an alternative paradigm for image captioning, which factorizes the captioning procedure into two stages: (1) extracting an explicit semantic representation from the given image; and (2) constructing the caption based on a recursive compositional procedure in a bottom-up manner. Compared to conventional ones, our paradigm better preserves the semantic content through an explicit factorization of semantics and syntax. By using the compositional generation procedure, caption construction follows a recursive structure, which naturally fits the properties of human language. Moreover, the proposed compositional procedure requires less data to train, generalizes better, and yields more diverse captions.
出版社接受日期05.09.2018
著者Bo Dai, Sanja Fidler, Dahua Lin
會議名稱32nd Conference on Neural Information Processing Systems (NIPS)
會議開始日02.12.2018
會議完結日08.12.2018
會議地點Montreal
會議國家/地區加拿大
會議論文集題名Advances in Neural Information Processing Systems
出版年份2018
月份12
國際標準期刊號1049-5258
語言美式英語

上次更新時間 2021-21-01 於 02:20