Bridging Music and Image via Cross-Modal Ranking Analysis
Publication in refereed journal


引用次數
替代計量分析
.

其它資訊
摘要Human perceptions of music and image are closely related to each other, since both can inspire similar human sensations, such as emotion, motion, and power. This paper aims to explore whether and how music and image can be automatically matched by machines. The main contributions are three aspects. First, we construct a benchmark dataset composed of more than 45 000 music-image pairs. Human labelers are recruited to annotate whether these pairs are well-matched or not. The results show that they generally agree with each other on the matching degree of music-image pairs. Secondly, we investigate suitable semantic representations of music and image for this cross-modal matching task. In particular, we adopt lyrics as a middle-media to connect music and image, and design a set of lyric-based attributes for image representation. Thirdly, we propose cross-modal ranking analysis (CMRA) to learn the semantic similarity between music and image with ranking labeling information. CMRA aims to find the optimal embedding spaces for both music and image in the sense of maximizing the ordinal margin between music-image pairs. The proposed method is able to learn the non-linear relationship between music and image, and to integrate heterogeneous ranking data from different modalities into a unified space. Experimental results demonstrate that the proposed method outperforms state-of-the-art cross-modal methods in the music-image matching task, and achieves a consistency rate of 91.5% with human labelers.
著者Wu XX, Qiao Y, Wang XG, Tang XO
期刊名稱IEEE Transactions on Multimedia
出版年份2016
月份7
卷號18
期次7
出版社IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
頁次1305 - 1318
國際標準期刊號1520-9210
電子國際標準期刊號1941-0077
語言英式英語
關鍵詞Cross-modal; feature embedding; lyric-based image attribute; music-image matching; ordinal regression
Web of Science 學科類別Computer Science; Computer Science, Information Systems; Computer Science, Software Engineering; Telecommunications

上次更新時間 2020-02-08 於 02:12