Learning from Multiple Sources for Video Summarisation
Publication in refereed journal

香港中文大學研究人員

引用次數
替代計量分析
.

其它資訊
摘要Many visual surveillance tasks, e.g. video summarisation, is conventionally accomplished through analysing imagery-based features. Relying solely on visual cues for public surveillance video understanding is unreliable, since visual observations obtained from public space CCTV video data are often not sufficiently trustworthy and events of interest can be subtle. We believe that non-visual data sources such as weather reports and traffic sensory signals can be exploited to complement visual data for video content analysis and summarisation. In this paper, we present a novel unsupervised framework to learn jointly from both visual and independently-drawn non-visual data sources for discovering meaningful latent structure of surveillance video data. In particular, we investigate ways to cope with discrepant dimension and representation whilst associating these heterogeneous data sources, and derive effective mechanism to tolerate with missing and incomplete data from different sources. We show that the proposed multi-source learning framework not only achieves better video content clustering than state-of-the-art methods, but also is capable of accurately inferring missing non-visual semantics from previously-unseen videos. In addition, a comprehensive user study is conducted to validate the quality of video summarisation generated using the proposed multi-source model.
著者Zhu XT, Loy CC, Gong SG
期刊名稱International Journal of Computer Vision
詳細描述DOI: 10.1007/s11263-015-0864-3.
出版年份2016
月份5
日期1
卷號117
期次3
出版社SPRINGER
頁次247 - 268
國際標準期刊號0920-5691
電子國際標準期刊號1573-1405
語言英式英語
關鍵詞Event recognition; Heterogeneous data; Multi-source data; Video summarisation; Visual surveillance
Web of Science 學科類別Computer Science; Computer Science, Artificial Intelligence

上次更新時間 2020-14-10 於 00:35