Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
Refereed conference paper presented and published in conference proceedings


引用次數
替代計量分析
.

其它資訊
摘要Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles to design effective ConvNet architectures for action recognition in videos and learn these models given limited training samples. Our first contribution is temporal segment network (TSN), a novel framework for video-based action recognition. which is based on the idea of long-range temporal structure modeling. It combines a sparse temporal sampling strategy and video-level supervision to enable efficient and effective learning using the whole action video. The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network. Our approach obtains the state-the-of-art performance on the datasets of HMDB51 (69.4%) and UCF101 (94.2%). We also visualize the learned ConvNet models, which qualitatively demonstrates the effectiveness of temporal segment network and the proposed good practices
著者Wang LM, Xiong YJ, Wang Z, Qiao Y, Lin DH, Tang XO, Van Gool L
會議名稱14th European Conference on Computer Vision (ECCV)
會議開始日08.10.2016
會議完結日16.10.2016
會議地點Amsterdam
會議國家/地區荷蘭
出版年份2016
卷號9912
出版社SPRINGER INT PUBLISHING AG
頁次20 - 36
國際標準書號978-3-319-46483-1
電子國際標準書號978-3-319-46484-8
國際標準期刊號0302-9743
語言英式英語
關鍵詞Action recognition; ConvNets; Good practices; Temporal segment networks
Web of Science 學科類別Computer Science; Computer Science, Artificial Intelligence; Computer Science, Theory & Methods; Imaging Science & Photographic Technology

上次更新時間 2020-07-08 於 02:15