Temporal Action Detection with Structured Segment Networks
Refereed conference paper presented and published in conference proceedings
已正式接受出版


全文

其它資訊
摘要Detecting activities in untrimmed videos is an important yet challenging task. In this paper, we tackle the difficulties of effectively locating the start and the end of a long complex action, which are often met by existing methods. Our key contribution is the structured segment network, a novel framework for temporal action detection, which models the temporal structure of each activity instance via a structured temporal pyramid. On top of the pyramid, we further introduce a decomposed discriminative model, which comprises two classifiers, respectively for classifying activities and determining completeness. This allows the framework to effectively distinguish positive proposals from background or incomplete ones, thus leading to both accurate recognition and localization. These components are integrated into a unified network that can be efficiently trained in an end-to-end fashion. We also propose a simple yet effective temporal action proposal scheme that can generate proposals of considerably higher qualities. On two challenging benchmarks, THUMOS14 and ActivityNet, our method remarkably outperforms existing state-of-the-art methods by over 10% absolute average mAP, demonstrating superior accuracy and strong adaptivity in handling activities with various temporal structures.
出版社接受日期19.07.2017
著者Yue Zhao, Yuanjun Xiong, Limin Wang, Zhirong Wu, Xiaoou Tang, Dahua Lin
會議名稱International Conference on Computer Vision
會議開始日22.10.2017
會議完結日29.10.2017
會議地點Venice
會議國家/地區意大利
出版年份2017
月份10
語言美式英語

上次更新時間 2018-20-01 於 19:01