Factors in finetuning deep model for object detection with long-tail distribution
Refereed conference paper presented and published in conference proceedings

Full Text

Times Cited

Other information
AbstractFinetuning from a pretrained deep model is found to yield state-of-the-art performance for many vision tasks. This paper investigates many factors that influence the performance in finetuning for object detection. There is a longtailed distribution of sample numbers for classes in object detection. Our analysis and empirical results show that classes with more samples have higher impact on the feature learning. And it is better to make the sample number more uniform across classes. Generic object detection can be considered as multiple equally important tasks. Detection of each class is a task. These classes/tasks have their individuality in discriminative visual appearance representation. Taking this individuality into account, we cluster objects into visually similar class groups and learn deep representations for these groups separately. A hierarchical feature learning scheme is proposed. In this scheme, the knowledge from the group with large number of classes is transferred for learning features in its subgroups. Finetuned on the GoogLeNet model, experimental results show 4.7% absolute mAP improvement of our approach on the ImageNet object detection dataset without increasing much computational cost at the testing stage. Code is available on www.ee.cuhk.edu.hk/~wlouyang/projects/ImageNetFactors/CVPR16.html.
All Author(s) ListOuyang W., Wang X., Zhang C., Yang X.
Name of Conference2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
Start Date of Conference26/06/2016
End Date of Conference01/07/2016
Place of ConferenceLas Vegas
Country/Region of ConferenceUnited States of America
Detailed descriptionorganized by IEEE,
Volume Number2016-January
Pages864 - 873
LanguagesEnglish-United Kingdom

Last updated on 2021-12-09 at 00:00