主 题: DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
报告人: Prof. Xiaogang Wang (the Chinese University of Hong Kong)
时 间: 2015-08-25 10:30 - 12:00
地 点: 理科一号楼1365室
In this talk, I will introduce the deep learning based framework for general objectdetection on ImageNet. It significantly outperforms well-known object detectionworks such as GoogleNet, VGG and RCNN with large margins on the ILSVRC2014 detection test set. The proposed pipeline integrates regionproposal, bounding box rejection, a new pre-training strategy based onobject-level annotations, feature learning, part-deformation learning, contextualmodeling, bounding box regression, and model averaging. Detailed component-wise analysis will be providedthrough extensive experimental evaluation,which provides aglobal view for people to understand the deep learning object detectionpipeline. In the proposed new deeparchitecture, a new deformation constrained pooling (def-pooling) layer modelsthe deformation of object parts with geometric constraint and penalty.
Through the application of objectdetection, I would also like to highlight two key points on deep learning. (1) Inorder to learn feature representation with high discriminative power and goodgeneralization capability, it is better to use challenging supervision taskswith high dimensional prediction to train deep models. Once these features arelearned with challenging tasks, they can be well applied to easier tasks. (2)Instead of treating deep learning as black box, one could build the connectionbetween the layers of deep models and the key components of existing visionsystems. The research experience from existing vision systems can help usproposed new layers and new training strategies.
Xiaogang Wang received his Bachelor degree in Electrical Engineering andInformation Science from the Special Class of Gifted Young at the University ofScience and Technology of China in 2001, M. Phil. degree inInformation Engineering from the Chinese University of Hong Kong in 2004, andPhD degree in Computer Science from Massachusetts Institute of Technology in2009. He is an assistant professor in the Department of Electronic Engineeringat the Chinese University of Hong Kong since August 2009. He received theOutstanding Young Researcher in Automatic Human Behaviour Analysis Award in2011, Hong Kong RGC Early Career Award in 2012, and Young Researcher Award of the ChineseUniversity of Hong Kong. He is the associate editor of the Image andVisual Computing Journal. He was the area chair of ICCV 2011, ECCV 2014, ACCV2014, and ICCV 2015. His research interests includecomputer vision, deeplearning, crowd video surveillance, object detection, and face recognition.