Context matters: Refining object detection in video with recurrent neural networks
Research output: Contribution to conference › Paper › Research › peer-review
Given the vast amounts of video available online and recent breakthroughs in object detection with static images, object detection in video offers a promising new frontier. However, motion blur and compression artifacts cause substantial frame-level variability, even in videos that appear smooth to the eye. Additionally, in video datasets, frames are typically sparsely annotated. We present a new framework for improving object detection in videos that captures temporal context and encourages consistency of predictions. First, we train a pseudo-labeler, i.e., a domain-adapted convolutional neural network for object detection, on the subset of labeled frames. We then subsequently apply it to provisionally label all frames, including those absent labels. Finally, we train a recurrent neural network that takes as input sequences of pseudo-labeled frames and optimizes an objective that encourages both accuracy on the target frame and consistency across consecutive frames. The approach incorporates strong supervision of target frames, weak-supervision on context frames, and regularization via a smoothness penalty. Our approach achieves mean Average Precision (mAP) of 68.73, an improvement of 7.1 over the strongest image-based baselines for the Youtube-Video Objects dataset. Our experiments demonstrate that neighboring frames can provide valuable information, even absent labels.
Original language | English |
---|---|
Publication date | 2016 |
Number of pages | 12 |
DOIs | |
Publication status | Published - 2016 |
Externally published | Yes |
Event | 27th British Machine Vision Conference, BMVC 2016 - York, United Kingdom Duration: 19 Sep 2016 → 22 Sep 2016 |
Conference
Conference | 27th British Machine Vision Conference, BMVC 2016 |
---|---|
Country | United Kingdom |
City | York |
Period | 19/09/2016 → 22/09/2016 |
Sponsor | ARM, Disney Research, et al., HP, Ocado Technology, OSRAM |
Bibliographical note
Publisher Copyright:
© 2016. The copyright of this document resides with its authors.
ID: 301827993