short-paper

UnitBox: An Advanced Object Detection Network

Authors:

Zhangyang Wang,

Thomas HuangAuthors Info & Claims

MM '16: Proceedings of the 24th ACM international conference on Multimedia

Pages 516 - 520

https://doi.org/10.1145/2964284.2967274

Published: 01 October 2016 Publication History

Abstract

In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the l₂ loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union (IoU) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of IoU loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.

References

[1]

V. Belagiannis, X. Wang, H. Beny Ben Shitrit, K. Hashimoto, R. Stauder, Y. Aoki, M. Kranzfelder, A. Schneider, P. Fua, S. Ilic, H. Feussner, and N. Navab. Parsing human skeletons in an operating room. Machine Vision and Applications, 2016.

[2]

R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Computer Vision and Pattern Recognition, 2014.

Digital Library

[3]

K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. ArXiv e-prints, Dec. 2015.

[4]

J. Hosang, M. Omran, R. Benenson, and B. Schiele. Taking a deeper look at pedestrians. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4073--4082, 2015.

[5]

L. Huang, Y. Yang, Y. Deng, and Y. Yu. DenseBox: Unifying Landmark Localization with End to End Object Detection. ArXiv e-prints, Sept. 2015.

[6]

V. Jain and E. Learned-Miller. Fddb: A benchmark for face detection in unconstrained settings. Technical Report UM-CS-2010-009, University of Massachusetts, Amherst, 2010.

[7]

Z. Jie, X. Liang, J. Feng, W. F. Lu, E. H. F. Tay, and S. Yan. Scale-aware Pixel-wise Object Proposal Networks. ArXiv e-prints, Jan. 2016.

[8]

H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua. A convolutional neural network cascade for face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5325--5334, 2015.

[9]

J. Li, X. Liang, S. Shen, T. Xu, and S. Yan. Scale-aware Fast R-CNN for Pedestrian Detection. ArXiv e-prints, Oct. 2015.

[10]

S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems (NIPS), 2015.

Digital Library

[11]

K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv e-prints, Sept. 2014.

[12]

J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, and A. W. M. Smeulders. Selective search for object recognition. International Journal of Computer Vision, 104(2):154--171, 2013.

Digital Library

[13]

Z. Wang, S. Chang, Y. Yang, D. Liu, and T. S. Huang. Studying very low resolution recognition using deep networks. CoRR, abs/1601.04153, 2016.

[14]

S. Yang, P. Luo, C. C. Loy, and X. Tang. Wider face: A face detection benchmark. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[15]

C. L. Zitnick and P. Dollár. Edge boxes: Locating object proposals from edges. In ECCV. European Conference on Computer Vision, September 2014.

Cited By

Wang QTu ZLi CTang J(2025)High performance RGB-Thermal Video Object Detection via hybrid fusion with progressive interaction and temporal-modal differenceInformation Fusion10.1016/j.inffus.2024.102665114(102665)Online publication date: Feb-2025
https://doi.org/10.1016/j.inffus.2024.102665
Youjun TCunxiao MHe ZYufeng LWen Y(2024)Low-altitude UAV obstacle detection method based on position constraint and attentionJournal of Applied Artificial Intelligence10.59782/aai.v1i2.3081:2(289-300)Online publication date: 18-Oct-2024
https://doi.org/10.59782/aai.v1i2.308
葛旭金学马慧邹天(2024)YOLOv7-BW: 基于遥感图像的密集小目标高效检测器智能机器人10.52810/JIR.2024.0041:1(39-54)Online publication date: 30-May-2024
https://doi.org/10.52810/JIR.2024.004
Show More Cited By

Index Terms

UnitBox: An Advanced Object Detection Network
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

Directly Optimizing IoU for Bounding Box Localization
Pattern Recognition
Abstract
Object detection has seen remarkable progress in recent years with the introduction of Convolutional Neural Networks (CNN). Object detection is a multi-task learning problem where both the position of the objects in the images as well as their ...
Hybridization of Deep Convolutional Neural Network for Underwater Object Detection and Tracking Model
Highlights
- Underwater object detection and tracking was studied using the efficient Hybridization of Deep Convolutional Neural Network for Underwater Object Detection ...
Abstract
In this present work, underwater object detection and tracking was studied using the efficient Hybridization of Deep Convolutional Neural Network for Underwater Object Detection and Tracking (HDCNN-UODT) model for three bench mark data ...
PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments
Computer Vision – ECCV 2020
Abstract
Object detection using an oriented bounding box (OBB) can better target rotated objects by reducing the overlap with background areas. Existing OBB approaches are mostly built on horizontal bounding box detectors by introducing an additional angle ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '16: Proceedings of the 24th ACM international conference on Multimedia

October 2016

1542 pages

ISBN:9781450336031

DOI:10.1145/2964284

General Chairs:
Alan Hanjalic
Delft University of Technology
,
Cees Snoek
Qualcomm Research Netherlands / University of Amsterdam
,
Marcel Worring
University of Amsterdam
,
Moderator:
Dick Bulterman
CWI / VU University Amsterdam
,
Program Chairs:
Benoit Huet
EURECOM
,
Aisling Kelliher
Virginia Tech
,
Yiannis Kompatsiaris
CERTH-ITI
,
Jin Li
Microsoft

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

MM '16

Sponsor:

SIGMM

MM '16: ACM Multimedia Conference

October 15 - 19, 2016

Amsterdam, The Netherlands

Acceptance Rates

MM '16 Paper Acceptance Rate 52 of 237 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,055
Total Citations
View Citations
2,065
Total Downloads

Downloads (Last 12 months)314
Downloads (Last 6 weeks)41

Reflects downloads up to 17 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang QTu ZLi CTang J(2025)High performance RGB-Thermal Video Object Detection via hybrid fusion with progressive interaction and temporal-modal differenceInformation Fusion10.1016/j.inffus.2024.102665114(102665)Online publication date: Feb-2025
https://doi.org/10.1016/j.inffus.2024.102665
Youjun TCunxiao MHe ZYufeng LWen Y(2024)Low-altitude UAV obstacle detection method based on position constraint and attentionJournal of Applied Artificial Intelligence10.59782/aai.v1i2.3081:2(289-300)Online publication date: 18-Oct-2024
https://doi.org/10.59782/aai.v1i2.308
葛旭金学马慧邹天(2024)YOLOv7-BW: 基于遥感图像的密集小目标高效检测器智能机器人10.52810/JIR.2024.0041:1(39-54)Online publication date: 30-May-2024
https://doi.org/10.52810/JIR.2024.004
Ip AYung KZhu DHuang Z(2024)Drug Recognition Detection Based on Deep Learning and Improved YOLOv8Journal of Organizational and End User Computing10.4018/JOEUC.35977036:1(1-21)Online publication date: 7-Nov-2024
https://dl.acm.org/doi/10.4018/JOEUC.359770
LIU Genghuan 刘ZENG Xiangjin 曾DOU Jiazhen 豆REN Zhenbo 任ZHONG Liyun 钟DI Jianglei 邸QIN Yuwen 秦(2024)基于深度学习的小目标检测技术研究进展(特邀)Infrared and Laser Engineering10.3788/IRLA2024025353:9(20240253)Online publication date: 2024
https://doi.org/10.3788/IRLA20240253
Li JLiu SChen DZhou SLi C(2024)APD-YOLOv7: Enhancing Sustainable Farming through Precise Identification of Agricultural Pests and Diseases Using a Novel Diagonal Difference Ratio IOU LossSustainability10.3390/su1620885516:20(8855)Online publication date: 13-Oct-2024
https://doi.org/10.3390/su16208855
Ye CWang YWang YLiu Y(2024)Steering-Angle Prediction and Controller Design Based on Improved YOLOv5 for Steering-by-Wire SystemSensors10.3390/s2421703524:21(7035)Online publication date: 31-Oct-2024
https://doi.org/10.3390/s24217035
Zhou QWang ZZhong YZhong FWang L(2024)Efficient Optimized YOLOv8 Model with Extended VisionSensors10.3390/s2420650624:20(6506)Online publication date: 10-Oct-2024
https://doi.org/10.3390/s24206506
Wu CLi SXie TWang XZhou J(2024)WoodenCube: An Innovative Dataset for Object Detection in Concealed Industrial EnvironmentsSensors10.3390/s2418590324:18(5903)Online publication date: 11-Sep-2024
https://doi.org/10.3390/s24185903
Shi ZFang YSong H(2024)Intelligent Inspection Method and System of Plastic Gear Surface Defects Based on Adaptive Sample Weighting Deep Learning ModelSensors10.3390/s2414466024:14(4660)Online publication date: 18-Jul-2024
https://doi.org/10.3390/s24144660
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents