Abstract
We describe an implementation of the Deformable Parts Model [1] that operates in a user-defined time-frame. Our implementation uses a variety of mechanism to trade-off speed against accuracy. Our implementation can detect all 20 PASCAL 2007 objects simultaneously at 30Hz with an mAP of 0.26. At 15Hz, its mAP is 0.30; and at 100Hz, its mAP is 0.16. By comparison the reference implementation of [1] runs at 0.07Hz and mAP of 0.33 and a fast GPU implementation runs at 1Hz. Our technique is over an order of magnitude faster than the previous fastest DPM implementation. Our implementation exploits a series of important speedup mechanisms. We use the cascade framework of [3] and the vector quantization technique of [2]. To speed up feature computation, we compute HOG features at few scales, and apply many interpolated templates. A hierarchical vector quantization method is used to compress HOG features for fast template evaluation. An object proposal step uses hash-table methods to identify locations where evaluating templates would be most useful; these locations are inserted into a priority queue, and processed in a detection phase. Both proposal and detection phases have an any-time property. Our method applies to legacy templates, and no retraining is required.
Chapter PDF
Similar content being viewed by others
References
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object Detection with Discriminatively Trained Part Based Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (2010)
Sadeghi, M.A., Forsyth, D.: Fast Template Evaluation with Vector Quantization. In: Advances in Neural Information Processing Systems, NIPS (2013)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Cascade Object Detection with Deformable Part Models. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)
Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast Feature Pyramids for Object Detection In. IEEE Transactions on Pattern Analysis and Machine Intelligence (2014)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Endres, I., Hoiem, D.: Category Independent Object Proposals. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 575–588. Springer, Heidelberg (2010)
Cheng, M., Zhang, Z., Lin, W., Torr, P.: Bing: Binarized Normed Gradients for Objectness Estimation at 300fps. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
Nister, D., Stewenius, H.: Scalable Recognition with a Vocabulary Tree. In: IEEE Conference on Computer Vision and Pattern Recognition (2006)
Pirsiavash, H., Ramanan, D.: Steerable part models. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Dollár, P., Appel, R., Kienzle, W.: Crosstalk Cascades for Frame-Rate Pedestrian Detection. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 645–659. Springer, Heidelberg (2012)
Dubout, C., Fleuret, F.: Exact Acceleration of Linear Object Detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 301–311. Springer, Heidelberg (2012)
Pedersoli, M., Gonzàlez, J., Bagdanov, A.D., Villanueva, J.J.: Recursive Coarse-to-Fine Localization for Fast Object Detection. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 280–293. Springer, Heidelberg (2010)
Kokkinos, I.: Bounding Part Scores for Rapid Detection with Deformable Part Models. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part III. LNCS, vol. 7585, pp. 41–50. Springer, Heidelberg (2012)
Dean, T., Ruzon, M., Segal, M., Shlens, J., Vijayanarasimhan, S., Yagnik, J.: Fast, Accurate Detection of 100,000 Object Classes on a Single Machine. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Indyk, P., Motwani, R.: Approximate nearest neighbours: Towards removing the curse of dimensionality. In: ACM Symposium on Theory of Computing (1998)
Vedaldi, A., Zisserman, A.: Sparse Kernel Approximations for Efficient Classification and Detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Jgou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbour search. IEEE Transactions on Pattern Analysis and Machine Intelligence (2010)
Gray, R.M., Neuhoff, D.L.: Quantization. IEEE Transactions on Information Theory (1998)
Ren, X., Ramanan, D.: Histograms of Sparse Codes for Object Detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Felzenszwalb, P., Girshick, R., McAllester, D.: Discriminatively Trained Deformable Part Models, Release 4, http://people.cs.uchicago.edu/pff/latent-release4/
Girshick, R., Felzenszwalb, P., McAllester, D.: Discriminatively Trained Deformable Part Models, Release 5, http://people.cs.uchicago.edu/rbg/latent-release5/
Song, H.O., Zickler, S., Althoff, T., Girshick, R., Fritz, M., Geyer, C., Felzenszwalb, P., Darrell, T.: Sparselet Models for Efficient Multiclass Object Detection. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 802–815. Springer, Heidelberg (2012)
Yan, J., Lei, Z., Wen, L., Li, S.Z.: The Fastest Deformable Part Model for Object Detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
Bengio, S., Weston, J., Grangier, D.: Label embedding trees for large multi-class tasks. In: Advances in Neural Information Processing Systems (2010)
Benenson, R., Mathias, M., Timofte, R., Van Gool, L.: Pedestrian detection at 100 frames per second. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Sadeghi, M.A., Forsyth, D. (2014). 30Hz Object Detection with DPM V5. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8689. Springer, Cham. https://doi.org/10.1007/978-3-319-10590-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-10590-1_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10589-5
Online ISBN: 978-3-319-10590-1
eBook Packages: Computer ScienceComputer Science (R0)