Nothing Special   »   [go: up one dir, main page]

Skip to main content

RFLA: Gaussian Receptive Field Based Label Assignment for Tiny Object Detection

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13669))

Included in the following conference series:

Abstract

Detecting tiny objects is one of the main obstacles hindering the development of object detection. The performance of generic object detectors tends to drastically deteriorate on tiny object detection tasks. In this paper, we point out that either box prior in the anchor-based detector or point prior in the anchor-free detector is sub-optimal for tiny objects. Our key observation is that the current anchor-based or anchor-free label assignment paradigms will incur many outlier tiny-sized ground truth samples, leading to detectors imposing less focus on the tiny objects. To this end, we propose a Gaussian Receptive Field based Label Assignment (RFLA) strategy for tiny object detection. Specifically, RFLA first utilizes the prior information that the feature receptive field follows Gaussian distribution. Then, instead of assigning samples with IoU or center sampling strategy, a new Receptive Field Distance (RFD) is proposed to directly measure the similarity between the Gaussian receptive field and ground truth. Considering that the IoU-threshold based and center sampling strategy are skewed to large objects, we further design a Hierarchical Label Assignment (HLA) module based on RFD to achieve balanced learning for tiny objects. Extensive experiments on four datasets demonstrate the effectiveness of the proposed methods. Especially, our approach outperforms the state-of-the-art competitors with 4.0 AP points on the AI-TOD dataset. Codes are available at https://github.com/Chasel-Tsui/mmdet-rfla.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bai, Y., Zhang, Y., Ding, M., Ghanem, B.: SOD-MTGAN: small object detection via multi-task generative adversarial network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 210–226. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_13

    Chapter  Google Scholar 

  2. Bashir, S.M.A., Wang, Y.: Small object detection in remote sensing images with residual feature aggregation-based super-resolution and object detector network. Remote Sens. 13(9), 1854 (2021)

    Article  Google Scholar 

  3. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. CoRR arXiv:2004.10934 (2020)

  4. Cai, Z., Vas., N.: Cascade R-CNN: delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162 (2018)

    Google Scholar 

  5. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  6. Chen, K., Wang, J., Pang, J., et al.: MMDetection: Open MMLab detection toolbox and benchmark. CoRR arXiv:1906.07155 (2019)

  7. Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., Sun, J.: You only look one-level feature. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 13039–13048 (2021)

    Google Scholar 

  8. Courtrai, L., Pham, M.T., Lefèvre, S.: Small object detection in remote sensing images based on super-resolution with auxiliary generative adversarial networks. Remote Sens. 12(19), 3152 (2020)

    Article  Google Scholar 

  9. Ding, J., et al.: Object detection in aerial images: a large-scale benchmark and challenges. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7778–7796 (2021)

    Article  Google Scholar 

  10. Du, D., Zhu, P., Wen, L., et al.: VisDrone-DET2019: the vision meets drone object detection in image challenge results. In: IEEE International Conference on Computer Vision Workshops, pp. 213–226 (2019)

    Google Scholar 

  11. Duchi, J.: Derivations for linear algebra and optimization 3(1), 2325–5870. Berkeley, California (2007)

    Google Scholar 

  12. Endres, D.M., Schindelin, J.E.: A new metric for probability distributions. IEEE Trans. Inf. Theor. 49(7), 1858–1860 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  13. Everingham, M., Eslami, S.A., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)

    Article  Google Scholar 

  14. Ge, Z., Liu, S., Li, Z., Yoshie, O., Sun, J.: OTA: optimal transport assignment for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  16. Kim, K., Lee, H.S.: Probabilistic anchor assignment with IoU prediction for object detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 355–371. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_22

    Chapter  Google Scholar 

  17. Kim, Y., Kang, B.-N., Kim, D.: SAN: learning relationship between convolutional features for multi-scale object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 328–343. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_20

    Chapter  Google Scholar 

  18. Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019)

  19. Kong, T., Sun, F., Liu, H., Jiang, Y., Li, L., Shi, J.: Foveabox: beyound anchor-based object detection. IEEE Trans. Image Process. 29, 7389–7398 (2020)

    Article  MATH  Google Scholar 

  20. Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 765–781. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_45

    Chapter  Google Scholar 

  21. Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., Yan, S.: Perceptual generative adversarial networks for small object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1222–1230 (2017)

    Google Scholar 

  22. Li, Y., Chen, Y., Wang, N., Zhang, Z.: Scale-aware trident networks for object detection. In: IEEE International Conference on Computer Vision, pp. 6054–6063 (2019)

    Google Scholar 

  23. Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

    Google Scholar 

  24. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  25. Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  26. Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)

    Google Scholar 

  27. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  28. Lu, X., Li, B., Yue, Y., Li, Q., Yan, J.: Grid R-CNN. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7363–7372 (2019)

    Google Scholar 

  29. Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: Advances in Neural Information Processing Systems 29 (2016)

    Google Scholar 

  30. Ma, Y., Liu, S., Li, Z., Sun, J.: IQDet: instance-wise quality distribution sampling for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1717–1725 (2021)

    Google Scholar 

  31. Nielsen, F.: On the jensen-shannon symmetrization of distances relying on abstract means. Entropy 21(5), 485 (2019). https://doi.org/10.3390/e21050485

    Article  MathSciNet  Google Scholar 

  32. Noh, J., Bae, W., Lee, W., Seo, J., Kim, G.: Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection. In: IEEE International Conference on Computer Vision, pp. 9725–9734 (2019)

    Google Scholar 

  33. Paszke, A., Gross, S., Massa, F., Lerer, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, pp. 8024–8035 (2019)

    Google Scholar 

  34. Peyré, G., Cuturi, M., et al.: Computational optimal transport: with applications to data science. Found. Trends Mach. Learn. 11(5–6), 355–607 (2019)

    Article  MATH  Google Scholar 

  35. Qiao, S., Chen, L.C., Yuille, A.: DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution. In: IEEE Conference on Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  36. Rabbi, J., Ray, N., Schubert, M., Chowdhury, S., Chao, D.: Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network. Remote Sens. 12(9), 1432 (2020)

    Article  Google Scholar 

  37. Redmon, J., Farhadi, A.: YOLO9000: Better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)

    Google Scholar 

  38. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  39. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  40. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)

    Google Scholar 

  41. Bernstein, M., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  42. Singh, B., Davis, L.S.: An analysis of scale invariance in object detection snip. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3578–3587 (2018)

    Google Scholar 

  43. Singh, B., Najibi, M., Davis, L.S.: Sniper: efficient multi-scale training. In: Advances in Neural Information Processing Systems, pp. 9310–9320 (2018)

    Google Scholar 

  44. Sun, P., et al.: Sparse R-CNN: end-to-end object detection with learnable proposals. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 14454–14463 (2021)

    Google Scholar 

  45. Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)

    Google Scholar 

  46. Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: IEEE International Conference on Computer Vision, pp. 9627–9636 (2019)

    Google Scholar 

  47. Vu, T., Jang, H., Pham, T.X., Yoo, C.: Cascade RPN: delving into high-quality region proposal network with adaptive convolution 32, 1432–1442 (2019)

    Google Scholar 

  48. Wang, J., Xu, C., Yang, W., Yu, L.: A normalized Gaussian wasserstein distance for tiny object detection. arXiv preprint arXiv:2110.13389 (2021)

  49. Wang, J., Yang, W., Guo, H., Zhang, R., Xia, G.S.: Tiny object detection in aerial images. In: International Conference on Pattern Recognition, pp. 3791–3798 (2021)

    Google Scholar 

  50. Wang, J., Yang, W., Li, H.C., Zhang, H., Xia, G.S.: Learning center probability map for detecting objects in aerial images. IEEE Trans. Geosci. Remote Sens. 59(5), 4307–4323 (2021)

    Article  Google Scholar 

  51. Xu, C., Wang, J., Yang, W., Yu, H., Yu, L., Xia, G.S.: Detecting tiny objects in aerial images: a normalized wasserstein distance and a new benchmark. ISPRS J. Photogramm. Remote. Sens. 190, 79–93 (2022)

    Article  Google Scholar 

  52. Xu, C., Wang, J., Yang, W., Yu, L.: Dot distance for tiny object detection in aerial images. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1192–1201 (2021)

    Google Scholar 

  53. Yang, X., Yan, J., Ming, Q., Wang, W., Zhang, X., Tian, Q.: Rethinking rotated object detection with Gaussian wasserstein distance loss. In: International Conference on Machine Learning, vol. 139, pp. 11830–11841 (2021)

    Google Scholar 

  54. Yang, X., Yang, X., Yang, J., Ming, Q., Wang, W., Tian, Q., Yan, J.: Learning high-precision bounding box for rotated object detection via Kullback-Leibler divergence. In: Advances in Neural Information Processing Systems 34 (2021)

    Google Scholar 

  55. Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: RepPoints: point set representation for object detection. In: IEEE International Conference on Computer Vision, pp. 9657–9666 (2019)

    Google Scholar 

  56. Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.: UnitBox: an advanced object detection network, pp. 516–520 (2016)

    Google Scholar 

  57. Yu, X., Gong, Y., Jiang, N., Ye, Q., Han, Z.: Scale match for tiny person detection. In: IEEE Workshops on Applications of Computer Vision, pp. 1257–1265 (2020)

    Google Scholar 

  58. Zhang, S., Chi, C., Yao, Y., Lei, Z., Li, S.Z.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 9759–9768 (2020)

    Google Scholar 

  59. Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: S3FD: single shot scale-invariant face detector. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 192–201 (2017)

    Google Scholar 

  60. Zhang, X., Wan, F., Liu, C., Ji, X., Ye, Q.: Learning to match anchors for visual object detection. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3096–3109 (2021)

    Article  Google Scholar 

  61. Zhao, Q., et al.: M2Det: a single-shot object detector based on multi-level feature pyramid network. In: AAAI Conference on Artificial Intelligence, pp. 9259–9266 (2019)

    Google Scholar 

  62. Zhu, B., et al.: AutoAssign: differentiable label assignment for dense object detection. arXiv preprint arXiv:2007.03496 (2020)

  63. Zhu, P., et al.: VisDrone-DET2018: the vision meets drone object detection in image challenge results. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 437–468. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_27

    Chapter  Google Scholar 

  64. Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: International Conference on Learning Representations (2021)

    Google Scholar 

Download references

Acknowledgement

This work was partly supported by the Fundamental Research Funds for the Central Universities under Grant 2042022kf1010, and the National Natural Science Foundation of China under Grant 61771351 and 61871297. The numerical calculations were conducted on the supercomputing system in the Supercomputing Center, Wuhan University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen Yang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 17291 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, C., Wang, J., Yang, W., Yu, H., Yu, L., Xia, GS. (2022). RFLA: Gaussian Receptive Field Based Label Assignment for Tiny Object Detection. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13669. Springer, Cham. https://doi.org/10.1007/978-3-031-20077-9_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20077-9_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20076-2

  • Online ISBN: 978-3-031-20077-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics