Abstract
This paper introduces the point-axis representation for oriented object detection, as depicted in aerial images in Fig. 1, emphasizing its flexibility and geometrically intuitive nature with two key components: points and axes. 1) Points delineate the spatial extent and contours of objects, providing detailed shape descriptions. 2) Axes define the primary directionalities of objects, providing essential orientation cues crucial for precise detection. The point-axis representation decouples location and rotation, addressing the loss discontinuity issues commonly encountered in traditional bounding box-based approaches. For effective optimization without introducing additional annotations, we propose the max-projection loss to supervise point set learning and the cross-axis loss for robust axis representation learning. Further, leveraging this representation, we present the Oriented DETR model, seamlessly integrating the DETR framework for precise point-axis prediction and end-to-end detection. Experimental results demonstrate significant performance improvements in oriented object detection tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Xia, G.-S., et al.: A large-scale dataset for object detection in aerial images. In: CVPR, Dota (2018)
Ding, J., et al.: Object detection in aerial images: a large-scale benchmark and challenges. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 7778–7796 (2021)
Cheng, G., Wang, J., Li, K., Xie, X., Lang, C., Yao, Y., Han, J.: Anchor-free oriented proposal generator for object detection. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2022)
Zhou, Y., et al.: Mmrotate: a rotated object detection benchmark using pytorch. In: ACM MM (2022)
Gui, S., Song, S., Qin, R., Tang, Y.: Remote sensing object detection in the deep learning era–a review. Remote Sensing 16(2), 327 (2024)
Girshick, R.: Fast r-cnn. In: ICCV (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: Unified, real-time object detection. In: CVPR, You only look once (2016)
Tian, Z., Shen, C., Chen, H., He, T.: Fully convolutional one-stage object detection. In ICCV, Fcos (2019)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: CVPR (2023)
Zou, Z., Chen, K., Shi, Z., Guo, Y., Ye, J.: Object detection in 20 years: a survey. Proceedings of the IEEE (2023)
Song, X., He, Y, Dong, S., Gong, Y.: Non-exemplar domain incremental object detection via learning domain bias. In: AAAI (2024)
Ding, J., Xue, N., Long, Y., Xia, G.-S., Lu, O.: Learning roi transformer for oriented object detection in aerial images. In: CVPR (2019)
Yang, X., Yan, J.: Arbitrary-oriented object detection with circular smooth label. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 677–694. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_40
Yang, X., et al.: The kfiou loss for rotated object detection. In: ICLR (2023)
Yu, Y., Da, F.: Phase-shifting coder: Predicting accurate orientation in oriented object detection. In: CVPR (2023)
Qian, W., Yang, X., Peng, S., Yan, J., Guo, Y.: Learning modulated loss for rotated object detection. In: AAAI (2021)
Yongchao, X., et al.: Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell. 43(4), 1452–1459 (2020)
Ming, Q., Miao, L., Zhou, Z., Yang, X., Dong, Y.: Optimization for arbitrary-oriented object detection via representation invariance loss. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2021)
Yi, J., Wu, P., Liu, B., Huang, Q., Qu, H., Metaxas, D.: Oriented object detection in aerial images with box boundary-aware vectors. In: WACV (2021)
Wei, H., Zhang, Y., Chang, Z., Li, H., Wang, H., Sun, X.: Oriented objects as pairs of middle lines. ISPRS J. Photogramm. Remote. Sens. 169, 268–279 (2020)
Yang, X., Yan, J., Ming, Q., Wang, W., Zhang, X., Tian, Q.: Rethinking rotated object detection with gaussian wasserstein distance loss. In: ICML (2021)
Yang, X., et al.: Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. NeurIPS (2021)
Yu, Y., Da, F.: On boundary discontinuity in angle regression based arbitrary oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell. (2024)
Xiao, Z., Yang, G., Yang, X., Mu, T., Yan, J., Hu, S.: Theoretically achieving continuous representation of oriented bounding boxes. In: CVPR (2024)
Chen, Y., Zhang, Z., Cao, Y., Wang, L., Lin, S., Hu, H.: Reppoints v2: Verification meets regression for object detection. NeurIPS (2020)
Li, W., Chen, Y., Hu, K., Zhu, J.: Oriented reppoints for aerial object detection. In: CVPR (2022)
Xie, X., Cheng, G., Wang, J., Yao, X., Han, J.: Oriented r-cnn for object detection. In: ICCV (2021)
Han, J., Ding, J., Xue, N., Xia, G.-S.: A rotation-equivariant detector for aerial object detection. In: CVPR, Redet (2021)
Pu, Y., et al.: Adaptive rotated convolution for rotated object detection. In: ICCV (2023)
Li, Y., Hou, Q., Zheng, Z., Cheng, M.-M., Yang, J., Li, X.: Large selective kernel network for remote sensing object detection. In: ICCV (2023)
Yang, X., Hou, L., Zhou, Y., Wang, W., Yan, J.: Dense label encoding for boundary discontinuity free rotation detection. In: CVPR (2021)
Dongchen, L., Li, D., Li, Y., Wang, S.: Orientation-sensitive keypoint localization for rotated object detection. In CVPR, Oskdet (2022)
Yu, H., Tian, Y., Ye, Q., Liu, Y.: Spatial transform decoupling for oriented object detection. In: AAAI (2024)
Chen, Z., et al.: The devil is in the crack orientation: a new perspective for crack detection. In: ICCV (2023)
Vaswani, A., et al.: Attention is all you need. NeurIPS (2017)
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Wei, X., et al.: Scene-adaptive attention network for crowd counting. arXiv preprint arXiv:2112.15509 (2021)
Shi, D., Wei, X., Li, L., Ren, Y., Tan, W.: End-to-end multi-person pose estimation with transformers. In: CVPR (2022)
Wei, X., Bai, Y., Zheng, Y., Shi, D., Gong, Y.: Autoregressive visual tracking. In: CVPR (2023)
Bai, Y., Zhao, Z., Gong, Y., Wei, X.: Artrackv2: Prompting autoregressive tracker where to look and how to describe. In: CVPR (2024)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Zhu, X., Weijie, S., Lewei, L., Li, B., Wang, X., Dai, J.: Deformable transformers for end-to-end object detection. In: ICLR, Deformable detr (2021)
Liu, S., et al.: Dynamic anchor boxes are better queries for detr. In: ICLR, Dab-detr (2022)
Zhang, G., Luo, Z., Yu, Y., Cui, K., Lu, S.: Accelerating detr convergence via semantic-aligned matching. In: CVPR (2022)
Zhang, G., Luo, Z., Tian, Z., Zhang, J., Zhang, X., Lu, S.: Towards efficient use of multi-scale features in transformer-based object detectors. In: CVPR (2023)
Cai, Z., Vasconcelos, N.: Cascade r-cnn: delving into high quality object detection. In VPR (2018)
Dai, X., et al.: Unifying object detection heads with attentions. In: CVPR, Dynamic head (2021)
Zhang, H., et al.: Detr with improved denoising anchor boxes for end-to-end object detection. In: ICLR (2023)
Zong, Z., Song, G., Liu, Y.: Detrs with collaborative hybrid assignments training. In: ICCV (2023)
Liu, S., et al.: Grounding dino: Marrying dino with grounded pre-training for open-set object detection. arXiv (2023)
Li, F., et al.: Mask dino: Towards a unified transformer-based framework for object detection and segmentation. In: CVPR (2023)
Li, F., et al.: Visual in-context prompting. arXiv, (2023)
Hu, Z., et al.: Emo2-detr: efficient-matching oriented object detection with transformers. IEEE Trans. Geosci. Remote Sensing (2023)
Dai, L., Liu, H., Tang, H., Zhiwei, W., Song, P.: Ao2-detr: arbitrary-oriented object detection transformer. IEEE Trans. Circuits Syst. Video Technol. 33(5), 2342–2356 (2022)
Zeng, Y., Chen, Y., Yang, X., Li, Q., Yan, J.: Ars-detr: aspect ratio-sensitive detection transformer for aerial oriented object detection. IEEE Trans. Geosci. Remote Sens. 62, 1–15 (2024)
Li, K., Wan, G., Cheng, G., Meng, L., Han, J.: Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J. Photogramm. Remote. Sens. 159, 296–307 (2020)
Liu, Z., Yuan, L., Weng, L.,Yang, Y.: A high resolution optical satellite image dataset for ship recognition and some new baselines. In: International Conference on Pattern Recognition Applications and Methods, vol. 2, pp. 324–331. SciTePress (2017)
Lin, T.-Y., et al.: Common objects in context. In ECCV, Microsoft coco (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Liu, Z., et al.: Hierarchical vision transformer using shifted windows. In ICCV, Swin transformer (2021)
Everingham, M., Winn, J.: The pascal visual object classes challenge 2007 (voc2007) development kit. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Everingham, M., Winn, J.:The pascal visual object classes challenge 2012 (voc2012) development kit. Pattern Anal. Stat. Model. Comput. Learn., Tech. Rep 2007(1-45):5 (2012)
Yang, X., Yan, J., Feng, Z., He, T.: R3det: refined single-stage detector with feature refinement for rotating object. In: AAAI (2021)
Hou, L., Lu, K., Xue, J., Li, Y.: Shape-adaptive selection and measurement for oriented object detection. In: AAAI (2022)
Huang, Z., Li, W., Xia, X.-G., Tao, R.: A general gaussian heatmap label assignment for arbitrary-oriented object detection. IEEE Trans. Image Process. 31, 1895–1910 (2022)
Nie, G., Huang, H.: Multi-oriented object detection in aerial images with double horizontal rectangles. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4932–4944 (2023)
Xu, C., et al.: Dynamic coarse-to-fine learning for oriented tiny object detection. In: CVPR (2023)
Yang, X., et al.: Scrdet: Towards more robust detection for small, cluttered and rotated objects. In: ICCV (2019)
Wang, D., et al.: Advancing plain vision transformer toward remote sensing foundation model. IEEE Trans. Geosci. Remote Sens. 61, 1–15 (2022)
Cai, X., Lai, Q., Wang, Y., Wang, W., Sun, Z., Yao, Y.: Poly kernel inception network for remote sensing detection. In: CVPR (2024)
Cheng, G., et al.: Anchor-free oriented proposal generator for object detection. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2022)
Lyu, C., et al.: An empirical study of designing real-time object detectors. arXiv, Rtmdet (2022)
Ming, Q., Miao, L., Zhou, Z., Song, J., Dong, Y., Yang, X.: Task interleaving and orientation estimation for high-precision oriented object detection in aerial images. ISPRS J. Photogramm. Remote. Sens. 196, 241–255 (2023)
Yang, X., et al.: Detecting rotated objects as gaussian distributions and its 3-d generalization. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4335–4354 (2022)
Li, F., Zhang, H., Liu, S., Guo, J., Ni, L.M., Zhang, L.: Accelerate detr training by introducing query denoising. In: CVPR, Dn-detr (2022)
Jia, D., et al.: Detrs with hybrid matching. In: CVPR, Lei Sun (2023)
Ren, T., et al.: detrex: Benchmarking detection transformers. arXiv (2023)
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant No. U21B2048 and No. 62302382, Shenzhen Key Technical Projects under Grant CJGJZD2022051714160501, the Fundamental Research Funds for the Central Universities No. xxj032023020, and sponsored by the CAAI-MindSpore Open Fund, developed on OpenI Community.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, Z., Xue, Q., He, Y., Bai, Y., Wei, X., Gong, Y. (2025). Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15086. Springer, Cham. https://doi.org/10.1007/978-3-031-73390-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-73390-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73389-5
Online ISBN: 978-3-031-73390-1
eBook Packages: Computer ScienceComputer Science (R0)