Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3607834.3616571acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

A Cross-View Matching Method Based on Dense Partition Strategy for UAV Geolocalization

Published: 29 October 2023 Publication History

Abstract

\beginabstract This paper reports our solution for ACM Multimedia 2023 cross-view geo-localization challenge~\citezheng2023UVA, which aims to solve real-world geo-localization task with extremely large satellite-view gallery distractors. Our solution is built on the basis of SwinV2~\citeliu2022swin and LPN~\citewang2021each. Concretely, we adopt the SwinV2-B~\citeliu2022swin, the current mainstream transformer-based feature extractor, as the backbone of our model. Inspired by the feature partition strategy of LPN~\citewang2021each, we design a more efficient partition strategy named dense partition strategy. It segments and combines features to alleviate the problem of feature discontinuity at the boundary of different partitions. We get eight place in the geographic localization track on the official leaderboard. \endabstract

References

[1]
Quan Chen, Bolun Zheng, Xiaofei Zhou, Aiai Huang, Yaoqi Sun, Chuqiao Chen, Chenggang Yan, and Shanxin Yuan. 2023. Depth-guided deep filtering network for efficient single image bokeh rendering. Neural Computing and Applications (2023), 1--19. https://doi.org/10.1007/s00521-023-08852-y
[2]
Cheng Chi, Fangyun Wei, and Han Hu. 2020. Relationnet: Bridging visual representations for object detection via transformer decoder. Advances in Neural Information Processing Systems 33 (2020), 13564--13574.
[3]
Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, and Baining Guo. 2022. Cswin transformer: A general vision transformer backbone with cross-shaped windows. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12124--12134.
[4]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[5]
Sixing Hu, Mengdan Feng, Rang MH Nguyen, and Gim Hee Lee. 2018. Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7258--7267.
[6]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).
[7]
Liulei Li, Tianfei Zhou, Wenguan Wang, Jianwu Li, and Yi Yang. 2022. Deep hierarchical semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1246--1257.
[8]
Yehao Li, Ting Yao, Yingwei Pan, and Tao Mei. 2022. Contextual transformer networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 2 (2022), 1489--1500.
[9]
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. 2021. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 1833--1844.
[10]
Jinliang Lin, Zhedong Zheng, Zhun Zhong, Zhiming Luo, Shaozi Li, Yi Yang, and Nicu Sebe. 2022. Joint representation learning and keypoint detection for cross-view geo-localization. IEEE Transactions on Image Processing 31 (2022), 3780--3792.
[11]
Liu Liu and Hongdong Li. 2019. Lending orientation to neural networks for crossview geo-localization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5624--5633.
[12]
Nian Liu, Ni Zhang, Kaiyuan Wan, Ling Shao, and Junwei Han. 2021. Visual saliency transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 4722--4732.
[13]
Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, et al. 2022. Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12009--12019.
[14]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision. 10012--10022.
[15]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
[16]
Xiaoliang Qian, Yinfeng Zeng, Wei Wang, and Qiuwen Zhang. 2022. Co-saliency detection guided by group weakly supervised learning. IEEE Transactions on Multimedia (2022).
[17]
Yujiao Shi, Liu Liu, Xin Yu, and Hongdong Li. 2019. Spatial-aware feature aggregation for image based cross-view geo-localization. Advances in Neural Information Processing Systems 32 (2019).
[18]
Karen Simonyan and AndrewZisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[19]
Robin Strudel, Ricardo Garcia, Ivan Laptev, and Cordelia Schmid. 2021. Segmenter: Transformer for semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision. 7262--7272.
[20]
Gijs van Tulder, Yao Tong, and Elena Marchiori. 2021. Multi-view analysis of unregistered medical images using cross-view transformers. In Medical Image Computing and Computer Assisted Intervention--MICCAI 2021: 24th International Conference, Strasbourg, France, September 27--October 1, 2021, Proceedings, Part III 24. Springer, 104--113.
[21]
Tingyu Wang, Zhedong Zheng, Yaoqi Sun, Tat-Seng Chua, Yi Yang, and Chenggang Yan. 2022. Multiple-environment Self-adaptive Network for Aerial-view Geo-localization. arXiv preprint arXiv:2204.08381 (2022).
[22]
Tingyu Wang, Zhedong Zheng, Chenggang Yan, Jiyong Zhang, Yaoqi Sun, Bolun Zheng, and Yi Yang. 2022. Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization. IEEE Transactions on Circuits and Systems for Video Technology 32, 2 (2022), 867--879. https://doi.org/10.1109/TCSVT.2021.3061265
[23]
Tingyu Wang, Zhedong Zheng, Zunjie Zhu, Yuhan Gao, Yi Yang, and Chenggang Yan. 2022. Learning cross-view geo-localization embeddings via dynamic weighted decorrelation regularization. arXiv preprint arXiv:2211.05296 (2022).
[24]
Hongfa Wen, Chenggang Yan, Xiaofei Zhou, Runmin Cong, Yaoqi Sun, Bolun Zheng, Jiyong Zhang, Yongjun Bao, and Guiguang Ding. 2021. Dynamic selective network for RGB-D salient object detection. IEEE Transactions on Image Processing 30 (2021), 9179--9192.
[25]
Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, and Xiaolong Wang. 2022. Groupvit: Semantic segmentation emerges from text supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18134--18144.
[26]
Hongji Yang, Xiufan Lu, and Yingying Zhu. 2021. Cross-view geo-localization with evolving transformer. arXiv preprint arXiv:2107.00842 (2021).
[27]
Jia-Xing Zhao, Jiang-Jiang Liu, Deng-Ping Fan, Yang Cao, Jufeng Yang, and Ming- Ming Cheng. 2019. EGNet: Edge guidance network for salient object detection. In Proceedings of the IEEE/CVF international conference on computer vision. 8779-- 8788.
[28]
Bolun Zheng, Quan Chen, Shanxin Yuan, Xiaofei Zhou, Hua Zhang, Jiyong Zhang, Chenggang Yan, and Gregory Slabaugh. 2022. Constrained Predictive Filters for Single Image Bokeh Rendering. IEEE Transactions on Computational Imaging 8 (2022), 346--357. https://doi.org/10.1109/TCI.2022.3171417
[29]
Zhedong Zheng, Yujiao Shi, Tingyu Wang, Jun Liu, Jianwu Fang, Yunchao Wei, and Tat-seng Chua. 2023. UAVM '23: 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective. In Proceedings of the 31th ACM International Conference on Multimedia Workshop.
[30]
Zhedong Zheng, Yunchao Wei, and Yi Yang. 2020. University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In Proceedings of the 28th ACM international conference on Multimedia. 1395--1403.
[31]
Jiedong Zhuang, Ming Dai, Xuruoyan Chen, and Enhui Zheng. 2021. A faster and more effective cross-view matching method of uav and satellite images for uav geolocalization. Remote Sensing 13, 19 (2021), 3979.

Cited By

View all

Index Terms

  1. A Cross-View Matching Method Based on Dense Partition Strategy for UAV Geolocalization

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
    November 2023
    86 pages
    ISBN:9798400702860
    DOI:10.1145/3607834
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep learning
    2. feature partition
    3. geo-localization
    4. transformer

    Qualifiers

    • Short-paper

    Conference

    MM '23
    Sponsor:

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 93
      Total Downloads
    • Downloads (Last 12 months)93
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 24 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media