short-paper

A Cross-View Matching Method Based on Dense Partition Strategy for UAV Geolocalization

Authors:

Quan ChenAuthors Info & Claims

UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

Pages 19 - 23

https://doi.org/10.1145/3607834.3616571

Published: 29 October 2023 Publication History

Abstract

\beginabstract This paper reports our solution for ACM Multimedia 2023 cross-view geo-localization challenge~\citezheng2023UVA, which aims to solve real-world geo-localization task with extremely large satellite-view gallery distractors. Our solution is built on the basis of SwinV2~\citeliu2022swin and LPN~\citewang2021each. Concretely, we adopt the SwinV2-B~\citeliu2022swin, the current mainstream transformer-based feature extractor, as the backbone of our model. Inspired by the feature partition strategy of LPN~\citewang2021each, we design a more efficient partition strategy named dense partition strategy. It segments and combines features to alleviate the problem of feature discontinuity at the boundary of different partitions. We get eight place in the geographic localization track on the official leaderboard. \endabstract

References

[1]

Quan Chen, Bolun Zheng, Xiaofei Zhou, Aiai Huang, Yaoqi Sun, Chuqiao Chen, Chenggang Yan, and Shanxin Yuan. 2023. Depth-guided deep filtering network for efficient single image bokeh rendering. Neural Computing and Applications (2023), 1--19. https://doi.org/10.1007/s00521-023-08852-y

Digital Library

[2]

Cheng Chi, Fangyun Wei, and Han Hu. 2020. Relationnet: Bridging visual representations for object detection via transformer decoder. Advances in Neural Information Processing Systems 33 (2020), 13564--13574.

[3]

Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, and Baining Guo. 2022. Cswin transformer: A general vision transformer backbone with cross-shaped windows. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12124--12134.

[4]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[5]

Sixing Hu, Mengdan Feng, Rang MH Nguyen, and Gim Hee Lee. 2018. Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7258--7267.

[6]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).

[7]

Liulei Li, Tianfei Zhou, Wenguan Wang, Jianwu Li, and Yi Yang. 2022. Deep hierarchical semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1246--1257.

[8]

Yehao Li, Ting Yao, Yingwei Pan, and Tao Mei. 2022. Contextual transformer networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 2 (2022), 1489--1500.

[9]

Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. 2021. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 1833--1844.

[10]

Jinliang Lin, Zhedong Zheng, Zhun Zhong, Zhiming Luo, Shaozi Li, Yi Yang, and Nicu Sebe. 2022. Joint representation learning and keypoint detection for cross-view geo-localization. IEEE Transactions on Image Processing 31 (2022), 3780--3792.

[11]

Liu Liu and Hongdong Li. 2019. Lending orientation to neural networks for crossview geo-localization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5624--5633.

[12]

Nian Liu, Ni Zhang, Kaiyuan Wan, Ling Shao, and Junwei Han. 2021. Visual saliency transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 4722--4732.

[13]

Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, et al. 2022. Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12009--12019.

[14]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision. 10012--10022.

[15]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).

[16]

Xiaoliang Qian, Yinfeng Zeng, Wei Wang, and Qiuwen Zhang. 2022. Co-saliency detection guided by group weakly supervised learning. IEEE Transactions on Multimedia (2022).

[17]

Yujiao Shi, Liu Liu, Xin Yu, and Hongdong Li. 2019. Spatial-aware feature aggregation for image based cross-view geo-localization. Advances in Neural Information Processing Systems 32 (2019).

[18]

Karen Simonyan and AndrewZisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[19]

Robin Strudel, Ricardo Garcia, Ivan Laptev, and Cordelia Schmid. 2021. Segmenter: Transformer for semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision. 7262--7272.

[20]

Gijs van Tulder, Yao Tong, and Elena Marchiori. 2021. Multi-view analysis of unregistered medical images using cross-view transformers. In Medical Image Computing and Computer Assisted Intervention--MICCAI 2021: 24th International Conference, Strasbourg, France, September 27--October 1, 2021, Proceedings, Part III 24. Springer, 104--113.

Digital Library

[21]

Tingyu Wang, Zhedong Zheng, Yaoqi Sun, Tat-Seng Chua, Yi Yang, and Chenggang Yan. 2022. Multiple-environment Self-adaptive Network for Aerial-view Geo-localization. arXiv preprint arXiv:2204.08381 (2022).

[22]

Tingyu Wang, Zhedong Zheng, Chenggang Yan, Jiyong Zhang, Yaoqi Sun, Bolun Zheng, and Yi Yang. 2022. Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization. IEEE Transactions on Circuits and Systems for Video Technology 32, 2 (2022), 867--879. https://doi.org/10.1109/TCSVT.2021.3061265

[23]

Tingyu Wang, Zhedong Zheng, Zunjie Zhu, Yuhan Gao, Yi Yang, and Chenggang Yan. 2022. Learning cross-view geo-localization embeddings via dynamic weighted decorrelation regularization. arXiv preprint arXiv:2211.05296 (2022).

[24]

Hongfa Wen, Chenggang Yan, Xiaofei Zhou, Runmin Cong, Yaoqi Sun, Bolun Zheng, Jiyong Zhang, Yongjun Bao, and Guiguang Ding. 2021. Dynamic selective network for RGB-D salient object detection. IEEE Transactions on Image Processing 30 (2021), 9179--9192.

Digital Library

[25]

Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, and Xiaolong Wang. 2022. Groupvit: Semantic segmentation emerges from text supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18134--18144.

[26]

Hongji Yang, Xiufan Lu, and Yingying Zhu. 2021. Cross-view geo-localization with evolving transformer. arXiv preprint arXiv:2107.00842 (2021).

[27]

Jia-Xing Zhao, Jiang-Jiang Liu, Deng-Ping Fan, Yang Cao, Jufeng Yang, and Ming- Ming Cheng. 2019. EGNet: Edge guidance network for salient object detection. In Proceedings of the IEEE/CVF international conference on computer vision. 8779-- 8788.

[28]

Bolun Zheng, Quan Chen, Shanxin Yuan, Xiaofei Zhou, Hua Zhang, Jiyong Zhang, Chenggang Yan, and Gregory Slabaugh. 2022. Constrained Predictive Filters for Single Image Bokeh Rendering. IEEE Transactions on Computational Imaging 8 (2022), 346--357. https://doi.org/10.1109/TCI.2022.3171417

[29]

Zhedong Zheng, Yujiao Shi, Tingyu Wang, Jun Liu, Jianwu Fang, Yunchao Wei, and Tat-seng Chua. 2023. UAVM '23: 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective. In Proceedings of the 31th ACM International Conference on Multimedia Workshop.

[30]

Zhedong Zheng, Yunchao Wei, and Yi Yang. 2020. University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In Proceedings of the 28th ACM international conference on Multimedia. 1395--1403.

Digital Library

[31]

Jiedong Zhuang, Ming Dai, Xuruoyan Chen, and Enhui Zheng. 2021. A faster and more effective cross-view matching method of uav and satellite images for uav geolocalization. Remote Sensing 13, 19 (2021), 3979.

Cited By

Index Terms

A Cross-View Matching Method Based on Dense Partition Strategy for UAV Geolocalization
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations

Recommendations

AFPN: Attention-guided Feature Partition Network for Cross-view Geo-localization
UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

Cross-view geo-localization is to retrieve images of the same geographic target from different platforms. Since drones have received increasing attention in recent years because of their ability to capture high-quality multimedia data from the sky, we ...
Dual-branch Pattern and Multi-scale Context Facilitate Cross-view Geo-localization
UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

Cross-view geo-localization aims to locate the target image of the same geographic location from different viewpoints, which is a challenging task in the field of computer vision. Due to the interference of similar images and the surrounding environment ...
Dual Path Network for Cross-view Geo-Localization
UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

Cross-view geo-localization is to find images of the same geographical target from different views. We reevaluate the square-ring partition strategy proposed in LPN and identify a limitation in its fixed splitting strategy of features. Specifically, we ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

November 2023

86 pages

ISBN:9798400702860

DOI:10.1145/3607834

General Chairs:
Zhedong Zheng
National University of Singapore, Singapore
,
Yujiao Shi
The Australian National University, Australia
,
Tingyu Wang
Hangzhou Dianzi University, China
,
Jun Liu
Singapore University of Technology and Design, Singapore
,
Jianwu Fang
Chang'an University, China
,
Yunchao Wei
Beijing Jiaotong University, China
,
Tat-seng Chua
National University of Singapore, Singapore

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

November 2, 2023

Ottawa ON, Canada

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
93
Total Downloads

Downloads (Last 12 months)93
Downloads (Last 6 weeks)5

Reflects downloads up to 24 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents