short-paper

Dual-branch Pattern and Multi-scale Context Facilitate Cross-view Geo-localization

Authors:

Fasheng WangAuthors Info & Claims

UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

Pages 25 - 29

https://doi.org/10.1145/3607834.3616572

Published: 29 October 2023 Publication History

Abstract

Cross-view geo-localization aims to locate the target image of the same geographic location from different viewpoints, which is a challenging task in the field of computer vision. Due to the interference of similar images and the surrounding environment of the target building, the matching accuracy is significantly reduced when facing complex scenes. To solve this problem, we propose a cross-view geo-localization method based on dual-branch pattern and multi-scale context to provide a solution for challenging dataset with numerous distractors. This method exploits a Transformer feature extraction network to reduce the loss of fine-grained features. Meanwhile, a dual-branch structure is designed to capture image semantic information and local context information bidirectionally, which can effectively deal with the problem of more interference items in satellite images and improve the accuracy of geographic location tasks in complex scenes. After quantitative experimental verification, both recall rate (Recall) and image retrieval average precision (AP) indicators have been significantly improved on benchmark dataset University-1652 and challenging dataset University-160K, our method can achieve advanced cross-view geo-localization performance.

References

[1]

Khawaja Tehseen Ahmed, Shahida Ummesafi, and Amjad Iqbal. 2019. Content based image retrieval using image features information fusion. Information Fusion, Vol. 51 (2019), 76--99. https://doi.org/10.1016/j.inffus.2018.11.004

Digital Library

[2]

Hritam Basak, Rohit Kundu, Pawan Kumar Singh, Muhammad Fazal Ijaz, Marcin Wo'zniak, and Ram Sarkar. 2022. A union of deep learning and swarm-based optimization for 3D human action recognition. Scientific Reports, Vol. 12, 1 (2022), 5494. https://doi.org/10.1038/s41598-022-09293--8

[3]

Francesco Castaldo, Amir Zamir, Roland Angst, Francesco Palmieri, and Silvio Savarese. 2015. Semantic cross-view matching. In Proceedings of the IEEE International Conference on Computer Vision Workshops. IEEE Computer Society, 9--17. https://doi.org/10.1109/ICCVW.2015.137

Digital Library

[4]

Ming Dai, Jianhong Hu, Jiedong Zhuang, and Enhui Zheng. 2021. A transformer-based feature segmentation and region alignment method for UAV-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 7 (2021), 4376--4389. https://doi.org/10.1109/TCSVT.2021.3135013

Digital Library

[5]

Lirong Ding, Ji Zhou, Lingxuan Meng, and Zhiyong Long. 2020. A practical cross-view image matching method between UAV and satellite for UAV-based geo-localization. Remote Sensing, Vol. 13, 1 (2020), 47. https://doi.org/10.3390/rs13010047

[6]

Yalda Ghasemi, Heejin Jeong, Sung Ho Choi, Kyeong-Beom Park, and Jae Yeol Lee. 2022. Deep learning-based object detection in augmented reality: A systematic review. Computers in Industry, Vol. 139 (2022), 103661. https://doi.org/10.1016/j.compind.2022.103661

[7]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 770--778. https://doi.org/10.1109/CVPR.2016.90

[8]

Jinliang Lin, Zhedong Zheng, Zhun Zhong, Zhiming Luo, Shaozi Li, Yi Yang, and Nicu Sebe. 2022. Joint representation learning and keypoint detection for cross-view geo-localization. IEEE Transactions on Image Processing, Vol. 31 (2022), 3780--3792. https://doi.org/10.1109/TIP.2022.3175601

[9]

Tsung-Yi Lin, Serge Belongie, and James Hays. 2013. Cross-view image geolocalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 891--898. https://doi.org/10.1109/CVPR.2013.120

Digital Library

[10]

Tsung-Yi Lin, Yin Cui, Serge Belongie, and James Hays. 2015. Learning deep representations for ground-to-aerial geolocalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 5007--5015. https://doi.org/10.1109/CVPR.2015.7299135

[11]

Liu Liu and Hongdong Li. 2019. Lending orientation to neural networks for cross-view geo-localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 5624--5633. https://doi.org/10.1109/CVPR.2019.00577

[12]

Zifei Luo, Wenzhu Yang, Yunfeng Yuan, Ruru Gou, and Xiaonan Li. 2023. Semantic segmentation of agricultural images: a survey. Information Processing in Agriculture (2023). https://doi.org/10.1016/j.inpa.2023.02.001

[13]

Yujian Mo, Yan Wu, Xinneng Yang, Feilin Liu, and Yujun Liao. 2022. Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing, Vol. 493 (2022), 626--646. https://doi.org/10.1016/j.neucom.2022.01.005

Digital Library

[14]

Vipul Narayan, Pawan Kumar Mall, Shashank Awasthi, Swapnita Srivastava, and Anurag Gupta. 2023. FuzzyNet: Medical Image Classification based on GLCM Texture Feature. In 2023 International Conference on Artificial Intelligence and Smart Communication (AISC). IEEE, 769--773.

[15]

Fatma Outay, Hanan Abdullah Mengash, and Muhammad Adnan. 2020. Applications of unmanned aerial vehicle (UAV) in road safety, traffic and highway infrastructure management: Recent advances and challenges. Transportation Research Part A: Policy and Practice, Vol. 141 (2020), 116--129. https://doi.org/10.1016/j.tra.2020.09.018

[16]

Krishna Regmi and Mubarak Shah. 2019. Bridging the domain gap for ground-to-aerial image matching. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 470--479. https://doi.org/10.1109/ICCV.2019.00056

[17]

Royston Rodrigues and Masahiro Tani. 2021. Are these from the same place? seeing the unseen in cross-view image geo-localization. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. IEEE, 3753--3761. https://doi.org/10.1109/WACV48630.2021.00380

[18]

R Rani Saritha, Varghese Paul, and P Ganesh Kumar. 2019. Content based image retrieval using deep learning process. Cluster Computing, Vol. 22 (2019), 4187--4200. https://doi.org/10.1007/s10586-018--1731-0

Digital Library

[19]

Olivier Saurer, Georges Baatz, Kevin Köser, L'ubor Ladickỳ, and Marc Pollefeys. 2016. Image based geo-localization in the alps. International Journal of Computer Vision, Vol. 116, 3 (2016), 213--225. https://doi.org/10.1007/s11263-015-0830-0

Digital Library

[20]

Yujiao Shi, Liu Liu, Xin Yu, and Hongdong Li. 2019. Spatial-aware feature aggregation for image based cross-view geo-localization. Advances in Neural Information Processing Systems, Vol. 32 (2019), 10090--10100.

[21]

Yujiao Shi, Xin Yu, Dylan Campbell, and Hongdong Li. 2020a. Where am i looking at? joint location and orientation estimation by cross-view matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 4064--4072. https://doi.org/10.1109/CVPR42600.2020.00412

[22]

Yujiao Shi, Xin Yu, Liu Liu, Tong Zhang, and Hongdong Li. 2020b. Optimal feature transport for cross-view image geo-localization. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. AAAI Press, 11990--11997. https://doi.org/10.48550/arXiv.1907.05021

[23]

Jing Sun, Rui Yan, Bing Zhang, Bing Zhu, and Fuming Sun. 2023. A cross-view geo-localization method guided by relation-aware global attention. Multimedia Systems, Vol. 29, 4 (2023), 2205--2216. https://doi.org/10.1007/s00530-023-01101--1

Digital Library

[24]

Xiaoyang Tian, Jie Shao, Deqiang Ouyang, and Heng Tao Shen. 2021. UAV-satellite view synthesis for cross-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 7 (2021), 4804--4815. https://doi.org/10.1109/TCSVT.2021.3121987

Digital Library

[25]

Aysim Toker, Qunjie Zhou, Maxim Maximov, and Laura Leal-Taixé. 2021. Coming down to earth: Satellite-to-street view synthesis for geo-localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 6488--6497. https://doi.org/10.1109/CVPR46437.2021.00642

[26]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems, Vol. 30 (2017). https://doi.org/10.48550/arXiv.1706.03762

[27]

Pin Wang, En Fan, and Peng Wang. 2021a. Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recognition Letters, Vol. 141 (2021), 61--67. https://doi.org/10.1016/j.patrec.2020.07.042

[28]

Tingyu Wang, Zhedong Zheng, Chenggang Yan, Jiyong Zhang, Yaoqi Sun, Bolun Zheng, and Yi Yang. 2021b. Each part matters: Local patterns facilitate cross-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 2 (2021), 867--879. https://doi.org/10.1109/TCSVT.2021.3061265

[29]

Tingyu Wang, Zhedong Zheng, Zunjie Zhu, Yuhan Gao, Yi Yang, and Chenggang Yan. 2022b. Learning cross-view geo-localization embeddings via dynamic weighted decorrelation regularization. arXiv preprint arXiv:2211.05296 (2022). https://doi.org/10.48550/arXiv.2211.05296

[30]

Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. 2022a. Pvt v2: Improved baselines with pyramid vision transformer. Computational Visual Media, Vol. 8, 3 (2022), 415--424. https://doi.org/10.1007/s41095-022-0274--8

[31]

Scott Workman, Richard Souvenir, and Nathan Jacobs. 2015. Wide-area image geolocalization with aerial reference imagery. In Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society, 3961--3969. https://doi.org/10.1109/ICCV.2015.451

Digital Library

[32]

Zhen Xing, Qi Dai, Han Hu, Jingjing Chen, Zuxuan Wu, and Yu-Gang Jiang. 2023. Svformer: Semi-supervised video transformer for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 18816--18826. https://doi.org/10.48550/arXiv.2211.13222

[33]

Hongji Yang, Xiufan Lu, and Yingying Zhu. 2021. Cross-view geo-localization with evolving transformer. arXiv preprint arXiv:2107.00842, Vol. abs/2107.00842 (2021). https://doi.org/10.48550/arXiv.2107.00842

[34]

Menghua Zhai, Zachary Bessinger, Scott Workman, and Nathan Jacobs. 2017. Predicting ground-level scene layout from aerial imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 867--875. https://doi.org/10.1109/CVPR.2017.440

[35]

Dan Zhang, Mao Ye, Yiguang Liu, Lin Xiong, and Lihua Zhou. 2022. Multi-source unsupervised domain adaptation for object detection. Information Fusion, Vol. 78 (2022), 138--148. https://doi.org/10.1016/j.inffus.2021.09.011

Digital Library

[36]

Zhedong Zheng, Yujiao Shi, Tingyu Wang, Jun Liu, Jianwu Fang, Yunchao Wei, and Tat-seng Chua. 2023. UAVs in Multimedia: Capturing the World from a New Perspective. In Proceedings of the 31th ACM International Conference on Multimedia Workshop.

[37]

Zhedong Zheng, Yunchao Wei, and Yi Yang. 2020. University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In Proceedings of the 28th ACM International Conference on Multimedia. ACM, 1395--1403. https://doi.org/10.1145/3394171.3413896

Digital Library

[38]

Jiedong Zhuang, Ming Dai, Xuruoyan Chen, and Enhui Zheng. 2021. A faster and more effective cross-view matching method of uav and satellite images for uav geolocalization. Remote Sensing, Vol. 13, 19 (2021), 3979. https://doi.org/10.3390/rs13193979 io

Cited By

Deuser FWerner MHabel KOswald NZheng ZShi YWang TChen CZhu PHartley R(2024)Optimizing Geo-Localization with k-Means Re-Ranking in Challenging Weather ConditionsProceedings of the 2nd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective10.1145/3689095.3689099(9-13)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3689095.3689099

Index Terms

Dual-branch Pattern and Multi-scale Context Facilitate Cross-view Geo-localization
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

AFPN: Attention-guided Feature Partition Network for Cross-view Geo-localization
UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

Cross-view geo-localization is to retrieve images of the same geographic target from different platforms. Since drones have received increasing attention in recent years because of their ability to capture high-quality multimedia data from the sky, we ...
Image and Object Geo-Localization
Abstract
The concept of geo-localization broadly refers to the process of determining an entity’s geographical location, typically in the form of Global Positioning System (GPS) coordinates. The entity of interest may be an image, a sequence of images, a ...
Learning discriminative representations via variational self-distillation for cross-view geo-localization
Abstract
Cross-view geo-localization is to localize the same geographic target in images from different perspectives, e.g., satellite-view and drone-view. The primary challenge faced by existing methods is the large visual appearance changes ...
Highlights
- Variational self-distillation is used for cross-view geo-localization.
- Square-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

November 2023

86 pages

ISBN:9798400702860

DOI:10.1145/3607834

General Chairs:
Zhedong Zheng
National University of Singapore, Singapore
,
Yujiao Shi
The Australian National University, Australia
,
Tingyu Wang
Hangzhou Dianzi University, China
,
Jun Liu
Singapore University of Technology and Design, Singapore
,
Jianwu Fang
Chang'an University, China
,
Yunchao Wei
Beijing Jiaotong University, China
,
Tat-seng Chua
National University of Singapore, Singapore

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

Dalian Youth Science and Technology Star Program
National Natural Science Foundation of China

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

November 2, 2023

Ottawa ON, Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
85
Total Downloads

Downloads (Last 12 months)85
Downloads (Last 6 weeks)9

Reflects downloads up to 14 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Deuser FWerner MHabel KOswald NZheng ZShi YWang TChen CZhu PHartley R(2024)Optimizing Geo-Localization with k-Means Re-Ranking in Challenging Weather ConditionsProceedings of the 2nd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective10.1145/3689095.3689099(9-13)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3689095.3689099

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents