Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3607834.3616572acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Dual-branch Pattern and Multi-scale Context Facilitate Cross-view Geo-localization

Published: 29 October 2023 Publication History

Abstract

Cross-view geo-localization aims to locate the target image of the same geographic location from different viewpoints, which is a challenging task in the field of computer vision. Due to the interference of similar images and the surrounding environment of the target building, the matching accuracy is significantly reduced when facing complex scenes. To solve this problem, we propose a cross-view geo-localization method based on dual-branch pattern and multi-scale context to provide a solution for challenging dataset with numerous distractors. This method exploits a Transformer feature extraction network to reduce the loss of fine-grained features. Meanwhile, a dual-branch structure is designed to capture image semantic information and local context information bidirectionally, which can effectively deal with the problem of more interference items in satellite images and improve the accuracy of geographic location tasks in complex scenes. After quantitative experimental verification, both recall rate (Recall) and image retrieval average precision (AP) indicators have been significantly improved on benchmark dataset University-1652 and challenging dataset University-160K, our method can achieve advanced cross-view geo-localization performance.

References

[1]
Khawaja Tehseen Ahmed, Shahida Ummesafi, and Amjad Iqbal. 2019. Content based image retrieval using image features information fusion. Information Fusion, Vol. 51 (2019), 76--99. https://doi.org/10.1016/j.inffus.2018.11.004
[2]
Hritam Basak, Rohit Kundu, Pawan Kumar Singh, Muhammad Fazal Ijaz, Marcin Wo'zniak, and Ram Sarkar. 2022. A union of deep learning and swarm-based optimization for 3D human action recognition. Scientific Reports, Vol. 12, 1 (2022), 5494. https://doi.org/10.1038/s41598-022-09293--8
[3]
Francesco Castaldo, Amir Zamir, Roland Angst, Francesco Palmieri, and Silvio Savarese. 2015. Semantic cross-view matching. In Proceedings of the IEEE International Conference on Computer Vision Workshops. IEEE Computer Society, 9--17. https://doi.org/10.1109/ICCVW.2015.137
[4]
Ming Dai, Jianhong Hu, Jiedong Zhuang, and Enhui Zheng. 2021. A transformer-based feature segmentation and region alignment method for UAV-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 7 (2021), 4376--4389. https://doi.org/10.1109/TCSVT.2021.3135013
[5]
Lirong Ding, Ji Zhou, Lingxuan Meng, and Zhiyong Long. 2020. A practical cross-view image matching method between UAV and satellite for UAV-based geo-localization. Remote Sensing, Vol. 13, 1 (2020), 47. https://doi.org/10.3390/rs13010047
[6]
Yalda Ghasemi, Heejin Jeong, Sung Ho Choi, Kyeong-Beom Park, and Jae Yeol Lee. 2022. Deep learning-based object detection in augmented reality: A systematic review. Computers in Industry, Vol. 139 (2022), 103661. https://doi.org/10.1016/j.compind.2022.103661
[7]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 770--778. https://doi.org/10.1109/CVPR.2016.90
[8]
Jinliang Lin, Zhedong Zheng, Zhun Zhong, Zhiming Luo, Shaozi Li, Yi Yang, and Nicu Sebe. 2022. Joint representation learning and keypoint detection for cross-view geo-localization. IEEE Transactions on Image Processing, Vol. 31 (2022), 3780--3792. https://doi.org/10.1109/TIP.2022.3175601
[9]
Tsung-Yi Lin, Serge Belongie, and James Hays. 2013. Cross-view image geolocalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 891--898. https://doi.org/10.1109/CVPR.2013.120
[10]
Tsung-Yi Lin, Yin Cui, Serge Belongie, and James Hays. 2015. Learning deep representations for ground-to-aerial geolocalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 5007--5015. https://doi.org/10.1109/CVPR.2015.7299135
[11]
Liu Liu and Hongdong Li. 2019. Lending orientation to neural networks for cross-view geo-localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 5624--5633. https://doi.org/10.1109/CVPR.2019.00577
[12]
Zifei Luo, Wenzhu Yang, Yunfeng Yuan, Ruru Gou, and Xiaonan Li. 2023. Semantic segmentation of agricultural images: a survey. Information Processing in Agriculture (2023). https://doi.org/10.1016/j.inpa.2023.02.001
[13]
Yujian Mo, Yan Wu, Xinneng Yang, Feilin Liu, and Yujun Liao. 2022. Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing, Vol. 493 (2022), 626--646. https://doi.org/10.1016/j.neucom.2022.01.005
[14]
Vipul Narayan, Pawan Kumar Mall, Shashank Awasthi, Swapnita Srivastava, and Anurag Gupta. 2023. FuzzyNet: Medical Image Classification based on GLCM Texture Feature. In 2023 International Conference on Artificial Intelligence and Smart Communication (AISC). IEEE, 769--773.
[15]
Fatma Outay, Hanan Abdullah Mengash, and Muhammad Adnan. 2020. Applications of unmanned aerial vehicle (UAV) in road safety, traffic and highway infrastructure management: Recent advances and challenges. Transportation Research Part A: Policy and Practice, Vol. 141 (2020), 116--129. https://doi.org/10.1016/j.tra.2020.09.018
[16]
Krishna Regmi and Mubarak Shah. 2019. Bridging the domain gap for ground-to-aerial image matching. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 470--479. https://doi.org/10.1109/ICCV.2019.00056
[17]
Royston Rodrigues and Masahiro Tani. 2021. Are these from the same place? seeing the unseen in cross-view image geo-localization. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. IEEE, 3753--3761. https://doi.org/10.1109/WACV48630.2021.00380
[18]
R Rani Saritha, Varghese Paul, and P Ganesh Kumar. 2019. Content based image retrieval using deep learning process. Cluster Computing, Vol. 22 (2019), 4187--4200. https://doi.org/10.1007/s10586-018--1731-0
[19]
Olivier Saurer, Georges Baatz, Kevin Köser, L'ubor Ladickỳ, and Marc Pollefeys. 2016. Image based geo-localization in the alps. International Journal of Computer Vision, Vol. 116, 3 (2016), 213--225. https://doi.org/10.1007/s11263-015-0830-0
[20]
Yujiao Shi, Liu Liu, Xin Yu, and Hongdong Li. 2019. Spatial-aware feature aggregation for image based cross-view geo-localization. Advances in Neural Information Processing Systems, Vol. 32 (2019), 10090--10100.
[21]
Yujiao Shi, Xin Yu, Dylan Campbell, and Hongdong Li. 2020a. Where am i looking at? joint location and orientation estimation by cross-view matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 4064--4072. https://doi.org/10.1109/CVPR42600.2020.00412
[22]
Yujiao Shi, Xin Yu, Liu Liu, Tong Zhang, and Hongdong Li. 2020b. Optimal feature transport for cross-view image geo-localization. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. AAAI Press, 11990--11997. https://doi.org/10.48550/arXiv.1907.05021
[23]
Jing Sun, Rui Yan, Bing Zhang, Bing Zhu, and Fuming Sun. 2023. A cross-view geo-localization method guided by relation-aware global attention. Multimedia Systems, Vol. 29, 4 (2023), 2205--2216. https://doi.org/10.1007/s00530-023-01101--1
[24]
Xiaoyang Tian, Jie Shao, Deqiang Ouyang, and Heng Tao Shen. 2021. UAV-satellite view synthesis for cross-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 7 (2021), 4804--4815. https://doi.org/10.1109/TCSVT.2021.3121987
[25]
Aysim Toker, Qunjie Zhou, Maxim Maximov, and Laura Leal-Taixé. 2021. Coming down to earth: Satellite-to-street view synthesis for geo-localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 6488--6497. https://doi.org/10.1109/CVPR46437.2021.00642
[26]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems, Vol. 30 (2017). https://doi.org/10.48550/arXiv.1706.03762
[27]
Pin Wang, En Fan, and Peng Wang. 2021a. Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recognition Letters, Vol. 141 (2021), 61--67. https://doi.org/10.1016/j.patrec.2020.07.042
[28]
Tingyu Wang, Zhedong Zheng, Chenggang Yan, Jiyong Zhang, Yaoqi Sun, Bolun Zheng, and Yi Yang. 2021b. Each part matters: Local patterns facilitate cross-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 2 (2021), 867--879. https://doi.org/10.1109/TCSVT.2021.3061265
[29]
Tingyu Wang, Zhedong Zheng, Zunjie Zhu, Yuhan Gao, Yi Yang, and Chenggang Yan. 2022b. Learning cross-view geo-localization embeddings via dynamic weighted decorrelation regularization. arXiv preprint arXiv:2211.05296 (2022). https://doi.org/10.48550/arXiv.2211.05296
[30]
Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. 2022a. Pvt v2: Improved baselines with pyramid vision transformer. Computational Visual Media, Vol. 8, 3 (2022), 415--424. https://doi.org/10.1007/s41095-022-0274--8
[31]
Scott Workman, Richard Souvenir, and Nathan Jacobs. 2015. Wide-area image geolocalization with aerial reference imagery. In Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society, 3961--3969. https://doi.org/10.1109/ICCV.2015.451
[32]
Zhen Xing, Qi Dai, Han Hu, Jingjing Chen, Zuxuan Wu, and Yu-Gang Jiang. 2023. Svformer: Semi-supervised video transformer for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 18816--18826. https://doi.org/10.48550/arXiv.2211.13222
[33]
Hongji Yang, Xiufan Lu, and Yingying Zhu. 2021. Cross-view geo-localization with evolving transformer. arXiv preprint arXiv:2107.00842, Vol. abs/2107.00842 (2021). https://doi.org/10.48550/arXiv.2107.00842
[34]
Menghua Zhai, Zachary Bessinger, Scott Workman, and Nathan Jacobs. 2017. Predicting ground-level scene layout from aerial imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 867--875. https://doi.org/10.1109/CVPR.2017.440
[35]
Dan Zhang, Mao Ye, Yiguang Liu, Lin Xiong, and Lihua Zhou. 2022. Multi-source unsupervised domain adaptation for object detection. Information Fusion, Vol. 78 (2022), 138--148. https://doi.org/10.1016/j.inffus.2021.09.011
[36]
Zhedong Zheng, Yujiao Shi, Tingyu Wang, Jun Liu, Jianwu Fang, Yunchao Wei, and Tat-seng Chua. 2023. UAVs in Multimedia: Capturing the World from a New Perspective. In Proceedings of the 31th ACM International Conference on Multimedia Workshop.
[37]
Zhedong Zheng, Yunchao Wei, and Yi Yang. 2020. University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In Proceedings of the 28th ACM International Conference on Multimedia. ACM, 1395--1403. https://doi.org/10.1145/3394171.3413896
[38]
Jiedong Zhuang, Ming Dai, Xuruoyan Chen, and Enhui Zheng. 2021. A faster and more effective cross-view matching method of uav and satellite images for uav geolocalization. Remote Sensing, Vol. 13, 19 (2021), 3979. https://doi.org/10.3390/rs13193979 io

Cited By

View all
  • (2024)Optimizing Geo-Localization with k-Means Re-Ranking in Challenging Weather ConditionsProceedings of the 2nd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective10.1145/3689095.3689099(9-13)Online publication date: 28-Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
November 2023
86 pages
ISBN:9798400702860
DOI:10.1145/3607834
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. drone
  2. dual-branch pattern
  3. geo-localization
  4. transformer network

Qualifiers

  • Short-paper

Funding Sources

Conference

MM '23
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)85
  • Downloads (Last 6 weeks)9
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Optimizing Geo-Localization with k-Means Re-Ranking in Challenging Weather ConditionsProceedings of the 2nd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective10.1145/3689095.3689099(9-13)Online publication date: 28-Oct-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media