Ground-to-Aerial Image Geo-Localization with Cross-View Image Synthesis

Jiaqing Huang¹⁴ &
Dengpan Ye¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12890))

Included in the following conference series:

International Conference on Image and Graphics

2529 Accesses

Abstract

The task of ground-to-aerial image geo-localization can be achieved by matching a ground view query image to aerial images with geographic labels in a reference database. It remains challenging due to the drastic change in viewpoint. In this paper, we propose a new cross-view image synthesis conditional generative adversarial networks (cGAN) called Crossview Sequential Fork (CSF) to generate ground images from aerial images. CSF achieves a more detailed synthesis effect by the generation of segmentation maps and edge detection images. And the synthesis ground images are input to the image matching framework Cross View Synthesis Net (CVS-Net) to assist geo-localization, the distance between the descriptors of source ground image and synthesis ground image is calculated to assist the training of the network. CVS-Net is leveraged on the Siamese architecture to do metric learning for the matching task. Moreover, we introduce SARE loss as part of the training procedure and improve it by our data entry form which greatly improves the convergence rate and image retrieval accuracy compared to traditional triplet loss. Experimental results demonstrate the effectiveness and superiority of our proposed method over the state-of-the-art method on two benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Transformer-Based Method for UAV-View Geo-Localization

Semantic Map Injected GAN Training for Image-to-Image Translation

AerialIRGAN: unpaired aerial visible-to-infrared image translation with dual-encoder structure

Article Open access 27 September 2024

References

Image classification with the fisher vector: Theory and practice. Int. J. Comput. Vis. (IJCV) 105(3), 222–245 (2013)
Google Scholar
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. IEEE Trans. Patt. Anal. Mach. Intell. (2017)
Google Scholar
Bansal, M., Sawhney, H.S., Cheng, H., Daniilidis, K.: Geo-localization of street views with aerial image databases. In: Proceedings of the 19th ACM International Conference on Multimedia, pp. 1125–1128. MM 2011. Association for Computing Machinery, New York, NY, USA (2011). https://doi.org/10.1145/2072298.2071954
Cai, S., Guo, Y., Khan, S., Hu, J., Wen, G.: Ground-to-aerial image geo-localization with a hard exemplar reweighting triplet loss. In: International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification. IEEE (2017)
Google Scholar
Galvez-Lpez, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Rob. 28(5), 1188–1197 (2012)
Article Google Scholar
Goodfellow, I.J., et al.: Generative adversarial networks. Adv. Neural. Inf. Process. Syst. 3, 2672–2680 (2014)
Google Scholar
Hays, J., Efros, A.A.: IM2GPS: estimating geographic information from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR) (2008)
Google Scholar
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification (2017)
Google Scholar
Hu, S., Feng, M., Nguyen, R., Lee, G.H.: CVM-Net: cross-view matching network for image-based ground-to-aerial geo-localization. In: Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Jegou, H., Douze, M., Schmid, C., Perez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference On Computer Vision and Pattern Recognition (CVPR) (2010)
Google Scholar
Jia, D., Wei, D., Socher, R., Li, L.J., Kai, L., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar
Li, Y., Wang, S., He, H., Meng, D., Yang, D.: Fast aerial image geolocalization using the projective-invariant contour feature. Remote Sensing 13(3), 490 (2021)
Article Google Scholar
Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Lin, T.Y., Cui, Y., Belongie, S., Hays, J.: Learning deep representations for ground-to-aerial geolocalization. IEEE (2015)
Google Scholar
Liu, L., Li, H., Dai, Y.: Stochastic attraction-repulsion embedding for large scale image localization. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2020)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60(2), 91–110 (2004)
Article Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. Comput. Sci. 2672–2680 (2014)
Google Scholar
Regmi, K., Borji, A.: Cross-view image synthesis using conditional GANs. IEEE (2018)
Google Scholar
Sattler, T., Havlena, M., Schindler, K., Pollefeys, M.: Large-scale location recognition and the geometric burstiness problem. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Shi, Y., Yu, X., Liu, L., Zhang, T., Li, H.: Optimal feature transport for cross-view image geo-localization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34(7), pp. 11990–11997 (2020)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Comput. Sci. (2014)
Google Scholar
Sun, B., Chen, C., Zhu, Y., Jiang, J.: GEOCAPSNET: aerial to ground view image geo-localization using capsule network. IEEE (2019)
Google Scholar
Technicolor, T., Related, S., Technicolor, T., Related, S.: ImageNet classification with deep convolutional neural networks
Google Scholar
Tian, Y., Chen, C., Shah, M.: Cross-view image matching for geo-localization in urban environments. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Viswanathan, A., Pires, B.R., Huber, D.: Vision based robot localization by ground to satellite matching in GPS-denied situations (2005)
Google Scholar
Vo, N.N., Hays, J.: Localizing and orienting street views using overhead imagery. In: European Conference on Computer Vision (ECCV) (2016)
Google Scholar
Workman, S., Jacobs, N.: On the location dependence of convolutional neural network features. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2015)
Google Scholar
Workman, S., Souvenir, R., Jacobs, N.: Wide-area image geolocalization with aerial reference imagery. IEEE (2015)
Google Scholar
Zhai, M., Bessinger, Z., Workman, S., Jacobs, N.: Predicting ground-level scene layout from aerial imagery. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Zitova, B., Flusser, J.: Image registration methods: a survey. Image Vis. Comput. 21(11), 977–1000 (2003)
Google Scholar

Download references

Acknowledgement

This work was partially supported by the National Natural Science Foundation of China NSFC [grant numbers 62072343, U1736211]. The National Key Research Development Program of China [grant numbers 2019QY(Y)0206].

Author information

Authors and Affiliations

Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan, China
Jiaqing Huang & Dengpan Ye

Authors

Jiaqing Huang
View author publications
You can also search for this author in PubMed Google Scholar
Dengpan Ye
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dengpan Ye .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Yuxin Peng
Tsinghua University, Beijing, China
Shi-Min Hu
Tampere University, Tampere, Finland
Moncef Gabbouj
Zhejiang University, Hangzhou, China
Kun Zhou
Technion – Israel Institute of Technology, Haifa, Israel
Michael Elad
Tsinghua University, Beijing, China
Kun Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, J., Ye, D. (2021). Ground-to-Aerial Image Geo-Localization with Cross-View Image Synthesis. In: Peng, Y., Hu, SM., Gabbouj, M., Zhou, K., Elad, M., Xu, K. (eds) Image and Graphics. ICIG 2021. Lecture Notes in Computer Science(), vol 12890. Springer, Cham. https://doi.org/10.1007/978-3-030-87361-5_34

Download citation

DOI: https://doi.org/10.1007/978-3-030-87361-5_34
Published: 30 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87360-8
Online ISBN: 978-3-030-87361-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Ground-to-Aerial Image Geo-Localization with Cross-View Image Synthesis

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Transformer-Based Method for UAV-View Geo-Localization

Semantic Map Injected GAN Training for Image-to-Image Translation

AerialIRGAN: unpaired aerial visible-to-infrared image translation with dual-encoder structure

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Ground-to-Aerial Image Geo-Localization with Cross-View Image Synthesis

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Transformer-Based Method for UAV-View Geo-Localization

Semantic Map Injected GAN Training for Image-to-Image Translation

AerialIRGAN: unpaired aerial visible-to-infrared image translation with dual-encoder structure

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation