Abstract
The task of ground-to-aerial image geo-localization can be achieved by matching a ground view query image to aerial images with geographic labels in a reference database. It remains challenging due to the drastic change in viewpoint. In this paper, we propose a new cross-view image synthesis conditional generative adversarial networks (cGAN) called Crossview Sequential Fork (CSF) to generate ground images from aerial images. CSF achieves a more detailed synthesis effect by the generation of segmentation maps and edge detection images. And the synthesis ground images are input to the image matching framework Cross View Synthesis Net (CVS-Net) to assist geo-localization, the distance between the descriptors of source ground image and synthesis ground image is calculated to assist the training of the network. CVS-Net is leveraged on the Siamese architecture to do metric learning for the matching task. Moreover, we introduce SARE loss as part of the training procedure and improve it by our data entry form which greatly improves the convergence rate and image retrieval accuracy compared to traditional triplet loss. Experimental results demonstrate the effectiveness and superiority of our proposed method over the state-of-the-art method on two benchmark datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Image classification with the fisher vector: Theory and practice. Int. J. Comput. Vis. (IJCV) 105(3), 222–245 (2013)
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. IEEE Trans. Patt. Anal. Mach. Intell. (2017)
Bansal, M., Sawhney, H.S., Cheng, H., Daniilidis, K.: Geo-localization of street views with aerial image databases. In: Proceedings of the 19th ACM International Conference on Multimedia, pp. 1125–1128. MM 2011. Association for Computing Machinery, New York, NY, USA (2011). https://doi.org/10.1145/2072298.2071954
Cai, S., Guo, Y., Khan, S., Hu, J., Wen, G.: Ground-to-aerial image geo-localization with a hard exemplar reweighting triplet loss. In: International Conference on Computer Vision (ICCV) (2019)
Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification. IEEE (2017)
Galvez-Lpez, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Rob. 28(5), 1188–1197 (2012)
Goodfellow, I.J., et al.: Generative adversarial networks. Adv. Neural. Inf. Process. Syst. 3, 2672–2680 (2014)
Hays, J., Efros, A.A.: IM2GPS: estimating geographic information from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR) (2008)
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification (2017)
Hu, S., Feng, M., Nguyen, R., Lee, G.H.: CVM-Net: cross-view matching network for image-based ground-to-aerial geo-localization. In: Computer Vision and Pattern Recognition (CVPR) (2018)
Jegou, H., Douze, M., Schmid, C., Perez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference On Computer Vision and Pattern Recognition (CVPR) (2010)
Jia, D., Wei, D., Socher, R., Li, L.J., Kai, L., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Li, Y., Wang, S., He, H., Meng, D., Yang, D.: Fast aerial image geolocalization using the projective-invariant contour feature. Remote Sensing 13(3), 490 (2021)
Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Lin, T.Y., Cui, Y., Belongie, S., Hays, J.: Learning deep representations for ground-to-aerial geolocalization. IEEE (2015)
Liu, L., Li, H., Dai, Y.: Stochastic attraction-repulsion embedding for large scale image localization. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2020)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60(2), 91–110 (2004)
Mirza, M., Osindero, S.: Conditional generative adversarial nets. Comput. Sci. 2672–2680 (2014)
Regmi, K., Borji, A.: Cross-view image synthesis using conditional GANs. IEEE (2018)
Sattler, T., Havlena, M., Schindler, K., Pollefeys, M.: Large-scale location recognition and the geometric burstiness problem. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Shi, Y., Yu, X., Liu, L., Zhang, T., Li, H.: Optimal feature transport for cross-view image geo-localization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34(7), pp. 11990–11997 (2020)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Comput. Sci. (2014)
Sun, B., Chen, C., Zhu, Y., Jiang, J.: GEOCAPSNET: aerial to ground view image geo-localization using capsule network. IEEE (2019)
Technicolor, T., Related, S., Technicolor, T., Related, S.: ImageNet classification with deep convolutional neural networks
Tian, Y., Chen, C., Shah, M.: Cross-view image matching for geo-localization in urban environments. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Viswanathan, A., Pires, B.R., Huber, D.: Vision based robot localization by ground to satellite matching in GPS-denied situations (2005)
Vo, N.N., Hays, J.: Localizing and orienting street views using overhead imagery. In: European Conference on Computer Vision (ECCV) (2016)
Workman, S., Jacobs, N.: On the location dependence of convolutional neural network features. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2015)
Workman, S., Souvenir, R., Jacobs, N.: Wide-area image geolocalization with aerial reference imagery. IEEE (2015)
Zhai, M., Bessinger, Z., Workman, S., Jacobs, N.: Predicting ground-level scene layout from aerial imagery. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Zitova, B., Flusser, J.: Image registration methods: a survey. Image Vis. Comput. 21(11), 977–1000 (2003)
Acknowledgement
This work was partially supported by the National Natural Science Foundation of China NSFC [grant numbers 62072343, U1736211]. The National Key Research Development Program of China [grant numbers 2019QY(Y)0206].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Huang, J., Ye, D. (2021). Ground-to-Aerial Image Geo-Localization with Cross-View Image Synthesis. In: Peng, Y., Hu, SM., Gabbouj, M., Zhou, K., Elad, M., Xu, K. (eds) Image and Graphics. ICIG 2021. Lecture Notes in Computer Science(), vol 12890. Springer, Cham. https://doi.org/10.1007/978-3-030-87361-5_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-87361-5_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87360-8
Online ISBN: 978-3-030-87361-5
eBook Packages: Computer ScienceComputer Science (R0)