Multitask learning for image translation and salient object detection from multimodal remote sensing images

Yuanfeng Lian ORCID: orcid.org/0000-0002-1801-2507¹,
Xu Shi¹,
ShaoChen Shen¹ &
…
Jing Hua²

904 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

This paper presents a novel and efficient multitask learning framework for image translation and saliency detection from remote sensing images, which mainly contains the image translation network-weight sharing attention GAN (WSA-GAN) and the salient object detection network-boundary guidance network (BGNet). WSA-GAN can be used to generate a large number of synthetic infrared remote sensing images (IRIs) or optical remote sensing images (ORIs) from the corresponding complementary modality images. Then, a new multimodal context-aware learning is proposed for feature extraction and to coordinate the entanglement of latent features in the multimodal context of ORIs and IRIs. Since convolutional neural networks do not perform well when the object has directional variance, our framework introduces the attention-aware CapsNet (AACNet) to alleviate the problem and enhance the feature expressiveness. In addition, knowledge distillation strategy is introduced in AACNet to reduce the model complexity. Finally, the multiscale feature learning network and the boundary-aware block are designed to generate more accurate saliency detection results with clear boundaries. Experimental results demonstrate that the presented image translation and salient object detection networks outperform other approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ML-SCODNet: Multitask Learning for Scene Classification and Object Detection Network from Remote Sensing Images

Salient Object Detection in Optical Remote Sensing Images Based on Global Context Mixed Attention

Article 04 June 2024

Transformer guidance dual-stream network for salient object detection in optical remote sensing images

Article 20 May 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The raw/processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study. If you have any questions, please consult the corresponding author.

References

Kaji, S., Kida, S.: Overview of image-to-image translation by use of deep neural networks: denoising, super-resolution, modality conversion, and reconstruction in medical imaging. Radiol. Phys. Technol. 12(3), 235–248 (2019)
Article PubMed Google Scholar
Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)
Article MathSciNet PubMed ADS Google Scholar
Hasanov, A., Laine, T.H., Chung, T.S.: A survey of adaptive context-aware learning environments. J. Ambient Intell. Smart Environ. 11(5), 403–428 (2019)
Article Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Hertzmann, A., Jacobs, C.E., Oliver, N., Curless, B., Salesin, D.H.: Image analogies. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 327–340 (2001)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8798–8807 (2018)
Yi, Z., Zhang, H., Tan, P., Gong, M.: Dualgan: Unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2849–2857 (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. Adv. Neural Inform. Process. Syst. 30 (2017)
Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: International Conference on Machine Learning, pp. 1857–1865. PMLR (2017)
Yoo, D., Kim, N., Park, S., Paek, A.S., Kweon, I.S.: Pixel-level domain transfer. In: European Conference on Computer Vision, pp. 517–532. Springer (2016)
Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M., Yang, M.H.: Diverse image-to-image translation via disentangled representations. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 35–51 (2018)
Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 172–189 (2018)
Lan, J., Ye, F., Ye, Z., Xu, P., Ling, W.K., Huang, G.: Unsupervised style-guided cross-domain adaptation for few-shot stylized face translation. Visual Comput. pp. 1–15 (2022)
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. Adv. Neural Inf. Process. Syst. 29 (2016)
Tang, H., Xu, D., Sebe, N., Yan, Y.: Attention-guided generative adversarial networks for unsupervised image-to-image translation. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2019)
Li, J., Zeng, H., Peng, L., Zhu, J., Liu, Z.: Learning to rank method combining multi-head self-attention with conditional generative adversarial nets. Array 15, 100205 (2022)
Article Google Scholar
Heo, Y.J., Kim, B.G., Roy, P.P.: Frontal face generation algorithm from multi-view images based on generative adversarial network. J. Multimed. Inf. Sys. 8(2), 85–92 (2021)
Article Google Scholar
Ruoyao, L., Bo, Z., Bin, W.: Remote sensing image scene classification based on multi-layer feature context coding network. J. Infrared Millim. Waves 40(4), 530 (2021)
Google Scholar
Wang, Q., He, X., Li, X.: Locality and structure regularized low rank representation for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 57(2), 911–923 (2018)
Article ADS Google Scholar
Goferman, S., Zelnik-Manor, L., Tal, A.: Context-aware saliency detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 1915–1926 (2011)
Article Google Scholar
Li, T., Song, H., Zhang, K., Liu, Q.: Learning residual refinement network with semantic context representation for real-time saliency object detection. Pattern Recogn. 105, 107372 (2020)
Article Google Scholar
Li, C., Yuan, Y., Cai, W., Xia, Y., Dagan Feng, D.: Robust saliency detection via regularized random walks ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2710–2717 (2015)
Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3166–3173 (2013)
Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., Shum, H.Y.: Learning to detect a salient object. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 353–367 (2010)
CAS Google Scholar
Jiang, H., Wang, J., Yuan, Z., Wu, Y., Zheng, N., Li, S.: Salient object detection: A discriminative regional feature integration approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2083–2090 (2013)
Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer vision, pp. 8779–8788 (2019)
Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.H.: Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3203–3212 (2017)
Li, C., Cong, R., Hou, J., Zhang, S., Qian, Y., Kwong, S.: Nested network with two-stream pyramid for salient object detection in optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 57(11), 9156–9166 (2019)
Article ADS Google Scholar
Zhang, L., Liu, Y., Zhang, J.: Saliency detection based on self-adaptive multiple feature fusion for remote sensing images. Int. J. Remote Sens. 40(22), 8270–8297 (2019)
Article Google Scholar
Hu, X., Fu, C.W., Zhu, L., Wang, T., Heng, P.A.: Sac-net: Spatial attenuation context for salient object detection. IEEE Trans. Circuits Syst. Video Technol. 31(3), 1079–1090 (2020)
Article Google Scholar
Das, D.K., Shit, S., Ray, D.N., Majumder, S.: Cgan: closure-guided attention network for salient object detection. Vis. Comput. 38(11), 3803–3817 (2022)
Article Google Scholar
Yu, Y., Gu, T., Guan, H., Li, D., Jin, S.: Vehicle detection from high-resolution remote sensing imagery using convolutional capsule networks. IEEE Geosci. Remote Sens. Lett. 16(12), 1894–1898 (2019)
Article ADS Google Scholar
Yu, Y., Wang, J., Qiang, H., Jiang, M., Tang, E., Yu, C., Zhang, Y., Li, J.: Sparse anchoring guided high-resolution capsule network for geospatial object detection from remote sensing imagery. Int. J. Appl. Earth Obs. Geoinf. 104, 102548 (2021)
Google Scholar
Janakiramaiah, B., Kalyani, G., Karuna, A., Prasad, L., Krishna, M.: Military object detection in defense using multi-level capsule networks. Soft Comput. pp. 1–15 (2021)
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Proceedings of Advances in Neural Information Processing Systems, pp. 3856–3866 (2017)
Hinton, G.E., Sabour, S., Frosst, N.: Matrix capsules with em routing. In: International Conference on Learning Representations (2018)
Feng, Y., Gao, J., Xu, C.: Learning dual-routing capsule graph neural network for few-shot video classification. IEEE Transactions on Multimedia (2022)
Liu, Y., Zhang, D., Zhang, Q., Han, J.: Part-object relational visual saliency. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)
Mazzia, V., Salvetti, F., Chiaberge, M.: Efficient-capsnet: Capsule network with self-attention routing. Sci. Rep. 11(1), 1–13 (2021)
Article Google Scholar
Park, H.J., Choi, Y.J., Lee, Y.W., Kim, B.G.: ssfpn: Scale sequence ($s ^{2}$) feature based feature pyramid network for object detection. arXiv preprint arXiv:2208.11533 (2022)
Chen, T., Xiao, J., Hu, X., Zhang, G., Wang, S.: Boundary-guided network for camouflaged object detection. Knowl.-Based Syst. 248, 108901 (2022)
Article Google Scholar
Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)
Article PubMed Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein gans. Adv. Neural Inf. Process. Syst. 30 (2017)
Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1365–1374 (2019)
Hinton, G., Vinyals, O., Dean, J., et al.: Distilling the knowledge in a neural network. arXiv preprint 2(7) arXiv:1503.02531 (2015)
Jia, B., Huang, Q.: De-capsnet: a diverse enhanced capsule network with disperse dynamic routing. Appl. Sci. 10(3), 884 (2020)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Zhang, Q., Cong, R., Li, C., Cheng, M.M., Fang, Y., Cao, X., Zhao, Y., Kwong, S.: Dense attention fluid network for salient object detection in optical remote sensing images. IEEE Trans. Image Process. 30, 1305–1317 (2020)
Article PubMed ADS Google Scholar
Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3917–3926 (2019)
Yuan, Y., Li, C., Kim, J., Cai, W., Feng, D.D.: Reversion correction and regularized random walk ranking for saliency detection. IEEE Trans. Image Process. 27(3), 1311–1322 (2017)
Article MathSciNet ADS Google Scholar
Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4548–4557 (2017)

Download references

Funding

This research was funded by NSFC 61972353, NSF IIS-1816511, OAC-1910469 and Strategic Cooperation Technology Projects of CNPC and CUPB: ZLZX2020-05.

Author information

Authors and Affiliations

Beijing Key Lab of Petroleum Data Mining, Department of Computer Science and Technology, China University of Petroleum, Beijing, China
Yuanfeng Lian, Xu Shi & ShaoChen Shen
Wayne State University, Detroit, MI, USA
Jing Hua

Authors

Yuanfeng Lian
View author publications
You can also search for this author in PubMed Google Scholar
Xu Shi
View author publications
You can also search for this author in PubMed Google Scholar
ShaoChen Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jing Hua
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuanfeng Lian.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lian, Y., Shi, X., Shen, S. et al. Multitask learning for image translation and salient object detection from multimodal remote sensing images. Vis Comput 40, 1395–1414 (2024). https://doi.org/10.1007/s00371-023-02857-3

Download citation

Accepted: 09 March 2023
Published: 04 May 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00371-023-02857-3

Multitask learning for image translation and salient object detection from multimodal remote sensing images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ML-SCODNet: Multitask Learning for Scene Classification and Object Detection Network from Remote Sensing Images

Salient Object Detection in Optical Remote Sensing Images Based on Global Context Mixed Attention

Transformer guidance dual-stream network for salient object detection in optical remote sensing images

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Multitask learning for image translation and salient object detection from multimodal remote sensing images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ML-SCODNet: Multitask Learning for Scene Classification and Object Detection Network from Remote Sensing Images

Salient Object Detection in Optical Remote Sensing Images Based on Global Context Mixed Attention

Transformer guidance dual-stream network for salient object detection in optical remote sensing images

Explore related subjects

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation