Local weight coupled network: multi-modal unequal semi-supervised domain adaptation

Ziyun Cai ORCID: orcid.org/0000-0001-6822-915X¹,
Jie Song¹,
Tengfei Zhang¹,
Changhui Hu¹ &
…
Xiao-Yuan Jing^2,3

313 Accesses
Explore all metrics

Abstract

Existing semi-supervised domain adaptation (SSDA) approaches on visual classification usually assume that the labelled source data are only collected from single modality. However, since single source data cannot fully show the characteristics of the target data, source domain may be collected from multiple modalities (i.e. RGB and depth modalities). Traditional domain adaptation (DA) task makes an unrealistic scenario, where the label space in the source equals to the label space in the target. However, in real-world scenario, source and target domains may have different label spaces. Thus, the irrelevant categories in the source domain will cause two challenges: negative transfer and imbalanced distribution. In this paper, we design a novel deep SSDA framework in an end-to-end fashion, termed Local Weight Coupled Network (LWCN) for effective knowledge transfer, which aims to take advantage of the multi-modal information in the source domain and tackle the mentioned challenges, simultaneously. Specially, we construct the output layer including classification and regression, where the multi-class classifier and the multi-layer feature extractor can be learned jointly for mutual benefits. Empirical evaluations on five cross-domain benchmarks illustrate the competitive performance of our model with respect to the state-of-the-art, especially under the unequal categories scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Simultaneous Deep Transfer Across Domains and Tasks

Visual domain adaptation via transfer feature learning

Article 07 May 2016

Multi-source domain adaptation for image classification

Article 27 June 2020

Data Availability Statement

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Aggarwal K, Mijwil MM, Al-Mistarehi AH et al (2022) Has the future started? The current growth of artificial intelligence, machine learning, and deep learning. Iraqi J Comput Sci Math 3(1):115–123
Google Scholar
Cai Z, Long Y, Jing XY et al (2018) Adaptive visual-depth fusion transfer. In: Asian conference on computer vision, pp 56–73
Cai Z, Long Y, Shao L (2018) Adaptive rgb image recognition by visual-depth embedding. IEEE Trans Image Process 27(5):2471–2483
Article MathSciNet Google Scholar
Cai Z, Jing XY, Shao L (2020) Visual-depth matching network: deep rgb-d domain adaptation with unequal categories. IEEE Trans Cybern 52 (6):4623–4635
Article Google Scholar
Cao Z, Long M, Wang J et al (2018) Partial transfer learning with selective adversarial networks. In: IEEE Conference on computer vision and pattern recognition, pp 2724–2732
Cao Z, You K, Long M et al (2019) Learning to transfer examples for partial domain adaptation. In: IEEE Conference on computer vision and pattern recognition, pp 2985–2994
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet Google Scholar
Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. In: IEEE Conference on computer vision and pattern recognition, pp 248–255
Ding Z, Nasrabadi NM, Fu Y (2018) Semi-supervised deep domain adaptation via coupled neural networks. IEEE Trans Image Process 27(11):5214–5224
Article MathSciNet Google Scholar
Donahue J, Jia Y, Vinyals O et al (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: International Conference on machine learning, pp 647–655
Fei-Fei L, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: IEEE Conference on computer vision and pattern recognition, pp 524–531
Ganin Y, Ustinova E, Ajakan H et al (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):2096–2030
MathSciNet Google Scholar
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset
Gupta S, Girshick R, Arbeláez P et al (2014) Learning rich features from RGB-D images for object detection and segmentation. In: European conference on computer vision, pp 345–360
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778
Izonin I, Tkachenko R, Shakhovska N et al (2022) A two-step data normalization approach for improving classification accuracy in the medical diagnosis domain. Mathematics 10(11):1942
Article Google Scholar
Jangra M, Dhull SK, Singh KK et al (2021) O-wcnn: an optimized integration of spatial and spectral feature map for arrhythmia classification. Complex Intell Syst, 1–14
Janoch A, Karayev S, Jia Y et al (2013) A category-level 3D object dataset: putting the kinect to work
Koppanati RK, Kumar K (2020) P-mec: polynomial congruence-based multimedia encryption technique over cloud. IEEE Consum Electron Mag 10(5):41–46
Article Google Scholar
Kumar K (2021) Text query based summarized event searching interface system using deep learning over cloud. Multimed Tools Applic 80(7):11,079–11,094
Article Google Scholar
Kumar K, Shrimankar DD (2017) F-des: fast and deep event summarization. IEEE Trans Multimed 20(2):323–334
Article Google Scholar
Kumar K, Shrimankar DD (2018) Deep event learning boost-up approach: delta. Multimed Tools Applic 77:26,635–26,655
Article Google Scholar
Kumar K, Shrimankar DD, Singh N (2016) Equal partition based clustering approach for event summarization in videos. In: International conference on signal-image technology & internet-based systems, pp 119–126
Kumar K, Shrimankar DD, Singh N (2017) Event bagging: a novel event summarization approach in multiview surveillance videos. In: International conference on innovations in electronics, signal processing and communication, pp 106–111
Kumar K, Shrimankar DD, Singh N (2018) Eratosthenes sieve based key-frame extraction technique for event summarization in videos. Multimed Tools Appl 77:7383–7404
Article Google Scholar
Kumar K, Shrimankar DD, Singh N (2018) Somes: an efficient som technique for event summarization in multi-view surveillance videos. In: Recent findings in intelligent computing techniques, pp 383–389
Lai K, Bo L, Ren X et al (2011) A large-scale hierarchical multi-view RGB-D object dataset. In: IEEE International conference on robotics and automation, pp 1817–1824
Li L, Zhang Z (2018) Semi-supervised domain adaptation by covariance matching. IEEE Trans Pattern Anal Mach Intell 41(11):2724–2739
Article Google Scholar
Li W, Gu J, Dong Y et al (2020) Indoor scene understanding via RGB-D image segmentation employing depth-based CNN and crfs. Multimed Tools Applic 79(47):35,475–35,489
Article Google Scholar
Li Y, Li H, Gao G (2022) Towards end-to-end container code recognition. Multimed Tools Applic 81(11):15,901–15,918
Article Google Scholar
Long M, Zhu H, Wang J et al (2016) Unsupervised domain adaptation with residual transfer networks. In: Advances in neural information processing systems, pp 136–144
Ma N, Bu J, Lu L et al (2022) Context-guided entropy minimization for semi-supervised domain adaptation. Neural Netw 154:270–282
Article Google Scholar
Mancini M, Porzi L, Rota Bulò S et al (2018) Boosting domain adaptation by discovering latent domains. In: IEEE Conference on computer vision and pattern recognition, pp 3771–3780
Park GY, Lee SW (2021) Information-theoretic regularization for multi-source domain adaptation. In: IEEE/CVF International conference on computer vision, pp 9214–9223
Saito K, Kim D, Sclaroff S et al (2019) Semi-supervised domain adaptation via minimax entropy. In: IEEE International conference on computer vision, pp 8050–8058
Shao L, Cai Z, Liu L et al (2017) Performance evaluation of deep feature learning for RGB-D image/video classification. Inform Sci 385:266–283
Article Google Scholar
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22:8
Google Scholar
Silberman N, Fergus R (2011) Indoor scene segmentation using a structured light sensor. In: IEEE International Conference on Computer Vision Workshops, pp 601–608
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556
Tzeng E, Hoffman J, Saenko K et al (2017) Adversarial discriminative domain adaptation. In: IEEE Conference on computer vision and pattern recognition, pp 7167–7176
Wang Q, Fink O, Van Gool L et al (2022) Continual test-time domain adaptation. In: IEEE/CVF Conference on computer vision and pattern recognition, pp 7201–7211
Wu F, Wei P, Gao G et al (2022) Dual-aligned unsupervised domain adaptation with graph convolutional networks. Multimed Tools Applic 81 (11):14,979–14,997
Article Google Scholar
Xiao J, Jing L, Zhang L et al (2022) Learning from temporal gradient for semi-supervised action recognition. In: IEEE/CVF Conference on computer vision and pattern recognition, pp 3252–3262
Yang C, Cheung YM, Ding J et al (2022) Contrastive learning assisted-alignment for partial domain adaptation. IEEE Transactions on Neural Networks and Learning Systems
Yang N, Zhang C, Zhang Y et al (2022) A benchmark dataset and baseline model for co-salient object detection within rgb-d images. Multimed Tools Applic 81(25):35,831–35,842
Article Google Scholar
Yao S, Kang Q, Zhou M et al (2022) Discriminative manifold distribution alignment for domain adaptation. IEEE Transactions on Systems, Man, and Cybernetics: Systems
Zhou B, Lapedriza A, Xiao J et al (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495
Zou W, Peng Y, Zhang Z et al (2022) Rgb-d gate-guided edge distillation for indoor semantic segmentation. Multimed Tools Applic 81(25):35,815–35,830
Article Google Scholar

Download references

Acknowledgments

This work is partly supported by the Natural Science Foundation of China (Grant No. 62006127, 62073173, 62176069, 61833011, 62272240 and 62001247), partly supported by Postdoctoral Science Foundation of Jiangsu Grant 2021K290B, and Natural Science Foundation of Guangdong Province under Grant No. 2019A1515011076. It is also supported by Postdoctoral Science Foundation of China (Grant No. 2021M691656).

Author information

Authors and Affiliations

College of Automation, Nanjing University of Posts and Telecommunications, Nanjing, China
Ziyun Cai, Jie Song, Tengfei Zhang & Changhui Hu
School of Computer, Guangdong University of Petrochemical Technology, Maoming, China
Xiao-Yuan Jing
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Xiao-Yuan Jing

Authors

Ziyun Cai
View author publications
You can also search for this author in PubMed Google Scholar
Jie Song
View author publications
You can also search for this author in PubMed Google Scholar
Tengfei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Changhui Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Yuan Jing
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ziyun Cai.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Cai, Z., Song, J., Zhang, T. et al. Local weight coupled network: multi-modal unequal semi-supervised domain adaptation. Multimed Tools Appl 83, 4331–4357 (2024). https://doi.org/10.1007/s11042-023-15439-1

Download citation

Received: 23 October 2022
Revised: 24 March 2023
Accepted: 18 April 2023
Published: 24 May 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-15439-1

Local weight coupled network: multi-modal unequal semi-supervised domain adaptation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Simultaneous Deep Transfer Across Domains and Tasks

Visual domain adaptation via transfer feature learning

Multi-source domain adaptation for image classification

Data Availability Statement

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Local weight coupled network: multi-modal unequal semi-supervised domain adaptation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Simultaneous Deep Transfer Across Domains and Tasks

Visual domain adaptation via transfer feature learning

Multi-source domain adaptation for image classification

Data Availability Statement

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation