Invariant Representation Learning in Multimedia Recommendation with Modality Alignment and Model Fusion
<p>Schematic diagram of spurious correlation in MRS.</p> "> Figure 2
<p>Overall framework of <math display="inline"><semantics> <msup> <mi mathvariant="normal">M</mi> <mn>3</mn> </msup> </semantics></math>-InvRL includes multimedia representation, invariant representation, and model merging.</p> "> Figure 3
<p>The comparison among Naive-UltraGCN (UltraGCN), UltraGCN + InvRL (InvRL) and <math display="inline"><semantics> <msup> <mi mathvariant="normal">M</mi> <mn>3</mn> </msup> </semantics></math>-InvRL on <b>Tiktok</b> datasets with respect to Precision@K, Recall@K, NCDG@K.</p> "> Figure 4
<p>Experimental comparison of different environment numbers <math display="inline"><semantics> <mrow> <mo>|</mo> <mi mathvariant="script">E</mi> <mo>|</mo> </mrow> </semantics></math>.</p> ">
Abstract
:1. Introduction
- We propose to learn both shared and modality-specific representations to mitigate the generalization issues of relying on a single shared space. By aligning individual modality representations with the complete set of modalities, the framework effectively integrates and complements information across modalities.
- We introduce a new multimedia recommendation framework, M3-InvRL, which maps modality features into shared and specific spaces to learn invariant representations for each component. We utilize model merging to fully leverage all available invariant information, adaptively adjusting the weights of different modality predictors to enhance the model’s generalization ability.
- We conduct extensive experiments on two real-world datasets to demonstrate the effectiveness of our proposed framework.
2. Related Work
2.1. Collaborative Filtering for Recommendation
2.2. Multimedia Recommendation
2.3. Invariant Representation Learning
3. Preliminaries
4. Methods
4.1. Multimodal Representation for Recommendation
4.2. Invariant Learning for Recommendation
4.3. Model Merging for Recommendation
5. Results
5.1. Datasets
5.2. Experiment Details
5.3. Baselines
5.4. Evaluation Metrics
5.5. Overall Performance
5.6. Performance Comparison with Different Values of K
5.7. Effect of and
5.8. Different Model Merging Strategy
5.9. Study on the Number of Environments
6. Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lv, Z.; Zhang, W.; Chen, Z.; Zhang, S.; Kuang, K. Intelligent model update strategy for sequential recommendation. In Proceedings of the WWW, Singapore, Singapore, 13–17 May 2024. [Google Scholar]
- Lv, Z.; He, S.; Zhan, T.; Zhang, S.; Zhang, W.; Chen, J.; Zhao, Z.; Wu, F. Semantic Codebook Learning for Dynamic Recommendation Models. In Proceedings of the ACM MM, Melbourne, Australia, 28 October–1 November 2024. [Google Scholar]
- Wang, F.; Zhu, H.; Srivastava, G.; Li, S.; Khosravi, M.R.; Qi, L. Robust collaborative filtering recommendation with user-item-trust records. IEEE Trans. Comput. Soc. Syst. 2021, 9, 986–996. [Google Scholar] [CrossRef]
- Mu, Z.; Zhuang, Y.; Tan, J.; Xiao, J.; Tang, S. Learning hybrid behavior patterns for multimedia recommendation. In Proceedings of the ACM MM, Lisboa, Portugal, 10–14 October 2022. [Google Scholar]
- Zhang, J.; Zhu, Y.; Liu, Q.; Wu, S.; Wang, S.; Wang, L. Mining latent structures for multimedia recommendation. In Proceedings of the ACM MM, Virtual, 20–24 October 2021. [Google Scholar]
- Lin, Z.; Tan, Y.; Zhan, Y.; Liu, W.; Wang, F.; Chen, C.; Wang, S.; Yang, C. Contrastive intra-and inter-modality generation for enhancing incomplete multimedia recommendation. In Proceedings of the ACM MM, Ottawa, ON, Canada, 29 October–3 November 2023. [Google Scholar]
- Zhang, X.; Weng, H.; Wei, Y.; Wang, D.; Chen, J.; Liang, T.; Yin, Y. Multivariate Hawkes Spatio-Temporal Point Process with attention for point of interest recommendation. Neurocomputing 2024, 619, 129161. [Google Scholar] [CrossRef]
- Chen, X.; Lei, C.; Liu, D.; Wang, G.; Tang, H.; Zha, Z.J.; Li, H. E-commerce storytelling recommendation using attentional domain-transfer network and adversarial pre-training. IEEE Trans. Multimed. 2021, 24, 506–518. [Google Scholar] [CrossRef]
- Baek, J.W.; Chung, K.Y. Multimedia recommendation using Word2Vec-based social relationship mining. Multimed. Tools Appl. 2021, 80, 34499–34515. [Google Scholar] [CrossRef]
- Deldjoo, Y. Enhancing video recommendation using multimedia content. In Special Topics in Information Technology; Springer: Cham, Switzerland, 2020; pp. 77–89. [Google Scholar]
- Xia, F.; Asabere, N.Y.; Ahmed, A.M.; Li, J.; Kong, X. Mobile multimedia recommendation in smart communities: A survey. IEEE Access 2013, 1, 606–624. [Google Scholar]
- Lin, J.; Li, Q.; Xie, G.; Guan, Z.; Jiang, Y.; Xu, T.; Zhang, Z.; Zhao, P. Mitigating Sample Selection Bias with Robust Domain Adaption in Multimedia Recommendation. In Proceedings of the ACM MM, Melbourne, VIC, Australia, 28 October–1 November 2024. [Google Scholar]
- He, R.; McAuley, J. VBPR: Visual bayesian personalized ranking from implicit feedback. In Proceedings of the AAAI, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
- Liu, Q.; Wu, S.; Wang, L. Deepstyle: Learning user preferences for visual recommendation. In Proceedings of the SIGIR, Tokyo, Japan, 7–11 August 2017. [Google Scholar]
- Wei, W.; Huang, C.; Xia, L.; Zhang, C. Multi-modal self-supervised learning for recommendation. In Proceedings of the WWW, Austin, TX, USA, 30 April–4 May 2023. [Google Scholar]
- Wei, Y.; Wang, X.; Nie, L.; He, X.; Hong, R.; Chua, T.S. MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video. In Proceedings of the ACM MM, Nice, France, 21–25 October 2019. [Google Scholar]
- Wei, Y.; Wang, X.; Nie, L.; He, X.; Chua, T.S. Graph-refined convolutional network for multimedia recommendation with implicit feedback. In Proceedings of the ACM MM, Seattle, WA, USA, 12–16 October 2020. [Google Scholar]
- Wang, Q.; Wei, Y.; Yin, J.; Wu, J.; Song, X.; Nie, L. Dualgnn: Dual graph neural network for multimedia recommendation. IEEE Trans. Multimed. 2021, 25, 1074–1084. [Google Scholar] [CrossRef]
- Li, S.; Guo, D.; Liu, K.; Hong, R.; Xue, F. Multimodal Counterfactual Learning Network for Multimedia-based Recommendation. In Proceedings of the SIGIR, Taipei, Taiwan, 23–27 July 2023. [Google Scholar]
- Yang, Y.; Zhou, S.; Weng, H.; Wang, D.; Zhang, X.; Yu, D.; Deng, S. Siamese learning based on graph differential equation for Next-POI recommendation. Appl. Soft Comput. 2024, 150, 111086. [Google Scholar] [CrossRef]
- Volpi, R.; Namkoong, H.; Sener, O.; Duchi, J.C.; Murino, V.; Savarese, S. Generalizing to unseen domains via adversarial data augmentation. In Proceedings of the NeurIPS, Montreal, QC, Canada, 3–8 December 2018. [Google Scholar]
- Li, H.; Zheng, C.; Wu, P. StableDR: Stabilized Doubly Robust Learning for Recommendation on Data Missing Not at Random. In Proceedings of the ICLR, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Li, H.; Zheng, C.; Xiao, Y.; Wu, P.; Geng, Z.; Chen, X.; Cui, P. Debiased collaborative filtering with kernel-based causal balancing. In Proceedings of the ICLR, Vienna, Austria, 7–11 May 2024. [Google Scholar]
- Wang, J.; Li, H.; Zhang, C.; Liang, D.; Yu, E.; Ou, W.; Wang, W. Counterclr: Counterfactual contrastive learning with non-random missing data in recommendation. In Proceedings of the ICDM, Shanghai, China, 1–4 December 2023. [Google Scholar]
- Li, H.; Zheng, C.; Ding, S.; Feng, F.; He, X.; Geng, Z.; Wu, P. Be Aware of the Neighborhood Effect: Modeling Selection Bias under Interference for Recommendation. In Proceedings of the ICLR, Vienna, Austria, 7–11 May 2024. [Google Scholar]
- Muandet, K.; Balduzzi, D.; Schölkopf, B. Domain generalization via invariant feature representation. In Proceedings of the ICML, Atlanta, GA, USA, 17–19 June 2013. [Google Scholar]
- Arjovsky, M.; Bottou, L.; Gulrajani, I.; Lopez-Paz, D. Invariant risk minimization. arXiv 2019, arXiv:1907.02893. [Google Scholar]
- Ahuja, K.; Shanmugam, K.; Varshney, K.; Dhurandhar, A. Invariant risk minimization games. In Proceedings of the ICML, Virtual, 13–18 July 2020. [Google Scholar]
- Krueger, D.; Caballero, E.; Jacobsen, J.H.; Zhang, A.; Binas, J.; Zhang, D.; Le Priol, R.; Courville, A. Out-of-distribution generalization via risk extrapolation (rex). In Proceedings of the ICML, Virtual, 18–24 July 2021. [Google Scholar]
- Du, X.; Wu, Z.; Feng, F.; He, X.; Tang, J. Invariant representation learning for multimedia recommendation. In Proceedings of the ACM MM, Lisboa, Portugal, 10–14 October 2022. [Google Scholar]
- Huang, S.; Li, H.; Li, Q.; Zheng, C.; Liu, L. Pareto invariant representation learning for multimedia recommendation. In Proceedings of the ACM MM, Ottawa, ON, Canada, 29 October–3 November 2023. [Google Scholar]
- Bai, H.; Wu, L.; Hou, M.; Cai, M.; He, Z.; Zhou, Y.; Hong, R.; Wang, M. Multimodality Invariant Learning for Multimedia-Based New Item Recommendation. In Proceedings of the SIGIR, Washington DC, USA, 14–18 July 2024. [Google Scholar]
- Lv, Z.; Zhang, W.; Zhang, S.; Kuang, K.; Wang, F.; Wang, Y.; Chen, Z.; Shen, T.; Yang, H.; Ooi, B.C.; et al. DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization. In Proceedings of the WWW, Austin, USA, 30 April – 4 May 2023. [Google Scholar]
- He, X.; Zhang, H.; Kan, M.Y.; Chua, T.S. Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the SIGIR, Pisa, Italy, 17–21 July 2016. [Google Scholar]
- He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.S. Neural collaborative filtering. In Proceedings of the WWW, Perth, Australia, 3–7 April 2017. [Google Scholar]
- Wang, X.; He, X.; Wang, M.; Feng, F.; Chua, T.S. Neural graph collaborative filtering. In Proceedings of the SIGIR, Paris, France, 21–25 July 2019. [Google Scholar]
- He, X.; Deng, K.; Wang, X.; Li, Y.; Zhang, Y.; Wang, M. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the SIGIR, Virtual, 25 July 2020. [Google Scholar]
- Mao, K.; Zhu, J.; Xiao, X.; Lu, B.; Wang, Z.; He, X. UltraGCN: Ultra simplification of graph convolutional networks for recommendation. In Proceedings of the CIKM, Virtual, 1–5 November 2021. [Google Scholar]
- Zhang, F.; Yuan, N.J.; Lian, D.; Xie, X.; Ma, W.Y. Collaborative knowledge base embedding for recommender systems. In Proceedings of the KDD, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
- Zhang, J.; Liu, G.; Liu, Q.; Wu, S.; Wang, L. Modality-Balanced Learning for Multimedia Recommendation. In Proceedings of the ACM MM, Melbourne, VIC, Australia, 28 October–1 November 2024. [Google Scholar]
- Kang, W.C.; Fang, C.; Wang, Z.; McAuley, J. Visually-aware fashion recommendation and design with generative image models. In Proceedings of the ICDM, New Orleans, LA, USA, 18–21 November 2017. [Google Scholar]
- Wang, D.; Du, R.; Yang, Q.; Yu, D.; Wan, F.; Gong, X.; Xu, G.; Deng, S. Category-aware self-supervised graph neural network for session-based recommendation. World Wide Web 2024, 27, 61. [Google Scholar] [CrossRef]
- Li, S.; Xue, F.; Liu, K.; Guo, D.; Hong, R. Multimodal graph causal embedding for multimedia-based recommendation. IEEE Trans. Knowl. Data Eng. 2024, 36, 8842–8858. [Google Scholar] [CrossRef]
- Chen, X.; Chen, H.; Xu, H.; Zhang, Y.; Cao, Y.; Qin, Z.; Zha, H. Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation. In Proceedings of the SIGIR, Paris, France, 21–25 July 2019. [Google Scholar]
- Liu, S.; Chen, Z.; Liu, H.; Hu, X. User-video co-attention network for personalized micro-video recommendation. In Proceedings of the WWW, Palma, Spain, 1–3 July 2019. [Google Scholar]
- Liu, F.; Cheng, Z.; Sun, C.; Wang, Y.; Nie, L.; Kankanhalli, M. User diverse preference modeling by multimodal attentive metric learning. In Proceedings of the ACM MM, Nice, France, 21 October 2019. [Google Scholar]
- Lu, C.; Wu, Y.; Hernández-Lobato, J.M.; Schölkopf, B. Invariant causal representation learning for out-of-distribution generalization. In Proceedings of the ICLR, Vienna, Austria, 4 May 2021. [Google Scholar]
- Ahuja, K.; Caballero, E.; Zhang, D.; Gagnon-Audet, J.C.; Bengio, Y.; Mitliagkas, I.; Rish, I. Invariance principle meets information bottleneck for out-of-distribution generalization. In Proceedings of the NeurIPS, Online, 6–14 December 2021. [Google Scholar]
- Li, B.; Shen, Y.; Wang, Y.; Zhu, W.; Li, D.; Keutzer, K.; Zhao, H. Invariant information bottleneck for domain generalization. In Proceedings of the AAAI, Online, 22 February–1 March 2022. [Google Scholar]
- Yu, R.; Zhu, H.; Li, K.; Hong, L.; Zhang, R.; Ye, N.; Huang, S.L.; He, X. Regularization Penalty Optimization for Addressing Data Quality Variance in OoD Algorithms. In Proceedings of the AAAI, Online, 22 February–1 March 2022. [Google Scholar]
- Zhou, X.; Lin, Y.; Zhang, W.; Zhang, T. Sparse invariant risk minimization. In Proceedings of the ICML, Baltimore, MD, USA, 17–23 July 2022. [Google Scholar]
- Creager, E.; Jacobsen, J.H.; Zemel, R. Environment inference for invariant learning. In Proceedings of the ICML, Virtual, 18–24 July 2021. [Google Scholar]
- Liu, J.; Hu, Z.; Cui, P.; Li, B.; Shen, Z. Heterogeneous risk minimization. In Proceedings of the ICML, Virtual, 18–24 July 2021. [Google Scholar]
- Chen, J.; Dong, H.; Wang, X.; Feng, F.; Wang, M.; He, X. Bias and Debias in Recommender System: A Survey and Future Directions. ACM Trans. Inf. Syst. 2023, 41, 1–39. [Google Scholar] [CrossRef]
- Wang, F.; Chen, C.; Liu, W.; Fan, T.; Liao, X.; Tan, Y.; Qi, L.; Zheng, X. CE-RCFR: Robust Counterfactual Regression for Consensus-Enabled Treatment Effect Estimation. In Proceedings of the KDD, Barcelona, Spain, 28 August 25–29 2024. [Google Scholar]
- Wu, P.; Li, H.; Deng, Y.; Hu, W.; Dai, Q.; Dong, Z.; Sun, J.; Zhang, R.; Zhou, X.H. On the Opportunity of Causal Learning in Recommendation Systems: Foundation, Estimation, Prediction and Challenges. In Proceedings of the IJCAI, Vienna, Austria, 23–29 July 2022. [Google Scholar]
- Wang, W.; Zhang, Y.; Li, H.; Wu, P.; Feng, F.; He, X. Causal Recommendation: Progresses and Future Directions. In Proceedings of the SIGIR, Taipei, Taiwan, 23–27 July 2023. [Google Scholar]
- Schnabel, T.; Swaminathan, A.; Singh, A.; Chandak, N.; Joachims, T. Recommendations as treatments: Debiasing learning and evaluation. In Proceedings of the ICML, New York, NY, USA, 19–24 June 2016. [Google Scholar]
- Wang, X.; Zhang, R.; Sun, Y.; Qi, J. Doubly robust joint learning for recommendation on data missing not at random. In Proceedings of the ICML, Long Beach, CA, USA, 9–15 June 2019. [Google Scholar]
- Li, H.; Lyu, Y.; Zheng, C.; Wu, P. TDR-CL: Targeted Doubly Robust Collaborative Learning for Debiased Recommendations. In Proceedings of the ICLR, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Wang, H.; Chang, T.W.; Liu, T.; Huang, J.; Chen, Z.; Yu, C.; Li, R.; Chu, W. Escm2: Entire space counterfactual multi-task model for post-click conversion rate estimation. In Proceedings of the SIGIR, Madrid, Spain, 11–15 July 2022. [Google Scholar]
- Li, H.; Zheng, C.; Wang, S.; Wu, K.; Wang, E.; Wu, P.; Geng, Z.; Chen, X.; Zhou, X.H. Relaxing the Accurate Imputation Assumption in Doubly Robust Learning for Debiased Collaborative Filtering. In Proceedings of the ICML, Vienna, Austria, 21–27 July 2024. [Google Scholar]
- Li, H.; Xiao, Y.; Zheng, C.; Wu, P.; Cui, P. Propensity Matters: Measuring and Enhancing Balancing for Recommendation. In Proceedings of the ICML, Honolulu, HI, USA, 23–29 July 2023. [Google Scholar]
- Li, H.; Xiao, Y.; Zheng, C.; Wu, P. Balancing unobserved confounding with a few unbiased ratings in debiased recommendations. In Proceedings of the WWW, Austin, TX, USA, 30 April–4 May 2023. [Google Scholar]
- Li, H.; Wu, K.; Zheng, C.; Xiao, Y.; Wang, H.; Geng, Z.; Feng, F.; He, X.; Wu, P. Removing hidden confounding in recommendation: A unified multi-task learning approach. In Proceedings of the NeurIPS, New Orleans, LA, USA, 10–16 December 2023. [Google Scholar]
- Cheng, P.; Hao, W.; Dai, S.; Liu, J.; Gan, Z.; Carin, L. Club: A contrastive log-ratio upper bound of mutual information. In Proceedings of the ICML, Virtual, 13–18 July 2020. [Google Scholar]
- Liu, J.; Hu, Z.; Cui, P.; Li, B.; Shen, Z. Kernelized heterogeneous risk minimization. In Proceedings of the NeurIPS, Online, 6–14 December 2021. [Google Scholar]
- Wei, Y.; Wang, X.; He, X.; Nie, L.; Rui, Y.; Chua, T.S. Hierarchical user intent graph network for multimedia recommendation. IEEE Trans. Multimed. 2021, 24, 2701–2712. [Google Scholar] [CrossRef]
- Hershey, S.; Chaudhuri, S.; Ellis, D.P.; Gemmeke, J.F.; Jansen, A.; Moore, R.C.; Plakal, M.; Platt, D.; Saurous, R.A.; Seybold, B.; et al. CNN architectures for large-scale audio classification. In Proceedings of the ICASSP, New Orleans, LA, USA, 5–9 March 2017. [Google Scholar]
- Arora, S.; Liang, Y.; Ma, T. A simple but tough-to-beat baseline for sentence embeddings. In Proceedings of the ICLR, Toulon, France, 24–26 April 2017. [Google Scholar]
- Kingma, D.P. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Barkan, O.; Koenigstein, N.; Yogev, E.; Katz, O. CB2CF: A neural multiview content-to-collaborative filtering model for completely cold item recommendations. In Proceedings of the RecSys, Copenhagen, Denmark, 16–20 September 2019. [Google Scholar]
- Geng, X.; Zhang, H.; Bian, J.; Chua, T.S. Learning image and user features for recommendation in social networks. In Proceedings of the ICCV, Santiago, Chile, 7–13 December 2015. [Google Scholar]
- Ma, J.; Cui, P.; Kuang, K.; Wang, X.; Zhu, W. Disentangled graph convolutional networks. In Proceedings of the ICML, Long Beach, CA, USA, 9–15 June 2019. [Google Scholar]
- Ma, J.; Zhou, C.; Cui, P.; Yang, H.; Zhu, W. Learning disentangled representations for recommendation. In Proceedings of the NeurIPS, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
Variable | Description |
---|---|
u, | User u in the recommendation system, and is the set of all users. |
i, | Item i in the recommendation system, and is the set of all items. |
Binary interaction: if user u positively interacts with item i, 0 otherwise. | |
Set of all user-item interactions, where denotes positive samples (), and denotes negative samples (). | |
Feature of item i for modality m, . | |
Dimension of the feature vector for modality m. | |
Recommendation model predicting user preferences. | |
Parameters of the recommendation model . | |
Base encoder for the r-th modality, . | |
Representation generated by for the r-th modality. | |
Shared head mapping representations to a common space . | |
, | Shared representations for modality m and all modalities, respectively. |
Specific head generating modality-specific representations. | |
Modality-specific representation for modality m. | |
Similarity score between modality m (sample i) and modality n (sample j). | |
, | Common loss across modalities and mutual information loss. |
, | Invariant and variant representations for modality r of item i. |
Invariant mask in , used to generate invariant representations. | |
Combined invariant representations for modality m, . | |
Final recommendation model for modality m, trained on . | |
Entropy-based uncertainty for the m-th modality. | |
Importance weight for the m-th modality. | |
Final prediction by aggregating all predictors. |
Dataset | #Interactions | #Items | #Users | Sparsity | |||
---|---|---|---|---|---|---|---|
Movielens | 1,239,508 | 5986 | 55,485 | 99.63% | 2048 | 128 | 100 |
Tiktok | 726,065 | 76,085 | 36,656 | 99.99% | 128 | 128 | 128 |
Category | Methods | Movielens | Tiktok | ||||
---|---|---|---|---|---|---|---|
P@10 | R@10 | N@10 | P@10 | R@10 | N@10 | ||
M-CF | VBPR | 0.0512 | 0.1990 | 0.2261 | 0.0118 | 0.0628 | 0.0574 |
DUIF | 0.0538 | 0.2167 | 0.2341 | 0.0087 | 0.0483 | 0.0434 | |
CB2CF | 0.0548 | 0.2265 | 0.2505 | 0.0109 | 0.0642 | 0.0613 | |
G-NCF | NGCF | 0.0547 | 0.2196 | 0.2342 | 0.0135 | 0.0780 | 0.0661 |
DisenGCN | 0.0555 | 0.2222 | 0.2401 | 0.0145 | 0.0760 | 0.0639 | |
MacridVAE | 0.0576 | 0.2286 | 0.2437 | 0.0152 | 0.0813 | 0.0686 | |
M-NCF | MMGCN | 0.0581 | 0.2345 | 0.2517 | 0.0144 | 0.0808 | 0.0674 |
HUIGN | 0.0619 | 0.2522 | 0.2677 | 0.0164 | 0.0884 | 0.0769 | |
GRCN | 0.0639 | 0.2569 | 0.2754 | 0.0195 | 0.1048 | 0.0938 | |
UltraGCN | Naive-UltraGCN | 0.0624 | 0.2547 | 0.2691 | 0.0183 | 0.0981 | 0.0878 |
UltraGCN + InvRL | 0.0645 | 0.2615 | 0.2815 | 0.0192 | 0.1062 | 0.0922 | |
-InvRL(Ours) | 0.0675 | 0.2775 | 0.2840 | 0.0198 | 0.1099 | 0.0957 | |
%Improvement over Naive-UltraGCN | 8.17% | 8.95% | 5.55% | 8.20% | 12.03% | 9.00% | |
%Improvement over UltraGCN + InvRL | 4.65% | 6.11% | 0.89% | 3.13% | 3.49% | 3.79% |
Movielens | Tiktok | |||||
---|---|---|---|---|---|---|
P@10 | R@10 | N@10 | P@10 | R@10 | N@10 | |
M3-InvRL w/o | 0.0642 | 0.2648 | 0.2792 | 0.0190 | 0.1030 | 0.0925 |
M3-InvRL w/o | 0.0667 | 0.2753 | 0.2836 | 0.0194 | 0.1093 | 0.0931 |
M3-InvRL | 0.0675 | 0.2775 | 0.2840 | 0.0198 | 0.1099 | 0.0957 |
Movielens | Tiktok | |||||
---|---|---|---|---|---|---|
P@10 | R@10 | N@10 | P@10 | R@10 | N@10 | |
E-weight | 0.0652 | 0.2731 | 0.2829 | 0.0192 | 0.1080 | 0.0911 |
L-weight | 0.0648 | 0.2719 | 0.2817 | 0.0193 | 0.1073 | 0.0937 |
A-weight | 0.0670 | 0.2761 | 0.2834 | 0.0195 | 0.1105 | 0.0955 |
M3-InvRL | 0.0675 | 0.2775 | 0.2840 | 0.0198 | 0.1099 | 0.0957 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hu, X.; Zhang, H. Invariant Representation Learning in Multimedia Recommendation with Modality Alignment and Model Fusion. Entropy 2025, 27, 56. https://doi.org/10.3390/e27010056
Hu X, Zhang H. Invariant Representation Learning in Multimedia Recommendation with Modality Alignment and Model Fusion. Entropy. 2025; 27(1):56. https://doi.org/10.3390/e27010056
Chicago/Turabian StyleHu, Xinghang, and Haiteng Zhang. 2025. "Invariant Representation Learning in Multimedia Recommendation with Modality Alignment and Model Fusion" Entropy 27, no. 1: 56. https://doi.org/10.3390/e27010056
APA StyleHu, X., & Zhang, H. (2025). Invariant Representation Learning in Multimedia Recommendation with Modality Alignment and Model Fusion. Entropy, 27(1), 56. https://doi.org/10.3390/e27010056