Abstract
Multi-task learning (MTL) is a promising research direction in recommender systems, whose prediction accuracy greatly depends on the quality of the modeling of the relationships among tasks. Much of the prior research focus on three tasks: predicting click-through rate (CTR), post-view click-through & conversion rate (CTCVR), and post-click conversion rate (CVR), which rely on inherent user action pattern of impression \(\rightarrow \) click \(\rightarrow \) conversion. Information cascade pattern, represented by Adaptive Information Transfer Multi-task (AITM), attempts to model such sequential dependencies in the feature space close to the output for the first time. However, we observe that the first task in the information cascade model usually tends to be the victim, which is not in line with expectations. To this end, we propose a novel architecture: Multi-task Balanced Information Cascade Network (MT-BICN). We set up both shared experts and task-specific experts for each task to provide a bottom-line guarantee for each task’s performance, which largely reduces the risk of each task falling victim to the seesaw phenomenon. Information transfer unit (ITU) is designed and set at the output layer of the top tower to explicitly model the sequential dependencies among tasks. In addition, to further improve the feature extraction capability of the bottom shared experts, task-specific experts, and task towers, we design individual optimization objectives for the BASE model without introducing ITUs, and a balanced marginal constraint to encourage the introduction of ITU to benefit the later tasks without harming the former ones. We conducted extensive experiments on open-source large-scale recommendation datasets from AliExpress. The experimental results show that our approach significantly outperforms the mainstream MTL learning approaches for recommender systems. In addition, the ablation study demonstrates the necessity of designing core modules in MT-BICN.
H. Wu and Y. Gao—Contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aoki, R., Tung, F., Oliveira, G.L.: Heterogeneous multi-task learning with expert diversity. IEEE/ACM Trans. Comput. Biol. Bioinf. 19(6), 3093–3102 (2022)
Caruana, R.: Multitask learning. Mach. Learn. 28, 41–75 (1997)
Ding, K., et al.: Mssm: a multiple-level sparse sharing model for efficient multi-task learning. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2237–2241 (2021)
He, Y., Feng, X., Cheng, C., Ji, G., Guo, Y., Caverlee, J.: Metabalance: improving multi-task recommendations via adapting gradient magnitudes of auxiliary tasks. In: Proceedings of the ACM Web Conference 2022, pp. 2205–2215 (2022)
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Liu, J., Li, X., An, B., Xia, Z., Wang, X.: Multi-faceted hierarchical multi-task learning for recommender systems. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 3332–3341 (2022)
Ma, J., Zhao, Z., Yi, X., Chen, J., Hong, L., Chi, E.H.: Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1930–1939 (2018)
Ma, X., et al.: Entire space multi-task model: an effective approach for estimating post-click conversion rate. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 1137–1140 (2018)
Ni, Y., et al.: Perceive your users in depth: learning universal user representations from multiple e-commerce tasks. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 596–605 (2018)
Qin, Z., Cheng, Y., Zhao, Z., Chen, Z., Metzler, D., Qin, J.: Multitask mixture of sequential experts for user activity streams. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3083–3091 (2020)
Ruder, S.: An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017)
Tang, H., Liu, J., Zhao, M., Gong, X.: Progressive layered extraction (PLE): A novel multi-task learning (MTL) model for personalized recommendations. In: Proceedings of the 14th ACM Conference on Recommender Systems, pp. 269–278 (2020)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Wang, H., et al.: Escm2: entire space counterfactual multi-task model for post-click conversion rate estimation. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 363–372 (2022)
Wang, Y., et al.: Multi-task deep recommender systems: a survey. arXiv preprint arXiv:2302.03525 (2023)
Wen, H., et al.: Entire space multi-task modeling via post-click behavior decomposition for conversion rate prediction. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2377–2386 (2020)
Wu, H.: Mncm: multi-level network cascades model for multi-task learning. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 4565–4569 (2022)
Wu, L., He, X., Wang, X., Zhang, K., Wang, M.: A survey on accuracy-oriented neural recommendation: from collaborative filtering to information-rich recommendation. IEEE Trans. Know. Data Eng. 35(5), 4425–4445 (2022)
Xi, D., et al.: Modeling the sequential dependence among audience multi-step conversions with multi-task learning in targeted display advertising. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 3745–3755 (2021)
Xi, D., et al.: Modeling the field value variations and field interactions simultaneously for fraud detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14957–14965 (2021)
Xi, D., et al.: Neural hierarchical factorization machines for user’s event sequence analysis. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, p. 1893–1896 (2020)
Yang, E., et al.: Adatask: a task-aware adaptive learning rate approach to multi-task learning. arXiv preprint arXiv:2211.15055 (2022)
Zhang, D., et al.: Ctnocvr: a novelty auxiliary task making the lower-CTR-higher-CVR upper. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2272–2276 (2022)
Zhu, Y., et al.: Learning to expand audience via meta hybrid experts and critics for recommendation and advertising. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 4005–4013 (2021)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, H., Gao, Y. (2023). MT-BICN: Multi-task Balanced Information Cascade Network for Recommendation. In: Jin, Z., Jiang, Y., Buchmann, R.A., Bi, Y., Ghiran, AM., Ma, W. (eds) Knowledge Science, Engineering and Management. KSEM 2023. Lecture Notes in Computer Science(), vol 14119. Springer, Cham. https://doi.org/10.1007/978-3-031-40289-0_34
Download citation
DOI: https://doi.org/10.1007/978-3-031-40289-0_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40288-3
Online ISBN: 978-3-031-40289-0
eBook Packages: Computer ScienceComputer Science (R0)