research-article

Profile Decomposition Based Hybrid Transfer Learning for Cold-Start Data Anomaly Detection

Authors:

Ke ZhangAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 16, Issue 6

Article No.: 121, Pages 1 - 28

https://doi.org/10.1145/3530990

Published: 30 July 2022 Publication History

Abstract

Anomaly detection is an essential task for quality management in smart manufacturing. An accurate data-driven detection method usually needs enough data and labels. However, in practice, there commonly exist newly set-up processes in manufacturing, and they only have quite limited data available for analysis. Borrowing the name from the recommender system, we call this process a cold-start process. The sparsity of anomaly, the deviation of the profile, and noise aggravate the detection difficulty.

Transfer learning could help to detect anomalies for cold-start processes by transferring the knowledge from more experienced processes to the new processes. However, the existing transfer learning and multi-task learning frameworks are established on task- or domain-level relatedness. We observe instead, within a domain, some components (background and anomaly) share more commonality, others (profile deviation and noise) not. To this end, we propose a more delicate component-level transfer learning scheme, i.e., decomposition-based hybrid transfer learning (DHTL): It first decomposes a domain (e.g., a data source containing profiles) into different components (smooth background, profile deviation, anomaly, and noise); then, each component’s transferability is analyzed by expert knowledge; Lastly, different transfer learning techniques could be tailored accordingly. We adopted the Bayesian probabilistic hierarchical model to formulate parameter transfer for the background, and “L_2,1+L₁”-norm to formulate low dimension feature-representation transfer for the anomaly. An efficient algorithm based on Block Coordinate Descend is proposed to learn the parameters. A case study based on glass coating pressure profiles demonstrates the improved accuracy and completeness of detected anomaly, and a simulation demonstrates the fidelity of the decomposition results.

References

[1]

Bart Bakker and Tom Heskes. 2003. Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research 4, May (2003), 83–99.

Digital Library

[2]

Amir Beck and Marc Teboulle. 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences 2, 1 (2009), 183–202.

Digital Library

[3]

Shai Ben-David and Reba Schuller. 2003. Exploiting task relatedness for multiple task learning. In Proceedings of the Learning Theory and Kernel Machines. Springer, 567–580.

[4]

Ahmad W. Bitar, Loong-Fah Cheong, and Jean-Philippe Ovarlez. 2019. Sparse and low-rank matrix decomposition for automatic target detection in hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing 57, 8 (2019), 5239–5251.

[5]

Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM Computing Surveys 41, 3 (2009), 1–58.

Digital Library

[6]

Jianhui Chen, Ji Liu, and Jieping Ye. 2012. Learning incoherent sparse and low-rank patterns from multiple tasks. ACM Transactions on Knowledge Discovery from Data 5, 4 (2012), 1–31.

Digital Library

[7]

Longwei Cheng, Kai Wang, and Fugee Tsung. 2021. A hybrid transfer learning framework for in-plane freeform shape accuracy control in additive manufacturing. IISE Transactions 53, 3 (2021), 298–312.

[8]

John D. Coakley. 1950. Human operators and automatic machines. Personnel Psychology 3, 4 (1950), 401–411.

[9]

Bo Du and Liangpei Zhang. 2014. A discriminative metric learning based anomaly detection method. IEEE Transactions on Geoscience and Remote Sensing 52, 11 (2014), 6844–6857.

[10]

Bo Du, Liangpei Zhang, Dacheng Tao, and Dengyi Zhang. 2013. Unsupervised transfer learning for target detection from hyperspectral images. Neurocomputing 120 (2013), 72–82.

[11]

Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. 226–231.

[12]

André R. Gonçalves, Puja Das, Soumyadeep Chatterjee, Vidyashankar Sivakumar, Fernando J. Von Zuben, and Arindam Banerjee. 2014. Multi-task sparse structure learning. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 451–460.

Digital Library

[13]

André R. Gonçalves, Fernando J. Von Zuben, and Arindam Banerjee. 2016. Multi-task sparse structure learning with gaussian copula models. The Journal of Machine Learning Research 17, 1 (2016), 1205–1234.

Digital Library

[14]

Jie Guo, Hao Yan, Chen Zhang, and Steven Hoi. 2020. Partially Observable Online Change Detection via Smooth-Sparse Decomposition. arXiv:2009.10645. Retrieved from https://arxiv.org/abs/2009.10645.

[15]

Shuai Huang, Jing Li, Kewei Chen, Teresa Wu, Jieping Ye, Xia Wu, and Li Yao. 2012. A transfer learning approach for network modeling. IIE Transactions 44, 11 (2012), 915–931.

[16]

Tsuyoshi Idé, Dzung T. Phan, and Jayant Kalagnanam. 2017. Multi-task multi-modal models for collective anomaly detection. In Proceedings of the 2017 IEEE International Conference on Data Mining. IEEE, 177–186.

[17]

Hal Daume III and Daniel Marcu. 2006. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research 26,1 (2006), 101–126.

[18]

Ruoyi Jiang, Hongliang Fei, and Jun Huan. 2011. Anomaly localization for network data streams with graph joint sparse PCA. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 886–894.

Digital Library

[19]

Atsutoshi Kumagai, Tomoharu Iwata, and Yasuhiro Fujiwara. 2019. Transfer anomaly detection by inferring latent domain representations. In Proceedings of the Advances in Neural Information Processing Systems. 2471–2481.

[20]

Bin Li, Qiang Yang, and Xiangyang Xue. 2009. Transfer learning for collaborative filtering via a rating-matrix generative model. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 617–624.

Digital Library

[21]

Ziyue Li, Nurettin Dorukhan Sergin, Hao Yan, Chen Zhang, and Fugee Tsung. 2020. Tensor completion for weakly-dependent data on graph for metro passenger flow prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 4804–4810.

[22]

Jun Liu, Shuiwang Ji, and Jieping Ye. 2009. Multi-task feature learning via efficient l2, 1-norm minimization. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI 2009. AUAI Press, 339–348.

[23]

Song Liu, Makoto Yamada, Nigel Collier, and Masashi Sugiyama. 2013. Change-point detection in time-series data by relative density-ratio estimation. Neural Networks 43 (2013), 72–83.

Digital Library

[24]

Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H. Chi. 2018. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1930–1939.

Digital Library

[25]

Pedro A. Marques, Carlos B. Cardeira, Paula Paranhos, Sousa Ribeiro, and Helena Gouveia. 2015. Selection of the most suitable statistical process control approach for short production runs: A decision-model. International Journal of Information and Education Technology 5, 4 (2015), 303.

[26]

Saeed Masoudnia and Reza Ebrahimpour. 2014. Mixture of experts: A literature survey. Artificial Intelligence Review 42, 2 (2014), 275–293.

Digital Library

[27]

Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2009), 1345–1359.

Digital Library

[28]

Sinno Jialin Pan, Vincent Wenchen Zheng, Qiang Yang, and Derek Hao Hu. 2008. Transfer learning for wifi-based indoor localization. In Proceedings of the Association for the Advancement of Artificial Intelligence Workshop. 6.

[29]

Lorien Y. Pratt. 1993. Discriminability-based transfer between neural networks. In Proceedings of the Advances in Neural Information Processing Systems. 204–211.

[30]

Ying Qu, Wei Wang, Rui Guo, Bulent Ayhan, Chiman Kwan, Steven Vance, and Hairong Qi. 2018. Hyperspectral anomaly detection through spectral unmixing and dictionary-based low-rank decomposition. IEEE Transactions on Geoscience and Remote Sensing 56, 8 (2018), 4391–4405.

[31]

Daniel V. Samarov, David Allen, Jeeseong Hwang, Young Jong Lee, and Maritoni Litorja. 2017. A coordinate-descent-based approach to solving the sparse group elastic net. Technometrics 59, 4 (2017), 437–445.

[32]

Bo Shen, Rongxuan Wang, Andrew Chung Chee Law, Rakesh Kamath, Hahn Choo, and Zhenyu (James) Kong. 2022. Super resolution for multi-Sources image stream data using smooth and sparse tensor completion and its applications in data acquisition of additive manufacturing. Technometrics 64, 1 (2022), 2–17.

[33]

Fugee Tsung, Ke Zhang, Longwei Cheng, and Zhenli Song. 2018. Statistical transfer learning: A review and some extensions to statistical process control. Quality Engineering 30, 1 (2018), 115–128.

[34]

Zirui Wang, Zihang Dai, Barnabás Póczos, and Jaime Carbonell. 2019. Characterizing and avoiding negative transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11293–11302.

[35]

Karl Weiss, Taghi M. Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. Journal of Big Data 3, 1 (2016), 9.

[36]

Tao Wu, Ellie Ka-In Chio, Heng-Tze Cheng, Yu Du, Steffen Rendle, Dima Kuzmin, Ritesh Agarwal, Li Zhang, John Anderson, Sarvjeet Singh, Tushar Chandra, Ed H. Chi, Wen Li, Ankit Kumar, Xiang Ma, Alex Soares, Nitin Jindal, and Pei Cao. 2020. Zero-shot heterogeneous transfer learning from recommender systems to cold-start search retrieval. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2821–2828.

Digital Library

[37]

Yang Xu, Zebin Wu, Jun Li, Antonio Plaza, and Zhihui Wei. 2015. Anomaly detection in hyperspectral images based on low-rank and sparse representation. IEEE Transactions on Geoscience and Remote Sensing 54, 4 (2015), 1990–2000.

[38]

Hao Yan, Kamran Paynabar, and Jianjun Shi. 2017. Anomaly detection in images with smooth background via smooth-sparse decomposition. Technometrics 59, 1 (2017), 102–114.

[39]

Hao Yan, Kamran Paynabar, and Jianjun Shi. 2018. Real-time monitoring of high-dimensional functional data streams via spatio-temporal smooth sparse decomposition. Technometrics 60, 2 (2018), 181–197.

[40]

Ming Yan, Jitao Sang, Tao Mei, and Changsheng Xu. 2013. Friend transfer: Cold-start friend recommendation with cross-platform transfer learning of social knowledge. In Proceedings of the 2013 IEEE International Conference on Multimedia and Expo. IEEE, 1–6.

[41]

Yuan Yuan, Nan Chen, and Shiyu Zhou. 2013. Adaptive B-spline knot selection using multi-resolution basis set. IIE Transactions 45, 12 (2013), 1263–1277.

[42]

Xiaowei Yue, Hao Yan, Jin Gyu Park, Zhiyong Liang, and Jianjun Shi. 2018. A wavelet-based penalized mixed-effects decomposition for Multichannel profile detection of in-line Raman spectroscopy. IEEE Transactions on Automation Science and Engineering 15, 3 (2018), 1258–1271.

[43]

Seniha Esen Yuksel, Joseph N. Wilson, and Paul D. Gader. 2012. Twenty years of mixture of experts. IEEE Transactions on Neural Networks and Learning Systems 23, 8 (2012), 1177–1193.

[44]

Chen Zhang, Hao Yan, Seungho Lee, and Jianjun Shi. 2018. Weakly correlated profile monitoring based on sparse multi-channel functional principal component analysis. IISE Transactions 50, 10 (2018), 878–891.

[45]

Xiaotong Zhang, Xianchao Zhang, Han Liu, and Jiebo Luo. 2018. Multi-task clustering with model relation learning. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 3132–3140.

[46]

Yuxiang Zhang, Bo Du, Liangpei Zhang, and Tongliang Liu. 2016. Joint sparse representation and multitask learning for hyperspectral target detection. IEEE Transactions on Geoscience and Remote Sensing 55, 2 (2016), 894–906.

[47]

Yuxiang Zhang, Bo Du, Liangpei Zhang, and Shugen Wang. 2015. A low-rank and sparse matrix decomposition-based Mahalanobis distance method for hyperspectral anomaly detection. IEEE Transactions on Geoscience and Remote Sensing 54, 3 (2015), 1376–1389.

[48]

Yu Zhang and Qiang Yang. 2021. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering (2021), 1–1.

[49]

Liang Zhao, Qian Sun, Jieping Ye, Feng Chen, Chang-Tien Lu, and Naren Ramakrishnan. 2015. Multi-task learning for spatio-temporal event forecasting. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1503–1512.

Digital Library

[50]

Liang Zhao, Qian Sun, Jieping Ye, Feng Chen, Chang-Tien Lu, and Naren Ramakrishnan. 2017. Feature constrained multi-task learning models for spatiotemporal event forecasting. IEEE Transactions on Knowledge and Data Engineering 29, 5 (2017), 1059–1072.

Digital Library

[51]

Yujie Zhao, Hao Yan, Sarah Holte, and Yajun Mei. 2022. Rapid detection of hot-spots via tensor decomposition with applications to crime rate data. Journal of Applied Statistics 49, 7 (2022), 1636–1662.

[52]

Runxing Zhong, Weifeng Lv, Bowen Du, Shuo Lei, and Runhe Huang. 2017. Spatiotemporal multi-task learning for citywide passenger flow prediction. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, 1–8.

[53]

Jiayu Zhou, Jianhui Chen, and Jieping Ye. 2011. Malsar: Multi-task learning via structural regularization. Arizona State University 21 (2011).

[54]

Jiayu Zhou, Lei Yuan, Jun Liu, and Jieping Ye. 2011. A multi-task learning formulation for predicting disease progression. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 814–822.

Digital Library

[55]

Joey Tianyi Zhou, Sinno Jialin Pan, Ivor W. Tsang, and Yan Yan. 2014. Hybrid heterogeneous transfer learning through deep learning. In Proceedings of the National Conference on Artificial Intelligence.

Cited By

Hu JShi NKontar RYan H(2025)Personalized Tucker Decomposition: Modeling Commonality and Peculiarity on Tensor DataTechnometrics10.1080/00401706.2025.2453206(1-27)Online publication date: 17-Jan-2025
https://doi.org/10.1080/00401706.2025.2453206
Li SChen WXing KWang HZhang YKang M(2025)MGAN-LD: A sparse label propagation-based anomaly detection approach using multi-generative adversarial networksKnowledge-Based Systems10.1016/j.knosys.2025.113124312(113124)Online publication date: Mar-2025
https://doi.org/10.1016/j.knosys.2025.113124
Guo SLyu JZhu XFan H(2025)Multi-feature fusion for the evaluation of strategic nodes and regional importance in maritime networksChaos, Solitons & Fractals10.1016/j.chaos.2024.115902191(115902)Online publication date: Feb-2025
https://doi.org/10.1016/j.chaos.2024.115902
Show More Cited By

Index Terms

Profile Decomposition Based Hybrid Transfer Learning for Cold-Start Data Anomaly Detection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Transfer learning
      2. Unsupervised learning
        Anomaly detection

Recommendations

Anomaly Subgraph Detection with Feature Transfer
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Anomaly detection in multilayer graphs becomes more critical in many application scenarios, i.e., identifying crime hotspots in urban areas by discovering suspicious and illicit behaviors in social networks. However, it is a big challenge to identify ...
Transfer learning for video anomaly detection
Soft Computing and Intelligent Systems: Techniques and Applications

Anomaly detection from crowd is a widely addressed problem in the field of computer vision. It is an essential part of video surveillance and security. In surveillance videos, very little information about anomalous behaviors is available, so it becomes ...
DeGAN - Decomposition-based unified anomaly detection in static networks
Abstract
Graph anomaly detection aims to identify anomalous occurrences in networks. However, this is more challenging than the traditional anomaly detection problem because anomalies in graphs can manifest in three different forms: anomalous nodes, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 16, Issue 6

December 2022

631 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3543989

Editor:
Charu Aggarwal
IBM T. J. Watson Research, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2022

Online AM: 24 April 2022

Accepted: 01 April 2022

Revised: 01 February 2022

Received: 01 April 2021

Published in TKDD Volume 16, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

Hong Kong Research Grants Council (RGC) - General Research Fund (GRF)
National Science Foundation (NSF) - Civil, Mechanical and Manufacturing Innovation (CMMI)
U.S. Department of Energy (DOE)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
673
Total Downloads

Downloads (Last 12 months)178
Downloads (Last 6 weeks)34

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hu JShi NKontar RYan H(2025)Personalized Tucker Decomposition: Modeling Commonality and Peculiarity on Tensor DataTechnometrics10.1080/00401706.2025.2453206(1-27)Online publication date: 17-Jan-2025
https://doi.org/10.1080/00401706.2025.2453206
Li SChen WXing KWang HZhang YKang M(2025)MGAN-LD: A sparse label propagation-based anomaly detection approach using multi-generative adversarial networksKnowledge-Based Systems10.1016/j.knosys.2025.113124312(113124)Online publication date: Mar-2025
https://doi.org/10.1016/j.knosys.2025.113124
Guo SLyu JZhu XFan H(2025)Multi-feature fusion for the evaluation of strategic nodes and regional importance in maritime networksChaos, Solitons & Fractals10.1016/j.chaos.2024.115902191(115902)Online publication date: Feb-2025
https://doi.org/10.1016/j.chaos.2024.115902
Wu YWang ZLi YGuo YJiang HZhu XWu X(2024)Co-occurrence Order-preserving Pattern Mining with Keypoint Alignment for Time SeriesACM Transactions on Management Information Systems10.1145/365845015:2(1-27)Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3658450
Fallahdizcheh AWang C(2024)Variational inference-based transfer learning for profile monitoring with incomplete dataIISE Transactions10.1080/24725854.2024.232270257:4(351-366)Online publication date: 18-Mar-2024
https://doi.org/10.1080/24725854.2024.2322702
Shang YLu CLi LHe S(2024)Self-starting monitoring schemes for small-sample poisson profiles based on transfer learningComputers & Industrial Engineering10.1016/j.cie.2024.110262192(110262)Online publication date: Jun-2024
https://doi.org/10.1016/j.cie.2024.110262
Yan HLi ZZhao XHu J(2024)Sparse Decomposition Methods for Spatio-Temporal Anomaly DetectionMultimodal and Tensor Data Analytics for Industrial Systems Improvement10.1007/978-3-031-53092-0_9(185-206)Online publication date: 26-Feb-2024
https://doi.org/10.1007/978-3-031-53092-0_9
Li QYang XWang YWu YHe D(2023)Spatial–Temporal Traffic Modeling With a Fusion Graph Reconstructed by Tensor DecompositionIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.331413425:2(1749-1760)Online publication date: 22-Sep-2023
https://dl.acm.org/doi/10.1109/TITS.2023.3314134
Zhu LBao QZhang Z(2023)Measures and Optimization for Robustness and Vulnerability in Disconnected NetworksIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.327997918(3350-3362)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TIFS.2023.3279979
Li C(2023)Cold Start Recommendation based on Cross-domain Sharing Knowledge Transfer Learning2023 International Conference on Power, Electrical Engineering, Electronics and Control (PEEEC)10.1109/PEEEC60561.2023.00178(911-915)Online publication date: 25-Sep-2023
https://doi.org/10.1109/PEEEC60561.2023.00178
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents