research-article

A Bayesian LSTM Based Active Anomaly Detection Service for Large Online Systems

Authors:

Zhiwen ChenAuthors Info & Claims

Internetware '24: Proceedings of the 15th Asia-Pacific Symposium on Internetware

Pages 407 - 416

https://doi.org/10.1145/3671016.3674818

Published: 24 July 2024 Publication History

Abstract

Currently, many large online systems are constructed with a microservice architecture. Due to the complex dependencies, the failure of a service in such a system can cause an avalanche, which directly affects user experience and the company’s revenue. It is critical for service operators to build anomaly detection services to monitor online systems closely and comprehensively. Even though a large number of anomaly detection approaches have been proposed, few of them can simultaneously adapt to hundreds of operators’ practical detection requirements. To tackle this problem, we proposed LSTM-AAD, a Bayesian LSTM based active anomaly detection service. LSTM-AAD extracts anomaly features based on the common patterns among metrics, introduces a Bayesian LSTM model to detect anomalies in time series metrics, and employs active learning to update the online model via a small number of uncertain feedback samples. In addition, the proposed user-oriented service can be quickly responsive to operators’ further requirements. We conduct extensive experiments on real time series metrics of large online services in Tencent. The results indicate that LSTM-AAD significantly outperforms other state-of-the-art methods. Moreover, our approach can detect anomalies efficiently out of box to work in a large-scale system.

References

[1]

Armin Balalaie, Abbas Heydarnoori, and Pooyan Jamshidi. 2016. Microservices architecture enables devops: Migration to a cloud-native architecture. IEEE Software 33, 3 (2016), 42–52.

Digital Library

[2]

Marília Barandas, Duarte Folgado, Letícia Fernandes, Sara Santos, Mariana Abreu, Patrícia Bota, Hui Liu, Tanja Schultz, and Hugo Gamboa. 2020. TSFEL: Time series feature extraction library. SoftwareX 11 (2020), 100456.

[3]

Zhangyu Cheng, Chengming Zou, and Jianwei Dong. 2019. Outlier detection using isolation forest and local outlier factor. In Proceedings of the conference on research in adaptive and convergent systems. 161–168.

Digital Library

[4]

Sung-Bae Cho and Hyuk-Jang Park. 2003. Efficient anomaly detection by modeling privilege flows using hidden Markov model. computers & security 22, 1 (2003), 45–55.

[5]

Maximilian Christ, Nils Braun, Julius Neuffer, and Andreas W Kempa-Liehr. 2018. Time series feature extraction on basis of scalable hypothesis tests (tsfresh–a python package). Neurocomputing 307 (2018), 72–77.

Digital Library

[6]

David A Cohn, Zoubin Ghahramani, and Michael I Jordan. 1996. Active learning with statistical models. Journal of artificial intelligence research 4 (1996), 129–145.

Digital Library

[7]

Nan Ding, Huanbo Gao, Hongyu Bu, Haoxuan Ma, and Huaiwei Si. 2018. Multivariate-time-series-driven real-time anomaly detection based on bayesian network. Sensors 18, 10 (2018), 3367.

[8]

Nan Ding, HaoXuan Ma, Huanbo Gao, YanHua Ma, and GuoZhen Tan. 2019. Real-time anomaly detection based on long short-Term memory and Gaussian Mixture Model. Computers & Electrical Engineering 79 (2019), 106458.

Digital Library

[9]

Zhiguo Ding and Minrui Fei. 2013. An anomaly detection approach based on isolation forest algorithm for streaming data using sliding window. IFAC Proceedings Volumes 46, 20 (2013), 12–17.

[10]

Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 1285–1298.

Digital Library

[11]

Sarah M Erfani, Sutharshan Rajasegarar, Shanika Karunasekera, and Christopher Leckie. 2016. High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition 58 (2016), 121–134.

Digital Library

[12]

Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning. PMLR, 1050–1059.

[13]

Yarin Gal, Riashat Islam, and Zoubin Ghahramani. 2017. Deep bayesian active learning with image data. In International Conference on Machine Learning. PMLR, 1183–1192.

[14]

Yu Gan, Yanqi Zhang, Kelvin Hu, Dailun Cheng, Yuan He, Meghna Pancholi, and Christina Delimitrou. 2019. Seer: Leveraging big data to navigate the complexity of performance debugging in cloud microservices. In Proceedings of the twenty-fourth international conference on architectural support for programming languages and operating systems. 19–33.

Digital Library

[15]

Ville Hautamaki, Ismo Karkkainen, and Pasi Franti. 2004. Outlier detection using k-nearest neighbour graph. In Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Vol. 3. IEEE, 430–433.

[16]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.

Digital Library

[17]

Heiko Hoffmann. 2007. Kernel PCA for novelty detection. Pattern recognition 40, 3 (2007), 863–874.

[18]

Xiaodi Hou and Liqing Zhang. 2007. Saliency detection: A spectral residual approach. In 2007 IEEE Conference on computer vision and pattern recognition. Ieee, 1–8.

[19]

Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, and Máté Lengyel. 2011. Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745 (2011).

[20]

Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Soderstrom. 2018. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 387–395.

Digital Library

[21]

Alex Kendall, Vijay Badrinarayanan, and Roberto Cipolla. 2015. Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding. arXiv preprint arXiv:1511.02680 (2015).

[22]

Shiyu Liang, Yixuan Li, and Rayadurgam Srikant. 2017. Enhancing the reliability of out-of-distribution image detection in neural networks. arXiv preprint arXiv:1706.02690 (2017).

[23]

Susana Lima, A Manuela Gonçalves, and Marco Costa. 2019. Time series forecasting using Holt-Winters exponential smoothing: An application to economic data. In AIP Conference Proceedings, Vol. 2186. AIP Publishing LLC, 090003.

[24]

Minghua Ma, Shenglin Zhang, Dan Pei, Xin Huang, and Hongwei Dai. 2018. Robust and rapid adaption for concept drift in software system anomaly detection. In 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 13–24.

[25]

Pankaj Malhotra, Anusha Ramakrishnan, Gaurangi Anand, Lovekesh Vig, Puneet Agarwal, and Gautam Shroff. 2016. LSTM-based encoder-decoder for multi-sensor anomaly detection. arXiv preprint arXiv:1607.00148 (2016).

[26]

Weibin Meng, Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei, Yuqing Liu, Yihao Chen, Ruizhi Zhang, Shimin Tao, Pei Sun, 2019. LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs. In IJCAI, Vol. 19. 4739–4745.

[27]

Sasho Nedelkoski, Jorge Cardoso, and Odej Kao. 2019. Anomaly detection and classification using distributed tracing and deep learning. In 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, 241–250.

[28]

Miguel Nicolau, James McDermott, 2016. One-class classification for anomaly detection with kernel density estimation and genetic programming. In European Conference on Genetic Programming. Springer, 3–18.

[29]

Guansong Pang, Chunhua Shen, and Anton van den Hengel. 2019. Deep anomaly detection with deviation networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 353–362.

Digital Library

[30]

Manikantan Ramadas, Shawn Ostermann, and Brett Tjaden. 2003. Detecting anomalous network traffic with self-organizing maps. In International Workshop on Recent Advances in Intrusion Detection. Springer, 36–54.

[31]

Hansheng Ren, Bixiong Xu, Yujing Wang, Chao Yi, Congrui Huang, Xiaoyu Kou, Tony Xing, Mao Yang, Jie Tong, and Qi Zhang. 2019. Time-series anomaly detection service at microsoft. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3009–3017.

Digital Library

[32]

Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. 2018. A comparison of ARIMA and LSTM in forecasting time series. In 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 1394–1401.

[33]

Alban Siffer, Pierre-Alain Fouque, Alexandre Termier, and Christine Largouet. 2017. Anomaly detection in streams with extreme value theory. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1067–1075.

Digital Library

[34]

Mehmet Turkoz, Sangahn Kim, Youngdoo Son, Myong K Jeong, and Elsayed A Elsayed. 2020. Generalized support vector data description for anomaly detection. Pattern Recognition 100 (2020), 107119.

Digital Library

[35]

Junfeng Wu, Li Yao, Bin Liu, Zheyuan Ding, and Lei Zhang. 2021. Multi-task learning based Encoder-Decoder: A comprehensive detection and diagnosis system for multi-sensor data. Advances in Mechanical Engineering 13, 5 (2021), 16878140211013138.

[36]

Haowen Xu, Wenxiao Chen, Nengwen Zhao, Zeyan Li, Jiahao Bu, Zhihan Li, Ying Liu, Youjian Zhao, Dan Pei, Yang Feng, 2018. Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications. In Proceedings of the 2018 World Wide Web Conference. 187–196.

Digital Library

[37]

Guangba Yu, Pengfei Chen, Hongyang Chen, Zijie Guan, Zicheng Huang, Linxiao Jing, Tianjun Weng, Xinmeng Sun, and Xiaoyun Li. 2021. MicroRank: End-to-End Latency Issue Localization with Extended Spectrum Analysis in Microservice Environments. In Proceedings of the Web Conference 2021. Association for Computing Machinery, 3087–3098.

Digital Library

[38]

Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In International conference on learning representations.

Index Terms

Index terms have been assigned to the content through auto-classification.

Recommendations

A Semi-Supervised VAE Based Active Anomaly Detection Framework in Multivariate Time Series for Online Systems
WWW '22: Proceedings of the ACM Web Conference 2022

Nowadays, the large online systems are constructed on the basis of microservice architecture. A failure in this architecture may cause a series of failures due to the fault propagation. Thus, the large online systems need to be monitored comprehensively ...
GAN-based anomaly detection: A review
Graphical abstract

Display Omitted
Highlights
- This review reconsiders the anomaly and gives criteria and challenges for anomaly detection.
Abstract
Supervised learning algorithms have shown limited use in the field of anomaly detection due to the unpredictability and difficulty in acquiring abnormal samples. In recent years, unsupervised or semi-supervised anomaly-detection ...
Deep Anomaly Detection with Deviation Networks
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Although deep learning has been applied to successfully address many data mining problems, relatively limited work has been done on deep learning for anomaly detection. Existing deep anomaly detection methods, which focus on learning new feature ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

Internetware '24: Proceedings of the 15th Asia-Pacific Symposium on Internetware

July 2024

518 pages

ISBN:9798400707056

DOI:10.1145/3671016

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

Internetware 2024

Sponsor:

SIGSOFT

Internetware 2024: 15th Asia-Pacific Symposium on Internetware

July 24 - 26, 2024

Macau, China

Acceptance Rates

Overall Acceptance Rate 55 of 111 submissions, 50%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
79
Total Downloads

Downloads (Last 12 months)79
Downloads (Last 6 weeks)23

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten