Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3442381.3450023acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Open access

MStream: Fast Anomaly Detection in Multi-Aspect Streams

Published: 03 June 2021 Publication History

Abstract

Given a stream of entries in a multi-aspect data setting i.e., entries having multiple dimensions, how can we detect anomalous activities in an unsupervised manner? For example, in the intrusion detection setting, existing work seeks to detect anomalous events or edges in dynamic graph streams, but this does not allow us to take into account additional attributes of each entry. Our work aims to define a streaming multi-aspect data anomaly detection framework, termed MStream  which can detect unusual group anomalies as they occur, in a dynamic manner. MStream has the following properties: (a) it detects anomalies in multi-aspect data including both categorical and numeric attributes; (b) it is online, thus processing each record in constant time and constant memory; (c) it can capture the correlation between multiple aspects of the data. MStream is evaluated over the KDDCUP99, CICIDS-DoS, UNSW-NB 15 and CICIDS-DDoS datasets, and outperforms state-of-the-art baselines.

References

[1]
Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph Based Anomaly Detection and Description: A Survey. Data mining and knowledge discovery(2015).
[2]
Azeem Aqil, Karim Khalil, Ahmed O F Atya, Evangelos E Papalexakis, Srikanth V Krishnamurthy, Trent Jaeger, K K Ramakrishnan, Paul Yu, and Ananthram Swami. 2017. Jaal: Towards Network Intrusion Detection at ISP Scale. In CoNEXT.
[3]
Elisa Bertino, Evimaria Terzi, Ashish Kamra, and Athena Vakali. 2005. Intrusion detection in RBAC-administered databases. In ACSAC.
[4]
Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, and Christos Faloutsos. 2020. MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams. In AAAI.
[5]
Petko Bogdanov, Christos Faloutsos, Misael Mongiovì, Evangelos E Papalexakis, Razvan Ranca, and Ambuj K Singh. 2013. NetSpot: Spotting Significant Anomalous Regions on Dynamic Networks. In SDM.
[6]
Francesco Bonchi, Ilaria Bordino, Francesco Gullo, and Giovanni Stilo. 2016. Identifying Buzzing Stories via Anomalous Temporal Subgraph Discovery. In WI.
[7]
Francesco Bonchi, Ilaria Bordino, Francesco Gullo, and Giovanni Stilo. 2019. The importance of unexpectedness: Discovering buzzing stories in anomalous temporal graphs. Web Intelligence (2019).
[8]
Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: identifying density-based local outliers. In SIGMOD.
[9]
Moses S Charikar. 2002. Similarity estimation techniques from rounding algorithms. In STOC.
[10]
L Chi, B Li, X Zhu, S Pan, and L Chen. 2018. Chang, Yen-Yu and Li, Pan and Sosic, Rok and Afifi, MH and Schweighauser, Marco and Leskovec, Jure. IEEE Transactions on Cybernetics(2018).
[11]
Graham Cormode and Shan Muthukrishnan. 2005. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms(2005).
[12]
[12] KDD Cup 1999 Dataset.1999. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.
[13]
Paulo Vitor de Campos Souza, Augusto Junio Guimarães, Thiago Silva Rezende, Vinicius Jonathan Silva Araujo, and Vanessa Souza Araujo. 2020. Detection of Anomalies in Large-Scale Cyberattacks Using Fuzzy Neural Networks. Artificial Intelligence(2020).
[14]
Dhivya Eswaran and Christos Faloutsos. 2018. Sedanspot: Detecting anomalies in edge streams. In ICDM.
[15]
Hadi Fanaee-T and João Gama. 2015. Multi-aspect-streaming tensor analysis. Knowledge-Based Systems(2015).
[16]
Hadi Fanaee-T and João Gama. 2016. Tensor-based anomaly detection: An interdisciplinary survey. Knowledge-Based Systems(2016).
[17]
Adam Goodge, Bryan Hooi, See-Kiong Ng, and Wee Siong Ng. 2020. Robustness of Autoencoders for Anomaly Detection Under Adversarial Impact. In IJCAI.
[18]
Tyrone Gradison and Evimaria Terzi. 2018. Intrusion Detection Technology. In Encyclopedia of Database Systems.
[19]
Sudipto Guha, Nina Mishra, Gourav Roy, and Okke Schrijvers. 2016. Robust Random Cut Forest Based Anomaly Detection on Streams. In ICML.
[20]
Nikhil Gupta, Dhivya Eswaran, Neil Shah, Leman Akoglu, and Christos Faloutsos. 2017. LookOut on Time-Evolving Graphs: Succinctly Explaining Anomalies from Any Detector. ArXiv abs/1710.05333(2017).
[21]
Kawther Hassine, Aiman Erbad, and Ridha Hamila. 2019. Important Complexity Reduction of Random Forest in Multi-Classification Problem. In IWCMC.
[22]
Geoffrey E Hinton and Richard S Zemel. 1994. Autoencoders, minimum description length and Helmholtz free energy. In NIPS.
[23]
Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2015. A general suspiciousness metric for dense blocks in multimodal data. In ICDM.
[24]
Hyunjun Ju, Dongha Lee, Junyoung Hwang, Junghyun Namkung, and Hwanjo Yu. 2020. PUMAD: PU Metric learning for anomaly detection. Information Sciences(2020).
[25]
Farrukh Aslam Khan, Abdu Gumaei, Abdelouahid Derhab, and Amir Hussain. 2019. A Novel Two-Stage Deep Learning Model for Efficient Network Intrusion Detection. IEEE Access (2019).
[26]
Artemy Kolchinsky, Brendan D Tracey, and David H Wolpert. 2019. Nonlinear Information Bottleneck. Entropy (2019).
[27]
Tamara G Kolda and Brett W Bader. 2009. Tensor decompositions and applications. SIAM review (2009).
[28]
Xiangnan Kong and S Yu Philip. 2011. An ensemble-based approach to fast classification of multi-label data streams. In CollaborateCom.
[29]
Rithesh Kumar, Anirudh Goyal, Aaron C Courville, and Yoshua Bengio. 2019. Maximum Entropy Generators for Energy-Based Models. ArXiv abs/1901.08508(2019).
[30]
Jie Li, Guan Han, Jing Wen, and Xinbo Gao. 2011. Robust tensor subspace learning for anomaly detection. IJMLC (2011).
[31]
Witold Litwin. 1980. Linear hashing: a new tool for file and table addressing. In VLDB.
[32]
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. ICDM (2008).
[33]
Chen Luo and Anshumali Shrivastava. 2018. Arrays of (Locality-Sensitive) Count Estimators (ACE): Anomaly Detection on the Edge. In WWW.
[34]
Fragkiskos D Malliaros, Vasileios Megalooikonomou, and Christos Faloutsos. 2012. Fast Robustness Estimation in Large Social Graphs: Communities and Anomaly Detection. In SDM.
[35]
Emaad A Manzoor, Hemank Lamba, and Leman Akoglu. 2018. xStream: Outlier Detection in Feature-Evolving Data Streams. In KDD.
[36]
Hing-Hao Mao, Chung-Jung Wu, Evangelos E Papalexakis, Christos Faloutsos, Kuo-Chen Lee, and Tien-Cheu Kao. 2014. MalSpot: Multi 2 malicious network behavior patterns analysis. In PAKDD.
[37]
Koji Maruhashi, Fan Guo, and Christos Faloutsos. 2011. Multiaspectforensics: Pattern mining on large-scale heterogeneous networks with tensor analysis. In ASONAM.
[38]
Misael Mongiovì, Petko Bogdanov, Razvan Ranca, Ambuj K Singh, Evangelos E Papalexakis, and Christos Faloutsos. 2012. SigSpot: Mining Significant Anomalous Regions from Time-Evolving Networks (Abstract Only). In SIGMOD.
[39]
Nour Moustafa and Jill Slay. 2015. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In MilCIS.
[40]
Gyoung S Na, Donghyun Kim, and Hwanjo Yu. 2018. DILOF: Effective and Memory Efficient Local Outlier Detection in Data Streams. In KDD.
[41]
Phuc Cuong Ngo, Amadeus Aristo Winarto, Connie Khor Li Kou, Sojeong Park, Farhan Akram, and Hwee Kuan Lee. 2019. Fence GAN: Towards Better Anomaly Detection. ICTAI (2019).
[42]
Shirui Pan, Jia Wu, Xingquan Zhu, and Chengqi Zhang. 2015. Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification. IEEE Transactions on Cybernetics(2015).
[43]
Shirui Pan, Kuan Wu, Yang Zhang, and Xue Li. 2010. Classifier Ensemble for Uncertain Data Stream Classification. In Advances in Knowledge Discovery and Data Mining.
[44]
Shirui Pan, Xingquan Zhu, Chengqi Zhang, and S Yu Philip. 2013. Graph stream classification using labeled and unlabeled graphs. In ICDE.
[45]
Evangelos Papalexakis, Konstantinos Pelechrinis, and Christos Faloutsos. 2014. Spotting misbehaviors in location-based social networks using tensors. In WWW.
[46]
Evangelos E Papalexakis, Christos Faloutsos, and Nicholas D Sidiropoulos. 2012. Parcube: Sparse parallelizable tensor decompositions. In ECMLPKDD.
[47]
Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science (1901).
[48]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine Learning in Python. JMLR (2011).
[49]
Bryan Perozzi and Leman Akoglu. 2016. Scalable anomaly ranking of attributed neighborhoods. In SDM.
[50]
Bryan Perozzi and Leman Akoglu. 2018. Discovering Communities and Anomalies in Attributed Graphs: Interactive Visual Exploration and Summarization. TKDD (2018).
[51]
Bryan Perozzi, Michael Schueppert, Jack Saalweachter, and Mayur Thakur. 2016. When Recommendation Goes Wrong: Anomalous Link Discovery in Recommendation Networks. In KDD.
[52]
Smitha Rajagopal, Katiganere Siddaramappa Hareesha, and Poornima Panduranga Kundapur. 2020. Feature Relevance Analysis and Feature Reduction of UNSW NB-15 Using Neural Networks on MAMLS. In ICACIE.
[53]
Smitha Rajagopal, Poornima Panduranga Kundapur, and Katiganere Siddaramappa Hareesha. 2020. A Stacking Ensemble for Network Intrusion Detection Using Heterogeneous Datasets. Security and Communication Networks(2020).
[54]
Stephen Ranshous, Steve Harenberg, Kshitij Sharma, and Nagiza F Samatova. 2016. A Scalable Approach for Outlier Detection in Edge Streams Using Sketch-based Approximations. In SDM.
[55]
Markus Ring, Sarah Wunderlich, Deniz Scheuring, Dieter Landes, and Andreas Hotho. 2019. A survey of network-based intrusion detection data sets. Computers & Security(2019).
[56]
Peter J Rousseeuw and Katrien Van Driessen. 1999. A fast algorithm for the minimum covariance determinant estimator. Technometrics (1999).
[57]
Saket Sathe and Charu C Aggarwal. 2016. Subspace Outlier Detection in Linear Time with Randomized Hashing. In ICDM.
[58]
Neil Shah, Alex Beutel, Bryan Hooi, Leman Akoglu, Stephan Gunnemann, Disha Makhija, Mohit Kumar, and Christos Faloutsos. 2016. EdgeCentric: Anomaly Detection in Edge-Attributed Networks. In ICDMW.
[59]
Iman Sharafaldin, Arash Habibi Lashkari, and Ali A Ghorbani. 2018. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In ICISSP.
[60]
Lei Shi, Aryya Gangopadhyay, and Vandana P Janeja. 2015. STenSr: Spatio-temporal tensor streams for anomaly detection and pattern discovery. Knowledge and Information Systems(2015).
[61]
Kijung Shin, Bryan Hooi, and Christos Faloutsos. 2016. M-zoom: Fast dense-block detection in tensors with quality guarantees. In ECMLPKDD.
[62]
Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. D-cube: Dense-block detection in terabyte-scale tensors. In WSDM.
[63]
Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. DenseAlert: Incremental Dense-Subtensor Detection in Tensor Streams. KDD (2017).
[64]
Hongyu Sun, Qiang He, Kewen Liao, Timos Sellis, Longkun Guo, Xuyun Zhang, Jun Shen, and Feifei Chen. 2019. Fast Anomaly Detection in Multiple Multi-Dimensional Data Streams. In BigData.
[65]
Jimeng Sun, Dacheng Tao, and Christos Faloutsos. 2006. Beyond streams and graphs: dynamic tensor analysis. In KDD.
[66]
Naftali Tishby, Fernando C Pereira, and William Bialek. 2000. The information bottleneck method. arXiv preprint physics/0004057(2000).
[67]
Hanghang Tong, Chongrong Li, Jingrui He, Jiajian Chen, Quang-Anh Tran, Haixin Duan, and Xing Li. 2005. Anomaly Internet Network Traffic Detection by Kernel Principle Component Classifier. In ISNN.
[68]
Hanghang Tong and Ching-Yung Lin. 2011. Non-Negative Residual Matrix Factorization with Application to Graph Anomaly Detection. In SDM.
[69]
Ravi Vinayakumar, Mamoun Alazab, KP Soman, Prabaharan Poornachandran, Ameer Al-Nemrat, and Sitalakshmi Venkatraman. 2019. Deep Learning Approach for Intelligent Intrusion Detection System. IEEE Access (2019).
[70]
Wei Wang, Xiaohong Guan, Xiangliang Zhang, and Liwei Yang. 2006. Profiling program behavior for anomaly intrusion detection based on the transition and frequency property of computer audit data. Computers & Security(2006).
[71]
Wei Wang, Thomas Guyet, René Quiniou, Marie-Odile Cordier, Florent Masseglia, and Xiangliang Zhang. 2014. Autonomic intrusion detection: Adaptively detecting anomalies over unlabeled audit data streams in computer networks. Knowledge-Based Systems(2014).
[72]
Yiwei Wang, Shenghua Liu, Minji Yoon, Hemank Lamba, Wei Wang, Christos Faloutsos, and Bryan Hooi. 2020. Provably Robust Node Classification via Low-Pass Message Passing. ICDM (2020).
[73]
Audrey Wilmet, Tiphaine Viard, Matthieu Latapy, and Robin Lamarche-Perrin. 2018. Degree-Based Outliers Detection Within IP Traffic Modelled as a Link Stream. 2018 Network Traffic Measurement and Analysis Conference (TMA) (2018).
[74]
Audrey Wilmet, Tiphaine Viard, Matthieu Latapy, and Robin Lamarche-Perrin. 2019. Outlier detection in IP traffic modelled as a link stream using the stability of degree distributions over time. Computer Networks (2019).
[75]
Minji Yoon, Bryan Hooi, Kijung Shin, and Christos Faloutsos. 2019. Fast and Accurate Anomaly Detection in Dynamic Graphs with a Two-Pronged Approach. In KDD.
[76]
Weiren Yu, Charu C Aggarwal, Shuai Ma, and Haixun Wang. 2013. On anomalous hotspot discovery in graph streams. In ICDM.
[77]
Shuangfei Zhai, Yu Cheng, Weining Lu, and Zhongfei Zhang. 2016. Deep structured energy based models for anomaly detection. In ICML.
[78]
Jiabao Zhang, Shenghua Liu, Wenjian Yu, Wenjie Feng, and Xueqi Cheng. 2019. EigenPulse: Detecting Surges in Large Streaming Graphs with Row Augmentation. In PAKDD.
[79]
Shuo Zhou, Nguyen Xuan Vinh, James Bailey, Yunzhe Jia, and Ian Davidson. 2016. Accelerating online cp decompositions for higher order tensors. In KDD.
[80]
Artur Ziviani, Antonio Tadeu A Gomes, Marcelo L Monsores, and Paulo SS Rodrigues. 2007. Network anomaly detection using nonextensive entropy. IEEE Communications Letters(2007).
[81]
Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection. In ICLR.

Cited By

View all
  • (2024)Online adaptive anomaly thresholding with confidence sequencesProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693987(47105-47132)Online publication date: 21-Jul-2024
  • (2024)QARF: A Novel Malicious Traffic Detection Approach via Online Active Learning for Evolving Traffic StreamsChinese Journal of Electronics10.23919/cje.2022.00.36033:3(645-656)Online publication date: May-2024
  • (2024)METER: A Dynamic Concept Adaptation Framework for Online Anomaly DetectionProceedings of the VLDB Endowment10.14778/3636218.363623317:4(794-807)Online publication date: 5-Mar-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '21: Proceedings of the Web Conference 2021
April 2021
4054 pages
ISBN:9781450383127
DOI:10.1145/3442381
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Anomaly Detection
  2. Intrusion Detection
  3. Multi-Aspect Data
  4. Stream

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '21
Sponsor:
WWW '21: The Web Conference 2021
April 19 - 23, 2021
Ljubljana, Slovenia

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)734
  • Downloads (Last 6 weeks)70
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Online adaptive anomaly thresholding with confidence sequencesProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693987(47105-47132)Online publication date: 21-Jul-2024
  • (2024)QARF: A Novel Malicious Traffic Detection Approach via Online Active Learning for Evolving Traffic StreamsChinese Journal of Electronics10.23919/cje.2022.00.36033:3(645-656)Online publication date: May-2024
  • (2024)METER: A Dynamic Concept Adaptation Framework for Online Anomaly DetectionProceedings of the VLDB Endowment10.14778/3636218.363623317:4(794-807)Online publication date: 5-Mar-2024
  • (2024)Anomaly Detection in Dynamic Graphs: A Comprehensive SurveyACM Transactions on Knowledge Discovery from Data10.1145/366990618:8(1-44)Online publication date: 29-May-2024
  • (2024)A Light-Weight and Robust Tensor Convolutional Autoencoder for Anomaly DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.333278436:9(4346-4360)Online publication date: Sep-2024
  • (2024)Operations and Maintenance KPI Anomaly Detection Based on VAE-SA-BILSTM Hybrid Model2024 6th International Conference on Natural Language Processing (ICNLP)10.1109/ICNLP60986.2024.10692985(259-264)Online publication date: 22-Mar-2024
  • (2024)Bad Design Smells in Benchmark NIDS Datasets2024 IEEE 9th European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP60621.2024.00042(658-675)Online publication date: 8-Jul-2024
  • (2024)Channel-Centric Spatio-Temporal Graph Networks for Network-based Intrusion Detection2024 IEEE Conference on Communications and Network Security (CNS)10.1109/CNS62487.2024.10735668(1-9)Online publication date: 30-Sep-2024
  • (2024)Research on Adaptive Model Pooling Method for Data Stream Anomaly Detection Based on Variational Auto-Encoder2024 China Automation Congress (CAC)10.1109/CAC63892.2024.10865256(4127-4131)Online publication date: 1-Nov-2024
  • (2024)Hi-MLIC: Hierarchical Multilayer Lightweight Intrusion Classification for Various Intrusion ScenariosIEEE Access10.1109/ACCESS.2024.345067112(120098-120115)Online publication date: 2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media