research-article

Beyond Outlier Detection: Outlier Interpretation by Attention-Guided Triplet Deviation Network

Authors:

Fei LiAuthors Info & Claims

WWW '21: Proceedings of the Web Conference 2021

Pages 1328 - 1339

https://doi.org/10.1145/3442381.3449868

Published: 03 June 2021 Publication History

Abstract

Outlier detection is an important task in many domains and is intensively studied in the past decade. Further, how to explain outliers, i.e., outlier interpretation, is more significant, which can provide valuable insights for analysts to better understand, solve, and prevent these detected outliers. However, only limited studies consider this problem. Most of the existing methods are based on the score-and-search manner. They select a feature subspace as interpretation per queried outlier by estimating outlying scores of the outlier in searched subspaces. Due to the tremendous searching space, they have to utilize pruning strategies and set a maximum subspace length, often resulting in suboptimal interpretation results. Accordingly, this paper proposes a novel Attention-guided Triplet deviation network for Outlier interpretatioN (ATON). Instead of searching a subspace, ATON directly learns an embedding space and learns how to attach attention to each embedding dimension (i.e., capturing the contribution of each dimension to the outlierness of the queried outlier). Specifically, ATON consists of a feature embedding module and a customized self-attention learning module, which are optimized by a triplet deviation-based loss function. We obtain an optimal attention-guided embedding space with expanded high-level information and rich semantics, and thus outlying behaviors of the queried outlier can be better unfolded. ATON finally distills a subspace of original features from the embedding module and the attention coefficient. With the good generality, ATON can be employed as an additional step of any black-box outlier detector. A comprehensive suite of experiments is conducted to evaluate the effectiveness and efficiency of ATON. The proposed ATON significantly outperforms state-of-the-art competitors on 12 real-world datasets and obtains good scalability w.r.t. both data dimensionality and data size.

References

[1]

Charu C Aggarwal. 2017. Outlier analysis. Springer. https://doi.org/10.1007/978-1-4614-6396-2

[2]

Fabrizio Angiulli, Fabio Fassetti, Giuseppe Manco, and Luigi Palopoli. 2017. Outlying property detection with numerical attributes. Data mining and knowledge discovery 31, 1 (2017), 134–163.

[3]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In ICLR.

[4]

Xuan Hong Dang, Ira Assent, Raymond T Ng, Arthur Zimek, and Erich Schubert. 2014. Discriminative features for identifying and interpreting outliers. In ICDE. IEEE, 88–99.

[5]

Xuan Hong Dang, Barbora Micenková, Ira Assent, and Raymond T Ng. 2013. Local outlier detection with interpretation. In ECML PKDD. Springer, 304–320.

[6]

Lei Duan, Guanting Tang, Jian Pei, James Bailey, Akiko Campbell, and Changjie Tang. 2015. Mining outlying aspects on numeric data. Data Mining and Knowledge Discovery 29, 5 (2015), 1116–1151.

Digital Library

[7]

Ioana Giurgiu and Anika Schumann. 2019. Additive Explanations for Anomalies Detected from Multivariate Temporal Data. In CIKM. 2245–2248.

[8]

Markus Goldstein and Andreas Dengel. 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI-2012: Poster and Demo Track(2012), 59–63.

[9]

Nikhil Gupta, Dhivya Eswaran, Neil Shah, Leman Akoglu, and Christos Faloutsos. 2018. Beyond outlier detection: Lookout for pictorial explanation. In ECML PKDD. Springer, 122–138.

[10]

Songlei Jian, Guansong Pang, Longbing Cao, Kai Lu, and Hang Gao. 2018. Cure: Flexible categorical data representation by hierarchical coupling learning. IEEE Transactions on Knowledge and Data Engineering 31, 5(2018), 853–866.

[11]

Jacob Kauffmann, Klaus-Robert Müller, and Grégoire Montavon. 2020. Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recognition 101(2020), 107198.

Digital Library

[12]

Fabian Keller, Emmanuel Müller, Andreas Wixler, and Klemens Böhm. 2013. Flexible and adaptive subspace search for outlier analysis. In CIKM. 1381–1390.

[13]

Martin Kopp, Tomáš Pevnỳ, and Martin Holeňa. 2020. Anomaly explanation with random forests. Expert Systems with Applications 149 (2020), 113187.

[14]

Chia-Tung Kuo and Ian Davidson. 2016. A framework for outlier description using constraint programming. In AAAI. 1237–1243.

[15]

Zheng Li, Yue Zhao, Nicola Botta, Cezar Ionescu, and Xiyang Hu. 2020. COPOD: Copula-Based Outlier Detection. In ICDM. IEEE.

[16]

Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2012. Isolation-Based Anomaly Detection. ACM Transactions on Knowledge Discovery from Data 6, 1, Article 3 (2012), 39 pages.

[17]

Ninghao Liu, Donghwa Shin, and Xia Hu. 2018. Contextual outlier interpretation. In IJCAI. 2461–2467.

[18]

Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In NeuraIPS. 4765–4774.

[19]

Meghanath Macha and Leman Akoglu. 2018. Explaining anomalies in groups with characterizing subspace rules. Data Mining and Knowledge Discovery 32, 5 (2018), 1444–1480.

Digital Library

[20]

Barbora Micenková, Raymond T Ng, Xuan-Hong Dang, and Ira Assent. 2013. Explaining outliers by subspace separability. In ICDM. IEEE, 518–527.

[21]

Christoph Molnar. 2020. Interpretable Machine Learning. Lulu. com.

[22]

Guansong Pang, Longbing Cao, Ling Chen, and Huan Liu. 2018. Learning representations of ultrahigh-dimensional data for random distance-based outlier detection. In SIGKDD. 2041–2050.

[23]

Guansong Pang, Chunhua Shen, Longbing Cao, and Anton van den Hengel. 2020. Deep learning for anomaly detection: A review. arXiv preprint arXiv:2007.02500(2020).

[24]

Guansong Pang, Chunhua Shen, and Anton van den Hengel. 2019. Deep anomaly detection with deviation networks. In SIGKDD. 353–362.

[25]

Tomáš Pevnỳ and Martin Kopp. 2014. Explaining anomalies with sapling random forests. In Information Technologies-Applications and Theory Workshops, Posters, and Tutorials.

[26]

Shebuti Rayana. 2016. ODDS Library. http://odds.cs.stonybrook.edu

[27]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why should I trust you?” Explaining the predictions of any classifier. In SIGKDD. ACM, 1135–1144.

[28]

Durgesh Samariya, Kai Ming Ting, and Sunil Aryal. 2020. A new effective and efficient measure for outlying aspect mining. arXiv preprint arXiv:2004.13550(2020).

[29]

Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In CVPR. 815–823.

[30]

Md Amran Siddiqui, Alan Fern, Thomas G Dietterich, and Weng-Keen Wong. 2019. Sequential feature explanations for anomaly detection. ACM Transactions on Knowledge Discovery from Data 13, 1 (2019), 1–22.

[31]

Nguyen Xuan Vinh, Jeffrey Chan, Simone Romano, James Bailey, Christopher Leckie, Kotagiri Ramamohanarao, and Jian Pei. 2016. Discovering outlying aspects in large datasets. Data Mining and Knowledge Discovery 30, 6 (2016), 1520–1555.

Digital Library

[32]

Huaimin Wang, Peichang Shi, and Yiming Zhang. 2017. JointCloud: A Cross-Cloud Cooperation Architecture for Integrated Internet Service Customization. In ICDCS. IEEE, 1846–1855.

[33]

Hongzuo Xu, Yongjun Wang, Li Cheng, Yijie Wang, and Xingkong Ma. 2018. Exploring a High-quality Outlying Feature Value Set for Noise-Resilient Outlier Detection in Categorical Data. In CIKM. ACM, 17–26.

[34]

Hongzuo Xu, Yongjun Wang, Zhiyue Wu, and Yijie Wang. 2019. Embedding-based Complex Feature Value Coupling Learning for Detecting Outliers in Non-IID Categorical Data. In AAAI. AAAI Press, 5541–5548.

[35]

Hongzuo Xu, Yijie Wang, Zhiyue Wu, and Yongjun Wang. 2019. MIX: A Joint Learning Framework for Detecting Both Clustered and Scattered Outliers in Mixed-Type Data. In ICDM. IEEE, 1408–1413.

[36]

Xiao Zhang, Manish Marwah, I-ta Lee, Martin Arlitt, and Dan Goldwasser. 2019. ACE–An Anomaly Contribution Explainer for Cyber-Security Applications. In International Conference on Big Data (Big Data). IEEE, 1991–2000.

Cited By

Li ZWang PWang Z(2024)FlowGANAnomaly: Flow-Based Anomaly Network Intrusion Detection with Adversarial LearningChinese Journal of Electronics10.23919/cje.2022.00.17333:1(58-71)Online publication date: Jan-2024
https://doi.org/10.23919/cje.2022.00.173
Ali Tousi SDeSouza G(2024)Outlier Interpretation Using Regularized Auto Encoders and Genetic Algorithm2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10612022(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/CEC60901.2024.10612022
Angiulli FFassetti FNisticò SPalopoli L(2024)Explaining outliers and anomalous groups via subspace density contrastive lossMachine Learning10.1007/s10994-024-06618-8Online publication date: 23-Sep-2024
https://doi.org/10.1007/s10994-024-06618-8
Show More Cited By

Beyond Outlier Detection: Outlier Interpretation by Attention-Guided Triplet Deviation Network
1. Information systems
  1. Information systems applications

Recommendations

A survey on outlier explanations
Abstract
While many techniques for outlier detection have been proposed in the literature, the interpretation of detected outliers is often left to users. As a result, it is difficult for users to promptly take appropriate actions concerning the detected ...
Enhancing Outlier Detection by an Outlier Indicator
Machine Learning and Data Mining in Pattern Recognition
Abstract
Outlier detection is an important task in data mining and has high practical value in numerous applications such as astronomical observation, text detection, fraud detection and so on. At present, a large number of popular outlier detection ...
Local outlier detection with interpretation
ECMLPKDD'13: Proceedings of the 2013th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III

Outlier detection aims at searching for a small set of objects that are inconsistent or considerably deviating from other objects in a dataset. Existing research focuses on outlier identification while omitting the equally important problem of outlier ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '21: Proceedings of the Web Conference 2021

April 2021

4054 pages

ISBN:9781450383127

DOI:10.1145/3442381

Editors:
Jure Leskovec
Stanford
,
Marko Grobelnik
Jožef Stefan Institute
,
Marc Najork
Google
,
Jie Tang
Tsinghua University
,
Leila Zia
Wikimedia Foundation

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '21

Sponsor:

SIGWEB

WWW '21: The Web Conference 2021

April 19 - 23, 2021

Ljubljana, Slovenia

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
537
Total Downloads

Downloads (Last 12 months)119
Downloads (Last 6 weeks)7

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li ZWang PWang Z(2024)FlowGANAnomaly: Flow-Based Anomaly Network Intrusion Detection with Adversarial LearningChinese Journal of Electronics10.23919/cje.2022.00.17333:1(58-71)Online publication date: Jan-2024
https://doi.org/10.23919/cje.2022.00.173
Ali Tousi SDeSouza G(2024)Outlier Interpretation Using Regularized Auto Encoders and Genetic Algorithm2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10612022(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/CEC60901.2024.10612022
Angiulli FFassetti FNisticò SPalopoli L(2024)Explaining outliers and anomalous groups via subspace density contrastive lossMachine Learning10.1007/s10994-024-06618-8Online publication date: 23-Sep-2024
https://doi.org/10.1007/s10994-024-06618-8
Guo ZWu NZhao YWang W(2024)TransGAD: A Transformer-Based Autoencoder for Graph Anomaly DetectionDatabase Systems for Advanced Applications10.1007/978-981-97-5572-1_17(269-284)Online publication date: 31-Aug-2024
https://doi.org/10.1007/978-981-97-5572-1_17
Li ZZhu YVan Leeuwen M(2023)A Survey on Explainable Anomaly DetectionACM Transactions on Knowledge Discovery from Data10.1145/360933318:1(1-54)Online publication date: 6-Sep-2023
https://dl.acm.org/doi/10.1145/3609333
Xu HPang GWang YWang Y(2023)Deep Isolation Forest for Anomaly DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.327029335:12(12591-12604)Online publication date: 25-Apr-2023
https://dl.acm.org/doi/10.1109/TKDE.2023.3270293
Zhou XWang YXu HLiu MZhang R(2023)Local-Adaptive Transformer for Multivariate Time Series Anomaly Detection and Diagnosis2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394229(89-95)Online publication date: 1-Oct-2023
https://doi.org/10.1109/SMC53992.2023.10394229
Zhang RWang YZhou HLi BXu H(2023)DPSS: Dynamic Parameter Selection for Outlier Detection on Data Streams2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00122(908-915)Online publication date: Jan-2023
https://doi.org/10.1109/ICPADS56603.2022.00122
Xu HWang YPang GJian SLiu NWang Y(2023)RoSASInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10345960:5Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.ipm.2023.103459
Huang ZWang YXu HJian SWang Z(2022)Script event prediction based on pre-trained model with tail event enhancement2021 5th International Conference on Computer Science and Artificial Intelligence10.1145/3507548.3507585(242-248)Online publication date: 9-Mar-2022
https://doi.org/10.1145/3507548.3507585
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents