Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3442381.3449868acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Beyond Outlier Detection: Outlier Interpretation by Attention-Guided Triplet Deviation Network

Published: 03 June 2021 Publication History

Abstract

Outlier detection is an important task in many domains and is intensively studied in the past decade. Further, how to explain outliers, i.e., outlier interpretation, is more significant, which can provide valuable insights for analysts to better understand, solve, and prevent these detected outliers. However, only limited studies consider this problem. Most of the existing methods are based on the score-and-search manner. They select a feature subspace as interpretation per queried outlier by estimating outlying scores of the outlier in searched subspaces. Due to the tremendous searching space, they have to utilize pruning strategies and set a maximum subspace length, often resulting in suboptimal interpretation results. Accordingly, this paper proposes a novel Attention-guided Triplet deviation network for Outlier interpretatioN (ATON). Instead of searching a subspace, ATON directly learns an embedding space and learns how to attach attention to each embedding dimension (i.e., capturing the contribution of each dimension to the outlierness of the queried outlier). Specifically, ATON consists of a feature embedding module and a customized self-attention learning module, which are optimized by a triplet deviation-based loss function. We obtain an optimal attention-guided embedding space with expanded high-level information and rich semantics, and thus outlying behaviors of the queried outlier can be better unfolded. ATON finally distills a subspace of original features from the embedding module and the attention coefficient. With the good generality, ATON can be employed as an additional step of any black-box outlier detector. A comprehensive suite of experiments is conducted to evaluate the effectiveness and efficiency of ATON. The proposed ATON significantly outperforms state-of-the-art competitors on 12 real-world datasets and obtains good scalability w.r.t. both data dimensionality and data size.

References

[1]
Charu C Aggarwal. 2017. Outlier analysis. Springer. https://doi.org/10.1007/978-1-4614-6396-2
[2]
Fabrizio Angiulli, Fabio Fassetti, Giuseppe Manco, and Luigi Palopoli. 2017. Outlying property detection with numerical attributes. Data mining and knowledge discovery 31, 1 (2017), 134–163.
[3]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In ICLR.
[4]
Xuan Hong Dang, Ira Assent, Raymond T Ng, Arthur Zimek, and Erich Schubert. 2014. Discriminative features for identifying and interpreting outliers. In ICDE. IEEE, 88–99.
[5]
Xuan Hong Dang, Barbora Micenková, Ira Assent, and Raymond T Ng. 2013. Local outlier detection with interpretation. In ECML PKDD. Springer, 304–320.
[6]
Lei Duan, Guanting Tang, Jian Pei, James Bailey, Akiko Campbell, and Changjie Tang. 2015. Mining outlying aspects on numeric data. Data Mining and Knowledge Discovery 29, 5 (2015), 1116–1151.
[7]
Ioana Giurgiu and Anika Schumann. 2019. Additive Explanations for Anomalies Detected from Multivariate Temporal Data. In CIKM. 2245–2248.
[8]
Markus Goldstein and Andreas Dengel. 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI-2012: Poster and Demo Track(2012), 59–63.
[9]
Nikhil Gupta, Dhivya Eswaran, Neil Shah, Leman Akoglu, and Christos Faloutsos. 2018. Beyond outlier detection: Lookout for pictorial explanation. In ECML PKDD. Springer, 122–138.
[10]
Songlei Jian, Guansong Pang, Longbing Cao, Kai Lu, and Hang Gao. 2018. Cure: Flexible categorical data representation by hierarchical coupling learning. IEEE Transactions on Knowledge and Data Engineering 31, 5(2018), 853–866.
[11]
Jacob Kauffmann, Klaus-Robert Müller, and Grégoire Montavon. 2020. Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recognition 101(2020), 107198.
[12]
Fabian Keller, Emmanuel Müller, Andreas Wixler, and Klemens Böhm. 2013. Flexible and adaptive subspace search for outlier analysis. In CIKM. 1381–1390.
[13]
Martin Kopp, Tomáš Pevnỳ, and Martin Holeňa. 2020. Anomaly explanation with random forests. Expert Systems with Applications 149 (2020), 113187.
[14]
Chia-Tung Kuo and Ian Davidson. 2016. A framework for outlier description using constraint programming. In AAAI. 1237–1243.
[15]
Zheng Li, Yue Zhao, Nicola Botta, Cezar Ionescu, and Xiyang Hu. 2020. COPOD: Copula-Based Outlier Detection. In ICDM. IEEE.
[16]
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2012. Isolation-Based Anomaly Detection. ACM Transactions on Knowledge Discovery from Data 6, 1, Article 3 (2012), 39 pages.
[17]
Ninghao Liu, Donghwa Shin, and Xia Hu. 2018. Contextual outlier interpretation. In IJCAI. 2461–2467.
[18]
Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In NeuraIPS. 4765–4774.
[19]
Meghanath Macha and Leman Akoglu. 2018. Explaining anomalies in groups with characterizing subspace rules. Data Mining and Knowledge Discovery 32, 5 (2018), 1444–1480.
[20]
Barbora Micenková, Raymond T Ng, Xuan-Hong Dang, and Ira Assent. 2013. Explaining outliers by subspace separability. In ICDM. IEEE, 518–527.
[21]
Christoph Molnar. 2020. Interpretable Machine Learning. Lulu. com.
[22]
Guansong Pang, Longbing Cao, Ling Chen, and Huan Liu. 2018. Learning representations of ultrahigh-dimensional data for random distance-based outlier detection. In SIGKDD. 2041–2050.
[23]
Guansong Pang, Chunhua Shen, Longbing Cao, and Anton van den Hengel. 2020. Deep learning for anomaly detection: A review. arXiv preprint arXiv:2007.02500(2020).
[24]
Guansong Pang, Chunhua Shen, and Anton van den Hengel. 2019. Deep anomaly detection with deviation networks. In SIGKDD. 353–362.
[25]
Tomáš Pevnỳ and Martin Kopp. 2014. Explaining anomalies with sapling random forests. In Information Technologies-Applications and Theory Workshops, Posters, and Tutorials.
[26]
Shebuti Rayana. 2016. ODDS Library. http://odds.cs.stonybrook.edu
[27]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why should I trust you?” Explaining the predictions of any classifier. In SIGKDD. ACM, 1135–1144.
[28]
Durgesh Samariya, Kai Ming Ting, and Sunil Aryal. 2020. A new effective and efficient measure for outlying aspect mining. arXiv preprint arXiv:2004.13550(2020).
[29]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In CVPR. 815–823.
[30]
Md Amran Siddiqui, Alan Fern, Thomas G Dietterich, and Weng-Keen Wong. 2019. Sequential feature explanations for anomaly detection. ACM Transactions on Knowledge Discovery from Data 13, 1 (2019), 1–22.
[31]
Nguyen Xuan Vinh, Jeffrey Chan, Simone Romano, James Bailey, Christopher Leckie, Kotagiri Ramamohanarao, and Jian Pei. 2016. Discovering outlying aspects in large datasets. Data Mining and Knowledge Discovery 30, 6 (2016), 1520–1555.
[32]
Huaimin Wang, Peichang Shi, and Yiming Zhang. 2017. JointCloud: A Cross-Cloud Cooperation Architecture for Integrated Internet Service Customization. In ICDCS. IEEE, 1846–1855.
[33]
Hongzuo Xu, Yongjun Wang, Li Cheng, Yijie Wang, and Xingkong Ma. 2018. Exploring a High-quality Outlying Feature Value Set for Noise-Resilient Outlier Detection in Categorical Data. In CIKM. ACM, 17–26.
[34]
Hongzuo Xu, Yongjun Wang, Zhiyue Wu, and Yijie Wang. 2019. Embedding-based Complex Feature Value Coupling Learning for Detecting Outliers in Non-IID Categorical Data. In AAAI. AAAI Press, 5541–5548.
[35]
Hongzuo Xu, Yijie Wang, Zhiyue Wu, and Yongjun Wang. 2019. MIX: A Joint Learning Framework for Detecting Both Clustered and Scattered Outliers in Mixed-Type Data. In ICDM. IEEE, 1408–1413.
[36]
Xiao Zhang, Manish Marwah, I-ta Lee, Martin Arlitt, and Dan Goldwasser. 2019. ACE–An Anomaly Contribution Explainer for Cyber-Security Applications. In International Conference on Big Data (Big Data). IEEE, 1991–2000.

Cited By

View all
  • (2024)FlowGANAnomaly: Flow-Based Anomaly Network Intrusion Detection with Adversarial LearningChinese Journal of Electronics10.23919/cje.2022.00.17333:1(58-71)Online publication date: Jan-2024
  • (2024)Outlier Interpretation Using Regularized Auto Encoders and Genetic Algorithm2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10612022(1-8)Online publication date: 30-Jun-2024
  • (2024)Explaining outliers and anomalous groups via subspace density contrastive lossMachine Learning10.1007/s10994-024-06618-8Online publication date: 23-Sep-2024
  • Show More Cited By
  1. Beyond Outlier Detection: Outlier Interpretation by Attention-Guided Triplet Deviation Network

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '21: Proceedings of the Web Conference 2021
    April 2021
    4054 pages
    ISBN:9781450383127
    DOI:10.1145/3442381
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Feature embedding
    2. Outlier interpretation
    3. Self-attention
    4. Triplet deviation

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WWW '21
    Sponsor:
    WWW '21: The Web Conference 2021
    April 19 - 23, 2021
    Ljubljana, Slovenia

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)119
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 22 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)FlowGANAnomaly: Flow-Based Anomaly Network Intrusion Detection with Adversarial LearningChinese Journal of Electronics10.23919/cje.2022.00.17333:1(58-71)Online publication date: Jan-2024
    • (2024)Outlier Interpretation Using Regularized Auto Encoders and Genetic Algorithm2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10612022(1-8)Online publication date: 30-Jun-2024
    • (2024)Explaining outliers and anomalous groups via subspace density contrastive lossMachine Learning10.1007/s10994-024-06618-8Online publication date: 23-Sep-2024
    • (2024)TransGAD: A Transformer-Based Autoencoder for Graph Anomaly DetectionDatabase Systems for Advanced Applications10.1007/978-981-97-5572-1_17(269-284)Online publication date: 31-Aug-2024
    • (2023)A Survey on Explainable Anomaly DetectionACM Transactions on Knowledge Discovery from Data10.1145/360933318:1(1-54)Online publication date: 6-Sep-2023
    • (2023)Deep Isolation Forest for Anomaly DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.327029335:12(12591-12604)Online publication date: 25-Apr-2023
    • (2023)Local-Adaptive Transformer for Multivariate Time Series Anomaly Detection and Diagnosis2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394229(89-95)Online publication date: 1-Oct-2023
    • (2023)DPSS: Dynamic Parameter Selection for Outlier Detection on Data Streams2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00122(908-915)Online publication date: Jan-2023
    • (2023)RoSASInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10345960:5Online publication date: 1-Sep-2023
    • (2022)Script event prediction based on pre-trained model with tail event enhancement2021 5th International Conference on Computer Science and Artificial Intelligence10.1145/3507548.3507585(242-248)Online publication date: 9-Mar-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media