Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3387902.3392612acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Outlier detection based on sparse coding and neighbor entropy in high-dimensional space

Published: 23 May 2020 Publication History

Abstract

Outlier detection is an important branch in data mining and plays a vital role in broad range of applications including network-traffic anomaly detection, credit fraud prevention, etc. Based on the assumption that dataset can be approximately reconstructed by linear combinations of dictionary atoms, some detection algorithms initially project the data to a higher dimensional manifold such that data representation becomes sparse. Unlike previous sparse coding based approaches, our method SNOD (Sparse coding and Neighbor entropy based Outlier Detection) can detect local and global outliers and construct neighborhood in a self-manner. Finally, the outlier score of each sample using local reconstruction coefficients is computed. Experiments on several benchmark datasets and the comparison to the state-of-the-art methods validate the advantages of our algorithm.

References

[1]
Charu C Aggarwal. 2013. Proximity-based outlier detection. In Outlier Analysis. Springer, 101--133.
[2]
Monowar H Bhuyan, DK Bhattacharyya, and Jugal K Kalita. 2016. A multi-step outlier-based anomaly detection approach to network-wide traffic. Information Sciences 348 (2016), 243--271.
[3]
Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data. 93--104.
[4]
Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM computing surveys (CSUR) 41, 3 (2009), 1--58.
[5]
Iddo Drori and David L Donoho. 2006. Solution of 11 minimization problems by LARS/homotopy methods. In 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, Vol. 3. IEEE, III--III.
[6]
Jayanta K Dutta and Bonny Banerjee. 2019. Improved outlier detection using sparse coding-based methods. Pattern Recognition Letters 122 (2019), 99--105.
[7]
Andrew F Emmott, Shubhomoy Das, Thomas Dietterich, Alan Fern, and Weng-Keen Wong. 2013. Systematic construction of anomaly detection benchmarks from real data. In Proceedings of the ACM SIGKDD workshop on outlier detection and description. 16--21.
[8]
Xiaodi Hou and Liqing Zhang. 2009. Dynamic visual attention: Searching for coding length increments. In Advances in neural information processing systems. 681--688.
[9]
Zhang Jifu, Li Yonghong, Qin Xiao, and Xun Yaling. 2015. Related-Subspace-Based Local Outlier Detection Algorithm Using MapReduce. Journal of Software 26, 05 (2015), 1079--1095.
[10]
Ke-Yi Ju, De-Qun Zhou, and Yu-Qiang Zhang. 2008. A novel algorithm for outlier detection in high dimension and its application in mine disaster forewarning. In 2008 4th International Conference on Wireless Communications, Networking and Mobile Computing. IEEE, 1--7.
[11]
Song Li and Junhong Lin. 2011. Compressed Sensing with coherent tight frames via lq - minimizationf or 0 < q\ ≤ 1. arXiv preprint arXiv:1105.3299 (2011).
[12]
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining. IEEE, 413--422.
[13]
Alva Presbitero, Rick Quax, Valeria Krzhizhanovskaya, and Peter Sloot. 2017. Anomaly detection in clinical data of patients undergoing heart surgery. Procedia Computer Science 108 (2017), 99--108.
[14]
Hamada Rizk, Sherin Elgokhy, and Amany Sarhan. 2015. A hybrid outlier detection algorithm based on partitioning clustering and density measures. In 2015 Tenth International Conference on Computer Engineering & Systems (ICCES). IEEE, 175--181.
[15]
Saket Sathe and Charu C Aggarwal. 2016. Subspace outlier detection in linear time with randomized hashing. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 459--468.
[16]
Saket Sathe and Charu C Aggarwal. 2016. Subspace outlier detection in linear time with randomized hashing. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 459--468.
[17]
Li Shao-dong, Pei Wen-jiong, Yang Jun, and Hu Guo-qi. 2015. OMP reconstruction algorithm via Bayesian model and its application. Systems Engineering and Electronics 37, 2 (2015), 246--252.
[18]
Charles Soussen, Jérome Idier, Junbo Duan, and David Brie. 2015. Homotopy Based Algorithms for L0-Regularized Least-Squares. IEEE Transactions on Signal Processing 63, 13 (2015), 3301--3316.
[19]
Zhengya Sun and Qing Tao. 2009. Statistical machine learning: A review of the loss function and optimization. Communications of the China Computer Federation 5, 8 (2009), 7--14.
[20]
Bas van Stein, Matthijs van Leeuwen, and Thomas Bäck. 2016. Local subspace-based outlier detection using global neighbourhoods. In 2016 IEEE International Conference on Big Data (Big Data). IEEE, 1136--1142.
[21]
Y Wang, Q Gao, Zhang Xian, and Q Tao. 2014. A coordinate descent algorithm for solving capped-L1 regularization problems. J. Computer Research and Development 51 (2014), 1304--1312.
[22]
Graham Williams, Rohan Baxter, Hongxing He, Simon Hawkins, and Lifang Gu. 2002. A comparative study of RNN for outlier detection in data mining. In 2002 IEEE International Conference on Data Mining, 2002. Proceedings. IEEE, 709--712.
[23]
Jifu Zhang, Xiaolong Yu, Yonghong Li, Sulan Zhang, Yaling Xun, and Xiao Qin. 2016. A relevant subspace based contextual outlier mining algorithm. Knowledge-Based Systems 99 (2016), 1--9.
[24]
Wang Zhaowen, Yang Jianchao, and Zhang Haichao. 2015. Sparse Coding and its Applications in Computer Vision. World Scientific.

Cited By

View all
  • (2023)A Novel Anomaly Score Based on Kernel Density Fluctuation Factor for Improving the Local and Clustered Anomalies Detection of Isolation ForestsInformation Sciences10.1016/j.ins.2023.118979(118979)Online publication date: Apr-2023
  • (2022)Incremental Dictionary Learning for Multiframe Satellite Image Representation via Gradual OptimizationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2022.317393660(1-16)Online publication date: 2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CF '20: Proceedings of the 17th ACM International Conference on Computing Frontiers
May 2020
298 pages
ISBN:9781450379564
DOI:10.1145/3387902
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 May 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. high-dimensional space
  2. outlier detection
  3. sparse coding

Qualifiers

  • Research-article

Conference

CF '20
Sponsor:
CF '20: Computing Frontiers Conference
May 11 - 13, 2020
Sicily, Catania, Italy

Acceptance Rates

Overall Acceptance Rate 273 of 785 submissions, 35%

Upcoming Conference

CF '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A Novel Anomaly Score Based on Kernel Density Fluctuation Factor for Improving the Local and Clustered Anomalies Detection of Isolation ForestsInformation Sciences10.1016/j.ins.2023.118979(118979)Online publication date: Apr-2023
  • (2022)Incremental Dictionary Learning for Multiframe Satellite Image Representation via Gradual OptimizationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2022.317393660(1-16)Online publication date: 2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media