Abstract
Outlier removal is vital in machine learning. As massive unlabeled data are generated rapidly today, eliminating outliers from noisy data in a fast and unsupervised manner is gaining increasing attention in practical applications. This paper tackles this challenging problem by proposing a novel Recurrent Adaptive Reconstruction Extreme Learning Machine (RAR-ELM). Specifically, with the given noisy data collection, RAR-ELM recurrently learns to reconstruct data and automatically excludes those data with high reconstruction errors as outliers by a novel adaptive labeling mechanism. Compared with existing methods, the proposed RAR-ELM enjoys three major merits: first, RAR-ELM inherits the fast and sound learning property of original extreme learning machine (ELM). RAR-ELM can be implemented at a tens or hundreds of times faster speed while achieving a superior or comparable outlier removal performance to existing methods, which makes RAR-ELM particularly suitable for application scenarios like real-time outlier removal; secondly, instead of priorly specifying a decision threshold, RAR-ELM is able to adaptively find a reasonable decision threshold when processing data with different proportions of outliers, which is vital to the case of unsupervised outlier removal where no prior knowledge of outliers in the data is available; thirdly, we also propose Online Sequential RAR-ELM (OS-RAR-ELM) can be implemented by an online or sequential mode, which makes RAR-ELM easily applicable to massive noisy data or online sequential data. Extensive experiments on various datasets reveal that the proposed RAR-ELM can realize faster and better unsupervised outlier removal in contrast to existing methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Russakovsky O, Deng J, Hao S, Krause J, Satheesh S, Ma S (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Schroff F, Criminisi A, Zisserman A (2007) Harvesting image databases from the web. IEEE Int Conf Comput Vis 33:1–8
Chandola V (2004) Outlier detection : a survey. ACM Comput Surv 14(3):15
Perdisci R, Gu G, Lee W (2007) Using an ensemble of one-class SVM classifiers to Harden Payload-based anomaly detection systems. In: International conference on data mining, IEEE, pp. 488–498
Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. IEEE Comput Vis Pattern Recognit 26:1975–1981
Ji Z, Pang Y, Li X (2015) Relevance preserving projection and ranking for web image search reranking. IEEE Trans Image Process A Publ IEEE Signal Process Soc 24(11):4137–47
Xiao Y, Wang H, Zhang L, Xu W (2014) Two methods of selecting Gaussian Kernel parameters for one-class svm and their application to fault detection. Knowl Based Syst 59(2):75–84
Xiao Y, Wang H, Xu W, Zhou J (2016) Robust one-class svm for fault detection. Chemometr Intell Lab Syst 151:15–25
Roberts S, Tarassenko L (1994) A probabilistic resource allocating network for novelty detection. Neural Comput 6(2):270–284
Dasarathy BV (1998) Adaptive local fusion systems for novelty detection and diagnostics in condition monitoring. Proc SPIE Int Soc Opt Eng 3376:210–218
Manevitz L, Yousef M (2007) One-class document classification via Neural Networks. Elsevier, Amsterdam
Scholkopf B, Platt JC, Shawetaylor J, Smola AJ, Williamson RC (2014) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471
Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54(1):45–66
Leng Q, Qi H, Miao J, Zhu W, Su G (2015) One-class classification with extreme learning machine. In: Mathematical problems in engineering 1–11
Kriegel HP, Hubert MS, Zimek A (2008) Angle-based outlier detection in high-dimensional data. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 444–452
Casale P, Pujol O, Radeva P (2014) Approximate polytope ensemble for one-class classification. Pattern Recognit 47(2):854–864
Janakiraman VM, Nielsen D (2016) Anomaly detection in aviation data using extreme learning machines. In: International joint conference on neural networks, pp 1993–2000
Breunig MM, Kriegel HP, Ng RT (2000) LOF: identifying density-based local outliers. In: ACM sigmod international conference on management of data, Vol 29, pp 93–104
Tang J, Chen Z, Fu AW, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. Pacific Asia Conf Knowl Discov Data Min 2336:535–548
Hautamaki V, Karkkainen I, Franti P (2004) Outlier Detection Using k-Nearest Neighbour Graph. In: International conference on pattern recognition, IEEE, Vol 3, pp 430–433
Pokrajac D, Lazarevic A, Latecki LJ (2007) Incremental local outlier detection for data streams. In: Computational intelligence and data mining, 2007, CIDM 2007, IEEE Symposium on, pp 504–515
Liu W, Hua G, Smith JR (2014) Unsupervised one-class learning for automatic outlier removal. In: IEEE conference on computer vision and pattern recognition, pp 3826–3833
Grubbs F (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21
Zimek A, Schubert E, Kriegel HP (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min 5(5):363–387
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076
Kim JS, Scott C (2008) Robust kernel density estimation. In: IEEE international conference on acoustics, speech and signal processing, vol 13, pp 2529–2565
Karlpearson FRS (1901) Liii. on lines and planes of closest fit to systems of points in space. Philos Magn 2(11):559–572
Schlkopf B, Smola A, Mller KR (1998) Nonlinear component analysis as a kernel eigen-value problem. Neuroimage 10:1299–1319
Vidal R, Sapiro G, Elhamifar E (2012) See all by looking at a few: Sparse modeling for finding representative objects. IEEE Comput Vis Pattern Recognit 157:1600–1607
Xia Y, Cao X, Wen F, Hua G (2015) Learning discriminative reconstructions for unsupervised outlier removal. In: IEEE international conference on computer vision, pp 1511–1519
Li S, Shao M, Fu Y (2014) Locality linear fitting one-class SVM with low-rank constraints for outlier detection. In: International joint conference on neural networks, IEEE, pp 676–683
Li S, Shao M, Fu Y (2014) Low-rank outlier detection
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501
Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42(2):513–529
Liang NY, Huang GB, Saratchandran P, Sundararajan N (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–1423
Huang G, Song S, Gupta JND, Wu C (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 44(12):2405–2417
Cambria E, Liu Q, Li K, Leung VCM, Feng L, Ong YS et al (2013) Extreme learning machines: trends and controversies. IEEE Intell Syst 28(6):30–59
Wang Y, Xie Z, Xu K, Dou Y, Lei Y (2016) An efficient and effective convolutional auto-encoder extreme learning machine network for 3d feature learning. Neurocomputing 174(PB):988–998
Bai Z, Huang GB (2015) Generic object recognition with local receptive fields based extreme learning machine. Proc Comput Sci 53(1):391–399
Decherchi S, Gastaldo P, Zunino R, Cambria E, Redi J (2013) Circular-elm for the reduced-reference assessment of perceived image quality. Neurocomputing 102(2):78–89
Choi K, Toh K-A, Byun H (2012) Incremental face recognition for large-scale social network services. Pattern Recognit 45(8):2868–2883
Xie Z, Kai X, Shan W, Liu L, Xiong Y, Huang H (2015) Projective feature learning for 3d shapes with multi-view depth images. Comput Graph Forum 34(7):1–11
Wang S, Zhu E, Yin J, Porikli F (2017) Video anomaly detection and localization by local motion based joint video representation and OCELM. Neurocomputing 277:161–175
Tang J, Deng C, Huang GB (2017) Extreme learning machine for multilayer perceptron. IEEE Trans Neural Netw Learn Syst 27(4):809–821
Zhang L, Deng P (2017) Abnormal odor detection in electronic nose via self-expression inspired extreme learning machine. IEEE Trans Syst Man Cybern Syst PP(99):1–11
Williams G, Baxter R, He H, Hawkins S, Gu L (2002) A comparative study of RNN for outlier detection in data mining. In: IEEE international conference on data mining, 2002. ICDM 2003. IEEE, Proceedings vol 156, pp 709–712
Ohtsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Dasgupta S (2013) Experiments with random projection. In: Proceedings of the sixteenth conference on uncertainty in artificial intelligence, pp 143–151
Bingham E, Mannila H (2001) Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 245–250
Xie H, Li J, Xue H (2017) A survey of dimensionality reduction techniques based on random projection. arXiv preprint arXiv:1706.04371
Dasgupta S, Gupta A (2003) An elementary proof of a theorem of johnson and lindenstrauss. Random Struct Algorithm 22(1):60–65
Aggarwal C (2015) Outlier analysis. Springer, New York
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection:a survey. ACM Comput Surv (CSUR) 41(3):1–58
Zong W, Huang GB, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101(3):229–242
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Computer vision and pattern recognition, IEEE, vol 119, pp 3360–3367
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Siqi, W., Qiang, L., Xifeng, G. et al. Fast and unsupervised outlier removal by recurrent adaptive reconstruction extreme learning machine. Int. J. Mach. Learn. & Cyber. 10, 3539–3556 (2019). https://doi.org/10.1007/s13042-019-00943-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-019-00943-4