Abstract
With the rapid development of IoT and sensor technologies, data stream is more common in real life and it can provide data assurance for production. However, the abnormal data is usually existed in the collected data stream, and the presence of abnormal data will affect data-based prediction and analysis, therefore, it needs to be detected effectively. The widely distributed sensors make the volume of collected data is very huge, which results the traditional frequent pattern-based abnormal detecting method not suitable for large scale data due to the time cost in abnormal detecting process is expensive. Aimed at this problem, this paper records the data information of the collected data stream into vector structure first, and then proposes a maximal frequent pattern-based abnormal detecting method called MFPM-AD to improve the efficiency of abnormal detecting. Specifically, the maximal frequent patterns are mined instead of frequent patterns to reduce the time cost in the abnormal detecting phase, moreover, three abnormality indexes are designed to measure the abnormal degree of each detected transaction. Then, the abnormal detecting algorithm called MFPM-AD is proposed to effectively detect the implicit abnormal data based on the mined maximal frequent patterns and designed deviation indexes. The experimental results show that our proposed MFPM-AD method can effectively detect the existing implicit abnormal data over data stream.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cao, L., Yang, D., Wang, Q., Yu, Y., Wang, J.: Scalable distance-based outlier detection over high-volume data streams. In: 30th International Conference on Data Engineering, Chicago, USA, pp. 76–87. IEEE (2014)
Kontaki, M., Gounaris, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Efficient and flexible algorithms for monitoring distance-based outliers over data streams. Inf. Syst. 55, 37–53 (2016)
Bai, M., Wang, X., Xin, J., Wang, G.: An efficient algorithm for distributed density-based outlier detection on big data. Neurocomputing 181, 19–28 (2016)
Tang, B., He, H.: A local density-based approach for outlier detection. Neurocomputing 241, 171–180 (2017)
de Vries, T., Chawla, S., Houle, M.E.: Density-preserving projections for large-scale local anomaly detection. Knowl. Inf. Syst. 32(1), 25–52 (2012)
Elahi, M., Li, K., Nisar, W., Lv, X., Wang, H.: Efficient clustering-based outlier detection algorithm for dynamic data stream. In: 5th International Conference on Fuzzy Systems and Knowledge Discovery, Shandong, China, pp. 298–304. IEEE (2008)
Huang, J., Zhu, Q., Yang, L., Cheng, D., Wu, Q.: A novel outlier cluster detection algorithm without top-n parameter. Knowl.-Based Syst. 121, 32–40 (2017)
He, Z., Xu, X., Huang, Z.J., Deng, S.: Fp-outlier: frequent pattern based outlier detection. Comput. Sci. Inf. Syst. 2(1), 103–118 (2005)
Zhang, W., Wu, J., Yu, J.: An improved method of outlier detection based on frequent pattern. In: 2nd WASE International Conference on Information Engineering, Beidaihe, China, pp. 3–6. IEEE (2010)
Lin, F., Le, W., Bo, J.: Research on maximal frequent pattern outlier factor for online high dimensional time-series outlier detection. J. Converg. Inf. Technol. 5(10), 66–71 (2010)
Calders, T., Dexters, N., Gillis, J.J., Goethals, B.: Mining frequent itemsets in a stream. Inf. Syst. 39, 233–255 (2014)
Li, H.F., Lee, S.Y., Shan, M.K.: Online mining (recently) maximal frequent itemsets over data streams. In: 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, Tokyo, Japan, pp. 11–18. IEEE (2005)
Mao, G., Wu, X., Zhu, X., Chen, G., Liu, C.: Mining maximal frequent itemsets from data streams. J. Inf. Sci. 33(3), 251–262 (2007)
Yang, J., Wei, Y., Zhou, F.: An efficient algorithm for mining maximal frequent patterns over data streams. In: 7th International Conference on Intelligent Human–Machine Systems and Cybernetics, Hangzhou, China, pp. 444–447. IEEE (2015)
Jiang, S.Y., An, Q.B.: Clustering-based outlier detection method. In: 5th International Conference on Fuzzy Systems and Knowledge Discovery, Shandong, China, pp. 429–433. IEEE (2008)
Hawkins, D.M.: Identification of Outliers, vol. 11. Chapman and Hall, London (1980)
Cai, S., Sun, R., Cheng, C., Wu, G.: Exception detection of data stream based on improved maximal frequent itemsets mining. In: Xu, M., Qin, Z., Yan, F., Fu, S. (eds.) CTCIS 2017. CCIS, vol. 704, pp. 112–125. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-7080-8_10
https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/. Accessed 12 June 2018
Acknowledgements
This work was supported by Scientific and technological key projects of Xinjiang Production & Construction Corps (Grant No. 2015AC023).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Cai, S., Sun, R., Li, J., Deng, C., Li, S. (2019). Abnormal Detecting over Data Stream Based on Maximal Pattern Mining Technology. In: Sun, Y., Lu, T., Xie, X., Gao, L., Fan, H. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2018. Communications in Computer and Information Science, vol 917. Springer, Singapore. https://doi.org/10.1007/978-981-13-3044-5_27
Download citation
DOI: https://doi.org/10.1007/978-981-13-3044-5_27
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3043-8
Online ISBN: 978-981-13-3044-5
eBook Packages: Computer ScienceComputer Science (R0)