Nothing Special   »   [go: up one dir, main page]

skip to main content
article

A novel single scan distributed pattern mining algorithm for frequent pattern identification

Published: 01 January 2019 Publication History

Abstract

In data mining, the extraction of frequent patterns from large databases is still a challenging and difficult task due to the various drawbacks such as, high response time, communication cost to alleviates such issues, a new algorithm namely single scan distributed pattern mining algorithm SSDPMA is proposed in this paper for frequent mining. The frequent patterns are extracted in a single scan of the database. Then, it is split into multiple files, which will be shared to multiple virtual machines VMs to store and compute the weight for the distinct records. Then, the support, confidence and threshold values are estimated. If the limit is greater than the given data, the frequent data are mined by using the proposed SSDPMA algorithm. The experimental results evaluate the performance of the proposed system in terms of response time, message size, execution time, run time and memory usage.

References

[1]
Adnan, M. and Alhajj, R. (2011) 'A bounded and adaptive memory-based approach to mine frequent patterns from very large databases', IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, Vol. 41, No. 1, pp.154-172.
[2]
Ahmed, A.U., Ahmed, C.F., Samiullah, M., Adnan, N. and Leung, C.K-S. (2016) 'Mining interesting patterns from uncertain databases', Information Sciences, Vol. 354, No. 1, pp.60-85.
[3]
Borgelt, C. (2012) 'Frequent itemset mining', Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 2, No. 6, pp.437-456.
[4]
Ceci, M., Appice, A., Loglisci, C., Caruso, C., Fumarola, F. and Malerba, D. (2009) 'Novelty detection from evolving complex data streams with time windows', in International Symposium on Methodologies for Intelligent Systems, pp.563-572.
[5]
Cesario, E., Mastroianni, C. and Talia, D. (2014) 'A multi-domain architecture for mining frequent items and itemsets from distributed data streams', Journal of Grid Computing, Vol. 12, No. 1, pp.153-168.
[6]
Cheng, X., Su, S., Xu, S., Tang, P. and Li, Z. (2015) 'Differentially private maximal frequent sequence mining', Computers & Security, Vol. 55, No. 1, pp.175-192.
[7]
Connect 4 (2003) Data Set [online] http://archive.ics.uci.edu/ml/datasets/Connect-4.
[8]
Cuzzocrea, A., Leung, C.K. and MacKinnon, R.K. (2015) 'Approximation to expected support of frequent itemsets in mining probabilistic sets of uncertain data', Procedia Computer Science, Vol. 60, pp.613-622.
[9]
Cuzzocrea, A., Leung, C.K.-S. and MacKinnon, R.K. (2014) 'Mining constrained frequent itemsets from distributed uncertain data', Future Generation Computer Systems, Vol. 37, No. 1, pp.117-126.
[10]
Deng, Z-H. (2016) 'Diffnodesets: an efficient structure for fast mining frequent itemsets', Applied Soft Computing, Vol. 41, No. 1, pp.214-223.
[11]
Dhanaseelan, M.J.S.F.R. (2017) 'Mining frequent, maximal and closed frequent itemsets over data stream - a review', Int. J. of Data Analysis Techniques and Strategies, Vol. 9, No. 1, pp.46-62.
[12]
Fariha, A., Ahmed, C.F., Leung, C.K., Samiullah, M., Pervin, S. and Cao, L. (2015) 'A new framework for mining frequent interaction patterns from meeting databases', Engineering Applications of Artificial Intelligence, Vol. 45, No. 1, pp.103-118.
[13]
Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C-W. and Tseng, V.S. (2014) 'SPMF: a Java open-source pattern mining library', The Journal of Machine Learning Research, Vol. 15, No. 1, pp.3389-3393.
[14]
Gadia, K. and Bhowmick, K. (2015) 'Parallel text mining in multicore systems using FP-tree algorithm', Procedia Computer Science, Vol. 45, pp.111-117.
[15]
Giraud-Carrier, N.G.C. (2014) 'A confidence-prioritisation approach for learning noisy data', Int. J. of Data Analysis Techniques and Strategies, Vol. 6, No. 4, pp.307-326.
[16]
Jiang, C., Coenen, F. and Zito, M. (2013) 'A survey of frequent subgraph mining algorithms', The Knowledge Engineering Review, Vol. 28, pp.75-105.
[17]
Lee, G., Yun, U. and Ryang, H. (2015) 'An uncertainty-based approach: frequent itemset mining from uncertain data with different item importance', Knowledge-Based Systems, Vol. 90, No. 1, pp.239-256.
[18]
Lee, G., Yun, U. and Ryu, K.H. (2014) 'Sliding window based weighted maximal frequent pattern mining over data streams', Expert Systems with Applications, Vol. 41, No. 2, pp.694-708.
[19]
Li, H-F., Ho, C-C., Chen, H-S. and Lee, S-Y. (2012) 'A single-scan algorithm for mining sequential patterns from data streams', International Journal of Innovative Computing, Information and Control, Vol. 8, No. 3(A), pp.1799-1820.
[20]
Lin, K.W. and Chung, S-H. (2015) 'A fast and resource efficient mining algorithm for discovering frequent patterns in distributed computing environments', Future Generation Computer Systems, Vol. 52, No. 1, pp.49-58.
[21]
Lin, K.W. and Lo, Y-C. (2013) 'Efficient algorithms for frequent pattern mining in many-task computing environments', Knowledge-Based Systems, Vol. 49, No. 1, pp.10-21.
[22]
Liu, G., Zhang, H. and Wong, L. (2014) 'A flexible approach to finding representative pattern sets', IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 7, pp.1562-1574.
[23]
Liu, M. and Qu, J. (2012) 'Mining high utility itemsets without candidate generation', in Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp.55-64.
[24]
Meridji, A.A-K.K.T.A-S.K. (2014) 'Towards a framework for quantifying system nonfunctional data definition and database requirements', Int. J. of Data Analysis Techniques and Strategies, Vol. 6, No. 2, pp.162-187.
[25]
Moteria, P.M. and Ghodasara, Y. (2013) 'Frequent pattern mining algorithm based on single scanning of whole transactional dataset', in International Journal of Engineering Research and Technology, pp.26-32.
[26]
Nakamura, A., Takigawa, I., Tosaka, H., Kudo, M. and Mamitsuka, H. (2016) 'Mining approximate patterns with frequent locally optimal occurrences', Discrete Applied Mathematics, Vol. 200, No. 3, pp.123-152.
[27]
Ogunde, A.O., Folorunso, O. and Sodiya, A.S. (2015) 'A partition enhanced mining algorithm for distributed association rule mining systems', Egyptian Informatics Journal, Vol. 16, No. 3, pp.297-307.
[28]
Tong, Y., Chen, L., Cheng, Y. and Yu, P.S. (2012) 'Mining frequent itemsets over uncertain databases', Proceedings of the VLDB Endowment, Vol. 5, pp.1650-1661.
[29]
Wang, E.T. and Chen, A.L. (2011) 'Mining frequent itemsets over distributed data streams by continuously maintaining a global synopsis', Data Mining and Knowledge Discovery, Vol. 23, No. 2, pp.252-299.
[30]
Yun, U. and Lee, G. (2016) 'Incremental mining of weighted maximal frequent itemsets from dynamic databases', Expert Systems with Applications, Vol. 54, No. 1, pp.304-327.
[31]
Zhu, X., Li, B., Wu, X., He, D. and Zhang, C. (2011) 'CLAP: collaborative pattern mining for distributed information systems', Decision Support Systems, Vol. 52, No. 1, pp.40-51.
  1. A novel single scan distributed pattern mining algorithm for frequent pattern identification

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image International Journal of Data Analysis Techniques and Strategies
    International Journal of Data Analysis Techniques and Strategies  Volume 11, Issue 1
    January 2019
    100 pages
    ISSN:1755-8050
    EISSN:1755-8069
    Issue’s Table of Contents

    Publisher

    Inderscience Publishers

    Geneva 15, Switzerland

    Publication History

    Published: 01 January 2019

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media