Nothing Special   »   [go: up one dir, main page]

skip to main content
article

A new mining approach for uncertain databases using CUFP trees

Published: 01 March 2012 Publication History

Abstract

In the past, many algorithms have been proposed to mine frequent itemsets from transactional databases, in which the presence or absence of items in transactions was certainly known. In some applications, items may also be uncertain in transactions with their existential probabilities ranging from 0 to 1 in the uncertain dataset. Apparently, the processing in uncertain datasets is quite different from those in certain datasets. The UF-tree algorithm was proposed to construct the UF-tree structure from an uncertain dataset and mine frequent itemsets from the tree. In the UF-tree construction process, however, only the same items with the same existential probabilities in transactions were merged together in the tree, thus causing many redundant nodes in the tree. In this paper, a new tree structure called the compressed uncertain frequent-pattern tree (CUFP tree) is designed to efficiently keep the related information in the mining process. In the CUFP tree, the same items will be merged in a branch of the tree even when the existential probabilities in transactions are not the same. A mining algorithm called the CUFP-mine algorithm is then proposed based on the tree structure to find uncertain frequent patterns. Experimental results show that the proposed approach has a better performance than UF-tree algorithm both in the execution time and in the number of tree nodes.

References

[1]
A tree projection algorithm for generation of frequent item sets. Journal of Parallel and Distributed Computing. v61. 350-371.
[2]
Agrawal, R., Imielinski, T., & Swami, A. (1993a). Mining association rules between sets of items in large databases. In International conference on management of data (pp. 207-216).
[3]
Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering. v5. 914-925.
[4]
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in large databases. In The 20th international conference on very large data bases (pp. 487-499).
[5]
Tbar: An efficient method for association rule mining in relational databases. Data and Knowledge Engineering. v37. 47-64.
[6]
Dynamic itemset counting and implication rules for market basket data. SIGMOD Record. v26. 255-264.
[7]
Perfect hashing schemes for mining association rules. The Computer Journal. v48. 168-179.
[8]
Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering. v8. 866-883.
[9]
Cheng, R., Kalashnikov, D. V., & Prabhakar, S. (2003). Evaluating probabilistic queries over imprecise data. In ACM SIGMOD international conference on management of data (pp. 551-562).
[10]
Querying imprecise data in moving object environments. IEEE Transactions on Knowledge and Data Engineering. v16. 1112-1127.
[11]
Mining frequent itemsets from uncertain data. Advances in Knowledge Discovery and Data Mining. v4426. 47-58.
[12]
Dai, X., Yiu, M. L., Mamoulis, N., Tao, Y., & Vaitis, M. (2005). Probabilistic spatial queries on existentially uncertain data. In The 11th international symposium on spatial and temporal databases (pp. 400-417).
[13]
Ezeife, C. I., & Su, Y. (2002). Mining incremental association rules with generalized fp-tree. In The 15th conference of the Canadian society for computational studies of intelligence on advances in artificial intelligence (pp. 147-160).
[14]
Goethals, B. Frequent itemset mining dataset repository. <http://fimi.cs.helsinki.fi/data/>.
[15]
Fast algorithms for frequent itemset mining using fp-trees. IEEE Transactions on Knowledge and Data Engineering. v17. 1347-1362.
[16]
Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery. v8. 53-87.
[17]
Leung, C. K. S., Carmichael, C. L., & Hao, B. (2007). Efficient mining of frequent patterns from uncertain data. In The 7th IEEE international conference on data mining workshops (pp. 489-494).
[18]
A tree-based approach for frequent pattern mining from uncertain data. Lecture Notes in Computer Science. v5012. 653-661.
[19]
A new fp-tree algorithm for mining frequent itemsets. Lecture Notes in Computer Science. v3309. 266-277.
[20]
Liu, J., Pan, Y., Wang, K., & Han, J. (2002). Mining frequent item sets by opportunistic projection. In The 8th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 229-238).
[21]
An effective hash-based algorithm for mining association rules. SIGMOD Record. v24. 175-186.
[22]
Using a hash-based method with transaction trimming for mining association rules. IEEE Transactions on Knowledge and Data Engineering. v9. 813-825.
[23]
Sarda, N. L., & Srinivas, N. V. (1998). An adaptive algorithm for incremental mining of association rules. In The 9th international workshop on database and expert systems applications (pp. 240-245).
[24]
Yong, Q., Yong Jie, L., & Qing Song, X. (2004). An improved algorithm of mining from fp-tree. In International conference on machine learning and cybernetics (pp. 1665-1670).
[25]
Zaki, M. J., Parthasarathy, S., Ogihara, M., & Li, W. (1997). New algorithms for fast discovery of association rules. In International conference on knowledge discovery and data mining (pp. 283-286).
[26]
Face recognition: A literature survey. ACM Computing Surveys. v35. 399-458.

Cited By

View all
  • (2021)Uncertain-Driven Analytics of Sequence Data in IoCV EnvironmentsIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2020.301238722:8(5403-5414)Online publication date: 1-Aug-2021
  • (2021)Mining of High-Utility Patterns in Big IoT DatabasesArtificial Intelligence and Soft Computing10.1007/978-3-030-87897-9_19(205-216)Online publication date: 20-Jun-2021
  • (2020)An efficient mining algorithm for maximal frequent patterns in uncertain graph databaseJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-20023739:5(7021-7033)Online publication date: 1-Jan-2020
  • Show More Cited By
  1. A new mining approach for uncertain databases using CUFP trees

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Expert Systems with Applications: An International Journal
      Expert Systems with Applications: An International Journal  Volume 39, Issue 4
      March, 2012
      736 pages

      Publisher

      Pergamon Press, Inc.

      United States

      Publication History

      Published: 01 March 2012

      Author Tags

      1. CUFP tree
      2. Data mining
      3. Existential probability
      4. FP tree
      5. Uncertain database

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 14 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Uncertain-Driven Analytics of Sequence Data in IoCV EnvironmentsIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2020.301238722:8(5403-5414)Online publication date: 1-Aug-2021
      • (2021)Mining of High-Utility Patterns in Big IoT DatabasesArtificial Intelligence and Soft Computing10.1007/978-3-030-87897-9_19(205-216)Online publication date: 20-Jun-2021
      • (2020)An efficient mining algorithm for maximal frequent patterns in uncertain graph databaseJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-20023739:5(7021-7033)Online publication date: 1-Jan-2020
      • (2020)Efficiently mining erasable stream patterns for intelligent systems over uncertain dataInternational Journal of Intelligent Systems10.1002/int.2226935:11(1699-1734)Online publication date: 28-Sep-2020
      • (2019)Applying uncertain frequent pattern mining to improve ranking of retrieved imagesApplied Intelligence10.1007/s10489-019-01412-949:8(2982-3001)Online publication date: 1-Aug-2019
      • (2018)Frequent Sequence Mining with Weight Constraints in Uncertain DatabasesProceedings of the 12th International Conference on Ubiquitous Information Management and Communication10.1145/3164541.3164627(1-8)Online publication date: 5-Jan-2018
      • (2018)On Efficient Mining of Frequent Itemsets from Big Uncertain DatabasesJournal of Grid Computing10.1007/s10723-018-9456-017:4(831-850)Online publication date: 6-Aug-2018
      • (2017)Extracting recent weighted-based patterns from uncertain temporal databasesEngineering Applications of Artificial Intelligence10.1016/j.engappai.2017.03.00461:C(161-172)Online publication date: 1-May-2017
      • (2017)Efficiently mining uncertain high-utility itemsetsSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-016-2159-121:11(2801-2820)Online publication date: 1-Jun-2017
      • (2016)Mining closed high utility itemsets in uncertain databasesProceedings of the 7th Symposium on Information and Communication Technology10.1145/3011077.3011124(7-14)Online publication date: 8-Dec-2016
      • Show More Cited By

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media