Mining high utility itemsets by dynamically pruning the tree structure

Wei Song¹,
Yu Liu¹ &
Jinhong Li¹

870 Accesses
50 Citations
Explore all metrics

Abstract

Mining high utility itemsets is one of the most important research issues in data mining owing to its ability to consider nonbinary frequency values of items in transactions and different profit values for each item. Mining such itemsets from a transaction database involves finding those itemsets with utility above a user-specified threshold. In this paper, we propose an efficient concurrent algorithm, called CHUI-Mine (Concurrent High Utility Itemsets Mine), for mining high utility itemsets by dynamically pruning the tree structure. A tree structure, called the CHUI-Tree, is introduced to capture the important utility information of the candidate itemsets. By recording changes in support counts of candidate high utility items during the tree construction process, we implement dynamic CHUI-Tree pruning, and discuss the rationality thereof. The CHUI-Mine algorithm makes use of a concurrent strategy, enabling the simultaneous construction of a CHUI-Tree and the discovery of high utility itemsets. Our algorithm reduces the problem of huge memory usage for tree construction and traversal in tree-based algorithms for mining high utility itemsets. Extensive experimental results show that the CHUI-Mine algorithm is both efficient and scalable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient algorithm for mining high average-utility itemsets in incremental transaction databases

Article 14 February 2017

A high utility itemset mining algorithm based on subsume index

Article 09 December 2015

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Adnan M, Alhajj R (2009) DRFP-tree: disk-resident frequent pattern tree. Appl Intell 30(2):84–97
Article Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings 20th international conference very large data bases (VLDB’94), pp 487–499
Google Scholar
Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2009) Efficient tree structures for high utility pattern. IEEE Trans Knowl Data Eng 21(12):1708–1721
Article Google Scholar
Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2011) HUC-Prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34(2):181–198
Article Google Scholar
Barber B, Hamilton HJ (2003) Extracting share frequent itemsets with infrequent subsets. Data Min Knowl Discov 7(2):153–185
Article MathSciNet Google Scholar
Chan R, Yang Q, Shen Y-D (2003) Mining high utility itemsets. In: Proceedings of the 3rd IEEE international conference on data mining (ICDM’03), pp 19–26
Chapter Google Scholar
Erwin A, Gopalan RP, Achuthan NR (2007) CTU-Mine: an efficient high utility itemset mining algorithm using the pattern growth approach. In: Proceedings of the 7th IEEE international conference on computer and information technology (CIT’07), pp 71–76
Chapter Google Scholar
Erwin A, Gopalan RP, Achuthan NR (2007) A bottom-up projection based algorithm for mining high utility itemsets. In: Proceedings of the 2nd international workshop on integrating artificial intelligence and data mining (AIDM’07), pp 3–10
Google Scholar
Erwin A, Gopalan RP, Achuthan NR (2008) Efficient mining of high utility itemsets from large datasets. In: Proceedings of the 12th Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD’08), pp 554–561
Chapter Google Scholar
Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-trees. IEEE Trans Knowl Data Eng 17(10):1347–1362
Article Google Scholar
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86
Article MathSciNet Google Scholar
Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Google Scholar
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87
Article MathSciNet Google Scholar
Hu J, Mojsilovic A (2007) High-utility pattern mining: a method for discovery of high-utility item sets. Pattern Recognit 40(11):3317–3324
Article MATH Google Scholar
Kaya M, Alhajj R (2008) Online mining of fuzzy multidimensional weighted association rules. Appl Intell 29(1):13–34
Article Google Scholar
Lee C-H (2007) IMSP: an information theoretic approach for multi-dimensional sequential pattern mining. Appl Intell 26(3):231–242
Article MATH Google Scholar
Lan G-C, Hong T-P, Tseng VS (2012) A projection-based approach for discovering high average-utility itemsets. J Inf Sci Eng 28(1):193–209
Google Scholar
Li H-F, Huang H-Y, Lee S-Y (2011) Fast and memory efficient mining of high-utility itemsets from data streams: with and without negative item profits. Knowl Inf Syst 28(3):495–522
Article Google Scholar
Li H-F (2011) MHUI-max: an efficient algorithm for discovering high-utility itemsets from data streams. J Inf Sci 37(5):532–545
Article Google Scholar
Li Y-C, Yeh J-S, Chang C-C (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217
Article Google Scholar
Lin C-W, Hong T-P, Lu W-H (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424
Article Google Scholar
Liu Y, Liao W-K, Choudhary AN (2005) A two phase algorithm for fast discovery of high utility of itemsets. In: Proceedings of the 9th Pacific-Asia conference on knowledge discovery and data mining (PAKDD’05), pp 689–695
Google Scholar
Maunz A, Helma C, Kramer S (2011) Efficient mining for structurally diverse subgraph patterns in large molecular databases. Mach Learn 83(2):193–218
Article MATH MathSciNet Google Scholar
Piatetsky-Shapiro G (2007) Data mining and knowledge discovery 1996 to 2005: overcoming the hype and moving from “university” to “business” and “analytics”. Data Min Knowl Discov 15(1):99–105
Article MathSciNet Google Scholar
Rauch J (2005) Logic of association rules. Appl Intell 22(1):9–28
Article MATH Google Scholar
Roscoe AW (2010) Understanding concurrent systems. Springer, London
Book MATH Google Scholar
Shie B-E, Yu PS, Tseng VS (2013) Mining interesting user behavior patterns in mobile commerce environments. Appl Intell 38(3):418–435
Article Google Scholar
Song W, Yang BR, Xu ZY (2008) Index-BitTableFI: an improved algorithm for mining frequent itemsets. Knowl-Based Syst 21(6):507–513
Article Google Scholar
Tatti N, Cule B (2012) Mining closed strict episodes. Data Min Knowl Discov 25(1):34–66
Article MATH MathSciNet Google Scholar
Tseng M-C, Lin Y-Y, Jeng R (2008) Updating generalized association rules with evolving taxonomies. Appl Intell 29(3):306–320
Article Google Scholar
Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) UP-Growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’10), pp 253–262
Chapter Google Scholar
IBM data generator. http://www.cs.loyola.edu/~cgiannel/assoc_gen.html
Frequent Itemset mining implementations repository. http://fimi.ua.ac.be/
Wang S-L, Patel D, Jafari A, Hong T-P (2007) Hiding collaborative recommendation association rules. Appl Intell 27(1):67–77
Article MATH Google Scholar
Wang Y-T, Cheng J-T (2011) Mining periodic movement patterns of mobile phone users based on an efficient sampling approach. Appl Intell 35(1):32–40
Article Google Scholar
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 4th SIAM international conference on data mining (SDM’04), pp 482–486
Chapter Google Scholar
Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626
Article Google Scholar
Yu G, Shao S, Luo B, Zeng X (2009) A hybrid method for high-utility itemsets mining in large high-dimensional data. Int J Data Warehous Min 5(1):57–73
Article Google Scholar
Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’03), pp 326–335
Google Scholar
Zhang S, Chen F, Wu X, Zhang C, Wang R (2012) Mining bridging rules between conceptual clusters. Appl Intell 36(1):108–118
Article Google Scholar

Download references

Acknowledgements

We would like to express our deep gratitude to the anonymous reviewers of this paper. The work is partly supported by the National Natural Science Foundation of China (61105045), Funding Project for Academic Human Resources Development in Institutions of Higher Learning Under the Jurisdiction of Beijing Municipality (PHR201108057), and North China University of Technology (CCXZ201303).

Author information

Authors and Affiliations

College of Information Engineering, North China University of Technology, Beijing, 100144, China
Wei Song, Yu Liu & Jinhong Li

Authors

Wei Song
View author publications
You can also search for this author in PubMed Google Scholar
Yu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jinhong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, W., Liu, Y. & Li, J. Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell 40, 29–43 (2014). https://doi.org/10.1007/s10489-013-0443-7

Download citation

Published: 29 May 2013
Issue Date: January 2014
DOI: https://doi.org/10.1007/s10489-013-0443-7

Mining high utility itemsets by dynamically pruning the tree structure

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient algorithm for mining high average-utility itemsets in incremental transaction databases

A high utility itemset mining algorithm based on subsume index

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Mining high utility itemsets by dynamically pruning the tree structure

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient algorithm for mining high average-utility itemsets in incremental transaction databases

A high utility itemset mining algorithm based on subsume index

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation