Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Discovering high utility itemsets with multiple minimum supports

Published: 01 November 2014 Publication History

Abstract

Generally, association rule mining uses only a single minimum support threshold for the whole database. This model implicitly assumes that all items in the database have the same nature. In real applications, however, each item can have different nature such as medical datasets which contain information of both diseases and symptoms or status related to the diseases. Therefore, association rule mining needs to consider multiple minimum supports. Association rule mining with multiple minimum supports discovers all item rules by reflecting their characteristics. Although this model can identify meaningful association rules including rare item rules, not only the importance of items such as fatality rate of diseases but also attribute of items such as duration of symptoms are not considered since it treats each item with equal importance and represents the occurrences of items in transactions as binary values. In this paper, we propose a novel tree structure, called MHU-Tree (Multiple item supports with High Utility Tree), which is constructed with a single scan. Moreover, we propose an algorithm, named MHU-Growth (Multiple item supports with High Utility Growth), for mining high utility itemsets with multiple minimum supports. Experimental results show that MHU-Growth outperforms the previous algorithm on both real and synthetic datasets, and can discover useful rules from a medical dataset.

References

[1]
R. Agrawal and R. Srikant, Fast algorithms for mining association rules, in: Proc of the 20th Int'l Conf, on Very Large Data Bases (VLDB 1994) (1994), 487-499.
[2]
C.F. Ahmed, S.K. Tanbeer, B.-S. Jeong, Y.-K. Lee and H.-J. Choi, Single-pass incremental and interactive mining for weighted frequent patterns, Expert Systems with Applications 39(9) (2012), 7976-7994.
[3]
C.F. Ahmed, S.K. Tanbeer, B.-S. Jeong and Y.-K. Lee, Efficient tree structures for high utility pattern mining in incremental databases, IEEE Transactions on Knowledge and Data Engineering 21(12) (2009), 1708-1721.
[4]
V.P. Álvarez and J.M. Vázquez, An evolutionary algorithm to discover quantitative association rules from huge databases without the need for an apriori discretization, Expert Systems with Applications 39(1) (2012), 2012.
[5]
R. Chaves, J. Ramírez, J.M. Górriz and C.G. Puntonet, Association rule-based feature selection method for alzheimer's disease diagnosis, Expert Systems with Applications 39(14) (2012), 11766-11774.
[6]
C.-H. Chen, T.-P. Hong and V.S. Tseng, An improved approach to find membership dunctions and multiple minimum supports in fuzzy data mining, Expert Systems with Applications 36(6) (2009), 10016-10024.
[7]
C.-H. Chen, T.-P. Hong and V.S. Tseng, Genetic-fuzzy mining with multiple minimum supports based on fuzzy clustering, Soft Computing 15(12) (2011), 2319-2333.
[8]
S.-S. Chen, T.C.-K. Huang and Z.-M. Lin, New and efficient knowledge discovery of partial periodic patterns with multiple minimum supports, Journal of Systems and Software 84(10) 1638-1651.
[9]
J. Han and Y. Fu, Discovery of multiple-level association rules from large databases, in: Proc of the 21th Int'l Conf on Very Large Database (VLDB 1995) (1995), 420-431.
[10]
J. Han, J. Pei and Y. Yin, Mining frequent patterns without candidate generation, in: Proc of the 2000 ACM SIGMOD Int'l Conf on Management of Data (2000), 1-12.
[11]
T.C.-K. Huang, Discovery of fuzzy quantitative sequential patterns with multiple minimum supports and adjustable membership functions, Information Sciences 222 (2013), 126-146.
[12]
Y.-H. Hu, F. Wu and Y.-J. Liao, An efficient tree-based algorithm for mining sequential patterns with multiple minimum supports, Journal of Systems and Software 86(5) (2013), 1224-1238.
[13]
Y.-H. Hu and Y.-L. Chen, Mining association rules with multiple minimum supports: A new mining algorithm and a support tuning mechanism, Decision Support Systems 42(1) (2006), 1-24.
[14]
R.U. Kiran and P.K. Reddy, Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms, The 14th Int'l Conf on Extending Database Technology (EDBT 2011) (2001), 11-20.
[15]
W. Lee, S.J. Stolfo and K.W. Mok, Mining audit data to build intrusion detection models, in: Proc the 4th Int'l Conf on Knowledge Discovery and Data Mining (KDD 1998) (1998), 66-72.
[16]
G. Lee, U. Yun and K. Ryu, Sliding window based weighted maximal frequent pattern mining over data streams, Expert Systems with Applications 41(2) (Feb 2014), 694-708.
[17]
Y.-C. Lee, T.-P. Hong and T.-C. Wang, Multi-level fuzzy mining with multiple minimum supports, Expert Systems with Applications 34(1) (2008), 459-468.
[18]
C.-W. Lin, G.-C. Lan and T.-P. Hong, An incremental mining algorithm for high utility itemsets, Expert Systems with Applications 39(8) (2012), 7173-7180.
[19]
C.-W. Lin, T.-P. Hong and W.-H. Lu, An effective tree structure for mining high utility itemsets, Expert Systems with Applications 38(6) (2011), 7419-7424.
[20]
X.-B. Liu, K. Zhai and W. Pedrycz, An improved association rules mining method, Expert Systems with Applications 39(1) (2012), 1362-1374.
[21]
Y.-C. Liu, C.-P. Cheng and V.S. Tseng, Discovering relational-based association rules with multiple minimum supports on microarray datasets, Bioinformatics 27(22) (2011), 3142-3148.
[22]
B. Liu, W. Hsu and Y. Ma, Mining association rules with multiple minimum supports, in: Proc of the Fifth ACM SIGKDD Int'l Conf on Knowledge Discovery and Data Mining (KDD 1999) (1999), 337-341.
[23]
Y. Liu, W.-K. Liao and A.N. Choudhary, A two-phase algorithm for fast discovery of high utility itemsets, Advances in Knowledge Discovery and Data Mining (PAKDD 2005) (2005), 689-695.
[24]
H. Mannila, Database methods for data mining, in: ACM SIGKDD Conf on Knowledge Discovery and Data Mining (KDD 1998) Tutorial, 1998.
[25]
S.B. Patil and V.S. Kumaraswamy, Intelligent and effective heart attack prediction system using data mining and artificial neural network, European Journal of Scientific Research 31(4) (2009), 642-656.
[26]
G. Pyun, U. Yun and K. Ryu, Efficient frequent pattern mining based on linear prefix tree, Knowledge-Based Systems 55 (Jan 2014), 125-139.
[27]
M. Shinoda, T. Ozaki and T. Ohkawa, Weighted frequent subgraph mining in weighted graph databases, IEEE Int'l Conf on Data Mining Workshops (ICDM 2009) (2009), 58-63.
[28]
V.S. Tseng, B.-E. Shie, C.-W. Wu and P.S. Yu, Efficient algorithms for mining high utility itemsets from transactional databases, IEEE Transactions on Knowledge and Data Engineering 25(8) (2013), 1772-1786.
[29]
C.-W. Wu, P. Fournier-Viger, P.S. Yu and V.S. Tseng, Efficient mining of a concise and lossless representation of high utility itemsets, The 11th IEEE Int'l Conf on Data Mining (ICDM 2011) (2011), 824-833.
[30]
U. Yun, G. Lee and K. Ryu, Mining maximal frequent patterns by considering weight conditions over data streams, Knowledge-Based Systems 55 (Jan 2014), 49-65.
[31]
U. Yun, H. Ryang and K. Ryu, High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates, Expert Systems with Applications 41(8) (June 2014), 3861-3878.
[32]
U. Yun and K. Ryu, Efficient mining of maximal correlated weight frequent patterns, Intelligent Data Analysis 17(5) (Sep 2013), 917-939.

Cited By

View all
  1. Discovering high utility itemsets with multiple minimum supports

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Intelligent Data Analysis
    Intelligent Data Analysis  Volume 18, Issue 6
    November 2014
    238 pages

    Publisher

    IOS Press

    Netherlands

    Publication History

    Published: 01 November 2014

    Author Tags

    1. Frequent Itemsets
    2. Multiple Minimum Supports
    3. Rare Frequent Itemsets
    4. Utility Mining

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)High average-utility itemsets mining: a surveyApplied Intelligence10.1007/s10489-021-02611-z52:4(3901-3938)Online publication date: 1-Mar-2022
    • (2021)Beyond FrequencyACM Transactions on Internet Technology10.1145/342549821:1(1-32)Online publication date: 5-Jan-2021
    • (2017)Vertical Pattern Mining Algorithm for Multiple Support ThresholdsProcedia Computer Science10.1016/j.procs.2017.08.051112:C(417-426)Online publication date: 1-Sep-2017
    • (2017)Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniquesKnowledge and Information Systems10.1007/s10115-016-0989-x51:2(627-659)Online publication date: 1-May-2017
    • (2016)Mining recent high average utility patterns based on sliding window from stream dataJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/IFS-16210630:6(3605-3617)Online publication date: 1-Jan-2016
    • (2016)Enhanced sequence identification technique for protein sequence database mining with hybrid frequent pattern mining algorithmInternational Journal of Data Mining and Bioinformatics10.1504/IJDMB.2016.08067316:3(205-229)Online publication date: 1-Jan-2016
    • (2016)Efficient representative pattern mining based on weight and maximality conditionsExpert Systems: The Journal of Knowledge Engineering10.1111/exsy.1215833:5(439-462)Online publication date: 1-Oct-2016
    • (2016)Efficient Algorithms for Mining Top-K High Utility ItemsetsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.245886028:1(54-67)Online publication date: 1-Jan-2016
    • (2016)Efficient mining of high-utility itemsets using multiple minimum utility thresholdsKnowledge-Based Systems10.1016/j.knosys.2016.09.013113:C(100-115)Online publication date: 1-Dec-2016
    • (2016)Sliding window based weighted erasable stream pattern mining for stream data applicationsFuture Generation Computer Systems10.1016/j.future.2015.12.01259:C(1-20)Online publication date: 1-Jun-2016
    • Show More Cited By

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media