Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2790798.2790807acmotherconferencesArticle/Chapter ViewAbstractPublication PagesuccsConference Proceedingsconference-collections
research-article

Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

Published: 13 July 2015 Publication History

Abstract

High-utility itemset mining (HUIM) is an emerging topic in data mining. It consists of discovering high-utility itemsets (HUIs), i.e. groups of items (itemsets) that generate a high profit in transactional databases. Several algorithms have been proposed for this task. However, they suffer from an important limitation, which is to rely on a single minimum utility threshold as the sole criterion for identifying HUIs. In this paper, we address this issue by introducing the novel framework of HUIM with multiple minimum utility thresholds (HUIM-MMU). According to this framework, the user may specify different thresholds for each item, to discover HUIs. To perform HUIM-MMU, we first present an algorithm named HUI-MMU, which relies on a new sorted downward closure (SDC) property and least minimum utility threshold (LMU). Furthermore, an improved algorithm, namely HUI-MMUTID, is also proposed based on TID-index strategy, to increase mining performance. Substantial experiments both on real-life and synthetic datasets show that the two proposed algorithms can efficiently and effectively discover the complete set of HUIs in transactional databases while considering multiple minimum utility thresholds.

References

[1]
http://fimi.ua.ac.be/data/. 2012.
[2]
R. Agrawal and R. Srikant. Quest synthetic data generator. http://www.Almaden.ibm.com/cs/quest/syndata.htm., 1994.
[3]
C. F. Ahmed, S. K. Tanbeer, B. S. Jeong, and Y. K. Le. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Transactions on Knowledge and Data Engineering, 21(12):1708--1721, 2009.
[4]
R. Chan, Q. Yang, and Y. D. Shen. Minging high utility itemsets. IEEE International Conference on Data Mining, pages 19--26, 2003.
[5]
M. S. Chen, J. Han, and P. S. Yu. Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6):866--883, 1996.
[6]
P. Fournier-Viger, C. W. Wu, S. Zida, and V. S. Tseng. Fhm: Faster high-utility itemset mining using estimated utility co-occurrence pruning. Lecture Notes in Computer Science, 8502:83--92, 2014.
[7]
J. Han, Y. Y. J. Pei, and R. Mao. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1):53--87, 2004.
[8]
Y. H. Hu and Y. L. Chen. Mining association rules with multiple minimum supports: A new mining algorithm and a support tuning mechanism. Decision Support Systems, 42(1):1--24, 2006.
[9]
R. U. Kiran and P. K. Reddy. Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms. ACM International Conference on Extending Database Technology, pages 11--20, 2011.
[10]
G. C. Lan, T. P. Hong, and V. S. Tseng. Discovery of high utility itemsets from on-shelf time periods of products. Expert Systems with Applications, 38(5):5851--5857, 2011.
[11]
G. C. Lan, T. P. Hong, and V. S. Tseng. An efficient projection-based indexing approach for mining high utility itemsets. Knowledge and Information Systems, 38(1):85--107, 2013.
[12]
G. C. Lan, T. P. Hong, V. S. Tseng, and S. L. Wang. Applying the maximum utility measure in high utility sequential pattern mining. Expert Systems with Applications, 41(11):5071--5081, 2014.
[13]
C. W. Lin, T. P. Hong, and W. H. Lu. The pre-fufp algorithm for incremental mining. Expert Systems with Applications, 36(5):9498--950, 2009.
[14]
C. W. Lin, T. P. Hong, and W. H. Lu. An effective tree structure for mining high utility itemsets. Expert Systems with Applications, 38(6):7419--7424, 2011.
[15]
B. Liu, W. Hsu, and Y. Ma. Mining association rules with multiple minimum supports. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 337--341, 1999.
[16]
M. Liu and J. Qu. Mining high utility itemsets without candidate generation. ACM International Conference on Information and Knowledge Management, pages 55--64, 2012.
[17]
Y. Liu, W. K. Liao, and A. Choudhary. A two-phase algorithm for fast discovery of high utility itemsets. Lecture Notes in Computer Science, 3518:689--695, 2005.
[18]
Y. C. Liu, C. P. Cheng, and V. S. Tseng. Discovering relational-based association rules with multiple minimum supports on microarray datasets. Bioinformatics, 27(22):3142--3148, 2011.
[19]
H. Ryang, U. Yun, and K. H. Ryu. Discovering high utility itemsets with multiple minimum supports. Intelligent Data Analysis, 18(6):1027--1047, 2014.
[20]
V. S. Tseng, B. E. Shie, C. W. Wu, and P. S. Yu. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering, 25(8):1772--178, 2013.
[21]
V. S. Tseng, C. W. Wu, B. E. Shie, and P. S. Yu. Up-growth: An efficient algorithm for high utility itemset mining. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 253--262, 2010.
[22]
H. Yao and H. J. Hamilton. Mining itemset utilities from transaction databases. Data & Knowledge Engineering, 59(3):603--626, 2006.
[23]
H. Yao, H. J. Hamilton, and C. J. Butz. A foundational approach to mining itemset utilities from databases. SIAM International Conference on Data Mining, pages 211--225, 2004.

Cited By

View all
  • (2023)A new algorithm using integer programming relaxation for privacy-preserving in utility miningApplied Intelligence10.1007/s10489-023-04913-w53:21(25106-25118)Online publication date: 3-Aug-2023
  • (2022)Mining High Utility Itemset with Multiple Minimum Utility Thresholds Based on Utility Deviation2022 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW58026.2022.00071(490-496)Online publication date: Nov-2022
  • (2022)Constraint Pushing Multi-threshold Framework for High Utility Time Interval Sequential Pattern MiningSoft Computing and its Engineering Applications10.1007/978-3-031-05767-0_21(264-273)Online publication date: 7-May-2022
  • Show More Cited By

Index Terms

  1. Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    C3S2E '15: Proceedings of the Eighth International C* Conference on Computer Science & Software Engineering
    July 2015
    166 pages
    ISBN:9781450334198
    DOI:10.1145/2790798
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Keio University: Keio University
    • BytePress

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 July 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. High-utility itemsets
    2. LMU
    3. TID-index
    4. multiple minimum utility thresholds
    5. sorted downward closure property

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    C3S2E 2015

    Acceptance Rates

    Overall Acceptance Rate 12 of 42 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 17 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A new algorithm using integer programming relaxation for privacy-preserving in utility miningApplied Intelligence10.1007/s10489-023-04913-w53:21(25106-25118)Online publication date: 3-Aug-2023
    • (2022)Mining High Utility Itemset with Multiple Minimum Utility Thresholds Based on Utility Deviation2022 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW58026.2022.00071(490-496)Online publication date: Nov-2022
    • (2022)Constraint Pushing Multi-threshold Framework for High Utility Time Interval Sequential Pattern MiningSoft Computing and its Engineering Applications10.1007/978-3-031-05767-0_21(264-273)Online publication date: 7-May-2022
    • (2021)Distributed Mining of High Utility Time Interval Sequential Patterns with Multiple Minimum Utility ThresholdsAdvances and Trends in Artificial Intelligence. Artificial Intelligence Practices10.1007/978-3-030-79457-6_8(86-97)Online publication date: 19-Jul-2021
    • (2020)A Survey of Key Technologies for High Utility Patterns MiningIEEE Access10.1109/ACCESS.2020.29819628(55798-55814)Online publication date: 2020
    • (2019)FLUI-GrowthProceedings of the 2019 International Conference on Artificial Intelligence and Computer Science10.1145/3349341.3349464(535-541)Online publication date: 12-Jul-2019
    • (2017)Mining of frequent patterns with multiple minimum supportsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2017.01.00960:C(83-96)Online publication date: 1-Apr-2017
    • (2017)A binary PSO approach to mine high-utility itemsetsSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-016-2106-121:17(5103-5121)Online publication date: 1-Sep-2017
    • (2016)More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility ThresholdsProceedings, Part I, 27th International Conference on Database and Expert Systems Applications - Volume 982710.1007/978-3-319-44403-1_5(71-87)Online publication date: 5-Sep-2016
    • (2015)Mining Potential High-Utility Itemsets over Uncertain DatabasesProceedings of the ASE BigData & SocialInformatics 201510.1145/2818869.2818895(1-6)Online publication date: 7-Oct-2015

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media