Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3579654.3579766acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacaiConference Proceedingsconference-collections
research-article

IHUMN: an improved high-utility itemsets mining algorithm with negative utility items

Published: 14 March 2023 Publication History

Abstract

High-utility itemset mining is to mine high profit itemsets from transaction databases. But if there are some itemsets with negative utility values in the transaction database, the high-utility itemsets with the negative values may be pruned incorrectly and the subset of the low-utility itemsets may be the high-utility itemsets. In this paper, an improved high-utility itemsets mining algorithm with negative utility items (IHUMN) is proposed. A novel utility-list buffer structure with negative unit profits is proposed to efficiently store and retrieve utility-list, and reduce the memory consumption during the mining process. Moreover, Transitive Extension with Negative utility formula is constructed to compute the upper bound of utility avoiding the overestimation of low-utility itemsets as high-utility itemsets. The performance of IHUMN is evaluated, and compared against the FHN and GHUM method. The results of the experiments confirm that IHUMN has a favorable improvement in terms of time costs, the memory utilization and the number of visited nodes. The IHUMN algorithm consumes 40% less memory than GHUM. Moreover, the algorithm has good performance on dense datasets.

References

[1]
Fournier-Viger P, Lin JC-W, Nkambou R, Vo B, Tseng VS. High-utility pattern mining: theory, algorithms and applications. Vol.51, Studies in big data. Springer, Cham, Switzerland. 2019. https://doi.org/10.1007/978-3-030-04921-8.
[2]
Ghafari SM, Tjortjis C. A survey on association rules mining using heuristics. WIREs Data Mining Knowl Discov. 2019;9(4): e1307. https://doi.org/10.1002/widm.1307.
[3]
Jia Z, Wang Z. A regression-based algorithm for frequent itemsets mining. Data Technol Appl. 2019;54(3):259-273. https://doi.org/10.1108/DTA-03-2019-0037.
[4]
Valiullin T, Huang ZX, Wei CH, Yin JF, Wu DM, Egorova I. A new approximate method for mining frequent itemsets from big data. Comput Sci Inf Syst. 2021;18(3):641–656.
[5]
Agrawal R, Srikant R. Fast Algorithms for Mining Association Rules in Large Databases. Proceedings of the 20th International Conference on Very Large Data Bases; 1994 Sep; Santiago de Chile.
[6]
Han J, Pei J, Yin Y, Mao R. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Min Knowl Discov. 2004;8(1):53–87. https://doi.org/10.1023/B:DAMI.0000005258.31418.83
[7]
Liu Y, Liao W, Choudhary A. A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets. Proceedings of Advances in Knowledge Discovery and Data Mining; 2005 May; Hanoi, Vietnam. https://doi.org/10.1007/11430919_79.
[8]
Liu M, Qu J. Mining high utility itemsets without candidate generation. Proceedings of the 21st ACM International Conference on Information and Knowledge Management; 2012 Oct; Maui Hawaii, USA. https://doi.org/10.1145/2396761.2396773.
[9]
Krishnamoorthy S. Pruning strategies for mining high utility itemsets. Expert Syst Appl. 2015;42(5):2371–2381. https://doi.org/10.1016/j.eswa.2014.11.001.
[10]
Fournier-Viger P, Wu C-W, Zida S, Tseng VS. FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning. Proceedings of Foundations of Intelligent Systems; 2014 Jun; Roskilde, Denmark. https://doi.org/10.1007/978-3-319-08326-1_9.
[11]
Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS. EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst. 2017;51(2):595–625. https://doi.org/10.1007/s10115-016-0986-0.
[12]
Yun U, Nam H, Lee G, Yoon E. Efficient approach for incremental high utility pattern mining with indexed list structure. Future Gener Comput Syst. 2019;95:221–239. https://doi.org/10.1016/j.future.2018.12.029.
[13]
Truong T, Duong H, Le B, Fournier-Viger P, Yun U, Fujita H. Efficient algorithms for mining frequent high utility sequences with constraints. Inform Sciences. 2021;568:239–264. https://doi.org/10.1016/j.ins.2021.01.060.
[14]
Sethi KK, Ramesh D. High average-utility itemset mining with multiple minimum utility threshold: A generalized approach. Eng Appl Artif Intell. 2020;96:103933. https://doi.org/10.1016/j.engappai.2020.103933.
[15]
Singh K, Kumar R, Biswas B. High average-utility itemsets mining: a survey. Appl Intell. 2022;52(4):3901–3938. https://doi.org/10.1007/s10489-021-02611-z.
[16]
Nguyen LTT, Nguyen TDD, Nguyen A, Efficient Method for Mining High-Utility Itemsets Using High-Average Utility Measure. Proceedings of Computational Collective Intelligence; 2020 Nov; Da Nang, Vietnam. https://doi.org/10.1007/978-3-030-63007-2_24
[17]
Han X, Liu X, Li J, Gao H. Efficient top-k high utility itemset mining on massive data. Inform Sciences. 2021;557:382–406. https://doi.org/10.1016/j.ins.2020.08.028.
[18]
Chu C-J, Tseng VS, Liang T. An efficient algorithm for mining high utility itemsets with negative item values in large databases. Appl Math Comput. 2009;215(2):767–778. https://doi.org/10.1016/j.amc.2009.05.066.
[19]
Lan G-C, Hong T-P, Huang J-P, Tseng VS. On-shelf utility mining with negative item values. Expert Syst Appl. 2014;41(7):3450–3459. https://doi.org/10.1016/j.eswa.2013.10.049.
[20]
Lin JC-W, Fournier-Viger P, Gan W. FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits. Knowl Based Syst. 2016;111:283–298. https://doi.org/10.1016/j.knosys.2016.08.022.
[21]
Krishnamoorthy S. Efficiently mining high utility itemsets with negative unit profits. Knowl Based Syst. 2018;145:1–14. https://doi.org/10.1016/j.knosys.2017.12.035.
[22]
Sun R, Han M, Zhang C, Shen M, Du S. Mining of top-k high utility itemsets with negative utility. J Intell Fuzzy Syst. 2021;40(3):5637–5652. https://doi.org/10.3233/JIFS-201357.
[23]
Fournier-Viger P, Gomariz A, Gueniche T, Soltani A, Wu C-W, Tseng VS. SPMF: A Java Open-Source Data Mining Library. 2014. Retrieved May 6, 2020 from https://www.philippe-fournier-viger.com/spmf.

Index Terms

  1. IHUMN: an improved high-utility itemsets mining algorithm with negative utility items
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ACAI '22: Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence
    December 2022
    770 pages
    ISBN:9781450398336
    DOI:10.1145/3579654
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 March 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Frequent itemset mining
    2. High utility itemset mining
    3. Negative unit profits
    4. Pruning strategies
    5. Utility-list buffer structure

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ACAI 2022

    Acceptance Rates

    Overall Acceptance Rate 173 of 395 submissions, 44%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 16
      Total Downloads
    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media