Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

RHUPS: Mining Recent High Utility Patterns with Sliding Window–based Arrival Time Control over Data Streams

Published: 13 January 2021 Publication History

Abstract

Databases that deal with the real world have various characteristics. New data is continuously inserted over time without limiting the length of the database, and a variety of information about the items constituting the database is contained. Recently generated data has a greater influence than the previously generated data. These are called the time-sensitive non-binary stream databases, and they include databases such as web-server click data, market sales data, data from sensor networks, and network traffic measurement. Many high utility pattern mining and stream pattern mining methods have been proposed so far. However, they have a limitation that they are not suitable to analyze these databases, because they find valid patterns by analyzing a database with only some of the features described above. Therefore, knowledge-based software about how to find meaningful information efficiently by analyzing databases with these characteristics is required. In this article, we propose an intelligent information system that calculates the influence of the insertion time of each batch in a large-scale stream database by applying the sliding window model and mines recent high utility patterns without generating candidate patterns. In addition, a novel list-based data structure is suggested for a fast and efficient management of the time-sensitive stream databases. Moreover, our technique is compared with state-of-the-art algorithms through various experiments using real datasets and synthetic datasets. The experimental results show that our approach outperforms the previously proposed methods in terms of runtime, memory usage, and scalability.

References

[1]
C. Ahmed, S. Tanbeer, B.-S. Jeong, and H.-J. Choi. 2012. Interactive mining of high utility patterns over data streams. Exp. Syst. Applic. 39, 15 (2012), 11979--11991.
[2]
C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee. 2009. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21, 12 (2009), 1708--1721.
[3]
X. Ao, H. Shi, J. Wang, L. Zuo, H. Li, and Q. He. 2019. Large-scale frequent episode mining from complex event sequences with hierarchies. ACM Trans. Intell. Syst. Technol. 10, 4 (2019), 36:1--36:26.
[4]
H. Bui, B. Vo, H. Nguyen, T.-A. Nguyen-Hoang, and T.-P. Hong. 2018. A weighted n-list-based method for mining frequent weighted itemsets. Exp. Syst. Applic. 96, 388--405.
[5]
S. Cai, R. Sun, S. Hao, S. Li, and G. Yuan. 2020. Minimal weighted infrequent itemset mining-based outlier detection approach on uncertain data stream. Neural Comput. Applic. 32, 11 (2020), 6619--6639.
[6]
L. Chen and Q. Mei. 2014. Mining frequent items in data stream using time fading model. Inf. Sci. 257, 54--69.
[7]
S. Chen, L. Nie, X. Tao, Z. Li, and L. Zhao. 2020. Approximation of probabilistic maximal frequent itemset mining over uncertain sensed data. IEEE Access 8, 97529--97539.
[8]
Y.-C. Chen, W.-C. Peng, J.-L. Huang, and W.-C. Lee. 2015. Significant correlation pattern mining in smart homes. ACM Trans. Intell. Syst. Technol. 6, 3 (2015), 35:1--35:23.
[9]
C.-J. Chu, V. Tseng, and T. Liang. 2008. An efficient algorithm for mining temporal high utility itemsets from data streams. J. Syst. Softw. 81, 7 (2008), 1105--1117.
[10]
Q.-H. Duong, H. Ramampiaro, K. Nørvåg, P. Fournier-Viger, and T.-L. Dam. 2018. High utility drift detection in quantitative data streams. Knowl.-based Syst. 157, 34--51.
[11]
P. Fournier-Viger, C.-W. Wu, S. Zida, and V. S. Tseng. 2014. FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In Proceedings of the International Symposium on Methodologies for Intelligent Systems. 83--92.
[12]
W. Gan, J. C.-W. Lin, P. Fournier-Viger, H.-C. Chao, and P. S. Yu. 2020. HUOPM: High utility occupancy pattern mining. IEEE Trans. Cyber. 50, 3 (2020), 1195--1208.
[13]
W. Gan, J. C.-W. Lin, P. Fournier-Viger, H.-C. Chao, V. S. Tseng, and P. S. Yu. 2019. A survey of utility-oriented pattern mining. IEEE Trans. Knowl. Data Eng.
[14]
W. Gan, J. C.-W. Lin, H.-C. Chao, P. Fournier-Viger, X. Wang, and P. S. Yu. 2019. Utility-driven mining of trend information for intelligent system. CoRR abs/1912.11666.
[15]
W. Gan, J. C. W. Lin, P. Fournier-Viger, and H.-C. Chao, T.-P. Hong, and H. Fujita. 2018. A survey of incremental high-utility itemset mining. WIREs Data Mining Knowl. Discov. 8, 2.
[16]
W. Gan, J. C.-W. Lin, P. Fournier-Viger, and H.-C. Chao. 2016. Mining recent high-utility patterns from temporal databases with time-sensitive constraint. In Proceedings of the International Conference on Big Data Analytics and Knowledge Discovery (DaWaK’16) 3--18.
[17]
D. Kim and U. Yun. 2016. Mining high utility itemsets based on the time decaying model. Intell. Data Anal. 20, 1 (2016), 1157--1180.
[18]
S. Krishnamoorthy. 2019. Mining top-k high utility itemsets with effective threshold raising strategies. Exp. Syst. Applic. 117 (2019), 148--165.
[19]
T. Le, B. Vo, P. Fournier-Viger, M. Lee, and S. Baik. 2019. SPPC: A new tree structure for mining erasable patterns in data streams. Appl. Intell. 49, 2 (2019), 478--495.
[20]
T. Le, B. Vo, and S. Baik. 2018. Efficient algorithms for mining top-rank-k erasable patterns using pruning strategies and the subsume concept. Eng. Applic. Artif. Intell. 68 (2018), 1--9.
[21]
M. Liaqat, S. Khan, M. Younis, M. Majid, and K. Rajpoot. 2019. Applying uncertain frequent pattern mining to improve ranking of retrieved images. Appl. Intell. 49, 8 (2019), 2982--3001.
[22]
J. C.-W. Lin, Y. Li, P. Fournier-Viger, Y. Djenouri, and J. Zhang. 2020. Efficient chain structure for high-utility sequential pattern mining. IEEE Access 8 (2020), 40714--40722.
[23]
J. C.-W. Lin, L. Yang, P. Fournier-Viger, and T. P. Hong. 2019. Mining of skyline patterns by considering both frequent and utility constraints. Eng. Applic. Artif. Intell. 77 (2019), 229--238
[24]
J. C.-W. Lin, Y. Zhan, P. Fournier-Viger, and T.-P. Hong. 2018. Efficiently updating the discovered multiple fuzzy frequent itemsets with transaction insertion. Int. J. Fuzzy Syst. 20, 8 (2018), 2440--2457.
[25]
J. C.-W. Lin, S. Ren, and P. Fournier-Viger. 2018. MEMU: More efficient algorithm to mine high average-utility patterns with multiple minimum average-utility thresholds. IEEE Access 6 (2018), 7593--7609.
[26]
J. C.-W. Lin, W. Gan, P. Fournier-Viger, T.-P. Hong, and H.-C. Chao. 2017. FDHUP: Fast algorithm for mining discriminative high utility patterns. Knowl. Inf. Syst. 51, 3 (2017), 873--909.
[27]
J. C.-W. Lin, W. Gan, T.-P. Hong, and V. S. Tseng. 2015. Efficient algorithms for mining up-to-date high-utility patterns. Adv. Eng. Inf. 29, 3 (2015), 648--661.
[28]
J. C.-W. Lin, W. Gan, P. Fournier-Viger, and T.-P. Hong. 2015. RWFIM: Recent weighted-frequent itemsets mining. Eng. Applic. Artif. Intell. 45 (2015), 18--32.
[29]
H. Liu, K. Zhou, P. Zhao, and S. Yao. 2018. Mining frequent itemsets over uncertain data streams. Int. J. High Perf. Comput. Netw. 11, 4 (2018), 312--321.
[30]
J. Liu, X. Ju, X. Zhang, B. C. M. Fung, X. Yang, and C. Yu. 2019. Incremental mining of high utility patterns in one phase by absence and legacy-based pruning. IEEE Access 7 (2019), 74168--74180.
[31]
M. Liu and J.-F. Qu. 2012. Mining high utility itemsets without candidate generation. In Proceedings of the International Conference on Information and Knowledge Management (CIKM’12). 55--64.
[32]
Q. Liu, F. Yu, S. Wu, and L. Wang. 2018. Mining significant microblogs for misinformation identification: An attention-based approach. ACM Trans. Intell. Syst. Technol. 9, 5 (2018), 50:1--50:20.
[33]
X. Liu, J. Guan, and P. Hu. 2009. Mining frequent closed itemsets from a landmark window over online data streams. Comput. Math. Applic. 57, 6 (2009), 927--936.
[34]
Y. Liu, W.-K. Liao, and A. N. Choudhary. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In Proceedings of the Advances in Knowledge Discovery and Data Mining Conference (PAKDD’05). 689--695.
[35]
H. Nam, U. Yun, E. Yoon, and J. C.-W. Lin. 2020. Efficient approach for incremental weighted erasable pattern mining with list structure. Exp. Syst. Applic. 143.
[36]
R. A. Rossi, N. K. Ahmed, R. Zhou, and H. Eldardiry. 2018. Interactive visual graph mining and learning. ACM Trans. Intell. Syst. Technol. 9, 5 (2018), 59:1--59:25.
[37]
H. Ryang and U. Yun. 2016. High utility pattern mining over data streams with sliding window technique. Exp. Syst. Applic. 57 (2016), 214--231.
[38]
K. Singh, S. Singh, A. Kumar, and B. Biswas. 2019. TKEH: An efficient algorithm for mining top-k high utility itemsets. Appl. Intell. 49, 3 (2019), 1078--1097.
[39]
T. C. Truong, H. V. Duong, B. Le, and P. Fournier-Viger. 2019. Efficient vertical mining of high average-utility itemsets based on novel upper-bounds. IEEE Trans. Knowl. Data Eng. 31, 2 (2019), 301--314.
[40]
V. S. Tseng, B.-E. Shie, C.-W. Wu, and P. S. Yu. 2013. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25, 8 (2013), 1772--1786.
[41]
B. Vo, L. T. T. Nguyen, T. D. D. Nguyen, P. Fournier-Viger, and U. Yun. 2020. A multi-core approach to efficiently mining high-utility itemsets in dynamic profit databases. IEEE Access 8 (2020), 85890--85899.
[42]
J.-Z. Wang, and J.-L. Huang. 2018. On incremental high utility sequential pattern mining. ACM Trans. Intell. Syst. Technol. 9, 5 (2018), 55:1--55:26.
[43]
J. Wu, J. Zhan, and J. Lin. 2017. An ACO-based approach to mine high-utility itemsets. Knowl.-based Syst. 116, 15 (2017), 102--113.
[44]
X. Yu, J. Zhao, H. Wang, X. Zheng, and X. Yan. 2019. A model of mining approximate frequent itemsets using rough set theory. Int. J. Comput. Sci. Eng. 19, 1 (2019), 71--82.
[45]
U. Yun, G. Lee, and E. Yoon. 2019. Advanced approach of sliding window based erasable pattern mining with list structure of industrial fields. Inf. Sci. 49 (2019), 37--59.
[46]
U. Yun, H. Nam, G. Lee, and E. Yoon. 2019. Efficient approach for incremental high utility pattern mining with indexed list structure. Fut. Gen. Comput. Syst. 95 (2019), 221--239.
[47]
U. Yun, D. Kim, E. Yoon, and H. Fujita. 2018. Damped window based high average utility pattern mining over data streams. Knowl.-based Syst. 144 (2018), 188--205.
[48]
U. Yun, G. Lee, and E. Yoon. 2017. Efficient high utility pattern mining for establishing manufacturing plans with sliding window control. IEEE Trans. Industr. Electron. 64, 9 (2017), 7239--7249.
[49]
U. Yun, H. Ryang, G. Lee, and H. Fujita. 2017. An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowl.-based Syst. 124 (2017), 188--206.
[50]
U. Yun and H. Ryang. 2015. Incremental high utility pattern mining with static and dynamic databases. Appl. Intell. 42, 2 (2015), 323--352.

Cited By

View all
  • (2024)RNP-Miner: Repetitive Nonoverlapping Sequential Pattern MiningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.333430036:9(4874-4889)Online publication date: 1-Sep-2024
  • (2024)An Improved FP-Growth Algorithm with Time Decay Factor and Element Attention Weight2024 IEEE 4th International Conference on Power, Electronics and Computer Applications (ICPECA)10.1109/ICPECA60615.2024.10470947(860-865)Online publication date: 26-Jan-2024
  • (2024)Incremental Top-k High Utility Pattern Mining and Analyzing Over the Entire Accumulated Dynamic DatabaseIEEE Access10.1109/ACCESS.2024.340656212(77605-77620)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. RHUPS: Mining Recent High Utility Patterns with Sliding Window–based Arrival Time Control over Data Streams

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 12, Issue 2
    Survey Paper and Regular Paper
    April 2021
    319 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/3447400
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 January 2021
    Accepted: 01 October 2020
    Revised: 01 September 2020
    Received: 01 January 2020
    Published in TIST Volume 12, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Recent high utility pattern
    2. evolutionary time-fading factor
    3. sliding window
    4. stream database

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • National Research Foundation of Korea (NRF)
    • Ministry of Education
    • Science and Technology (NRF)

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)31
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)RNP-Miner: Repetitive Nonoverlapping Sequential Pattern MiningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.333430036:9(4874-4889)Online publication date: 1-Sep-2024
    • (2024)An Improved FP-Growth Algorithm with Time Decay Factor and Element Attention Weight2024 IEEE 4th International Conference on Power, Electronics and Computer Applications (ICPECA)10.1109/ICPECA60615.2024.10470947(860-865)Online publication date: 26-Jan-2024
    • (2024)Incremental Top-k High Utility Pattern Mining and Analyzing Over the Entire Accumulated Dynamic DatabaseIEEE Access10.1109/ACCESS.2024.340656212(77605-77620)Online publication date: 2024
    • (2024)Incremental high average-utility itemset mining: survey and challengesScientific Reports10.1038/s41598-024-60279-014:1Online publication date: 30-Apr-2024
    • (2024)MLC-minerExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124383254:COnline publication date: 15-Nov-2024
    • (2024)Incremental mining algorithms for generating and updating frequent patterns for dynamic databases against insert, update, and support changesInternational Journal of Data Science and Analytics10.1007/s41060-024-00619-5Online publication date: 13-Aug-2024
    • (2024)Mining Top-K constrained cross-level high-utility itemsets over data streamsKnowledge and Information Systems10.1007/s10115-023-02045-866:5(2885-2924)Online publication date: 1-May-2024
    • (2023)ONP-Miner: One-off Negative Sequential Pattern MiningACM Transactions on Knowledge Discovery from Data10.1145/354994017:3(1-24)Online publication date: 22-Feb-2023
    • (2023)MCoR-Miner: Maximal Co-Occurrence Nonoverlapping Sequential Rule MiningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.324121335:9(9531-9546)Online publication date: 1-Sep-2023
    • (2023)Scalable and Efficient Approach for High Temporal Fuzzy Utility Pattern MiningIEEE Transactions on Cybernetics10.1109/TCYB.2022.319866153:12(7672-7685)Online publication date: Dec-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media