Abstract
In the high-utility itemset mining (HUIM) model, the low-utility patterns sometimes with a very high-utility pattern will be considered as a valuable pattern even if this behavior may be not highly correlated. A more intelligent system that provides non-redundant and correlated behavior based on utility measure is desired. In this paper, we first present a novel method, called extracting non-redundant correlated purchase behaviors by utility measure, to determine the high qualified patterns, which can lead to higher recall and better precision. In the proposed projection-based approach, efficient projection mechanism and a sorted downward closure property are developed to reduce the database size. Two pruning strategies are further developed to efficiently and effectively discover the desired patterns. An extensive experimental study showed that the proposed algorithm considerably outperforms the existing HUIM algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Frequent itemset mining dataset repository. http://fimi.ua.ac.be/data/
Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5, 914–925 (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: The International Conference on Very Large Data Bases, pp. 487–499 (1994)
Agrawal, R., Srikant, R.: Quest synthetic data generator. http://www.Almaden.ibm.com/cs/quest/syndata.html
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Le, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Choi, Y.K.: A framework for mining interesting high utility patterns with a strong frequency affinity. Inf. Sci. 181(21), 4878–4894 (2011)
Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: The International Conference on Data Mining, pp. 19–26 (2003)
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. 38(3), Article 9 (2006)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)
Fournier-Viger, P., Wu, C.W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. Found. Intell. Syst. 8502, 83–92 (2014)
Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)
Lin, J.C.W., Gan, W., Hong, T.P., Tseng, V.S.: Efficient algorithms for mining up-to-date high-utility patterns. Adv. Eng. Inform. 29(3), 648–661 (2015)
Lin, J.C.-W., Gan, W., Fournier-Viger, P., Hong, T.-P.: Mining discriminative high utility patterns. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, T.-P. (eds.) ACIIDS 2016. LNCS, vol. 9622, pp. 219–229. Springer, Heidelberg (2016). doi:10.1007/978-3-662-49390-8_21
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P.: Mining high-utility itemsets with multiple minimum utility thresholds. In: ACM International Conference on Computer Science & Software Engineering, pp. 9–17 (2015)
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)
Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS, vol. 3518, pp. 689–695. Springer, Heidelberg (2005). doi:10.1007/11430919_79
Omiecinski, E.R.: Alternative interest measures for mining associations in databases. IEEE Trans. Knowl. Data Eng. 15(1), 57–69 (2003)
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: UP-growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262 (2010)
Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Wu, T., Chen, Y., Han, J.: Re-examination of interestingness measures in pattern mining: a unified framework. Data Min. Knowl. Discov. 21(3), 371–397 (2010)
Yao, H., Hamilton, J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: SIAM International Conference on Data Mining, pp. 211–225 (2004)
Acknowledgments
This research was partially supported by the National Natural Science Foundation of China (NSFC) under grant No. 61503092 and by the Tencent Project under grant CCF-Tencent IAGR20160115.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Gan, W., Lin, J.CW., Fournier-Viger, P., Chao, HC. (2017). Extracting Non-redundant Correlated Purchase Behaviors by Utility Measure. In: Bellatreche, L., Chakravarthy, S. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2017. Lecture Notes in Computer Science(), vol 10440. Springer, Cham. https://doi.org/10.1007/978-3-319-64283-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-64283-3_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64282-6
Online ISBN: 978-3-319-64283-3
eBook Packages: Computer ScienceComputer Science (R0)