Abstract
Extraction of interesting patterns from large databases have placed great impact on many applications for instance market basket analysis, web service mining, bio-informatics and mobile commerce for effective decision making. One of the methods to unearth the interesting patterns from the databases is High Utility Itemset Mining (HUIM). Research in mining High Utility Itemsets (HUIs) is emerged because frequent itemset mining only focuses on statistical correlations between the items in an itemset. HUIM aims to extract relations between the items in an itemset based on the semantic significance. This article presents a Hybrid Genetic Algorithm (HGA) which combines quantum operators with classical genetic operators to mine interesting patterns from large databases for web service composition. The extensive experimental result show that proposed algorithm is sensible and admissible in terms of running time and memory consumption for extracting high utility itemsets for web service composition.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866–883. https://doi.org/10.1109/69.553155
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, New York
Geng L, Hamilton HJ (2006) Interestingness measures for data mining: a survey. ACM Comput Surv 38(3):1–32. https://doi.org/10.1145/1132960.1132963
George B, Plexousakis D (2010) Automated web service composition: state of the art and research challenges. ICS-FORTH, Technical Report-409
Rathore M, Suman U (2013) Web service selection algorithm for dynamic service composition using LSLO approach, In: Proceedings of 2013 international conference on informatics, electronics and vision (ICIEV), pp 1–6. doi: https://doi.org/10.1109/ICIEV.2013.6572688.
Vivek R, Prasad M, Sushmitha N (2016) Recommendation for web service Composition by mining usage logs. Int J Data Mining Knowle Manage Process 6(2):83–89. https://doi.org/10.5121/ijdkp.2016.6207
Walid G, Baïna K, Godart C (2008) Log-based mining techniques applied to web service composition reengineering. Serv Oriented Comput Appl 2(2):93–110. https://doi.org/10.1007/s11761-008-0023-6
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Bosten
Perales-Gravan C, Lahoz-Beltra R (2008) An AM Radio Receiver designed with a genetic algorithm based on a bacterial conjugation genetic operator. IEEE Trans Evol Comput 12(2):129–142. https://doi.org/10.1109/TEVC.2007.895271
Calvin WH (1987) The brain as a Darwin machine. Nature 330:33–34. https://doi.org/10.1038/330033a0
Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility Itemsets. In: 9th Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD 2005), Lecturer Notes Computer Science, vol 3518, pp 689–695. doi: https://doi.org/10.1007/11430919_79.
Chan R, Yang Q, Shen Y (2003) Mining high-utility itemsets. In: Proceedings of the 2003 IEEE international conference on data mining (ICDM’ 03) Melbourne, FL, pp 19–26. doi: https://doi.org/10.1109/ICDM.2003.1250893.
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 3rd SIAM international conference on data mining, Orlando, Florida, pp 482–486. doi: https://doi.org/10.1137/1.9781611972740.51.
Liu Y, Liao WK, Choudhary A (2005) A fast high utility itemsets mining algorithm. UBDM’2005, pp 90–99. doi: https://doi.org/10.1145/1089827.1089839.
Zaki MJ (1999) Parallel and distributed association mining: a survey. IEEE Concurr 7(4):4–25. https://doi.org/10.1109/4434.806975
Zaki MJ (2001) SPADE: an efficient algorithm for mining frequent sequences. Mach Learn 42(1):31–60. https://doi.org/10.1023/A:1007652502315
Yao H, Hamilton HJ, Geng L (2006) A unified framework for utility based measures for mining itemsets. In: Proceedings of the 2nd international workshop on utility-based data mining, pp 28–37
Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626. https://doi.org/10.1016/j.datak.2005.10.004
Tseng VS, Chu CJ, Liang T (2006) Efficient mining of temporal high utility itemsets from data streams. In: Proceedings of the 2nd international workshop on utility-based data mining, pp 18–27. doi: https://doi.org/10.1007/978-3-642-13265-0_8.
Fayyad U, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (1996) Advances in knowledge discovery and data mining. AAAI/MIT Press, New York
Wang J, Liu Y, Zhou L, Shi Y, Zhu X (2007) Pushing frequency constraint to utility mining model. Lect Notes Comput Sci 4489:685–692. https://doi.org/10.1007/978-3-540-72588-6_115
Podpecan V, Lavrac N, Kononenko I (2007) A fast algorithm for mining utility-frequent itemsets. In: Proceedings of the 11th European conference on principles and practice of knowledge discovery in databases.
Hu J, Mojsilovic A (2007) High-utility pattern mining: a method for discovery of high-utility item sets. Pattern Recogn 40(11):3317–3324. https://doi.org/10.1016/j.patcog.2007.02.003
Erwin A, Gopalan RP, Achuthan NR (2007) CTU-Mine: an efficient high utility itemset mining algorithm using the pattern growth approach. In: Proceedings of 7th international conference on computer and information technology, pp 71–76. Doi: https://doi.org/10.1109/CIT.2007.120.
Erwin A, Gopalan RP, Achuthan NR (2007) A bottom-up projection based algorithm for mining high utility itemsets. In: 2nd workshop on integrating AI and data mining (AIDM 2007), pp 3–11
Yen SJ, Lee YS (2007) Mining high utility quantitative association rules. Lect Notes Comput Sci 4654:283–292. https://doi.org/10.1007/978-3-540-74553-2_26
Li YC, Yeh JS, Chang CC (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217. https://doi.org/10.1016/j.datak.2007.06.009
Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721. https://doi.org/10.1109/TKDE.2009.46
Lan GC, Hong TP, Tseng VS (2009) Mining On-shelf high utility itemsets. In: International conference on information technology and applications in outlying islands, pp 482–489
Chu CJ, Tseng VS, Liang T (2009) An efficient algorithm for mining high utility itemsets with negative item values in large databases. J Appl Math Comput 215(2):767–778. https://doi.org/10.1016/j.amc.2009.05.066
Ahmed CF, Tanbeer SK, Jeong BS (2009) Efficient mining of weighted frequent patterns over data streams. In: Eleventh IEEE international conference on high performance computing and communications, pp 400–406. Doi: https://doi.org/10.1109/HPCC.2009.36.
Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 253–262. doi: https://doi.org/10.1145/1835804.1835839.
Li HF (2011) MHUI-max: An efficient algorithm for discovering high-utility itemsets from data streams. Inf Sci 37(5):532–545. https://doi.org/10.1177/0165551511416436
Hong TP, Lee CH, Wang SL (2011) Effective utility mining with the measure of average utility. Expert Syst Appl 38(7):8259–8265. https://doi.org/10.1016/j.eswa.2011.01.006
Li FG, Sun YJ, Ni ZW, Yu L, Mao XM (2012) The Utility Frequent Pattern Mining Based on Slide Window in Data Stream. In: 5th international conference on intelligent computation technology and automation (ICICTA), pp 414–419. doi: https://doi.org/10.1109/ICICTA.2012.110.
Kannimuthu S, Premalatha S, Shankar S (2012) A novel approach to extract high utility itemsets from distributed databases. Comput Inform 31(6):1597–1615
Lin CW, Lan GC, Hong TP (2012) An incremental mining algorithm for high utility itemsets. Expert Syst Appl 39(8):7173–7180. https://doi.org/10.1016/j.eswa.2012.01.072
Shie BE, Yu PS, Tseng VS (2012) Efficient algorithms for mining maximal high utility itemsets from data streams with different models. Expert Syst Appl 39(17):12947–12960. https://doi.org/10.1016/j.eswa.2012.05.035
Zihayat M, An A (2014) Mining top-k high utility patterns over data streams. J Inf Sci 285(20):138–161. https://doi.org/10.1016/j.ins.2014.01.045
Lan GC, Hong TP, Tseng VS, Wang SL (2014) Applying the maximum utility measure in high utility sequential pattern mining. Expert Syst Appl 41(11):5071–5081. https://doi.org/10.1016/j.eswa.2014.02.022
Zhang X, Deng ZH (2015) Mining summarization of high utility itemsets. Knowl Based Syst 84:67–77. https://doi.org/10.1016/j.knosys.2015.04.004
Lin JC-W, Gan W, Hong TP, Tseng VS (2015) Efficient algorithms for mining up-to-date high-utility patterns. Adv Eng Inform 29(3):648–661. https://doi.org/10.1016/j.aei.2015.06.002
Yun U, Kim J (2015) A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst Appl 42(3):1149–1165. https://doi.org/10.1016/j.eswa.2014.08.037
Lin JC-W, Gan W, Fournier-Viger P, Yang L, Liu Q, Frnda J, Sevcik L, Voznak M (2016) High utility-itemset mining and privacy-preserving utility mining. J Sci Perspect 7:74–80. https://doi.org/10.1016/j.pisc.2015.11.013
Lin JC-W, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2016) Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl Based Syst 96:171–187. https://doi.org/10.1016/j.knosys.2015.12.019
Chen Y, An A (2016) Approximate parallel high utility itemset mining. Big Data Res 6:26–42. https://doi.org/10.1016/j.bdr.2016.07.001
Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowl Based Syst 124:188–206. https://doi.org/10.1016/j.knosys.2017.03.016
Krishnamoorthy S (2018) Efficiently mining high utility itemsets with negative unit profits. Knowl Based Syst 145:1–14. https://doi.org/10.1016/j.knosys.2017.12.035
Nguyen LTT, Vu VV, Lam MTH, Duong TTM, Manh LT, Nguyen TTT, Vo B, Fujita H (2019) An efficient method for mining high utility closed itemsets. J Inf Sci 495:78–99. https://doi.org/10.1016/j.ins.2019.05.006
Nam H, Yun U, Yoon E, Lin JC (2020) Efficient approach of recent high utility stream pattern mining with indexed list structure and pruning strategy considering arrival times of transactions. J Inf Sci 529:1–27. https://doi.org/10.1016/j.ins.2020.03.030
Lin JC-W, Pirouz M, Djenouri Y, Cheng CF, Ahmed U (2020) Incrementally updating the high average-utility patterns with pre-large concept. Appl Intell 50:3788–3807. https://doi.org/10.1109/ACCESS.2020.2982415
Truong PFVT, Tran A, Duong H, Le B (2020) EHUSM: mining high utility sequences with a pessimistic utility model. Data Sci Pattern Recogn 4(2):65–83
Srivastava G, Lin JC-W, Zhang X, Li Y (2021) Large-scale high-utility sequential pattern analytics in internet of things. IEEE Internet Things J 8(16):12669–12678. https://doi.org/10.1109/JIOT.2020.3026826
Lin JC-W, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive GA-based model for closed high-utility itemset mining. Appl Soft Comput 108:107422. https://doi.org/10.1016/j.asoc.2021.107422
Lin JC-W, Djenouri Y, Srivastava G (2021) Efficient closed high-utility pattern fusion model in large-scale databases. Inf Fusion 76:122–132. https://doi.org/10.1016/j.inffus.2021.05.011
Kannimuthu S, Premalatha K (2014) Discovery of high utility itemsets using genetic algorithm with ranked mutation. Appl Artif Intell 28:337–359. https://doi.org/10.1080/08839514.2014.891839
Song W, Huang C (2018) Discovering high utility itemsets based on the artificial bee colony algorithm. PAKDD 2018. Lect Notes Comput Sci 10939:3–14. https://doi.org/10.1007/978-3-319-93040-4_1
Lin JC-W, Yang L, Fournier-Viger P, Wu JMT, Hong TP, Wang LSL, Zhan J (2016) Mining high-utility itemsets based on particle swarm optimization. Eng Appl Artif Intell 55:320–330. https://doi.org/10.1016/j.engappai.2016.07.006
Lin JC-W, Yang L, Fournier-Viger P, Hong TP, Voznak M (2017) A binary PSO approach to mine high-utility itemsets. Soft Comput 21:5103–5121. https://doi.org/10.1007/s00500-016-2106-1
Song W, Huang C (2018) Mining high utility itemsets using bio-inspired algorithms: a diverse optimal value framework. IEEE Access 6:19568–19582. https://doi.org/10.1109/ACCESS.2018.2819162
Gunawan R, Winarkoa E, Pulungana R (2020) A BPSO-based method for high-utility itemset mining without minimum utility threshold. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2019.105164
Song W, Huang C (2020) Mining high average-utility itemsets based on particle swarm optimization. Data Sci Pattern Recogn 4(2):19–32
Nayak R, Tong C (2004) Applications of data mining in web services. WISE 2004. Lect Notes Comput Sci 3306:199–205. https://doi.org/10.1007/978-3-540-30480-7_22
Ran T, Zou Y (2010) An approach for mining web service composition patterns from execution logs. In: Proceedings of 12th IEEE international symposium on web systems evolution (WSE), Timisoara, pp 53–62. doi: https://doi.org/10.1109/WSE.2010.5623568.
Yasmina RZ, Fethallah H, Fadoua L (2021) Web service selection and composition based on uncertain quality of service. Concurrency Comput Pract Exp. https://doi.org/10.1002/cpe.6531
Zhang G (2011) Quantum-inspired evolutionary algorithms: a survey and empirical study. J Heuristics 17(3):303–351. https://doi.org/10.1007/s10732-010-9136-0
Mohammed AM, Elhefnawy NA, El-Sherbiny MM, Hadhoud MM (2012) Quantum crossover based quantum genetic algorithm for solving non-linear programming. In: 8th international conference on informatics and systems (INFOS2012), Cairo, Egypt, pp 145–153
SPMF: An open-source data mining library (2020) http://www.philippe-fournier-viger.com/spmf/index.php?link=algorithms.php. Accessed 6 Aug 2020
IBM Synthetic Data Generation (2020) http://www.almaden.ibm.com/software/
projects/hdb/resources.shtml. Accessed 6 Aug 2020
Friedman M (1940) A comparison of alternative tests of significance for the problem of m ranking. Ann Math Stat 11:86–92. https://doi.org/10.1214/AOMS/1177731944
Nemenyi B (1963) Distribution-free multiple comparison. Dissertation, Princeton University.
Demsar J (2006) Statistical comparison of classifiers over multiple datasets. J Mach Learn Res 7:1–30
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kannimuthu, S., Chakravarthy, D.G. Discovery of Interesting Itemsets for Web Service Composition Using Hybrid Genetic Algorithm. Neural Process Lett 54, 3913–3939 (2022). https://doi.org/10.1007/s11063-022-10793-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-022-10793-x