Abstract
Quality of service (QoS)-based web service selection has been studied in the service computing community for some time. However, characteristics of the input dataset that is going to be processed by the web service are not usually considered in the selection process, even though they might have impact on QoS values of the service, e.g. latency on processing a bigger dataset is higher than that on a smaller dataset, one service takes longer time to process a certain dataset than another service. To address this issue, in this work, we take into consideration the dataset features in the QoS-based service recommendation process and we focus on data mining services because their QoS values could be highly dependent on dataset features. We propose two approaches for data mining service recommendations and compare their performances. In the first approach, we use a meta-learning algorithm to incorporate dataset features in the recommendation process and study the use of different machine learning algorithms (both classification models and regression models) as meta-learners in recommending data mining services for the given dataset. We also investigate the impact of the number of dataset features on the performance of the meta-learners. In the second approach, we propose a novel technique of using factor analysis for web service recommendation. We use decomposition technique to identify latent features of the input dataset and then recommend services by exploiting these latent variables. Our proposed approach of web service recommendation based on latent features was shown to be a more robust model with an accuracy of 85% compared to meta-feature-based recommendation.
Similar content being viewed by others
References
Witten I, Hall M (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, Amsterdam
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. SIGKDD Explor 11:10–18
R Core-Team (2013) R: a language and environment for statistical computing. The R Foundation for Statistical Computing, Vienna
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: large-scale machine learning on heterogeneous systems. Software available from http://tensorflow.org
Microsoft, “Azure”. http://azure.microsoft.com/en-us/services/machine-learning/
Rastogi R (2015) Machine learning @ Amazon. Presented at the proceedings of the 2nd IKDD conference on data sciences, Bangalore
Zaharia M, Chowdhury M, Franklin M, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. Presented at the proceedings of the 2nd USENIX conference on hot topics in cloud computing, Boston
Ferrucci DA (2011) IBM’s Watson/DeepQA. In: SIGARCH comput. archit. news, vol 39
Wang Y, Stroulia E (2003) Structural and semantic matching for assessing web-service similarity. In: First international conference, Trento, 2003. Proceedings, pp 194–207
Lemke C, Budka M, Gabrys B (2015) Metalearning: a survey of trends and technologies. Artif Intell Rev 44:117–130
Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, Niculae V et al (2013) API design for machine learning software: experiences from the scikit-learn project
Talia D, Trunfio P, Verta O (2008) The Weka4WS framework for distributed data mining in service-oriented grids. Concurr Comput Pract Exp 20:1933–1951
Kritikos K, Plexousakis D (2009) Mixed-integer programming for QoS-based web service matchmaking. IEEE Trans Serv Comput 2:122–139
Brazdil P, Carrier CG, Soares C (2008) Metalearning: application to data mining. Springer, Berlin
Martinez W, Martinez A (2008) Computational statistics handbook with MATLAB, 2nd edn. Chapman and Hall/CRC, Boca Raton
Ruz-Cortes A (2005) Improving the automatic procurement of web services using constraint programming. Int J Coop Inf Syst, p 439
Hu Y, Peng Q, Hu X, Yang R (2015) Web service recommendation based on time series forecasting and collaborative filtering, pp 233–240
Jain N, Ding C, Liu X (2016) Data-dependent QoS-based service selection. In: Sheng QZ, Stroulia E, Tata S, Bhiri S (eds) Service-oriented computing: 14th international conference, ICSOC 2016, Banff, 10–13 Oct 2016, Proceedings. Springer, Cham, pp 617–625
Chen G (2017) Latent discriminant analysis with representative feature discovery. In: AAAI
Gado NEI, Grall-Maës E, Kharouf M (2017) Linear discriminant analysis based on fast approximate SVD
Dua D, Taniskidou KE (2017) UCI machine learning repository. School of Information and Computer Science, University of California, Irvine
Zhou Y, Wilkinson D, Schreiber R, Pan R (2008) Large-scale parallel collaborative filtering for the netflix prize. In: Proceedings of the 4th international conference on algorithmic aspects in information and management. AAIM’08. Springer, Berlin, pp 337–348
Rahman MS, Ding C, Liu X, Chi C-H (2016) A testbed for collecting QoS data of cloud-based analytic services. In 2016 IEEE 9th international conference on cloud computing, pp 236–243
Sun Q, Pfahringer B (2013) Pairwise meta-rules for better meta-learning-based algorithm ranking. Mach Learn 93:141–161
GepSoft (2014) Analyzing GeneXproTools models statistically. http://www.gepsoft.com. Accessed 25 Dec 2018
Liu Y, Ngu A, Zeng L (2004) QoS computation and policing in dynamic web service selection. In: Proceedings of the 13th international World Wide Web conference on alternate track papers and posters, pp 66–73
Herssens C, Jureta I, Faulkner S (2008) Dealing with quality tradeoffs during service selection, pp 77–86
The MathWorks, Inc. (2014) MATLAB and statistics toolbox release 2014b
Zheng Z, Ma H, Lyu MR, King I (2009) WSRec: a collaborative filtering based web service recommender system, pp 437–444
Kang G, Liu J, Tang M, Liu X, Cao B, Xu Y (2012) AWSR: active web service recommendation based on usage history. In: 2012 IEEE 19th international conference on web services, Honolulu, pp 186–193
Cao J, Wu Z, Wang Y, Zhuang Y (2013) Hybrid collaborative filtering algorithm for bidirectional web service recommendation. Knowl Inf Syst 36(3):607–627
Chen X, Zheng Z, Yu Q, Lyu MR (2014) Web service recommendation via exploiting location and QoS information. IEEE Trans Parallel Distrib Syst 25(7):1913–1924
Al-Masri E, Mahmoud Q (2007) QoS-based discovery and ranking of web services. IEEE, pp 529–534
Yan J, Piao J (2008) Towards QoS-based web services discovery. In: ICSOC, pp 200–210
Menasc DA, Dubey V (2007) Utility-based QoS brokering in service oriented architectures. IEEE, pp 422–430
Tran V, Tsuji H, Masuda R (2009) A new QoS ontology and its QoS-based ranking algorithm for web services. Simul Model Pract Theory 17:1378–1398
Yu Q, Bouguettaya A (2010) Computing service skyline from uncertain QoWS. IEEE Trans Serv Comput 3:16–29
Skoutas D, Sacharidi D, Simitsis A, Kantere V, Sellis T (2009) Top-k dominant web services under multi-criteria matching. In: Proceedings of the 12th international conference on extending database technology: advances in database technology, pp 898–909
Brazdil P, Soares C, Da Costa J (2003) Ranking learning algorithms-using IBL and meta-learning on accuracy and time results. Mach Learn 50:271–277
de Souto M, Prudencio R, Soares R, Araujo D, Costa I, Ludermir T et al (2008) Ranking and selecting clustering algorithms using a meta-learning approach. In: Neural networks, 2008. IJCNN 2008, pp 3729–3735
Guerra S, Prudencio R, Ludermir TB (2008) Predicting the performance of learning algorithms using support vector machines as meta-regressors. In: ICANN, pp 523–532
Soares R, Ludermir T, Carvalho F (2009) An analysis of meta-learning techniques for ranking clustering algorithms applied to artificial data. In: ICANN, pp 131–140
Handl J (2009) Cluster generators for large high-dimensional data sets with large numbers of clusters. In: ICANN
Ferrari DG, de Castro LN (2012) Clustering algorithm recommendation: a meta-learning approach. In: SEMCCO, pp 143–150
Liu X, Fulia I (2015) Incorporating user, topic, and service related latent factors into web service recommendation. In: 2015 IEEE international conference on web services, New York, pp 185–192
Li S, Wen J, Luo F, Gao M, Zeng J, Dong ZY (2017) A new QoS-aware web service recommendation system based on contextual feature recognition at server-side. IEEE Trans Netw Serv Manag 14(2):332–342
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Alghofaily, B.I., Ding, C. Data mining service recommendation based on dataset features. SOCA 13, 261–277 (2019). https://doi.org/10.1007/s11761-019-00272-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11761-019-00272-y