Abstract
This paper introduces a new method for the session construction problem, which is the first main step of the Web usage mining process. The proposed method defines user sessions as a set of navigation paths in the Web graph and produces a complete set of all possible maximal paths. Our new method is capable of generating navigation paths which cannot be extracted by using previous greedy approaches. Through experiments performed on real data, it is shown that when our new technique is used, it outperforms previous approaches in Web usage mining applications such as next-page prediction. Our analysis on Web user sessions exposes an important observation: Web users sessions contain navigation graphs that has small number of nodes where users branch out their navigation into multiple paths.
Similar content being viewed by others
Code Availability
The source code and the datasets are available at the following links:
Notes
The subsequence relation is equivalent to substring relation in this context.
⊔ operation stands for sequence concatenation which is same as string concatenation operator.
References
Agarwal, R., Saxena, S.: An Efficient Approach for Web Usage Mining Using Ann Technique. In: System Performance and Management Analytics, pp 55–63. Springer (2019)
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: ICDE, pp 3–14 (1995)
Algiriyage, N., Jayasena, S., Dias, G.: Web User Profiling Using Hierarchical Clustering with Improved Similarity Measure. In: 2015 Moratuwa Engineering Research Conference (MERCon), pp 295–300 . IEEE (2015)
Bayir, M.A., Toroslu, I.H., Demirbas, M., Cosar, A.: Discovering better navigation sequences for the session construction problem. Data Knowl. Eng. 73, 58–72 (2012)
Bishop, C.M.: Pattern recognition and machine learning. Springer (2006)
Catledge, L.D., Pitkow, J.E.: Characterizing browsing strategies in the world-wide web. Computer Networks and ISDN Systems 27(6), 1065–1073 (1995)
Ceci, M., Lanotte, P.F.: Closed sequential pattern mining for sitemap generation. World Wide Web 24(1), 175–203 (2021)
Chen, W., Niu, Z., Zhao, X., Li, Y.: A hybrid recommendation algorithm adapted in e-learning environments. World Wide Web 17(2), 271–284 (2014)
Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowl. Inf. Syst. 1(1), 5–32 (1999)
Cooley, R., Tan, P.N., Srivastava, J.: Discovery of Interesting Usage Patterns from Web Data. In: WEBKDD, pp 163–182 (1999)
Dell, R.F., Roman, P.E., Velásquez, J.D.: Web user session reconstruction using integer programming. In: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01, pp 385–388. IEEE Computer Society (2008)
Dell, R.F., Román, P.E., Velásquez, J.D.: Web user session reconstruction with back button browsing. In: Knowledge-Based and Intelligent Information and Engineering Systems, 13Th International Conference, KES 2009, Santiago, Chile, September 28-30, 2009, Proceedings, Part I, pp 326–332 (2009)
Donato, D., Laura, L., Leonardi, S., Millozzi, S.: The web as a graph: how far we are. ACM Trans. Internet. Techn. 7(1), 25 (2007). https://doi.org/10.1145/1189740.1189744
Esmeli, R., Bader-El-Den, M., Abdullahi, H.: Using word2vec recommendation for improved purchase prediction. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp 1–8. IEEE (2020)
Fu, Y., Shih, M.Y.: A Framework for Personal Web Usage Mining. In: International Conference on Internet Computing, pp 595–600 (2002)
Gellert, A., Florea, A.: Web prefetching through efficient prediction by partial matching. World Wide Web 19(5), 921–932 (2016)
Huang, Z., Cautis, B., Cheng, R., Zheng, Y.: Kb-enabled query recommendation for long-tail queries. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp 2107–2112 (2016)
Huang, Z., Mamoulis, N.: Location-aware query recommendation for search engines at scale. In: International Symposium on Spatial and Temporal Databases, pp 203–220 . Springer (2017)
Katarya, R., Verma, O.P.: An effective web page recommender system with fuzzy c-mean clustering. Multimedia Tools and Applications 76(20), 21,481–21,496 (2017)
Liu, B., Mobasher, B., Nasraoui, O.: Web usage mining. In: Web Data Mining, Data-Centric Systems and Applications, pp 527–603. Springer Berlin Heidelberg, Berlin (2011)
Lopes, P., Roy, B.: Dynamic recommendation system using web usage mining for ecommerce users. Procedia Computer Science 45, 60–69 (2015)
Mobasher, B.: Data mining for web personalization. In: The Adaptive Web, pp 90–135 (2007)
Mokryn, O., Bogina, V., Kuflik, T.: Will this session end with a purchase? Inferring current purchase intent of anonymous visitors. Electron. Commer. Res. Appl. 34(100), 836 (2019)
Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)
Postelnicu, Z., Raviv, T., Ben-Gal, I.: Improving websites’ quality of service by shortening their browsing expected path length. Qual. Reliab. Eng. Int. 32(6), 2017–2031 (2016)
Raphaeli, O., Goldstein, A., Fink, L.: Analyzing online consumer behavior in mobile and pc devices: a novel web usage mining approach. Electronic Commerce Research and Applications 26, 1–12 (2017)
Shahabi, C., Kashani, F.B.: Efficient and anonymous web-usage mining for web personalization. INFORMS J. Comput. 15(2), 123–147 (2003)
Sisodia, D.S., Verma, S.: Web Usage Pattern Analysis through Web Logs: a Review. In: 2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE), pp 49–53. IEEE (2012)
Tarus, J.K., Niu, Z., Kalui, D.: A hybrid recommender system for e-learning based on context awareness and sequential pattern mining. Soft. Comput. 22(8), 2449–2461 (2018)
Tarus, J.K., Niu, Z., Yousif, A.: A hybrid knowledge-based recommender system for e-learning based on ontology and sequential pattern mining. Futur. Gener. Comput. Syst. 72, 37–48 (2017)
Tseng, V.S., Wu, C.W., Fournier-Viger, P., Philip, S.Y.: Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28 (1), 54–67 (2016)
Zhang, J., Ghorbani, A.A.: The Reconstruction of User Sessions from a Server Log Using Improved Time-Oriented Heuristics. In: 2Nd Annual Conference on Communication Networks and Services Research (CNSR 2004), 19-21 May 2004, Fredericton, N.B., Canada, pp 315–322 (2004)
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)
Acknowledgements
The authors would like to thank Arzu Bayir for the proofreading of the paper and reviewing format of the figures.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The work was done during the time period when author was employed at Microsoft. The author is currently with Meta Platforms Inc.
This article belongs to the Topical Collection: Special Issue on Computational Aspects of Network Science
Guest Editors: Apostolos N. Papadopoulos and Richard Chbeir
Rights and permissions
About this article
Cite this article
Bayir, M.A., Toroslu, I.H. Maximal paths recipe for constructing Web user sessions. World Wide Web 25, 2455–2485 (2022). https://doi.org/10.1007/s11280-022-01024-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-022-01024-3