Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Mining interesting knowledge from weblogs: a survey

Published: 01 June 2005 Publication History

Abstract

Web Usage Mining is that area of Web Mining which deals with the extraction of interesting knowledge from logging information produced by Web servers. In this paper we present a survey of the recent developments in this area that is receiving increasing attention from the Data Mining community.

References

[1]
{1} O. Etzioni, The world-wide Web: quagmire or gold mine? Communications of the ACM 39 (11) (1996) 65-68.]]
[2]
{2} R. Kosala, H. Blockeel, Web mining research: a survey, SIGKDD: SIGKDD explorations: newsletter of the special interest group (SIG) on knowledge discovery & data mining, ACM 2 (1) (2000) 1-15.]]
[3]
{3} S. Brin, L. Page, The anatomy of a large-scale hypertextual Web search engine, Computer Networks and ISDN Systems 30 (1-7) (1998) 107-117.]]
[4]
{4} Configuration file of W3C httpd, http://www.w3.org/Daemon/User/Config/(1995).]]
[5]
{5} W3C Extended Log File Format, http://www.w3.org/TR/WD-logfile.html (1996).]]
[6]
{6} J.R. Punin, M.S. Krishnamoorthy, M.J. Zaki, Logml: Log markup language for web usage mining, in: R. Kohavi, B. Masand, M. Spiliopoulou, J. Srivastava (Eds.), WEBKDD 2001--Mining Web Log Data Across All Customers Touch Points, Third International Workshop, San Francisco, CA, USA, August 26, 2001. Revised Papers, vol. 2356 of Lecture Notes in Computer Science, Springer, 2002, pp. 88-112.]]
[7]
{7} J. Srivastava, R. Cooley, M. Deshpande, P.-N. Tan, Web usage mining: discovery and applications of usage patterns from web data, SIGKDD Explorations 1 (2) (2000) 12-23.]]
[8]
{8} The WebSIFT project, http://www.cs.umn.edu/research/websift/(2003).]]
[9]
{9} S. Pal, V. Talwar, P. Mitra, Web Mining in soft computing framework: relevance, state of the art and future directions, IEEE Transactions on Neural Networks 13 (5) (2002) 1163-1177.]]
[10]
{10} Consortium on discovering knowledge with Inductive Queries (clnQ). Project funded by the European Commission under the Information Society Technologies Programme (1998-2002) Future and Emerging Technologies arm. Contract no. IST-2000-26469., http://www.cinq-project.org. Bibliography on Web Usage Mining available at http://www.cinq-project.org/intranet/polimi/.]]
[11]
{11} A. Nanopoulos, M. Zakrzewicz, T. Morzy, Y. Manolopoulos, Indexing web access-logs for pattern queries, in: fourth ACM CIKM International Workshop on Web Information and Data Management (WIDM'02), 2002.]]
[12]
{12} K.P. Joshi, A. Joshi, Y. Yesha, On using a warehouse to analyze web logs, Distributed and Parallel Databases 13 (2) (2003) 161-180.]]
[13]
{13} D.M. Kristol, Http cookies: standards, privacy, and politics, ACM Transactions on Internet Technology (TOIT) 1 (2)(2001) 151-198.]]
[14]
{14} B. Berendt, B. Mobasher, M. Nakagawa, M. Spiliopoulou, The impact of site structure and user environment on session reconstruction in web usage analysis, in: Proceedings of the 4th WebKDD 2002 Workshop, at the ACM-SIGKDD Conference on Knowledge Discovery in Databases (KDD'2002), 2002.]]
[15]
{15} K.D. Fenstermacher, M. Ginsburg, Mining client-side activity for personalization, in: Fourth IEEE International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems (WECWIS'02), 2002, pp. 205-212.]]
[16]
{16} Pilot Software, Web site analysis, Going Beyond Traffic Analysis http://www.marketwave.com/products-solutions/hitlist.html (2002).]]
[17]
{17} S. Ansari, R. Kohavi, L. Mason, Z. Zheng, Integrating e-commerce and data mining: Architecture and challenges, in: WEBKDD 2000--Web Mining for E-Commerce--Challenges and Opportunities, Second International Workshop, 2000.]]
[18]
{18} S. Ansari, R. Kohavi, L. Mason, Z. Zheng, Integrating e-commerce and data mining: Architecture and challenges, in: N. Cercone, T.Y. Lin, X. Wu (Eds.), Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM 2001), IEEE Computer Society, 2001.]]
[19]
{19} C. Shahabi, F. Banaei-Kashani, A framework for efficient and anonymous web usage mining based on client-side tracking, in: R. Kohavi, B. Masand, M. Spiliopoulou, J. Srivastava (Eds.), WEBKDD 2001--Mining Web Log Data Across All Customers Touch Points, Third International Workshop, San Francisco, CA, USA, August 26, 2001. Revised papers, vol. 2356 of Lecture Notes in Computer Science, Springer, 2002, pp. 113-144.]]
[20]
{20} L.D. Catledge, J.E. Pitkow, Characterizing browsing strategies in the World-Wide Web, Computer Networks and ISDN Systems 27 (6) (1995) 1065-1073.]]
[21]
{21} C.R. Anderson, A machine learning approach to web personalization, Ph.D. thesis, University of Washington, 2002.]]
[22]
{22} R. Cooley, B. Mobasher, J. Srivastava, Data preparation for mining world wide web browsing patterns, Knowledge and Information Systems 1 (1) (1999) 5-32.]]
[23]
{23} B. Diebold, M. Kaufmann, Usage-based visualization of web localities, in: Australian symposium on information visualisation, 2001, pp. 159-164.]]
[24]
{24} P.-N. Tan, V. Kumar, Modeling of web robot navigational patterns, in: WEBKDD 2000--Web Mining for E-Commerce--Challenges and Opportunities, Second International Workshop, 2000.]]
[25]
{25} P.-N. Tan, V. Kumar, Discovery of web robot sessions based on their navigational patterns, Data Mining and Knowledge Discovery 6 (1) (2002) 9-35.]]
[26]
{26} R. Cooley, Web usage mining: discovery and application of interesting patterns from web data, Ph.D. thesis, University of Minnesota, 2000.]]
[27]
{27} B. Mobasher, R. Cooley, J. Srivastava, Automatic personalization based on web usage mining, Communications of the ACM 43 (8) (2000) 142-151.]]
[28]
{28} IBM, SurfAid Analytics http://surfaid.dfw.ibm.com (2003).]]
[29]
{29} M. Chen, A.S. LaPaugh, J.P. Singh, Predicting category accesses for a user in a structured information space, in: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, 2002, pp. 65-72.]]
[30]
{30} G. Stumme, A. Hotho, B. Berendt, Usage mining for and on the semantic web, in: National Science Foundation Workshop on Next Generation Data Mining, 2002.]]
[31]
{31} M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, S. Slattery, Learning to construct knowledge bases from the world wide web, Artificial Intelligence 118 (1-2) (2000) 69-113.]]
[32]
{32} A. Banerjee, J. Ghosh, Clickstream clustering using weighted longest common subsequences, in: Proceedings of the Web Mining Workshop at the 1st SIAM Conference on Data Mining, 2001.]]
[33]
{33} E.H. Chi, P. Pirolli, K. Chen, J.E. Pitkow, Using information scent to model user information needs and actions and the web, in: Proceedings of ACM CHI 2002 Conference on Human Factors in Computing Systems, ACM Press, 2001, pp. 490-497.]]
[34]
{34} R. Cooley, The use of web structure and content to identify subjectively interesting web usage patterns, ACM Transactions on Internet Technology (TOIT) 3 (2) (2003) 93-116.]]
[35]
{35} J. Andersen, A. Giversen, A.H. Jensen, R.S. Larsen, T.B. Pedersen, J. Skyt, Analyzing clickstreams using subsessions, in: International Workshop on Data Warehousing and OLAP (DOLAP 2000), 2000.]]
[36]
{36} J. Pei, J. Han, B. Mortazavi-asl, H. Zhu, Mining access patterns efficiently from web logs, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2000, pp. 396-407.]]
[37]
{37} E. Menasalvas, S. Millan, J. Pena, M. Hadjimichael, O. Marban, Subsessions: a granular approach to click path analysis, in: Proceedings of FUZZ-IEEE Fuzzy Sets and Systems Conference, at the World Congress on Computational Intelligence, Honolulu, HI, 12-17 May 2002.]]
[38]
{38} J.Z. Huang, M. Ng, W.-K. Ching, J. Ng, D. Cheung, A cube model and cluster analysis for web access sessions, in: R. Kohavi, B. Masand, M. Spiliopoulou, J. Srivastava (Eds.), WEBKDD 2001--Mining Web Log Data Across All Customers Touch Points, Third International Workshop, San Francisco, CA, USA, August 26, 2001. Revised papers, vol. 2356 of Lecture Notes in Computer Science, Springer, 2002, pp. 48-67.]]
[39]
{39} J. Han, M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, 2001.]]
[40]
{40} A. Nanopoulos, D. Katsaros, Y. Manolopoulos, Exploiting web log mining for web cache enhancement, in: R. Kohavi, B. Masand, M. Spiliopoulou, J. Srivastava (Eds.), WEBKDD 2001--Mining Web Log Data Across All Customers Touch Points, Third International Workshop, San Francisco, CA, USA, August 26, 2001. Revised papers, vol. 2356 of Lecture Notes in Computer Science, Springer, 2002, pp. 68-87.]]
[41]
{41} X. Huang, N. Cercone, A. An, Comparison of interestingness functions for learning web usage patterns, in: Proceedings of the Eleventh International Conference on Information and Knowledge Management, ACM Press, 2002, pp. 617-620.]]
[42]
{42} S.S.C. Wong, S. Pal, Mining fuzzy association rules for web access case adaptation, in: Workshop on Soft Computing in Case-Based Reasoning, International Conference on Case-Based Reasoning (ICCBR'01), 2001.]]
[43]
{43} E.S. Nan Niu, M. El-Ramly, Understanding web usage for dynamic web-site adaptation: A case study, in: Proceedings of the Fourth International Workshop on Web Site Evolution (WSE'02), IEEE, 2002, pp. 53-64.]]
[44]
{44} B. Mortazavi-Asl, Discovering and mining user web-page traversal patterns, Master's thesis, Simon Fraser University, 2001.]]
[45]
{45} J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, M. Hsu, FreeSpan: frequent pattern-projected sequential pattern mining, in: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'2000), Boston, MA, 2000.]]
[46]
{46} J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, M.-C. Hsu, Mining sequential patterns by patterngrowth: the PrefixSpan Approach, IEEE Transactions on Knowledge and Data Engineering, in press.]]
[47]
{47} S.E. Jespersen, J. Thorhauge, T.B. Pedersen, A hybrid approach to web usage mining, in: Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery, Springer-Verlag, 2002, pp. 73-82.]]
[48]
{48} J. Borges, A data mining model to capture UserWeb navigation patterns, Ph.D. thesis, Department of Computer Science, University College London, 2000.]]
[49]
{49} J. Heer, E.H. Chi, Mining the structure of user activity using cluster stability, in: Proceedings of the Workshop on Web Analytics, Second SIAM Conference on Data Mining, ACM Press, 2002.]]
[50]
{50} Y. Xie, V.V. Phoha, Web user clustering from access log using belief function, in: Proceedings of the First International Conference on Knowledge Capture (K-CAP 2001), ACM Press, 2001, pp. 202-208.]]
[51]
{51} B. Hay, G. Wets, K. Vanhoof, Clustering navigation patterns on a website using a sequence alignment method. In: Intelligent Techniques for Web Personalization: IJCAI 2001, 17th Int. Joint Conf. on Artificial Intelligence, August 4, 2001, Seattle, WA, USA, pp. 1-6.]]
[52]
{52} C. Shahabi, Y.-S. Chen, Improving user profiles for e-commerce by genetic algorithms, E-Commerce and Intelligent Methods Studies in Fuzziness and Soft Computing 105 (8) (2002).]]
[53]
{53} J.H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, 1975, republished by the MIT press, 1992.]]
[54]
{54} O. Nasraoui, F. Gonzalez, D. Dasgupta, The fuzzy artificial immune system: Motivations, basic concepts, and application to clustering and web profiling, in: Proceedings of the World Congress on Computational Intelligence (WCCI) and IEEE International Conference on Fuzzy Systems, 2002, pp. 711-716.]]
[55]
{55} A. Ypma, T. Heskes, Clustering web surfers with mixtures of hidden markov models, in: Proceedings of the 14th Belgian Dutch Conference on AI (BNAIC'02), 2002.]]
[56]
{56} S. Oyanagi, K. Kubota, A. Nakase, Application of matrix clustering to web log analysis and access prediction, in: WEBKDD 2001--Mining Web Log Data Across All Customers Touch Points, Third International Workshop, 2001.]]
[57]
{57} B. Mobasher, H. Dai, M. Tao, Discovery and evaluation of aggregate usage profiles for web personalization, Data Mining and Knowledge Discovery 6 (2002) 61-82.]]
[58]
{58} G. Adomavicius, A. Tuzhilin, Extending recommender systems: A multidimensional approach. Workshop on Intelligent Techniques for Web Personalization, IJCAI 2001, Seattle, WA, USA.]]
[59]
{59} D. VanderMeer, K. Dutta, A. Datta, Enabling scalable online personalization on the web, in: Proceedings of the 2nd ACM E-Commerce Conference (EC'00), ACM Press, 2000, pp. 185-196.]]
[60]
{60} B. Mobasher, H. Dai, T. Luo, M. Nakagawa, Effective personalization based on association rule discovery from web usage data, Web Information and Data Management (2001) 9-15.]]
[61]
{61} F. Toolan, N. Kushmerick, Mining web logs for personalized site maps. In: Third Int. Conf. on Web Information Systems Engineering (WISE 02), Workshop on Mining for Enhanced Web Search. Singapore, December 11, 2002, pp. 232-237.]]
[62]
{62} H.-Y. Paik, B. Benatallah, R. Hamadi, Dynamic restructuring of e-catalog communities based on user interaction patterns, World Wide Web 5 (4) (2002) 325-366.]]
[63]
{63} H.K. Dai, B. Mobasher, Using ontologies to discover domain-level web usage profiles, in: Proceedings of the 2nd Semantic Web Mining Workshop at ECML/PKDD 2002, Helsinki, Finland, August 2002.]]
[64]
{64} J.B. Schafer, J.A. Konstan, J. Riedl, E-commerce recommendation applications, Data Mining and Knowledge Discovery 5 (1-2) (2001) 115-153.]]
[65]
{65} C.-Y. Chang, M.-S. Chen, A new cache replacement algorithm for the integration of web caching and prefetching, in: Proceedings of the Eleventh International Conference on Information and Knowledge Management, ACM Press, 2002, pp. 632-634.]]
[66]
{66} B. Lan, S. Bressan, B.C. Ooi, K.-L. Tan, Rule-assisted prefetching in web-server caching, in: Proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM 2000), ACM Press, 2000, pp. 504-511.]]
[67]
{67} T. Li, Web-document prediction and presenting using association rule sequential classifiers, Master's thesis, Simon Fraser University, 2001.]]
[68]
{68} Y.-H. Wu, A.L.P. Chen, Prediction of web page accesses by proxy server log, World Wide Web 5 (1) (2002) 67-88.]]
[69]
{69} B. Berendt, Using site semantics to analyze, visualize, and support navigation, Data Mining and Knowledge Discovery 6 (1) (2002) 37-59.]]
[70]
{70} Y. Fu, M. Creado, C. Ju, Reorganizing web sites based on user access patterns, in: Proceedings of the Tenth International Conference on Information and Knowledge Management, ACM Press, 2001, pp. 583-585.]]
[71]
{71} T. Kamdar, Creating adaptive web servers using incremental web log mining, Master's thesis, Computer Science Department, University of Maryland, Baltimore County, 2001.]]
[72]
{72} O.R. Zaïane, Web usage mining for a better web-based learning environment, in: Proceedings of Conference on Advanced Technology for Education, 2001, pp. 450-455.]]
[73]
{73} C. Bounsaythip, E. Rinta-Runsala, Overview of data mining for customer behavior modeling, Technical Report TTE1-2001-18, VTT Information Technology (2001).]]
[74]
{74} M. Eirinaki, M. Vazirgiannis, Web Mining for web personalization, ACM Transactions on Internet Technology (TOIT) 3 (1) (2003) 1-27.]]
[75]
{75} WebSideStory HitBox, http://www.websidestory.com (2003).]]
[76]
{76} Accrue, http://www.accrue.com (2003).]]
[77]
{77} NetlQ WebTrends Log Analyzer, http://www.netiq.com (2003).]]
[78]
{78} Sane NetTracker, http://www.sane.com/products/NetTracker (2003).]]
[79]
{79} Funnel Web Analyzer, http://www.quest.com (2003).]]
[80]
{80} WUM: A Web Utilization Miner, http://wum.wiwi.huberlin.de (2003).]]
[81]
{81} Accrue G2, http://www.accrue.com/products/g2 (2003).]]
[82]
{82} Accrue Insight 5, http://www.accrue.com/products/insight (2003).]]
[83]
{83} Pilot Hitlist, http://www.accrue.com/products/hitlist (2003).]]
[84]
{84} Lumio Recognition, http://www.lumio.com (2003).]]
[85]
{85} Elytics Analysis Suite, http://www.elytics.com (2003).]]
[86]
{86} E. piphany E.6, http://www.epiphany.com (2003).]]
[87]
{87} NetGenesis, http://www.spss.com/netgenesis (2003).]]
[88]
{88} SPSS, http://www.spss.com (2003).]]
[89]
{89} SAS WebHound, http://www.sas.com/products/webhound (2003).]]
[90]
{90} SAS IntelliVisor, http://www.sas.com/solutions/intellivisor (2003).]]
[91]
{91} Megaputer WebAnalyst, http://www.megaputer.com (2003).]]
[92]
{92} Prudsys ECOMMINER, http://www.prudsys.com (2003).]]
[93]
{93} InterShop, http://www.intershop.com (2003).]]
[94]
{94} Logisma Business Webstore, http://www.logisma.de (2003).]]
[95]
{95} H.R. Kim, P.K. Chan, Learning implicit user interest hierarchy for context in personalization, in: Proceedings of the 2003 International Conference on Intelligent User Interfaces, ACM Press, 2003, pp. 101-108.]]
[96]
{96} W. Lin, S.A. Alvarez, C. Ruiz, Efficient adaptive-support association rule mining for recommender systems, Data Mining and Knowledge Discovery 6 (1) (2002) 83-105.]]
[97]
{97} M. Eirinaki, M. Vazirgiannis, I. Varlamis, Sewep: using site semantics and a taxonomy to enhance the web personalization process, in: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, 2003, pp. 99-108.]]
[98]
{98} F. Bonchi, F. Giannotti, C. Gozzi, G. Manco, M. Nanni, D. Pedreschi, C. Renso, S. Ruggieri, Web log data warehousing and mining for intelligent web caching, Data Knowledge Engineering 39 (2) (2001) 165-189.]]
[99]
{99} A. Nanopoulos, D. Katsaros, Y. Manolopoulos, A data mining algorithm for generalized web prefetching, IEEE Transactions on Knowledge and Data Engineering 15 (5) (2003) 1155-1169.]]
[100]
{100} Q. Yang, H.H. Zhang, Web-log mining for predictive web caching, IEEE Transactions on Knowledge and Data Engineering 15 (4) (2003) 1050-1054.]]
[101]
{101} C.R. Anderson, P. Domingos, D.S. Weld, Relational markov models and their application to adaptive web navigation, in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2002), 2002.]]
[102]
{102} M. Spiliopoulou, C. Pohle, Data mining for measuring and improving the success of web sites (1-2) (2001) 85-114.]]
[103]
{103} R. Srikant, Y. Yang, Mining web logs to improve website organization, World Wide Web (2001) 430-437.]]
[104]
{104} J. Zhu, J. Hong, J.G. Hughes, Using markov chains for link prediction in adaptive web sites, in: D.W. Bustard, W. Liu, R. Sterritt (Eds.), Soft-Ware 2002: Computing in an Imperfect World, First International Conference, Soft-Ware 2002, Belfast, Northern Ireland, 8-10 April 2002, Proceedings, vol. 2311 of Lecture Notes in Computer Science, Springer, 2002, pp. 60-73.]]
[105]
{105} W.-L. Chang, S.-T. Yuan, A synthesized learning approach for web-based crm, in: WEBKDD 2000--Web Mining for E-Commerce--Challenges and Opportunities, Second International Workshop, 2000.]]
[106]
{106} Directive 94/46/ec of the european parliament and of the council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data, Official Journal of the European Community (281) (1995) 31-50, http://europa.eu.int/comm/internalmarket/privacy/index.htm.]]
[107]
{107} Platform for Privacy Preferences (P3P) Project, http://www.w3.org/TR/P3P/(2003).]]
[108]
{108} R. Agrawal, R. Srikant, Privacy-preserving data mining, in: Proceedings of the ACM SIGMOD Conference on Management of Data, ACM Press, 2000, pp. 439-450.]]
[109]
{109} Y. Lindell, B. Pinkas, Privacy preserving data mining, Lecture Notes in Computer Science 1880 (2000) 36.]]
[110]
{110} A. Evfimievski, R. Srikant, R. Agrawal, J. Gehrke, Privacy preserving mining of association rules, in: Proceedings of 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2002.]]
[111]
{111} S.J. Rizvi, J.R. Haritsa, Maintaining data privacy in association rule mining, in: Proceedings of 28th International Conference on Very Large Data Bases (VLDB), 2002.]]
[112]
{112} R. Meo, P.L. Lanzi, M. Matera, Integrating web conceptual modeling and web usage mining, Technical Report, Dipartimento di Elettronica e Informazione--Politecnico di Milano, accepted at WEBKDD2004 (2004).]]

Cited By

View all
  • (2023)Popular, but hardly used: Has Google Analytics been to the detriment of Web Analytics?Proceedings of the 15th ACM Web Science Conference 202310.1145/3578503.3583601(304-311)Online publication date: 30-Apr-2023
  • (2023)IRPDP_HT2: a scalable data pre-processing method in web usage mining using Hadoop MapReduceSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-023-08019-w27:12(7907-7923)Online publication date: 24-Mar-2023
  • (2022)Rough-Set-Based Real-Time Interest Label Extraction over Large-Scale Social NetworksComplexity10.1155/2022/20729502022Online publication date: 1-Jan-2022
  • Show More Cited By

Recommendations

Reviews

Michael G. Murphy

This paper surveys the field of Web usage mining, which is a sub-area of Web mining, which, in turn, is a sub-area of data mining. Web usage mining is the part of Web mining that deals with the extraction of knowledge from server log files. Such Web logs, or weblogs, are mostly textual logs, collected when users access Web servers, and are stored in one of several commonly used formats. (Note that weblogs in this context are not blogs, as that term has come to be known recently.) The sections of the paper include an introduction, and cover data sources, data preprocessing, knowledge discovery techniques, applications, software support, moving from techniques to applications, privacy issues, future trends, and a brief summary. There are also 112 references. There are no figures, but there is one helpful table that provides references to carefully selected papers, with representative applications, techniques, and data sources. Data sources for Web usage mining are Web servers, proxy servers, and Web clients. Preprocessing includes data cleaning, identifying and reconstructing users' sessions, retrieving information about page content and structure, and data formatting. Knowledge discovery techniques for research in Web usage mining, as opposed to the statistical analysis typical of commercial applications, focus on association rules, sequential patterns, and clustering. Since the general goal of Web usage mining is to gather useful information about Web users' navigation patterns, the results produced by mining Web logs can be used to personalize the delivery of Web content, improve user navigation by means of prefetching and caching, improve Web design, and improve customer satisfaction in e-commerce. Software support has evolved over the last several years, with e-commerce Web usage mining becoming part of integrated customer relationship management (CRM) solutions, simple Web log analyzers for general usage, and an open source tool, the Web utilization miner (WUM), for the research community. The privacy issue, in general, is still being considered by the Web usage mining community. Future trends appear to be tied to the emergence and proliferation of the semantic Web concept. The paper serves as a good survey of Web usage mining. It is recommended for anyone wanting to understand the essentials of this rapidly emerging field. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Data & Knowledge Engineering
Data & Knowledge Engineering  Volume 53, Issue 3
June 2005
110 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 June 2005

Author Tags

  1. machine learning
  2. web mining

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Popular, but hardly used: Has Google Analytics been to the detriment of Web Analytics?Proceedings of the 15th ACM Web Science Conference 202310.1145/3578503.3583601(304-311)Online publication date: 30-Apr-2023
  • (2023)IRPDP_HT2: a scalable data pre-processing method in web usage mining using Hadoop MapReduceSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-023-08019-w27:12(7907-7923)Online publication date: 24-Mar-2023
  • (2022)Rough-Set-Based Real-Time Interest Label Extraction over Large-Scale Social NetworksComplexity10.1155/2022/20729502022Online publication date: 1-Jan-2022
  • (2021)An Effective Clustering-Based Web Page Recommendation Framework for E-Commerce WebsitesSN Computer Science10.1007/s42979-021-00736-z2:4Online publication date: 15-Jun-2021
  • (2021)PKM3: an optimal Markov model for predicting future navigation sequences of the web surfersPattern Analysis & Applications10.1007/s10044-020-00892-724:1(263-281)Online publication date: 1-Feb-2021
  • (2019)Discovering communities for web usage mining systemsInternational Journal of Advanced Intelligence Paradigms10.5555/3324436.332444612:3-4(331-354)Online publication date: 1-Jan-2019
  • (2019)Consumer Behaviour through the Eyes of Neurophysiological MeasuresComputational Intelligence and Neuroscience10.1155/2019/19768472019Online publication date: 1-Jan-2019
  • (2018)A MapReduce-Based User Identification Algorithm in Web Usage MiningInternational Journal of Information Technology and Web Engineering10.4018/IJITWE.201804010213:2(11-23)Online publication date: 1-Apr-2018
  • (2018)State of the Art Recommendation ApproachesInternational Journal of Advanced Pervasive and Ubiquitous Computing10.4018/IJAPUC.201801010410:1(51-76)Online publication date: 1-Jan-2018
  • (2018)Weblog Data StructurationProceedings of the 20th International Conference on Information Integration and Web-based Applications & Services10.1145/3282373.3282379(263-271)Online publication date: 19-Nov-2018
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media