Abstract
Case-based reasoning aims to use past experience to solve new problems. A strong requirement for its application is that extensive experience base exists that provides statistically significant justification for new applications. Such extensive experience base has been rare, limiting most CBR applications to be confined to small-scale problems involving single or few users, or even toy problems. In this work, we present an application of CBR in the domain of web document prediction and retrieval, whereby a server-side application can decide, with high accuracy and coverage, a user’s next request for hypertext documents based on past requests. An application program can then use the prediction knowledge to prefetch or presend web objects to reduce latency and network load. Through this application, we demonstrate the feasibility of CBR application in the web-document retrieval context, exposing the vast possibility of using web-log files that contain document retrieval experiences from millions of users. In this framework, a CBR system is embedded within an overall web-server application. A novelty of the work is that data mining and case-based reasoning are combined in a seamless manner, allowing cases to be mined efficiently. In addition we developed techniques to allow different case bases to be combined in order to yield a overall case base with higher quality than each individual ones. We validate our work through experiments using realistic, large-scale web logs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
D.W. Aha and L.A. Breslow. Refining conversational case libraries. In Proceedings of the Second International Conference on Case-based Reasoning (ICCBR-97), Providence, RI, July 1997.
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. of the ACM SIGMOD Int’l Conf. on Management of Data (ACM SIGMOD’ 93), Washington, USA, May 1993.
M. Arlitt, R. Friedrich L. Cherkasova, J. Dilley, and T. Jin. Evaluating content management techniques for web proxy caches. In HP Technical report, Palo Alto, Apr. 1999.
D. Aha and H. Munoz-Avila. Applied Intelligence Journal, Special Issue on Interactive CBR. Kluwer 2001.
R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. of the Int’l Conf. on Data Engineering (ICDE), Taipei, Taiwan, March 1995.
C. Aggarwal, J. L. Wolf, and P. S. Yu. Caching on the World Wide Web. In IEEE Transactions on Knowledge and Data Engineering, volume 11, pages 94--107, 1999.
Albrecht, D. W., Zukerman, I., and Nicholson, A. E. 1999. Pre-sending documents on the WWW: A comparative study. IJCAI99-Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence.
P. Cao and S. Irani. Cost-aware www proxy caching algorithms. In USENIX Symposium on Internet Technologies and Systems, Monterey, CA, Dec. 1997.
E. Markatos and C. Chironaki. A Top Ten Approach for Prefetching the Web. In Proceedings of the INET’98 Internet Global Summit. July 1998
Joachims, T., Freitag, D., and Mitchell, T. 1997 WebWatcher: A tour guild for the World Wide Web. IJCAI 97-Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, 770–775.
T. M. Kroeger and D. D. E. Long. Predicting future file-system actions from prior events. In USENIX 96, San Diego, Calif., Jan. 1996.
D. Leake Case-Based Reasoning: Experiences, Lessons, and Future Directions. Menlo Park, CA, AAAI Press. 1996.
B. Liu, W. Hsu, and Y. Ma: “Integrating Classification and Association Rule Mining”, Proc. Fourth Int’l Conf. on Knowledge Discovery and Data Mining (KDD), pp. 80–86, AAAI Press, Menlo Park, Calif., 1998.
K. Chinen and S. Yamaguchi. An Interactive Prefetching Proxy Server for Improvement of WWW Latency. In Proceedings of the Seventh Annual Conference of the Internet Society (INEt’97), Kuala Lumpur, June 1997.
Pitkow J. and Pirolli P. Mining longest repeating subsequences to predict www surfing. In Proceedings of the 1999 USENIX Annual Technical Conference, 1999.
Smyth, B. and Keane, M.T. 1995. Remembering to Forget: A Competence-Preserving Case Deletion Policy for Case-based Reasoning systems. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, IJCAI-95, pp. 377–382.
Z. Su, Q. Yang, and H. Zhang. A prediction system for multimedia pre-fetching on the Internet. In Proceedings of the ACM Multimedia Conference 2000. ACM, October 2000.
Watson (1997). Applying Case-Based Reasoning: techniques for enterprise systems. Morgan Kaufmann Publishers Inc., San Francisco, USA.
D. Wettscherck, and D.W. Aha 1995. Weighting Features. In Proceedings of the 1st International Conference of Case-Base Reasoning, ICCBR-95, pp. 347–358.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, Q., Li, I.TY., Zhang, H.H. (2001). Mining High-Quality Cases for Hypertext Prediction and Prefetching. In: Aha, D.W., Watson, I. (eds) Case-Based Reasoning Research and Development. ICCBR 2001. Lecture Notes in Computer Science(), vol 2080. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44593-5_53
Download citation
DOI: https://doi.org/10.1007/3-540-44593-5_53
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42358-4
Online ISBN: 978-3-540-44593-7
eBook Packages: Springer Book Archive