Abstract
Evaluation criteria for conversational CBR (CCBR) systems are important to guide development and tuning of new methods, and to enable practitioners to make informed decisions about which methods to use. Traditional criteria for evaluating CCBR performance by precision and efficiency provide useful information, but are limited by their focus on the single point at which a case is selected at the end of the system dialogue, and by their dependence on a model of the user’s case selection criteria. This paper begins by revisiting issues in the evaluation of CCBR systems, arguing for the value of assessing the quality of the intermediate dialogue before case selection. It then proposes an evaluation approach based on rank quality to provide a fuller picture of system performance, and illustrates with an empirical study the use of rank quality to illuminate characteristics of similarity assessment strategies for partially-specified cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Watson, I.: Applying Case-Based Reasoning: Techniques for Enterprise Systems. Morgan Kaufmann, San Mateo (1997)
Aha, D., Breslow, L.: Refining conversational case libraries. In: Leake, D.B., Plaza, E. (eds.) ICCBR 1997. LNCS, vol. 1266, pp. 267–278. Springer, Heidelberg (1997)
Bogaerts, S., Leake, D.: Facilitating CBR for incompletely-described cases: Distance metrics for partial problem descriptions. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 62–76. Springer, Heidelberg (2004)
Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
McSherry, D.: Minimizing dialog length in interactive case-based reasoning. In: Proceedings of the seventeenth International Joint Conference on Artificial Intelligence (IJCAI 2001), pp. 993–998. Morgan Kaufmann, San Francisco (2001)
Buchanan, B., Shortliffe, E.: Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading (1984)
Cheetham, W., Price, J.: Measures of solution accuracy in case-based reasoning systems. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 106–118. Springer, Heidelberg (2004)
Miller, R., Pople, H., Meyers, J.: Internist-i, an experimental computer-based diagnostic consultant for general internal medicine. New England Journal of Medicine 307(8), 468–476 (1982)
McSherry, D.: Incremental relaxation of unsuccessful queries. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 331–345. Springer, Heidelberg (2004)
McCarthy, K., Reilly, J., McGinty, L., Smyth, B.: Experiments in dynamic critiquing. In: IUI 2005: Proceedings of the 10th international conference on Intelligent user interfaces, pp. 175–182. ACM Press, New York (2005)
Bogaerts, S., Leake, D.: IUCBRF: A framework for rapid and modular CBR system development. Technical Report TR 617, Computer Science Department, Indiana University, Bloomington, IN (2005)
McSherry, D.: Precision and recall in interactive case-based reasoning. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 392–406. Springer, Heidelberg (2001)
Gupta, K.M., Aha, D.W., Sandhu, N.: Exploiting taxonomic and causal relations in conversational case retrieval. In: Craw, S., Preece, A.D. (eds.) ECCBR 2002. LNCS (LNAI), vol. 2416, pp. 133–147. Springer, Heidelberg (2002)
Kohlmaier, A., Schmitt, S., Bergmann, R.: Evaluation of a similarity-based approach to customer-adaptive elect ronic sales dialogs (2001)
Voorhees, E.M.: Evaluation by highly relevant documents. In: SIGIR 2001: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 74–82. ACM Press, New York (2001)
Boyan, J., Freitag, D., Joachims, T.: A machine learning architecture for optimizing web search engines. In: Proceedings of the AAAI Workshop on Internet-Based Information Systems, Portland, Oregon (1996)
Borlund, P., Ingwersen, P.: Measures of relative relevance and ranked half-life: performance indicators for interactive ir. In: SIGIR 1998: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 324–331. ACM Press, New York (1998)
Cooper, W.S.: Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation 19(1), 30–42 (1968)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bogaerts, S., Leake, D. (2006). What Evaluation Criteria Are Right for CCBR? Considering Rank Quality. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds) Advances in Case-Based Reasoning. ECCBR 2006. Lecture Notes in Computer Science(), vol 4106. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11805816_29
Download citation
DOI: https://doi.org/10.1007/11805816_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36843-4
Online ISBN: 978-3-540-36846-5
eBook Packages: Computer ScienceComputer Science (R0)