Abstract
In data mining, the selection of an appropriate classifier to estimate the value of an unknown attribute for a new instance has an essential impact to the quality of the classification result. Recently promising approaches using parallel and distributed computing have been presented. In this paper, we consider an approach that uses classifiers trained on a number of data subsets in parallel as in the arbiter meta-learning technique. We suggest that information is collected during the learning phase about the performance of the included base classifiers and arbiters and that this information is used during the application phase to select the best classifier dynamically. We evaluate our technique and compare it with the simple arbiter meta-learning using selected data sets from the UCI machine learning repository. The comparison results show that our dynamic meta-learning technique outperforms the arbiter metalearning significantly in some cases but further profound analysis is needed to draw general conclusions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aivazyan, S.A.: Applied Statistics: Classification and Dimension Reduction. Finance and Statistics, Moscow (1989)
Chan, P., Stolfo, S.: On the Accuracy of Meta-Learning for Scalable Data Mining. Intelligent Information Systems, Vol. 8 (1997) 5–28
Chan, P., Stolfo, S.: Toward Parallel and Distributed Learning by Meta-Learning. In Working Notes AAAI Work. Knowledge Discovery in Databases (1993) 227–240
Chan, P.: An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD Thesis, Columbia University (1996)
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. AAAI/ MIT Press (1997)
Kohavi, R., Sommerfield, D., Dougherty, J.: Data Mining Using MLC++: A Machine Learning Library in C++. Tools with Artificial Intelligence, IEEE CS Press (1996) 234–245
Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In: Proceedings of IJCAI’95 (1995)
Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Department of Information and Computer Science, University of California, Irvine, CA (1998)
Puuronen, S., Terziyan, V., Katasonov, A., Tsymbal, A.: Dynamic Integration of Multiple Data Mining Techniques in a Knowledge Discovery Management System. In: Dasarathy, B.V. (Ed.): Data Mining and Knowledge Discovery: Theory, Tools, and Techniques. Proceedings of SPIE, Vol. 3695. SPIE-The International Society for Optical Engineering, USA (1999) 128–139
Puuronen, S., Terziyan, V., Tsymbal, A.: A Dynamic Integration Algorithm with an Ensemble of Classifiers. In: Proceedings ISMIS’99–The Eleventh International Symposium on Methodologies for Intelligent Systems, Warsaw, Poland, June (1999) (to appear)
Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993)
Schapire, R.E.: Using Output Codes to Boost Multiclass Learning Problems. In: Machine Learning: Proceedings of the Fourteenth International Conference (1997) 313–321
Skalak, D.B.: Combining Nearest Neighbor Classifiers. Ph.D. Thesis, Dept. of Computer Science, University of Massachusetts, Amherst, MA (1997)
Terziyan, V., Tsymbal, A., Puuronen, S.: The Decision Support System for Telemedicine Based on Multiple Expertise. Int. J. of Medical Informatics, Vol. 49,No. 2 (1998) 217–229
Terziyan, V., Tsymbal, A., Tkachuk, A., Puuronen, S.: Intelligent Medical Diagnostics System Based on Integration of Statistical Methods. In: Informatica Medica Slovenica, Journal of Slovenian Society of Medical Informatics, Vol.3,Ns. 1,2,3 (1996) 109–114
Thrun, S.B., Bala, J, Bloedorn, E., et al.: The MONK’s Problems–A Performance Comparison of Different Learning Algorithms. Technical Report CS-CMU-91-197, Carnegie Mellon University, Pittsburg, PA (1991)
Tsymbal, A., Puuronen, S., Terziyan, V.: Advanced Dynamic Selection of Diagnostic Methods. In: Proceedings 11th IEEE Symp. on Computer-Based Medical Systems CBMS’98, IEEE CS Press, Lubbock, Texas, June (1998) 50–54
Wolpert, D.: Stacked Generalization. Neural Networks, Vol. 5 (1992) 241–259
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsymbal, A., Puuronen, S., Terziyan, V. (1999). Arbiter Meta-Learning with Dynamic Selection of Classifiers and its Experimental Investigation. In: Eder, J., Rozman, I., Welzer, T. (eds) Advances in Databases and Information Systems. ADBIS 1999. Lecture Notes in Computer Science, vol 1691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48252-0_16
Download citation
DOI: https://doi.org/10.1007/3-540-48252-0_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66485-7
Online ISBN: 978-3-540-48252-9
eBook Packages: Springer Book Archive