Ensembles of Balanced Nested Dichotomies for Multi-class Problems

Lin Dong²³,
Eibe Frank²³ &
Stefan Kramer²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3721))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

3195 Accesses
29 Citations

Abstract

A system of nested dichotomies is a hierarchical decomposition of a multi-class problem with c classes into c–1 two-class problems and can be represented as a tree structure. Ensembles of randomly-generated nested dichotomies have proven to be an effective approach to multi-class learning problems [1]. However, sampling trees by giving each tree equal probability means that the depth of a tree is limited only by the number of classes, and very unbalanced trees can negatively affect runtime. In this paper we investigate two approaches to building balanced nested dichotomies—class-balanced nested dichotomies and data-balanced nested dichotomies—and evaluate them in the same ensemble setting. Using C4.5 decision trees as the base models, we show that both approaches can reduce runtime with little or no effect on accuracy, especially on problems with many classes. We also investigate the effect of caching models when building ensembles of nested dichotomies.

Download to read the full chapter text

Chapter PDF

Building Ensembles of Adaptive Nested Dichotomies with Random-Pair Selection

Ensembles of Nested Dichotomies with Multiple Subset Evaluation

Combining One-vs-One Decomposition and Ensemble Learning for Multi-class Imbalanced Data

References

Frank, E., Kramer, S.: Ensembles of nested dichotomies for multi-class problems. In: Proc. Int. Conf. on Machine Learning, pp. 305–312. ACM Press, New York (2004)
Google Scholar
Dietterich, T., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)
MATH Google Scholar
Fürnkranz, J.: Round robin classification. Journal of Machine Learning Research 2, 721–747 (2002)
Article MATH Google Scholar
Fox, J.: Applied Regression Analysis, Linear Models, and Related Methods. Sage, Thousand Oaks (1997)
Google Scholar
Blake, C., Merz, C.: UCI repository of machine learning databases. University of California, Irvine, Dept. of Inf. and Computer Science (1998), www.ics.uci.edu/~mlearn/MLRepository.html
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos (1992)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Nadeau, C., Bengio, Y.: Inference for the generalization error. Machine Learning 52, 239–281 (2003)
Article MATH Google Scholar
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40, 139–157 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Waikato, New Zealand
Lin Dong & Eibe Frank
Department of Computer Science, Technical University of Munich, Germany
Stefan Kramer

Authors

Lin Dong
View author publications
You can also search for this author in PubMed Google Scholar
Eibe Frank
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Kramer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIACC/FEP, Universidade do Porto, Portugal
Alípio Mário Jorge
LIAAD-INESC Porto LA / FEP, University of Porto, R. de Ceuta, 118, 6, 4050-190, Porto, Portugal
Luís Torgo
LIAAD-INESC Porto L.A./Faculty of Economics, University of Porto, Rua de Ceuta, 118-6, 4050-190, Porto, Portugal
Pavel Brazdil
Faculdade de Engenharia & LIAAD, Universidade do Porto, Portugal
Rui Camacho
Faculty of Economics of the University of Porto, Portugal
João Gama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, L., Frank, E., Kramer, S. (2005). Ensembles of Balanced Nested Dichotomies for Multi-class Problems. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds) Knowledge Discovery in Databases: PKDD 2005. PKDD 2005. Lecture Notes in Computer Science(), vol 3721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564126_13

Download citation

DOI: https://doi.org/10.1007/11564126_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29244-9
Online ISBN: 978-3-540-31665-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Ensembles of Balanced Nested Dichotomies for Multi-class Problems

Abstract

Chapter PDF

Similar content being viewed by others

Building Ensembles of Adaptive Nested Dichotomies with Random-Pair Selection

Ensembles of Nested Dichotomies with Multiple Subset Evaluation

Combining One-vs-One Decomposition and Ensemble Learning for Multi-class Imbalanced Data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Ensembles of Balanced Nested Dichotomies for Multi-class Problems

Abstract

Chapter PDF

Similar content being viewed by others

Building Ensembles of Adaptive Nested Dichotomies with Random-Pair Selection

Ensembles of Nested Dichotomies with Multiple Subset Evaluation

Combining One-vs-One Decomposition and Ensemble Learning for Multi-class Imbalanced Data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation