A Simple Lexicographic Ranker and Probability Estimator

Peter Flach¹ &
Edson Takashi Matsubara²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4701))

Included in the following conference series:

European Conference on Machine Learning

6025 Accesses
18 Citations

Abstract

Given a binary classification task, a ranker sorts a set of instances from highest to lowest expectation that the instance is positive. We propose a lexicographic ranker, LexRank, whose rankings are derived not from scores, but from a simple ranking of attribute values obtained from the training data. When using the odds ratio to rank the attribute values we obtain a restricted version of the naive Bayes ranker. We systematically develop the relationships and differences between classification, ranking, and probability estimation, which leads to a novel connection between the Brier score and ROC curves. Combining LexRank with isotonic regression, which derives probability estimates from the ROC convex hull, results in the lexicographic probability estimator LexProb. Both LexRank and LexProb are empirically evaluated on a range of data sets, and shown to be highly effective.

Download to read the full chapter text

Chapter PDF

Probability Models for Ranking Data

Regression Algorithm Using the Rank Measure

Article 01 July 2015

Weighted Rank Correlation: A Flexible Approach Based on Fuzzy Order Relations

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In: Brodley, C.E., Danyluk, A.P. (eds.) Proceedings of the Eigteenth International Conference on Machine Learning (ICML 2001), pp. 609–616. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Fawcett, T., Niculescu-Mizil, A.: PAV and the ROC convex hull. Machine Learning 68(1), 97–106 (2007)
Article Google Scholar
Ferri, C., Flach, P.A., Hernández-Orallo, J.: Learning decision trees using the area under the ROC curve. In: Sammut, C., Hoffmann, A.G. (eds.) Proceedings of the Nineteenth International Conference (ICML 2002), pp. 139–146. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Provost, F., Domingos, P.: Tree induction for probability-based ranking. Machine Learning 52(3), 199–215 (2003)
Article MATH Google Scholar
Brier, G.: Verification of forecasts expressed in terms of probabilities. Monthly Weather Review 78, 1–3 (1950)
Article Google Scholar
Cohen, I., Goldszmidt, M.: Properties and benefits of calibrated classifiers. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 125–136. Springer, Heidelberg (2004)
Google Scholar
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42(3), 203–231 (2001)
Article MATH Google Scholar
Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Bristol, United Kingdom
Peter Flach
Instituto de Ciências e Matemáticas e de Computação, Universidade de São Paulo,
Edson Takashi Matsubara

Authors

Peter Flach
View author publications
You can also search for this author in PubMed Google Scholar
Edson Takashi Matsubara
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Joost N. Kok Jacek Koronacki Raomon Lopez de Mantaras Stan Matwin Dunja Mladenič Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Flach, P., Matsubara, E.T. (2007). A Simple Lexicographic Ranker and Probability Estimator. In: Kok, J.N., Koronacki, J., Mantaras, R.L.d., Matwin, S., Mladenič, D., Skowron, A. (eds) Machine Learning: ECML 2007. ECML 2007. Lecture Notes in Computer Science(), vol 4701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74958-5_55

Download citation

DOI: https://doi.org/10.1007/978-3-540-74958-5_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74957-8
Online ISBN: 978-3-540-74958-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Simple Lexicographic Ranker and Probability Estimator

Abstract

Chapter PDF

Similar content being viewed by others

Probability Models for Ranking Data

Regression Algorithm Using the Rank Measure

Weighted Rank Correlation: A Flexible Approach Based on Fuzzy Order Relations

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Simple Lexicographic Ranker and Probability Estimator

Abstract

Chapter PDF

Similar content being viewed by others

Probability Models for Ranking Data

Regression Algorithm Using the Rank Measure

Weighted Rank Correlation: A Flexible Approach Based on Fuzzy Order Relations

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation