research-article

Efficient learning of minimax risk classifiers in high dimensions

AUTHORs:

Kartheek Bondugula,

Santiago Mazuelas,

Aritz PérezAuthors Info & Claims

UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence

Article No.: 20, Pages 206 - 215

Published: 31 July 2023 Publication History

Abstract

High-dimensional data is common in multiple areas, such as health care and genomics, where the number of features can be tens of thousands. In such scenarios, the large number of features often leads to inefficient learning. Constraint generation methods have recently enabled efficient learning of L1-regularized support vector machines (SVMs). In this paper, we leverage such methods to obtain an efficient learning algorithm for the recently proposed minimax risk classifiers (MRCs). The proposed iterative algorithm also provides a sequence of worst-case error probabilities and performs feature selection. Experiments on multiple high-dimensional datasets show that the proposed algorithm is efficient in high-dimensional scenarios. In addition, the worst-case error probability provides useful information about the classifier performance, and the features selected by the algorithm are competitive with the state-of-the-art.

References

[1]

Dimitris Bertsimas and John N. Tsitsiklis. Introduction to Linear Optimization, volume 6. Athena Scientific, Belmont, MA, 1997.

Digital Library

[2]

Kartheek Bondugula, Verónica Álvarez, Jose I. Segovia-Martin, Aritz Pérez, and Santiago Mazuelas. MRCpy: A library for minimax risk classifiers. arXiv preprint, arXiv:2108.01952, 2023.

[3]

Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján. Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. Journal of Machine Learning Research, 13:27-66, 2012.

Digital Library

[4]

Michael Celentano and Andrea Montanari. Fundamental barriers to high-dimensional regression with convex penalties. The Annals of Statistics, 50(1):170-196, 2022.

[5]

Antoine Dedieu, Rahul Mazumder, and Haoyue Wang. Solving L1-regularized SVMs and related linear programs: Revisiting the effectiveness of column and constraint generation. Journal of Machine Learning Research, 23:1-41, 2022.

[6]

Jacques Desrosiers and Marco E. Lübbecke. A primer in column generation. Springer, 2005.

[7]

Chris Ding and Hanchuan Peng. Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology, 3: 185-205, 2005.

[8]

Dheeru Dua and Casey Graff. UCI Machine Learning Repository, 2017. URL http://archive.ics.uci.edu/ml.

[9]

Debopriya Ghosh and Javier Cabrera. Enriched random forest for high dimensional genomic data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 19:2817-2828, 2022.

Digital Library

[10]

Isabelle Guyon and André Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157-1182, 2003.

Digital Library

[11]

Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. Gene selection for cancer classification using support vector machines. Machine learning, 46: 389-422, 2002.

[12]

Cho-Jui Hsieh, Kai-Wei Chang, Chih-Jen Lin, S Sathiya Keerthi, and Sellamanickam Sundararajan. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the 25th International Conference on Machine Learning, pages 408-415, 2008.

Digital Library

[13]

Kwangmoo Koh, Seung-Jean Kim, and Stephen Boyd. An interior-point method for large-scale L1-regularized logistic regression. Journal of Machine learning research, 8:1519-1555, 2007.

[14]

Yong Liu, Shizhong Liao, Hailun Lin, Yinliang Yue, and Weiping Wang. Infinite kernel learning: generalization bounds and algorithms. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017.

[15]

Santiago Mazuelas, Andrea Zanoni, and Aritz Perez. Minimax classification with 0-1 loss and performance guarantees. In Advances in Neural Information Processing Systems, volume 33, pages 302-312, 2020.

[16]

Santiago Mazuelas, Yuan Shen, and Aritz Pérez. Generalized maximum entropy for supervised classification. IEEE Transactions on Information Theory, 68:2530-2550, 2022.

[17]

Santiago Mazuelas, Mauricio Romero, and Peter Grünwald. Minimax risk classifiers with 0-1 loss. arXiv preprint, arXiv:2201.06487, 2023.

[18]

Francesca Mignacco, Florent Krzakala, Yue Lu, Pierfrancesco Urbani, and Lenka Zdeborova. The role of regularization in classification of high-dimensional noisy gaussian mixture. In Proceedings of the 37th International Conference on Machine Learning, pages 6874-6883, 2020.

Digital Library

[19]

Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of machine learning. MIT press, Cambridge, MA, second edition, 2018.

Digital Library

[20]

Hanchuan Peng, Fuhui Long, and Chris Ding. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:1226-1238, 2005.

Digital Library

[21]

Ali Rahimi and Benjamin Recht. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems, volume 20, pages 1177-1184, 2008.

[22]

Shai Shalev-Shwartz, Yoram Singer, and Nathan Srebro. Pegasos: Primal estimated sub-gradient solver for SVM. In Proceedings of the 24th International Conference on Machine Learning, pages 807-814, 2007.

Digital Library

[23]

Jianing Shi, Wotao Yin, Stanley Osher, and Paul Sajda. A fast hybrid algorithm for large-scale L1-regularized logistic regression. Journal of Machine Learning Research, 11:713-741, 2010.

Digital Library

[24]

Sören Sonnenburg, Gunnar Rätsch, Christin Schäfer, and Bernhard Schölkopf. Large scale multiple kernel learning. Journal of Machine Learning Research, 7:1531-1565, 2006.

Digital Library

[25]

Robert Tibshirani, Jacob Bien, Jerome Friedman, Trevor Hastie, Noah Simon, Jonathan Taylor, and Ryan J Tibshirani. Strong rules for discarding predictors in lasso-type problems. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 74:245-266, 2012.

[26]

Andrius Vabalas, Emma Gowen, Ellen Poliakoff, and Alexander J Casson. Machine learning algorithm validation with a limited sample size. PloS One, 14, 2019.

[27]

Gaël Varoquaux. Cross-validation failure: Small sample sizes lead to large error bars. Neuroimage, 180:68-77, 2018.

[28]

Hsiang-Fu Yu, Fang-Lan Huang, and Chih-Jen Lin. Dual coordinate descent methods for logistic regression and maximum entropy models. Machine Learning, 85:41-75, 2011.

Digital Library

[29]

Guo-Xun Yuan, Chia-Hua Ho, and Chih-Jen Lin. Recent advances of large-scale linear classification. Proceedings of the IEEE, 100:2584-2603, 2012.

Recommendations

Efficient learning of Naive Bayes classifiers under class-conditional classification noise
ICML '06: Proceedings of the 23rd international conference on Machine learning

We address the problem of efficiently learning Naive Bayes classifiers under class-conditional classification noise (CCCN). Naive Bayes classifiers rely on the hypothesis that the distributions associated to each class are product distributions. When ...
AdaBoost classifiers for pecan defect classification

Highlights The performance of AdaBoost algorithms were compared with support vector machine and Bayesian classifiers for pecan defect classification. AdaBoost classifiers took least time and gave best classification accuracy. AdaBoost classifiers ...
Learning classifiers from imbalanced data based on biased minimax probability machine
CVPR'04: Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition

We consider the problem of the binary classification on imbalanced data, in which nearly all the instances are labelled as one class, while far fewer instances are labelled as the other class, usually the more important class. Traditional machine ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence

July 2023

2617 pages

Copyright © 2023 Association for Uncertainty in Artificial Intelligence.

Publisher

JMLR.org

Publication History

Published: 31 July 2023

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents