research-article

Efficient Decision Rule List Learning via Unified Sequence Submodular Optimization

Authors:

Liang SunAuthors Info & Claims

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 3758 - 3769

https://doi.org/10.1145/3637528.3671827

Published: 24 August 2024 Publication History

Abstract

Interpretable models are crucial in many high-stakes decision-making applications. In this paper, we focus on learning a decision rule list for binary and multi-class classification. Different from rule set learning problems, learning an optimal rule list involves not only learning a set of rules, but also their orders. In addition, many existing algorithms rely on rule pre-mining to handle large-scale high-dimensional data, which leads to suboptimal rule list model and degrades its generalization accuracy and interpretablity. In this paper, we learn a rule list from the sequence submodular perspective. We consider the rule list as a sequence and define the cover set for each rule. Then we formulate a sequence function which combines both model complexity and classification accuracy. Based on its appealing sequence submodular property, we propose a general distorted greedy insert algorithm under Minorization-Maximization (MM) framework, which gradually inserts rules with highest inserting gain to the rule list. The rule generation process is treated as a subproblem, allowing our method to learn the rule list through a unified framework which avoids rule pre-mining. We further provide a theoretical lower bound of our greedy insert algorithm in rule list learning. Experimental results show that our algorithm achieves better accuracy and interpretability than the state-of-the-art rule learning methods, and in particular it scales well on large-scale datasets, especially on high-dimensional data.

Supplemental Material

MOV File - Promotional Video for paper Efficient Decision Rule List Learning via Unified Sequence Submodular Optimization

Promotional Video for paper Efficient Decision Rule List Learning via Unified Sequence Submodular Optimization

Download
121.97 MB

References

[1]

Sina Aghaei, Andres Gomez, and Phebe Vayanos. 2020. Learning optimal classification trees: Strong max-flow formulations. arXiv preprint arXiv:2002.09142 (2020).

[2]

Gaël Aglin, Siegfried Nijssen, and Pierre Schaus. 2020. Learning optimal decision trees using caching branch-and-bound search. In Proceedings of the AAAI, Vol. 34. 3146--3153.

[3]

Saeed Alaei, Ali Makhdoumi, and Azarakhsh Malekian. 2021. Maximizing sequence-submodular functions and its application to online advertising. Management Science, Vol. 67, 10 (2021), 6030--6054.

Digital Library

[4]

Saeed Alaei and Azarakhsh Malekian. 2010. Maximizing sequence-submodular functions. arXiv preprint arXiv:1009.4153 (2010).

[5]

Elaine Angelino, Nicholas Larus-Stone, Daniel Alabi, Margo Seltzer, and Cynthia Rudin. 2017. Learning Certifiably Optimal Rule Lists. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 35--44.

Digital Library

[6]

Onur Asan, Alparslan Emrah Bayrak, and Avishek Choudhury. 2020. Artificial intelligence and human trust in healthcare: focus on clinicians. Journal of medical Internet research, Vol. 22, 6 (2020), e15154.

[7]

Martin Atzmueller. 2015. Subgroup Discovery. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 5, 1 (2015), 35--49.

Digital Library

[8]

Sumanta Basu, Karl Kumbier, James B Brown, and Bin Yu. 2018. Iterative random forests to discover predictive and stable high-order interactions. Proceedings of the National Academy of Sciences, Vol. 115, 8 (2018), 1943--1948.

[9]

Sara Bernardini, Fabio Fagnani, and Chiara Piacentini. 2020. Through the lens of sequence submodularity. In Proceedings of the International Conference on Automated Planning and Scheduling, Vol. 30. 38--47.

[10]

Sara Bernardini, Fabio Fagnani, and Chiara Piacentini. 2021. A unifying look at sequence submodularity. Artificial Intelligence, Vol. 297 (2021), 103486.

[11]

Dimitris Bertsimas and Jack Dunn. 2017. Optimal classification trees. Machine Learning, Vol. 106, 7 (2017), 1039--1082.

Digital Library

[12]

Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schwartz. 2014. Submodular maximization with cardinality constraints. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms. SIAM, 1433--1452.

[13]

Allison Chang, Dimitris Bertsimas, and Cynthia Rudin. 2012. An Integer Optimization Approach to Associative Classification. In Advances in Neural Information Processing Systems. 269--277.

[14]

William W Cohen. 1995. Fast Effective Rule Induction. In Proceedings of the 12th International Conference on Machine Learning. 115--123.

Digital Library

[15]

Sanjeeb Dash, Oktay Gunluk, and Dennis Wei. 2018. Boolean Decision Rules via Column Generation. In Advances in Neural Information Processing Systems. 4655--4665.

[16]

J Richard Dietrich and Robert S Kaplan. 1982. Empirical analysis of the commercial loan classification decision. Accounting Review (1982), 18--38.

[17]

Julia Dressel and Hany Farid. 2018. The accuracy, fairness, and limits of predicting recidivism. Science advances, Vol. 4, 1 (2018), eaao5580.

[18]

Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml

[19]

Jonathan Eckstein, Noam Goldberg, and Ai Kagawa. 2017. Rule-enhanced penalized regression by column generation using rectangular maximum agreement. In ICML. PMLR, 1059--1067.

[20]

Radwa ElShawi, Youssef Sherif, Mouaz Al-Mallah, and Sherif Sakr. 2020. Interpretability in healthcare: A comparative study of local machine learning interpretability techniques. Computational Intelligence (2020).

[21]

Uriel Feige, Vahab S Mirrokni, and Jan Vondrák. 2011. Maximizing non-monotone submodular functions. SIAM J. Comput., Vol. 40, 4 (2011), 1133--1153.

Digital Library

[22]

FICO, Google, Imperial College London, MIT, University of Oxford, UC Irvine, and UC Berkeley. 2018. Explainable Machine Learning Challenge. https://community.fico.com/s/explainable-machine-learning-challenge

[23]

Bishwamittra Ghosh, Dmitry Malioutov, and Kuldeep S. Meel. 2020. Classification Rules in Relaxed Logical Form. In Proc. of ECAI.

[24]

Bishwamittra Ghosh and Kuldeep S. Meel. 2019. IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules. In Proc. of AIES.

[25]

Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA). IEEE, 80--89.

[26]

Chris Harshaw, Moran Feldman, Justin Ward, and Amin Karbasi. 2019. Submodular maximization beyond non-negativity: Guarantees, fast algorithms, and applications. In ICML. PMLR, 2634--2643.

[27]

Rishabh Iyer, Stefanie Jegelka, and Jeff Bilmes. 2013. Fast semidifferential-based submodular function optimization. In ICML. PMLR, 855--863.

[28]

Jon Kleinberg, Emily Ryu, and Éva Tardos. 2022. Ordered submodularity and its applications to diversifying recommendations. arXiv preprint arXiv:2203.00233 (2022).

[29]

Andreas Krause and Daniel Golovin. 2014. Submodular function maximization. Tractability, Vol. 3 (2014), 71--104.

[30]

Himabindu Lakkaraju, Stephen H Bach, and Jure Leskovec. 2016. Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1675--1684.

Digital Library

[31]

Himabindu Lakkaraju and Cynthia Rudin. 2017. Learning cost-effective and interpretable treatment regimes. In Artificial intelligence and statistics. PMLR, 166--175.

[32]

Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. How We Analyzed the COMPAS Recidivism Algorithm. ProPublica (2016).

[33]

Jimmy Lin, Chudi Zhong, Diane Hu, Cynthia Rudin, and Margo Seltzer. 2020. Generalized and scalable optimal sparse decision trees. In ICML. PMLR, 6150--6160.

[34]

Pantelis Linardatos, Vasilis Papastefanopoulos, and Sotiris Kotsiantis. 2021. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy, Vol. 23, 1 (2021), 18.

[35]

Dmitry Malioutov and Kuldeep S. Meel. 2018. MLIC: A MaxSAT-Based framework for learning interpretable classification rules. In Proceedings of International Conference on Constraint Programming (CP).

[36]

Graziano Mita, Paolo Papotti, Maurizio Filippone, and Pietro Michiardi. 2020. LIBRE: Learning interpretable boolean rule ensembles. In ICAIS. PMLR, 245--255.

[37]

Susan A Murphy. 2003. Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 65, 2 (2003), 331--355.

[38]

Nina Narodytska, Alexey Ignatiev, Filipe Pereira, Joao Marques-Silva, and IS RAS. 2018. Learning Optimal Decision Trees with SAT. In Ijcai. 1362--1368.

[39]

George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. 1978. An analysis of approximations for maximizing submodular set functions-I. Mathematical programming, Vol. 14 (1978), 265--294.

[40]

Pierre Perrault, Jennifer Healey, Zheng Wen, and Michal Valko. 2021. On the approximation relationship between optimizing ratio of submodular (rs) and difference of submodular (ds) functions. arXiv preprint arXiv:2101.01631 (2021).

[41]

Hugo M Proencca and Matthijs van Leeuwen. 2020. Interpretable multiclass classification by MDL-based rule lists. Information Sciences, Vol. 512 (2020), 1372--1393.

Digital Library

[42]

J Ross Quinlan and R Mike Cameron-Jones. 1993. FOIL: A midterm report. In ECML. Springer, 1--20.

[43]

Ronald L Rivest. 1987. Learning decision lists. Machine learning, Vol. 2, 3 (1987), 229--246.

[44]

Cynthia Rudin, Chaofan Chen, Zhi Chen, Haiyang Huang, Lesia Semenova, and Chudi Zhong. 2021. Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges. arxiv: 2103.11251 [cs.LG]

[45]

Cynthia Rudin and cSeyda Ertekin. 2018. Learning customized and optimized lists of rules with mathematical programming. Mathematical Programming Computation, Vol. 10, 4 (2018), 659--702.

[46]

Matthew Streeter and Daniel Golovin. 2008. An online algorithm for maximizing submodular functions. NeurIPS, Vol. 21 (2008).

[47]

Ying Sun, Prabhu Babu, and Daniel P. Palomar. 2017. Majorization-Minimization Algorithms in Signal Processing, Communications, and Machine Learning. IEEE Transactions on Signal Processing, Vol. 65, 3 (2017), 794--816. https://doi.org/10.1109/TSP.2016.2601299

Digital Library

[48]

Fulton Wang and Cynthia Rudin. 2015. Falling Rule Lists. In Proceedings of Artificial Intelligence and Statistics (AISTATS).

[49]

Tong Wang and Cynthia Rudin. 2015. Learning Optimized Or's of And's. arxiv: 1511.02210 [cs.AI]

[50]

Tong Wang, Cynthia Rudin, Finale Doshi-Velez, Yimin Liu, Erica Klampfl, and Perry MacNeille. 2017. A bayesian framework for learning rule sets for interpretable classification. The Journal of Machine Learning Research, Vol. 18, 1 (2017), 2357--2393.

[51]

Fan Yang, Kai He, Linxiao Yang, Hongxia Du, Jingbang Yang, Bo Yang, and Liang Sun. 2021. Learning Interpretable Decision Rule Sets: A Submodular Optimization Approach. NeurIPS, Vol. 34 (2021).

[52]

Hongyu Yang, Cynthia Rudin, and Margo Seltzer. 2017. Scalable Bayesian Rule Lists. In Proceedings of the 34th International Conference on Machine Learning. 3921--3930.

[53]

Jinqiang Yu, Alexey Ignatiev, Pierre Le Bodic, and Peter J Stuckey. 2020. Optimal decision lists using SAT. arXiv preprint arXiv:2010.09919 (2020).

[54]

Yichi Zhang, Eric B Laber, Anastasios Tsiatis, and Marie Davidian. 2015. Using decision lists to construct interpretable and parsimonious treatment regimes. Biometrics, Vol. 71, 4 (2015), 895--904.

[55]

Zhenliang Zhang, Edwin KP Chong, Ali Pezeshki, and William Moran. 2015. String submodular functions with curvature constraints. IEEE Trans. Automat. Control, Vol. 61, 3 (2015), 601--616.

[56]

Haoran Zhu, Pavankumar Murali, Dzung Phan, Lam Nguyen, and Jayant Kalagnanam. 2020. A scalable MIP-based method for learning optimal multivariate decision trees. NeurlIPS, Vol. 33 (2020), 1771--1781.

Index Terms

Efficient Decision Rule List Learning via Unified Sequence Submodular Optimization

Index terms have been assigned to the content through auto-classification.

Recommendations

Top-k Rule List Learning via Integer Linear Programming
PRICAI 2024: Trends in Artificial Intelligence
Abstract
Rule lists, one of the best-known rule-based models, have been intensively studied in recent years because its high interpretability is desirable for use in important decision-making tasks. Although rule lists often achieve relatively higher ...
Learning Locally Interpretable Rule Ensemble
Machine Learning and Knowledge Discovery in Databases: Research Track
Abstract
This paper proposes a new framework for learning a rule ensemble model that is both accurate and interpretable. A rule ensemble is an interpretable model based on the linear combination of weighted rules. In practice, we often face the trade-off ...
An interpretability improvement for fuzzy rule bases obtained by the iterative rule learning approach

Interpretability is one of the key concepts in many of the applications using the fuzzy rule-based approach. It is well known that there are many different criteria around this concept, the complexity being one of them. In this paper, we focus our ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2024

6901 pages

ISBN:9798400704901

DOI:10.1145/3637528

General Chairs:
Ricardo Baeza-Yates
Northeastern University, USA
,
Francesco Bonchi
CENTAI / Eurecat, Italy

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '24

Sponsor:

KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona, Spain

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
143
Total Downloads

Downloads (Last 12 months)143
Downloads (Last 6 weeks)13

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten