Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3637528.3671827acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Efficient Decision Rule List Learning via Unified Sequence Submodular Optimization

Published: 24 August 2024 Publication History

Abstract

Interpretable models are crucial in many high-stakes decision-making applications. In this paper, we focus on learning a decision rule list for binary and multi-class classification. Different from rule set learning problems, learning an optimal rule list involves not only learning a set of rules, but also their orders. In addition, many existing algorithms rely on rule pre-mining to handle large-scale high-dimensional data, which leads to suboptimal rule list model and degrades its generalization accuracy and interpretablity. In this paper, we learn a rule list from the sequence submodular perspective. We consider the rule list as a sequence and define the cover set for each rule. Then we formulate a sequence function which combines both model complexity and classification accuracy. Based on its appealing sequence submodular property, we propose a general distorted greedy insert algorithm under Minorization-Maximization (MM) framework, which gradually inserts rules with highest inserting gain to the rule list. The rule generation process is treated as a subproblem, allowing our method to learn the rule list through a unified framework which avoids rule pre-mining. We further provide a theoretical lower bound of our greedy insert algorithm in rule list learning. Experimental results show that our algorithm achieves better accuracy and interpretability than the state-of-the-art rule learning methods, and in particular it scales well on large-scale datasets, especially on high-dimensional data.

Supplemental Material

MOV File - Promotional Video for paper Efficient Decision Rule List Learning via Unified Sequence Submodular Optimization
Promotional Video for paper Efficient Decision Rule List Learning via Unified Sequence Submodular Optimization

References

[1]
Sina Aghaei, Andres Gomez, and Phebe Vayanos. 2020. Learning optimal classification trees: Strong max-flow formulations. arXiv preprint arXiv:2002.09142 (2020).
[2]
Gaël Aglin, Siegfried Nijssen, and Pierre Schaus. 2020. Learning optimal decision trees using caching branch-and-bound search. In Proceedings of the AAAI, Vol. 34. 3146--3153.
[3]
Saeed Alaei, Ali Makhdoumi, and Azarakhsh Malekian. 2021. Maximizing sequence-submodular functions and its application to online advertising. Management Science, Vol. 67, 10 (2021), 6030--6054.
[4]
Saeed Alaei and Azarakhsh Malekian. 2010. Maximizing sequence-submodular functions. arXiv preprint arXiv:1009.4153 (2010).
[5]
Elaine Angelino, Nicholas Larus-Stone, Daniel Alabi, Margo Seltzer, and Cynthia Rudin. 2017. Learning Certifiably Optimal Rule Lists. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 35--44.
[6]
Onur Asan, Alparslan Emrah Bayrak, and Avishek Choudhury. 2020. Artificial intelligence and human trust in healthcare: focus on clinicians. Journal of medical Internet research, Vol. 22, 6 (2020), e15154.
[7]
Martin Atzmueller. 2015. Subgroup Discovery. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 5, 1 (2015), 35--49.
[8]
Sumanta Basu, Karl Kumbier, James B Brown, and Bin Yu. 2018. Iterative random forests to discover predictive and stable high-order interactions. Proceedings of the National Academy of Sciences, Vol. 115, 8 (2018), 1943--1948.
[9]
Sara Bernardini, Fabio Fagnani, and Chiara Piacentini. 2020. Through the lens of sequence submodularity. In Proceedings of the International Conference on Automated Planning and Scheduling, Vol. 30. 38--47.
[10]
Sara Bernardini, Fabio Fagnani, and Chiara Piacentini. 2021. A unifying look at sequence submodularity. Artificial Intelligence, Vol. 297 (2021), 103486.
[11]
Dimitris Bertsimas and Jack Dunn. 2017. Optimal classification trees. Machine Learning, Vol. 106, 7 (2017), 1039--1082.
[12]
Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schwartz. 2014. Submodular maximization with cardinality constraints. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms. SIAM, 1433--1452.
[13]
Allison Chang, Dimitris Bertsimas, and Cynthia Rudin. 2012. An Integer Optimization Approach to Associative Classification. In Advances in Neural Information Processing Systems. 269--277.
[14]
William W Cohen. 1995. Fast Effective Rule Induction. In Proceedings of the 12th International Conference on Machine Learning. 115--123.
[15]
Sanjeeb Dash, Oktay Gunluk, and Dennis Wei. 2018. Boolean Decision Rules via Column Generation. In Advances in Neural Information Processing Systems. 4655--4665.
[16]
J Richard Dietrich and Robert S Kaplan. 1982. Empirical analysis of the commercial loan classification decision. Accounting Review (1982), 18--38.
[17]
Julia Dressel and Hany Farid. 2018. The accuracy, fairness, and limits of predicting recidivism. Science advances, Vol. 4, 1 (2018), eaao5580.
[18]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[19]
Jonathan Eckstein, Noam Goldberg, and Ai Kagawa. 2017. Rule-enhanced penalized regression by column generation using rectangular maximum agreement. In ICML. PMLR, 1059--1067.
[20]
Radwa ElShawi, Youssef Sherif, Mouaz Al-Mallah, and Sherif Sakr. 2020. Interpretability in healthcare: A comparative study of local machine learning interpretability techniques. Computational Intelligence (2020).
[21]
Uriel Feige, Vahab S Mirrokni, and Jan Vondrák. 2011. Maximizing non-monotone submodular functions. SIAM J. Comput., Vol. 40, 4 (2011), 1133--1153.
[22]
FICO, Google, Imperial College London, MIT, University of Oxford, UC Irvine, and UC Berkeley. 2018. Explainable Machine Learning Challenge. https://community.fico.com/s/explainable-machine-learning-challenge
[23]
Bishwamittra Ghosh, Dmitry Malioutov, and Kuldeep S. Meel. 2020. Classification Rules in Relaxed Logical Form. In Proc. of ECAI.
[24]
Bishwamittra Ghosh and Kuldeep S. Meel. 2019. IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules. In Proc. of AIES.
[25]
Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA). IEEE, 80--89.
[26]
Chris Harshaw, Moran Feldman, Justin Ward, and Amin Karbasi. 2019. Submodular maximization beyond non-negativity: Guarantees, fast algorithms, and applications. In ICML. PMLR, 2634--2643.
[27]
Rishabh Iyer, Stefanie Jegelka, and Jeff Bilmes. 2013. Fast semidifferential-based submodular function optimization. In ICML. PMLR, 855--863.
[28]
Jon Kleinberg, Emily Ryu, and Éva Tardos. 2022. Ordered submodularity and its applications to diversifying recommendations. arXiv preprint arXiv:2203.00233 (2022).
[29]
Andreas Krause and Daniel Golovin. 2014. Submodular function maximization. Tractability, Vol. 3 (2014), 71--104.
[30]
Himabindu Lakkaraju, Stephen H Bach, and Jure Leskovec. 2016. Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1675--1684.
[31]
Himabindu Lakkaraju and Cynthia Rudin. 2017. Learning cost-effective and interpretable treatment regimes. In Artificial intelligence and statistics. PMLR, 166--175.
[32]
Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. How We Analyzed the COMPAS Recidivism Algorithm. ProPublica (2016).
[33]
Jimmy Lin, Chudi Zhong, Diane Hu, Cynthia Rudin, and Margo Seltzer. 2020. Generalized and scalable optimal sparse decision trees. In ICML. PMLR, 6150--6160.
[34]
Pantelis Linardatos, Vasilis Papastefanopoulos, and Sotiris Kotsiantis. 2021. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy, Vol. 23, 1 (2021), 18.
[35]
Dmitry Malioutov and Kuldeep S. Meel. 2018. MLIC: A MaxSAT-Based framework for learning interpretable classification rules. In Proceedings of International Conference on Constraint Programming (CP).
[36]
Graziano Mita, Paolo Papotti, Maurizio Filippone, and Pietro Michiardi. 2020. LIBRE: Learning interpretable boolean rule ensembles. In ICAIS. PMLR, 245--255.
[37]
Susan A Murphy. 2003. Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 65, 2 (2003), 331--355.
[38]
Nina Narodytska, Alexey Ignatiev, Filipe Pereira, Joao Marques-Silva, and IS RAS. 2018. Learning Optimal Decision Trees with SAT. In Ijcai. 1362--1368.
[39]
George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. 1978. An analysis of approximations for maximizing submodular set functions-I. Mathematical programming, Vol. 14 (1978), 265--294.
[40]
Pierre Perrault, Jennifer Healey, Zheng Wen, and Michal Valko. 2021. On the approximation relationship between optimizing ratio of submodular (rs) and difference of submodular (ds) functions. arXiv preprint arXiv:2101.01631 (2021).
[41]
Hugo M Proencca and Matthijs van Leeuwen. 2020. Interpretable multiclass classification by MDL-based rule lists. Information Sciences, Vol. 512 (2020), 1372--1393.
[42]
J Ross Quinlan and R Mike Cameron-Jones. 1993. FOIL: A midterm report. In ECML. Springer, 1--20.
[43]
Ronald L Rivest. 1987. Learning decision lists. Machine learning, Vol. 2, 3 (1987), 229--246.
[44]
Cynthia Rudin, Chaofan Chen, Zhi Chen, Haiyang Huang, Lesia Semenova, and Chudi Zhong. 2021. Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges. arxiv: 2103.11251 [cs.LG]
[45]
Cynthia Rudin and cSeyda Ertekin. 2018. Learning customized and optimized lists of rules with mathematical programming. Mathematical Programming Computation, Vol. 10, 4 (2018), 659--702.
[46]
Matthew Streeter and Daniel Golovin. 2008. An online algorithm for maximizing submodular functions. NeurIPS, Vol. 21 (2008).
[47]
Ying Sun, Prabhu Babu, and Daniel P. Palomar. 2017. Majorization-Minimization Algorithms in Signal Processing, Communications, and Machine Learning. IEEE Transactions on Signal Processing, Vol. 65, 3 (2017), 794--816. https://doi.org/10.1109/TSP.2016.2601299
[48]
Fulton Wang and Cynthia Rudin. 2015. Falling Rule Lists. In Proceedings of Artificial Intelligence and Statistics (AISTATS).
[49]
Tong Wang and Cynthia Rudin. 2015. Learning Optimized Or's of And's. arxiv: 1511.02210 [cs.AI]
[50]
Tong Wang, Cynthia Rudin, Finale Doshi-Velez, Yimin Liu, Erica Klampfl, and Perry MacNeille. 2017. A bayesian framework for learning rule sets for interpretable classification. The Journal of Machine Learning Research, Vol. 18, 1 (2017), 2357--2393.
[51]
Fan Yang, Kai He, Linxiao Yang, Hongxia Du, Jingbang Yang, Bo Yang, and Liang Sun. 2021. Learning Interpretable Decision Rule Sets: A Submodular Optimization Approach. NeurIPS, Vol. 34 (2021).
[52]
Hongyu Yang, Cynthia Rudin, and Margo Seltzer. 2017. Scalable Bayesian Rule Lists. In Proceedings of the 34th International Conference on Machine Learning. 3921--3930.
[53]
Jinqiang Yu, Alexey Ignatiev, Pierre Le Bodic, and Peter J Stuckey. 2020. Optimal decision lists using SAT. arXiv preprint arXiv:2010.09919 (2020).
[54]
Yichi Zhang, Eric B Laber, Anastasios Tsiatis, and Marie Davidian. 2015. Using decision lists to construct interpretable and parsimonious treatment regimes. Biometrics, Vol. 71, 4 (2015), 895--904.
[55]
Zhenliang Zhang, Edwin KP Chong, Ali Pezeshki, and William Moran. 2015. String submodular functions with curvature constraints. IEEE Trans. Automat. Control, Vol. 61, 3 (2015), 601--616.
[56]
Haoran Zhu, Pavankumar Murali, Dzung Phan, Lam Nguyen, and Jayant Kalagnanam. 2020. A scalable MIP-based method for learning optimal multivariate decision trees. NeurlIPS, Vol. 33 (2020), 1771--1781.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2024
6901 pages
ISBN:9798400704901
DOI:10.1145/3637528
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. interpretability
  2. rule list
  3. sequence submodular optimization

Qualifiers

  • Research-article

Conference

KDD '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 143
    Total Downloads
  • Downloads (Last 12 months)143
  • Downloads (Last 6 weeks)13
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media