Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Relative Keys: Putting Feature Explanation into Context

Published: 26 March 2024 Publication History

Abstract

Formal feature explanations strictly maintain perfect conformity but are intractable to compute, while heuristic methods are much faster but can lead to problematic explanations due to lack of conformity guarantees. We propose relative keys that have the best of both worlds. Relative keys associate feature explanations with a set of instances as context, and warrant perfect conformity over the context as formal explanations do, whilst being orders of magnitudes faster and working for complex blackbox models. Based on it, we develop CCE, a prototype that computes explanations with provably bounded conformity and succinctness, without accessing the models. We show that computing the most succinct relative keys is NP-complete and develop various algorithms for it under the batch and online models. Using 9 real-life datasets and 7 state-of-the-art explanation methods, we demonstrate that CCE explains cases where existing methods cannot, and provides more succinct explanations with perfect conformity for cases they can; moreover, it is 2 orders of magnitude faster.

Supplemental Material

MP4 File
Presentation video
MP4 File
Presentation video

References

[1]
2020. Machine Learning Best Practices in Financial Services. https://d1.awsstatic.com/whitepapers/machine-learning-in-financial-services-on-aws.pdf.
[2]
2022. Caching for ML Model Deployments. https://www.tekhnoal.com/caching-for-ml-models.html.
[3]
2022. Compas dataset. https://www.kaggle.com/datasets/danofer/compass.
[4]
2022. Kaggle. https://www.kaggle.com/.
[5]
2022. Loan dataset. https://www.kaggle.com/datasets/vikasukani/loan-eligible-dataset.
[6]
2022. Manual Review. https://seon.io/resources/dictionary/manual-reviews/.
[7]
2022. online predictions with static features. https://www.zdnet.com/article/machine-learning-is-going-real-time-heres-why-and-how/.
[8]
2022. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/index.php.
[9]
2022. Unleashing the power of machine learning models in banking through explainable artificial intelligence (XAI). https://www2.deloitte.com/us/en/insights/industry/financial-services/explainable-ai-in-banking.html.
[10]
2023. Amazon ElastiCache for Redis. https://aws.amazon.com/elasticache/redis/?nc1=h_ls.
[11]
2023. Amazon SageMaker. https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-model-explainability.html.
[12]
2023. Explainable Artificial Intelligence: From Static to Dynamic. https://sites.google.com/view/dynxai-ecmlpkdd-2023.
[13]
2023. Google Cloud: Precomputing and caching predictions. https://cloud.google.com/architecture/minimizing-predictive-serving-latency-in-machine-learning.
[14]
2023. Upstart. https://www.upstart.com/.
[15]
2023. Zest AI. https://www.zest.ai/.
[16]
Darius Afchar, Vincent Guigue, and Romain Hennequin. 2021. Towards Rigorous Interpretations: a Formalisation of Feature Attribution. In ICML.
[17]
Chirag Agarwal, Satyapriya Krishna, Eshika Saxena, Martin Pawelczyk, Nari Johnson, Isha Puri, Marinka Zitnik, and Himabindu Lakkaraju. 2022. OpenXAI: Towards a Transparent Evaluation of Model Explanations. In NeurIPS.
[18]
Rakesh Agrawal, Heikki Mannila, Ramakrishnan Srikant, Hannu Toivonen, and A. Inkeri Verkamo. 1996. Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 307--328.
[19]
Emanuele Albini, Antonio Rago, Pietro Baroni, and Francesca Toni. 2020. Relation-Based Counterfactual Explanations for Bayesian Network Classifiers. In IJCAI. 451--457.
[20]
Noga Alon, Baruch Awerbuch, Yossi Azar, Niv Buchbinder, and Joseph Naor. 2003. The online set cover problem. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing, June 9--11, 2003, San Diego, CA, USA. ACM, 100--105.
[21]
Emmanuel Ameisen. 2020. Building Machine Learning Powered Applications: Going from Idea to Product. " O'Reilly Media, Inc.".
[22]
Leila Amgoud and Jonathan Ben-Naim. 2022. Axiomatic Foundations of Explainability. In IJCAI. 636--642.
[23]
Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, et al . 2020. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information fusion 58 (2020), 82--115.
[24]
Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, and Isabelle Augenstein. 2020. A Diagnostic Study of Explainability Techniques for Text Classification. In EMNLP. 3256--3274.
[25]
Gilles Audemard, Frédéric Koriche, and Pierre Marquis. 2020. On Tractable XAI Queries based on Compiled Representations. In KR. 838--849.
[26]
Giorgio Ausiello, M Protasi, A Marchetti-Spaccamela, G Gambosi, P Crescenzi, and V Kann. 1999. Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer-Verlag.
[27]
Andrea Baraldi, Francesco Del Buono, Matteo Paganelli, Francesco Guerra, et al . 2021. Using landmarks for explaining entity matching models. In EDBT, Vol. 2021. 451--456.
[28]
Nils Barlaug and Jon Atle Gulla. 2021. Neural networks for entity matching: A survey. TKDD 15, 3 (2021), 1--37.
[29]
Guy Blanc, Jane Lange, and Li-Yang Tan. 2021. Provably efficient, succinct, and precise explanations. In NeurIPS. 6129--6141.
[30]
Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ B. Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri S. Chatterji, Annie S. Chen, Kathleen Creel, Jared Quincy Davis, Dorottya Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah D. Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark S. Krass, Ranjay Krishna, Rohith Kuditipudi, and et al. 2021. On the Opportunities and Risks of Foundation Models. CoRR abs/2108.07258 (2021).
[31]
Peter Buneman, Sanjeev Khanna, and Wang Chiew Tan. 2001. Why and Where: A Characterization of Data Provenance. In ICDT.
[32]
Peter Buneman and Wang Chiew Tan. 2007. Provenance in databases. In SIGMOD. ACM.
[33]
Nadia Burkart and Marco F Huber. 2021. A survey on the explainability of supervised machine learning. Journal of Artificial Intelligence Research 70 (2021), 245--317.
[34]
Tianqi Chen, Tong He, Michael Benesty, et al . 2015. Xgboost: extreme gradient boosting. R package version 0.4--2 1, 4 (2015), 1--4.
[35]
Martin C. Cooper and João Marques-Silva. 2021. On the Tractability of Explaining Decisions of Classifiers. In CP, Vol. 210. 21:1--21:18.
[36]
Susanne Dandl, Christoph Molnar, Martin Binder, and Bernd Bischl. 2020. Multi-objective counterfactual explanations. In International Conference on Parallel Problem Solving from Nature. Springer, 448--469.
[37]
Sanjib Das, A Doan, C Gokhale Psgc, and P Konda. 2016. The magellan data repository. (2016).
[38]
Sanjoy Dasgupta, Nave Frost, and Michal Moshkovitz. 2022. Framework for evaluating faithfulness of local explanations. In ICML. PMLR, 4794--4815.
[39]
Daniel Deutch and Nave Frost. 2019. Constraints-Based Explanations of Classifications. In ICDE. 530--541.
[40]
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).
[41]
Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and Removing Disparate Impact. In KDD. ACM, 259--268.
[42]
M. R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman.
[43]
Aurélien Garivier and Eric Moulines. 2011. On upper-confidence bound policies for switching bandit problems. In International Conference on Algorithmic Learning Theory. 174--188.
[44]
Kareem El Gebaly, Parag Agrawal, Lukasz Golab, Flip Korn, and Divesh Srivastava. 2014. Interpretable and Informative Explanations of Outcomes. Proc. VLDB Endow. 8, 1 (2014), 61--72.
[45]
Zixuan Geng, Maximilian Schleich, and Dan Suciu. 2022. Computing Rule-Based Explanations by Leveraging Counterfactuals. Proc. VLDB Endow. 16, 3 (2022), 420--432.
[46]
Niku Gorji and Sasha Rubin. 2022. Sufficient Reasons for Classifier Decisions in the Presence of Domain Constraints. In AAAI. 5660--5667.
[47]
Todd J. Green, Gregory Karvounarakis, and Val Tannen. 2007. Provenance semirings. In PODS. ACM, 31--40.
[48]
Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco Turini, and Fosca Giannotti. 2018. Local rule-based explanations of black box decision systems. arXiv preprint arXiv:1805.10820 (2018).
[49]
David Gunning and David Aha. 2019. DARPA's explainable artificial intelligence (XAI) program. AI magazine 40, 2 (2019), 44--58.
[50]
David Gunning, Mark Stefik, Jaesik Choi, Timothy Miller, Simone Stumpf, and Guang-Zhong Yang. 2019. XAI-Explainable artificial intelligence. Science robotics 4, 37 (2019), eaay7120.
[51]
Johannes Haug, Alexander Braun, Stefan Zürn, and Gjergji Kasneci. 2022. Change Detection for Local Explainability in Evolving Data Streams. In CIKM. 706--716.
[52]
Alexey Ignatiev. 2020. Towards Trustable Explainable AI. In IJCAI. 5154--5158.
[53]
Alexey Ignatiev, Yacine Izza, Peter J. Stuckey, and João Marques-Silva. 2022. Using MaxSAT for Efficient Explanations of Tree Ensembles. In AAAI. 3776--3785.
[54]
Alexey Ignatiev, Nina Narodytska, and João Marques-Silva. 2019. Abduction-Based Explanations for Machine Learning Models. In AAAI. 1511--1519.
[55]
Alexey Ignatiev, Nina Narodytska, and João Marques-Silva. 2019. On Validating, Repairing and Refining Heuristic ML Explanations. CoRR abs/1907.02509 (2019).
[56]
Dominik Janzing, Lenon Minorics, and Patrick Blöbaum. 2020. Feature relevance quantification in explainable AI: A causal problem. In AISTATS.
[57]
Amir-Hossein Karimi, Gilles Barthe, Borja Balle, and Isabel Valera. 2020. Model-Agnostic Counterfactual Explanations for Consequential Decisions. In AISTATS.
[58]
Ron Kohavi. 1996. Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, Oregon, USA. AAAI Press, 202--207.
[59]
Simon Korman. 2004. On the use of randomization in the online set cover problem. Weizmann Institute of Science 2 (2004).
[60]
Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the 20th international conference on intelligent user interfaces. 126--137.
[61]
Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. 2016. Interpretable Decision Sets: A Joint Framework for Description and Prediction. In SIGKDD. 1675--1684.
[62]
Yaguang Li, Kun Fu, Zheng Wang, Cyrus Shahabi, Jieping Ye, and Yan Liu. 2018. Multi-task Representation Learning for Travel Time Estimation. In KDD. ACM, 1695--1704.
[63]
Yuliang Li, Jinfeng Li, Yoshihiko Suhara, AnHai Doan, and Wang-Chiew Tan. 2020. Deep Entity Matching with Pre-Trained Language Models. Proc. VLDB Endow. 14, 1 (2020), 50--60.
[64]
Q Vera Liao, Yunfeng Zhang, Ronny Luss, Finale Doshi-Velez, and Amit Dhurandhar. 2022. Connecting Algorithmic Research and Usage Contexts: A Perspective of Contextualized Evaluation for Explainable AI. In AAAI, Vol. 10. 147--159.
[65]
Yin Lou, Rich Caruana, and Johannes Gehrke. 2012. Intelligible models for classification and regression. In KDD. ACM, 150--158.
[66]
Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In NeurIPS. 4765--4774.
[67]
João Marques-Silva and Alexey Ignatiev. 2022. Delivering Trustworthy AI through Formal XAI. In AAAI. 12342--12350.
[68]
Alexandra Meliou, Wolfgang Gatterbauer, Katherine F. Moore, and Dan Suciu. 2010. The Complexity of Causality and Responsibility for Query Answers and non-Answers. Proc. VLDB Endow. 4, 1 (2010), 34--45.
[69]
Michelle Miao. 2022. Debating the Right to Explanation: An Autonomy-Based Analytical Framework. SAcLJ 34 (2022), 864.
[70]
Zhengjie Miao, Qitian Zeng, Boris Glavic, and Sudeepa Roy. 2019. Going Beyond Provenance: Explaining Query Answers with Pattern-based Counterbalances. In SIGMOD. ACM, 485--502.
[71]
Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence 267 (2019), 1--38.
[72]
Christoph Molnar. 2020. Interpretable machine learning.
[73]
Ramaravind K Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 607--617.
[74]
Carlos Mougan, Klaus Broelemann, David Masip, Gjergji Kasneci, Thanassis Thiropanis, and Steffen Staab. 2023. Explanation Shift: Investigating Interactions between Models and Shifting Data Distributions. arXiv preprint arXiv:2303.08081 (2023).
[75]
Sidharth Mudgal, Han Li, Theodoros Rekatsinas, AnHai Doan, Youngchoon Park, Ganesh Krishnan, Rohit Deep, Esteban Arcaute, and Vijay Raghavendra. 2018. Deep learning for entity matching: A design space exploration. In SIGMOD. 19--34.
[76]
Rohan Paleja, Muyleng Ghuy, Nadun Ranawaka Arachchige, Reed Jensen, and Matthew Gombolay. 2021. The utility of explainable ai in ad hoc human-machine teaming. Advances in Neural Information Processing Systems 34 (2021), 610--623.
[77]
Garima Pruthi, Frederick Liu, Satyen Kale, and Mukund Sundararajan. 2020. Estimating training data influence by tracing gradient descent. Advances in Neural Information Processing Systems 33 (2020), 19920--19930.
[78]
Yanou Ramon, Tom Vermeire, Olivier Toubia, David Martens, and Theodoros Evgeniou. 2021. Understanding consumer preferences for explanations generated by XAI algorithms. arXiv:2107.02624 (2021).
[79]
Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In KDD. 1135--1144.
[80]
Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-Precision Model-Agnostic Explanations. In AAAI. 1527--1535.
[81]
Sudeepa Roy, Laurel J. Orr, and Dan Suciu. 2015. Explaining Query Answers with Explanation-Ready Databases. Proc. VLDB Endow. 9, 4 (2015), 348--359.
[82]
Sudeepa Roy and Dan Suciu. 2014. A formal approach to finding explanations for database queries. In SIGMOD. ACM, 1579--1590.
[83]
Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature machine intelligence 1, 5 (2019), 206--215.
[84]
Cynthia Rudin and Yaron Shaposhnik. 2023. Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models: Application to Credit-Risk Evaluation. J. Mach. Learn. Res. 24 (2023), 16:1--16:44.
[85]
Chris Russell. 2019. Efficient search for diverse coherent explanations. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 20--28.
[86]
Babak Salimi and Leopoldo E. Bertossi. 2015. From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back. In ICDT.
[87]
Babak Salimi and Leopoldo E. Bertossi. 2015. Query-Answer Causality in Databases: Abductive Diagnosis and View Updates. In Proc. UAI'15 Causal Inference Workshop.
[88]
Babak Salimi and Leopoldo E. Bertossi. 2016. Causes for Query Answers from Databases, Datalog Abduction and View-Updates: The Presence of Integrity Constraints. In FLAIRS.
[89]
Babak Salimi, Leopoldo E. Bertossi, Dan Suciu, and Guy Van den Broeck. 2016. Quantifying Causal Effects on Query Answering in Databases. In TaPP.
[90]
Babak Salimi, Johannes Gehrke, and Dan Suciu. 2018. Bias in OLAP Queries: Detection, Explanation, and Removal. In SIGMOD. ACM, 1021--1035.
[91]
Peter Schmidt and Ann D Witte. 1988. Predicting recidivism in north carolina, 1978 and 1980. Inter-university Consortium for Political and Social Research.
[92]
Andrew Selbst and Julia Powles. 2018. "Meaningful information" and the right to explanation. In conference on fairness, accountability and transparency. PMLR, 48--48.
[93]
Andy Shih, Arthur Choi, and Adnan Darwiche. 2018. A Symbolic Approach to Explaining Bayesian Network Classifiers. In IJCAI. 5103--5111.
[94]
Aditya A. Shrotri, Nina Narodytska, Alexey Ignatiev, Kuldeep S. Meel, João Marques-Silva, and Moshe Y. Vardi. 2022. Constraint-Driven Explanations for Black-Box ML Models. In AAAI. AAAI Press, 8304--8314.
[95]
Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods. In AIES. ACM, 180--186.
[96]
Kacper Sokol and Peter Flach. 2020. Explainability fact sheets: a framework for systematic assessment of explainable approaches. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 56--67.
[97]
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic Attribution for Deep Networks. In ICML.
[98]
Kai Sheng Tai, Vatsal Sharan, Peter Bailis, and Gregory Valiant. 2018. Sketching linear classifiers over data streams. In SIGMOD. 757--772.
[99]
Tommaso Teofili, Donatella Firmani, Nick Koudas, Vincenzo Martello, Paolo Merialdo, and Divesh Srivastava. 2022. Effective explanations for entity resolution models. In ICDE. 2709--2721.
[100]
Michael Tsang, Sirisha Rambhatla, and Yan Liu. 2020. How does This Interaction Affect Me? Interpretable Attribution for Feature Interactions. In NeurIPS.
[101]
Christopher Umans. 2001. The Minimum Equivalent DNF Problem and Shortest Implicants. J. Comput. Syst. Sci. 63, 4 (2001), 597--611.
[102]
Danding Wang, Qian Yang, Ashraf Abdul, and Brian Y Lim. 2019. Designing theory-driven user-centric explainable AI. In CHI. 1--15.
[103]
Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative Deep Learning for Recommender Systems. In KDD. ACM, 1235--1244.
[104]
Daniel S Weld and Gagan Bansal. 2019. The challenge of crafting intelligible intelligence. Commun. ACM 62, 6 (2019), 70--79.
[105]
Eugene Wu and Samuel Madden. 2013. Scorpion: Explaining Away Outliers in Aggregate Queries. Proc. VLDB Endow. 6, 8 (2013), 553--564.
[106]
Feiyu Xu, Hans Uszkoreit, Yangzhou Du, Wei Fan, Dongyan Zhao, and Jun Zhu. 2019. Explainable AI: A Brief Survey on History, Research Areas, Approaches and Challenges. In NLPCC, Vol. 11839. Springer, 563--574.

Cited By

View all
  • (2024)Counterfactual Explanation at Will, with Zero Privacy LeakageProceedings of the ACM on Management of Data10.1145/36549332:3(1-29)Online publication date: 30-May-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 1
SIGMOD
February 2024
1874 pages
EISSN:2836-6573
DOI:10.1145/3654807
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 March 2024
Published in PACMMOD Volume 2, Issue 1

Permissions

Request permissions for this article.

Author Tags

  1. database for explainable machine learning
  2. in-database explanation
  3. model explainability

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)229
  • Downloads (Last 6 weeks)36
Reflects downloads up to 19 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Counterfactual Explanation at Will, with Zero Privacy LeakageProceedings of the ACM on Management of Data10.1145/36549332:3(1-29)Online publication date: 30-May-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media