research-article

Open access

Automated Boundary Identification for Machine Learning Classifiers

Authors:

Robert FeldtAuthors Info & Claims

SBFT '24: Proceedings of the 17th ACM/IEEE International Workshop on Search-Based and Fuzz Testing

Pages 1 - 8

https://doi.org/10.1145/3643659.3643927

Published: 10 September 2024 Publication History

Abstract

AI and Machine Learning (ML) models are increasingly used as (critical) components in software systems, even safety-critical ones. This puts new demands on the degree to which we need to test them and requires new and expanded testing methods. Recent boundary-value identification methods have been developed and shown to automatically find boundary candidates for traditional, non-ML software: pairs of nearby inputs that result in (highly) differing outputs. These can be shown to developers and testers, who can judge if the boundary is where it is supposed to be.

Here, we explore how this method can identify decision boundaries of ML classification models. The resulting ML Boundary Spanning Algorithm (ML-BSA) is a search-based method extending previous work in two main ways. We empirically evaluate ML-BSA on seven ML datasets and show that it better spans and thus better identifies the entire classification boundary(ies). The diversity objective helps spread out the boundary pairs more broadly and evenly. This, we argue, can help testers and developers better judge where a classification boundary actually is, compare to expectations, and then focus further testing, validation, and even further training and model refinement on parts of the boundary where behaviour is not ideal.

References

[1]

Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 291--300.

Digital Library

[2]

Panagiotis Bountakas, Apostolis Zarras, Alexios Lekidis, and Christos Xenakis. 2023. Defense strategies for Adversarial Machine Learning: A survey. Computer Science Review 49 (2023), 100573.

Digital Library

[3]

Dieter Brughmans, Pieter Leyman, and David Martens. 2023. Nice: an algorithm for nearest instance counterfactual explanations. Data Mining and Knowledge Discovery (2023), 1--39.

[4]

Felix Dobslaw, Francisco Gomes de Oliveira Neto, and Robert Feldt. 2020. Boundary value exploration for software analysis. In 2020 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). IEEE, 346--353.

[5]

Felix Dobslaw and Robert Feldt. 2023. Similarities of Testing Programmed and Learnt Software. In 2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). IEEE, 78--81.

[6]

Felix Dobslaw, Robert Feldt, and Francisco Gomes de Oliveira Neto. 2023. Automated black-box boundary value detection. PeerJ Computer Science 9 (2023), e1625.

[7]

Robert Feldt, Francisco G de Oliveira Neto, and Richard Torkar. 2018. Ways of applying artificial intelligence in software engineering. In Proceedings of the 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering. 35--41.

Digital Library

[8]

Robert Feldt and Felix Dobslaw. 2019. Towards automated boundary value testing with program derivatives and search. In Search-Based Software Engineering: 11th International Symposium, SSBSE 2019, Tallinn, Estonia, August 31--September 1, 2019, Proceedings 11. Springer, 155--163.

Digital Library

[9]

Vitaliy Feoktistov. 2006. Differential evolution. Springer.

[10]

Riccardo Guidotti. 2022. Counterfactual explanations and how to find them: literature review and benchmarking. Data Mining and Knowledge Discovery (2022), 1--55.

[11]

Shalmali Joshi, Oluwasanmi Koyejo, Been Kim, and Joydeep Ghosh. 2018. xgems: Generating examplars to explain black-box models. arXiv preprint arXiv:1806.08867 (2018).

[12]

Thibault Laugel, Adulam Jeyasothy, Marie-Jeanne Lesot, Christophe Marsala, and Marcin Detyniecki. 2023. Achieving Diversity in Counterfactual Explanations: a Review and Discussion. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. 1859--1869.

Digital Library

[13]

Wei Ma, Mike Papadakis, Anestis Tsakmalis, Maxime Cordy, and Yves Le Traon. 2021. Test selection for deep learning systems. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 2 (2021), 1--22.

Digital Library

[14]

Bogdan Marculescu and Robert Feldt. 2018. Finding a boundary between valid and invalid regions of the input space. In 2018 25th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 169--178.

[15]

Leland McInnes, John Healy, and James Melville. 2018. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018).

[16]

Ramaravind K Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 607--617.

Digital Library

[17]

Inioluwa Deborah Raji, Andrew Smart, Rebecca N White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 33--44.

Digital Library

[18]

Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature machine intelligence 1, 5 (2019), 206--215.

[19]

Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E Hines, John P Dickerson, and Chirag Shah. 2020. Counterfactual explanations and algorithmic recourses for machine learning: A review. arXiv preprint arXiv:2010.10596 (2020).

[20]

Rey Reza Wiyatno, Anqi Xu, Ousmane Dia, and Archy De Berker. 2019. Adversarial examples in modern machine learning: A review. arXiv preprint arXiv:1911.05268 (2019).

Index Terms

Automated Boundary Identification for Machine Learning Classifiers
1. Human-centered computing
2. Theory of computation
  1. Logic

Index terms have been assigned to the content through auto-classification.

Recommendations

Multi-sided Boundary Labeling

In the Boundary Labeling problem, we are given a set of n points, referred to as sites, inside an axis-parallel rectangle R, and a set of n pairwise disjoint rectangular labels that are attached to R from the outside. The task is to connect the sites to ...
Packing boundary-anchored rectangles and squares
Abstract
Consider a set P of n points on the boundary of an axis-aligned square Q. We study the boundary-anchored packing problem on P in which the goal is to find a set of interior-disjoint axis-aligned rectangles in Q such that each rectangle ...
Direct numerical identification of boundary values in the Laplace equation
Proceedings of the international conference on recent advances in computational mathematics

An inverse boundary value problem for the Laplace equation is considered. The Dirichlet and the Neumann data are prescribed on respective part of the boundary, while there is the second part of the boundary where no boundary data are given. There is the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SBFT '24: Proceedings of the 17th ACM/IEEE International Workshop on Search-Based and Fuzz Testing

April 2024

84 pages

ISBN:9798400705625

DOI:10.1145/3643659

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 September 2024

Check for updates

Qualifiers

Research-article

Conference

SBFT '24

Sponsor:

SIGSOFT

SBFT '24: 17th ACM/IEEE International Workshop on Search-Based and Fuzz Testing

April 14, 2024

Lisbon, Portugal

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
53
Total Downloads

Downloads (Last 12 months)53
Downloads (Last 6 weeks)40

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents