Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3617574.3617859acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

MLGuard: Defend Your Machine Learning Model!

Published: 04 December 2023 Publication History

Abstract

Machine Learning (ML) is used in critical highly regulated and high-stakes fields such as finance, medicine, and transportation. The correctness of these ML applications is important for human safety and economic benefit. Progress has been made on improving ML testing and monitoring of ML. However, these approaches do not provide i) pre/post conditions to handle uncertainty, ii) defining corrective actions based on probabilistic outcomes, or iii) continual verification during system operation. In this paper, we propose MLGuard, a new approach to specify contracts for ML applications. Our approach consists of a) an ML contract specification defining pre/post conditions, invariants, and altering behaviours, b) generated validation models to determine the probability of contract violation, and c) an ML wrapper generator to enforce the contract and respond to violations. Our work is intended to provide the overarching framework required for building ML applications and monitoring their safety.

References

[1]
Leonhard Applis, Annibale Panichella, and Arie van Deursen. 2021. Assessing Robustness of ML-Based Program Analysis Tools using Metamorphic Program Transformations. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1377–1381.
[2]
Denis Baylor. 2017. TFX: A TensorFlow-Based Production-Scale Machine Learning Platform. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1387–1395.
[3]
Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, and D. Sculley. 2017. The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction. In Proceedings of IEEE Big Data.
[4]
Alex Cummaudo, Scott Barnett, Rajesh Vasa, John Grundy, and Mohamed Abdelrazek. 2020. Beware the Evolving ‘Intelligent’ Web Service! An Integration Architecture Tactic to Guard AI-First Components. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 269–280.
[5]
Alexander D’Amour. 2022. Underspecification Presents Challenges for Credibility in Modern Machine Learning. J. Mach. Learn. Res., 23, 1 (2022), Article 226, issn:1532-4435
[6]
Xiang Gao, Ripon K Saha, Mukul R Prasad, and Abhik Roychoudhury. 2020. Fuzz testing based data augmentation to improve robustness of deep neural networks. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 1147–1158.
[7]
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021. Datasheets for datasets. Commun. ACM, 64, 12 (2021), 86–92.
[8]
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. 2017. On calibration of modern neural networks. In International Conference on Machine Learning. 1321–1330.
[9]
Xin He, Kaiyong Zhao, and Xiaowen Chu. 2021. AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, 212 (2021), 106622.
[10]
Dan Hendrycks and Kevin Gimpel. 2017. A baseline for detecting misclassified and out-of-distribution examples in neural networks. In ICLR.
[11]
Nick Hynes, D Sculley, and Michael Terry. 2017. The Data Linter: Lightweight, Automated Sanity Checking for ML Data Sets. NeurIPS.
[12]
Marta Kwiatkowska. 2020. Safety and robustness for deep learning with provable guarantees. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 1–3.
[13]
Bertrand Meyer. 1992. Applying “design by contract”. Computer, 25, 10 (1992), 40–51.
[14]
Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency. 220–229.
[15]
Jose G. Moreno-Torres, Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern recognition, 45, 1 (2012), 521–530.
[16]
Jie Ren, Peter J Liu, Emily Fertig, Jasper Snoek, Ryan Poplin, Mark Depristo, Joshua Dillon, and Balaji Lakshminarayanan. 2019. Likelihood ratios for out-of-distribution detection. NeurIPS.
[17]
Sebastian Schelter, Stefan Grafberger, Philipp Schmidt, Tammo Rukat, Mario Kiessling, Andrey Taptunov, Felix Biessmann, and Dustin Lange. 2019. Differential data quality verification on partitioned data. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). 1940–1945.
[18]
Sebastian Schelter, Dustin Lange, Philipp Schmidt, Meltem Celikel, Felix Biessmann, and Andreas Grafberger. 2018. Automating large-scale data quality verification. Proceedings of the VLDB Endowment, 11, 12 (2018), 1781–1794. issn:21508097
[19]
Ali Shahrokni and Robert Feldt. 2013. A systematic review of software robustness. Information and Software Technology, 55, 1 (2013), 1–17.
[20]
Afshin Shoeibi. 2021. Epileptic Seizures Detection Using Deep Learning Techniques: A Review. International Journal of Environmental Research and Public Health, 18, 11 (2021), issn:1660-4601
[21]
Klaas-Jan Stol, Paul Ralph, and Brian Fitzgerald. 2016. Grounded theory in software engineering research: a critical review and guidelines. In Proceedings of the 38th International Conference on Software Engineering. 120–131.
[22]
Xiyue Zhang. 2020. Uncertainty-guided testing and robustness enhancement for deep learning systems. In 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 101–103.

Cited By

View all
  • (2023)Robustness Attributes to Safeguard Machine Learning Models in Production2023 IEEE Engineering Informatics10.1109/IEEECONF58110.2023.10520555(1-9)Online publication date: 22-Nov-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SE4SafeML 2023: Proceedings of the 1st International Workshop on Dependability and Trustworthiness of Safety-Critical Systems with Machine Learned Components
December 2023
25 pages
ISBN:9798400703799
DOI:10.1145/3617574
This work is licensed under a Creative Commons Attribution 4.0 International License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ML validation
  2. design by contract
  3. error handling
  4. system validation

Qualifiers

  • Research-article

Funding Sources

Conference

SE4SafeML '23
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)148
  • Downloads (Last 6 weeks)18
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Robustness Attributes to Safeguard Machine Learning Models in Production2023 IEEE Engineering Informatics10.1109/IEEECONF58110.2023.10520555(1-9)Online publication date: 22-Nov-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media