Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3664646.3664766acmconferencesArticle/Chapter ViewAbstractPublication PagesaiwareConference Proceedingsconference-collections
research-article
Open access

An AI System Evaluation Framework for Advancing AI Safety: Terminology, Taxonomy, Lifecycle Mapping

Published: 10 July 2024 Publication History

Abstract

The advent of advanced AI underscores the urgent need for comprehensive safety evaluations, necessitating collaboration across communities (i.e., AI, software engineering, and governance). However, divergent practices and terminologies across these communities, combined with the complexity of AI systems—of which models are only a part—and environmental affordances (e.g., access to tools), obstruct effective communication and comprehensive evaluation. This paper proposes a framework for AI system evaluation comprising three components: 1) harmonised terminology to facilitate communication across communities involved in AI safety evaluation; 2) a taxonomy identifying essential elements for AI system evaluation; 3) a mapping between AI lifecycle, stakeholders, and requisite evaluations for accountable AI supply chain. This framework catalyses a deeper discourse on AI system evaluation beyond model-centric approaches.

References

[1]
2016. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation).
[2]
2022. ISO/IEC 22989:2022, Information technology – Artificial intelligence – Artificial intelligence concepts and terminology.
[3]
2023. The Bletchley Declaration by Countries Attending the AI Safety Summit.
[4]
AI Safety Institute. 2024. AI Safety Institute Approach to Evaluations.
[5]
Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, and Gillian Hadfield. 2023. Managing ai risks in an era of rapid progress. arXiv preprint arXiv:2310.17688.
[6]
Zhenpeng Chen, Jie M Zhang, Max Hort, Mark Harman, and Federica Sarro. 2023. Fairness Testing: A Comprehensive Survey and Analysis of Trends. ACM Transactions on Software Engineering and Methodology.
[7]
Jane Huang, Kirk Li, and Daniel Yehdego. 2024. Evaluating LLM Systems: Metrics, Challenges, and Best Practices.
[8]
IEEE Computer Society. 2014. Guide to the Software Engineering Body of Knowledge (SWEBOK): Version 3.0. IEEE Computer Society.
[9]
Qinghua Lu, Liming Zhu, Jon Whittle, and Xiwei Xu. 2023. Responsible AI: Best Practices for Creating Trustworthy AI Systems. Addison-Wesley Professional.
[10]
Qinghua Lu, Liming Zhu, Xiwei Xu, Jon Whittle, Didar Zowghi, and Aurelie Jacquet. 2023. Responsible ai pattern catalogue: A collection of best practices for ai governance and engineering. Comput. Surveys.
[11]
Potsawee Manakul, Adian Liusie, and Mark JF Gales. 2023. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. arXiv preprint arXiv:2303.08896.
[12]
US National Institute of Standards and Technology (NIST). 2023. AI Risk Management Framework (AI RMF 1.0).
[13]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.
[14]
Lee Sharkey, Clíodhna Ní Ghuidhir, Dan Braun, Jérémy Scheurer, Mikita Balesni, Lucius Bushnaq, Charlotte Stix, and Marius Hobbhahn. 2024. A Causal Framework for AI Regulation and Auditing.
[15]
Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, and Xiner Li. 2024. Trustllm: Trustworthiness in large language models. arXiv preprint arXiv:2401.05561.
[16]
The White House. 2023. Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.
[17]
Bertie Vidgen, Adarsh Agrawal, Ahmed M Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, and Borhane Blili-Hamelin. 2024. Introducing v0. 5 of the ai safety benchmark from mlcommons. arXiv preprint arXiv:2404.12241.
[18]
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R Bowman. 2018. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461.
[19]
Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, and Rylan Schaeffer. 2023. Decodingtrust: A comprehensive assessment of trustworthiness in gpt models. arXiv preprint arXiv:2306.11698.
[20]
Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor Griffin, and Ben Bariach. 2023. Sociotechnical safety evaluation of generative ai systems. arXiv preprint arXiv:2310.11986.
[21]
Boming Xia, Qinghua Lu, Liming Zhu, Sung Une Lee, Yue Liu, and Zhenchang Xing. 2024. Towards a Responsible AI Metrics Catalogue: A Collection of Metrics for AI Accountability. In 3rd International Conference on AI Engineering–Software Engineering for AI (CAIN ’24).
[22]
Matei Zaharia, Omar Khattab, Lingjiao Chen, Jared Quincy Davis, Heather Miller, Chris Potts, James Zou, Michael Carbin, Jonathan Frankle, Naveen Rao, and Ali Ghodsi. 2024. The Shift from Models to Compound AI Systems.
[23]
Dawen Zhang, Boming Xia, Yue Liu, Xiwei Xu, Thong Hoang, Zhenchang Xing, Mark Staples, Qinghua Lu, and Liming Zhu. 2023. Navigating privacy and copyright challenges across the data lifecycle of generative ai. arXiv preprint arXiv:2311.18252.
[24]
Zhexin Zhang, Yida Lu, Jingyuan Ma, Di Zhang, Rui Li, Pei Ke, Hao Sun, Lei Sha, Zhifang Sui, and Hongning Wang. 2024. ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors. arXiv preprint arXiv:2402.16444.
[25]
Kun Zhou, Yutao Zhu, Zhipeng Chen, Wentong Chen, Wayne Xin Zhao, Xu Chen, Yankai Lin, Ji-Rong Wen, and Jiawei Han. 2023. Don’t Make Your LLM an Evaluation Benchmark Cheater. arXiv preprint arXiv:2311.01964.

Index Terms

  1. An AI System Evaluation Framework for Advancing AI Safety: Terminology, Taxonomy, Lifecycle Mapping

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    AIware 2024: Proceedings of the 1st ACM International Conference on AI-Powered Software
    July 2024
    182 pages
    ISBN:9798400706851
    DOI:10.1145/3664646
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 July 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. AI Safety
    2. AI Testing
    3. Benchmarking
    4. Evaluation
    5. Responsible AI

    Qualifiers

    • Research-article

    Conference

    AIware '24
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 198
      Total Downloads
    • Downloads (Last 12 months)198
    • Downloads (Last 6 weeks)101
    Reflects downloads up to 28 Sep 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media