Keyword: AI Safety : Search

Article

Emergence in Multi-agent Systems: A Safety Perspective

Leveraging Applications of Formal Methods, Verification and Validation. Rigorous Engineering of Collective Adaptive SystemsPages 104–120https://doi.org/10.1007/978-3-031-75107-3_7

Abstract

Emergent effects can arise in multi-agent systems (MAS) where execution is decentralized and reliant on local information. These effects may range from minor deviations in behavior to catastrophic system failures. To formally define these effects, ...

Article

Establishing the Foundation for Out-of-Distribution Detection in Monument Classification Through Nested Dichotomies

Hybrid Artificial Intelligent SystemsPages 165–176https://doi.org/10.1007/978-3-031-74186-9_14

Abstract

This paper introduces a hierarchical approach utilizing nested dichotomies to enhance the MonuMAI framework designed for architectural image classification. The study focuses on developing a foundational layer dedicated to distinguishing between ...

Article

An In-depth Analysis of Jailbreaking Through Domain Characterization of LLM Training Sets

Hybrid Artificial Intelligent SystemsPages 116–127https://doi.org/10.1007/978-3-031-74186-9_10

Abstract

Research on large language models (LLMs) is a prominent field in open-world machine learning. Despite their significant capabilities in natural language processing, LLMs face several challenges that must be overcome, namely, consistency, ...

Article

Bridging the Reality Gap: Assurable Simulations for an ML-Based Inspection Drone Flight Controller

Computer Safety, Reliability, and Security. SAFECOMP 2024 WorkshopsPages 412–424https://doi.org/10.1007/978-3-031-68738-9_33

Abstract

Autonomous drones have been proposed for many industrial inspection roles including wind farms, railway lines and solar farms. They have many potential benefits, including accessing difficult to reach locations, reduced physical risk to operators ...

research-article

CoSec: On-the-Fly Security Hardening of Code LLMs via Supervised Co-decoding

ISSTA 2024: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and AnalysisPages 1428–1439https://doi.org/10.1145/3650212.3680371

Large Language Models (LLMs) specialized in code have shown exceptional proficiency across various programming-related tasks, particularly code generation. Nonetheless, due to its nature of pretraining on massive uncritically filtered data, prior studies ...

research-article

Open Access

An AI System Evaluation Framework for Advancing AI Safety: Terminology, Taxonomy, Lifecycle Mapping

AIware 2024: Proceedings of the 1st ACM International Conference on AI-Powered SoftwarePages 74–78https://doi.org/10.1145/3664646.3664766

The advent of advanced AI underscores the urgent need for comprehensive safety evaluations, necessitating collaboration across communities (i.e., AI, software engineering, and governance). However, divergent practices and terminologies across these ...

Article

Towards a Certified Proof Checker for Deep Neural Network Verification

Logic-Based Program Synthesis and TransformationPages 198–209https://doi.org/10.1007/978-3-031-45784-5_13

Abstract

Recent developments in deep neural networks (DNNs) have led to their adoption in safety-critical systems, which in turn has heightened the need for guaranteeing their safety. These safety properties of DNNs can be proven using tools developed by ...

research-article

User Tampering in Reinforcement Learning Recommender Systems

AIES '23: Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and SocietyPages 58–69https://doi.org/10.1145/3600211.3604669

In this paper, we introduce new formal methods and provide empirical evidence to highlight a unique safety concern prevalent in reinforcement learning (RL)-based recommendation algorithms – ’user tampering.’ User tampering is a situation where an RL-...

Article

Mutation Testing of Reinforcement Learning Systems

Dependable Software Engineering. Theories, Tools, and ApplicationsPages 143–160https://doi.org/10.1007/978-3-030-91265-9_8

Abstract

Reinforcement Learning (RL), one of the most active research areas in artificial intelligence, focuses on goal-directed learning from interaction with an uncertain environment. RL systems play an increasingly important role in many aspects of ...

poster

Safer reinforcement learning through evolved instincts

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference CompanionPages 77–78https://doi.org/10.1145/3377929.3389946

An important goal in reinforcement learning is to create agents that can quickly adapt to new goals but at the same time avoid situations that might cause damage to themselves or their environments. One way agents learn is through exploration mechanisms,...

Article

Risk-averse Distributional Reinforcement Learning: A CVaR Optimization Approach

IJCCI 2019: Proceedings of the 11th International Joint Conference on Computational IntelligencePages 412–423https://doi.org/10.5220/0008175604120423

Conditional Value-at-Risk (CVaR) is a well-known measure of risk that has been directly equated to robustness, an important component of Artificial Intelligence (AI) safety. In this paper we focus on optimizing CVaR in the context of Reinforcement ...

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Emergence in Multi-agent Systems: A Safety Perspective

Establishing the Foundation for Out-of-Distribution Detection in Monument Classification Through Nested Dichotomies

An In-depth Analysis of Jailbreaking Through Domain Characterization of LLM Training Sets

Bridging the Reality Gap: Assurable Simulations for an ML-Based Inspection Drone Flight Controller

CoSec: On-the-Fly Security Hardening of Code LLMs via Supervised Co-decoding

An AI System Evaluation Framework for Advancing AI Safety: Terminology, Taxonomy, Lifecycle Mapping

Towards a Certified Proof Checker for Deep Neural Network Verification

User Tampering in Reinforcement Learning Recommender Systems

Mutation Testing of Reinforcement Learning Systems

Safer reinforcement learning through evolved instincts

Risk-averse Distributional Reinforcement Learning: A CVaR Optimization Approach

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder