Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Evolving Generative AI: Entangling the Accountability Relationship

Published: 13 February 2025 Publication History

Abstract

Since ChatGPT's debut, generative AI technologies have surged in popularity within the AI community. Recognized for their cutting-edge language processing capabilities, these excel in generating human-like conversations, enabling open-ended dialogues with end-users. We consider that the future adoption of generative AI for critical public domain applications transforms the accountability relationship. Previously characterized by the relationship between an actor and a forum, the introduction of generative systems complicates accountability dynamics as the initial interaction shifts from the actor to an advanced generative system. We conceptualise a dual-phase accountability relationship involving the actor, the forum, and the generative AI as a foundational approach to understanding public sector accountability in the context of these technologies. Focusing on integrating generative AI for assisting healthcare triaging, we identify potential challenges introduced for maintaining effective accountability relationships, highlighting concerns that these technologies relegate actors to a secondary phase of accountability and creates a disconnect between government actors and citizens. We suggest recommendations aimed at disentangling the complexities generative systems bring to the accountability relationship. As we speculate on the technologies’ disruptive impact on accountability, we urge public servants, policymakers, and system designers to deliberate on the potential accountability impact generative systems produce prior to their deployment.

1 Introduction

Generative AI (GenAI) has quickly become the hottest topic for AI technologies, boasting significant advancements in natural language processing and next generational application possibilities in private and public sector domains, all encapsulated in user-friendly packages that facilitate seamless interaction for users [32, 36]. Among the various attributes of GenAI technologies, arguably, one of their most remarkable strengths lies in their development of human-like conversational linguistics [2, 12]. A notable example is ChatGPT,1 which has demonstrated the ease at which end-users may engage with AI systems using natural language and the system responds in a humanly comprehensible manner [24, 36]. This characteristic creates a platform where individuals devoid of technical expertise can interact with and significantly benefit from these systems [20].
These strengths of GenAI technologies represents a potentially transformative opportunity for public administrations. It enables innovative methods for integrating GenAI systems into the public domain, allowing citizens to directly engage and participate with government services without requiring intermediary public servants to facilitate the interaction [4, 37]. Consequently, GenAI contributes to revolutionary advancements by enhancing organizational efficiency, minimizing resource requirements, and offering citizens personalized and instantaneous access to convenient services [2, 21, 32]. This holds the potential to redefine the dynamics of citizen-government interactions through GenAI systems, to create a more inclusive and efficient framework for public services.
The existing literature on GenAI and Large Language Models (LLM) has examined extensively the current challenges and ethical risks associated with deploying these systems in government domains—ranging from concerns about 'hallucinating' incorrect information, to reflecting gender and racial biases, and ethical concerns about its usage [3, 14, 20, 25]. We instead shift the focus towards a future-oriented perspective on GenAI systems, focusing on public administration. Specifically, we contemplate how the accountability relationship dynamics between public administration actors and citizen forums might evolve or undergo transformation with the integration of GenAI systems in the public sector. It is essential to anticipate the potential impacts that introducing state-of-the-art technologies may bring, as Young et al. [46] highlights the opportunities and dangers linked to the transition in public administration tasks from human to system actors.
Diverging from past trends in AI adoption for public sector deployment, like artificial neural networks or machine learning techniques for decision making [42], GenAI introduces a paradigm shift by providing a more personal experience to the citizens through the construction of human-like conversations (which tend to be clearer than human-written conversations) [25]. This transformative characteristic implies that citizens may seek justification directly from the GenAI system itself before expecting responses from the organizational actors behind the GenAI deployment.
In this forward-looking commentary, we explore how the actor-forum accountability relationship may become more intricate for the next generation of governments that incorporate generative systems in critical domain applications. By focusing on the healthcare triaging process in the public sector (like that within the United Kingdom's publicly funded healthcare systems), we contemplate how accountability for this critical task may change shape when GenAI systems are introduced and provide a degree of technology mediated interactions. Specifically, we consider the scenario of a GenAI system handling initial triaging, instead of the citizen's first interaction being with a medical professional, the initial citizen interaction has now been directed to the GenAI. From a technical perspective, the training process of a bespoke GenAI triaging system is outside the scope of our discussion and instead we deploy this chosen scenario as a case study towards understanding the accountability evolution and envisioning the potential real-world impact on public sector accountability.
We propose that the traditional actor-forum accountability process may transform into a dual-phase process when GenAI is integrated with public administration activities. This commentary represents a preliminary effort to draw the attention of public actors to the potential accountability complexities and challenges associated with the integration of GenAI into public sector operations, highlighting the transformation accountability relationships undergo in this context. Furthermore, we identify potential pre-emptive measures to mitigate the risk of GenAI systems damaging relationships between government actors and citizens. This proactive approach aims to address and manage potential challenges, ensuring a harmonious integration of GenAI technologies into the fabric of future government interactions with citizens.

2 Accountability Relationships

Accountability is a mechanism to incentivise better behaviour and effective accountability is indispensable for fostering good governance [8, 38]. Accountability provides specific audiences, referred to as forums, with a mechanism for seeking validation behind actions conducted by groups or individuals, known as actors. Bovens [7] describes this actor-forum accountability relationship as one in which “the actor has an obligation to explain and justify his or her conduct, the forum can pose questions and pass judgement, and the actor may face consequences”. Cech [11] considered the agency of algorithmic accountability through the lens of Boven's [7] definition of accountability, which we have simplified to the visualisation in Figure 1 (below).
Fig. 1.
Fig. 1. The actor-forum accountability process.
In the context of actors for public sector AI systems, the actor may manifest as an individual or a larger group within their organisation that has deployed the algorithmic system. Attempting to discern which actors should be held accountable for algorithmic systems and their decisions is often a challenging issue to address [33, 45], particularly with the system designer role. As public sector workforces often lack the required technological skills [12], system development is often outsourced to a private entity. With the system designer, or developer, originating from a private organisation further blurs the lines of accountability [23]. The forum, on the other hand, can take various forms, including a legal forum, an internal forum based on organisational hierarchy, or the citizens who constitute the end-users of the algorithmic system [32].
To streamline the discourse surrounding accountability with GenAI systems, we adopt a narrowed scope wherein the actor represents an organisation member who has a responsibility towards the citizens. Similarly, the forum is represented by the citizens who engage with the organisation practices (and the GenAI systems embedded within). In a real-world setting, this could be reflected as a medical professional being the actor, and the forum being the patients. This streamlined perspective allows for a clearer delineation of accountability roles and responsibilities in the deployment and interaction with GenAI systems. For the purpose of our commentary, it provides a clearer view of the accountability transformation.
The forum may extend trust to the actor, especially to those in particular roles, such as medical professionals, where they are expected to act with responsibility and accountability [15, 19]. Professionals in the medical field adhere to ethical codes of practice, and given the gravity of decisions affecting citizen lives, they are obligated to provide an account of their actions and justify the correctness of their decisions [41]. If the actor, in this context a medical professional, cannot adequately justify their decisions, consequences from the forum may follow, potentially leading to legal action and the loss of professional qualifications.
Expanding on this scenario, when AI systems have been employed to support clinical practice, such as predicting illnesses or offering recommendations for patient care [6, 23], accountability concerns often emerge regarding the opacity of systems that generate unexplainable outputs [41]. Clinicians may face challenges in accounting for AI predicted outputs and may struggle to adequately justify their actions if they choose to follow the AI-recommended decision. Even with AI aimed towards assisting the actors’ decision, these tools begin to increase the complexity for being held accountable [45]. Yet, in this circumstance, the onus falls on the clinician as they ultimately choose to use the AI's recommendation and the forum communicates directly to the actor without accessing the AI system themselves. This dynamic emphasises the nuanced and evolving nature of accountability relationships in the context of AI-assisted decision-making within public administration.
Looking towards the future, generative AI systems have exhibited remarkable capabilities to serve as the foundational elements behind social chatbots across domains [13, 16]. Adapting these towards public domain applications, such as healthcare, holds significant promise for leveraging advanced chatbots. Citizens could potentially access these chatbots without the need for seeing a clinician, thereby addressing issues such as long waiting times for initial responses, an aspect that GenAI systems are well-equipped to manage [39]. Notably, research has previously been conducted on GenAI-assisted patient triaging, making the practical deployment of GenAI chatbots for such processes a conceivable reality [5, 22]. The GenAI triaging tasks demonstrated that supplying a brief trauma scenario to the system could provide an Emergency Severity Index (ESI) scale classification along with justification as to why and recommended actions for the patient [27].
In contrast to previous AI deployments, this transformative shift for public domains alters the accountability relationship, diminishing the actor's direct contact with the forum. While this is not a novel challenge, existing literature has extensively examined the accountability risks associated with automated AI decision-making processes [9]. What sets GenAI apart and introduces new challenges for accountability is their high versatility and unique capacity to replicate sophisticated human-like conversations that maintain rapport with users—an ability not previously demonstrated to this quality by conventional AI systems [18, 32]. This distinctive feature requires a re-evaluation of traditional accountability, acknowledging the evolving dynamics introduced by GenAI systems and their potential impact on the accountability relationships within the realm of public sector applications.
The traditional accountability relationship (Figure 1), where an actor once supplied information and justified their actions, undergoes a fundamental shift in the context of generative systems. In this new paradigm, a bespoke generative system may take on the responsibility of supplying information that previously would have been provided by the human actor. Importantly, these systems can be designed with the capability to respond to queries from the forum regarding the information they provide. The transformative aspect lies in the fact that the generative system, with sophisticated human-like conversation capabilities, can now address questions that were previously solely in the domain of human actors to justify. Where accountability previously represented a closed cycle between actor and forum (Figure 1), we propose that intelligent GenAI systems reconstruct the environment into a dual-phase accountability relationship.

3 Accountabilities with Generative AI

Consider a citizen feeling unwell, they interact with their healthcare system and arrange an appointment with their clinician. They explain to the clinician symptoms they've been experiencing; the clinician assesses the priority of treatment required and prescribes appropriate actions. The patient then may seek justification behind the clinician's assessment, and if dissatisfied with their response might opt for a second opinion or escalate their claim to higher authorities [38]. This establishes an accountability relationship that may be present in healthcare triaging without utilising GenAI technologies. In this traditional setup, the relationship between the actor (clinician) and the forum (patient) is relatively direct, with communication occurring directly between both parties.
Contrast this scenario with a bespoke GenAI system integrated into the public healthcare system to handle initial triaging. The citizen interacts directly with the system, answering questions akin to those posed by a clinician. Trained on extensive medial and triaging data, the GenAI system provides a comparable experience to a professional, suggesting an ESI classification and recommending actions based on the user's trauma description [5, 22]. Unlike more conventional AI approaches, GenAI systems are largely versatile [40]. Following the AI triaging process, citizens may request further justification behind their recommended outcome, which the system may be capable of producing. The accuracy of these responses may be subject to debate. Yet the naturally human linguistics of generative responses, along with the possibility of manifesting ‘humanness and emotion’ in their understanding [1], provides for citizen interaction with a system unparalleled to simpler strictly question-answer chatbots of previous years.
At a conceptual level, the implementation of current AI systems already deployed in society inherently establishes an indirect relationship between actors and forums. These systems, deployed to enhance efficiency and conserve resources [23], introduce technology-mediated interactions that facilitate indirect actor-forum relationships. As a comparator, consider a simple automated ticket verification system at an airport, where travellers (forums) interact indirectly with airport security (actors) through the system. If the system flags an issue with the traveller's ticket, the actor intervenes directly to resolve the issue. Naturally these systematic tasks are simpler, the process has been designed with a singular objective of answering ‘Is this a valid ticket?’ and providing a binary response. The future sees AI systems integrated into tasks with greater uncertainties or making decisions for tasks not deemed ‘well-defined and sufficiently precise’ [21, 34].
Unlike simpler technology systems with clear predefined tasks, intelligent GenAI systems are tasked with handling uncertainties and making judgements, with the likes of the triaging process. It is impossible to consider every trauma condition, and the possibilities that could be considered are additionally limited by the natural language modality of the conversation. Even if it were possible to consider all conditions, different clinicians may provide different outcomes dependent on their own knowledge and experiences. In comparison to the ticketing system, the GenAI is being deployed to handle highly complex and often ambiguous tasks. Therefore, the transformation in the accountability relationship is not explicitly due to the AI system reducing direct interaction between actor and forum, but instead stems from imbuing intelligent GenAI systems with tasks that require an essence of judgement and reasoning to explain decisions of uncertainty.
Boven's [7] definition of accountability has the actor supply information and provide justification for their actions, with the possibility of facing consequences from the forum. The GenAI system's ability to provide information and a level of justification to the forum mimics these aspects of the accountability process. Yet the system cannot face consequences, therefore the GenAI system cannot ultimately be held accountable. Naturally, we expect that forums dissatisfied with the GenAI responses will escalate their claim to a higher authority, the human actor. However, this introduces another relationship into the accountability process. Unlike the ticket verification system, where human actors tend to intervene as soon as an issue occurs, intelligent GenAI have the capability to provide some form of judgement that attempts to reason their predicted outcome [26]. With the ticket system all potential inputs can be known, uncertainties are not present, and resolution can be easily reached. On the other hand, the triage GenAI system is an ill-defined situation where all possibilities are ‘imperfectly known’ and uncertainties exist [18]. The GenAI thus draws from its available information to deliver its best prediction, even in uncertainties, and provides these predictions in a conversation-like manner.
GenAI producing responses and inadequate justifications that dissatisfy the forum may also correspond to forums developing negative judgements. The forum's judgement may be transferred to the human actor for the system's unfavourable responses due to an accountability chain where the public actor has delegated their task to the GenAI agent [8]. It should be considered that the GenAI accounts produced may not even align with the actors’ own accounts [11], similar to how one clinician may provide a different recommendation than another in cases of uncertainty, this places the actor in the challenging position of facing a preconceived negative reception that originated during forum interaction with the generative system. Here is where the entangling of the accountability relationship lies.

4 The Actor, the Forum, and the Generative AI Accountability Cycle

Accountability between actor and forum has been represented as a single-loop cycle (Figure 1), expanding on this conventional representation of traditional accountability [7], we propose a dual-phase cycle that delves into the intricate flow of interactions involving the actor, forum, and GenAI (Figure 2). This theoretical accountability relationship is particularly relevant in contexts where the GenAI system acts as the initial point of contact, typically in the form of an interactive open-ended chatbot system, aimed at streamlining existing processes and reducing dependency on human workers [40]; as exemplified in our earlier scenario of employing GenAI for initial patient triaging.
Fig. 2.
Fig. 2. The dual-phase accountability process between actor, forum and GenAI.
The two phases are separated according to whom the forum is interacting with at that stage of the accountability relationship. The first phase is directly between the GenAI system and the forum, whereas the second phase introduces the actor, who becomes accountable for rectifying unsatisfactory responses provided by the GenAI.

4.1 Phase 1: GenAI-Forum Interaction

In contrast to the conventional representation of a classical actor-forum accountability relationship (Figure 1), this phase illustrates a shift in the initial interaction, with the forum engaging directly with the generative chatbot system (Figure 2). In the case of a triaging scenario, for instance, the forum could be an unwell citizen seeking preliminary assessment, thus interacting with the triaging GenAI chat system to be assigned an ESI classification.
During this interaction, the citizen provides information about their symptoms. The AI system, drawing on its training data, produces a likely prognosis, determining treatment urgency and suggesting further actions [27]. The forum (citizen) then has the option to agree (and act on) the recommendation or seek additional information about the ESI categorisation or other recommended elements. This marks the forum's initiation of accountability, as they make demands that seek further information, a process facilitated initially through the generative system.
The information supplied by the GenAI in response to forum queries is influenced by the system design [45], with the information quality being contingent upon intentional decisions made by the system designer. Design choices may restrict certain dialogues, preventing the system engaging in irrelevant (or inappropriate) topics [24, 26]. If a forum question is deemed more suitable for a direct response from an official authority, the GenAI system may be programmed to redirect the forum toward human operators. In any accountability relationship, the quality of information provided to the forum directly influences the effectiveness of the relationship and contributes to determining forum satisfaction [9, 29].
Effective information enables the forum to better judge actions and make more informed decisions [35], marking a fundamental shift from previous technology-mediated interactions and foreseeable intelligent GenAI interactions. In this interaction, a cyclical exchange occurs between the forum's inquiries and the AI-provided information (Figure 2). Unlike simpler technology systems with limited responses, intelligent GenAI systems have the potential to maintain a conversational rapport with the forum. Potentially altering the nature and quality of information provided to the forum changing as GenAI interactions progress, even before human-actor intervention.
For this first phase, the information supplied by the GenAI and its justification to forum demands is pivotal in shaping the subsequent flow of the accountability relationship. As highlighted in Figure 2, the AI generated responses will be utilised by the forum to make an informed judgement. The forum may dictate whether they are satisfied with the AI generated responses, assessing both the GenAI's chosen actions and its ability to justify them. A patient must understand why the AI produced prognosis is most likely to be correct and be able to judge the recommendation supplied. Succeeding in satisfying the demands of the forum requires no further action and the accountability claim is satisfied without having to ever leave the first phase. However, failure to meet forum expectations requires elevating their demands to human actors, progressing the accountability relationship to the second phase of our entangled relationship.

4.2 Phase 2: Actor-Forum Interaction

The second phase strictly occurs when the GenAI system falls short of satisfying the forum's demands in the initial phase. Reasons for a shortfall may include unfavourable outcomes, imprecise or lacklustre justifications, or intentional restrictions for predetermined questions (perhaps for safety or privacy reasoning). In the latter case, the system may inform the forum that the query falls outside its parameters, advising them to contact a human authority for further clarification. The forum's accountability claim then transitions from phase one to phase two; or in other words, the forum's accountability claim transfers from the GenAI to the relevant human actor. Here, the forum has an expectation of the human actor to address their demands that the AI-driven system couldn't fulfil.
The secondary phase closely mirrors aspects of the traditional accountability cycle depicted in Figure 1. During this stage, the forum escalates their demands to the actor. The actor, in turn, evaluates the GenAI-forum responses and may provide new information to the forum or offer additional justifications for the GenAI-generated accounts if they are deemed accurate by the actor. Effective actor responses can satisfy the forum's demands, but failure may result in consequences, leading to the breakdown of the accountability relationship [9]. However, in contrast to the classical accountability relationship, this phase introduces two key additional components that set it apart:

4.2.1 Forum Transferred Judgement.

The first distinction between the classical accountability relationship (Figure 1) and the actor-forum relationship present in phase two (Figure 2) is the potential for negative judgements directed at the actor due to first-phase interactions. As the GenAI lacks the capacity to face consequences, shortcomings in GenAI-forum interactions lead to forum dissatisfaction carrying over to the human actor. The actor assumes a senior hierarchical role, as they delegated their role to the GenAI [8]. Therefore, the actor potentially faces preconceived negative judgments and reputational damage transferred from the prior GenAI-forum interaction [10].
The transfer of judgment through the accountability chain poses an emerging challenge in using GenAI-assisted chatbots in public domain applications. While generative systems offer advanced, human-like interactions, the potential lack of restrictions on generated dialogues may hinder actors from fully anticipating the system's responses. The challenge arises when generated responses do not align with the actor's own thoughts or actions [11], potentially leading to valid negative forum attitudes shaped by GenAI responses before direct interaction with the actor. Consequently, this dynamic may impede the effectiveness of the relationship between the actor and the forum.

4.2.2 GenAI Maintenance.

The second key difference is the potential for the information acquired from the interaction to update the GenAI system. Updating the system with new information gathered from prior interactions may contribute towards mitigating recurring unmeaningful or harmful GenAI-forum interactions. When biased or undesirable outcomes have been produced, the ability to update the system reflects consequential reaction aimed at rectifying any wrongdoings the forums experienced. Two main approaches for updating based on the interaction are fine-tuning and post-processing outputs [24, 26]. Fine-tuning enhances the system's knowledge and performance by incorporating additional data [29], while post-processing can filter results to address forum dissatisfaction with irrelevant dialogues [19]. For instance, in healthcare scenarios, fine-tuning could enhance the system's ability to recommend specialised care or provide more detailed justifications to the forums.
While the AI maintenance process requires both technical and domain expertise, the ability to meaningfully update the system is crucial for continuing to use generative LLM systems in public practices. Maintenance, in the form of updates based on forum interactions, is essential for facilitating long-term meaningful citizen interactions with these technologies. However, with the abundance of usage of pre-trained language models, it is conceivable that there are technical limitations to the extent to which behaviours from the pre-training could be mitigated using fine-tuning or updates; this introduces a dimension of technical complexity.

5 Recommendations

The envisioned change in dynamic of the accountability relationship, arising from the integration of generative-assisted chatbot systems, introduces concerns about the potential disconnect between government actors and citizen forums. The scenario where unsatisfied forums redirect their demands from an AI system to the actor raises the possibility of negative judgements being transferred, leading to a loss of initial direct answerability from the relevant actor.
In the deployment of LLM and other generative AI systems in public sector domains, addressing concerns about accountability dissonance is crucial. We define accountability dissonance as a lack of alignment or clarity in the expectations of forums when interacting with generative technologies. The absence of specified roles and boundaries for generative systems applications may lead to blurred expectations for generative accounts, resulting in unsatisfied forums transferring their negative judgments toward the actors.
Methods that aim to reduce this dissonance are imperative to ensure that these systems contribute to societal benefits while still providing citizens accessible means of holding the relevant actor accountable. Striking a balance is essential, maintaining a high degree of system utility without compromising accountability by isolating or entirely removing the relevant human actor from their position. This balance is key to fostering a robust and effective accountability relationship in the context of generative AI systems across public domains. The following suggests recommendations towards disentangling the complexities generative systems may bring to the accountability relationship.

5.1 Monitoring and Regulating GenAI Responses

Fully automating AI processes have sparked concerns about determining accountability when the AI produce harmful outputs [37], and similar concerns persist for deploying generative AI systems in comparable contexts [3]. Rather than strictly limiting utility by restricting the systems capabilities, adapting the actor's role to encompass the maintenance and assessment of their deployed GenAI's interactions may enhance accountability for AI-driven actions. Wieringa (2020) suggests that accountability of algorithmic systems concerns a network of actors, where the actors each have roles such as decision-makers, developers, and users. If we are to follow this perspective, future GenAI systems may require collaboration across each of these actor types in maintaining and monitoring the system.
Consider GenAI-assisted triaging, rather than the actor reflecting changes only after the forum interacts following a failed GenAI interaction, we advocate for the tracing and recording of GenAI-forum interactions and the verification of GenAI outcomes. In triaging, this would require the clinician to carefully verify the ESI suggested by the system. Although this approach involves a sacrifice in system utility, incorporating a human-in-the-loop allows actors to audit the system process and intervene when necessary [37]. Conducting routine verification procedures over a random subset of GenAI-Forum interactions empowers the actor to ensure that AI predictions align more closely with their own expert assessment, serving as an effective method to maintain system quality. In this scenario, collaboration between the clinician, who has domain expertise, and the developer, who has technical expertise, may be required to ensure GenAI systems align with the experts’ assessment and performance is improved over time.
Similarly, incorporating mechanisms for forums to participate in reporting their experience with the GenAI acts as a beneficial method for regulating the system quality [19]. This provides actors with valuable insights to identify aspects that may weaken the correspondence between forums and GenAI, and incorporate these results to suggest future system improvements.

5.2 Providing Information and Setting Expectations

Successful accountability ultimately requires an individual or group of actors to be held accountable for their actions and have consequences imposed. This principle applies equally to the use of generative systems, where the system itself is incapable of facing consequences, necessitating accountability to be directed upwards towards the responsible actor when systems fail [8]. While blame should not be shifted away from the actor, providing forums with preliminary information before they interact with the system can set expectations during the first phase, potentially reducing dissatisfaction stemming from anticipating responses beyond the system's boundaries.
Public organisations informing citizens about the purposes, limitations, and some technical details of the system prior to interaction are beneficial aspects that inform users [9]. For instance, in the healthcare triage scenario, providing a brief description on the intended purpose of the system - to provide an ESI classification, reduce triage wait times and ultimately provide quicker care. Along with providing the user with example input dialogues, that demonstrate how to appropriately use the system and what sort of inputs produce meaningful information may improve usefulness to the user. Additionally, informing the citizen that AI systems can make mistakes and to contact a professional is cases of urgency are beneficial towards managing forum expectations.
While these mechanisms may not actively influence forum satisfaction with the generative AI's response, they can impact the likelihood of negative perceptions towards the AI's response transferring to the actor. This awareness of system boundaries and the understanding that certain requests are better suited for the human actor to handle separately can mitigate forum dissatisfaction and contribute to a more informed and accountable interaction process.

5.3 Task Selection for GenAI Integration and Equipping Public Servants

AI systems promise improved efficiencies and performance across various tasks, whilst often true, the potential impact of removing human agency from specific tasks is often overlooked. Our previous recommendations identify practices to support accountability for GenAI systems already deployed. However, it is equally important to consider the system design phase and question if the proposed application is appropriate, feasible, and an ethically responsible action.
Public organisations should first ask whether the system should exist and critically how it influences power dynamics in society [4, 47]. Particularly concerning GenAI systems with human-like dialogue capabilities, organisations should clarify the system's purposes to ensure alignment with explicit objectives. Restricting dialogues may enhance system reliability, albeit at the cost of utility. However, well-defined tasks with fewer uncertainties are easier to manage in terms of accountability and forum satisfaction, like that of the previously discussed ticket verification task, which in comparison to the GenAI task is better well-defined and more manageable due to reduced task complexity.
Although GenAI systems hold potential for uncertain situations like healthcare triaging, their deployment should be an iterative process. Identifying core application features and recurring citizen responses may aid integration. Initially focusing on fewer objectives, assessing system performance, and gradually expanding functionality can effectively manage accountabilities while leveraging generative systems meaningfully.
For GenAI technologies to succeed in public sector deployment, government administrations must enhance technological skills across their workforces, as they severely trail behind the private sector [4, 12]. Investing in hiring, reskilling, or upskilling efforts is essential to cultivate a workforce capable of understanding and managing algorithmic systems. While not all public actors need full technical knowledge, they must grasp the system's purpose, limitations, and be capable of interpreting responses to meaningfully address forum accountability demands during the second phase of the accountability relationship.

6 Conclusion

With the current rate at which GenAI systems are advancing in quality, developing closer to human-like conversations, it presents the potential to alter the government sector landscape. This extends from adapting established processes that enhance citizen engagement to discovering novel applications that propel the digital society forward [32, 33, 36]. GenAI systems could play a prominent role in transforming society, ushering in innovative and fresh directions for citizen-government interactions. Looking ahead, we move beyond the current challenges associated with GenAI systems and delve into the evolution of the government-citizen accountability relationship, exploring potential challenges that generative AI systems introduce into the traditional actor-forum accountability relationship.
We anticipate that deployment of generative systems in intelligent chatbot scenarios brings society closer than ever to mimicking an accountability relationship through the GenAI-forum interaction. However, as AI systems lack the ability to face consequential actions and are confined to their designated purposes, the human actor must be held accountable, providing information and justification when the system falls short. Our conceptualisation of a dual-phase accountability relationship is designed as a starting point towards examining what shape accountability takes when GenAI systems are used in the public sector. This highlights the potential pitfalls that generative systems may introduce, complicating how accountability is assessed and relegating the human actor, ultimately to be held accountable, to a secondary phase of interaction with the forum.
While our reflections speculate on the future of generative AI systems, we do so with the intention of garnering the attention of public servants, policymakers, and system designers. We urge them to contemplate how the relationship between government and citizens becomes more intricate when entangled with highly intelligent GenAI systems. Envisioning potential public domain GenAI applications, government and policy actors should also weigh the potential impact these applications may have on our broader perception of the accountability relationship. Deliberation on these impacts should precede the deployment of the system in the public domain, with a focus on implementing mechanisms that ensure government actors and citizen forums remain connected, and accountability remains a focal point as we pursue technologies that advance the digital society.

Footnote

References

[1]
Achini Adikari, Daswin de Silva, Harsha Moraliyage, Damminda Alahakoon, Jiahui Wong, Mathew Gancarz, Suja Chackochan, Bomi Park, Rachel Heo, and Yvonne Leung. 2022. Empathic conversational agents for real-time monitoring and co-facilitation of patient-centered healthcare. Future Generation Computer Systems 126, (2022), 318–329.
[2]
David Baidoo-Anu and Leticia Owusu Ansah. 2023. Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. SSRN Journal (2023).
[3]
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’21), 2021. Association for Computing Machinery, New York, NY, USA, 610–623.
[4]
Jamie Berryhill, Kévin Kok Heang, Rob Clogher, and Keegan McBride. 2019. Hello, World: Artificial Intelligence and Its use in the Public Sector. OECD, Paris.
[5]
Suhrith Bhattaram, Varsha S. Shinde, and Princy Panthoi Khumujam. 2023. ChatGPT: The next-gen tool for triaging? The American Journal of Emergency Medicine 69, (2023), 215–217.
[6]
Adam Bohr and Kaveh Memarzadeh. 2020. The rise of artificial intelligence in healthcare applications. Artificial Intelligence in Healthcare (2020), 25–60.
[7]
Mark Bovens. 2007. New forms of accountability and EU-governance. Comp. Eur. Polit. 5, 1 (2007), 104–120.
[8]
Mark Bovens. 2010. Two concepts of accountability: Accountability as a virtue and as a mechanism. West European Politics 33, 5 (2010), 946–967.
[9]
Madalina Busuioc. 2021. Accountable artificial intelligence: Holding algorithms to account. Public Administration Review 81, 5 (2021), 825–836.
[10]
Madalina Busuioc and Martin Lodge. 2017. Reputation and accountability relationships: Managing accountability expectations through reputation. Public Administration Review 77, 1 (2017), 91–100.
[11]
Florian Cech. 2021. The agency of the forum: Mechanisms for algorithmic accountability through the lens of agency. Journal of Responsible Technology 7–8, (2021), 100015.
[12]
David Chinn, Solveigh Hieronimus, Julian Kirchherr, and Julia Klier. 2020. The future is now: Closing the skills gap in Europe's public sector. (2020). Retrieved December 18, 2023 from https://www.mckinsey.com/industries/public-sector/our-insights/the-future-is-now-closing-the-skills-gap-in-europes-public-sector#/
[13]
Debby R. E. Cotton, Peter A. Cotton, and J. Reuben Shipway. 2023. Chatting and cheating: Ensuring academic integrity in the era of ChatGPT. Innovations in Education and Teaching International 0, 0 (2023), 1–12.
[14]
Faiza Farhat. 2023. ChatGPT as a complementary mental health resource: A boon or a bane. Ann. Biomed. Eng. (2023).
[15]
Nikhil Garg, Londa Schiebinger, Dan Jurafsky, and James Zou. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences 115, 16 (2018), E3635–E3644.
[16]
General Medical Practice. 2023. Good medical practice 2024. Retrieved November 27, 2023 from https://www.gmc-uk.org/ethical-guidance/good-medical-practice-2024/get-to-know-good-medical-practice-2024
[17]
A. Shaji George and A. S. Hovan George. 2023. A review of ChatGPT AI's impact on several business sectors. Partners Universal International Innovation Journal 1, 1 (2023), 9–23.
[18]
Gerd Gigerenzer. 2023. Psychological AI: Designing algorithms informed by human psychology. Perspect. Psychol. Sci. (2023), 17456916231180597.
[19]
Philipp Hacker, Andreas Engel, and Marco Mauer. 2023. Regulating ChatGPT and other large generative AI models. In 2023 ACM Conference on Fairness, Accountability, and Transparency, June 12, 2023. ACM, Chicago IL USA, 1112–1123.
[20]
Abid Haleem, Mohd Javaid, and Ravi Pratap Singh. 2022. An era of ChatGPT as a significant futuristic support tool: A study on features, abilities, and challenges. BenchCouncil Transactions on Benchmarks, Standards and Evaluations 2, 4 (2022), 100089.
[21]
Sven Ove Hansson and Gertrude Hirsch Hadorn. 2016. Introducing the argumentative turn in policy analysis. In The Argumentative Turn in Policy Analysis: Reasoning about Uncertainty, Sven Ove Hansson and Gertrude Hirsch Hadorn (eds.). Springer International Publishing, Cham, 11–35.
[22]
Health Care Professions Council. 2016. Standards of conduct, performance and ethics. Retrieved November 27, 2023 from https://www.hcpc-uk.org/standards/standards-of-conduct-performance-and-ethics/
[23]
Merve Hickok. 2022. Public procurement of artificial intelligence systems: New risks and future proofing. AI & Soc. (October 2022).
[24]
Ashley M. Hopkins, Jessica M. Logan, Ganessan Kichenadasse, and Michael J. Sorich. 2023. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectrum 7, 2 (2023), pkad010.
[25]
Fan Huang, Haewoon Kwak, and Jisun An. 2023. Is ChatGPT better than human annotators? Potential and limitations of ChatGPT in explaining implicit hate speech. In Companion Proceedings of the ACM Web Conference 2023, April 30, 2023. 294–297.
[26]
Shiyuan Huang, Siddarth Mamidanna, Shreedhar Jangam, Yilun Zhou, and Leilani H. Gilpin. 2023. Can large language models explain themselves? A study of LLM-generated self-explanations.
[27]
Jerry Jacob. 2023. ChatGPT: Friend or foe?—utility in trauma triage. Indian J. Crit. Care Med. 27, 8 (2023), 563–566.
[28]
Kevin B. Johnson, Wei-Qi Wei, Dilhan Weeraratne, Mark E. Frisse, Karl Misulis, Kyu Rhee, Juan Zhao, and Jane L. Snowdon. 2021. Precision medicine, AI, and the future of personalized health care. Clin. Transl. Sci. 14, 1 (2021), 86–93.
[29]
Enkelejda Kasneci, Kathrin Sessler, Stefan Küchemann, Maria Bannert, Daryna Dementieva, Frank Fischer, Urs Gasser, Georg Groh, Stephan Günnemann, Eyke Hüllermeier, Stephan Krusche, Gitta Kutyniok, Tilman Michaeli, Claudia Nerdel, Jürgen Pfeffer, Oleksandra Poquet, Michael Sailer, Albrecht Schmidt, Tina Seidel, Matthias Stadler, Jochen Weller, Jochen Kuhn, and Gjergji Kasneci. 2023. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences 103, (2023), 102274.
[30]
Zuheir N. Khlaif. 2023. Ethical concerns about using AI-generated text in scientific research.
[31]
Jan Kocoń, Igor Cichecki, Oliwier Kaszyca, Mateusz Kochanek, Dominika Szydło, Joanna Baran, Julita Bielaniewicz, Marcin Gruza, Arkadiusz Janz, Kamil Kanclerz, Anna Kocoń, Bartłomiej Koptyra, Wiktoria Mieleszczenko-Kowszewicz, Piotr Miłkowski, Marcin Oleksy, Maciej Piasecki, Łukasz Radliński, Konrad Wojtasik, Stanisław Woźniak, and Przemysław Kazienko. 2023. ChatGPT: Jack of all trades, master of none. Information Fusion 99, (2023), 101861.
[32]
Staffan I. Lindberg. 2013. Mapping accountability: Core concept and subtypes. International Review of Administrative Sciences 79, 2 (2013), 202–226.
[33]
Brent Daniel Mittelstadt, Patrick Allo, Mariarosaria Taddeo, Sandra Wachter, and Luciano Floridi. 2016. The ethics of algorithms: Mapping the debate. Big Data & Society 3, 2 (2016), 2053951716679679.
[34]
Maria Nordström. 2022. AI under great uncertainty: Implications and decision strategies for public policy. AI & Soc. 37, 4 (2022), 1703–1714.
[35]
Claudio Novelli, Mariarosaria Taddeo, and Luciano Floridi. 2023. Accountability in artificial intelligence: What it is and how it works. AI & Soc. (2023).
[36]
Johan P. Olsen. 2014. Accountability and ambiguity. In The Oxford Handbook of Public Accountability, Mark Bovens, Robert Goodin and Thomas Schillemans (eds.). Oxford University Press, 0.
[37]
Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing.
[38]
João Gabriel Rosa Ramos and Daniel Neves Forte. 2021. Accountability for reasonableness and criteria for admission, triage and discharge in intensive care units: An analysis of current ethical recommendations. Rev. Bras. Ter. Intensiva 33, 1 (2021), 38–47.
[39]
Partha Pratim Ray. 2023. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems 3, (2023), 121–154.
[40]
Julia Romberg and Tobias Escher. 2023. Making sense of citizens’ input through artificial intelligence: A review of methods for computational text analysis to support the evaluation of contributions in public participation. Digit. Gov.: Res. Pract. (2023).
[41]
Helen Smith. 2021. Clinical AI: Opacity, accountability, responsibility and liability. AI & Soc. 36, 2 (2021), 535–545.
[42]
Weslei Gomes de Sousa, Elis Regina Pereira de Melo, Paulo Henrique De Souza Bermejo, Rafael Araújo Sousa Farias, and Adalmir Oliveira Gomes. 2019. How and where is artificial intelligence in the public sector going? A literature review and research agenda. Government Information Quarterly 36, 4 (2019), 101392.
[43]
Timm Teubner, Christoph M. Flath, Christof Weinhardt, Wil van der Aalst, and Oliver Hinz. 2023. Welcome to the Era of ChatGPT et al. Bus. Inf. Syst. Eng. 65, 2 (2023), 95–101. https://link.springer.com/article/10.1007/s12599-023-00795-x
[44]
P. Viechnicki and W. D. Eggers. 2017. How Much Time and Money Can AI Save Government? Cognitive Technologies Could Free up Hundreds of Millions of Public Sector Worker Hours. Deloitte University Press. Retrieved from www2.deloitte.com/content/dam/insights/us/articles/3834_How-much-time-and-money-can-AI-save-government/DUP_How-much-time-and-money-can-AI-save-government.pdf
[45]
Maranke Wieringa. 2020. What to account for when accounting for algorithms: A systematic literature review on algorithmic accountability. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, January 27, 2020. ACM, Barcelona Spain, 1–18.
[46]
Matthew M. Young, Justin B. Bullock, and Jesse D. Lecy. 2019. Artificial discretion as a tool of governance: A framework for understanding the impact of artificial intelligence on public administration. Perspectives on Public Management and Governance 2, 4 (2019), 301–313.
[47]
Theresa Züger and Hadi Asghari. 2023. AI for the public. How public interest theory shifts the discourse on AI. AI & Soc. 38, 2 (2023), 815–828.

Cited By

View all
  • (2025)From Code to Candidacy: Albania’s Accelerated European Union Application with ChatGPTDigital Society10.1007/s44206-024-00158-34:1Online publication date: 9-Jan-2025
  • (2024)Introduction to Generative AI in Web EngineeringGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch015(297-330)Online publication date: 27-Sep-2024
  • (2024)QuizWiz: Integrating Generative Artificial Intelligence in an Online Study ToolProceedings of the 2024 7th International Conference on Big Data and Education10.1145/3704289.3704296(87-96)Online publication date: 24-Sep-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Digital Government: Research and Practice
Digital Government: Research and Practice  Volume 6, Issue 1
March 2025
201 pages
EISSN:2639-0175
DOI:10.1145/3696796
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 February 2025
Online AM: 14 May 2024
Accepted: 04 May 2024
Revised: 29 March 2024
Received: 26 January 2024
Published in DGOV Volume 6, Issue 1

Check for updates

Author Tags

  1. Accountability theory
  2. generative AI
  3. public administrations
  4. conversational AI

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)702
  • Downloads (Last 6 weeks)176
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)From Code to Candidacy: Albania’s Accelerated European Union Application with ChatGPTDigital Society10.1007/s44206-024-00158-34:1Online publication date: 9-Jan-2025
  • (2024)Introduction to Generative AI in Web EngineeringGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch015(297-330)Online publication date: 27-Sep-2024
  • (2024)QuizWiz: Integrating Generative Artificial Intelligence in an Online Study ToolProceedings of the 2024 7th International Conference on Big Data and Education10.1145/3704289.3704296(87-96)Online publication date: 24-Sep-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media