Open Access Published by Oldenbourg Wissenschaftsverlag June 28, 2024

From explanations to human-AI co-evolution: charting trajectories towards future user-centric AI

Jürgen Ziegler
Jürgen Ziegler is a senior full professor in the Faculty of Computer Science at the University of Duisburg-Essen where he directs the research group on interactive intelligent systems. His main research interests lie in the areas of human-computer interaction, human-AI cooperation, recommender systems, and explainable AI. Among numerous other scientific functions, he has been founding and long-term editor-in-chief of the Journal of Interactive Media. He is also vice-chair of the German special interest group on user-centered artificial intelligence.
and Tim Donkers
Dr. Tim Donkers is a Postdoctoral Researcher at the University of Duisburg-Essen, specializing in Computational Social Science. His research focuses on the development of innovative simulation approaches for studying opinion dynamics and social polarization using machine learning techniques. Previously, Tim Donkers has contributed to the field of recommender systems, particularly in enhancing the transparency and explainability. His work continues to explore the intersection of technology and social behavior, seeking to understand and influence how digital platforms impact societal trends.

From the journal i-com

https://doi.org/10.1515/icom-2024-0020

Abstract

This paper explores the evolving landscape of User-Centric Artificial Intelligence, particularly in light of the challenges posed by systems that are powerful but not fully transparent or comprehensible to their users. Despite advances in AI, significant gaps remain in aligning system actions with user understanding, prompting a reevaluation of what “user-centric” really means. We argue that current XAI efforts are often too much focused on system developers rather than end users, and fail to address the comprehensibility of the explanations provided. Instead, we propose a broader, more dynamic conceptualization of human-AI interaction that emphasizes the need for AI not only to explain, but also to co-create and cognitively resonate with users. We examine the evolution of a communication-centric paradigm of human-AI interaction, underscoring the need for AI systems to enhance rather than mimic human interactions. We argue for a shift toward more meaningful and adaptive exchanges in which AI’s role is understood as facilitative rather than autonomous. Finally, we outline how future UCAI may leverage AI’s growing capabilities to foster a genuine co-evolution of human and machine intelligence, while ensuring that such interactions remain grounded in ethical and user-centered principles.

Keywords: user-centric AI; explainable AI; transmodal interaction; human-AI co-evolution

1 Where are we in user-centric AI?

In his visionary 1950 article “Computing machinery and intelligence”, Alan Turing wrote that one should allow the possibility to “construct a machine which works but whose manner of operation cannot be satisfactorily described by its constructors because they have applied a method which is largely experimental”. ¹ This possibility has, at the latest, turned into reality with the introduction of deep learning and the various powerful neural architectures based on it. Due to the inherent and probably unsurmountable problem of “satisfactorily describing the manner of operation” of such systems, their lack of transparency posits new challenges to the endeavor of rendering AI systems more user-centric. At the same time, this raises the question what “user-centric” really means. Although the term User-centric AI (UCAI) has a relatively brief history, a stronger focus on AI users has been asked for early. In 1983, Fischer postulated that “for “intelligent” systems to be acceptable, they must be able to explain what and how they have performed an action; this must be done in a way that the user can understand”; ² translated from German). This long-forgotten request has seen a strong revival in the more recent Explainable AI (XAI) movement which was significantly stimulated by DARPA’s XAI program. ³

Despite the abundance of research work done since then, ⁴ we believe that current XAI research is still confined and even misguided in several respects. First, there is the question which users are addressed by an explanation. Most techniques are mainly targeted at developers who wish to test the accuracy and reliability of their systems. End users and their needs are often not considered. Second, the comprehensibility of an explanation is mostly neither explicitly aimed at nor empirically tested. As a consequence, the cognitive demands involved in understanding an explanation are often underestimated. Third, and, as we believe, most importantly, explanations are typically intrinsic to the AI system that needs explaining, that is, the explanations are created relying on the same data and algorithms as the system itself. This leads to an overemphasis on the “why”. According to many XAI works, one of their main goals is to explain, why a certain output, be that a binary decision, a ranked list of suggested options, or a generated piece of information is produced by the algorithm. Yet, there is increasing evidence that users are less interested in the how the AI produced an output but want to apply their own reasoning and judgement to assess the quality and trustworthiness of the system’s recommendations. In Hernandez-Bocanegra and Ziegler, ⁵ for instance, we found that out of 1800 freely generated user questions in a conversational hotel recommender, less than 25 % asked why a recommendation was given, and almost none asked how the algorithm generated the output. Most users sought to obtain additional information about the items suggested to assess the suitability of an item in a self-determined, flexible manner. Even though some of the requested information might be found in user reviews the system could access, many questions could only be satisfied by drawing on additional, system-external data sources.

The advent of large foundation models covering massive amounts of world knowledge promises a much broader grounding for creating explanations, but some of the issues, such as the trustworthiness of the information used in an explanation, will remain or may even be amplified. At the core, however, there is the question what constitutes a good explanation. Many current explanation techniques rather fall in the category of mere descriptions than that of argumentative justifications for a system decision. In most cases, users cannot question the data and assumptions underlying an explanation, or cannot interactively ask for clarifications or the impact of alternative scenarios. An important trajectory towards more effective and user-centric systems can thus be seen in AI that can provide reasoned arguments for its suggestions or decisions, drawing on large bodies of knowledge, especially on trustworthy, validated knowledge. User trust is and will remain a core issue, not only for accepting a decision made by AI but also for appraising the explanatory capabilities of the system.

The aims of UCAI cover, however, a much broader field than just explaining the system’s decision. In the framework proposed by Xu ⁶ three main aspects are included: ethical design considerations, human factors design, and enhancing AI that fully reflects human intelligence. Explainability is seen as an aspect of human factors design. Notably, the enhancement of technology to reflect human intelligence is also seen as a core dimension of UCAI which essentially posits that to make AI human-centered, it should also become more human-like. A question touching upon all three dimensions refers to the user’s options to interact with an AI system. Human-AI interactivity refers to a large and expanding range of different functions, including the UI techniques used for controlling the system, the computational tasks that can be influenced interactively, or the level of proactivity and autonomy a system exhibits. In recommendation tasks, interactivity can mean that users can steer the user model created and maintained by the system in a desired direction. In AI-based decision-making, users may explore alternative outcomes by applying counterfactual techniques. In generative tasks, such as text or image generation, the method for interactively specifying the desired outcome at an appropriate level of detail and abstraction should be adapted to the users goals. We already see the importance of conversational or multimodal techniques on the rise, following decades dominated by the GUI paradigm. The introduction of large-scale language models constitutes a point of disruption that will likely lead to reframing human-AI interaction as a communication process that is essentially multimodal, discursive, and contextually grounded.

Human-AI interaction is also strongly determined by the degree of delegation and autonomy ascribed to the system. The design space of AI proactivity is vast, ranging from synchronous micro-interactions e.g. in human-robot collaboration to delegating complex, long-lasting tasks to the systems, or to fully automated intelligent processes. This opens up a myriad of functional, cognitive, and ethical questions, many of which have not yet been addressed in current research, and which will proliferate with the ongoing rapid advances in AI technology. Therefore, and despite the recent rapid advances, for instance in conversational systems, we see ample room – and also the greatest need – for further user-centric developments in the design of human-AI interactions.

A common understanding of the term “user-centric” implies that systems should be designed and function according to the user’s needs and capabilities. This understanding, however, tends to adhere to a rather static view of the human, where cognitive capabilities are mainly seen as constraints in the processing of information exchanged with the system. On the other hand, when the aim is to support learning processes on the side of the user, the direction of support is mainly one-way, with the system taking on the role of a tutor, applying largely pre-defined content and functions. Two-sided processes are only taken into account to a limited extent, for example, when users adapt to changes in an interface which in turn adapts to the user’s changing behavior. In future human-AI interaction, the mutual adaptions may happen with a much larger scope, and at a much deeper semantic and pragmatic level, encompassing potentially large parts of the information and action spaces of both parties. It seems evident that such developments would create considerable demands and also potential risks for users. Users will need to develop knowledge and skills for communicating with the system at an adequate level, for example, for expressing their queries and commands at a level of detail and precision that produces the intended output. We already see such demands arising under the label “prompt-engineering” when interacting with large language models. Yet, the prospect of a human-AI co-evolution also engenders higher level functions such as co-learning, joint idea generation, or the mutual establishment of norms and values.

In this brief and coarse review of UCAI facets and current developments we have deliberately omitted important themes such as fairness or the accountability of AI systems. Instead, we want take the review as a point of departure for discussing three main trajectories along which we believe the future evolution of UCAI may take place. We are deliberately not forecasting a timeline for these developments. Many of the recent advances in AI have been quite unpredictable and this may also be the case for future developments.

2 Transmodal interaction based on common ground

For a significant period in the history of digital technology, individual user experiences were limited primarily by the constraints of the software interfaces that mediated the exchange between humans and machines. These interfaces required people to conform to predefined modes of interaction that were the result of design paradigms of the time as well as technological limitations. Key limitations of existing techniques are still characterized by a low degree of adaptability, by a lack of personalization, and by insufficient consideration of the world context in which human and system are embedded.

With the advancing capabilities of AI systems, the possibility arises that these limitations will be overcome. Amplifying the perceptive capabilities of AI is one important component. AI systems will not only be capable of sensing and interpreting a wide range of human expressions and behaviors, be that explicit signals, such as verbal and gestural, or implicit signals such as eyegaze, but will also be able to recognize objects in the world along with their static and dynamic relations. As a central component, lying at the core of such functions, large, world-scale foundations models are evolving, including not only language or visual models but also action models that can represent and predict situation-dependent courses of action.

Combining the power of enhanced AI perception techniques and large foundation models will permit new forms of interaction with unprecedented levels of fluidity and context-awareness. Users will be able to refer, e.g. by pointing, to arbitrary objects in the environment and start to converse about them with the system. Conversely, the system may highlight, e.g. through XR technology, objects of interest or recommend actions to perform. The common ground for such interactions with fluid transitions between different modalities and virtual or physical components will be provided through large foundation models that are able to continuously absorb and integrate new knowledge, including user-specific activities. The ensuing style of interaction is transmodal since it is based on underlying representations that are independent of their surface form and where both user and system can switch between interaction modalities in a seamless, fluid manner where interaction context is always maintained in a consistent form.

A critical overall question concerning the use of large foundation model is the reliability and trustworthiness of the information included in the model and of the outputs generated. Ideally, both algorithmic and social solutions for ensuring trustworthiness should operate in unison. Human discussion and feedback seem essential, the case of Wikipedia has shown that a high level of trustworthiness can be achieved by such social mechanisms. In addition, new algorithmic advances seem necessary to make the immense amount of information scrutable and discussable in the first place.

3 Explainable AI – what to explain and to whom?

Explainable AI (XAI) has been emerging as a central field that addresses the critical challenge of making the complex decision-making processes of AI transparent and understandable to users. Yet, many questions concerning the comprehensibility and usefulness of explanations remain.

3.1 Where are we in XAI?

AI systems often remain enigmatic to the very people they are designed to serve. The black-box nature of such technologies, particularly those rooted in deep learning, complicates the trust and reliability that users place in them. While XAI aims to shed light on the intricate workings of AI models, the explanations it provides tend to be steeped in technical jargon and algorithmic processes and are often not aligned with users’ real needs in supporting their decision process. For example, Nunes & Jannach; ⁷ in a survey paper on explanations in decision support systems – especially recommendation systems, where the target audience is largely lay users – found that more than half of the proposed explanatory models attempt to describe the inference process. Yet, it becomes increasingly questionable whether users are interested in this kind of information. Often, the very details intended to illuminate the AI’s decisions often end up obfuscating them.

As pointed out in the introduction, one of the issues compromising the comprehensibility and usefulness of current XAI explanation is their model-centric focus, producing explanations only within the confines of algorithms and data patterns. This approach neglects the inclusion of external information that may be critical to contextualize decisions. Without broader context, explanations remain narrowly focused, limiting their relevance and meaningfulness. Returning to the example of recommendation systems, we can see that the algorithms used operate primarily on user interaction data. Explicit or implicit feedback behaviors such as ratings, time spent, or click rates are used as proxies for complex human needs, but by no means fully reflect what users actually require. The main limitation of such model-centric narratives is that they do not capture the multifaceted nature of human decision-making, which is influenced by a complex interplay of factors beyond what is immediately available in the data or model architecture. This oversight particularly limits the potential for AI to deliver actionable insights that are aligned with users’ specific contexts and needs.

Another major concern with current XAI efforts is their reliance on post hoc rationalizations. Such explanations are produced after the AI has already made a decision, and thus abstract from the actual operational logic, which renders this approach vulnerable to misguided explanations that follow a simplistic or linear narrative that does not accurately represent the often complex, non-linear process behind AI decisions. While such rationalizations may seem more tangible and closer to human explanatory patterns, there is a risk of oversimplification that could create a false sense of understanding in users.

3.2 Future explainable AI – what will it mean to explain

Acknowledging the critical dilemma between simplifying explanations for user comprehensibility and maintaining the technical accuracy necessary for genuine insight, the future of XAI demands a paradigm shift. In our view, this shift involves evolving from static, model-centric explanations to dynamic, interactive dialogues that grow with the user’s understanding and needs, allowing explanations to become more tangible and meaningful through progressive mutual understanding. By prioritizing adaptability and user engagement, future XAI models may provide explanations that balance the need for simplicity of understanding with technical accuracy subject to the current state of interaction. This approach marks a shift from static, one-time explanation schemes to a dynamic, collaborative exploration of AI’s decision-making processes.

Such a transformation suggests integrating external knowledge sources and embracing principles like the Web of Trust, where trusted external sources can mutually enhance the credibility of AI-generated explanations. Such a framework not only broadens the basis for explanations, but also introduces a layer of verifiability and context that current model-centric approaches lack. Furthermore, to address the need for trustworthiness and reliability in AI systems, the incorporation of structured logic capabilities – principles rooted in first-order logic – emerges as a promising avenue. This approach could enable AI not only to learn from data, but also to apply logical reasoning in generating explanations, making them more understandable and inherently verifiable. However, integrating these capabilities into the learning mechanisms of foundation models presents challenges that require innovations in how these models learn and, especially, how they articulate their reasoning.

This proposed evolution of XAI signifies a move towards explanations that are not end points but starting points for deeper exploration and mutual learning between humans and AI systems. We argue that incorporating argumentation theoretical principles into the XAI framework offers a promising path to more meaningful and understandable explanations for communicating these extended insights. Argumentation provides a coherent and rational basis for AI decisions, making them more accessible and persuasive to users. Dialogical models focusing on discursive forms of argumentation rather than static argument structures offer a promising theoretical ground for developing interactive argumentative explanations. ⁸ By presenting information through reasoned arguments, AI can outline the logic behind its conclusions, giving users insight into the “why” and “how” of its processing, moving beyond mere data points or representations of their inference processes.

While in sensitive domains such as healthcare, where the stakes of AI decisions are high, the accuracy and reliability of explanations should remain a priority, in domains such as entertainment or content recommendation, where the consequences of AI decisions are less critical, the perceived plausibility of explanations may even take precedence over their factual accuracy. Here, argumentation can play a significant role in enhancing user engagement and satisfaction. By presenting explanations that are plausible and logically consistent, even if simplified, AI can help users make sense of its decisions in a way that feels intuitive and satisfying, without necessitating a deep dive into complex data or algorithms.

Compared to post hoc rationalizations, which often present a one-time simplified or linear narrative, argumentation invites users into a dialogue, making the explanation process more dynamic and empowering users to critically evaluate the AI’s decisions, trustworthiness, and reliability. Thus, when AI systems are designed to engage in argumentative exchanges, they naturally become more interrogatable through two-way dialogues. This quality allows users not only to receive explanations, but also to explore the reasoning behind certain decisions, to ask for clarification on specific points, or even to challenge the AI’s conclusions. Such a dialogic approach is consistent with communication theories that emphasize the importance of interaction and mutual understanding in the exchange of information.

In sum, through the lens of argumentation, XAI can evolve to support a more dynamic, user-centered approach to explanation, where clarity, coherence, and critical engagement are prioritized. This evolution toward argumentative and interrogative systems marks a significant step forward in realizing the full potential of AI as a partner in decision making, bridging the gap between complex AI algorithms and user needs for transparency and trust.

4 The paradigmatic shift towards human-AI communication

Human-computer interaction in the past decades has been dominated by an action-oriented paradigm in which users interact by manipulating objects in a graphically simulated world. This style of interaction, the graphical user interface, has been immensely successful and has contributed to the omnipresence of information technology in all areas of life. The increasingly powerful conversational interaction style, however, is posed to challenge the dominant position of the GUI. In the following, we discuss the prospect of conversational interaction becoming the leading paradigm for interacting with intelligent systems.

4.1 Communication as the leading interaction paradigm

As we look towards a future where XAI overcomes its current limitations through dialogic, context-aware, and logical explanations, we are also striving for a new paradigm of human-AI interaction. In this sense, human-AI interaction is evolving from the provision of static explanations to a more dynamic and reciprocal form of communication. Human-AI dialogue introduces a new dimension to social interaction that is distinct from traditional human-to-human conversation and solitary reflection, such as writing. This distinction is rooted in the unique characteristics and capabilities of AI which, despite their growing powerfulness, are, as we believe, essentially different from human intelligence grounded in biological life. When integrated into a socio-technical communication framework, it provides insights into the transformative potential of human-AI co-evolution.

Human-AI communication surpasses the limitations of human-human interaction by processing vast amounts of data, applying complex algorithms to identify patterns, and generating real-time responses. Unlike human interlocutors, AI can instantly access and analyze extensive knowledge bases, providing insights that may not be readily available or may be overlooked by humans. This feature facilitates an informative and augmentative form of interaction, enhancing the cognitive processes of human participants. AI’s unique position as a constant interlocutor allows for an uninterrupted flow of ideas, fostering a deeper discursive exploration of topics. This enables an in-depth discussion that can reveal insights and perspectives that may have been previously inaccessible or overlooked in human-to-human interactions or solitary contemplation.

Building on the foundation of reciprocal chains of communication, we are approaching a future where AI not only converses with humans but also actively co-creates with them. This introduces ‘adaptive proactivity’ as one cornerstone of this new interaction paradigm. The concept extends the dynamic of the exchange, allowing AI to initiate actions that anticipate user needs, refine collaborative processes, and enrich the dialogue with contextually relevant information and suggestions. For instance, in a project management context, an AI could automate routine tasks based on its comprehension of the project timeline or propose checkpoints for human review in more nuanced areas. This balanced approach ensures that AI contributions are not only reactive but also strategically anticipatory, enhancing the fluidity and depth of human-AI discourse.

In summary, our proposed socio-technical hypothesis of communication recognizes the intertwined nature of social processes and technological systems in shaping communication. It posits that the unique characteristics of AI-based communication, such as its adaptability, proactivity, and capacity for data-driven insights, create new forms of social interaction that are not possible in purely human contexts. This hypothesis emphasizes that AI systems, due to their design and functionality, actively participate in the communication process. They are influenced by and also influence the social dynamics in which they are embedded.

4.2 The emergence of new demands on user skills and competences

Adaptive proactivity is not an end in itself. Rather, it can be used strategically to meet individual user needs. In contrast to conventional interfaces, some of which make rigid demands on the user’s skills, intelligent agents are equipped with the ability to adapt to personal skill profiles. This does not mean, however, that this increased flexibility on the part of the system implies a complete liberation from requirements on the part of the user. Rather, it calls for a rethinking of the scope of competence, which no longer refers to the learning of static user interfaces and a predetermined repertoire of functions, but rather assumes an increased cognitive flexibility and readiness.

This means that people must transfer social skills to interaction with technology, while, at the same time, constantly reminding themselves that they are not communicating with another person. Although AI mimics human communication behavior, it is ultimately just a productive means of subordinating itself to the human form of conveying meaning. AI may be able to understand, classify, or even produce emotional subtexts, for example, but it does nothing more than facilitating communication for its human interlocutor. Despite this, or perhaps because of it, humans need to remember that while communication may appear to be interpersonal, it is ultimately a technical system whose performance and potential for error is fundamentally different from that of a human.

Nevertheless, from an analytical perspective, we believe it makes sense to understand the exchange between humans and AI as a communication process. From the perspective of communication theory, communicative exchange is characterized by the fact that the internal processes of each agent involved remain opaque – they encounter each other as black boxes. In order for communication to be successful and to increase the likelihood of subsequent connections, it is therefore necessary to be able to deal with the resulting contingency. We now argue that this sense of contingency, which inevitably arises in interpersonal communication, can also be applied to human-AI interaction. Unlike the unpredictable nature of human interactions, however, which are to a significant extent driven by personal experience, stances, and emotion, contingency in human-AI communication arises from the operational parameters and complexity of AI systems. AI responses are dependent on their programming and the data they process, which introduces a different kind of indeterminacy – one that is algorithmically bounded and devoid of human experiential context.

This divergence fundamentally affects the expectation of follow-up communication. In human dialogues, responses are influenced by rich social cues and emotional connections, fostering an open exchange of ideas. In contrast, follow-up communication with AI is constrained by the system’s current capabilities and underlying algorithms, requiring a different approach to anticipating responses. Despite its advanced capabilities, AI does not possess self-awareness or true understanding. In this sense, it should not be overlooked that while human-AI interactions are indeed communicative acts and are therefore comparable to interpersonal exchange processes and often assume similar dynamics, the communication expectation with which humans encounter an AI is ultimately fundamentally different. Such a different quality of contingency, that is in essence built on vague mental models of algorithmic processing and data analysis capabilities ascribed to the AI, has significant implications for the expectation of follow-up communication.

Therefore, it is vital that users critically evaluate AI responses, recognizing that despite their linguistic sophistication, these responses are generated from a finite set of data inputs and processing rules. This critical scrutiny is essential because the verification of AI-generated information does not benefit from the collective validation found in human discussion or publicly accessible sources such as Wikipedia.

Finally, while the primary medium for human-AI communication is human language, the effectiveness and efficiency of this exchange can vary widely. Effective communication with AI requires clarity, conciseness, and the ability to provide unambiguous feedback, which, while not requiring deep technical expertise, benefits from an understanding of the system’s data-driven foundations and limitations. This knowledge is critical to setting realistic expectations about the AI’s capabilities and ensuring that interactions remain productive and meaningful.

5 Towards human-AI co-evolution

Human-AI co-evolution refers to the process by which human and artificial intelligence systems influence and enhance each other’s capabilities over time. This concept envisions AI not just as a tool or assistant, but as a partner that dynamically contributes to shared goals. Co-evolution in this context implies a synergy in which both humans and AI systems adapt, learn, and evolve through ongoing interactions, thereby expanding their cognitive and operational capabilities.

Each interaction, whether textual, vocal, gestural, or even non-verbal, embodies a complex dialogic exchange that is burdened with a high degree of mutual contingency due to the inherent opacity of the internal processes of both agents. For this reason, we propose to overcome the one-sided and narrow focus of current user-centered AI projects and instead promote a deeper exploration of communication theories that allow profound insights into the navigability and improvement of the predictability of this dynamic exchange process. From this perspective, we define two factors as essential for successful co-evolution: (1) the establishment of a productive common ground, and (2) the emergence of a novel mutual capacity for insight, creativity, and problem solving.

5.1 Common ground

Establishing common ground, the shared understanding necessary for effective communication and collaboration, is a fundamental aspect of achieving co-evolution. In human interactions, common ground involves the alignment of mental models and expectations through communicative acts, even though the thoughts and feelings of each participant remain opaque to others. Similarly, in human-AI interactions, creating common ground means developing a system in which both human users and AI systems can predict and understand each other’s responses and behaviors to some degree. This involves overcoming the inherent contingency of AI’s operating parameters – where AI’s responses are shaped by the algorithms and data it has processed – and aligning them with human expectations and needs.

Incorporating insights from communication theory, where successful communication often depends on navigating the contingency inherent in not fully knowing another’s internal state, we apply these principles to human-AI interaction. The challenge here is different because AI systems, by their nature, process and respond based on predefined algorithms and learned data, and lack the intuitive sense of context that humans typically use. To overcome this, AI systems must be designed to interpret human input in a contextually aware manner, leveraging advances in natural language processing and machine learning to better mimic this human ability, thus fostering a more intuitive and seamless exchange of ideas.

The crucial point here, however, is that clarity and predictability of interaction need not be exclusively – or even primarily – the product of full transparency, but emerge gradually through sustained engagement. Over time, repeated interactions allow both humans and AIs to adapt to each other’s patterns and idiosyncrasies, fostering mutual understanding and mitigating initial unpredictability.

This is where the concept of Theory of Mind (ToM; ⁹ ) emerges as a critical component in the evolution of user-centered AI, marking an important advance in AI’s ability to engage in more meaningful and productive interactions. ToM refers to the ability to attribute mental states – beliefs, desires, intentions, emotions – to oneself and others, and to understand that others have beliefs and desires that differ from one’s own. By equipping AI with the ability to infer the user’s underlying mental states from their interactions, we can greatly enhance AI’s responsiveness and adaptability, ensuring that its contributions resonate deeply with the user’s current cognitive and emotional context.

5.2 Cognitive resonance

Once a productive common ground is established, a state can be achieved in which human and AI cognitive processes not only align, but also reinforce each other’s capacities for insight, creativity, and problem solving. Cognitive resonance occurs when AI systems are so finely tuned to the user’s cognitive and emotional states, as well as their contextual needs, that their interactions result in a significant enhancement of cognitive capabilities. This goes beyond mere adaptation or personalization; it involves AI systems engaging in a form of dialogue that resonates deeply with the user’s thought processes, sparking new ideas and facilitating a deeper understanding of complex concepts or problems. It is a manifestation of AI’s ability to not only respond to queries or engage in dialogue, but to do so in a way that feels like an extension of the user’s own mind, enhancing their cognitive capabilities in real time.

Thus, acknowledging the ongoing debate about AI’s ability to exhibit human-like creativity, we argue that AI’s most effective role in the creative process is not as a creator by replacing human creativity altogether, but as an enhancer and facilitator. Drawing on cognitive theories such as conceptual blending, ¹⁰ which posits that creativity involves the blending of different cognitive domains to form new ideas, AI’s vast computational capabilities uniquely position it to suggest novel combinations that might elude the human brain. By accessing and analyzing large knowledge bases, AI can introduce a wide range of concepts, connections, and data-driven insights that stimulate and extend human creative capabilities without attempting to replace the intuitive leaps and subjective insights that characterize human creativity. In addition, the Geneplore model ¹¹ illustrates creativity as a cyclical process of idea generation and exploration in which AI can play a central role.

In this context, AI can act as a catalyst by dynamically engaging with users and offering real-time improvisations that stimulate creative thinking. This is exemplified by systems such as the Drawing Apprentice, a co-creative drawing partner designed to collaborate with users on abstract sketches. ¹² Such systems, which have been identified as part of a new genre of creative technologies called “casual creators,” aim to provide enjoyable creative experiences rather than focusing solely on improving the quality of creative output. They introduce a framework of participatory sense-making to model and understand open-ended collaboration. During an exploration phase, AI’s ability to provide immediate, data-driven feedback supports the iterative refinement of ideas. This feedback mechanism is critical because it enables the dynamic evolution of creative output, adjusting and responding in real time to the creative’s evolving work.

In sum, AI can enhance human creativity by introducing new insights or perspectives drawn from broader data sets than a human could readily comprehend, or through specific techniques such as prompting, nudging, and the use of tailored analogies.

At the same time, the reverse is also true: not only can AI help enhance the cognitive potential of humans, but humans can also provide AI with insights and suggestions that cannot be derived from their internal pattern matching. Unlike traditional machine learning scenarios where AI learns from large data sets, in a co-evolutionary setup, AI would adapt based on direct interaction cues from users. This could involve AI recognizing subtle cues from user feedback to refine its algorithms. For example, a recommendation system could adjust its suggestions based not only on explicit user ratings, but also on inferred satisfaction derived from user interaction times or follow-up actions.

In summary, the concept of human-AI co-evolution represents a transformative shift in our approach to artificial intelligence, from viewing AI as merely reactive tools to treating them as dynamic partners in a shared journey of cognitive and creative improvement. By establishing a productive common ground through improved communication strategies and mutual understanding, both humans and AI can transcend traditional roles and collaboratively forge new paths in problem solving and innovation. This co-evolutionary process leverages AI’s ability to process massive amounts of data and generate novel insights, while, at the same time, also harnesses human creativity to guide AI toward more nuanced and contextually relevant applications. This mutual learning and adaptation between humans and AI promises to unlock unprecedented potential in both artificial intelligence and human cognitive expansion, creating a state of cognitive resonance in which both parties can achieve higher levels of understanding and capability than would be possible individually.

5.3 Example dialogue: climate resilience planning with AI assistance

Task Overview: The city of Duisburg is initiating a comprehensive climate resilience project aimed at making the urban area more adaptable to extreme weather events such as floods, heatwaves, and storms. Emma, a climate resilience planner, collaborates with an AI system, ResAI, designed to assist in analyzing climate risks, simulating potential impacts of climate adaptation strategies, and generating sustainable urban planning solutions:

Emma: “ResAI, we need to draft a resilience strategy that addresses our primary risks: flooding, extreme heat, and severe storms. Can you analyze the historical climate data and predict the areas most at risk?”

ResAI: “Certainly, Emma. Analyzing the last 30 years of climate data, the riverbank regions are most prone to flooding, especially during spring. Urban centers experience significantly higher temperatures during heatwaves due to the urban heat island effect. I suggest focusing on these areas first. Shall I proceed with a risk assessment model?”

Emma: “Yes, please. Include socioeconomic factors in your model. We need to ensure our interventions do not disproportionately affect low-income communities.”

ResAI: “Understood. I am integrating socioeconomic data into the risk model. The preliminary findings suggest that green infrastructure, such as creating wetlands along the river and increasing urban green spaces, can mitigate these risks effectively while benefiting all community sectors. Would you like detailed projections on these solutions?”

Emma: “That would be perfect. Also, consider how these green infrastructures can be multi-functional, like providing recreational spaces or urban farming areas.”

ResAI: “I have modeled several scenarios. One promising option includes transforming 30 % of underutilized urban areas into green spaces that serve as community gardens and stormwater management systems. This approach reduces flood risk by 45 % and lowers local temperatures by up to 3 °C on peak heat days.”

Emma: “I like this integrated approach. Can you simulate the long-term maintenance costs and the potential social impact of these green spaces?”

ResAI: “Simulating now. The initial installation of green infrastructure is cost-intensive but leads to a 20 % reduction in annual maintenance costs compared to traditional infrastructure due to reduced strain on drainage systems and improved ecological sustainability. Social impact studies indicate high community approval with increased local biodiversity and improved public health outcomes.”

Dialog concepts: Emma and ResAI establish a common ground by aligning their objectives around creating sustainable, community-focused solutions for climate resilience. ResAI provides crucial data-driven insights and simulations that enhance Emma’s ability to propose actionable, evidence-based strategies. This example highlights the co-evolutionary relationship where AI contributes to expanding human capacities in strategic planning and decision-making, particularly in complex areas like climate change adaptation.

6 Conclusions

The area of AI is currently moving at an astounding pace, making it difficult to predict future capabilities and to anticipate timelines of technological developments. In line with these rapid developments, the concept of human-centeredness of AI systems needs to evolve as well. In this article, we have attempted to sketch out conceptual trajectories along which UCAI may evolve. With the increasing perceptual capabilities of AI systems in conjunction with increasingly powerful foundation models, we see a development towards transmodal interaction, where user and AI can fluidly transition between modalities while always maintaining context. A second trajectory we see in the development of a communicative paradigm of human-AI interaction which draws on the ever-increasing AI capabilities but is distinct from human-human communication. Finally, a co-evolution of human and AI can be envisioned where both partners learn from and augment each other. The central concern of UCAI must be to ensure that human values are recognized and used as guiding principles in these future developments.

Corresponding author: Jürgen Ziegler, University of Duisburg-Essen, Duisburg, Germany, E-mail: juergen.ziegler@uni-due.de

About the authors

Jürgen Ziegler

Jürgen Ziegler is a senior full professor in the Faculty of Computer Science at the University of Duisburg-Essen where he directs the research group on interactive intelligent systems. His main research interests lie in the areas of human-computer interaction, human-AI cooperation, recommender systems, and explainable AI. Among numerous other scientific functions, he has been founding and long-term editor-in-chief of the Journal of Interactive Media. He is also vice-chair of the German special interest group on user-centered artificial intelligence.

Tim Donkers

Dr. Tim Donkers is a Postdoctoral Researcher at the University of Duisburg-Essen, specializing in Computational Social Science. His research focuses on the development of innovative simulation approaches for studying opinion dynamics and social polarization using machine learning techniques. Previously, Tim Donkers has contributed to the field of recommender systems, particularly in enhancing the transparency and explainability. His work continues to explore the intersection of technology and social behavior, seeking to understand and influence how digital platforms impact societal trends.

Research ethics: Not applicable – not a study.
Author contributions: The authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Competing interests: The authors state no competing interests.
Research funding: None declared.
Data availability: Not applicable.

References

1. Turing, A. M. Computing Machinery and Intelligence. Mind 1950, LIX (236), 433–460. https://doi.org/10.1093/mind/LIX.236.433.Search in Google Scholar

2. Fischer, G. Entwurfsrichtlinien für die Software-Ergonomie aus der Sicht der Mensch-Maschine Kommunikation (MMK), Teubner, B. G., Ed.; Software-Ergonomie: Stuttgart, 1983; pp. 30–48.Search in Google Scholar

3. Gunning, D. Explainable Artificial Intelligence (XAI) at DARPA, 2017. https://asd.gsfc.nasa.gov/conferences/ai/program/003-XAIforNASA.pdf (accessed Jun 24, 2024).Search in Google Scholar

4. Rudin, C.; Chen, C.; Chen, Z.; Huang, H.; Semenova, L.; Zhong, C. Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges, 2021. (arXiv:2103.11251). arXiv. https://doi.org/10.48550/arXiv.2103.11251.Search in Google Scholar

5. Hernandez-Bocanegra, D. C.; Ziegler, J. Explaining Recommendations through Conversations: Dialog Model and the Effects of Interface Type and Degree of Interactivity. ACM Trans. Interact. Intell. Syst. 2023, 13 (2), 6:1–6:47. https://doi.org/10.1145/3579541.Search in Google Scholar

6. Xu, W. Toward Human-Centered AI: A Perspective from Human-Computer Interaction. Interactions 2019, 26 (4), 42–46. https://doi.org/10.1145/3328485.Search in Google Scholar

7. Nunes, I.; Jannach, D. A Systematic Review and Taxonomy of Explanations in Decision Support and Recommender Systems. User Model. User-Adapt. Interact. 2017, 27, 393–444. https://doi.org/10.1007/s11257-017-9195-0.Search in Google Scholar

8. Walton, D. A New Dialectical Theory of Explanation. Philos. Explor. 2004, 7 (1), 71–89; https://doi.org/10.1080/1386979032000186863.Search in Google Scholar

9. Langley, C.; Cirstea, B. I.; Cuzzolin, F.; Sahakian, B. J. Theory of Mind and Preference Learning at the Interface of Cognitive Science, Neuroscience, and AI: A Review. Front. Artif. Intell. 2022, 5, 778852. https://doi.org/10.3389/frai.2022.778852.Search in Google Scholar PubMed PubMed Central

10. Fauconnier, G.; Turner, M. Conceptual Blending, Form and Meaning. Rev. Commun. 2003, 19, 57–86; https://doi.org/10.14428/rec.v19i19.48413.Search in Google Scholar

11. Finke, R. A.; Ward, T. B.; Smith, S. M. Creative Cognition: Theory, Research, and Applications; MIT press Cambridge: MA, 1992.10.7551/mitpress/7722.001.0001Search in Google Scholar

12. Davis, N.; Hsiao, C. P.; Yashraj Singh, K.; Li, L.; Magerko, B. Empirically Studying Participatory Sense-Making in Abstract Drawing with a Co-creative Cognitive Agent. In Proceedings of the 21st International Conference on Intelligent User Interfaces, 2016; pp. 196–207.10.1145/2856767.2856795Search in Google Scholar

Received: 2024-02-26

Accepted: 2024-05-28

Published Online: 2024-06-28

This work is licensed under the Creative Commons Attribution 4.0 International License.