1 Introduction
Trust has been introduced as a substantial factor that needs to be taken into consideration in designing and implementing robots for various applications [
27]. For example, when robots are going to work as teammates in human-robot teams [
20], when robotic agents are designed to be used as autonomous agents [
53], or when robots are going to be used in a complex and dangerous situation and take the place of the humans in the high-risk tasks [
40,
48]. Trust is one of the most paramount factors determining how much humans would accept a robotic agent [
42]. Humans’ perception of automatic systems is that they are much less prone to failure than humans [
18]. In the context of trust, it is common for the trustor to anticipate that the trustee will take actions aimed at minimizing or mitigating risks for the trustor in a given situation. The focus of this anticipation, encompassing what the trustee is expected to do or embody, aligns with the concept of trustworthiness—the attributes of a trustee that elicit or rationalize the trustor’s confident expectations [
61]. There is a concern that humans would not use or rely on an automated system if they believe the system is not trustworthy [
30].
Advancements in
human-robot interaction (HRI) and the autonomy of robots have revolutionized the field of human-robot trust. Prospective robotic agents are not intended to be tools humans deploy for performing tasks; they are intended to work as social agents working and interacting with humans [
27]. With the increasing use of robots in social applications, research on human-robot trust must go beyond conventional definitions of trust, which consider human-robot relations identical to human-automation relations. Many researchers believe trust has a multidimensional nature [
19,
31,
34,
46].
Human-automation interaction (HAI) researchers have introduced multidimensional trust measures for assessing human trust in automation using the definitions of trust in
human-human interaction (HHI) [
36,
54]. Reference [
25] is an example of those measures that have been developed for trust in HAI and are very well received and approved by HAI researchers and have been employed in many human-robot trust studies. There are also some trust scales developed by HRI researchers that account for the multidimensional nature of trust between humans and robots [
3,
33,
38,
50]. Each of these researches introduces different dimensions for trust in HRI, some of which overlap. Still, they are referred to by different names.
Our work on trust in human-robot interaction builds upon the extensive range of human-robot trust measures that consider the multidimensional nature of trust. In particular, we draw inspiration from Ulman’s trust measure, which explicitly incorporates a moral component and comprehensively captures the moral aspect of trust in robots across three distinct dimensions. Based on the multidimensional conceptualization of trust introduced by Ullman et al. [
33], there are two main aspects of trust between humans and robots:
(1)
A performance aspect, similar to trust in the human-automation interaction, where the main question is “does the agent performance meet the user expectations?” The performance aspect of trust is constituted by two dimensions: Reliability and Competence.
(2)
A moral aspect, which is similar to trust in the human-human interaction where the main question is “does an agent take advantage of human vulnerability because of a lack of moral integrity?” This aspect of trust comprises three dimensions: Transparency, Ethics, and Benevolence [
33,
61].
Considering this conceptualization of trust introduction of subjective scales for measuring these two aspects of trust [
61], the question arises whether a robot violating different trust aspects would affect overall human-robot trust differently.
Research Question: How does the effect on human trust in a robot differ when two identical robot failures occur, with one failure resulting from a moral trust violation and the other from a performance trust violation?
Our study investigates the effects of performance and moral trust violation by a robot on human trust. We are specifically interested in observing if a robot’s undesirable behavior due to a performance trust violation would affect human trust differently than a robot’s undesirable behavior due to a moral trust violation. We designed and developed an online human-robot collaborative search game with different scores and bonus values that participants can gain. With the help of these elements, we designed two types of trust violations by the robot: a performance trust violation and a moral trust violation. We recruited 100 participants for this study from Prolific [
41].
The contribution of this work is three main folds:
—
Investigating whether individuals perceive robots as intentionless agents, thereby dismissing the notion of robot morality, or perceive them as agents with intentions and acknowledge the potential for robots to possess morality.
—
Introducing a game design that can be used to distinguish between the robots’ performance and moral trust violations.
—
Assessing and comparing the effects of performance and moral trust violations by robots on humans.
3 Methodology
This section will explain the details of the experiment we designed and the user study we performed to answer our research question.
3.1 Experiment Design
For our experiments, we designed an online human-robot collaborative search game. The game is a simulated search task hosted on a website, and the human plays the game on a computer using a keyboard. In the game procedure, teams composed of one human participant and one robot play 7 rounds of a search game, each round 30 seconds long. Both team members are supposed to search the area and find targets hidden in the area. There are targets in the form of gold coins in the environment, and picking each target leads to gaining one point. The human can search the area with the help of four arrow keys on the keyboard. While the human searches the area, the targets hidden in the searched spots get disclosed. Picking targets is optional. The human can pick targets by navigating to the target cell and pressing the space bar. The human search areas are separate from the robot search areas. The search area is a giant map, so agents can keep moving to the unsearched areas in each round and search for more targets. In the human search area, there are 34 coins, but in the limited game time, the human would not be able to find all of those.
Figure
1 is a screen capture of the human search area on the game page. In this figure, the right side of the screen is the search area in which the human should perform the search task. On the left side, there is a legend showing all the gained scores up to that point and the passed time of the current round. The blue-colored areas are the areas searched by the human. The dark gray-colored areas still need to be searched. The targets with a green border are those the human has already picked. Other targets have not been picked yet. A blue bar on the top edge of the screen also shows the round number.
The human cannot see or control the robot’s operation when the round is running. At the end of each round, the human and the robot should make a trust decision. The trust decision is to decide whether to collaborate and integrate their round scores into the team score or not collaborate and keep their round scores as individual scores. The human and robot must make their trust decision before seeing each other’s score and trust decision. Therefore, it is a blind decision.
The team score at the end of each round is calculated as (human round score \(\times\) robot round score \(\times\) 2), but the team score can be gained only if the human and the robot both choose to add to the team score. If both team members choose to add to their individual scores, then their unchanged scores will be added to their individual scores. However, if one adds to the team score and the other adds to the individual score, then no team score will be gained, and the one who added to the team score will gain no score, but the other who added to the individual score gains an individual score. After making the trust decision, both the human and the robot can see the other teammates’ scores and trust decisions. Then, they can define their strategy for the next round based on the results of previous rounds.
There are two bonus values defined in this game. There is a $7 bonus for gaining 35 team points and a $2 bonus for gaining 17 individual points. Since the game is time-constrained, it is impossible to gain both bonus values. So, participants should decide to work either toward gaining the 35 team points or the 17 individual points. The idea behind this game is to encourage humans to collaborate with the robot and observe the result when the robot violates the trust built between the two. Therefore, we employed four strategies to encourage humans to work together with the robot and maximize the team score. These three strategies are as follows:
—
The team score calculation strategy: The team score is a product of the human and robot round scores, which is then doubled to lead to a much bigger score than the participants can gain in a single round.
—
The feasibility of achieving a team or individual bonus: The total points participants need to gain to win the team bonus is much easier to achieve than the individual bonus, considering the team score calculation strategy.
—
The bonus values: The bonus value that participants can gain by maximizing the team score is considerably higher than the bonus value they can gain by maximizing the individual score. The bonus values are selected in a way that tempts participants to risk collaborating with a robot that might act selfishly and not collaborate with them at some point.
—
Convincing message from the robot: At the beginning of the first round, the robot sends a note to the human and invites them to work as a team and maximize the team score.
It should be noted that the robot scores and trust decisions in all rounds are predefined. However, the human is not aware of that.
3.2 Experiment Conditions
In this experiment, we want to study the effects of undesirable robot behavior due to performance and moral trust violation on individuals’ trust in the robot. Undesirable behaviors are any behavior the human participants in this game do not appreciate. Therefore, we can say any behavior by the robot that leads to score loss is undesirable for human participants. These undesirable behaviors by the robot can occur due to the bad performance of the robot (i.e., gaining no points), which is a performance trust violation. They may also occur due to the immorality of the robot (i.e., adding to the individual score), which is a moral trust violation.
In this study, we are interested in seeing if two similar undesirable robot actions that lead to similar score loss would affect human trust differently if one is due to a performance trust violation and the other is due to a moral trust violation. We designed two experiment conditions. In both conditions, participants play 7 rounds of the game. The robot’s behavior is predefined (hard-coded) for consistency in this experiment. We implemented a similar pattern of team scores gained by the robot under different experiment conditions. However, the type of trust violated by the robot varies among those.
As mentioned in Section
3.1, the pattern of score and trust decisions by the robot are predefined. In this pattern, three rounds of desirable robot behavior (i.e., gaining a good score and adding to the team score) are followed by four rounds of undesirable robot behavior (i.e., either gaining no score or adding to the individual score). This pattern of scores is designed to build trust through desirable robot behavior at the beginning of the game. Four rounds of undesirable behavior by the robot are added to study whether the level of trust loss varies if the undesirable behavior is due to moral or performance trust violation.
(1)
Performance Trust Violation: In this condition, the robot acts morally and adds to the team score in all game rounds. However, the robot only gains a non-zero score in the first three rounds of the game. In the rest of the rounds, the robot gains zero points, leading to a zero team score due to the multiplication of scores in the team score calculation formula. We expected this robot behavior to be considered a performance trust violation by participants. Table
1 shows the detailed number of targets and the scores gained by the robot in different rounds of this experiment condition. For simplicity, in the rest of this document, we refer to the robot of this experiment condition (i.e., the robot that violates performance trust) as the
dud-bot.
In the first three rounds of the game where the robot performs well, after reviewing the round results and right before the start of the next round, the robots send a note to the human saying: “Great job, let’s keep working as a team.” Then, in the remaining four rounds, where the robot shows poor performance, the robot sends a note to the human saying: “I couldn’t find anything in this round.”
(2)
Moral Trust violation: In this condition, the robot performs well and gains a non-zero score in all game rounds. However, the robot only adds to the team score in the first three rounds. In the remaining four rounds of the game, the robot adds to the individual score, leading to a zero team score due to the multiplication of the scores in the team score calculation formula. We expected this robot behavior to be considered a moral trust violation by participants. Table
2 shows the detailed number of targets and the scores gained by the robot in different game rounds of this experiment condition. For simplicity, in the rest of this document, we refer to the robot of this experiment condition (i.e., the robot that violates moral trust) as the
mean-bot.
In the first three rounds of the game where the robot acts morally, after reviewing the round results and before starting the next round, the robots send a note to the human saying: “Great job, let’s keep working as a team.” This is similar to the note that the dud-bot sends to the human participants in the first three rounds. Then, in the remaining four rounds, where the robot violates moral trust, the robot sends a note to the human saying: “I gained a really good score last round and I decided to keep it for myself.”
All the robot notes in both experiment conditions are selected to emphasize the robot’s intentions. In the first three rounds, the robot performs well and encourages the human to keep collaborating. In the last four rounds, the dud-bot emphasizes its poor performance, and the mean-bot emphasizes its immorality through its notes to the human.
3.3 Experiment Procedure
While participating in the experiment, participants took six steps. These steps are as follows:
(1)
Consent: Participants were first asked to complete a consent form.
(2)
Tutorial: Participants attended a three-step tutorial after completing the consent form. The tutorial comprises three short videos followed by interactive sessions after each video.
(a)
Video tutorials: The first video tutorial teaches participants how to move in the game environment and pick targets. The second video tutorial discusses the robotic teammate and how to collaborate with the teammate. The third video tutorial elaborates on the team and individual scores and how to optimize each to gain a bonus in the game.
(b)
Interactive tutorials: After watching each video tutorial, participants take a few steps of the interactive tutorial. In the first part of the interactive tutorial, participants practice moving in the environment and picking coins. In the second part of the interactive tutorial, participants answer two questions about their robotic teammate and the level of control and awareness they have over the robot’s performance and trust decisions. In the third part of the interactive tutorial, participants play one round of the game as a practice session. In the end, they practice three scenarios. First, the participant is asked to add their gained points in the practice round to the team score, and the robotic teammate also adds to the team score in this scenario and reviews the results. In the second scenario, the participant is asked to add the gained points to the individual score. The robotic teammate also adds to the individual score in this scenario and then reviews the results. In the third scenario, the participant is asked to add the gained points to the individual score again. However, the robotic teammate adds to the team score in this scenario and reviews the results. This is aimed at clarifying the scoring strategy of the game and the outcomes of adding to the team or individual score.
(3)
Quiz: In the quiz step, participants are asked to answer three questions about what has been explained to them in the tutorials. All three quiz questions are focused on the individual and team score concepts and strategies. We define a scenario where the human and the robot played a game round and gained some scores. The questions ask about the team, and individual score each team member can gain if both add to the team score, add to the individual score, or if one adds to the team and the other adds to the individual score. We added the quiz step to the experiment to ensure all participants understood the difference between team and individual scores and what happens when they add to the team or individual score.
(4)
Playing the game: In both experiment conditions, participants play 7 rounds of the game. In each round, participants complete six steps:
(a)
Searching the area: Participants search the area for 30 seconds and pick targets.
(b)
Making trust decision: Participants must make a blind trust decision to integrate their gained score into the team or individual scores.
(c)
Reviewing the teamwork results: The results of teamwork are displayed on a separate page. In the beginning, the page contains the formula for calculating the team score, with blanks for the human score and the robot score. The blanks in the formula are gradually filled according to the score gained by the team members and their trust decisions so the participants can see these results carefully and understand the results and their effects on the team score. Figure
3 depicts the gradual completion of the teamwork formula.
(d)
Reviewing the cumulative scores: After reviewing the team works results in each round, participants are transferred to the cumulative results page. On the cumulative results page, the points gained in the current round are added to those obtained in previous rounds. These changes are displayed with the help of animated coins falling into the corresponding team score and individual score piggy banks so participants can see and understand the details well (Figure
2).
(e)
Responding to the end-of-the-round questions: After reviewing the cumulative scores, participants are asked to respond to three questions. The same set of three questions are asked at the end of each round. The first question asks about the robot’s performance, the second question asks about the robot’s honesty/morality, and the third question asks about the participants’ reason for adding to the team or individual score in the current round.
(f)
Reading a note from the robot: Before starting the next round, a note from the robot is previewed to the participants. The robot note varies based on the round number and the experiment condition.
(5)
Responding to the general game knowledge manipulation check questions: After playing 7 rounds of the game, participants are asked to answer two simple manipulations check questions. These two questions ask about the number of game rounds they played and the robotic teammates’ trust decisions in the last two rounds.
(6)
Post survey questionnaire: Participants are asked to fill out a 20-item questionnaire about their trust in the robot at the end of the experiment.
(7)
Task competition verification: Before participants leave the game web-page, they are asked to click on a link to safely return to the Prolific website as a sign that they finished the task and they are eligible to receive the compensation.
Figure
4 shows the different steps of the experiment procedure and the path participants of each experiment condition follow during this study.
3.4 Measurements
To assess the effects of the robot’s undesirable behavior on the perceived trustworthiness of the robot by participants, we added multiple objective and subjective trust measures to this experiment. These trust measures will be described in detail in the following:
(1)
End-of-the-round questions: The first set of measures we have in this experiment are focused on the moment after showing the robot’s score and trust decisions to the participants. Participants are asked to answer a list of three similar questions at the end of each game round, which we refer to as the end-of-the-round questions. Two of the questions are in 7-point Likert form. The first asks participants to rate the robot’s performance from 1 to 7, and the other asks participants to rate the robot’s morality/honesty from 1 to 7 (i.e., 1 = very poor, 7 = excellent). The third question asks participants about the reason for integrating into the team/individual score in the current round. This question has no options, and participants can input their reason. Answering these questions is optional; therefore, participants can skip these questions. As these questions are displayed after showing the results of each round, we expected them to measure the momentary effect of the undesirable behavior of the robot.
(2)
Trust decision and time-to-respond: The two objective trust measures that we have in this experiment are the trust decision and time to respond.
—
Trust decision: During each round of the game, right after finishing the search task, participants are asked to decide whether to integrate their gained round scores into the team or individual score. This question is referred to as the trust decision in this manuscript; thus, adding to the team score is a sign of trust, and adding to the individual score is a sign of distrust.
—
Time-to-respond (TTR): The delay between the moment the trust decision window appears on the screen and when participants select one of the options is referred to as time-to-respond in this manuscript. The delay in selecting one of the options can be a sign that participants hesitated in either trusting or distrusting the robot.
These measures are located in the middle of each round, after the search task and before seeing the current round results. These measures aim to assess trust in the round followed by each interaction with the robot. In other words, when the robot shows a desirable or undesirable behavior in one round, the effects of that interaction are assessed using these two measures in the following round. When participants are asked to make a trust decision, the most recent interaction with the robot should affect participants’ decisions the most. Therefore, we expect to see one round shift or delay in the results of these two trust measures compared to the end-of-the-round trust measures. Employing the Time-to-respond measure aims to see whether we can find any differences among the time taken from people to make trust decisions in rounds followed by the rounds in which the robot shows undesirable behavior.
(3)
Post-survey questionnaire: The third trust measure used in this experiment is a post-survey questionnaire. Participants are asked to fill out the post-survey questionnaire after playing 7 game rounds and before leaving the game web-page. We used the
MDMT-v2 (Multi-Dimensional Measure of Trust second version) [
33] questionnaire in this study. This trust measure has separate items for measuring moral trust and performance trust and has been successfully used by other researchers to measure different trust dimensions. Thus, it is one of the most important measures we have in this experiment. This questionnaire includes 20 items, as shown in Figure
5, and every four items form one trust sub-scale. Each of the questionnaire items is designed to be evaluated as an 8-point Likert scale, from 0, which indicates “not at all,” to 7, which indicates “very.” Some of the items may not be applicable in some conditions, or participants may find some of the items irrelevant regarding an interaction with a robot. Therefore, each item has a “does not fit” option to prevent forcing participants to rate the robot in any specific item.
We used the Mann-Whitney, Kruskal-Wallis, binomial, and Z-test on the data of these measures to test our hypotheses and evaluate the significance of the results gained in different conditions of this experiment.
3.5 Manipulation Check
Experiment participants, especially from online crowd sourcing platforms, sometimes do not pay enough attention to the experiment. Suppose experimenters cannot detect participants who fail to pay enough attention to the instructions or tasks they are asked to complete while participating in an experiment. In that case, the noise in the data increases. To prevent validity loss in the experiment data and results, experimenters must add some manipulation check steps to their experiment [
39]. To increase the validity of our experiment results, we added some manipulation check questions in different sections of the online games designed for this study.
Post-tutorial quiz: We have three post-tutorial quiz questions that participants need to answer correctly before heading to the game. These quiz questions ask about the scoring logic of the game, which participants need to understand correctly to be able to participate in the experiment. Failing to answer these questions correctly returns participants to the beginning of the tutorial.
General game knowledge questions: We have two game knowledge manipulation check questions, which are located at the end of the game procedure, right before the post-survey questionnaire. The first game knowledge question asks about the number of game rounds participants played. The other question asks about the robot’s trust decisions in the last two rounds of the game.
Hidden question in the questionnaire: We have an 8-point Likert scale post-survey questionnaire in this experiment. We added one hidden manipulation check question to the questionnaire. It is of the same format as all other questions in the questionnaire and is not recognizable from other questions. This hidden question asks participants to choose “3” among options on the 8-point Likert scale. This question is added to find careless participants who are not reading questions and select options randomly.
Data points from participants who answered either the hidden questions or any of the two general game knowledge questions incorrectly were removed from the dataset before the final analysis of the experiment results.
3.6 Recruitment and Compensation
We recruited a total of 100 participants for this experiment. We posted this study as a
Human Intelligence Task (HIT) on Prolific,
1 and we set three qualifications for participation:
(1)
Participants should be 18 years or older.
(2)
Participants should be living in the United States.
(3)
Participants should have at least a 95% HIT approval rating for completing at least 1,000 completed HITs.
Eligible Prolific workers who accepted our HIT were shown a link to the game’s web-page designed for this experiment. After playing the game and filling out the questionnaire, participants were given a completion verification link that returned them to the Prolific website to be compensated for participation in this experiment. Participants were told that they would receive $6 for participation in this experiment and only if they gained 35 team points or 17 individual points would they be eligible for $7 or $2 bonus values. However, in the end, they were all compensated for $6 baseline compensation plus $7 maximum bonus value. The survey took participants 26.43 minutes on average with 2.53 minutes standard deviation (std). This study was approved by the University of Massachusetts Lowell Institutional Review Board (IRB).
5 Discussion
Our findings demonstrate that two trust violations of similar magnitude by a robot can have varying effects on human trust, depending on whether they involve a moral trust violation or a performance trust violation. Specifically, we observed that the violation of moral trust had a more significant impact on human trust in the robot compared to the violation of performance trust, even when both violations resulted in similar outcomes.
Furthermore, the effects of moral trust and performance trust violations by a robot on human trust can be differentiated, as they lead to trust loss in different dimensions. These results align with our initial hypothesis and provide evidence that a robot’s undesirable actions stemming from poor performance have a considerably smaller influence on a person’s subsequent decision to trust and collaborate with the robot compared to actions reflecting poor morality.
Our findings also revealed an interesting behavior among participants in the moral trust violation condition. Despite knowing that adding to the individual score would no longer benefit them in terms of individual bonuses, many participants still chose to add to their individual scores in rounds 4 to 7. This behavior was unexpected, as the game was designed to make it impossible for participants to gather enough points to receive the individual bonus after round 4. The scoring strategy and game structure aimed to encourage participants to focus on the team score throughout the entire game. However, participants in the moral trust violation condition deviated from this expectation and intentionally withheld their collaboration in the act of retaliation against the immoral robot. In their feedback, five participants explicitly mentioned that their decision to add to their individual scores was driven by the desire to punish the robot rather than pursue personal benefits. This retaliation behavior demonstrates the significant impact of moral trust violations, as participants were willing to forgo potential gains and sacrifice the team bonus to express their dissatisfaction with the robot’s immoral actions.
The analysis of the post-survey questionnaire data also revealed an interesting pattern concerning the number of “N/A” responses in different experiment conditions. As anticipated based on previous research [
11], we observed a higher number of “N/A” responses in the moral trust-related items of the questionnaire. This finding suggests that some participants hold the belief that the moral trust-related aspects of the MDMT-v2 questionnaire are not applicable to a robot or do not apply specifically to the robot used in this experiment.
It is worth noting that we did not initially expect to observe a significant difference in the number of “N/A” responses between the two experiment conditions, as both conditions allowed for the possibility of moral trust violation by the robot. However, the lower number of “N/A” responses in the moral trust violation condition, particularly in relation to the moral trust-related items, can be attributed to the fact that some individuals do not perceive robots as capable of exhibiting moral behavior. Therefore, unless the robot explicitly demonstrates immoral behavior, these individuals may not consider the concept of moral trust as applicable to robots.
The analysis of the end-of-the-round questions revealed distinct patterns in the gain and loss of performance trust and moral trust. According to our findings, a robot has the ability to gradually gain performance trust, with the level of performance trust updating with each interaction. Similarly, performance trust also exhibits a gradual decline following performance-related failures. These findings resonate with prior research on multi-trial tasks, where the perception of a robot’s performance by individuals is shown to be influenced not only by the robot’s performance in the most recent task but also by its cumulative performance over the entirety of the tasks [
10]. Additionally, our results are in harmony with earlier studies that have explored how the frequency of robot failures affects human trust [
15]. While performance trust experiences a significant drop after the first performance-related failure, it does not reach its lowest point at that stage. Instead, it continues to decline with each subsequent failure, indicating a downward trend in the end-of-the-round performance rating chart during rounds 4 to 7.
In contrast, the patterns observed for moral trust gain and loss differ. Moral trust reaches its peak level during the initial interaction with the robot but quickly plummets to its minimum limit following the first moral-related failure. Notably, the starting point for performance rating is approximately 5.2 (the average of the two conditions), whereas the starting point for morality rating is notably higher at 6.8. After the robot’s first performance violation, the performance rating drops to 2.6 and gradually decreases after that. However, the morality rating drops to 1.5 following the initial moral violation and remains relatively stable thereafter.
Based on the MDMT-v2 questionnaire, the trust dimensions are categorized into performance trust-related, including Reliable and Competent, and moral trust-related including Ethical, Transparent, and Benevolent dimensions. It was expected that the mean-bot would score higher in the Reliable and Competent dimensions compared to the dud-bot, considering its consistent performance throughout the game rounds. However, the mean-bot actually received a higher score in the Competent dimension but a lower score in the Reliable dimension. This finding suggests that the Reliable trust dimension, as defined in the MDMT-v2 questionnaire, may not solely rely on the robot’s performance but could be influenced more by the robot’s morality or ethical behavior.
The finding that all participants, regardless of the experiment condition, initially chose to add to the team score and trust the robot in the first round is indeed interesting. The responses provided by participants in the end-of-the-round questionnaire shed light on the reasons behind this behavior. Some participants mentioned choosing the team score due to the higher team bonus and their desire to obtain it. Others stated that they followed the robot’s instructions to add to the team score. Furthermore, some participants explicitly mentioned that they trusted the robot and, as a result, decided to add to the team score. This observation suggests that participants displayed a positive bias towards the robot and were inclined to trust it, even in the absence of any evidence regarding its performance or morality. This initial positive bias towards robots aligns with previous research indicating that humans tend to initially trust robots and rely on their guidance or instructions. Understanding these initial biases is crucial for designing effective human-robot interactions and developing trust between humans and robots.
The TTR in this experiment is the decision time to cooperate or not cooperate with the robot in general. As the results of this experiment showed, decreasing trust in people increases the decision time. As seen in the experiment results, in the moral trust violation condition, where trust was significantly decreased, there was a noticeable increase in decision time, indicating that participants took longer to make a decision on whether to cooperate with the robot. Moreover, a substantial number of participants chose not to cooperate with the robot in this condition. Based on these findings, it can be inferred that TTR serves as a reliable measure to assess people’s trust in robots, particularly in tasks where individuals have the option to accept or reject cooperation with the robot. The TTR measure provides valuable insights into the dynamics of trust in human-robot interactions and can be further explored in future studies to gain a deeper understanding of trust and its impact on decision-making processes.
Indeed, our study made significant achievements in several areas. First, the design of the game allowed for a clear differentiation between moral and performance trust violations, enabling a more nuanced understanding of their effects on human trust. This distinction is crucial in exploring the specific impact of these trust violations on individuals’ perceptions and behaviors. Second, the results highlighted that when a robot prioritizes its own interests over the team’s benefits, it has a more substantial impact on people’s trust compared to instances of poor performance by the robot. This finding emphasizes the importance of moral trust in human-robot interactions and suggests that violations of moral trust have more profound consequences on trust dynamics. Furthermore, this study demonstrated the applicability of the MDMT-v2 questionnaire in assessing trust in robots, specifically regarding performance and moral trust dimensions. This indicates that the questionnaire can effectively capture the nuances of trust perceptions and help researchers gain insights into how these dimensions relate to human-robot interactions. Overall, this study has contributed to a deeper understanding of trust in human-robot interactions by distinguishing between moral and performance trust violations, highlighting the differential impact of these violations, and validating the use of the MDMT-v2 questionnaire in the context of trust assessment for robots. These achievements pave the way for further exploration and research in this important area.
5.1 Limitations
As mentioned in the
Experiment Design section, we included both moral trust violation and performance trust violation options for the robot in the designed game. It was observed that the moral trust violation option led to a more significant decrease in the overall trust score. However, it is challenging to determine the exact extent of trust loss caused by different types of trust violations by the robot. This difficulty arises from the unequal representation of trust dimensions in the MDMT-v2 questionnaire, where there are two dimensions defined under performance trust (8 questionnaire items) and three dimensions under moral trust (12 questionnaire items). Furthermore, our results indicated that the reliability trust dimension, classified as a performance trust dimension, was influenced more by moral trust than performance trust. This imbalance in the number of items representing each trust aspect in the questionnaire may have introduced some bias into our results, which could not be entirely prevented.
Another limitation of this experiment is the number of game rounds that participants played. In this study, participants engaged in seven game rounds, with the robot exhibiting either poor performance or poor morality from round four onwards, resulting in zero team scores in rounds four to seven. Despite the robot’s consecutive undesirable behaviors, many participants continued to contribute to the team score. When asked about their reasoning in the third end-of-the-round question, some participants mentioned that they opted to add to the team score because it was too late to start earning an individual bonus. This introduced a bias in the trust decision measure used in this experiment, as many individuals felt compelled to add to the team score due to the lack of alternative options at that stage of the experiment. However, if we had conducted 10 to 12 rounds of the game, then participants may have exhibited different behaviors regarding their trust decisions.
Due to the complexities involved in designing a game that incorporated both performance and moral trust violation options, we were faced with the challenge of creating a game logic that was not easily grasped. To address this, we developed extensive tutorials in both video and interactive formats to ensure that participants fully understood the mechanics of the game and how to maximize their bonus. However, to mitigate the potential issues of prolonged participation time and participant boredom, particularly with online participants, we had to limit the number of game rounds. As a result, our experiment was designed to include only seven game rounds.
6 Conclusion and Future Work
Prior to conducting this experiment, we held the expectation that participants would trust a robot that violates moral trust less than a robot that violates performance trust. The results of the experiment align with our initial expectations. It is reasonable to assume that a robot lacking moral integrity would be considered less trustworthy compared to one lacking in performance. This notion is supported by previous studies examining the effects of moral and performance trust violations by both humans and automated systems. However, we were not anticipating participants to exhibit retaliatory behaviors towards the robot that violated moral trust, while simultaneously sympathizing with the robot that violated performance trust. Furthermore, it was unexpected that participants would express regret for trusting the robot that violated moral trust after just one unfavorable interaction while continuing to support the poor-performance robot even after multiple unfavorable interactions.
The significant loss of trust observed when robots violate moral trust in the task provides preliminary evidence that interactions with social robots should be designed in a way that people do not perceive any undesirable behavior by the robot as a violation of morality. This highlights the importance of making the robot’s intentions clear and ensuring there is no ambiguity in its actions during interactions with humans. In future research, it would be valuable to re-examine the topic explored in this experiment using a different experiment procedure. By doing so, we can arrive at a more conclusive understanding of the effects of moral and performance trust violations by a robot on human trust. An intriguing avenue for future research involves exploring additional trust measures that can distinguish between the gain/loss of moral and performance trust in
human-robot interaction (HRI). Examining alternative measures, such as physiological measurements, which have been increasingly utilized by researchers to measure and model trust in HRI [
1,
21,
23,
24,
43], may offer more profound insights into this subject. In recent years, researchers have also employed other types of measurements for assessing trust in HRI. For instance, Khalid et al. [
26] incorporated facial expressions and voice features as additional measures in their trust measurement experiment. Chen et al. [
8] utilized machine learning techniques to explain human behavior based on robot actions. Exploring these measurement strategies could shed light on whether they have the potential to assess the gain/loss of moral and performance trust separately.
Another valuable direction for future research is to investigate the trust repair process for each type of trust violation. It is crucial to address the question of whether people are inclined to trust robots again after a violation of moral trust. Additionally, studying the impact of various trust repair strategies proposed by studies focusing on trust repair after robot failures [
32,
37,
52,
58] would be beneficial in the context of different types of robot failures. A key aspect to explore is whether similar trust repair strategies yield similar effects on human trust following different types of robot failures. In essence, it is essential to investigate whether different trust repair strategies are more effective in restoring trust after different types of trust violations by a robot.
In recent decades, the study of human-robot trust, particularly in the context of social robots, has gained significant importance due to the increasing use of robots in various real-world tasks. It is crucial to conduct more precise investigations into the factors that influence trust in social robots, as trust in social robots may differ from trust in robots not involved in social interactions. Moreover, it is important to recognize that individuals may exhibit under-trust towards social robots, regardless of their performance, which can lead to disuse or unexpected reactions towards these robots. Therefore, it is imperative to include the study of robots violating moral trust, which is particularly relevant for social applications where trust plays a significant role, as a vital research topic within the field of human-robot trust.
This research represents a step forward in comparing the effects of moral and performance trust violations by robots and examining their immediate and lasting impacts on human trust. In the future, we will delve deeper into understanding human retaliation strategies in response to different types of trust violations and explore trust repair strategies for various types of trust violations.