License: CC BY-SA 4.0
arXiv:2312.02034v1 [cs.HC] 04 Dec 2023
11institutetext: CITEC - Cognitive Interaction Technology
Bielefeld University, 33619 Bielefeld, Germany
11email: {rvisser, bhammer}@techfak.uni-bielefeld.de
22institutetext: Psychology
Paderborn University, 33098 Paderborn, Germany
22email: {tobias.peters, ingrid.scharlau}@uni-paderborn.de

Trust, distrust, and appropriate reliance in (X)AI: a survey of empirical evaluation of user trust thanks: We gratefully acknowledge funding from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) for grant TRR 318/1 2021 – 438445824.
Parts of this article have been accepted for publication in the Springer CCIS series of the 1st xAI world conference (Lisbon, Portugal; 2023). A preprint version prior to the peer-review is available at https://arxiv.longhoe.net/abs/2307.13601.

Roel Visser{}^{{\dagger}}start_FLOATSUPERSCRIPT † end_FLOATSUPERSCRIPT 11 0009-0006-3067-5545    Tobias M. Peters{}^{{\dagger}}start_FLOATSUPERSCRIPT † end_FLOATSUPERSCRIPT 22 0009-0008-5193-6243    Ingrid Scharlau 22 0000-0003-2364-9489    Barbara Hammer 11 0000-0002-0935-5591
Abstract

A current concern in the field of Artificial Intelligence (AI) is to ensure the trustworthiness of AI systems. The development of explainability methods is one prominent way to address this, which has often resulted in the assumption that the use of explainability will lead to an increase in the trust of users and wider society. However, the dynamics between explainability and trust are not well established and empirical investigations of their relation remain mixed or inconclusive.

In this paper we provide a detailed description of the concepts of user trust and distrust in AI and their relation to appropriate reliance. For that we draw from the fields of machine learning, human-computer interaction, and the social sciences. Furthermore, we have created a survey of existing empirical studies that investigate the effects of AI systems and XAI methods on user (dis)trust. With clarifying the concepts and summarizing the empirical investigations, we aim to provide researchers, who examine user trust in AI, with an improved starting point for develo** user studies to measure and evaluate the user’s attitude towards and reliance on AI systems.

Keywords:
XAI Psychology Appropriate Trust Distrust Reliance Trustworthy AI Human-centric evaluation
footnotetext: These authors contributed equally to this work and share first authorship.

1 Introduction

Intelligent systems and decision making supported by Artificial Intelligence (AI) are becoming ever more present and relevant within our everyday lives. Especially their use in high-stakes applications like medical diagnosis, credit scoring, and parole and bail decisions has led to concerns about the AI models Rudin (\APACyear2019). This includes concerns about the AI’s transparency, interpretability, accountability, and fairness Guidotti \BOthers. (\APACyear2019); Arrieta \BOthers. (\APACyear2020); Mohseni \BOthers. (\APACyear2021).

These concerns, which are further enforced by the EU’s General Data Protection Regulation (GDPR, Art. 15) increased the research interest in making AI systems more trustworthy and reliable. A number of different guidelines have been set out to ensure the trustworthiness of AI (for an overview see \citeAThiebes.2021), which should help increase users’ or stakeholders’ trust. One prominent way to approach this is for modern (blackbox) AI methods to be able to explain their outputs Arrieta \BOthers. (\APACyear2020). This led to a surge in the development of explainable AI (XAI) over the last years for a host of different applications, domains, and data types Guidotti \BOthers. (\APACyear2019); Arrieta \BOthers. (\APACyear2020); Samek \BOthers. (\APACyear2021). As a result, explainability is often considered as a means to increase user trust Kastner \BOthers. (\APACyear2021). Problematically, the dynamics between explainability and trust are far from being clarified, with both terms lacking precise definitions Ferrario \BBA Loi (\APACyear2022) and results from empirical investigations of their relation remaining mixed and inconclusive Kastner \BOthers. (\APACyear2021).

In the following, we approach these problems by, first, summarizing the insights from psychological trust research that are already being employed in some of the work related to automation, AI, and human-computer interaction and clarifying the involved terminology, which is partially based on the work covered in \citeApeters2023importance. From these insights we conclude that both trust and distrust may be of similar relevance for the interaction between explainability and appropriate reliance.

Secondly, we give on overview of recent work that studies the question of trust in the AI and XAI context broadly, as well as works that perform empirical studies on the evaluation on the effects that (X)AI methods have on user trust. We provide an extensive summary of the typical application domains, (X)AI methods, and outcome measurements to evaluate the targeted AI objectives. With these contributions we want to support future research on user trust and distrust in the (X)AI context by identifying important considerations for studying trust in AI, giving recommendations for setting up empirical studies, and identifying gaps in the current research.

2 Trust in AI

Trust in AI can be defined as an attitude that a stakeholder has towards a system Kastner \BOthers. (\APACyear2021), while trustworthiness is a property of the system that justifies to trust the system Toreini \BOthers. (\APACyear2020). When looking at literature related to (X)AI, it is important to make this distinction between trust and trustworthiness. In some cases trust and trustworthiness are not clearly differentiated, or rather used interchangeably. For example, \citeABarredoArrieta.2020 describe trustworthiness as the confidence that a model will act as intended when facing a given problem, which is a fitting description of trust. The differentiation is critical, because apart from the system’s trustworthiness there are other factors that can influence trust Toreini \BOthers. (\APACyear2020).

According to research on trust between humans (so-called trustor and trustee), trustworthiness is characterized by the trustee’s ability, i.e. competence or expertise in the relevant context, the trustee’s benevolence towards the trustor, and the trustee’s integrity towards principles that the trustor finds acceptable R\BPBIC. Mayer \BOthers. (\APACyear1995\APACexlab\BCnt1). A high level of these three factors of trustworthiness does not necessarily lead to trust, and trust can also go along with little trustworthiness R\BPBIC. Mayer \BOthers. (\APACyear1995\APACexlab\BCnt1).

For the AI context a meta-analysis Kaplan \BOthers. (\APACyear2023) identified the expertise and personality traits of a trustor interacting with AI as significant predictors for trust in AI. Other relevant influences on trust, e.g., cultural differences, the type of technology, or perceived risk were identified by previous research and are discussed in Section 7.1. Overall, trustworthiness influences trust but does not fully determine it. Even the most trustworthy model will not be trusted in every case by every person. Vice versa, people may – and often do – trust an untrustworthy model Chen \BOthers. (\APACyear2023).

In complex interactions in which multiple outcomes with varying truth are possible trust is essential for humans Luhmann (\APACyear2009). By trusting a person engages in the interaction as if only certain interpretations are possible (e.g., taking things at face value) and thus rendering the interaction less complex Luhmann (\APACyear2009). In analogy, trust is also important in human-AI interactions, because of the involved risk caused by the complexity and non-determinism of AI Glikson \BBA Woolley (\APACyear2020). Similarly, \citeAhoff2015trust argue that trust is not only important to interpersonal relations but can also be defining for the way people interact with technology. In other words, you can trust an AI to be correct in its recommendation or prediction, even though the AI could err and you may not be able to comprehend or retrace the way the AI came to its outputs.

The Integrative Model of Organizational Trust by \citeAMayer.1995 is a prominent basis for trust in AI and automation Stanton \BBA Jensen (\APACyear2021). \citeAMayer.1995 define trust as “[…] the willingness of a party to be vulnerable to the actions of another party based on the expectation that the other will perform a particular action important to the trustor, irrespective of the ability to monitor or control that other party” (R\BPBIC. Mayer \BOthers., \APACyear1995\APACexlab\BCnt1, p. 712). Based on this definition they differentiate between factors that contribute to trust, trust itself, the role of risk, and the outcomes of trust. The authors highlight that for trust to be of concern, the situation of an interaction must involve some form of risk and vulnerability for the person who trusts. If there is no vulnerability involved, cooperation can occur without trust, and if there is no risk present, it is situation of confidence, not of trust R\BPBIC. Mayer \BOthers. (\APACyear1995\APACexlab\BCnt1).

Drawing from \citeAMayer.1995, definitions of trust in automation also consider the necessity of uncertainty (i.e., risk) Hoff \BBA Bashir (\APACyear2015); Lee \BBA See (\APACyear2004) and vulnerability Lee \BBA See (\APACyear2004); Kohn \BOthers. (\APACyear2021). Trust in automated systems “plays a leading role in determining the willingness of humans to rely on automated systems in situations characterized by uncertainty” (Hoff \BBA Bashir, \APACyear2015, p. 407), and is defined as “[…] the attitude that an agent will help achieve an individual’s goals in a situation characterized by uncertainty and vulnerability” (Lee \BBA See, \APACyear2004, p. 54). Trust in a system and trust defined by \citeAMayer.1995 share that they influence the willingness to rely and the situational requirements of risk and vulnerability for them to be of importance. \citeAMuir1996 observed that people used automated systems they trust but not those they do not trust. \citeALee1994 report that operators did not use automation systems if their trust in them was less than their own self-confidence.

Trust is dynamic and can increase or decrease during interactions Y. Guo \BBA Yang (\APACyear2021); Glikson \BBA Woolley (\APACyear2020); R\BPBIR. Hoffman \BOthers. (\APACyear2009). Furthermore, trust is not only influenced by the interactions and their trajectories but also by other factors outside the interaction Hoff \BBA Bashir (\APACyear2015). \citeAhoff2015trust identified dispositional, situational and initially learned factors of trust that lie outside the interaction. Such a distinction is especially relevant for the XAI context, as an explanation can be viewed as working on a micro-level (during an interaction) and including a macro-level (context of the interaction) Rohlfing \BOthers. (\APACyear2021).

To conclude, different perspectives on trust should be distinguished. One could be interested in trust in a certain decision, a certain type of interaction, a certain type of (X)AI, or in trust in AI in general. Thus, different time points to assess indicators of trust are conceivable. This entails repeated measurements of trust during multiple interactions and measurements before and after the interactions, while it can also be sensible to assess the attitude towards AI in general. These distinction are visible in the different types of assessment that are employed for measuring trust, which will be discussed in Section 7.3.

3 Appropriate reliance & appropriate trust

Reliance in the AI context can be understood as a human decision or action that takes into consideration the decision or recommendation of an AI. Trust is an attitude that benefits the decision to rely, as it has a critical role for human reliance on automation Hoff \BBA Bashir (\APACyear2015). Ideally, one would only rely on an (automated) system if it is correct. Two types of problems can occur here, namely overtrust and disuse. Disuse describes the situation in which it would be correct to rely but one does not, and overtrust describes the situation in which it would be wrong to rely but one still does. Preventing both disuse and overtrust would ensure appropriate reliance.

The problems of disuse and overtrust are well discussed in the field of trust in automation. In XAI literature they are sometimes explicitly employed as well Jacovi \BOthers. (\APACyear2021); Mohseni \BOthers. (\APACyear2021). More often they are only implicitly targeted when aiming for and discussing appropriate trust (see Section 3.2), which is why in the following we first draw the connection to previous, more well-founded work on trust in automation.

In the field of trust in automation the prevention of disuse and overtrust has been targeted by ensuring appropriate trust or calibrated trust. \citeAMcBride2010 as well as \citeAMcGuirl2006 define appropriate trust as the alignment between perceived and actual performance of an automated system. This relates to a user’s ability to recognize when the system is correct or incorrect and adjust their reliance on it accordingly. \citeAHan2020 describe calibrated trust as the alignment between actual trustworthiness and user trust. Within their model for trust calibration in human-robot teams, \citeAVisser.2020 define calibrated trust as given when a team member’s perception of trustworthiness of another team member matches the actual trustworthiness of that team member. If this is not given, either ‘undertrust’, which leads to disuse, or ‘overtrust’ can occur de Visser \BOthers. (\APACyear2020); Parasuraman \BBA Riley (\APACyear1997).

Trust calibration in de Visser \BOthers.’s sense aims to assure a healthy level of trust and to avoid unhealthy trust relationships. Thereto, they establish a process of trust calibration which accompanies collaboration by establishing and continuously re-calibrating trust between the team members. To prevent people from overtrusting, so-called, trust dampening methods are to be applied. According to the authors these methods are especially worthwhile in interactions with machines and robots, as humans have a tendency to expect too much from automation de Visser \BOthers. (\APACyear2020). The authors recommend to present human with exemplary failures, performance history, likelihood alarms, or provide information about the system’s limitations. Moreover, they make the connection to the expanding field of XAI arguing that explanation activities can help with calibrating trust. This idea is present in much work on XAI, but with an emphasis on preventing disuse and a neglect of the mitigation of overtrust.

3.1 Explainability and trust

A multitude of XAI studies implicitly or explicitly assume explainability to facilitate, or increase, trust Kastner \BOthers. (\APACyear2021); Ferrario \BBA Loi (\APACyear2022). In their summary of current XAI studies concerned with user trust, \citeAKastner.2021 call this the explainability-trust hypothesis. By connecting explainability with the facilitation of trust authors focus on one of two utilities of explanations, namely the explanation’s utility to indicate correct predictions. Yet, explanations can also indicate false predictions and, thus, sometimes another utility of explanations is also identified, e.g., not trusting predictions Ribeiro \BOthers. (\APACyear2016), critical reflection Ehsan \BBA Riedl (\APACyear2020), or enabling distrust Jacovi \BOthers. (\APACyear2021). This second utility of explanations is entertained by \citeAKastner.2021 when discussing potential reasons for the mixed results of empirical investigations of the explainability-trust hypothesis. They state that explanations could actually reveal problems of the system that may have otherwise gone unnoticed and could lead a user not to trust the AI.

The utility of an explanation to reveal problems of an AI is also targeted in the paper on the LIME algorithm, one of the first popular XAI methods which has been used for deep models Ribeiro \BOthers. (\APACyear2016). According to \citeARibeiro2016 explanations are not only helpful for deciding when one should trust a prediction, but also beneficial in identifying when not to trust one. Thereby, they differentiate between the explanation’s utility for trusting and not trusting, demonstrating the latter in an example where an explanation reveals a wrong causal relation underlying an automated decision. Yet, when generally discussing the benefit of explanations, \citeARibeiro2016 argue that “[…] explaining predictions is an important aspect in getting humans to trust and use machine learning effectively, if the explanations are faithful and intelligible” (pp. 1135-1136). The aim of getting humans to trust sets the focus on the utility of explanations to identify correct predictions, while the utility of explanation to identify wrong predictions falls short.

3.2 Desideratum of appropriate trust

Even though the two utilities of explanations are identified in the literature, when speaking in broad terms explainability is connected to a facilitation of trust, and thus, the mitigation of disuse. More careful formulations can also be observed, which target appropriate trust instead Kastner \BOthers. (\APACyear2021). The added appropriateness acknowledges that it would not be correct to trust in every case but it lacks clarity on what this entails. Implied by the appropriateness of trust is that neither blind trust leading to overtrust, nor blind distrust leading to disuse is wanted.

In other words, current improvements in automated systems, like XAI methods, are regarded as beneficial for appropriate reliance by preventing disuse and overtrust. Ideally, appropriate reliance should be achieved by fostering appropriate trust. Several formulations of this underlying notion of appropriate trust can be observed across the literature, which often entail trust and terms that can be summarized under distrust (see Fig. 1).

Refer to caption
Figure 1: Desideratum of appropriate trust in AI and the relation of trust and distrust to appropriate reliance with the goal of preventing both disuse and overtrust.
3-123-12footnotetext: \citeAHoffman.2018, \citeARibeiro2016, \citeABansal2021, \citeAVisser.2020, \citeAJacovi.2021, \citeAGunning.2019, \citeAGaube.2021, \citeAEhsan2020, \citeAJiang2022

So, beneath the desideratum of increased appropriate trust lies the desideratum of increased appropriate reliance. Yet, how can such an appropriate trust, its influencing factors, and the relation to appropriate reliance be conceptualized? To rely appropriately one would consider correct decisions or recommendations of an AI and would disregard false ones. Trust does not lead to this because trust is not only influenced by the correctness, i.e. the performance of an AI. According to \citeAhoff2015trust, the performance of an automated system is similar to the trustee’s ability in interpersonal trust, and the process and purpose of an automated system are analogous to benevolence and integrity. On top of that and as mentioned before, trust is not fully determined by trustworthiness.

R\BPBIC. Mayer \BOthers.’s influential work on trust demonstrates the difference between trust and trustworthiness, but for the mitigation of overtrust their model does not provide a basis to proceed. Fostering trust, i.e., increasing the willingness to rely, mitigates the problem of disuse. However, for mitigating overtrust, not an absence of the willingness to rely, but the ability to identify reasons not to rely is needed. Trust in R\BPBIC. Mayer \BOthers.’s model does not entail this, as they define trust “[…] irrespective of the ability to monitor or control that other party” (p. 712). While it may be true that a lower willingness to rely, a lower trust, would decrease the likelihood of overtrust, it would also lead to less reliance overall. To this point, to mitigate overtrust reliance should be prevented if, and only if, it would be wrong to rely.

More recent trust research highlights a distinction that might be of interest here. Several researchers provided evidence that trust and distrust are two related, yet separate dimensions Lewicki \BOthers. (\APACyear1998); Vaske (\APACyear2016); Harrison McKnight \BBA Chervany (\APACyear2001); Benamati \BOthers. (\APACyear2006). A recent preprint also argues in favor of considering trust and distrust in the XAI context Scharowski \BBA Perrig (\APACyear2023). We assume that this separation of trust and distrust might help solving the conceptual issue and propose that we need an understanding of both trust and distrust. Psychological insights on the potential benefit of considering distrust will be detailed in the next section.

4 Distrust as a separate dimension

While distrust is often regarded as the opposite of trust, the concept of a one-dimensional view of trust and distrust is being questioned and not widely accepted Schweer \BOthers. (\APACyear2009); Lewicki \BOthers. (\APACyear1998); S\BHBIL. Guo \BOthers. (\APACyear2017); Schoorman \BOthers. (\APACyear2007). In the two-dimensional approach, by definition, low trust is not the same as high distrust, and low distrust is not equal to high trust Lewicki \BOthers. (\APACyear1998). This allows the coexistence of trust and distrust. Among others, trust is characterized by hope, faith, or assurance, and distrust by scepticism, fear, suspicion or vigilance Benamati \BOthers. (\APACyear2006); Cho (\APACyear2006); Lewicki \BOthers. (\APACyear1998).

\citeA

Lewicki.1998 exemplify the separation of trust and distrust by contrasting low trust with high distrust. The authors regard expectations of beneficial actions being absent or present as antecedent to trust, and expectations of harmful actions being absent or present as antecedent to distrust. If the former is absent, low trust is expressed by a lack of hope, faith, and confidence. If the later is present, high distrust is expressed by vigilance, scepticism, and wariness. The combination of high trust and high distrust is described by the authors as a relationship in which opportunities are pursued while risks and vulnerabilities are monitored.

When reviewing research that draws from two-dimensional approaches, concepts and terms like “critical trust”, “trust but verify”, and “healthy distrust” are used Poortinga \BBA Pidgeon (\APACyear2003); Lewicki \BOthers. (\APACyear1998); Vaske (\APACyear2016). These align well with the problem of mitigating overtrust. Yet, little consideration of the two-dimensional view on trust and distrust can be found when trust is considered in the technology context.

One, at least partial, reason for this is found in the field of organisational psychology. Even in this field, in which the conceptual critique towards the one-dimensional approach is the most visible, applied work still relies mostly on the model by R\BPBIC. Mayer \BOthers. Vaske (\APACyear2016). Discussing the trajectory of the conceptual debate on trust and distrust, \citeAVaske.2016 describes that most of the early work on trust are one-dimensional approaches. From the mid 80s onward these approaches were considered too simplistic.

Yet, efforts to resolve this debate and empirically test it remain scarce Rusk (\APACyear2018). Instead of providing empirical evidence, work on the two-dimensional approach mostly reproduces common-sense assumptions Vaske (\APACyear2016). Moreover, only the concept trust is well researched and has a good theoretical background, while distrust remains in the state of conceptual debate and is receives little research attention Vaske (\APACyear2016).

The field of trust in XAI inherited this focus on trust and neglect of distrust, because it took prominent work on trust in automation Lee \BBA See (\APACyear2004); Hoff \BBA Bashir (\APACyear2015) as a starting point Thiebes \BOthers. (\APACyear2021), which drew from \citeAMayer.1995’s model. Distrust is often connotated negatively Lewicki \BOthers. (\APACyear1998); Vaske (\APACyear2016) and sometimes explicitly considered something to be avoided Frison \BOthers. (\APACyear2019); Seckler \BOthers. (\APACyear2015), or at least implied to be avoided. Yet considering the imperfection of contemporary ML models, distrust towards erroneous predictions and towards explanations that indicate them is not to be avoided, but fostered. Otherwise, a neglect of distrust remains, which is serious because it renders potential positive consequences of distrust invisible.

In a study by \citeAMcKnight.2004 the disposition to distrust predicted high-risk perceptions better than the disposition to trust did. For their study context of online expert advice sites they suggest that future research should study dispositional trust and also dispositional distrust. By identifying positive consequences of distrust, psychological studies also point to the benefit of considering distrust. Distrust or suspicion led, for example, to an increase of creativity J. Mayer \BBA Mussweiler (\APACyear2011) or a reduction of the correspondence bias Fein (\APACyear1996). Moreover, a series of studies by \citeAPosten.2021 showed an increase of memory performance. Finally, \citeAVaske.2016 identified a potential of distrust to improve critical reflection and innovation in the context of working in an organisational setting.

Looking at potential underlying mechanisms of distrust, Mayo’s Mayo (\APACyear2015) review introduces a so-called distrust mindset as an explanation for the positive effects of distrust. The distrust mindset leads to an activation of incongruent and alternative associations, which aligns well with the increase of creativity, reflection, and innovation. According to \citeAPosten.2021, trust triggers a perception focus on similarities that makes it harder to remember single entities. Distrust shifts the perception focus towards differences and, therefore, increases memory performance. Interestingly, in one of their studies \citeAPosten.2021 observe a higher acceptance of misinformation in a trust condition, underlining the potential problem of the current trust focus in the (X)AI context and the danger of overtrust.

A conceptual example of how trust and distrust can be targeted is provided by \citeAHoffman.2018 in their work on measuring trust in XAI. They advocate that people experience a mixture of justified and unjustified trust, as well as justified and unjustified mistrust. Ideally, the user would develop the ability to trust the machine in certain task, goals, and problems, and also to appropriately distrust the machine in other tasks, goals, and problems. This requires them to be able to decide when to trust or to correctly distrust, when scepticism is warranted.

In sum, although often connotated negatively, distrust also has positive consequences and merits separate from those of trust. Some work that considered both trust and distrust can be observed across different sub-fields of interaction with technology and will be summarised and discussed in Section 7.3.

5 Perceived versus actual trustworthiness

As we discussed the trustworthiness of AI systems and the underlying models is a growing concern, leading to the need to develop trustworthy AI. When designing models to be trustworthy some of the main concerns (from trustworthy AI) are model fairness, reliability and safety Mohseni \BOthers. (\APACyear2021). Addressing each of these concerns poses challenges in their own right. To further complicate the matter of trustworthiness, users may perceive a system’s trustworthiness incorrectly and may, therefore, undertrust, or disuse, trustworthy systems and overtrust untrustworthy ones Schlicker \BBA Langer (\APACyear2021). To highlight this complication, \citeASchlicker2021 distinguish between actual and perceived trustworthiness and introduce the notion of cues, that the system can provide users, to bridge the gap between the two.

In general terms, the actual trustworthiness of an AI system encompasses the system’s functionalities and capabilities, the underlying motivations and objectives of its developers, the conscientiousness of the development, design, and validation process, and the societal value ascribed to them HLEG (\APACyear2019). The perceived trustworthiness on the other hand is based on the user’s assessment of the systems actual trustworthiness. A system is perceived as trustworthy only in respect to its ability to perform a specific task within a specific context and at a specific point of time Schlicker \BBA Langer (\APACyear2021). The user bases their assessment on their cognitive and affective evaluation of the system, which can include the fairness or ability of the system or the purpose for which it was developed Baer \BBA Colquitt (\APACyear2018); Lee \BBA See (\APACyear2004); Madsen \BBA Gregor (\APACyear2000); Schlicker \BBA Langer (\APACyear2021).

With the term cues \citeASchlicker2021 group any observable piece of information that provides the user with insight on the actual trustworthiness of the system. The cues form an interface between the user’s perceived trustworthiness of the system and its actual trustworthiness, where the degree of agreement between these two reflect the accuracy of the user’s assessment of the system Schlicker \BBA Langer (\APACyear2021). The accuracy of the assessment of the actual trustworthiness depends on the ecological validity of the system cues and how the cues are applied by the user Schlicker \BBA Langer (\APACyear2021). It is not possible to observe the actual trustworthiness of an AI system directly, therefore users utilize the different cues to form their perceived trustworthiness Schlicker \BBA Langer (\APACyear2021). For example, it is not possible to test and observe most models for all situations they might encounter. If the user is able to perfectly assess the actual trustworthiness, they neither over- nor underestimate the system’s capabilities Schlicker \BBA Langer (\APACyear2021). When interacting with an AI system users are constantly searching for, using, and interpreting cues to assess its trustworthiness, which results in perceived trustworthiness Schlicker \BBA Langer (\APACyear2021). As a result, users must continuously evaluate the system’s actual trustworthiness to determine their own perceived trustworthiness of the system Schlicker \BBA Langer (\APACyear2021).

\citeA

Chen2023 highlight the risk of overreliance on untrustworthy AI. They show a number of studies that suggest that explanations can increase users’ reliance on AI predictions even if the AI system’s predictions are wrong. This highlights the difficulties the user has in correctly calibrating between perceived trustworthiness, actual trustworthiness, and reliance on the system. This is even more of a problem as the current understanding of user trust is limited. A common impulse, for example, is to nudge people into trusting the AI system (i.e. to find an explanation plausible and to accept it) rather than to distrust it Ehsan \BBA Riedl (\APACyear2020). The importance of accurate assessment of actual trustworthiness and appropriate reliance is highlighted by the risks associated with the misleading of user with XAI. Several studies investigate or observe the misleading effects that explanations can have Eiband \BOthers. (\APACyear2019); Lakkaraju \BBA Bastani (\APACyear2020); Dimanov \BOthers. (\APACyear2020); Suresh \BOthers. (\APACyear2020). This makes an exclusive focus on trust, rather than healthy scepticism and distrust, potentially dangerous, as a mismatch between actual and perceived trustworthiness can result in overtrust on the system.

6 Human-centric evaluation of XAI

As we described, the development and application of explainability methods is one way to make AI systems more transparent and trustworthy. The application of XAI can serve a number of purposes, namely it can be used to justify, to control and manage, to evaluate and improve, to discover and learn, or to calibrate trust Adadi \BBA Berrada (\APACyear2018); Meske \BOthers. (\APACyear2020); van der Waa \BOthers. (\APACyear2021); Tintarev \BBA Masthoff (\APACyear2012). XAI systems allow users to assess how reliable the system is and, therefore, to correctly calibrate their perception of the system’s accuracy Mohseni \BOthers. (\APACyear2021) or, in other words, to calibrate their perceived trustworthiness of the system. Schmidt \BBA Biessmann (\APACyear2019) note that providing explanations to users more than doubled their productivity in annotation tasks, which indicates that they do not only serve to calibrate reliance but can also benefit other outcomes, such as productivity.

When develo** explainability methods two main criteria need to be met. Firstly, we need to determine whether the developed method is mathematically and computationally sound and that it correctly represents the underlying model (i.e. does not come up with random or meaningless explanations that are not in line with the model). Secondly, we need to consider whether the method is having the originally intended effect on the user (e.g. providing them with meaningful insight into why the model came up with some output).

As explainability is an inherently human-centric property Lopes \BOthers. (\APACyear2022), Mohseni \BOthers. (\APACyear2021) and Liao \BBA Varshney (\APACyear2021) believe that HCI and human-centric evaluation can contribute to solving XAI algorithms and applications’ limitations. Lopes \BOthers. (\APACyear2022) point out that most XAI methods are developed in a more technically, computationally, focused environment and, as a result, potentially useful contributions/insights from other fields, such as HCI, are often ignored. They note that the lack of a multidisciplinary approach is a potential pitfall for both the development and evaluation of XAI methods. From their survey Adadi \BBA Berrada (\APACyear2018) found that of the works related to the development of interpretable ML only 5% considered the evaluation of these methods and quantification of their relevance.

Another problem is that while many measurements exist for the computational evaluation of XAI (e.g. completeness, soundness, and fidelity of explanations) Zhou \BOthers. (\APACyear2021); Nauta \BOthers. (\APACyear2022); Mohseni \BOthers. (\APACyear2021); Lopes \BOthers. (\APACyear2022), there is a clear lack of well-defined and validated measurements for the user-centric evaluation of XAI R\BPBIR. Hoffman \BOthers. (\APACyear2018). Given these facts, there is a need for properly defined metrics in order to be able meaningfully compare how explainable a model is Arrieta \BOthers. (\APACyear2020).

Common human-centric evaluation criteria that are used in the context of XAI are related to understanding, user’s mental model, trust and reliance, satisfaction, usefulness and usability, performance, fairness Mueller \BOthers. (\APACyear2021); Mohseni \BOthers. (\APACyear2021); Lai \BOthers. (\APACyear2021). Vilone \BBA Longo (\APACyear2020) distinguish two types of human-centric evaluation studies: qualitative and quantitative. Qualitative studies make use of open-ended questions aimed at obtaining deeper insights, while quantitative studies make use of close-ended questions that can be analysed in a statistical manner. Similarly, Zhou \BOthers. (\APACyear2021) distinguish between subjective and objective metrics for human-centric evaluation of XAI.

The quality criteria of validity and reliability from the social sciences provide standards for scientifically sound user-centric evaluation of XAI van der Waa \BOthers. (\APACyear2021). The validity of a method refers to its ability to accurately measure what it sets out to measure Field \BOthers. (\APACyear2012), which may be harmed by poor design, ill defined constructs, or arbitrarily selected measurements van der Waa \BOthers. (\APACyear2021). Reliability refers to the extend whether a method’s interpretation is consistent across different situations Field \BOthers. (\APACyear2012), which may be harmed by a lack of documentation, application in an unsuitable use case, or noisy measurements van der Waa \BOthers. (\APACyear2021). In order to be able to properly compare the results of different studies and experiments it is necessary that a user evaluation is both reliable and valid Joppe (\APACyear2000). van der Waa \BOthers. (\APACyear2021) state that this can be (partially) achieved by develo** different types of measurements for common constructs, such as for example using self-reported subjective and behavioral measurements to measure task performance or trust.

In the development and usage of XAI systems several potential user types and various other stakeholders exist. Stakeholder that are not directly interacting with the AI system itself, still might have an impact on the design of the systems themselves. Some of the stakeholder are AI regulators or individuals that may be affected by the decision made by or based upon AI Meske \BOthers. (\APACyear2020). Different users have different objectives and levels of expertise and might, therefore, be impacted differently by different types of applications Liao \BBA Varshney (\APACyear2021). Which means that, the choice of a particular XAI method should be guided by the needs of the specific type of user Lopes \BOthers. (\APACyear2022); Liao \BBA Varshney (\APACyear2021). Already many ways exist to group and identify different user types and stakeholder in the context of XAI Mohseni \BOthers. (\APACyear2021); Liao \BBA Varshney (\APACyear2021); Meske \BOthers. (\APACyear2020). Still, one of the issues in existing research is that diverse and dynamic user objectives are often not explicitly considered when develo** XAI algorithms Liao \BBA Varshney (\APACyear2021). Algorithms are often developed based on the intuitions of the AI researchers about what constitutes a good explanation rather than on the needs of the intended users Liao \BBA Varshney (\APACyear2021). While in our survey the user studies focus primarily on direct users of AI system, it is important to note that passive and indirect stakeholders should also be considered a type of user, something which is mostly ignored by current research on trust in AI Lukyanenko \BOthers. (\APACyear2022). These indirect stakeholders are affected by the outcomes of AI and it may, therefore, be important to also study their perceptions of and views on AI systems.

In our survey we focus on the stakeholders, or user groups, that directly interact with an AI system across the different stages of the ML-lifecycle (from design and development to being used by actual end-users). This includes ML researchers, engineers and developers, domain experts, and end- or lay-users. We categorize the user types according to their objectives, in what stage of the ML development life-cycle they interact with the system (which informs their objectives), and their level of expertise. This led to the following grou**:

  • Interacting stakeholders

    • (X)AI Developers, Designers, ML experts

    • Domain experts

    • Lay-users

  • Non-user/Other stakeholders:

    • Regulators/regulatory bodies

    • Business owners or administators

    • Impacted groups

7 Studying the effects of (X)AI on (user) trust

In empirical research the calibration of user trust and distrust has two primary objectives: achieving appropriate reliance and improving task performance. Appropriate reliance in this case should increase task performance by causing people to neither overtrust nor disuse an AI system’s outputs. When studying the effects that (X)AI systems have on its users and their trust a number of different components need to be considered. Relevant questions to consider are:

  • How does the user interface, or interact, with the system?

  • What factors can influence user trust?

  • Which information can be provided to the user to affect their trust in the system?

  • What ML and XAI methods can be used to provide such information?

  • How to measure user trust (and other important evaluation criteria)?

  • What constitutes a good outcome and what are the main objectives, e.g. aiming for maximizing user trust versus calibrating trust and distrust for fostering appropriate reliance?

In order to give a comprehensive overview of the evaluation of the effects of XAI on user trust, we have performed a survey of both theoretical and empirical research on this subject. For this we have developed a taxonomy of components and considerations that are important when studying the effects of (X)AI systems on user trust and provide an overview of existing work that has been done for each of these elements. Additionally, based on the survey of existing work we determine what the gaps and open questions are within the existing literature and provide a number of recommendations for each component. With our survey we try to give a comprehensive overview of existing work that studies the effects that (X)AI methods can have on user trust. For the survey we collected for each of the paper the following information:

  • Application domain & type the application domain(s) in which the experiments were performed (see Section 7.1.1)

  • Main aims: a summary of the main aims and objectives of the researchers.

  • (X)AI system (description of relevant features of the AI system used in the experiments):

    • ML methods: the machine learning methods that were used in the studies (see Section 7.2.1)

    • ML model modulation: if applicable, how the ML models were varied in order to observe that effects that had on users (e.g. variety of models with different overall accuracy/performance) (see Section 7.2.1)

    • (Model) descriptors: the descriptors used in the experiments (e.g. showing uncertainty estimate or example-based local explanation to the user) (see Section 7.2 and 7.2.1 for more details)

    • Descriptors modulation: if applicable, how the descriptors were applied or varied in the experiments (e.g. showing versus not showing an uncertainty estimate or local output explanation) (see Section 7.2.1)

    • Interface modality & modulation: the modality of the interface between the system and the user (e.g. image-based outputs and explanations, textual explanations, or the researchers providing feedback and explanations) (see Section 7.2.2)

  • Experimental setup (description of relevant information about the experiments besides the XAI system used):

    • Methodology: a summary description of the methodology for the empirical studies and experimental setups

    • Targeted users: which types of users were considered (e.g. domain experts, lay-users, or users with varying of levels of AI expertise) (see Section 7.1.2)

    • Evaluation criteria: evaluation criteria that were considered besides trust/reliance (e.g. understanding, user’s mental models, etc.). (see Section 7.3)

    • Trust measurement type: the specific trust measurements that were used (see Section 7.3)

    • Trust outcomes/effects: summary of the reported outcomes of the experiments with respect to the measurement and evaluation of the effects on user trust (see Section 7.3)

  • Conclusions: summary of the conclusions of the paper and/or description of the main outcomes

  • Limitations: summary of the main limitations observed by the authors

In the following sections we first describe the factors that can influence user trust. We connect these factors to the ML life-cycle and (X)AI system that a user can interact with. Finally we give a detailed description of types of trust measurement methods and evaluation criteria that exist and are used in the surveyed papers.

7.1 Factors of trust

When designing human-centric evaluation of user trust in AI it is important to keep in mind which factors can influence trust. Both those factors that are directly influenced by the human-AI interaction and confounding factors which might not be influenced by the AI system and its design directly, but that may impact whether and to what degree a user (dis)trusts the system. There are factors that are directly influenced within the human-AI interaction and other, potentially confounding, factors that are not influenced by the AI system and its design directly, but that may impact whether and to what degree a user (dis)trusts the system.

Many different models to classify and categorize the factors that influence trust (i.e. its antecedents) exist Toreini \BOthers. (\APACyear2020); Siau \BBA Wang (\APACyear2018); Hoff \BBA Bashir (\APACyear2015); Schaefer \BOthers. (\APACyear2016); Riegelsberger \BOthers. (\APACyear2005); Kee \BBA Knox (\APACyear1970); Kaplan \BOthers. (\APACyear2023); R. Yang \BBA Wibowo (\APACyear2022); R\BPBIC. Mayer \BOthers. (\APACyear1995\APACexlab\BCnt2); Dietz \BBA Hartog (\APACyear2006). The first of these models on trust Kee \BBA Knox (\APACyear1970); R\BPBIC. Mayer \BOthers. (\APACyear1995\APACexlab\BCnt2); Dietz \BBA Hartog (\APACyear2006) stem from social sciences, including organisational science and psychology, and are concerned with interpersonal trust. These earlier models have later been adapted to the context of human-technology, human-automation, and human-AI relations Toreini \BOthers. (\APACyear2020); Siau \BBA Wang (\APACyear2018); Hoff \BBA Bashir (\APACyear2015); Schaefer \BOthers. (\APACyear2016); Riegelsberger \BOthers. (\APACyear2005); Kaplan \BOthers. (\APACyear2023); R. Yang \BBA Wibowo (\APACyear2022).

\citeA

Yang2022 give a comprehensive survey of the different models and provide conceptual framework detailing the components of trust in relation to AI. Similarly, \citeASiau2018 and \citeAToreini2020 provide a comparison/overview between different earlier models and the adaptation into the context of AI. Many of these works derive (partial) inspiration from the work by \citeAhoff2015trust in which the authors give a detailed of account of the factors that influence trust in automation. \citeAToreini2020 establish a conceptualization that makes the connection between and integrates the factors of trustworthiness (Ability, Benevolence, Integrity) factors with the human, environmental, and technological categorization of trust factors and connects these to the (Trustworthy AI) objectives important to regulators and policy makers e.g. FAES (fairness, explainability, auditability, safety) and FATE (Fairness, Accountability, Transparency & Ethics). \citeAKoerber2018 provide a model of trust in automation with factors related to perceived trustworthiness based on the dimensions proposed in the \citeAMayer1995 and \citeAlee2004trust models, thereby placing the ABI notions of trustworthiness within the context of automation.

In this paper we use a categorization of the factors influencing trust similar to the one used by \citeAKaplan2023. These factors are related to the human, the AI, and the context Kaplan \BOthers. (\APACyear2023). When looking at the human-centric evaluation of the effect of (X)AI on user trust we want to focus on the factors that may influence the user, differentiating between the factors that the AI system designers and developers (e.g. researchers) can control and those that they have no direct control of (e.g. demographic or cultural aspects of the user, or societal views on AI in general). Of the ones that have been studied there is a relatively robust picture of which individual antecedents are effective at predicting trust and which are not, however there is a lack of empirical research for many of the other factors Kaplan \BOthers. (\APACyear2023). Additionally, the interactive effects between the antecedents of trust have been largely unexplored Kaplan \BOthers. (\APACyear2023).

7.1.1 Contextual factors of trust

As said, there are many factors outside of the XAI system itself that can influence user trust. These can include cultural, organisational factors (e.g. organisational setting), organisational trust, and institutional factors, i.e. the trust that people have in the institutions that develop or regulate the AI systems (i.e. institution-based trust) Hoff \BBA Bashir (\APACyear2015); Siau \BBA Wang (\APACyear2018); R. Yang \BBA Wibowo (\APACyear2022), as well as task related ones such as the type of system, system complexity, task complexity (e.g. tasks of varying cognitive load Wang \BBA Yin (\APACyear2022); Jiang \BOthers. (\APACyear2022)), workload, perceived risks (e.g. using AI in a medical application Bussone \BOthers. (\APACyear2015)), perceived benefits, and framing of task Hoff \BBA Bashir (\APACyear2015); Siau \BBA Wang (\APACyear2018); Schaefer \BOthers. (\APACyear2016).

Research into XAI systems cover a wide variety of application domains Adadi \BBA Berrada (\APACyear2018); Ferreira \BBA Monteiro (\APACyear2020); Lai \BOthers. (\APACyear2021), which is supported by our survey. The surveyed papers are concerned with various domains, ranging from higher risk ones (e.g. medical diagnosis, self-driving cars, finance, or law, e.g. recidivism prediction) to low-medium risk (e.g. entertainment or social media related such as movie recommendation) applications. Table 1 shows the most frequently specified application domains. Many studies did not target a specific application domain but rather had users perform a mock or proxy task, such as a generic image classification or some other ML recommendation task.

Different types of applications are used in the user studies, such as decision support systems Wanner \BOthers. (\APACyear2020); Sarah Bayer (\APACyear2021); Anik \BBA Bunt (\APACyear2021); Thaler \BBA Schmid (\APACyear2021); Wang \BBA Yin (\APACyear2023); Chen \BOthers. (\APACyear2023), recommender systems Guesmi \BOthers. (\APACyear2021); Shin (\APACyear2021); Bansal \BOthers. (\APACyear2021); Eslami \BOthers. (\APACyear2018); Cramer \BOthers. (\APACyear2008); Eiband \BOthers. (\APACyear2019); Gurney \BOthers. (\APACyear2022); Kunkel \BOthers. (\APACyear2019); Berkovsky \BOthers. (\APACyear2017); Zhao \BOthers. (\APACyear2019); Suresh \BOthers. (\APACyear2020); Guesmi \BOthers. (\APACyear2023), image classification Yu \BOthers. (\APACyear2018, \APACyear2017, \APACyear2019); Nourani \BOthers. (\APACyear2019); Mahsan Nourani (\APACyear2020); Leichtmann \BOthers. (\APACyear2023); F. Yang \BOthers. (\APACyear2020); Yu \BOthers. (\APACyear2016); Suresh \BOthers. (\APACyear2020), text annotation Papenmeier \BOthers. (\APACyear2021, \APACyear2022); Schmidt \BBA Biessmann (\APACyear2019); Linder \BOthers. (\APACyear2021), or speech recognition Anik \BBA Bunt (\APACyear2021).

Table 1: Application domain categories
Application domain

Papers

Entertainment

\citeAkulesza2013too, Ehsan2020, Cramer2008, Kunkel2019, Berkovsky2017, Bayer2021, Schmidt2020, Schmidt2019

Law/Legal

\citeAWang2021, Wang2022, Bansal2021, Lakkaraju2020, Anik2021

Medicine/Healthcare

\citeABussone2015, Leffrang2021, Jiang2022, Alam2021

Social Media/News

\citeAPapenmeier2022, Papenmeier2021, Linder2021, Eslami2018

Education

\citeAKizilcec2016, Cheng2019, Anik2021

Transportation

\citeAOmeiza2021, Koerber2018

Finance

\citeAWang2023

Scientific Research

\citeAThaler2021, Guesmi2023

7.1.2 Factors related to the user

The user itself can also have an impact on the human-AI system interaction and how user trust develops. Common terms that relate to users that are frequently employed to analyze and evaluate trust, include user knowledge, technical proficiency, familiarity, confidence, beliefs, faith, emotions, and personal attachments Mohseni \BOthers. (\APACyear2021). A person’s personality (disposition to trust) Siau \BBA Wang (\APACyear2018); Hoff \BBA Bashir (\APACyear2015) and ability are of concern Siau \BBA Wang (\APACyear2018). Similarly, a user’s self-confidence, mood, and emotional state can also have an impact of level of trust Schaefer \BOthers. (\APACyear2016); Hoff \BBA Bashir (\APACyear2015). Additionally, demographics factors such as gender, age, or culture Hoff \BBA Bashir (\APACyear2015) and various cultural aspects such as individualism and power relations within a culture can impact a person’s propensity to trust an AI system Chien \BOthers. (\APACyear2015, \APACyear2016). Data experts, similar to AI novices, benefit from interpretability in order to assess model uncertainty and trustworthiness Mohseni \BOthers. (\APACyear2021).

Another important factor that can influence the effect of explanations on users is their domain knowledge Wang \BBA Yin (\APACyear2023). Domain experts are capable of dynamically adjusting the perceived trustworthiness of an AI model by using its explanations Mahsan Nourani (\APACyear2020). Furthermore, Mahsan Nourani (\APACyear2020) observe that novice (non-expert) users suffer from over-reliance due to their lack of knowledge which results in their inability to properly detect errors. However, while domain expertise does impact user trust Ooge \BBA Verbert (\APACyear2021) observe that as expectations and personal experiences play a significant role, domain expertise alone cannot fully predict people’s trust in a model.

In our survey we find that most papers do not target a specific type of user (31 papers). Often generic, mixed, participants are recruited via crowdsource platforms, such as Amazon Mechanical Turk or Prolific. Other authors select specifically one user type such as non-experts Yu \BOthers. (\APACyear2017, \APACyear2019); Leffrang \BBA Müller (\APACyear2021); Mahsan Nourani (\APACyear2020); Jiang \BOthers. (\APACyear2022); Cheng \BOthers. (\APACyear2019); F. Yang \BOthers. (\APACyear2020); Ming Yin (\APACyear2019); Ehsan \BBA Riedl (\APACyear2020); Eslami \BOthers. (\APACyear2018); Rechkemmer \BBA Yin (\APACyear2022); Körber \BOthers. (\APACyear2018) or domain experts Lakkaraju \BBA Bastani (\APACyear2020); Wanner \BOthers. (\APACyear2020); Zhou \BOthers. (\APACyear2019); Drozdal \BOthers. (\APACyear2020); Kunkel \BOthers. (\APACyear2019); Berkovsky \BOthers. (\APACyear2017). In most studies a variety of demographic information about the user is collected (e.g. education level Omeiza \BOthers. (\APACyear2021)), even when the authors do not target any group specifically. Moreover, a number of studies look specifically at a variety of groups with varying attributes, such as varying levels of AI Nourani \BOthers. (\APACyear2019); Anik \BBA Bunt (\APACyear2021); Suresh \BOthers. (\APACyear2020); Wang \BBA Yin (\APACyear2023) or domain Sarah Bayer (\APACyear2021); Suresh \BOthers. (\APACyear2020); Wang \BBA Yin (\APACyear2023) expertise.

While many studies evaluate user trust as a static property, it is essential, when interacting with complex AI systems, to take into account the evolution of users’ experience and learning over time Mohseni \BOthers. (\APACyear2021); Lopes \BOthers. (\APACyear2022). The long term evaluation of XAI systems can help in the estimation of valuable user experience factors such as over- and undertrust Mohseni \BOthers. (\APACyear2021). Prior knowledge and belief are important factors in sha** a person’s initial trust Mohseni \BOthers. (\APACyear2021). Users of ML systems are in a constant learning state, therefore, even without model updates, their mental models and trust depend on their knowledge and familiarity with the system Lopes \BOthers. (\APACyear2022).

7.2 (X)AI system

The factors of trust related to the AI are often categorized according to performance, process, and purpose factors Lee \BBA See (\APACyear2004); Siau \BBA Wang (\APACyear2018); Hoff \BBA Bashir (\APACyear2015). Many of the factors are not related to model performance, but instead depend on its design, appearance or usability Kaplan \BOthers. (\APACyear2023); Hoff \BBA Bashir (\APACyear2015). The type of technology (e.g. using a DNN (deep neural network) blackbox model Bansal \BOthers. (\APACyear2021) or decision tree Zhang \BOthers. (\APACyear2020)) can also have an impact on trust Schaefer \BOthers. (\APACyear2016). When defining and constructing an XAI system a distinction between two key components should be considered, namely the (model) descriptors and the (explanation) interface. \citeANauta2022 define an explanation as ”a presentation of (aspects of) the reasoning, functioning and/or behavior of a machine learning model in human-understandable terms”. Similarly, the concept of cues introduced by \citeASchlicker2021 contains both the notion of the presentation (e.g. the aesthetics of the interface Schlicker \BBA Langer (\APACyear2021)) and content of explanations (e.g. descriptors such as inputs or outputs of the system Schlicker \BBA Langer (\APACyear2021)). So, from the perspective of studying human-evaluation of XAI an explanation consists of both its content (e.g. the descriptive information about the reasoning, functioning, and behavior of the system and underlying ML model) of the explanation and the way in which its represented (e.g. the user interface, its modality, or a researcher describing how the system is supposed to work or what its limitations are).

The explanation content can be any descriptive information about the model and its outputs (e.g. local output explanation or descriptive statistics of the training data), which we refer to as the (model/system) descriptors that can be used. While the presentation is the way in which this information is presented to the user, such as for example the design of the user interface and the modality in which the explanation is presented (e.g. visual or textual). Important factors for descriptors and interface of the XAI system are the relevance, availability, detection and utilization of the cues that this provides the users Schlicker \BBA Langer (\APACyear2021). In the following sections we describe both the different model descriptors (Section 7.2.1) that can be used as the content and the presentation of the interface (Section 7.2.2).

7.2.1 Model descriptors

When asking what information to provide the user we need to consider what questions users may ask about or of the system. \citeALiao2021, Lim2009 provide a list of such questions. These questions range from asking what inputs the model used to asking how the system came to its output and what reasoning it applied. \citeAMohseni2021, Anik2021, Liao2021 give an overview of different explanation approaches, or types, that are already used to answer such questions. Several authors Schlicker \BBA Langer (\APACyear2021); Lai \BOthers. (\APACyear2021); Vilone \BBA Longo (\APACyear2020); Liao \BBA Varshney (\APACyear2021) have focused on the types of information or feedback users can get from an AI system. As we mentioned, \citeASchlicker2021 describe the cues that users can make use of. Similarly, \citeALai2021 describe AI assistance elements by which they mean additional information such as information about the output (e.g. explanations or uncertainty estimates), the model or training data (e.g. feature importances, model performance, or what type of algorithm that is used), or other AI system elements that affect a user’s agency or experience, that can be provided to the user besides simply giving them the model’s prediction. \citeAVilone2020, Vilone2021 mention the use of explanators which is what the end-user will interact with, this is similar to \citeASchlicker2021 and \citeALai2021’s concepts of cues and assistance elements. These similar approaches highlight an overlap** sense of what information users would require from the system and how this information might be provided by explanations. We summarize the available descriptive information that can be used in the AI system’s interface and accompanying explanations as the system or model descriptors. The used descriptors provide the content of the explanations that the user is given about the system.

From this we can see that an explanation does not have to be confined to the outputs generated by explainability methods, but can be any descriptive information about the model and its outputs. Based on the observations from our survey and the related literature, which we described in the previous paragraph, we summarize the descriptors that have already been of interest in user studies concerned with user trust. Therefore, to help the identification of existing insights on descriptors and effects of different explanatory content, we provide the following list:

  • Input sample (e.g. (tabular) feature values, text, image)

  • Output prediction

  • Explanations of outputs

    • Uncertainty estimates

    • Local explanation (e.g. example based, counterfactuals, salience maps)

    • Local feature importances / input influences

  • Underlying model

    • Training data (e.g. data distribution, features used, modality of data)

    • Algorithm type / training procedure

    • Inherently interpretable model information (e.g. Decision Tree)

    • Global feature ranking / importance

    • Model performance metrics (e.g. model accuracy, precision/recall scores)

In our survey we find that the most commonly used descriptors are different types of local explanations (depending on input data modality and application) or uncertainty estimates. Many types of local explanations are used, namely example based F. Yang \BOthers. (\APACyear2020); Chen \BOthers. (\APACyear2023); Alam \BBA Mueller (\APACyear2021); Leichtmann \BOthers. (\APACyear2023); Kulesza \BOthers. (\APACyear2013), contrastive Omeiza \BOthers. (\APACyear2021), (pixel) attribution Nourani \BOthers. (\APACyear2019); Mahsan Nourani (\APACyear2020); Leichtmann \BOthers. (\APACyear2023); Bussone \BOthers. (\APACyear2015), counterfactual or ’What if’ Guesmi \BOthers. (\APACyear2023); Wang \BBA Yin (\APACyear2022), nearest neighbor Wang \BBA Yin (\APACyear2022) explanations or describing the model’s reasoning or logical rationale for the output. As we detail later in this section, it is common that a mock model is used and not an actual underlying ML model, therefore, in these cases, the specific local explanation method is not always described in detail but rather that the users receive some sort of local explanation of which a high-level description is given. Other studies make use of feature importance or feature contributions Cheng \BOthers. (\APACyear2019); Wang \BBA Yin (\APACyear2022); Buçinca \BOthers. (\APACyear2021); Zhou \BOthers. (\APACyear2019); Lu \BBA Yin (\APACyear2021); Chen \BOthers. (\APACyear2023) or display or describe the input features of the model and the output sample Cheng \BOthers. (\APACyear2019); Lim \BOthers. (\APACyear2009); Zhou \BOthers. (\APACyear2019); Drozdal \BOthers. (\APACyear2020); Zhao \BOthers. (\APACyear2019); Rechkemmer \BBA Yin (\APACyear2022); Lu \BBA Yin (\APACyear2021); Wang \BBA Yin (\APACyear2023); Chen \BOthers. (\APACyear2023).

User are also often provided with some form of model performance metrics, such as the accuracy Papenmeier \BOthers. (\APACyear2022, \APACyear2021); Ming Yin (\APACyear2019); Lim \BOthers. (\APACyear2009); Rechkemmer \BBA Yin (\APACyear2022); Lu \BBA Yin (\APACyear2021); Thaler \BBA Schmid (\APACyear2021); Suresh \BOthers. (\APACyear2020); Drozdal \BOthers. (\APACyear2020), confusion matrices Drozdal \BOthers. (\APACyear2020), or class-based error rates Thaler \BBA Schmid (\APACyear2021). Some less frequently used descriptors are providing the users with (introductory) information about the system and underlying model Körber \BOthers. (\APACyear2018); Chen \BOthers. (\APACyear2023), such as the model architecture Suresh \BOthers. (\APACyear2020), which algorithm was used, system limitations Körber \BOthers. (\APACyear2018), what the explanatory process is and how it works Zhao \BOthers. (\APACyear2019), or information about the training data, data distributions, and process of feature engineering Anik \BBA Bunt (\APACyear2021); Drozdal \BOthers. (\APACyear2020).

In order to test the impact of different descriptors they are manipulate in various ways. This can be done either in a within-subject or a between-subject design. The most common strategy is to either show or not show a specific descriptor, which is usually applied for various local explanations and uncertainty estimates. Another strategy is to supply the user with explanations of various levels of detail (e.g. only showing an uncertainty estimate, combining it with a pixel attribution) or various combinations of different explanations (e.g. combining uncertainty estimates with an example based explanations). The modality of the explanation and the interface can also vary or be combined (e.g. showing a text based explanation along with an input image). We discuss the design and modality of the user interface in the following section (Section 7.2.2).

Additionally, authors may also try to determine the effects of the trustworthiness of the explanations and underlying model. In most of the papers the actual underlying model that is used is not clearly specified. Often this is because they use a mock or proxy task with a simulated model. This type of setup is referred to as a Wizard of Oz setup Lai \BOthers. (\APACyear2021). The benefit of this is that it allows the researchers full control over the outputs that the user is seeing and allows them to specifically tune the (actual) trustworthiness of the system to determine to what extend this affects the users Lai \BOthers. (\APACyear2021). For example, it allows the researchers to control many elements of the explanations interface such the usage of different descriptor classes (rather than actual XAI methods) and interface designs. The drawback of this is of course that this does not tell us anything about a specific XAI method or algorithm. Additionally, given the complexity of model behavior, it can be challenging to design realistic studies using this method. While a lack of realism can hinder the validity and generalizability of study results Lai \BOthers. (\APACyear2021).

The papers that do state a specific ML method make use of a range of different models of varying complexity and interpretability from decision trees Wanner \BOthers. (\APACyear2020); Gurney \BOthers. (\APACyear2022), SVMs Rechkemmer \BBA Yin (\APACyear2022), Logistic regression Wang \BBA Yin (\APACyear2022), and Naive Bayes Rechkemmer \BBA Yin (\APACyear2022) models to Random Forest Rechkemmer \BBA Yin (\APACyear2022), CNN (e.g. ResNet) Papenmeier \BOthers. (\APACyear2021); Wanner \BOthers. (\APACyear2020); Zhou \BOthers. (\APACyear2019); Leichtmann \BOthers. (\APACyear2023), language (e.g. LSTM) Wanner \BOthers. (\APACyear2020); Bansal \BOthers. (\APACyear2021), time series Leffrang \BBA Müller (\APACyear2021), and Reinforcement Learning Pynadath \BOthers. (\APACyear2018) models. Most researchers study the effects of different models whether they are proxy or mock models (i.e. Wizard of Oz setups) or actual underlying models. The most common approach is to use models of varying accuracy Yu \BOthers. (\APACyear2018, \APACyear2017, \APACyear2019); Bansal \BOthers. (\APACyear2021); Donald R. Honeycutt (\APACyear2020); Ming Yin (\APACyear2019); Zhou \BOthers. (\APACyear2018, \APACyear2019); Nourani \BOthers. (\APACyear2019) and to observe whether this has an effect on the user or whether or not users can accurately detect the change and adjust their perceived trustworthiness of the model and reliance accordingly. \citeAWang2023 observe what the effect of model changes are. \citeAMiller2016 use two models with different levels of automation capability. \citeAYu2017, Yu2019 study the effects of varying accuracy over time. \citeAZhang2020 use models that are based on different sets of features, where one uses all available features and another uses only part of the feature set.

7.2.2 AI interface

Along with the explanations’ content (i.e. used descriptors) the presentation of this information also has an impact on the user Hoff \BBA Bashir (\APACyear2015). \citeAhoff2015trust describe several design features related to trust in automation and the development of trustworthy automation, such as communication style, ease-of-use, transparency and feedback, level of control, and appearance. For the design of which they provide a number of recommendation. \citeANauta2022 describe three quality properties, or criteria, related to the presentation of the explanation: compactness (i.e. the size of the explanation), compositionality (i.e. formatting and organization of the information), and confidence (i.e. presence and accuracy of probability information). In our survey we find different interface modalities being used, where textual interfaces, such as showing a worded explanation of the output, are most frequently used (45 papers), followed by visual or graphical descriptions (28 papers), such as image based methods (e.g. salience maps or displaying the input image) Yu \BOthers. (\APACyear2018, \APACyear2017, \APACyear2019); Nourani \BOthers. (\APACyear2019); Mahsan Nourani (\APACyear2020); Leichtmann \BOthers. (\APACyear2023); Thaler \BBA Schmid (\APACyear2021); Wang \BBA Yin (\APACyear2021, \APACyear2022); F. Yang \BOthers. (\APACyear2020); Donald R. Honeycutt (\APACyear2020); Wang \BBA Yin (\APACyear2023), using graphs Leffrang \BBA Müller (\APACyear2021); Cheng \BOthers. (\APACyear2019); Bansal \BOthers. (\APACyear2021); Ming Yin (\APACyear2019); Linder \BOthers. (\APACyear2021); Wanner \BOthers. (\APACyear2020); Buçinca \BOthers. (\APACyear2021); Zhou \BOthers. (\APACyear2019); Drozdal \BOthers. (\APACyear2020); Gurney \BOthers. (\APACyear2022); Pynadath \BOthers. (\APACyear2018); Ooge \BBA Verbert (\APACyear2021); Rechkemmer \BBA Yin (\APACyear2022); Lu \BBA Yin (\APACyear2021); Chen \BOthers. (\APACyear2023); Guesmi \BOthers. (\APACyear2023), or video Suresh \BOthers. (\APACyear2020).

Many studies make use of feature based display methods Kulesza \BOthers. (\APACyear2013); Bussone \BOthers. (\APACyear2015); Zhang \BOthers. (\APACyear2020); Guesmi \BOthers. (\APACyear2021); Jiang \BOthers. (\APACyear2022); Cheng \BOthers. (\APACyear2019); Wang \BBA Yin (\APACyear2022); Ming Yin (\APACyear2019); Linder \BOthers. (\APACyear2021); Lim \BOthers. (\APACyear2009); Wanner \BOthers. (\APACyear2020); Buçinca \BOthers. (\APACyear2021); Zhou \BOthers. (\APACyear2019); Drozdal \BOthers. (\APACyear2020); Lu \BBA Yin (\APACyear2021), such as input features (e.g. showing the values of the input features for the sample) and output explanation (e.g. local feature importance), or global feature importance to describe the model. Some papers perform visual highlighting of different elements of the interface Papenmeier \BOthers. (\APACyear2022, \APACyear2021); Bansal \BOthers. (\APACyear2021); Linder \BOthers. (\APACyear2021); Schmidt \BBA Biessmann (\APACyear2019); Wang \BBA Yin (\APACyear2023); Chen \BOthers. (\APACyear2023); Guesmi \BOthers. (\APACyear2023), such as highlighting the words with the highest feature importances, to make them stand out more compared to the other parts of the interface and/or explanations. \citeAMiller2016 provide users with both visual and voice cues about navigation in an automated driving system. \citeAGuesmi2023 make use of an interactive interface that the user can use to explore the problem domain, model, outputs, and output explanations, in order to the determine the effects that different levels of interactivity can have on users.

In our survey we find that several studies attempt to discern the effects of different kinds of interface design rather than the descriptor content. \citeAGuesmi2021, Bansal2021, Bayer2021, Alam2021, Linder2021, Miller2016, Omeiza2021, Gurney2022 make use of varying presentation formats with different combinations of explanations in different types of modalities. \citeAYang2020 investigate the effect of different kinds of spatial layouts for displaying graphical example-based explanations. Comparing a grid, tree, and graph based structure, as well as using the input image versus a rose chart of features. While \citeAKulms2019 investigate the effects of different kinds of antropomorphism.

7.3 Evaluation & trust measurement

In Section 6 we mentioned the evaluation criteria which are often used in the context of the human-centric evaluation of XAI. For each of the surveyed paper we have collected the evaluation criteria that are applied alongside trust or reliance related criteria. The criteria found in the surveyed papers fall in four main categories: Trust and Reliance, Understanding and Mental Models, System Satisfaction and Usability, and Task Performance. Some miscellaneous criteria we classify in the ’Other’ category. Figure 2 shows for each of the categories how many studies use criteria related to that category. In Table 2 we provide a full overview of the specific criteria that are applied for each category and which papers use of that criteria.

Refer to caption
Figure 2: Number of papers that make use of one or more evaluation criteria per category.
Table 2: Evaluation criteria per category
Category

Criteria

Trust & Reliance

trust (ABI) Guesmi \BOthers. (\APACyear2021); Berkovsky \BOthers. (\APACyear2017), trust change Yu \BOthers. (\APACyear2017); Mahsan Nourani (\APACyear2020); Wang \BBA Yin (\APACyear2023), trust in automation Papenmeier \BOthers. (\APACyear2022), trust score Kulesza \BOthers. (\APACyear2013), propensity to trust Papenmeier \BOthers. (\APACyear2022); F. Yang \BOthers. (\APACyear2020); Schmidt \BOthers. (\APACyear2020), reliance Bussone \BOthers. (\APACyear2015); Yu \BOthers. (\APACyear2018, \APACyear2017); Wang \BBA Yin (\APACyear2021, \APACyear2022); D. Miller \BOthers. (\APACyear2016); Yu \BOthers. (\APACyear2016); Buçinca \BOthers. (\APACyear2021); Körber \BOthers. (\APACyear2018); Lu \BBA Yin (\APACyear2021); Schmidt \BBA Biessmann (\APACyear2019); Chen \BOthers. (\APACyear2023), willingness to accept Wanner \BOthers. (\APACyear2020), willingness to follow predictions Leffrang \BBA Müller (\APACyear2021); F. Yang \BOthers. (\APACyear2020); Wanner \BOthers. (\APACyear2020); Suresh \BOthers. (\APACyear2020), ability to form appropriate trust (control factor) Kulms \BBA Kopp (\APACyear2019), perceived user trust Omeiza \BOthers. (\APACyear2021), effect demographic characteristic of users on trust Papenmeier \BOthers. (\APACyear2022), acceptance Jiang \BOthers. (\APACyear2022); Bansal \BOthers. (\APACyear2021); Cramer \BOthers. (\APACyear2008); Wanner \BOthers. (\APACyear2020), switch rate Leffrang \BBA Müller (\APACyear2021); Chen \BOthers. (\APACyear2023), advice adoption Jiang \BOthers. (\APACyear2022); Bansal \BOthers. (\APACyear2021); Wang \BBA Yin (\APACyear2023), agreement rate Mahsan Nourani (\APACyear2020); Wang \BBA Yin (\APACyear2021, \APACyear2022); Chen \BOthers. (\APACyear2023), disagreement rate Leffrang \BBA Müller (\APACyear2021), advice seeking Kulms \BBA Kopp (\APACyear2019)

Understanding & Mental Models

understanding Kulesza \BOthers. (\APACyear2013); Papenmeier \BOthers. (\APACyear2022); Omeiza \BOthers. (\APACyear2021); Cheng \BOthers. (\APACyear2019); Wang \BBA Yin (\APACyear2021, \APACyear2022); F. Yang \BOthers. (\APACyear2020); Linder \BOthers. (\APACyear2021); Alam \BBA Mueller (\APACyear2021); Eslami \BOthers. (\APACyear2018); Lim \BOthers. (\APACyear2009); Cramer \BOthers. (\APACyear2008); Eiband \BOthers. (\APACyear2019); Drozdal \BOthers. (\APACyear2020); Wang \BBA Yin (\APACyear2023), self-reported understanding Cheng \BOthers. (\APACyear2019), objective understanding Cheng \BOthers. (\APACyear2019), perceived understanding Papenmeier \BOthers. (\APACyear2021); Eiband \BOthers. (\APACyear2019); Zhao \BOthers. (\APACyear2019), comprehension Leichtmann \BOthers. (\APACyear2023); Cheng \BOthers. (\APACyear2019); Thaler \BBA Schmid (\APACyear2021), perception of accuracy Alam \BBA Mueller (\APACyear2021), user’s mental model fidelity Kulesza \BOthers. (\APACyear2013), uncertainty awareness Wang \BBA Yin (\APACyear2021, \APACyear2022), perceived system ability Holliday \BOthers. (\APACyear2016), perceived system accuracy Nourani \BOthers. (\APACyear2019); Mahsan Nourani (\APACyear2020); Donald R. Honeycutt (\APACyear2020); Alam \BBA Mueller (\APACyear2021); Rechkemmer \BBA Yin (\APACyear2022), perceived system performance Yu \BOthers. (\APACyear2019); Shin (\APACyear2021), perceived task performance Yu \BOthers. (\APACyear2019), perceived model accuracy Wang \BBA Yin (\APACyear2023), perceived model change Donald R. Honeycutt (\APACyear2020)

Task Performance

task performance Omeiza \BOthers. (\APACyear2021); Linder \BOthers. (\APACyear2021); Lim \BOthers. (\APACyear2009); Buçinca \BOthers. (\APACyear2021); Suresh \BOthers. (\APACyear2020); Schmidt \BBA Biessmann (\APACyear2019); Wang \BBA Yin (\APACyear2023); Chen \BOthers. (\APACyear2023), combined accuracy Zhang \BOthers. (\APACyear2020); Schmidt \BOthers. (\APACyear2020), combined performance Zhang \BOthers. (\APACyear2020); Leichtmann \BOthers. (\APACyear2023); F. Yang \BOthers. (\APACyear2020); Bansal \BOthers. (\APACyear2021); Thaler \BBA Schmid (\APACyear2021), complementary performance Bansal \BOthers. (\APACyear2021), correctness and response time as measures of human performance Linder \BOthers. (\APACyear2021), decision performance / combined performance Zhou \BOthers. (\APACyear2018), task execution speed Schmidt \BBA Biessmann (\APACyear2019), take-over performance Körber \BOthers. (\APACyear2018), team performance Bansal \BOthers. (\APACyear2021); Kulms \BBA Kopp (\APACyear2019); Gurney \BOthers. (\APACyear2022); Pynadath \BOthers. (\APACyear2018); Körber \BOthers. (\APACyear2018)

System Satisfaction & Usability

usability F. Yang \BOthers. (\APACyear2020), usefulness Bansal \BOthers. (\APACyear2021); Alam \BBA Mueller (\APACyear2021); Anik \BBA Bunt (\APACyear2021), satisfaction Guesmi \BOthers. (\APACyear2021); Alam \BBA Mueller (\APACyear2021); Wang \BBA Yin (\APACyear2023); Guesmi \BOthers. (\APACyear2023), user satisfaction Ehsan \BBA Riedl (\APACyear2020), user confidence Zhou \BOthers. (\APACyear2018); Wanner \BOthers. (\APACyear2020); Thaler \BBA Schmid (\APACyear2021); Guesmi \BOthers. (\APACyear2023), perceived ease of use Cramer \BOthers. (\APACyear2008); Guesmi \BOthers. (\APACyear2023), perceived usefulness Cramer \BOthers. (\APACyear2008), recommendation quality Kunkel \BOthers. (\APACyear2019), perceived control Holliday \BOthers. (\APACyear2016); Guesmi \BOthers. (\APACyear2023), mental demand Buçinca \BOthers. (\APACyear2021), perceived competence Cramer \BOthers. (\APACyear2008); Berkovsky \BOthers. (\APACyear2017); Sarah Bayer (\APACyear2021), persuasiveness Guesmi \BOthers. (\APACyear2021), quality of interpretability Schmidt \BBA Biessmann (\APACyear2019), explanation quality Kunkel \BOthers. (\APACyear2019), explanation type preference Chen \BOthers. (\APACyear2023), perceived accountability Shin (\APACyear2021), perceived consistency of AI explanations with prior knowledge Wang \BBA Yin (\APACyear2023), perceived quality of explanation Guesmi \BOthers. (\APACyear2023), perceived system complexity Buçinca \BOthers. (\APACyear2021), perceived transparency Shin (\APACyear2021); Cramer \BOthers. (\APACyear2008); Holliday \BOthers. (\APACyear2016), perceived trustworthiness Kulms \BBA Kopp (\APACyear2019); Bruzzese \BOthers. (\APACyear2020); Berkovsky \BOthers. (\APACyear2017); Sarah Bayer (\APACyear2021), perceived fairness Shin (\APACyear2021); Anik \BBA Bunt (\APACyear2021), perceived predictability Holliday \BOthers. (\APACyear2016), perceived need for explanation Cramer \BOthers. (\APACyear2008), expectation violation Kizilcec (\APACyear2016)

Others

plausibility Ehsan \BBA Riedl (\APACyear2020), completeness Alam \BBA Mueller (\APACyear2021), effectiveness Guesmi \BOthers. (\APACyear2021), efficiency Guesmi \BOthers. (\APACyear2021), knowledge gaps Kulesza \BOthers. (\APACyear2013), self-confidence in decision F. Yang \BOthers. (\APACyear2020); Wanner \BOthers. (\APACyear2020), situation awareness Pynadath \BOthers. (\APACyear2018), sufficiency Alam \BBA Mueller (\APACyear2021), transparency Guesmi \BOthers. (\APACyear2021); Berkovsky \BOthers. (\APACyear2017); Zhao \BOthers. (\APACyear2019); Guesmi \BOthers. (\APACyear2023), familiarity Papenmeier \BOthers. (\APACyear2022)

In \citeAKohn.2021 the authors give a review of different measurements of trust in automation. They describe three categories in which these methods can be placed: self-report, physiological, and behavioral. Self-report measures are measures where respondents report on their own behaviors, beliefs, attitudes, or intentions by receiving a question or prompt and selecting or detailing a response Kohn \BOthers. (\APACyear2021). Self-report measures are typically set up as surveys or questionnaires. \citeAKohn.2021 describe 16 different types of self-report methods used in trust in automation. Behavioral methods use the observation of participants’ behavioral processes or tendencies Kohn \BOthers. (\APACyear2021). Examples of behavioral measures include the tracking of combined human-machine performance, agreement rate, decision or response time, delegation or reliance, and economic trust games. Physiological methods are based on the measurement of biological responses from the user, such as muscle movements, heart rate, or neural activation Kohn \BOthers. (\APACyear2021).

With respect to the use of self-report and behavioral measures, \citeAscharowski2023distrust clarify that trust as an attitude should be measured via questionnaires whereas behavioural measures are suitable to assess reliance. The outcome of questionnaires is influenced by the participants’ ability to reflect their attitude towards a system, which can be difficult for some participants Papenmeier \BOthers. (\APACyear2022). Such questionnaires measure the participants’ perception of their trust T. Miller (\APACyear2022) that do not always correspond to their actions. Across multiple studies in our overview a discrepancy between self-report and behavioural measures is observed <e.g.,¿[]Papenmeier2022, Papenmeier2021, Wang2023.

In practice most often self-reported trust is employed. Typically, this entails a short questionnaire, in which the participants rate their agreement to different statements (items) on a likert-scale. Frequently, custom self-report scales are used for assessing trust in automation Kohn \BOthers. (\APACyear2021). A procedure that is also observed in the evaluation of XAI methods Lopes \BOthers. (\APACyear2022) or in the broad context of trust in information systems Rusk (\APACyear2018). \citeAKohn.2021 highlight that researchers either state that they developed their own way of measuring trust, or they do not cite any source for their method. These custom scales can even take the form single item measurements Spain \BOthers. (\APACyear2008), which, in essence simply ask ”Did you trust our system?” Kohn \BOthers. (\APACyear2021). Despite the advantage in efficiency of single item measures, they are usually less reliable than multiple items and narrow down the complex concept of trust to a single (dis)agreement.

The customization of self-report items has the further problem that their validity has not been tested. This hinders the interpretation of results because one can not be certain if they are actually measuring trust. Moreover, this contributes to a lack of generalization of trust measurement, because results obtained with different self-report items can not be easily compared. A difference in outcomes could be due to both the method or manipulation that is investigated or the measurement applied. Thus, without standardizing measurements the comparison between results and methods is obstructed. Consistent and standardized items would allow researchers to develop a better understanding of trust and distrust Rusk (\APACyear2018).

As discussed in section 4 the distinction of trust and distrust may be of interest in the context of appropriate reliance. Some examples identifying the merit of considering trust and distrust as separate dimensions, which can be found across different subfields of human-technology interaction Kohn \BOthers. (\APACyear2021); Harrison McKnight \BBA Chervany (\APACyear2001); McKnight \BOthers. (\APACyear2004); Benamati \BOthers. (\APACyear2006); Ou \BBA Sia (\APACyear2010); Fang \BOthers. (\APACyear2015); Thielsch \BOthers. (\APACyear2018). A difference between dispositional trust and dispositional distrust was observed in the context of online expert advice McKnight \BOthers. (\APACyear2004), and trust and distrust co-existed as distinct construct in the context of online banking Benamati \BOthers. (\APACyear2006) and online shop** Ou \BBA Sia (\APACyear2010). A study on website design showed that trust and distrust are affected by different antecedents, and the performance of a trust-aware recommender system was improved by predicting not only trust but also distrust Fang \BOthers. (\APACyear2015). \citeAThielsch.2018 investigated work-related information systems and also identified trust and distrust as related yet separate influences on different outcome variables.

\citeA

Kohn.2021 note, however, that notwithstanding the evidence for the two-dimensional conceptualisation, uni-dimensional scales are the common form for assessing trust in automation Kohn \BOthers. (\APACyear2021). Of these, the Checklist for Trust between People and Automation Jian \BOthers. (\APACyear2000) is the most frequently used Kohn \BOthers. (\APACyear2021). This checklist measures trust and distrust as polar opposites along a single dimension. Five of the 12 items (statements rated by the user) measure distrust. In practice, these items are often reverse-scored and summed with the trust items to form one trust score, which was also suggested by the original authors of the scale Spain \BOthers. (\APACyear2008). In a critical validation attempt of this scale by \citeASpain.2008, a one-factor model (indicating the polar opposites along a single dimension) and a two-factor model were compared. This factor analysis provided evidence for the conceptualization of trust and distrust as separate, yet related constructs Spain \BOthers. (\APACyear2008). Reverse scoring distrust items to then sum with the trust items entails a problematic entanglement of the two factors identified by \citeASpain.2008 and disregards the incremental insight by measuring trust and distrust individually.

In his dissertation \citeARusk.2018 sets out to close a research gap and introduces a scale that measures trust and distrust separately and provides a first validation. The scale is developed for the context of information systems and might be applicable to the (X)AI context as well. Importantly, the author points out that the results need to be independently re-evaluated in a different study. In that regard we could not find any work with that aim and highlight this as promising work for future research.

Referring back to the clarification of the underlying aim of appropriate trust (Section 3.2) the work of \citeAWang2021, Wang2022 should be highlighted. When assessing behavioural trust they distinguished between appropriate trust, undertrust and overtrust, depending on the participants usage of the model and the correctness of the model. When a ground truth is known, this is an useful approach to measure effects of an introduced method on the user in a clarified manner. For instance in one study, this approach allowed to observe that feature importance and feature contribution slightly increased appropriate trust, decreased their undertrust, and that, for feature importance, this came at the cost of a slight increase in overtrust Wang \BBA Yin (\APACyear2022).

Figure 3 show the number of surveyed papers that use some self-report, behavioral, or a combination of both trust measurements. For each of the experiments in the papers we have summarized the outcomes related to the effects for the different measurements (e.g. whether the intervention (changing AI system) had a positive measured effect on behavioral trust). In Figure 4 we provide a barplot summarizing the effects for self-report and behavioral trust. From this we can see that mostly there is a (moderately) positive effect on user trust from the application of explanations, while a clear negative effect is not as common especially for behavioral trust measures. Additionally, we observe that many experiments result in mixed or inconclusive outcomes. In Section 0.A.2 in the appendix we provide an overview of the outcomes for each of the experiments in the papers. \citeAZhou2018 is the only study in our survey in which a physiological trust measurement is applied, therefore we did not include it in the strategies shown in Figure 3.

Refer to caption
Figure 3: Number of papers that use self-report and/or behavioral trust measurement strategies.
Refer to caption
Figure 4: Stated effects of methods on trust (counted across all studies). If a paper includes multiple experiments or uses both self-report and behavioral it may be counted multiple times.

8 Open questions & recommendations

From our survey we find that while there is growing interest in the empirical evaluation of trust several open questions and gaps in the field remain from which we can describe a number of recommendations. \citeAvanderWaa2021 provides a number of recommendations for the improvement of user-centric evaluation of XAI, such as on the constructs, use cases and experimental context, and measurements. \citeAMiller2022 lay out four requirements for the proper evaluation of warranted and unwarranted trust and distrust, namely the need to take into account and measure task performance, the need for risk, the need to allow users to choose to rely or not, and the ability to manipulate the trustworthiness of the AI system.

We find that from the possible types of explanations, the descriptors used, and the ways in which they are presented only a limited scope has already been investigated. This means that there are ample opportunities for researchers left. Not least by integrating the evaluation of user trust and reliance in existing user studies/workflows related to (X)AI. Many different setups for various application domains, types, and users remain un(der)investigated for many of the potential antecedent and confounding factors of trust.

When comparing the descriptors, modulation strategies, and (XAI) methods used with those user questions and needs, types of explanations, descriptors, and XAI methods that are described in the various overviews we find that there is still a number of areas and methods that have been either not investigated yet or those for which results are currently inconclusive.

For researchers it is important to note that they can control certain factors that influence trust while not other. These factors that lie beyond the control of the developer should be considered as potential confounding variables in user studies. Potential confounders can be addressed in multiple ways. They can be controlled for by kee** them on a constant level, by balancing them between conditions, or by randomisation Sedlmeier (\APACyear2013). They can also be treated as further independent variables and be manipulated as well by, for example recruiting lay-users and domain experts as two experimental groups that can be compared. Most importantly, one should be conscious of the potential confounders and address them within the limits of the planned experiment. Considering which of the known confounders maybe be present in one’s study and addressing the relevant ones improves the quality of results and improves their interpretation and the potential comparability of study outcomes.

Given that a number of the surveyed papers observe discrepancies between self-report and behavioural measurements, a combination of measurement types would be beneficial. This allows to study potential differences of the reported attitude and the actual reliance. For example, in a low risk scenario a person who reports trust could also rely. Yet, under higher risk the person may report trust but could be more hesitant and would not act upon it. With combining measurement types such hypothetical patterns are possible to observe and investigate.

To at least partially overcome the mixed and inconclusive results in empirical research on the explainability-trust hypothesis, we make three suggestions. First, make explicit what you mean by trust by indicating which definition of trust you are following. Studies on trust (and distrust) in the XAI context need to continue to draw from the established work on (dis)trust without conflating it with common-sense reasoning on trust and distrust. Furthermore, state which perspective on trust is of interest for your study. There is an important difference between studying whether a XAI method affects the user’s trust in a single interaction and whether it affects the general attitude in AI, which should not be mixed. Second, standardized and validated scales should be used for self-report questionnaires to facilitate comparability across studies and generalization of observed results.

Third, consider that the distinction of trust and distrust may be of interest in the XAI context. Because explanations help identify both correct and wrong outputs, XAI methods may affect both the user’s trust and distrust. As a starting point to advance the consideration and evaluation of trust and distrust, we suggest the scale developed by \citeARusk.2018, which requires further independent validation as recommended by the author. This would provide further evidence for the two-dimensional concept of trust and distrust and make it easier to consider and evaluated trust and distrust separately.

In general, one should be aware of the context of one’s study. As the varying results of the surveyed studies show the application domain, the targeted users and their level of expertise are relevant to the effects of a method on the user’s trust and other outcomes in general. The application domain and the entailed risk may inform which reliance problem (disuse or overtrust) is more important. Depending on the risk, unwarranted trust or unwarranted distrust can be problematic in different ways.

9 Conclusion

We have shown that there is growing interest in user trust in (X)AI and the development of methods that can foster appropriate reliance in trustworthy AI methods. We aimed to give an overview of the current theoretical view on trust in (X)AI and its related concepts. We provided a survey of existing empirical studies that perform human-centric evaluation of the effects that (X)AI methods have on its users and which factors are most important and the extend to which they can affect user trust and distrust. We hope that this gives researchers and ML practitioners insights into concepts such as trust in AI, appropriate reliance, and trustworthiness of AI are and support their ability to construct valuable studies on the effects of ML methods on user trust and its evaluation.

Everything discussed here is not necessarily relevant for each study or context. Yet, this contribution seeks to stimulate awareness of one’s application domain, user types and needs, the involved risk, the chosen type measurement, and the implications all this encompasses. We find that work needs to be done both on the theoretical underpinnings of the constructs of trust and distrust and how they can be properly measured and evaluated in the context of (X)AI.

Appendix 0.A Appendix

0.A.1 Description of surveyed papers on empirical evaluation of the effects of ML methods on user trust

Table 3: Survey papers.
Trust Other criteria
Paper

Self-report

Behavioral

Physiological

Understanding & Mental models

Task Performance

System satisfaction & usability

Others

\citeAkulesza2013too

\citeAKizilcec2016

\citeABussone2015

\citeAPapenmeier2022

\citeAZhou2018

\citeAYu2018

\citeAYu2017

\citeAYu2019

\citeALeffrang2021
\citeAZhang2020

\citeAnourani2019effects

\citeAOmeiza2021

\citeANourani2020

\citeAGuesmi2021

\citeAJiang2022
\citeAShin2021

\citeALeichtmann2023

\citeAPapenmeier2021

\citeACheng2019

\citeAWang2021

\citeAWang2022

\citeAYang2020

\citeABansal2021

\citeAHoneycutt2020

\citeAMingYin2019

\citeALinder2021

\citeAMiller2016

\citeALakkaraju2020

\citeAEhsan2020

\citeAAlam2021

\citeAEslami2018

\citeAKulms2019

\citeAYu2016

\citeALim2009a

\citeACramer2008

\citeAEiband2019

\citeABruzzese2020

\citeAWanner2020

\citeABucinca2021

\citeAZhou2019

\citeAThaler2021

\citeAdrozdal2020trust

\citeAGurney2022

\citeAKunkel2019

\citeAPynadath2018

\citeABerkovsky2017

\citeABayer2021

\citeAZhao2019

\citeASchmidt2020

\citeAOoge2021

\citeARechkemmer2022

\citeAKoerber2018

\citeAAnik2021

\citeASuresh2020

\citeALu2021

\citeASchmidt2019

\citeAHolliday2016

\citeAWang2023

\citeAChen2023

\citeAGuesmi2023

0.A.2 Trust measurements outcomes

Table 4: Measurement outcomes
Paper Measurement Type Measured Effect
\citeAkulesza2013too self-report positive
\citeAKizilcec2016 self-report mixed
\citeABussone2015 self-report positive
behavioral mixed
\citeAPapenmeier2022 self-report mixed
behavioral no effect
\citeAYu2018 self-report positive
behavioral no effect
\citeAYu2017 self-report positive
behavioral positive
\citeAYu2019 self-report positive
behavioral positive
\citeALeffrang2021 behavioral no effect
\citeAZhang2020 behavioral positive
\citeAnourani2019effects self-report positive
\citeAOmeiza2021 self-report negative
\citeANourani2020 self-report negative
\citeAGuesmi2021 self-report mixed
\citeAJiang2022 self-report mixed
\citeAShin2021 self-report positive
\citeALeichtmann2023 self-report negative
\citeAPapenmeier2021 self-report mixed
behavioral positive
\citeACheng2019 self-report no effect
\citeAWang2021 behavioral mixed
\citeAWang2022 behavioral mixed
\citeAYang2020 behavioral positive
\citeABansal2021 behavioral mixed
\citeAMingYin2019 self-report positive
behavioral positive
\citeALinder2021 behavioral no effect
\citeAMiller2016 self-report no effect
behavioral no effect
\citeALakkaraju2020 self-report positive
\citeAAlam2021 self-report positive
\citeAEslami2018 self-report inconclusive
\citeAKulms2019 self-report positive
behavioral no effect
\citeAYu2016 self-report positive
\citeALim2009a self-report positive
\citeACramer2008 self-report no effect
\citeAEiband2019 self-report positive (but not fully desired)
\citeABruzzese2020 self-report mixed
\citeABucinca2021 self-report inconclusive
\citeAZhou2019 self-report positive
\citeAThaler2021 self-report positive
\citeAdrozdal2020trust self-report positive
\citeAKunkel2019 self-report positive
\citeAPynadath2018 self-report positive
\citeABerkovsky2017 self-report mixed
\citeABayer2021 self-report mixed
\citeASchmidt2020 behavioral negative
\citeAOoge2021 self-report mixed
\citeARechkemmer2022 self-report positive
behavioral positive
\citeAKoerber2018 behavioral positive
\citeAAnik2021 self-report positive
\citeASuresh2020 behavioral positive
\citeASchmidt2019 behavioral positive
\citeAHolliday2016 self-report mixed
\citeAWang2023 self-report positive
behavioral no effect
\citeAChen2023 behavioral positive
\citeAGuesmi2023 self-report positive

References

  • Adadi \BBA Berrada (\APACyear2018) \APACinsertmetastaradadi2018peeking{APACrefauthors}Adadi, A.\BCBT \BBA Berrada, M.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitlePeeking inside the black-box: a survey on explainable artificial intelligence (XAI) Peeking inside the black-box: a survey on explainable artificial intelligence (XAI).\BBCQ \APACjournalVolNumPagesIEEE access652138–52160. {APACrefDOI} \doi10.1109/ACCESS.2018.2870052 \PrintBackRefs\CurrentBib
  • Alam \BBA Mueller (\APACyear2021) \APACinsertmetastarAlam2021{APACrefauthors}Alam, L.\BCBT \BBA Mueller, S.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleExamining the effect of explanation on satisfaction and trust in AI diagnostic systems Examining the effect of explanation on satisfaction and trust in ai diagnostic systems.\BBCQ \APACjournalVolNumPagesBMC medical informatics and decision making211178. {APACrefURL} https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01542-6 \PrintBackRefs\CurrentBib
  • Anik \BBA Bunt (\APACyear2021) \APACinsertmetastarAnik2021{APACrefauthors}Anik, A\BPBII.\BCBT \BBA Bunt, A.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleData-Centric Explanations: Explaining Training Data of Machine Learning Systems to Promote Transparency Data-centric explanations: Explaining training data of machine learning systems to promote transparency.\BBCQ \BIn \APACrefbtitleProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Proceedings of the 2021 chi conference on human factors in computing systems. \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3411764.3445736 {APACrefDOI} \doi10.1145/3411764.3445736 \PrintBackRefs\CurrentBib
  • Arrieta \BOthers. (\APACyear2020) \APACinsertmetastarArrieta2020{APACrefauthors}Arrieta, A., Díaz-Rodríguez, N., Ser, J., Bennetot, A., Tabik, S., Barbado, A.\BDBLHerrera, F.  \APACrefYearMonthDay20201230. \BBOQ\APACrefatitleExplainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible ai.\BBCQ. {APACrefURL} https://www.sciencedirect.com/science/article/abs/pii/S1566253519308103 \PrintBackRefs\CurrentBib
  • Baer \BBA Colquitt (\APACyear2018) \APACinsertmetastarBaer2018{APACrefauthors}Baer, M.\BCBT \BBA Colquitt, J\BPBIA.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitleMoving toward a more comprehensive consideration of the antecedents of trust Moving toward a more comprehensive consideration of the antecedents of trust.\BBCQ \APACjournalVolNumPagesRoutledge companion to trust163–182. \PrintBackRefs\CurrentBib
  • Bansal \BOthers. (\APACyear2021) \APACinsertmetastarBansal2021{APACrefauthors}Bansal, G., Wu, T., Zhou, J., Fok, R., Nushi, B., Kamar, E.\BDBLWeld, D.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleDoes the Whole Exceed Its Parts? The Effect of AI Explanations on Complementary Team Performance Does the whole exceed its parts? the effect of ai explanations on complementary team performance.\BBCQ \BIn \APACrefbtitleProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Proceedings of the 2021 chi conference on human factors in computing systems. \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3411764.3445717 {APACrefDOI} \doi10.1145/3411764.3445717 \PrintBackRefs\CurrentBib
  • Barredo Arrieta \BOthers. (\APACyear2020) \APACinsertmetastarBarredoArrieta.2020{APACrefauthors}Barredo Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A.\BDBLHerrera, F.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleExplainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai.\BBCQ \APACjournalVolNumPagesInformation Fusion5882–115. {APACrefDOI} \doi10.1016/j.inffus.2019.12.012 \PrintBackRefs\CurrentBib
  • Benamati \BOthers. (\APACyear2006) \APACinsertmetastarBenamati.2006{APACrefauthors}Benamati, J., Serva, M\BPBIA.\BCBL \BBA Fuller, M\BPBIA.  \APACrefYearMonthDay2006. \BBOQ\APACrefatitleAre Trust and Distrust Distinct Constructs? An Empirical Study of the Effects of Trust and Distrust among Online Banking Users Are trust and distrust distinct constructs? an empirical study of the effects of trust and distrust among online banking users.\BBCQ \BIn \APACrefbtitleProceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS’06). Proceedings of the 39th annual hawaii international conference on system sciences (hicss’06). \APACaddressPublisherIEEE. {APACrefDOI} \doi10.1109/hicss.2006.63 \PrintBackRefs\CurrentBib
  • Berkovsky \BOthers. (\APACyear2017) \APACinsertmetastarBerkovsky2017{APACrefauthors}Berkovsky, S., Taib, R.\BCBL \BBA Conway, D.  \APACrefYearMonthDay2017. \BBOQ\APACrefatitleHow to Recommend? User Trust Factors in Movie Recommender Systems How to recommend? user trust factors in movie recommender systems.\BBCQ \BIn \APACrefbtitleProceedings of the 22nd International Conference on Intelligent User Interfaces Proceedings of the 22nd international conference on intelligent user interfaces (\BPG 287–300). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3025171.3025209 {APACrefDOI} \doi10.1145/3025171.3025209 \PrintBackRefs\CurrentBib
  • Bruzzese \BOthers. (\APACyear2020) \APACinsertmetastarBruzzese2020{APACrefauthors}Bruzzese, T., Gao, I., Dietz, G., Ding, C.\BCBL \BBA Romanos, A.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleEffect of Confidence Indicators on Trust in AI-Generated Profiles Effect of confidence indicators on trust in ai-generated profiles.\BBCQ \BIn \APACrefbtitleExtended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems Extended abstracts of the 2020 chi conference on human factors in computing systems (\BPG 1–8). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3334480.3382842 {APACrefDOI} \doi10.1145/3334480.3382842 \PrintBackRefs\CurrentBib
  • Buçinca \BOthers. (\APACyear2021) \APACinsertmetastarBucinca2021{APACrefauthors}Buçinca, Z., Malaya, M\BPBIB.\BCBL \BBA Gajos, K\BPBIZ.  \APACrefYearMonthDay2021apr. \BBOQ\APACrefatitleTo Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-Assisted Decision-Making To trust or to think: Cognitive forcing functions can reduce overreliance on ai in ai-assisted decision-making.\BBCQ \APACjournalVolNumPagesProc. ACM Hum.-Comput. Interact.5CSCW1. {APACrefURL} https://doi.org/10.1145/3449287 {APACrefDOI} \doi10.1145/3449287 \PrintBackRefs\CurrentBib
  • Bussone \BOthers. (\APACyear2015) \APACinsertmetastarBussone2015{APACrefauthors}Bussone, A., Stumpf, S.\BCBL \BBA O’Sullivan, D.  \APACrefYearMonthDay2015. \BBOQ\APACrefatitleThe Role of Explanations on Trust and Reliance in Clinical Decision Support Systems The role of explanations on trust and reliance in clinical decision support systems.\BBCQ \BIn \APACrefbtitle2015 International Conference on Healthcare Informatics 2015 international conference on healthcare informatics (\BPGS 160–169). {APACrefDOI} \doi10.1109/ICHI.2015.26 \PrintBackRefs\CurrentBib
  • Chen \BOthers. (\APACyear2023) \APACinsertmetastarChen2023{APACrefauthors}Chen, V., Liao, Q\BPBIV., Vaughan, J\BPBIW.\BCBL \BBA Bansal, G.  \APACrefYearMonthDay2023. \BBOQ\APACrefatitleUnderstanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations Understanding the role of human intuition on reliance in human-ai decision-making with explanations.\BBCQ \APACjournalVolNumPagesArXivabs/2301.07255. \PrintBackRefs\CurrentBib
  • Cheng \BOthers. (\APACyear2019) \APACinsertmetastarCheng2019{APACrefauthors}Cheng, H\BHBIF., Wang, R., Zhang, Z., O’Connell, F., Gray, T., Harper, F\BPBIM.\BCBL \BBA Zhu, H.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleExplaining decision-making algorithms through UI: Strategies to help non-expert stakeholders Explaining decision-making algorithms through ui: Strategies to help non-expert stakeholders.\BBCQ \BIn \APACrefbtitleProceedings of the 2019 chi conference on human factors in computing systems Proceedings of the 2019 chi conference on human factors in computing systems (\BPGS 1–12). \PrintBackRefs\CurrentBib
  • Chien \BOthers. (\APACyear2015) \APACinsertmetastarChien2015{APACrefauthors}Chien, S\BHBIY., Lewis, M., Hergeth, S., Semnani-Azad, Z.\BCBL \BBA Sycara, K.  \APACrefYearMonthDay2015sep. \BBOQ\APACrefatitleCross-Country Validation of a Cultural Scale in Measuring Trust in Automation Cross-country validation of a cultural scale in measuring trust in automation.\BBCQ \APACjournalVolNumPagesProceedings of the Human Factors and Ergonomics Society Annual Meeting591686–690. {APACrefDOI} \doi10.1177/1541931215591149 \PrintBackRefs\CurrentBib
  • Chien \BOthers. (\APACyear2016) \APACinsertmetastarChien2016{APACrefauthors}Chien, S\BHBIY., Lewis, M.\BCBL \BBA Sycara, K.  \APACrefYearMonthDay2016. \BBOQ\APACrefatitleInfluence of Cultural Factors in Dynamic Trust in Automation Influence of cultural factors in dynamic trust in automation.\BBCQ \BIn \APACrefbtitle2016 IEEE International Conference on Systems, Man, and Cybernetics’ SMC 20161 October 9-12,2016’ Budapest, Hungary. 2016 ieee international conference on systems, man, and cybernetics’ smc 20161 october 9-12,2016’ budapest, hungary. \APACaddressPublisherIEEE. {APACrefURL} https://ieeexplore.ieee.org/abstract/document/7844677 \PrintBackRefs\CurrentBib
  • Cho (\APACyear2006) \APACinsertmetastarCho.2006{APACrefauthors}Cho, J.  \APACrefYearMonthDay2006. \BBOQ\APACrefatitleThe mechanism of trust and distrust formation and their relational outcomes The mechanism of trust and distrust formation and their relational outcomes.\BBCQ \APACjournalVolNumPagesJournal of Retailing82125–35. {APACrefDOI} \doi10.1016/j. jretai.2005.11.002 \PrintBackRefs\CurrentBib
  • Cramer \BOthers. (\APACyear2008) \APACinsertmetastarCramer2008{APACrefauthors}Cramer, H., Evers, V., Ramlal, S., Van Someren, M., Rutledge, L., Stash, N.\BDBLWielinga, B.  \APACrefYearMonthDay2008. \BBOQ\APACrefatitleThe effects of transparency on trust in and acceptance of a content-based art recommender The effects of transparency on trust in and acceptance of a content-based art recommender.\BBCQ \APACjournalVolNumPagesUser Modeling and User-adapted interaction185455–496. {APACrefURL} https://link.springer.com/article/10.1007/s11257-008-9051-3 \PrintBackRefs\CurrentBib
  • de Visser \BOthers. (\APACyear2020) \APACinsertmetastarVisser.2020{APACrefauthors}de Visser, E\BPBIJ., Peeters, M\BPBIM\BPBIM., Jung, M\BPBIF., Kohn, S., Shaw, T\BPBIH., Pak, R.\BCBL \BBA Neerincx, M\BPBIA.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleTowards a Theory of Longitudinal Trust Calibration in Human–Robot Teams Towards a theory of longitudinal trust calibration in human–robot teams.\BBCQ \APACjournalVolNumPagesInternational Journal of Social Robotics122459–478. {APACrefDOI} \doi10.1007/s12369-019-00596-x \PrintBackRefs\CurrentBib
  • Dietz \BBA Hartog (\APACyear2006) \APACinsertmetastarDietz2006{APACrefauthors}Dietz, G.\BCBT \BBA Hartog, D\BPBIN\BPBID.  \APACrefYearMonthDay2006sep. \BBOQ\APACrefatitleMeasuring trust inside organisations Measuring trust inside organisations.\BBCQ \APACjournalVolNumPagesPersonnel Review355557–588. {APACrefDOI} \doi10.1108/00483480610682299 \PrintBackRefs\CurrentBib
  • Dimanov \BOthers. (\APACyear2020) \APACinsertmetastarDimanov2020{APACrefauthors}Dimanov, B., Bhatt, U., Jamnik, M.\BCBL \BBA Weller, A.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleYou Shouldn’t Trust Me: Learning Models Which Conceal Unfairness From Multiple Explanation Methods You shouldn’t trust me: Learning models which conceal unfairness from multiple explanation methods.\BBCQ. \PrintBackRefs\CurrentBib
  • Donald R. Honeycutt (\APACyear2020) \APACinsertmetastarHoneycutt2020{APACrefauthors}Donald R. Honeycutt, E\BPBID\BPBIR., Mahsan Nourani.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleSoliciting Human-in-the-Loop User Feedback for Interactive Machine Learning Reduces User Trust and Impressions of Model Accuracy Soliciting human-in-the-loop user feedback for interactive machine learning reduces user trust and impressions of model accuracy.\BBCQ \BIn \APACrefbtitleProceedings of the Eighth AAAI Conference on Human Computation and Crowdsourcing (HCOMP-20). Proceedings of the eighth aaai conference on human computation and crowdsourcing (hcomp-20). \PrintBackRefs\CurrentBib
  • Drozdal \BOthers. (\APACyear2020) \APACinsertmetastardrozdal2020trust{APACrefauthors}Drozdal, J., Weisz, J., Wang, D., Dass, G., Yao, B., Zhao, C.\BDBLSu, H.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleTrust in AutoML: Exploring Information Needs for Establishing Trust in Automated Machine Learning Systems Trust in automl: Exploring information needs for establishing trust in automated machine learning systems.\BBCQ \BIn \APACrefbtitleProceedings of the 25th International Conference on Intelligent User Interfaces Proceedings of the 25th international conference on intelligent user interfaces (\BPGS 297–307). {APACrefDOI} \doi.org/10.1145/3377325.3377501 \PrintBackRefs\CurrentBib
  • Ehsan \BBA Riedl (\APACyear2020) \APACinsertmetastarEhsan2020{APACrefauthors}Ehsan, U.\BCBT \BBA Riedl, M\BPBIO.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleHuman-Centered Explainable AI: Towards a Reflective Sociotechnical Approach Human-centered explainable ai: Towards a reflective sociotechnical approach.\BBCQ \BIn \APACrefbtitleInternational Conference on Human-Computer Interaction International conference on human-computer interaction (\BPGS 449–466). \APACaddressPublisherSpringer, Cham. {APACrefDOI} \doi10.1007/978-3-030-60117-1_33 \PrintBackRefs\CurrentBib
  • Eiband \BOthers. (\APACyear2019) \APACinsertmetastarEiband2019{APACrefauthors}Eiband, M., Buschek, D., Kremer, A.\BCBL \BBA Hussmann, H.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleThe Impact of Placebic Explanations on Trust in Intelligent Systems The impact of placebic explanations on trust in intelligent systems.\BBCQ \BIn \APACrefbtitleExtended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems Extended abstracts of the 2019 chi conference on human factors in computing systems (\BPG 1–6). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3290607.3312787 {APACrefDOI} \doi10.1145/3290607.3312787 \PrintBackRefs\CurrentBib
  • Eslami \BOthers. (\APACyear2018) \APACinsertmetastarEslami2018{APACrefauthors}Eslami, M., Krishna Kumaran, S\BPBIR., Sandvig, C.\BCBL \BBA Karahalios, K.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitleCommunicating Algorithmic Process in Online Behavioral Advertising Communicating algorithmic process in online behavioral advertising.\BBCQ \BIn \APACrefbtitleProceedings of the 2018 CHI Conference on Human Factors in Computing Systems Proceedings of the 2018 chi conference on human factors in computing systems (\BPG 1–13). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3173574.3174006 {APACrefDOI} \doi10.1145/3173574.3174006 \PrintBackRefs\CurrentBib
  • Fang \BOthers. (\APACyear2015) \APACinsertmetastarFang.2015{APACrefauthors}Fang, H., Guo, G.\BCBL \BBA Zhang, J.  \APACrefYearMonthDay2015. \BBOQ\APACrefatitleMulti-faceted trust and distrust prediction for recommender systems Multi-faceted trust and distrust prediction for recommender systems.\BBCQ \APACjournalVolNumPagesDecision Support Systems7137–47. {APACrefDOI} \doi10.1016/j. dss.2015.01.005 \PrintBackRefs\CurrentBib
  • Fein (\APACyear1996) \APACinsertmetastarFein.1996{APACrefauthors}Fein, S.  \APACrefYearMonthDay1996. \BBOQ\APACrefatitleEffects of suspicion on attributional thinking and the correspondence bias Effects of suspicion on attributional thinking and the correspondence bias.\BBCQ \APACjournalVolNumPagesJournal of Personality and Social Psychology7061164–1184. {APACrefDOI} \doi10.1037/0022-3514.70.6.1164 \PrintBackRefs\CurrentBib
  • Ferrario \BBA Loi (\APACyear2022) \APACinsertmetastarferrario2022{APACrefauthors}Ferrario, A.\BCBT \BBA Loi, M.  \APACrefYearMonthDay2022. \BBOQ\APACrefatitleHow explainability contributes to trust in AI How explainability contributes to trust in ai.\BBCQ \BIn \APACrefbtitle2022 ACM Conference on Fairness, Accountability, and Transparency 2022 acm conference on fairness, accountability, and transparency (\BPGS 1457–1466). \PrintBackRefs\CurrentBib
  • Ferreira \BBA Monteiro (\APACyear2020) \APACinsertmetastarFerreira2020{APACrefauthors}Ferreira, J\BPBIJ.\BCBT \BBA Monteiro, M\BPBIS.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleWhat are people doing about XAI user experience? A survey on AI explainability research and practice What are people doing about XAI user experience? a survey on ai explainability research and practice.\BBCQ \BIn \APACrefbtitleInternational Conference on Human-Computer Interaction International conference on human-computer interaction (\BPGS 56–73). {APACrefURL} https://link.springer.com/chapter/10.1007/978-3-030-49760-6_4 \PrintBackRefs\CurrentBib
  • Field \BOthers. (\APACyear2012) \APACinsertmetastarField.2012{APACrefauthors}Field, Z., Miles, J.\BCBL \BBA Field, A.  \APACrefYear2012. \APACrefbtitleDiscovering statistics using R Discovering statistics using R. \APACaddressPublisherSage. \PrintBackRefs\CurrentBib
  • Frison \BOthers. (\APACyear2019) \APACinsertmetastarFrison.2019{APACrefauthors}Frison, A\BHBIK., Wintersberger, P., Riener, A., Schartmüller, C., Boyle, L\BPBIN., Miller, E.\BCBL \BBA Weigl, K.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleIn UX We Trust In ux we trust.\BBCQ \BIn S. Brewster (\BED), \APACrefbtitleProceedings of the 2019 CHI Conference on Human Factors in Computing Systems Proceedings of the 2019 chi conference on human factors in computing systems (\BPGS 1–13). \APACaddressPublisherNew York,NY,United StatesAssociation for Computing Machinery. {APACrefDOI} \doi10.1145/3290605.3300374 \PrintBackRefs\CurrentBib
  • Gaube \BOthers. (\APACyear2021) \APACinsertmetastarGaube.2021{APACrefauthors}Gaube, S., Suresh, H., Raue, M., Merritt, A., Berkowitz, S\BPBIJ., Lermer, E.\BDBLGhassemi, M.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleDo as AI say: susceptibility in deployment of clinical decision-aids Do as ai say: susceptibility in deployment of clinical decision-aids.\BBCQ \APACjournalVolNumPagesNPJ digital medicine4131. \PrintBackRefs\CurrentBib
  • Glikson \BBA Woolley (\APACyear2020) \APACinsertmetastarGlikson.2020{APACrefauthors}Glikson, E.\BCBT \BBA Woolley, A\BPBIW.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleHuman Trust in Artificial Intelligence: Review of Empirical Research Human trust in artificial intelligence: Review of empirical research.\BBCQ \APACjournalVolNumPagesAcademy of Management Annals142627–660. {APACrefDOI} \doi10.5465/annals.2018.0057 \PrintBackRefs\CurrentBib
  • Guesmi \BOthers. (\APACyear2023) \APACinsertmetastarGuesmi2023{APACrefauthors}Guesmi, M., Chatti, M\BPBIA., Joarder, S., Ain, Q\BPBIU., Alatrash, R., Siepmann, C.\BCBL \BBA Vahidi, T.  \APACrefYearMonthDay2023. \BBOQ\APACrefatitleInteractive Explanation with Varying Level of Details in an Explainable Scientific Literature Recommender System Interactive explanation with varying level of details in an explainable scientific literature recommender system.\BBCQ \APACjournalVolNumPagesArXivabs/2306.05809. \PrintBackRefs\CurrentBib
  • Guesmi \BOthers. (\APACyear2021) \APACinsertmetastarGuesmi2021{APACrefauthors}Guesmi, M., Chatti, M\BPBIA., Vorgerd, L., Joarder, S\BPBIA., Ain, Q\BPBIU., Ngo, T.\BDBLMuslim, A.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleInput or Output: Effects of Explanation Focus on the Perception of Explainable Recommendation with Varying Level of Details. Input or output: Effects of explanation focus on the perception of explainable recommendation with varying level of details.\BBCQ \BIn \APACrefbtitleIntRS@ RecSys Intrs@ recsys (\BPGS 55–72). \PrintBackRefs\CurrentBib
  • Guidotti \BOthers. (\APACyear2019) \APACinsertmetastarGuidotti2019{APACrefauthors}Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F.\BCBL \BBA Pedreschi, D.  \APACrefYearMonthDay2019sep. \BBOQ\APACrefatitleA Survey of Methods for Explaining Black Box Models A survey of methods for explaining black box models.\BBCQ \APACjournalVolNumPagesACM Computing Surveys5151–42. {APACrefDOI} \doi10.1145/3236009 \PrintBackRefs\CurrentBib
  • Gunning \BBA Aha (\APACyear2019) \APACinsertmetastarGunning.2019{APACrefauthors}Gunning, D.\BCBT \BBA Aha, D.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleDARPA’s Explainable Artificial Intelligence (XAI) Program Darpa’s explainable artificial intelligence (xai) program.\BBCQ \APACjournalVolNumPagesAI Magazine40244–58. {APACrefDOI} \doi10.1609/aimag.v40i2.2850 \PrintBackRefs\CurrentBib
  • S\BHBIL. Guo \BOthers. (\APACyear2017) \APACinsertmetastarGuo.2017{APACrefauthors}Guo, S\BHBIL., Lumineau, F.\BCBL \BBA Lewicki, R\BPBIJ.  \APACrefYearMonthDay2017. \BBOQ\APACrefatitleRevisiting the Foundations of Organizational Distrust Revisiting the foundations of organizational distrust.\BBCQ \APACjournalVolNumPagesFoundations and Trends® in Management111–88. {APACrefDOI} \doi10.1561/3400000001 \PrintBackRefs\CurrentBib
  • Y. Guo \BBA Yang (\APACyear2021) \APACinsertmetastarguo2021modeling{APACrefauthors}Guo, Y.\BCBT \BBA Yang, X\BPBIJ.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleModeling and Predicting Trust Dynamics in Human–Robot Teaming: A Bayesian Inference Approach Modeling and predicting trust dynamics in human–robot teaming: A bayesian inference approach.\BBCQ \APACjournalVolNumPagesInternational Journal of Social Robotics1381899–1909. {APACrefURL} https://link.springer.com/article/10.1007/s12369-020-00703-3 {APACrefDOI} \doi10.1007/s12369-020-00703-3 \PrintBackRefs\CurrentBib
  • Gurney \BOthers. (\APACyear2022) \APACinsertmetastarGurney2022{APACrefauthors}Gurney, N., Pynadath, D\BPBIV.\BCBL \BBA Wang, N.  \APACrefYearMonthDay2022. \BBOQ\APACrefatitleMeasuring and Predicting Human Trust in Recommendations from an AI Teammate Measuring and predicting human trust in recommendations from an AI teammate.\BBCQ \BIn ©c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 H. Degen\BCBL \BBA S. Ntoa (\BEDS), \APACrefbtitleArtificial Intelligence in HCI Artificial intelligence in HCI (\BVOL 13336, \BPGS 22–34). \APACaddressPublisherSpringer International Publishing. {APACrefDOI} \doi10.1007/978-3-031-05643-7_2 \PrintBackRefs\CurrentBib
  • Han \BBA Schulz (\APACyear2020) \APACinsertmetastarHan2020{APACrefauthors}Han, W.\BCBT \BBA Schulz, H\BHBIJ.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleBeyond Trust Building — Calibrating Trust in Visual Analytics Beyond trust building — calibrating trust in visual analytics.\BBCQ \BIn \APACrefbtitle2020 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX) 2020 ieee workshop on trust and expertise in visual analytics (trex) (\BPGS 9–15). {APACrefDOI} \doi10.1109/TREX51495.2020.00006 \PrintBackRefs\CurrentBib
  • Harrison McKnight \BBA Chervany (\APACyear2001) \APACinsertmetastarHarrisonMcKnight.2001{APACrefauthors}Harrison McKnight, D.\BCBT \BBA Chervany, N\BPBIL.  \APACrefYearMonthDay2001. \BBOQ\APACrefatitleTrust and Distrust Definitions: One Bite at a Time Trust and distrust definitions: One bite at a time.\BBCQ \BIn \APACrefbtitleTrust in Cyber-societies Trust in cyber-societies (\BPGS 27–54). \APACaddressPublisherSpringer, Berlin, Heidelberg. {APACrefDOI} \doi10.1007/3-540-45547-7_3 \PrintBackRefs\CurrentBib
  • HLEG (\APACyear2019) \APACinsertmetastarhlegAI{APACrefauthors}HLEG, A.  \APACrefYearMonthDay2019. \APACrefbtitleEthics guidelines for trustworthy AI. Ethics guidelines for trustworthy ai. {APACrefURL} [22.03.2023]https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai \PrintBackRefs\CurrentBib
  • Hoff \BBA Bashir (\APACyear2015) \APACinsertmetastarhoff2015trust{APACrefauthors}Hoff, K\BPBIA.\BCBT \BBA Bashir, M.  \APACrefYearMonthDay2015. \BBOQ\APACrefatitleTrust in automation: Integrating empirical evidence on factors that influence trust Trust in automation: Integrating empirical evidence on factors that influence trust.\BBCQ \APACjournalVolNumPagesHuman factors573407–434. {APACrefDOI} \doi10.1177/0018720814547570 \PrintBackRefs\CurrentBib
  • R. Hoffman \BOthers. (\APACyear2018) \APACinsertmetastarHoffman.2018{APACrefauthors}Hoffman, R., Mueller, S\BPBIT., Klein, G.\BCBL \BBA Litman, J.  \APACrefYear2018. \APACrefbtitleMeasuring Trust in the XAI Context: Technical Report, DARPA Explainable AI Program. Measuring trust in the xai context: Technical report, darpa explainable ai program. {APACrefDOI} \doi10.31234/osf.io/e3kv9 \PrintBackRefs\CurrentBib
  • R\BPBIR. Hoffman \BOthers. (\APACyear2009) \APACinsertmetastarhoffman2009dynamics{APACrefauthors}Hoffman, R\BPBIR., Lee, J\BPBID., Woods, D\BPBID., Shadbolt, N., Miller, J.\BCBL \BBA Bradshaw, J\BPBIM.  \APACrefYearMonthDay2009. \BBOQ\APACrefatitleThe dynamics of trust in cyberdomains The dynamics of trust in cyberdomains.\BBCQ \APACjournalVolNumPagesIEEE Intelligent Systems2465–11. \PrintBackRefs\CurrentBib
  • R\BPBIR. Hoffman \BOthers. (\APACyear2018) \APACinsertmetastarhoffman2018metrics{APACrefauthors}Hoffman, R\BPBIR., Mueller, S\BPBIT., Klein, G.\BCBL \BBA Litman, J.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitleMetrics for explainable AI: Challenges and prospects Metrics for explainable ai: Challenges and prospects.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1812.04608. {APACrefURL} https://arxiv.longhoe.net/abs/1812.04608 \PrintBackRefs\CurrentBib
  • Holliday \BOthers. (\APACyear2016) \APACinsertmetastarHolliday2016{APACrefauthors}Holliday, D., Wilson, S.\BCBL \BBA Stumpf, S.  \APACrefYearMonthDay2016. \BBOQ\APACrefatitleUser Trust in Intelligent Systems: A Journey Over Time User trust in intelligent systems: A journey over time.\BBCQ \BIn \APACrefbtitleProceedings of the 21st International Conference on Intelligent User Interfaces Proceedings of the 21st international conference on intelligent user interfaces (\BPG 164–168). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/2856767.2856811 {APACrefDOI} \doi10.1145/2856767.2856811 \PrintBackRefs\CurrentBib
  • Jacovi \BOthers. (\APACyear2021) \APACinsertmetastarJacovi.2021{APACrefauthors}Jacovi, A., Marasović, A., Miller, T.\BCBL \BBA Goldberg, Y.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleFormalizing Trust in Artificial Intelligence Formalizing trust in artificial intelligence.\BBCQ \BIn \APACrefbtitleProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency Proceedings of the 2021 acm conference on fairness, accountability, and transparency (\BPGS 624–635). \APACaddressPublisherNew York,NY,United StatesAssociation for Computing Machinery. {APACrefDOI} \doi10.1145/3442188.3445923 \PrintBackRefs\CurrentBib
  • Jian \BOthers. (\APACyear2000) \APACinsertmetastarJian.2000{APACrefauthors}Jian, J\BHBIY., Bisantz, A\BPBIM.\BCBL \BBA Drury, C\BPBIG.  \APACrefYearMonthDay2000. \BBOQ\APACrefatitleFoundations for an Empirically Determined Scale of Trust in Automated Systems Foundations for an empirically determined scale of trust in automated systems.\BBCQ \APACjournalVolNumPagesInternational Journal of Cognitive Ergonomics4153–71. {APACrefDOI} \doi10.1207/S15327566 IJCE0401_04 \PrintBackRefs\CurrentBib
  • Jiang \BOthers. (\APACyear2022) \APACinsertmetastarJiang2022{APACrefauthors}Jiang, J., Kahai, S.\BCBL \BBA Yang, M.  \APACrefYearMonthDay2022. \BBOQ\APACrefatitleWho needs explanation and when? Juggling explainable AI and user epistemic uncertainty Who needs explanation and when? juggling explainable ai and user epistemic uncertainty.\BBCQ \APACjournalVolNumPagesInternational Journal of Human-Computer Studies165102839. {APACrefURL} https://www.sciencedirect.com/science/article/pii/S1071581922000660 {APACrefDOI} \doi10.1016/j.ijhcs.2022.102839 \PrintBackRefs\CurrentBib
  • Joppe (\APACyear2000) \APACinsertmetastarJoppe2000{APACrefauthors}Joppe, M.  \APACrefYearMonthDay2000. \APACrefbtitleThe Research Process. Retrieved February 25, 1998. The research process. retrieved february 25, 1998. \PrintBackRefs\CurrentBib
  • Kaplan \BOthers. (\APACyear2023) \APACinsertmetastarKaplan2023{APACrefauthors}Kaplan, A\BPBID., Kessler, T\BPBIT., Brill, J\BPBIC.\BCBL \BBA Hancock, P\BPBIA.  \APACrefYearMonthDay2023. \BBOQ\APACrefatitleTrust in Artificial Intelligence: Meta-Analytic Findings Trust in artificial intelligence: Meta-analytic findings.\BBCQ \APACjournalVolNumPagesHuman Factors652337–359. {APACrefURL} https://doi.org/10.1177/00187208211013988 \APACrefnotePMID: 34048287 {APACrefDOI} \doi10.1177/00187208211013988 \PrintBackRefs\CurrentBib
  • Kastner \BOthers. (\APACyear2021) \APACinsertmetastarKastner.2021{APACrefauthors}Kastner, L., Langer, M., Lazar, V., Schomacker, A., Speith, T.\BCBL \BBA Sterz, S.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleOn the Relation of Trust and Explainability: Why to Engineer for Trustworthiness On the relation of trust and explainability: Why to engineer for trustworthiness.\BBCQ \BIn \APACrefbtitleProceedings, 29th IEEE International Requirements Engineering Conference Workshops : REW 2021 : September 20-24 2021, online event Proceedings, 29th ieee international requirements engineering conference workshops : Rew 2021 : September 20-24 2021, online event (\BPGS 169–175). \APACaddressPublisherLos Alamitos, CaliforniaIEEE Computer Society, Conference Publishing Services. {APACrefDOI} \doi10.1109/REW53955.2021.00031 \PrintBackRefs\CurrentBib
  • Kee \BBA Knox (\APACyear1970) \APACinsertmetastarKee1970{APACrefauthors}Kee, H\BPBIW.\BCBT \BBA Knox, R\BPBIE.  \APACrefYearMonthDay1970. \BBOQ\APACrefatitleConceptual and methodological considerations in the study of trust and suspicion Conceptual and methodological considerations in the study of trust and suspicion.\BBCQ \APACjournalVolNumPagesJournal of conflict resolution143357–366. \PrintBackRefs\CurrentBib
  • Kizilcec (\APACyear2016) \APACinsertmetastarKizilcec2016{APACrefauthors}Kizilcec, R\BPBIF.  \APACrefYearMonthDay2016may. \BBOQ\APACrefatitleHow Much Information? Effects of Transparency on Trust in an Algorithmic Interface How much information? effects of transparency on trust in an algorithmic interface.\BBCQ \BIn \APACrefbtitleProceedings of the 2016 CHI Conference on Human Factors in Computing Systems. Proceedings of the 2016 CHI conference on human factors in computing systems. \APACaddressPublisherACM. {APACrefDOI} \doi10.1145/2858036.2858402 \PrintBackRefs\CurrentBib
  • Kohn \BOthers. (\APACyear2021) \APACinsertmetastarKohn.2021{APACrefauthors}Kohn, S\BPBIC., de Visser, E\BPBIJ., Wiese, E., Lee, Y\BHBIC.\BCBL \BBA Shaw, T\BPBIH.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleMeasurement of Trust in Automation: A Narrative Review and Reference Guide Measurement of trust in automation: A narrative review and reference guide.\BBCQ \APACjournalVolNumPagesFrontiers in psychology12604977. {APACrefDOI} \doi10.3389/fpsyg.2021.604977 \PrintBackRefs\CurrentBib
  • Kulesza \BOthers. (\APACyear2013) \APACinsertmetastarkulesza2013too{APACrefauthors}Kulesza, T., Stumpf, S., Burnett, M., Yang, S., Kwan, I.\BCBL \BBA Wong, W\BHBIK.  \APACrefYearMonthDay2013. \BBOQ\APACrefatitleToo much, too little, or just right? Ways explanations impact end users’ mental models Too much, too little, or just right? ways explanations impact end users’ mental models.\BBCQ \BIn \APACrefbtitle2013 IEEE Symposium on visual languages and human centric computing 2013 ieee symposium on visual languages and human centric computing (\BPGS 3–10). {APACrefDOI} \doi.org/10.1109/VLHCC.2013.6645235 \PrintBackRefs\CurrentBib
  • Kulms \BBA Kopp (\APACyear2019) \APACinsertmetastarKulms2019{APACrefauthors}Kulms, P.\BCBT \BBA Kopp, S.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleMore Human-Likeness, More Trust? The Effect of Anthropomorphism on Self-Reported and Behavioral Trust in Continued and Interdependent Human-Agent Cooperation More human-likeness, more trust? the effect of anthropomorphism on self-reported and behavioral trust in continued and interdependent human-agent cooperation.\BBCQ \BIn \APACrefbtitleProceedings of Mensch Und Computer 2019 Proceedings of mensch und computer 2019 (\BPG 31–42). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3340764.3340793 {APACrefDOI} \doi10.1145/3340764.3340793 \PrintBackRefs\CurrentBib
  • Kunkel \BOthers. (\APACyear2019) \APACinsertmetastarKunkel2019{APACrefauthors}Kunkel, J., Donkers, T., Michael, L., Barbu, C\BHBIM.\BCBL \BBA Ziegler, J.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleLet Me Explain: Impact of Personal and Impersonal Explanations on Trust in Recommender Systems Let me explain: Impact of personal and impersonal explanations on trust in recommender systems.\BBCQ \BIn \APACrefbtitleProceedings of the 2019 CHI Conference on Human Factors in Computing Systems Proceedings of the 2019 chi conference on human factors in computing systems (\BPG 1–12). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3290605.3300717 {APACrefDOI} \doi10.1145/3290605.3300717 \PrintBackRefs\CurrentBib
  • Körber \BOthers. (\APACyear2018) \APACinsertmetastarKoerber2018{APACrefauthors}Körber, M., Baseler, E.\BCBL \BBA Bengler, K.  \APACrefYearMonthDay2018jan. \BBOQ\APACrefatitleIntroduction matters: Manipulating trust in automation and reliance in automated driving Introduction matters: Manipulating trust in automation and reliance in automated driving.\BBCQ \APACjournalVolNumPagesApplied Ergonomics66Munich.18–31. {APACrefDOI} \doi10.1016/j.apergo.2017.07.006 \PrintBackRefs\CurrentBib
  • Lai \BOthers. (\APACyear2021) \APACinsertmetastarLai2021{APACrefauthors}Lai, V., Chen, C., Liao, Q\BPBIV., Smith-Renner, A.\BCBL \BBA Tan, C.  \APACrefYearMonthDay2021. \APACrefbtitleTowards a Science of Human-AI Decision Making: A Survey of Empirical Studies. Towards a science of human-ai decision making: A survey of empirical studies. \PrintBackRefs\CurrentBib
  • Lakkaraju \BBA Bastani (\APACyear2020) \APACinsertmetastarLakkaraju2020{APACrefauthors}Lakkaraju, H.\BCBT \BBA Bastani, O.  \APACrefYearMonthDay2020feb. \BBOQ\APACrefatitle”How do I fool you?” Manipulating User Trust via Misleading Black Box Explanations ”how do i fool you?” manipulating user trust via misleading black box explanations.\BBCQ \BIn \APACrefbtitleProceedings of the AAAI/ACM Conference on AI, Ethics, and Society Proceedings of the aaai/acm conference on ai, ethics, and society (\BPGS 79–85). \APACaddressPublisherACM. {APACrefDOI} \doi10.1145/3375627.3375833 \PrintBackRefs\CurrentBib
  • Lee \BBA Moray (\APACyear1994) \APACinsertmetastarLee1994{APACrefauthors}Lee, J\BPBID.\BCBT \BBA Moray, N.  \APACrefYearMonthDay1994. \BBOQ\APACrefatitleTrust, self-confidence, and operators’ adaptation to automation Trust, self-confidence, and operators’ adaptation to automation.\BBCQ \APACjournalVolNumPagesInternational Journal of Human-Computer Studies401153–184. {APACrefDOI} \doihttps://doi.org/10.1006/ijhc.1994.1007 \PrintBackRefs\CurrentBib
  • Lee \BBA See (\APACyear2004) \APACinsertmetastarlee2004trust{APACrefauthors}Lee, J\BPBID.\BCBT \BBA See, K\BPBIA.  \APACrefYearMonthDay2004. \BBOQ\APACrefatitleTrust in automation: Designing for appropriate reliance Trust in automation: Designing for appropriate reliance.\BBCQ \APACjournalVolNumPagesHuman factors46150–80. {APACrefURL} https://journals.sagepub.com/doi/abs/10.1518/hfes.46.1.50_30392 \PrintBackRefs\CurrentBib
  • Leffrang \BBA Müller (\APACyear2021) \APACinsertmetastarLeffrang2021{APACrefauthors}Leffrang, D.\BCBT \BBA Müller, O.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleShould I Follow this Model? The Effect of Uncertainty Visualization on the Acceptance of Time Series Forecasts Should i follow this model? the effect of uncertainty visualization on the acceptance of time series forecasts.\BBCQ \BIn \APACrefbtitle2021 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX) 2021 ieee workshop on trust and expertise in visual analytics (trex) (\BPGS 20–26). {APACrefDOI} \doi10.1109/TREX53765.2021.00009 \PrintBackRefs\CurrentBib
  • Leichtmann \BOthers. (\APACyear2023) \APACinsertmetastarLeichtmann2023{APACrefauthors}Leichtmann, B., Humer, C., Hinterreiter, A., Streit, M.\BCBL \BBA Mara, M.  \APACrefYearMonthDay2023. \BBOQ\APACrefatitleEffects of Explainable Artificial Intelligence on trust and human behavior in a high-risk decision task Effects of explainable artificial intelligence on trust and human behavior in a high-risk decision task.\BBCQ \APACjournalVolNumPagesComputers in Human Behavior139107539. {APACrefURL} https://www.sciencedirect.com/science/article/pii/S0747563222003594 {APACrefDOI} \doihttps://doi.org/10.1016/j.chb.2022.107539 \PrintBackRefs\CurrentBib
  • Lewicki \BOthers. (\APACyear1998) \APACinsertmetastarLewicki.1998{APACrefauthors}Lewicki, R\BPBIJ., McAllister, D\BPBIJ.\BCBL \BBA Bies, R\BPBIJ.  \APACrefYearMonthDay1998. \BBOQ\APACrefatitleTrust And Distrust: New Relationships and Realities Trust and distrust: New relationships and realities.\BBCQ \APACjournalVolNumPagesAcademy of Management Review233438–458. {APACrefDOI} \doi10.5465/amr.1998.926620 \PrintBackRefs\CurrentBib
  • Liao \BBA Varshney (\APACyear2021) \APACinsertmetastarLiao2021{APACrefauthors}Liao, Q\BPBIV.\BCBT \BBA Varshney, K\BPBIR.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleHuman-centered explainable ai (XAI): From algorithms to user experiences Human-centered explainable ai (XAI): From algorithms to user experiences.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2110.10790. \PrintBackRefs\CurrentBib
  • Lim \BBA Dey (\APACyear2009) \APACinsertmetastarLim2009{APACrefauthors}Lim, B\BPBIY.\BCBT \BBA Dey, A\BPBIK.  \APACrefYearMonthDay2009. \BBOQ\APACrefatitleAssessing Demand for Intelligibility in Context-Aware Applications Assessing demand for intelligibility in context-aware applications.\BBCQ \BIn \APACrefbtitleProceedings of the 11th International Conference on Ubiquitous Computing Proceedings of the 11th international conference on ubiquitous computing (\BPG 195–204). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/1620545.1620576 {APACrefDOI} \doi10.1145/1620545.1620576 \PrintBackRefs\CurrentBib
  • Lim \BOthers. (\APACyear2009) \APACinsertmetastarLim2009a{APACrefauthors}Lim, B\BPBIY., Dey, A\BPBIK.\BCBL \BBA Avrahami, D.  \APACrefYearMonthDay2009. \BBOQ\APACrefatitleWhy and Why Not Explanations Improve the Intelligibility of Context-Aware Intelligent Systems Why and why not explanations improve the intelligibility of context-aware intelligent systems.\BBCQ \BIn \APACrefbtitleProceedings of the SIGCHI Conference on Human Factors in Computing Systems Proceedings of the sigchi conference on human factors in computing systems (\BPG 2119–2128). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/1518701.1519023 {APACrefDOI} \doi10.1145/1518701.1519023 \PrintBackRefs\CurrentBib
  • Linder \BOthers. (\APACyear2021) \APACinsertmetastarLinder2021{APACrefauthors}Linder, R., Mohseni, S., Yang, F., Pentyala, S\BPBIK., Ragan, E\BPBID.\BCBL \BBA Hu, X\BPBIB.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleHow level of explanation detail affects human performance in interpretable intelligent systems: A study on explainable fact checking How level of explanation detail affects human performance in interpretable intelligent systems: A study on explainable fact checking.\BBCQ \APACjournalVolNumPagesApplied AI Letters24e49. {APACrefURL} https://onlinelibrary.wiley.com/doi/abs/10.1002/ail2.49 {APACrefDOI} \doihttps://doi.org/10.1002/ail2.49 \PrintBackRefs\CurrentBib
  • Lopes \BOthers. (\APACyear2022) \APACinsertmetastarLopes2022{APACrefauthors}Lopes, P., Silva, E., Braga, C., Oliveira, T.\BCBL \BBA Rosado, L.  \APACrefYearMonthDay2022. \BBOQ\APACrefatitleXAI Systems Evaluation: A Review of Human and Computer-Centred Methods XAI systems evaluation: A review of human and computer-centred methods.\BBCQ \APACjournalVolNumPagesApplied Sciences1219. {APACrefURL} https://www.mdpi.com/2076-3417/12/19/9423 {APACrefDOI} \doi10.3390/app12199423 \PrintBackRefs\CurrentBib
  • Lu \BBA Yin (\APACyear2021) \APACinsertmetastarLu2021{APACrefauthors}Lu, Z.\BCBT \BBA Yin, M.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleHuman Reliance on Machine Learning Models When Performance Feedback is Limited: Heuristics and Risks Human reliance on machine learning models when performance feedback is limited: Heuristics and risks.\BBCQ \BIn \APACrefbtitleProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Proceedings of the 2021 chi conference on human factors in computing systems. \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3411764.3445562 {APACrefDOI} \doi10.1145/3411764.3445562 \PrintBackRefs\CurrentBib
  • Luhmann (\APACyear2009) \APACinsertmetastarluhmann2009a{APACrefauthors}Luhmann, N.  \APACrefYear2009. \APACrefbtitleVertrauen : ein Mechanismus der Reduktion sozialer Komplexität Vertrauen : ein mechanismus der reduktion sozialer komplexität (\PrintOrdinal4. Aufl., Nachdr. \BEd). \APACaddressPublisherStuttgart : Lucius & Lucius. \PrintBackRefs\CurrentBib
  • Lukyanenko \BOthers. (\APACyear2022) \APACinsertmetastarLukyanenko2022{APACrefauthors}Lukyanenko, R., Maass, W.\BCBL \BBA Storey, V\BPBIC.  \APACrefYearMonthDay2022Dec01. \BBOQ\APACrefatitleTrust in artificial intelligence: From a Foundational Trust Framework to emerging research opportunities Trust in artificial intelligence: From a foundational trust framework to emerging research opportunities.\BBCQ \APACjournalVolNumPagesElectronic Markets3241993–2020. {APACrefURL} https://doi.org/10.1007/s12525-022-00605-4 {APACrefDOI} \doi10.1007/s12525-022-00605-4 \PrintBackRefs\CurrentBib
  • Madsen \BBA Gregor (\APACyear2000) \APACinsertmetastarMadsen2000{APACrefauthors}Madsen, M.\BCBT \BBA Gregor, S.  \APACrefYearMonthDay2000. \BBOQ\APACrefatitleMeasuring human-computer trust Measuring human-computer trust.\BBCQ \BIn \APACrefbtitle11th australasian conference on information systems 11th australasian conference on information systems (\BVOL 53, \BPGS 6–8). \PrintBackRefs\CurrentBib
  • Mahsan Nourani (\APACyear2020) \APACinsertmetastarNourani2020{APACrefauthors}Mahsan Nourani, E\BPBID\BPBIR., Joanie T. King.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleThe Role of Domain Expertise in User Trust and the Impact of First Impressions with Intelligent Systems The role of domain expertise in user trust and the impact of first impressions with intelligent systems.\BBCQ \BIn \APACrefbtitleProceedings of the Eighth AAAI Conference on Human Computation and Crowdsourcing (HCOMP-20). Proceedings of the eighth aaai conference on human computation and crowdsourcing (hcomp-20). {APACrefURL} https://ojs.aaai.org/index.php/HCOMP/article/view/7469 \PrintBackRefs\CurrentBib
  • J. Mayer \BBA Mussweiler (\APACyear2011) \APACinsertmetastarMayer.2011{APACrefauthors}Mayer, J.\BCBT \BBA Mussweiler, T.  \APACrefYearMonthDay2011. \BBOQ\APACrefatitleSuspicious spirits, flexible minds: when distrust enhances creativity Suspicious spirits, flexible minds: when distrust enhances creativity.\BBCQ \APACjournalVolNumPagesJournal of Personality and Social Psychology10161262–1277. {APACrefDOI} \doi10.1037/a0024407 \PrintBackRefs\CurrentBib
  • R\BPBIC. Mayer \BOthers. (\APACyear1995\APACexlab\BCnt1) \APACinsertmetastarMayer.1995{APACrefauthors}Mayer, R\BPBIC., Davis, J\BPBIH.\BCBL \BBA Schoorman, F\BPBID.  \APACrefYearMonthDay1995\BCnt1. \BBOQ\APACrefatitleAn Integrative Model Of Organizational Trust An integrative model of organizational trust.\BBCQ \APACjournalVolNumPagesAcademy of Management Review203709–734. {APACrefDOI} \doi10.5465/amr.1995.9508080335 \PrintBackRefs\CurrentBib
  • R\BPBIC. Mayer \BOthers. (\APACyear1995\APACexlab\BCnt2) \APACinsertmetastarMayer1995{APACrefauthors}Mayer, R\BPBIC., Davis, J\BPBIH.\BCBL \BBA Schoorman, F\BPBID.  \APACrefYearMonthDay1995\BCnt2. \BBOQ\APACrefatitleAn integrative model of organizational trust An integrative model of organizational trust.\BBCQ \BIn (\BVOL 20, \BPGS 709–734). \APACaddressPublisherAcademy of Management Briarcliff Manor, NY 10510. \PrintBackRefs\CurrentBib
  • Mayo (\APACyear2015) \APACinsertmetastarMayo.2015{APACrefauthors}Mayo, R.  \APACrefYearMonthDay2015. \BBOQ\APACrefatitleCognition is a matter of trust: Distrust tunes cognitive processes Cognition is a matter of trust: Distrust tunes cognitive processes.\BBCQ \APACjournalVolNumPagesEuropean Review of Social Psychology261283–327. {APACrefDOI} \doi10.1080/10463283.2015.1117249 \PrintBackRefs\CurrentBib
  • McBride \BBA Morgan (\APACyear2010) \APACinsertmetastarMcBride2010{APACrefauthors}McBride, M.\BCBT \BBA Morgan, S.  \APACrefYearMonthDay2010. \BBOQ\APACrefatitleTrust calibration for automated decision aids Trust calibration for automated decision aids.\BBCQ \APACjournalVolNumPagesInstitute for Homeland Security Solutions1–11. \PrintBackRefs\CurrentBib
  • McGuirl \BBA Sarter (\APACyear2006) \APACinsertmetastarMcGuirl2006{APACrefauthors}McGuirl, J\BPBIM.\BCBT \BBA Sarter, N\BPBIB.  \APACrefYearMonthDay2006. \BBOQ\APACrefatitleSupporting trust calibration and the effective use of decision aids by presenting dynamic system confidence information Supporting trust calibration and the effective use of decision aids by presenting dynamic system confidence information.\BBCQ \APACjournalVolNumPagesHuman factors484656–665. \PrintBackRefs\CurrentBib
  • McKnight \BOthers. (\APACyear2004) \APACinsertmetastarMcKnight.2004{APACrefauthors}McKnight, Kacmar\BCBL \BBA Choudhury.  \APACrefYearMonthDay2004. \BBOQ\APACrefatitleDispositional Trust and Distrust Distinctions in Predicting High- and Low-Risk Internet Expert Advice Site Perceptions Dispositional trust and distrust distinctions in predicting high- and low-risk internet expert advice site perceptions.\BBCQ \APACjournalVolNumPagese-Service Journal3235. {APACrefDOI} \doi10.2979/esj.2004.3.2.35 \PrintBackRefs\CurrentBib
  • Meske \BOthers. (\APACyear2020) \APACinsertmetastarMeske2020{APACrefauthors}Meske, C., Bunde, E., Schneider, J.\BCBL \BBA Gersch, M.  \APACrefYearMonthDay2020dec. \BBOQ\APACrefatitleExplainable Artificial Intelligence: Objectives, Stakeholders, and Future Research Opportunities Explainable artificial intelligence: Objectives, stakeholders, and future research opportunities.\BBCQ \APACjournalVolNumPagesInformation Systems Management39153–63. {APACrefDOI} \doi10.1080/10580530.2020.1849465 \PrintBackRefs\CurrentBib
  • D. Miller \BOthers. (\APACyear2016) \APACinsertmetastarMiller2016{APACrefauthors}Miller, D., Johns, M., Mok, B., Gowda, N., Sirkin, D., Lee, K.\BCBL \BBA Ju, W.  \APACrefYearMonthDay2016sep. \BBOQ\APACrefatitleBehavioral Measurement of Trust in Automation Behavioral measurement of trust in automation.\BBCQ \APACjournalVolNumPagesProceedings of the Human Factors and Ergonomics Society Annual Meeting6011849–1853. {APACrefDOI} \doi10.1177/1541931213601422 \PrintBackRefs\CurrentBib
  • T. Miller (\APACyear2022) \APACinsertmetastarMiller2022{APACrefauthors}Miller, T.  \APACrefYearMonthDay2022. \BBOQ\APACrefatitleAre we measuring trust correctly in explainability, interpretability, and transparency research? Are we measuring trust correctly in explainability, interpretability, and transparency research?\BBCQ \APACjournalVolNumPagesArXivabs/2209.00651. \PrintBackRefs\CurrentBib
  • Ming Yin (\APACyear2019) \APACinsertmetastarMingYin2019{APACrefauthors}Ming Yin, H\BPBIW., Jennifer Wortman Vaughan.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleUnderstanding the Effect of Accuracy on Trust in Machine Learning Models Understanding the effect of accuracy on trust in machine learning models.\BBCQ. {APACrefDOI} \doi.org/10.1145/3290605.3300509 \PrintBackRefs\CurrentBib
  • Mohseni \BOthers. (\APACyear2021) \APACinsertmetastarMohseni2021{APACrefauthors}Mohseni, S., Zarei, N.\BCBL \BBA Ragan, E\BPBID.  \APACrefYearMonthDay2021sep. \BBOQ\APACrefatitleA Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems A multidisciplinary survey and framework for design and evaluation of explainable ai systems.\BBCQ \APACjournalVolNumPagesACM Trans. Interact. Intell. Syst.113–4. {APACrefDOI} \doi10.1145/3387166 \PrintBackRefs\CurrentBib
  • Mueller \BOthers. (\APACyear2021) \APACinsertmetastarMueller2021{APACrefauthors}Mueller, S\BPBIT., Veinott, E\BPBIS., Hoffman, R\BPBIR., Klein, G., Alam, L., Mamun, T.\BCBL \BBA Clancey, W\BPBIJ.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitlePrinciples of Explanation in Human-AI Systems Principles of explanation in human-ai systems.\BBCQ \APACjournalVolNumPagesCoRRabs/2102.04972. {APACrefURL} https://arxiv.longhoe.net/abs/2102.04972 \PrintBackRefs\CurrentBib
  • Muir \BBA Moray (\APACyear1996) \APACinsertmetastarMuir1996{APACrefauthors}Muir, B\BPBIM.\BCBT \BBA Moray, N.  \APACrefYearMonthDay1996. \BBOQ\APACrefatitleTrust in automation. Part II. Experimental studies of trust and human intervention in a process control simulation Trust in automation. part ii. experimental studies of trust and human intervention in a process control simulation.\BBCQ \APACjournalVolNumPagesErgonomics393429–460. \PrintBackRefs\CurrentBib
  • Nauta \BOthers. (\APACyear2022) \APACinsertmetastarNauta2022{APACrefauthors}Nauta, M., Trienes, J., Nguyen, E., Peters, M., Schmitt, Y., Schlötterer, J.\BCBL \BBA Seifert, C.  \APACrefYearMonthDay2022120. \BBOQ\APACrefatitleFrom Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai.\BBCQ \PrintBackRefs\CurrentBib
  • Nourani \BOthers. (\APACyear2019) \APACinsertmetastarnourani2019effects{APACrefauthors}Nourani, M., Kabir, S., Mohseni, S.\BCBL \BBA Ragan, E\BPBID.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleThe Effects of Meaningful and Meaningless Explanations on Trust and Perceived System Accuracy in Intelligent Systems The effects of meaningful and meaningless explanations on trust and perceived system accuracy in intelligent systems.\BBCQ \BIn \APACrefbtitleProceedings of the AAAI Conference on Human Computation and Crowdsourcing Proceedings of the aaai conference on human computation and crowdsourcing (\BVOL 7, \BPGS 97–105). {APACrefURL} https://ojs.aaai.org/index.php/HCOMP/article/view/5284 \PrintBackRefs\CurrentBib
  • Omeiza \BOthers. (\APACyear2021) \APACinsertmetastarOmeiza2021{APACrefauthors}Omeiza, D., Kollnig, K., Web, H., Jirotka, M.\BCBL \BBA Kunze, L.  \APACrefYearMonthDay2021jul. \BBOQ\APACrefatitleWhy Not Explain? Effects of Explanations on Human Perceptions of Autonomous Driving Why not explain? effects of explanations on human perceptions of autonomous driving.\BBCQ \BIn \APACrefbtitle2021 IEEE International Conference on Advanced Robotics and Its Social Impacts (ARSO). 2021 IEEE international conference on advanced robotics and its social impacts (ARSO). \APACaddressPublisherIEEE. {APACrefDOI} \doi10.1109/arso51874.2021.9542835 \PrintBackRefs\CurrentBib
  • Ooge \BBA Verbert (\APACyear2021) \APACinsertmetastarOoge2021{APACrefauthors}Ooge, J.\BCBT \BBA Verbert, K.  \APACrefYearMonthDay2021oct. \BBOQ\APACrefatitleTrust in Prediction Models: a Mixed-Methods Pilot Study on the Impact of Domain Expertise Trust in prediction models: a mixed-methods pilot study on the impact of domain expertise.\BBCQ \BIn \APACrefbtitle2021 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX). 2021 IEEE workshop on TRust and EXpertise in visual analytics (TREX). \APACaddressPublisherIEEE. {APACrefDOI} \doi10.1109/trex53765.2021.00007 \PrintBackRefs\CurrentBib
  • Ou \BBA Sia (\APACyear2010) \APACinsertmetastarOu.2010{APACrefauthors}Ou, C\BPBIX.\BCBT \BBA Sia, C\BPBIL.  \APACrefYearMonthDay2010. \BBOQ\APACrefatitleConsumer trust and distrust: An issue of website design Consumer trust and distrust: An issue of website design.\BBCQ \APACjournalVolNumPagesInternational Journal of Human-Computer Studies6812913–934. {APACrefDOI} \doi10.1016/j. ijhcs.2010.08.003 \PrintBackRefs\CurrentBib
  • Papenmeier \BOthers. (\APACyear2021) \APACinsertmetastarPapenmeier2021{APACrefauthors}Papenmeier, A., Englebienne, G.\BCBL \BBA Seifert, C.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleHow model accuracy and explanation fidelity influence user trust in AI How model accuracy and explanation fidelity influence user trust in ai.\BBCQ. \PrintBackRefs\CurrentBib
  • Papenmeier \BOthers. (\APACyear2022) \APACinsertmetastarPapenmeier2022{APACrefauthors}Papenmeier, A., Kern, D., Englebienne, G.\BCBL \BBA Seifert, C.  \APACrefYearMonthDay2022aug. \BBOQ\APACrefatitleIt’s Complicated: The Relationship between User Trust, Model Accuracy and Explanations in AI It’s complicated: The relationship between user trust, model accuracy and explanations in AI.\BBCQ \APACjournalVolNumPagesACM Transactions on Computer-Human Interaction2941–33. {APACrefDOI} \doi10.1145/3495013 \PrintBackRefs\CurrentBib
  • Parasuraman \BBA Riley (\APACyear1997) \APACinsertmetastarParasuraman1997{APACrefauthors}Parasuraman, R.\BCBT \BBA Riley, V.  \APACrefYearMonthDay1997. \BBOQ\APACrefatitleHumans and automation: Use, misuse, disuse, abuse Humans and automation: Use, misuse, disuse, abuse.\BBCQ \APACjournalVolNumPagesHuman factors392230–253. {APACrefDOI} \doi10.1518/001872097778543886 \PrintBackRefs\CurrentBib
  • Peters \BBA Visser (\APACyear2023) \APACinsertmetastarpeters2023importance{APACrefauthors}Peters, T\BPBIM.\BCBT \BBA Visser, R\BPBIW.  \APACrefYearMonthDay2023. \APACrefbtitleThe Importance of Distrust in AI. The importance of distrust in ai. \PrintBackRefs\CurrentBib
  • Poortinga \BBA Pidgeon (\APACyear2003) \APACinsertmetastarPoortinga.2003{APACrefauthors}Poortinga, W.\BCBT \BBA Pidgeon, N\BPBIF.  \APACrefYearMonthDay2003. \BBOQ\APACrefatitleExploring the dimensionality of trust in risk regulation Exploring the dimensionality of trust in risk regulation.\BBCQ \APACjournalVolNumPagesRisk analysis : an official publication of the Society for Risk Analysis235961–972. {APACrefDOI} \doi10.1111/1539-6924.00373 \PrintBackRefs\CurrentBib
  • Posten \BBA Gino (\APACyear2021) \APACinsertmetastarPosten.2021{APACrefauthors}Posten, A\BHBIC.\BCBT \BBA Gino, F.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleHow trust and distrust shape perception and memory How trust and distrust shape perception and memory.\BBCQ \APACjournalVolNumPagesJournal of Personality and Social Psychology121143–58. {APACrefDOI} \doi10.1037/pspa0000269 \PrintBackRefs\CurrentBib
  • Pynadath \BOthers. (\APACyear2018) \APACinsertmetastarPynadath2018{APACrefauthors}Pynadath, D\BPBIV., Barnes, M\BPBIJ., Wang, N.\BCBL \BBA Chen, J\BPBIY.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitleTransparency communication for machine learning in human-automation interaction Transparency communication for machine learning in human-automation interaction.\BBCQ \APACjournalVolNumPagesHuman and Machine Learning: Visible, Explainable, Trustworthy and Transparent75–90. {APACrefURL} https://link.springer.com/chapter/10.1007/978-3-319-90403-0_5 \PrintBackRefs\CurrentBib
  • Rechkemmer \BBA Yin (\APACyear2022) \APACinsertmetastarRechkemmer2022{APACrefauthors}Rechkemmer, A.\BCBT \BBA Yin, M.  \APACrefYearMonthDay2022. \BBOQ\APACrefatitleWhen Confidence Meets Accuracy: Exploring the Effects of Multiple Performance Indicators on Trust in Machine Learning Models When confidence meets accuracy: Exploring the effects of multiple performance indicators on trust in machine learning models.\BBCQ \BIn \APACrefbtitleProceedings of the 2022 CHI Conference on Human Factors in Computing Systems. Proceedings of the 2022 chi conference on human factors in computing systems. \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3491102.3501967 {APACrefDOI} \doi10.1145/3491102.3501967 \PrintBackRefs\CurrentBib
  • Ribeiro \BOthers. (\APACyear2016) \APACinsertmetastarRibeiro2016{APACrefauthors}Ribeiro, M\BPBIT., Singh, S.\BCBL \BBA Guestrin, C.  \APACrefYearMonthDay2016. \BBOQ\APACrefatitle”Why should i trust you?” Explaining the predictions of any classifier ”why should i trust you?” explaining the predictions of any classifier.\BBCQ \BIn \APACrefbtitleProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (\BPGS 1135–1144). \PrintBackRefs\CurrentBib
  • Riegelsberger \BOthers. (\APACyear2005) \APACinsertmetastarRiegelsberger2005{APACrefauthors}Riegelsberger, J., Sasse, M\BPBIA.\BCBL \BBA McCarthy, J\BPBID.  \APACrefYearMonthDay2005mar. \BBOQ\APACrefatitleThe mechanics of trust: A framework for research and design The mechanics of trust: A framework for research and design.\BBCQ \APACjournalVolNumPagesInternational Journal of Human-Computer Studies623381–422. {APACrefDOI} \doi10.1016/j. ijhcs.2005.01.001 \PrintBackRefs\CurrentBib
  • Rohlfing \BOthers. (\APACyear2021) \APACinsertmetastarRohlfing.2021{APACrefauthors}Rohlfing, K\BPBIJ., Cimiano, P., Scharlau, I., Matzner, T., Buhl, H\BPBIM., Buschmeier, H.\BDBLWrede, B.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleExplanation as a Social Practice: Toward a Conceptual Framework for the Social Design of AI Systems Explanation as a social practice: Toward a conceptual framework for the social design of ai systems.\BBCQ \APACjournalVolNumPagesIEEE Transactions on Cognitive and Developmental Systems133717–728. {APACrefDOI} \doi10.1109/tcds.2020.3044366 \PrintBackRefs\CurrentBib
  • Rudin (\APACyear2019) \APACinsertmetastarrudin2019stop{APACrefauthors}Rudin, C.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleStop explaining black box machine learning models for high stakes decisions and use interpretable models instead Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.\BBCQ \APACjournalVolNumPagesNature Machine Intelligence15206–215. \PrintBackRefs\CurrentBib
  • Rusk (\APACyear2018) \APACinsertmetastarRusk.2018{APACrefauthors}Rusk, J\BPBID.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitleTrust and distrust scale development: Operationalization and instrument validation Trust and distrust scale development: Operationalization and instrument validation.\BBCQ \PrintBackRefs\CurrentBib
  • Samek \BOthers. (\APACyear2021) \APACinsertmetastarSamek2021{APACrefauthors}Samek, W., Montavon, G., Lapuschkin, S., Anders, C\BPBIJ.\BCBL \BBA Müller, K\BHBIR.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleExplaining Deep Neural Networks and Beyond: A Review of Methods and Applications Explaining deep neural networks and beyond: A review of methods and applications.\BBCQ \APACjournalVolNumPagesProceedings of the IEEE1093247–278. {APACrefDOI} \doi10.1109/JPROC.2021.3060483 \PrintBackRefs\CurrentBib
  • Sarah Bayer (\APACyear2021) \APACinsertmetastarBayer2021{APACrefauthors}Sarah Bayer, M\BPBIM., Henner Gimpel.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleThe role of domain expertise in trusting and following explainable AI decision support systems The role of domain expertise in trusting and following explainable ai decision support systems.\BBCQ. {APACrefDOI} \doi.org/10.1080/12460125.2021.1958505 \PrintBackRefs\CurrentBib
  • Schaefer \BOthers. (\APACyear2016) \APACinsertmetastarSchaefer2016{APACrefauthors}Schaefer, K\BPBIE., Chen, J\BPBIY\BPBIC., Szalma, J\BPBIL.\BCBL \BBA Hancock, P\BPBIA.  \APACrefYearMonthDay2016mar. \BBOQ\APACrefatitleA Meta-Analysis of Factors Influencing the Development of Trust in Automation A meta-analysis of factors influencing the development of trust in automation.\BBCQ \APACjournalVolNumPagesHuman Factors: The Journal of the Human Factors and Ergonomics Society583377–400. {APACrefDOI} \doi10.1177/0018720816634228 \PrintBackRefs\CurrentBib
  • Scharowski \BBA Perrig (\APACyear2023) \APACinsertmetastarscharowski2023distrust{APACrefauthors}Scharowski, N.\BCBT \BBA Perrig, S\BPBIA.  \APACrefYearMonthDay2023. \BBOQ\APACrefatitleDistrust in (X)AI – Measurement Artifact or Distinct Construct? Distrust in (X)AI – measurement artifact or distinct construct?\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2303.16495. \PrintBackRefs\CurrentBib
  • Schlicker \BBA Langer (\APACyear2021) \APACinsertmetastarSchlicker2021{APACrefauthors}Schlicker, N.\BCBT \BBA Langer, M.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleTowards warranted trust: A model on the relation between actual and perceived system trustworthiness Towards warranted trust: A model on the relation between actual and perceived system trustworthiness.\BBCQ \BIn \APACrefbtitleProceedings of Mensch und Computer 2021 Proceedings of mensch und computer 2021 (\BPGS 325–329). \PrintBackRefs\CurrentBib
  • Schmidt \BBA Biessmann (\APACyear2019) \APACinsertmetastarSchmidt2019{APACrefauthors}Schmidt, P.\BCBT \BBA Biessmann, F.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleQuantifying interpretability and trust in machine learning systems Quantifying interpretability and trust in machine learning systems.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1901.08558. \PrintBackRefs\CurrentBib
  • Schmidt \BOthers. (\APACyear2020) \APACinsertmetastarSchmidt2020{APACrefauthors}Schmidt, P., Biessmann, F.\BCBL \BBA Teubner, T.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleTransparency and trust in artificial intelligence systems Transparency and trust in artificial intelligence systems.\BBCQ \APACjournalVolNumPagesJournal of Decision Systems294260–278. {APACrefURL} https://doi.org/10.1080/12460125.2020.1819094 {APACrefDOI} \doi10.1080/12460125.2020.1819094 \PrintBackRefs\CurrentBib
  • Schoorman \BOthers. (\APACyear2007) \APACinsertmetastarSchoorman.2007{APACrefauthors}Schoorman, F\BPBID., Mayer, R\BPBIC.\BCBL \BBA Davis, J\BPBIH.  \APACrefYearMonthDay2007. \BBOQ\APACrefatitleAn Integrative Model of Organizational Trust: Past, Present, and Future An integrative model of organizational trust: Past, present, and future.\BBCQ \APACjournalVolNumPagesAcademy of Management Review322344–354. {APACrefDOI} \doi10.5465/amr.2007.24348410 \PrintBackRefs\CurrentBib
  • Schweer \BOthers. (\APACyear2009) \APACinsertmetastarSchweer.2009{APACrefauthors}Schweer, M., Vaske, C.\BCBL \BBA Vaske, A\BHBIK.  \APACrefYear2009. \APACrefbtitleZur Funktionalität und Dysfunktionalität von Misstrauen in virtuellen Organisationen Zur funktionalität und dysfunktionalität von misstrauen in virtuellen organisationen. {APACrefURL} https://dl.gi.de/handle/20.500.12116/35191 \PrintBackRefs\CurrentBib
  • Seckler \BOthers. (\APACyear2015) \APACinsertmetastarSeckler.2015{APACrefauthors}Seckler, M., Heinz, S., Forde, S., Tuch, A\BPBIN.\BCBL \BBA Opwis, K.  \APACrefYearMonthDay2015. \BBOQ\APACrefatitleTrust and distrust on the web: User experiences and website characteristics Trust and distrust on the web: User experiences and website characteristics.\BBCQ \APACjournalVolNumPagesComputers in Human Behavior4539–50. {APACrefDOI} \doi10.1016/j. chb.2014.11.064 \PrintBackRefs\CurrentBib
  • Sedlmeier (\APACyear2013) \APACinsertmetastarSedlmeier.2013{APACrefauthors}Sedlmeier, P.  \APACrefYear2013. \APACrefbtitleForschungsmethoden und Statistik für Psychologen und Sozialwissenschaftler Forschungsmethoden und Statistik für Psychologen und Sozialwissenschaftler. \APACaddressPublisherPearson Deutschland GmbH. \PrintBackRefs\CurrentBib
  • Shin (\APACyear2021) \APACinsertmetastarShin2021{APACrefauthors}Shin, D.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleThe effects of explainability and causability on perception, trust, and acceptance: Implications for explainable AI The effects of explainability and causability on perception, trust, and acceptance: Implications for explainable ai.\BBCQ \APACjournalVolNumPagesInternational Journal of Human-Computer Studies146102551. {APACrefURL} https://www.sciencedirect.com/science/article/pii/S1071581920301531 {APACrefDOI} \doihttps://doi.org/10.1016/j.ijhcs.2020.102551 \PrintBackRefs\CurrentBib
  • Siau \BBA Wang (\APACyear2018) \APACinsertmetastarSiau2018{APACrefauthors}Siau, K.\BCBT \BBA Wang, W.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitleBuilding Trust in Artificial Intelligence,Machine Learning, and Robotics Building trust in artificial intelligence,machine learning, and robotics.\BBCQ. \PrintBackRefs\CurrentBib
  • Spain \BOthers. (\APACyear2008) \APACinsertmetastarSpain.2008{APACrefauthors}Spain, R\BPBID., Bustamante, E\BPBIA.\BCBL \BBA Bliss, J\BPBIP.  \APACrefYearMonthDay2008. \BBOQ\APACrefatitleTowards an Empirically Developed Scale for System Trust: Take Two Towards an empirically developed scale for system trust: Take two.\BBCQ \APACjournalVolNumPagesProceedings of the Human Factors and Ergonomics Society Annual Meeting52191335–1339. {APACrefDOI} \doi10.1177/154193120805201907 \PrintBackRefs\CurrentBib
  • Stanton \BBA Jensen (\APACyear2021) \APACinsertmetastarStanton.2021{APACrefauthors}Stanton, B.\BCBT \BBA Jensen, T.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleTrust and Artificial Intelligence Trust and artificial intelligence.\BBCQ {APACrefDOI} \doi10.6028/nist.ir.8332-draft \PrintBackRefs\CurrentBib
  • Suresh \BOthers. (\APACyear2020) \APACinsertmetastarSuresh2020{APACrefauthors}Suresh, H., Lao, N.\BCBL \BBA Liccardi, I.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleMisplaced Trust: Measuring the Interference of Machine Learning in Human Decision-Making Misplaced trust: Measuring the interference of machine learning in human decision-making.\BBCQ \BIn \APACrefbtitleProceedings of the 12th ACM Conference on Web Science Proceedings of the 12th acm conference on web science (\BPG 315–324). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3394231.3397922 {APACrefDOI} \doi10.1145/3394231.3397922 \PrintBackRefs\CurrentBib
  • Thaler \BBA Schmid (\APACyear2021) \APACinsertmetastarThaler2021{APACrefauthors}Thaler, A\BPBIM.\BCBT \BBA Schmid, U.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleExplaining machine learned relational concepts in visual domains-effects of perceived accuracy on joint performance and trust Explaining machine learned relational concepts in visual domains-effects of perceived accuracy on joint performance and trust.\BBCQ \BIn \APACrefbtitleProceedings of the Annual Meeting of the Cognitive Science Society Proceedings of the annual meeting of the cognitive science society (\BVOL 43). \PrintBackRefs\CurrentBib
  • Thiebes \BOthers. (\APACyear2021) \APACinsertmetastarThiebes.2021{APACrefauthors}Thiebes, S., Lins, S.\BCBL \BBA Sunyaev, A.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleTrustworthy artificial intelligence Trustworthy artificial intelligence.\BBCQ \APACjournalVolNumPagesElectronic Markets312447–464. {APACrefDOI} \doi10.1007/s12525-020-00441-4 \PrintBackRefs\CurrentBib
  • Thielsch \BOthers. (\APACyear2018) \APACinsertmetastarThielsch.2018{APACrefauthors}Thielsch, M\BPBIT., Meeßen, S\BPBIM.\BCBL \BBA Hertel, G.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitleTrust and distrust in information systems at the workplace Trust and distrust in information systems at the workplace.\BBCQ \APACjournalVolNumPagesPeerJ6e5483. {APACrefDOI} \doi10.7717/peerj.5483 \PrintBackRefs\CurrentBib
  • Tintarev \BBA Masthoff (\APACyear2012) \APACinsertmetastarTintarev2012{APACrefauthors}Tintarev, N.\BCBT \BBA Masthoff, J.  \APACrefYearMonthDay2012Oct01. \BBOQ\APACrefatitleEvaluating the effectiveness of explanations for recommender systems Evaluating the effectiveness of explanations for recommender systems.\BBCQ \APACjournalVolNumPagesUser Modeling and User-Adapted Interaction224399–439. {APACrefURL} https://doi.org/10.1007/s11257-011-9117-5 {APACrefDOI} \doi10.1007/s11257-011-9117-5 \PrintBackRefs\CurrentBib
  • Toreini \BOthers. (\APACyear2020) \APACinsertmetastarToreini2020{APACrefauthors}Toreini, E., Aitken, M., Coopamootoo, K., Elliott, K., Zelaya, C\BPBIG.\BCBL \BBA Van Moorsel, A.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleThe relationship between trust in AI and trustworthy machine learning technologies The relationship between trust in ai and trustworthy machine learning technologies.\BBCQ \BIn \APACrefbtitleProceedings of the 2020 conference on fairness, accountability, and transparency Proceedings of the 2020 conference on fairness, accountability, and transparency (\BPGS 272–283). \PrintBackRefs\CurrentBib
  • van der Waa \BOthers. (\APACyear2021) \APACinsertmetastarvanderWaa2021{APACrefauthors}van der Waa, J., Nieuwburg, E., Cremers, A.\BCBL \BBA Neerincx, M.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleEvaluating XAI: A comparison of rule-based and example-based explanations Evaluating XAI: A comparison of rule-based and example-based explanations.\BBCQ \APACjournalVolNumPagesArtificial Intelligence291103404. {APACrefURL} https://www.sciencedirect.com/science/article/pii/S0004370220301533 {APACrefDOI} \doihttps://doi.org/10.1016/j.artint.2020.103404 \PrintBackRefs\CurrentBib
  • Vaske (\APACyear2016) \APACinsertmetastarVaske.2016{APACrefauthors}Vaske, C.  \APACrefYear2016. \APACrefbtitleMisstrauen und Vertrauen Misstrauen und vertrauen. \APACaddressPublisherUniversität Vechta. \PrintBackRefs\CurrentBib
  • Vilone \BBA Longo (\APACyear2020) \APACinsertmetastarVilone2020{APACrefauthors}Vilone, G.\BCBT \BBA Longo, L.  \APACrefYearMonthDay20201013. \BBOQ\APACrefatitleExplainable Artificial Intelligence: a Systematic Review Explainable artificial intelligence: a systematic review.\BBCQ \PrintBackRefs\CurrentBib
  • Vilone \BBA Longo (\APACyear2021) \APACinsertmetastarVilone2021{APACrefauthors}Vilone, G.\BCBT \BBA Longo, L.  \APACrefYearMonthDay2021dec25. \BBOQ\APACrefatitleNotions of explainability and evaluation approaches for explainable artificial intelligence Notions of explainability and evaluation approaches for explainable artificial intelligence.\BBCQ \APACjournalVolNumPagesInformation Fusion7689–106. {APACrefDOI} \doi10.1016/j.inffus.2021.05.009 \PrintBackRefs\CurrentBib
  • Wang \BBA Yin (\APACyear2021) \APACinsertmetastarWang2021{APACrefauthors}Wang, X.\BCBT \BBA Yin, M.  \APACrefYearMonthDay2021apr16. \BBOQ\APACrefatitleAre Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making Are explanations helpful? a comparative study of the effects of explanations in AI-assisted decision-making.\BBCQ \BIn \APACrefbtitle26th International Conference on Intelligent User Interfaces. 26th international conference on intelligent user interfaces. \APACaddressPublisherACM. {APACrefDOI} \doi10.1145/3397481.3450650 \PrintBackRefs\CurrentBib
  • Wang \BBA Yin (\APACyear2022) \APACinsertmetastarWang2022{APACrefauthors}Wang, X.\BCBT \BBA Yin, M.  \APACrefYearMonthDay2022apr. \BBOQ\APACrefatitleEffects of Explanations in AI-Assisted Decision Making: Principles and Comparisons Effects of explanations in AI-assisted decision making: Principles and comparisons.\BBCQ \APACjournalVolNumPagesACM Transactions on Interactive Intelligent Systems. {APACrefDOI} \doi10.1145/3519266 \PrintBackRefs\CurrentBib
  • Wang \BBA Yin (\APACyear2023) \APACinsertmetastarWang2023{APACrefauthors}Wang, X.\BCBT \BBA Yin, M.  \APACrefYearMonthDay2023. \BBOQ\APACrefatitleWatch Out for Updates: Understanding the Effects of Model Explanation Updates in AI-Assisted Decision Making Watch out for updates: Understanding the effects of model explanation updates in ai-assisted decision making.\BBCQ \BIn \APACrefbtitleProceedings of the 2023 CHI Conference on Human Factors in Computing Systems. Proceedings of the 2023 chi conference on human factors in computing systems. \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3544548.3581366 {APACrefDOI} \doi10.1145/3544548.3581366 \PrintBackRefs\CurrentBib
  • Wanner \BOthers. (\APACyear2020) \APACinsertmetastarWanner2020{APACrefauthors}Wanner, J., Herm, L\BHBIV., Heinrich, K., Janiesch, C.\BCBL \BBA Zschech, P.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleWhite, Grey, Black: Effects of XAI Augmentation on the Confidence in AI-based Decision Support Systems Short Paper White, grey, black: Effects of XAI augmentation on the confidence in ai-based decision support systems short paper.\BBCQ. \PrintBackRefs\CurrentBib
  • F. Yang \BOthers. (\APACyear2020) \APACinsertmetastarYang2020{APACrefauthors}Yang, F., Huang, Z., Scholtz, J.\BCBL \BBA Arendt, D\BPBIL.  \APACrefYearMonthDay2020. \BBOQ\APACrefatitleHow Do Visual Explanations Foster End Users’ Appropriate Trust in Machine Learning? How do visual explanations foster end users’ appropriate trust in machine learning?\BBCQ \BIn \APACrefbtitleProceedings of the 25th International Conference on Intelligent User Interfaces Proceedings of the 25th international conference on intelligent user interfaces (\BPG 189–201). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3377325.3377480 {APACrefDOI} \doi10.1145/3377325.3377480 \PrintBackRefs\CurrentBib
  • R. Yang \BBA Wibowo (\APACyear2022) \APACinsertmetastarYang2022{APACrefauthors}Yang, R.\BCBT \BBA Wibowo, S.  \APACrefYearMonthDay2022Dec01. \BBOQ\APACrefatitleUser trust in artificial intelligence: A comprehensive conceptual framework User trust in artificial intelligence: A comprehensive conceptual framework.\BBCQ \APACjournalVolNumPagesElectronic Markets3242053–2077. {APACrefURL} https://doi.org/10.1007/s12525-022-00592-6 {APACrefDOI} \doi10.1007/s12525-022-00592-6 \PrintBackRefs\CurrentBib
  • Yu \BOthers. (\APACyear2016) \APACinsertmetastarYu2016{APACrefauthors}Yu, K., Berkovsky, S., Conway, D., Taib, R., Zhou, J.\BCBL \BBA Chen, F.  \APACrefYearMonthDay2016jul. \BBOQ\APACrefatitleTrust and Reliance Based on System Accuracy Trust and reliance based on system accuracy.\BBCQ \BIn \APACrefbtitleProceedings of the 2016 Conference on User Modeling Adaptation and Personalization. Proceedings of the 2016 conference on user modeling adaptation and personalization. \APACaddressPublisherACM. {APACrefDOI} \doi10.1145/2930238.2930290 \PrintBackRefs\CurrentBib
  • Yu \BOthers. (\APACyear2018) \APACinsertmetastarYu2018{APACrefauthors}Yu, K., Berkovsky, S., Conway, D., Taib, R., Zhou, J.\BCBL \BBA Chen, F.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitleDo I Trust a Machine? Differences in User Trust Based on System Performance Do i trust a machine? differences in user trust based on system performance.\BBCQ \BIn \APACrefbtitleHuman and Machine Learning Human and machine learning (\BPGS 245–264). \APACaddressPublisherSpringer International Publishing. {APACrefDOI} \doi10.1007/978-3-319-90403-0_12 \PrintBackRefs\CurrentBib
  • Yu \BOthers. (\APACyear2017) \APACinsertmetastarYu2017{APACrefauthors}Yu, K., Berkovsky, S., Taib, R., Conway, D., Zhou, J.\BCBL \BBA Chen, F.  \APACrefYearMonthDay2017mar. \BBOQ\APACrefatitleUser Trust Dynamics: An Investigation Driven by Differences in System Performance User trust dynamics: An investigation driven by differences in system performance.\BBCQ \BIn \APACrefbtitleProceedings of the 22nd International Conference on Intelligent User Interfaces. Proceedings of the 22nd international conference on intelligent user interfaces. \APACaddressPublisherACM. {APACrefDOI} \doi10.1145/3025171.3025219 \PrintBackRefs\CurrentBib
  • Yu \BOthers. (\APACyear2019) \APACinsertmetastarYu2019{APACrefauthors}Yu, K., Berkovsky, S., Taib, R., Zhou, J.\BCBL \BBA Chen, F.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleDo I Trust My Machine Teammate? An Investigation from Perception to Decision Do i trust my machine teammate? an investigation from perception to decision.\BBCQ \BIn \APACrefbtitleProceedings of the 24th International Conference on Intelligent User Interfaces Proceedings of the 24th international conference on intelligent user interfaces (\BPG 460–468). \APACaddressPublisherNew York, NY, USAAssociation for Computing Machinery. {APACrefURL} https://doi.org/10.1145/3301275.3302277 {APACrefDOI} \doi10.1145/3301275.3302277 \PrintBackRefs\CurrentBib
  • Zhang \BOthers. (\APACyear2020) \APACinsertmetastarZhang2020{APACrefauthors}Zhang, Y., Liao, Q\BPBIV.\BCBL \BBA Bellamy, R\BPBIK\BPBIE.  \APACrefYearMonthDay2020jan. \BBOQ\APACrefatitleEffect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making.\BBCQ \BIn \APACrefbtitleProceedings of the 2020 Conference on Fairness, Accountability, and Transparency. Proceedings of the 2020 conference on fairness, accountability, and transparency. \APACaddressPublisherACM. {APACrefDOI} \doi10.1145/3351095.3372852 \PrintBackRefs\CurrentBib
  • Zhao \BOthers. (\APACyear2019) \APACinsertmetastarZhao2019{APACrefauthors}Zhao, R., Benbasat, I.\BCBL \BBA Cavusoglu, H.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleDo Users Always Want to Know More? Investigating the Relationship between System Transparency and Users’ Trust in Advice-Giving Systems Do users always want to know more? investigating the relationship between system transparency and users’ trust in advice-giving systems.\BBCQ \BIn \APACrefbtitleEuropean Conference on Information Systems. European conference on information systems. {APACrefURL} https://aisel.aisnet.org/ecis2019_rip/42/ \PrintBackRefs\CurrentBib
  • Zhou \BOthers. (\APACyear2021) \APACinsertmetastarZhou2021{APACrefauthors}Zhou, J., Gandomi, A\BPBIH., Chen, F.\BCBL \BBA Holzinger, A.  \APACrefYearMonthDay2021. \BBOQ\APACrefatitleEvaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics Evaluating the quality of machine learning explanations: A survey on methods and metrics.\BBCQ. {APACrefDOI} \doi.org/10.3390/electronics10050593 \PrintBackRefs\CurrentBib
  • Zhou \BOthers. (\APACyear2019) \APACinsertmetastarZhou2019{APACrefauthors}Zhou, J., Li, Z., Hu, H., Yu, K., Chen, F., Li, Z.\BCBL \BBA Wang, Y.  \APACrefYearMonthDay2019may. \BBOQ\APACrefatitleEffects of Influence on User Trust in Predictive Decision Making Effects of influence on user trust in predictive decision making.\BBCQ \BIn \APACrefbtitleExtended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. Extended abstracts of the 2019 CHI conference on human factors in computing systems. \APACaddressPublisherACM. {APACrefDOI} \doi10.1145/3290607.3312962 \PrintBackRefs\CurrentBib
  • Zhou \BOthers. (\APACyear2018) \APACinsertmetastarZhou2018{APACrefauthors}Zhou, J., Yu, K.\BCBL \BBA Chen, F.  \APACrefYearMonthDay2018. \BBOQ\APACrefatitleRevealing User Confidence in Machine Learning-Based Decision Making Revealing user confidence in machine learning-based decision making.\BBCQ \BIn J. Zhou \BBA F. Chen (\BEDS), \APACrefbtitleHuman and Machine Learning: Visible, Explainable, Trustworthy and Transparent Human and machine learning: Visible, explainable, trustworthy and transparent (\BPGS 225–244). \APACaddressPublisherChamSpringer International Publishing. {APACrefURL} https://doi.org/10.1007/978-3-319-90403-0_11 {APACrefDOI} \doi10.1007/978-3-319-90403-0_11 \PrintBackRefs\CurrentBib