XAI—Explainable artificial intelligence

See allHide authors and affiliations

Science Robotics  18 Dec 2019:
Vol. 4, Issue 37, eaay7120
DOI: 10.1126/scirobotics.aay7120


Explainability is essential for users to effectively understand, trust, and manage powerful artificial intelligence applications.

Recent successes in machine learning (ML) have led to a new wave of artificial intelligence (AI) applications that offer extensive benefits to a diverse range of fields. However, many of these systems are not able to explain their autonomous decisions and actions to human users. Explanations may not be essential for certain AI applications, and some AI researchers argue that the emphasis on explanation is misplaced, too difficult to achieve, and perhaps unnecessary. However, for many critical applications in defense, medicine, finance, and law, explanations are essential for users to understand, trust, and effectively manage these new, artificially intelligent partners [see recent reviews (13)].

Recent AI successes are largely attributed to new ML techniques that construct models in their internal representations. These include support vector machines (SVMs), random forests, probabilistic graphical models, reinforcement learning (RL), and deep learning (DL) neural networks. Although these models exhibit high performance, they are opaque in terms of explainability. There may be inherent conflict between ML performance (e.g., predictive accuracy) and explainability. Often, the highest performing methods (e.g., DL) are the least explainable, and the most explainable (e.g., decision trees) are the least accurate. Figure 1 illustrates this with a notional graph of the performance-explainability tradeoff for some of the ML techniques.

Fig. 1 Performance versus explainability tradeoff for ML techniques.

(A) Learning techniques and explainability. Concept adapted from (9). (B) Interpretable models: ML techniques that learn more structured, interpretable, or causal models. Early examples included Bayesian rule lists, Bayesian program learning, learning models of causal relationships, and using stochastic grammars to learn more interpretable structure. Deep learning: Several design choices might produce more explainable representations (e.g., training data selection, architectural layers, loss functions, regularization, optimization techniques, and training sequences). Model agnostic: Techniques that experiment with any given ML model, as a black box, to infer an approximate explainable model.

Credit: A. Kitterman/Science Robotics


The purpose of an explainable AI (XAI) system is to make its behavior more intelligible to humans by providing explanations. There are some general principles to help create effective, more human-understandable AI systems: The XAI system should be able to explain its capabilities and understandings; explain what it has done, what it is doing now, and what will happen next; and disclose the salient information that it is acting on (4).

However, every explanation is set within a context that depends on the task, abilities, and expectations of the user of the AI system. The definitions of interpretability and explainability are, thus, domain dependent and may not be defined independently from a domain. Explanations can be full or partial. Models that are fully interpretable give full and completely transparent explanations. Models that are partially interpretable reveal important pieces of their reasoning process. Interpretable models obey “interpretability constraints” that are defined according to the domain (e.g., monotonicity with respect to certain variables and correlated variables obey particular relationships), whereas black box or unconstrained models do not necessarily obey these constraints. Partial explanations may include variable importance measures, local models that approximate global models at specific points and saliency maps.


XAI assumes that an explanation is provided to an end user who depends on the decisions, recommendations, or actions produced by an AI system, yet there could be many different kinds of users, often at different time points in the development and use of the system (5). For example, a type of user might be an intelligence analyst, a judge, or an operator. However, other users who demand an explanation of the system may be developers or test operators who need to understand where there might be areas of improvement. Yet, another user might be policy makers, who are trying to assess the fairness of the system. Each user group may have a preferred explanation type that is able to communicate information in the most effective way. An effective explanation will take the target user group of the system into account, who might vary in their background knowledge and needs for what should be explained.


A number of ways of evaluating and measuring the effectiveness of an explanation have been proposed; however, there is currently no common means of measuring whether an XAI system is more intelligible to a user than a non-XAI system. Some of these measures are subjective measures from the user’s point of view, such as user satisfaction, which can be measured through a subjective rating of the clarity and utility of an explanation. More objective measures for an explanation’s effectiveness might be task performance; i.e., does the explanation improve the user’s decision-making? Reliable and consistent measurement of the effects of explanations is still an open research question. Evaluation and measurement for XAI systems include evaluation frameworks, common ground [different thinking and mutual understanding (6)], common sense, and argumentation [why (7)].


There remain many active issues and challenges at the intersection of ML and explanation.

1) Starting from computers versus starting from people (8). Should XAI systems tailor explanations to particular users? Should they consider the knowledge that users lack? How can we exploit explanations to aid interactive and human-in-the-loop learning, including enabling users to interact with explanations to provide feedback and steer learning?

2) Accuracy versus interpretability. A major thread of XAI research on explanation explores techniques and limitations of interpretability. Interpretability needs to consider tradeoffs involving accuracy and fidelity and to strike a balance between accuracy, interpretability, and tractability.

3) Using abstractions to simplify explanations. High-level patterns are the basis for describing big plans in big steps. Automating the discovery of abstractions has long been a challenge, and understanding the discovery and sharing of abstractions in learning and explanation are at the frontier of XAI research today.

4) Explaining competencies versus explaining decisions. A sign of mastery by highly qualified experts is that they can reflect on new situations. It is necessary to help end users to understand the competencies of the AI systems in terms of what competencies a particular AI system has, how the competencies should be measured, and whether an AI system has blind spots; that is, are there classes of solutions it can never find?

From a human-centered research perspective, research on competencies and knowledge could take XAI beyond the role of explaining a particular XAI system and helping its users to determine appropriate trust. In the future, XAIs may eventually have substantial social roles. These roles could include not only learning and explaining to individuals but also coordinating with other agents to connect knowledge, developing cross-disciplinary insights and common ground, partnering in teaching people and other agents, and drawing on previously discovered knowledge to accelerate the further discovery and application of knowledge. From such a social perspective of knowledge understanding and generation, the future of XAI is just beginning.


Funding: J.C. was supported by an Institute for Information and Communications Technology Planning and Evaluation (IITP) grant (no. 2017-0-01779; A machine learning and statistical inference framework for explainable artificial intelligence). Material within this technical publication is based on the work supported by the Defense Advanced Research Projects Agency (DARPA) under contract FA8650-17-C-7710 (to M.S.). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the official policy or position of the Department of Defense or the U.S. government.
View Abstract

Stay Connected to Science Robotics

Navigate This Article