\paperID

58 \vol233

TraCE: Trajectory Counterfactual Explanation Scores

Jeffrey N. Clark^† Corresponding Author: [email protected] University of Bristol, UK Edward A. Small Equal Contribution University of Bristol, UK Royal Melbourne Institute of Technology, Australia Nawid Keshtmand University of Bristol, UK Michelle W.L. Wan University of Bristol, UK Elena Fillola Mayoral University of Bristol, UK Enrico Werner University of Bristol, UK Christopher P. Bourdeaux University Hospitals Bristol NHS Foundation Trust, UK Raul Santos-Rodriguez University of Bristol, UK

Abstract

Counterfactual explanations, and their associated algorithmic recourse, are typically leveraged to understand and explain predictions of individual instances coming from a black-box classifier. In this paper, we propose to extend the use of counterfactuals to evaluate progress in sequential decision making tasks. To this end, we introduce a model-agnostic modular framework, TraCE (Trajectory Counterfactual Explanation) scores, to distill and condense progress in highly complex scenarios into a single value. We demonstrate TraCE’s utility by showcasing its main properties in two case studies spanning healthcare and climate change.

1 Introduction

Counterfactual explanations can aid interpretation of predictions and address a lack of model transparency [1]. For example, counterfactuals have been applied to the prediction of patient survival within an intensive care unit [2]. For an unwell patient predicted not to survive, a counterfactual and algorithmic recourse may demonstrate the feature changes necessary to result in positive survival classification. In this way counterfactuals aid users in understanding the model and may provide actionable input to support decisions.

Refer to caption — Figure 1: TraCE for 2-D toy data set classification with three classes: light orange (current class), blue (desired class), and red (undesired class). The factual, $x$ , moves over the sequence, as do the respective target counterfactual points (stars). Between segments of the true trajectory (e.g. $x_{1}$ , $x_{2}$ ) TraCE measures alignment in angle, $R_{1}$ , and the “best move” given the angle, $R_{2}$ , with respect to counterfactual target points (stars in the left panel). In this example the TraCE score for moving from $x_{0}$ to $x_{1}$ is negative (-0.1855) because it aligns more with the negative counterfactual (red class), whereas the trajectory from $x_{1}$ to $x_{2}$ is away from the negative counterfactual and towards the positive counterfactual (blue class) hence the positive score (0.4056).

Many counterfactual explainers have been developed and most commonly they are applied to single-step decision making processes involving one data point per individual [3]. Relatively limited research has been conducted into more complex counterfactual techniques and applications for sequential and time series applications. Such research has mostly focused on counterfactuals in the context of multivariate time series explainability [4, 5], recourse as a sequence of actions [6], and suggested alterations to particular regions of an individual time series [7].

We hypothesise that counterfactual explanations could provide insights beyond their current role in the development of explainable systems by utilising them as benchmarks to evaluate trajectories or sequences of decisions. To this end, we introduce TraCE (Trajectory Counterfactual Explanation) scores, which consider the sequence of steps in a task and compare each step to counterfactual examples, including both desirable and undesirable targets. In the example of the intensive care unit patient, at each point in the patient’s stay, TraCE’s objective is to evaluate the true trajectory against a potential path towards survival (desirable counterfactual), and mortality (undesirable counterfactual). TraCE scores aim to provide an easily understandable sequential assessment of trajectory, enabling progress tracking in a specified task for laypeople and domain experts alike.

2 Preliminaries

Counterfactual explanations are often used to assess what actions are required to push a query point (the factual) over the decision boundary of a model in order to produce a different outcome (the counterfactual) [1]. Adversarial examples stand in stark contrast to counterfactual explanations, as they explicitly seek to misclassify the factual by deceitfully perturbing its features [8].

In essence, counterfactual explanations encapsulate the thought experiment:

$Y$ was my outcome, but if I had done $Z$

then $Y^{\prime}$ would have occurred instead.

Therefore, given a decision maker $f$ , a set of possible outcomes $\{y,y\prime\}$ and a query point $x$ , a counterfactual looks like:

f(x)=y,\quad f(x+z)=y^{\prime}

where $z$ is the change on $x$ in order to achieve $y^{\prime}$ . In the hospital example, where $x$ is the patient, $y$ is their predicted outcome (for example mortality), $y\prime$ is the counterfactual representing an alternative outcome (for example successful discharge), and $z$ is the set of feature changes required to lead to this alternative outcome. We can constrain $z$ to fulfill certain criteria, such as minimising complexity (sparse $z$ ) or length (small $\lVert z\rVert$ ) [9], maximising feasibility (follow probability distributions) [10] or agency (follow multiple possibilities) [11].

Notation

We define scalar values as Greek letters e.g., $\alpha$ , and an input space $\mathcal{X}$ without loss of generality. That is to say, $\mathcal{X}$ can take the form of a set of one-dimensional features (vector), image space, a compressed/latent space, etc. We require that $\mathcal{X}$ is a real vector space with a well-defined inner product $\langle\cdot\;,\;\cdot\rangle:\mathcal{X}\times\mathcal{X}\mapsto\mathbbm{R}$ which follows the usual properties. The inner-product induced norm is defined as $\lVert v\rVert=\sqrt{\langle v\;,\;v\rangle}$ . We could also use a sensible distance function $d:\mathcal{X}\times\mathcal{X}\mapsto\mathbbm{R}_{+}$ which must follow the usual axioms of a distance function.

We take $x_{t}\in\mathcal{X}$ as a singular instance taken at time $t$ from the input space, with $x_{t}^{\prime}$ to be the target point associated with $x_{t}$ . $x^{\prime}$ can be defined using any arbitrary process, e.g. a counterfactual generated with a model or a goal set by a domain expert. We then define the true change, $v_{t}$ , and the desired change, $v_{t}^{\prime}$ :

\displaystyle v_{t}=x_{t+1}-x_{t},\quad v_{t}^{\prime}=x_{t}^{\prime}-x_{t}

(1)

Theorem 1.

Given $a,b,c\in\mathbbm{R}^{n}$ , the closest point $d$ to $a$ in the vector direction $c-b$ is:

d=b+\frac{h}{\lVert h\rVert}\cdot\lVert g\rVert\cdot\theta

(2)

where $h=c-b$ , $g=a-b$ and $\theta=\frac{\langle h\;,\;g\rangle}{\lVert h\rVert\lVert g\rVert}$ .

Proof in Appendix A.1.

3 TraCE

Trajectory Counterfactual Explanation (TraCE) scores $S:\mathcal{X}\times\mathcal{X}\mapsto[-1,1]$ condense the complex task of tracking progress towards successive counterfactual targets through time into a single number between $-1$ and $1$ . This single number requires no expertise or domain knowledge to interpret. Simply put:

•

$S<0$ implies that $x_{t+1}$ is further from $x_{t}^{\prime}$ than $x_{t}$ , with $S\to-1\implies\lVert x_{t}^{\prime}-x_{t+1}\rVert\gg\lVert x_{t}^{\prime}-x_{t}\rVert$ . For the hospital patient example, when applied to a desirable counterfactual, $S<0$ implies that the patient is moving further from the desired region (discharge) and is deteriorating;
•

$S>0$ implies that $x_{t+1}$ is closer to $x_{t}^{\prime}$ than $x_{t}$ , with $S\to 1\implies\lVert x_{t}^{\prime}-x_{t+1}\rVert\ll\lVert x_{t}^{\prime}-x_{t}\rVert$ , suggesting that the patient is improving and getting closer to successful discharge; and
•

$S=0$ implies no movement towards or away from a target, so $\lVert x_{t}^{\prime}-x_{t+1}\rVert=\lVert x_{t}^{\prime}-x_{t}\rVert$ , suggesting that the patient is neither getting better or worse relative to the counterfactual target(s).

In order to do this, we track two metrics: (1) the angle between the real change and the desired change $R_{1}(x_{t},x^{\prime}_{t})$ ; and (2) the distance travelled relative to the angle $R_{2}(x_{t},x^{\prime}_{t})$ .

The angle between the true trajectory and desired trajectory can simply be measured using the normalised dot product:

R_{1}(x_{t},x^{\prime}_{t})=\frac{\langle v_{t}\;,\;v^{\prime}_{t}\rangle}{% \lVert v_{t}\rVert\lVert v^{\prime}_{t}\rVert}

(3)

From Theorem 1, given the angle score $R_{1}(x_{t},x^{\prime}_{t})=\theta_{t}$ , if $\theta_{t}>0$ then the closest point $\hat{x}_{t}$ to $x^{\prime}_{t}$ is:

\hat{x}_{t}=x_{t}+\frac{v_{t}}{\lVert v_{t}\rVert}\lVert v^{\prime}_{t}\rVert% \theta_{t}

whereas if $\theta_{t}\leq 0$ the distance from $x^{\prime}_{t}$ is increasing, and so $\hat{x}_{t}=x_{t}$ . Thus:

R_{2}(x_{t},x^{\prime}_{t})=\Big{\lvert}\frac{\langle\hat{v}_{t}\;,\;v^{*}_{t}% \rangle}{\lVert\hat{v}_{t}\rVert\lVert v^{*}_{t}\rVert}\Big{\rvert}

(4)

where:

\hat{v}_{t}=x^{\prime}_{t}-\hat{x}_{t},\quad v^{*}_{t}=x^{\prime}_{t}-x_{t+1}

Thus $R_{2}=1$ when $x_{t+1}=\hat{x}_{t}$ . We then combine Equations 3 and 4 into a single score:

S(x_{t},x^{\prime}_{t})=\lambda R_{1}(x_{t},x^{\prime}_{t})+(1-\lambda)R_{2}(x% _{t},x^{\prime}_{t})

(5)

where $\lambda\in[0,1]$ is a weight which can be either a scalar value or a function.

TraCE can consider progress towards a single class (as presented in Section 4.2), or multiple classes encompassing both desirable and undesirable counterfactuals (Section 4.1). Figure 1 encapsulates the latter scenario, where we assess progress towards two classes, one desirable and one undesirable, via an average between measured progress towards each outcome as the factual changes. Here we can see that if the distance between sequential factual instances is small, and/or if two counterfactual points from different classes are in close proximity (relative to their distance from the factual), it can be difficult to assess how any change to the factual may contribute to the final outcome. TraCE addresses this. $\lambda>\frac{1}{2}$ implies we care more about the trajectory angle than the distance travelled. When $\lambda\neq 1$ , $S=1$ implies $x_{t+1}=x^{\prime}_{t}$ , and so the goal has been achieved. Code is available to implement TraCE ¹¹1https://github.com/jeffnclark/TraCE.

4 Case Studies

Here we demonstrate the use of TraCE scores in two real-world case studies.

4.1 Intensive care unit outcomes

Clinical care involves a huge number of dynamic variables which must be considered when making decisions. Clinical scores, such as APACHE and NEWS, are widely used to provide a snapshot of a patient’s current status relative to established benchmarks [12]. However, these scores fail to capture dynamics and lack personalization to a patient’s scenario. TraCE is able to overcome these shortcomings in existing clinical scores by better capturing the dynamic progress of an individual patient. Here we demonstrate the application of TraCE to intensive care unit (ICU) patients, relative to counterfactuals for successful discharge and in-hospital mortality.

4.1.1 Methods

Time series intensive care unit data were extracted from the MIMIC IV 2.0 data set [13]. Seventeen features, including vital signs such as heart rate and respiratory rate, were identified for TraCE, following existing research [14]. Outcome labels were generated using known outcomes, for discharge to home or mortality. Patients discharged to locations other than home were removed, leaving a total of 327270 time points across 30860 hospital stays (26089 patients) for analysis. All time points prior to the final time point were labelled as not ready for discharge. Missing proceeding data in the time series were completed using forward fill and, for missing initial values, backward fill. Numerical features empty across each patient’s whole stay were filled with the class average, while absent categorical features were filled with the class mode. All features were normalised.

Using scikit-learn, a multi-layer perceptron classifier, with two hidden layers of 10 neurons each, was trained for a maximum of 10 epochs on individual patient time steps to predict three classes: not ready for discharge, ready for discharge, mortality. All other hyperparameters were as default. Classes were balanced by undersampling and an 80:20 train:test split was utilised.

TraCE analysis was carried out as follows for 1000 hospital stays in the test set, 500 known to be successfully discharged to home, 500 unsuccessfully discharged patients (in-hospital mortality). KDTrees for each outcome class were generated from the corpus of known outcomes within the training set. For each time step in a patient’s hospital stay, counterfactuals (n = 3) were sampled from each KDTree, resulting in ready for discharge (desired) counterfactuals and in-hospital mortality (undesired) counterfactuals. TraCE was implemented ( $\lambda=0.9$ ) against each of these counterfactuals and compared with class probabilities calculated by the classifier. Static features which did not differ between the factual and counterfactual were omitted from TraCE analysis, as were time steps where no features changed. Welch’s t-test was performed to test if average TraCE scores differed between the two outcome groups.

4.1.2 Results and Discussion

The multilayer perceptron classifier achieved test set accuracy of 0.95. The average TraCE score for 500 patients known to be successfully discharged to home was 0.0821 (SD 0.1373). For 500 unsuccessfully discharged patients (in-hospital mortality), their average TraCE score was -0.0302 (SD 0.0675). The difference in average TraCE score was statistically significant ( $p<.00001$ ). Since a patient is typically not ready for discharge (NRFD) for most of the stay, an average near $0$ is expected. More intelligent weighting of variables, coupled with expertise provided by clinicians, is likely to further increase the TraCE score gap between patients with positive and negative outcomes.

Instantaneous TraCE score values between successive time points are expected to be more useful at potential deployment than average scores, and plots for which are shown in Figure 2 for two patients with different outcomes.

TraCE scores plotted for a patient successfully discharged to home show signs of positive progress towards discharge early in the stay, as indicated by the high alignment with desirable counterfactuals (Figure 1(a), top). The MLP classifier does not capture this progress, with stable probabilities for all three classes until the final timepoint (patient discharge), and in fact higher likelihood of mortality than readiness for discharge for the majority of the ICU stay (Figure 1(a), bottom). An additional example trajectory of successful discharge can be found in Appendix 5(a). For cases such as these, real-time observation of TraCE scores could provide early insights into patient improvement.

We also present a negative outcome ICU stay which resulted in in-hospital mortality (Figure 1(b)). For the first half of the stay the classifier most likely predicts not ready for discharge (NRFD) closely followed by mortality. The high mortality probability is reflected in the instantaneous TraCE score which aligns more with the undesirable (mortality) counterfactual than the desirable (ready for discharge), and negative trend in total TraCE score. Patient deterioration is indicated by the TraCE scores at timepoint 2 (increasing undesirable TraCE component) whereas the classifier does not increase the risk of mortality until timepoint 3. Plots for an additional negative outcome patient trajectory is presented in Appendix 5(b). In instances of patient decline, early intervention is critical and TraCE may provide additional insights to compliment existing tools.

TraCE enables determination of the optimal vector for any single time point which would maximise the TraCE score by considering not just positive alignment with the desired outcome but also negative alignment with the undesired outcome. In a clinical setting, this insight could be applied prospectively, by suggesting optimal actions for a current patient in ICU. Likewise, clinicians are able to specify desirable and undesirable counterfactual targets which could be personalised for a given patient. For example, if it may not be reasonably expected that a patient will make a full recovery, the desirable counterfactual could be adjusted to match expectations such as discharge to a nursing facility.

With refinement, the presentation of TraCE scores in a clinical dashboard could provide clinicians with a digestible real-time summary of patient progress. Future work in develo** TraCE for this application, such as weighting TraCE to certain events, analysing the gradient and stability of TraCE scores during the ICU stay and considering counterfactual path feasibility, may yield an improved understanding of a patient’s health trajectory to inform and improve quality of care.

4.2 Monitoring sustainable global development

To address the ongoing climate emergency, it is critical to reconcile global socioeconomic development with environmental sustainability. However, it is difficult to holistically evaluate a region’s overall development trajectory, due to multifaceted social, economic, and environmental considerations. In 2017, five development narratives were published in the form of Shared Socioeconomic Pathways (SSPs): (1) Sustainability, (2) Middle of the Road, (3) Regional Rivalry, (4) Inequality, (5) Fossil-fueled Development [16, 15]. These characterise changing socioeconomic factors for the next century, and the associated changes in emissions of greenhouse gases and air pollutants. In this application, TraCE quantifies the overall development sustainability of different countries, relative to each of these established SSP scenarios, with a view to monitoring alignment with the development trajectories to date.

4.2.1 Methods

Global time series data for socioeconomic and environmental features was extracted for the years 2015-2022. For the environmental features (surface temperature, precipitation, methane concentration), ERA5 reanalysis data [17] and satellite data [18] were used to represent the factual historical features, and counterfactuals were represented by CMIP6 projections for the baseline scenario of each SSP [24, 19, 20, 21, 22, 23]. Factual and counterfactual representations for the socioeconomic features (population, GDP) were similarly obtained from OECD historical datasets [25, 26] and SSP projections [16, 28, 27] respectively. To address differences in spatiotemporal resolutions, spatial coverage, and missing data points in the datasets, the chosen feature data was aggregated to monthly mean values and normalised at the country level, resulting in features for 34 countries. For each SSP, TraCE scores were calculated ( $\lambda=0.9$ ) between the actual feature data and the matching monthly SSP projection data as the target point. No undesirable counterfactual point was assigned. TraCE scores for each SSP were then compared, to quantify the alignment of a given country’s development trajectory with the different SSPs.

4.2.2 Results and Discussion

Analysis of the average TraCE scores for 15 different countries found that most countries in the study fit a common pattern. An overview of the countries’ alignments with SSP projections is shown in the heatmap (Figure 3) for the study period 2015-2022. Comparisons can be made between SSPs for a single country, and across different countries. A common pattern emerges across most countries, with SSP5 (Fossil-fueled Development) ranking highest, followed by SSP1 (Sustainability), closely tracked by SSP4 (Inequality), and finally, SSP2 (Middle of the Road) and SSP3 (Regional Rivalry). Some notable results stand out: several countries, including Germany, Greece, Italy, Mexico, and Portugal, exhibit lower TraCE scores across all SSPs. This indicates that their observed data features are less similar to their corresponding SSP projections, when compared to most other countries in the study. Additionally, some countries deviate from the majority SSP ranking pattern. For example, Greece aligns most closely with SSP4, followed by SSP2 and SSP3, with SSP1 and SSP5 ranking the lowest. Italy aligns most with SSP3, showing strong divergence from the remaining SSPs, which have similar TraCE scores. Poland closely aligns with SSP4, followed by SSP3, with TraCE scores diverging significantly from the other SSPs.

Importantly, this work does not provide evidence for attributing specific actions or responsibility to particular countries. This is because the observed data features for a given country can be influenced by the actions of other countries. Instead, TraCE scores can serve as a monitoring metric, or an output metric in simulation experiments, because they quantify the alignment of observed data features with SSP projections.

Figure 4 shows the cumulative TraCE score time series (2015-2022) for Norway, which was identified as a representative country. The TraCE score trajectories are consistently positive across all SSPs, in agreement with the expectation that they were developed as realistic scenarios in alignment with the factual historical data. Of note is the visible flattening around the year 2020, which coincides with the onset of the COVID-19 pandemic. This flattening likely occurs because the SSP projections did not anticipate the pandemic, so the observed data features deviate from these trajectories, resulting in low or negative instantaneous TraCE scores. Overall, Figure 4 indicates that SSP5 consistently ranks the highest from 2016 onwards, while other SSPs score more closely together. However, starting in mid-2021, the SSP4 and SSP1 TraCE scores begin to diverge above those of SSP2 and SSP3. With refinement, future work could correlate temporal TraCE scores with societal events and political decisions, such as legislation. Additional plots presenting the findings for Poland, as a contrasting example, are available in Appendix A.3, including a heatmap of feature importance to provide preliminary explainability of TraCE.

It must be emphasised that this study serves as a proof of concept, and requires input from experts across multiple domains to ensure safe and trustworthy implementation. This includes the selection of data features for monitoring, and their weighting, which has been equally distributed in this demonstration. Different weighting schemes will yield distinct results and should be developed in accordance with the priorities and specific questions of the user. Additionally, the data used and results obtained are contingent on the model source for SSP projections.

The utility of TraCE scores in this application lies in the capability to reconcile complex and occasionally conflicting variables into a single value. This allows experts and non-experts to quickly assess alignment with the established SSP scenarios, via an explainable method based on direction and distance in the data feature space. Visually assessing such alignment from the raw data itself can be challenging, particularly as the number of included features increases. The TraCE method is therefore useful for communication and understanding between stakeholder groups, and with refinement could aid monitoring of region sustainability against established development pathways.

5 Conclusion

TraCE provides a model-agnostic modular framework from which to assess progress over time towards an assigned goal. As demonstrated, the modularity of TraCE enables application-specific adaptation. Counterfactual target points can be defined as most appropriate, such as: model-generated counterfactuals, corpus of examples, expert-selected landmarks, or industry benchmarks.

The presented case studies involve at most 17 features. TraCE’s utility is expected to become even more evident with higher complexity scenarios which likely involve larger neural networks. In this paper we present TraCE scores in several forms: instantaneous (ICU study, Section 4.1); average and cumulative (SSP study, Section 4.2). More sophisticated methods to harness the temporal dimension could be considered after calculating TraCE scores such as quantifying instability, gradients through successive time steps, or time-dependent score weighting. The implementation of TraCE for the presented applications are for illustrative purposes, deployment and interpretation of TraCE should be guided by domain experts. Further work is required for robust implementation, including feature selection and tuning of $\lambda$ .

By distilling high dimensional dynamic sequential tasks into a single value, TraCE scores enable experts and laypeople alike to quantify and better understand progress in sequential tasks.

6 Acknowledgements

We thank Thea Barnes for SQL scripts for MIMIC IV data extraction. JNC, MWLW and RSR are funded by the UKRI Turing AI Fellowship [grant number EP/V024817/1]. EAS is funded by the ARC Centre of Excellence for Automated Decision-Making and Society (project number CE200100005), funded by the Australian Government through the Australian Research Council. EFM is funded by a Google PhD Fellowship. Part of this work was done within the University of Bristol’s Machine Learning and Computer Vision (MaVi) Summer Research Program 2023.

References

[1] S Wachter, B Mittelstadt and C Russell “Counterfactual explanations without opening the black box: automated decisions and the GDPR” In Harvard Journal of Law and Technology 31.2 Harvard Law School, 2018, pp. 841–887
[2] Zhendong Wang, Isak Samsten and Panagiotis Papapetrou “Counterfactual explanations for survival prediction of cardiovascular ICU patients” In Artificial Intelligence in Medicine: 19th International Conference on Artificial Intelligence in Medicine, AIME 2021, Virtual Event, June 15–18, 2021, Proceedings, 2021, pp. 338–348 Springer DOI: 10.1007/978-3-030-77211-6˙38
[3] Riccardo Guidotti “Counterfactual explanations and how to find them: literature review and benchmarking” In Data Mining and Knowledge Discovery Springer, 2022, pp. 1–55 DOI: 10.1007/s10618-022-00831-6
[4] Emre Ates, Burak Aksar, Vitus J Leung and Ayse K Coskun “Counterfactual explanations for multivariate time series” In 2021 International Conference on Applied Artificial Intelligence (ICAPAI), 2021, pp. 1–8 IEEE DOI: 10.1109/ICAPAI49758.2021.9462056
[5] Jacqueline Höllig, Cedric Kulbach and Steffen Thoma “TSEvo: Evolutionary counterfactual explanations for time series classification” In 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), 2022, pp. 29–36 IEEE DOI: 10.1109/ICMLA55696.2022.00013
[6] Stratis Tsirtsis, Abir De and Manuel Rodriguez “Counterfactual explanations in sequential decision making under uncertainty” In Advances in Neural Information Processing Systems 34, 2021, pp. 30127–30139
[7] Eoin Delaney, Derek Greene and Mark T Keane “Instance-based counterfactual explanations for time series classification” In International Conference on Case-Based Reasoning, 2021, pp. 32–47 Springer DOI: 10.1007/978-3-030-86957-1˙3
[8] Ian J Goodfellow, Jonathon Shlens and Christian Szegedy “Explaining and harnessing adversarial examples” In arXiv preprint arXiv:1412.6572, 2014
[9] Marco Virgolin and Saverio Fracaros “On the robustness of sparse counterfactual explanations to adverse perturbations” In Artificial Intelligence 316, 2023, pp. 103840 DOI: https://doi.org/10.1016/j.artint.2022.103840
[10] Rafael Poyiadzi et al. “FACE: feasible and actionable counterfactual explanations” In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 2020, pp. 344–350
[11] Kacper Sokol, Edward Small and Yueqing Xuan “Navigating Explanatory Multiverse Through Counterfactual Path Geometry” In International Conference on Machine Learning Workshop on Counterfactuals in Minds and Machines, 2023 arXiv:2306.02786 [cs.LG]
[12] Stephen Gerry et al. “Early warning scores for detecting deterioration in adult hospital patients: systematic review and critical appraisal of methodology” In bmj 369 British Medical Journal Publishing Group, 2020 DOI: 10.1136/bmj.m1501
[13] A. Johnson et al. “MIMIC-IV (version 2.0)” In PhysioNet., 2022 DOI: 10.13026/7vcr-e114
[14] Christopher J McWilliams et al. “Towards a decision support tool for intensive care discharge: machine learning algorithm development using electronic healthcare data from MIMIC-III and Bristol, UK” In BMJ open 9.3 British Medical Journal Publishing Group, 2019, pp. e025925 DOI: 10.1136/bmjopen-2018-025925
[15] Brian C. O’Neill et al. “The roads ahead: Narratives for shared socioeconomic pathways describing world futures in the 21st century” In Global Environmental Change 42, 2017, pp. 169–180 DOI: https://doi.org/10.1016/j.gloenvcha.2015.01.004
[16] Keywan Riahi et al. “The Shared Socioeconomic Pathways and their energy, land use, and greenhouse gas emissions implications: An overview” In Global Environmental Change 42 Elsevier BV, 2017, pp. 153–168 DOI: 10.1016/j.gloenvcha.2016.05.009
[17] H. Hersbach et al. “ERA5 Monthly Averaged Data on Single Levels from 1940 to Present” Accessed on 17-08-2023, Copernicus Climate Change Service (C3S) Climate Data Store (CDS), 2023 DOI: 10.24381/cds.f17050d7
[18] Copernicus Climate Change Service, Climate Data Store “Methane Data from 2002 to Present Derived from Satellite Observations” Accessed on 01-09-2023, Copernicus Climate Change Service (C3S) Climate Data Store (CDS), 2018 DOI: 10.24381/cds.b25419f8
[19] NASA Goddard Institute Space Studies (NASA/GISS) “NASA-GISS GISS-E2.1H model output prepared for CMIP6 ScenarioMIP ssp126” Earth System Grid Federation, 2020 DOI: 10.22033/ESGF/CMIP6.7411
[20] NASA Goddard Institute Space Studies (NASA/GISS) “NASA-GISS GISS-E2.1H model output prepared for CMIP6 ScenarioMIP ssp245” Earth System Grid Federation, 2020 DOI: 10.22033/ESGF/CMIP6.7416
[21] NASA Goddard Institute Space Studies (NASA/GISS) “NASA-GISS GISS-E2.1H model output prepared for CMIP6 ScenarioMIP ssp370” Earth System Grid Federation, 2020 DOI: 10.22033/ESGF/CMIP6.7427
[22] NASA Goddard Institute Space Studies (NASA/GISS) “NASA-GISS GISS-E2.1H model output prepared for CMIP6 ScenarioMIP ssp460” Earth System Grid Federation, 2020 DOI: 10.22033/ESGF/CMIP6.7453
[23] NASA Goddard Institute Space Studies (NASA/GISS) “NASA-GISS GISS-E2.1H model output prepared for CMIP6 ScenarioMIP ssp585” Earth System Grid Federation, 2020 DOI: 10.22033/ESGF/CMIP6.7461
[24] Copernicus Climate Change Service, Climate Data Store “CMIP6 Climate Projections” Accessed on 17-08-2023, Copernicus Climate Change Service (C3S) Climate Data Store (CDS), 2021 DOI: 10.24381/cds.c866074c
[25] OECD “Historical Population” Accessed on 22-08-2023, 2023 URL: https://doi.org/10.1787/data-00285-en
[26] OECD “Gross Domestic Product (GDP) (indicator)” Accessed on 22-08-2023, 2023 DOI: 10.1787/dc2f7aec-en
[27] Rob Dellink, Jean Chateau, Elisa Lanzi and Bertrand Magné “Long-term economic growth projections in the Shared Socioeconomic Pathways” In Global Environmental Change 42 Elsevier BV, 2017, pp. 200–214 DOI: 10.1016/j.gloenvcha.2015.06.004
[28] Samir KC and Wolfgang Lutz “The human core of the shared socioeconomic pathways: Population scenarios by age, sex and level of education for all countries to 2100” In Global Environmental Change 42 Elsevier BV, 2017, pp. 181–192 DOI: 10.1016/j.gloenvcha.2014.06.004

Appendix A Appendix

A.1 Proof of Theorem 1

Claim

Given $a,b,c\in\mathbbm{R}^{n}$ , the closest point $d$ to $a$ in the vector direction $c-b$ is:

d=b+\frac{h}{\lVert h\rVert}\cdot\lVert g\rVert\cdot\theta

where $h=c-b$ , $g=a-b$ and $\theta=\frac{\langle h\;,\;g\rangle}{\lVert h\rVert\lVert g\rVert}$ .

Proof.

In $n$ -dimensional space, the points $a,b,c\in\mathbbm{R}^{n}$ create a triangle. Define $\alpha=\lVert g\rVert=\lVert a-b\rVert$ and $\beta=\lVert h\rVert=\lVert c-b\rVert$ . Since the closest point along a line to another point must form a perpendicular vector, for $d$ to be the closest point along the vector direction $c-b$ , $a,b,d$ must form a right angled triangle, shown in Figure 5. Thus, define $\epsilon=\lVert d-c\rVert$ , $\kappa=\lVert a-d\rVert$ , from Pythagoras Theorem:

	$\displaystyle(\beta+\epsilon)^{2}+\kappa^{2}$	$\displaystyle=\alpha^{2}$
	$\displaystyle\implies(\beta+\epsilon)$	$\displaystyle=\sqrt{\alpha^{2}-\kappa^{2}}$

From trigonometric identities, $\kappa=\alpha\sin(\phi)$ and

\phi=\arccos\bigg{(}\frac{\langle h\;,\;g\rangle}{\lVert h\rVert\lVert g\rVert% }\bigg{)}

thus:

\kappa=\alpha\sqrt{1-\frac{\langle h\;,\;g\rangle}{\lVert h\rVert\lVert g% \rVert}^{2}}

giving:

(\beta+\epsilon)=\sqrt{\alpha^{2}-\bigg{(}\alpha\sqrt{1-\frac{\langle h\;,\;g% \rangle}{\lVert h\rVert\lVert g\rVert}^{2}}\bigg{)}^{2}}

Since the normalised dot product is strictly $[-1,1]$ :

0\leq\sqrt{1-\frac{\langle h\;,\;g\rangle}{\lVert h\rVert\lVert g\rVert}^{2}}\leq 1

therefore:

\alpha\geq\alpha\sqrt{1-\frac{\langle h\;,\;g\rangle}{\lVert h\rVert\lVert g% \rVert}^{2}}

and so $\beta+\epsilon\in\mathbbm{R}_{+}$ , giving:

	$\displaystyle(b+\epsilon)$	$\displaystyle=\sqrt{\alpha^{2}-\bigg{(}\alpha\sqrt{1-\frac{\langle h\;,\;g% \rangle}{\lVert h\rVert\lVert g\rVert}^{2}}\bigg{)}^{2}}$
		$\displaystyle=\sqrt{\alpha^{2}-\alpha^{2}\bigg{(}1-\frac{\langle h\;,\;g% \rangle}{\lVert h\rVert\lVert g\rVert}^{2}\bigg{)}}$
		$\displaystyle=\sqrt{\alpha^{2}\bigg{(}\frac{\langle h\;,\;g\rangle}{\lVert h% \rVert\lVert g\rVert}\bigg{)}^{2}}$
		$\displaystyle=\alpha\frac{\langle h\;,\;g\rangle}{\lVert h\rVert\lVert g\rVert}$

$\beta+\epsilon$ describes the distance we must travel along the vector direction $c-b$ to get from $b$ to $d$ . Therefore:

d=b+\frac{h}{\lVert h\rVert}(\beta+\epsilon)

(6)

which gives Equation 2 when substitution is complete. ∎

A.2 Intensive care unit outcomes

Figure 5(a) shows TraCE applied to another ICU patient who was successfully discharged to home. For the first two-thirds of the stay, the patient’s predicted probability of mortality was higher than for successful discharge (RFD), which is reflected by the stronger alignment with the undesirable counterfactuals (mortality) in this portion of the stay. However, the patient does recover and goes on to be successfully discharged. The TraCE score begins to increase (timepoint 7) prior to the patient’s improved health being reflected in the classifier probabilities (timepoint 8).

An unsuccessfully discharged ICU patient is shown in Figure 5(b). In this case from the TraCE score it is evident throughout the stay that the patient is deteriorating, given the consistently higher alignment with the undesirable (mortality) counterfactuals than the desirable (discharged to home) counterfactuals. This demonstrated the patient’s increasing proximity to the undesirable outcome, mortality. The increasing risk of mortality is not reflected by the classifier (Figure 5(b), bottom), which is not apparent until the patient’s final timepoint. Until this point the probability plot appeared very similar to the previously described patient (Figure 5(a)). This suggests that with refinement TraCE could provide utility, as part of a clinician’s toolkit, to support decisions and ultimately improve patient care.

A.3 Monitoring sustainable global development

SSP TraCE scores for Poland

TraCE score analysis of the 34 countries in the global study found a common pattern across most countries, with SSP5 (Fossil-fueled Development) alignment ranking highest (Figure 3). Several countries deviated from this pattern, such as Poland, for which the TraCE score time series is shown in Figure 7. The TraCE score for SSP4 (Inequality) is consistently high throughout the time series, with SSP3 (Regional Rivalry) closely tracking, and overtaking in some instances. SSP4 then begins to diverge, leading as the highest ranked SSP from 2019 onwards. Unlike other countries in the study, SSP5 (Fossil-fueled Development) and SSP1 (Sustainability) are consistently ranked lowest throughout the time series. Interpretation of these results can be informed by Figure 8, which shows the feature-level heatmap of average SSP TraCE scores over the study period (2015-2022). These scores have been determined by applying the TraCE method to each feature individually, to indicate their sole alignment with the corresponding SSP projections for that feature. Note that due to the way in which TraCE is formulated, these scores are not linearly disaggregated from the overall TraCE score for the country. In Figure 8, the high TraCE score for SSP4 is dominated by the GDP feature, with other features also scoring highly for this SSP. SSP3 is dominated by the GDP and temperature features. The heatmap also shows that the features are most closely aligned with their SSP projections for GDP, temperature, and precipitation, with poor alignment for methane (CH4) projections across all SSPs.