-
Learn to explain yourself, when you can: Equip** Concept Bottleneck Models with the ability to abstain on their concept predictions
Authors:
Joshua Lockhart,
Daniele Magazzeni,
Manuela Veloso
Abstract:
The Concept Bottleneck Models (CBMs) of Koh et al. [2020] provide a means to ensure that a neural network based classifier bases its predictions solely on human understandable concepts. The concept labels, or rationales as we refer to them, are learned by the concept labeling component of the CBM. Another component learns to predict the target classification label from these predicted concept labe…
▽ More
The Concept Bottleneck Models (CBMs) of Koh et al. [2020] provide a means to ensure that a neural network based classifier bases its predictions solely on human understandable concepts. The concept labels, or rationales as we refer to them, are learned by the concept labeling component of the CBM. Another component learns to predict the target classification label from these predicted concept labels. Unfortunately, these models are heavily reliant on human provided concept labels for each datapoint. To enable CBMs to behave robustly when these labels are not readily available, we show how to equip them with the ability to abstain from predicting concepts when the concept labeling component is uncertain. In other words, our model learns to provide rationales for its predictions, but only whenever it is sure the rationale is correct.
△ Less
Submitted 18 December, 2022; v1 submitted 21 November, 2022;
originally announced November 2022.
-
Towards learning to explain with concept bottleneck models: mitigating information leakage
Authors:
Joshua Lockhart,
Nicolas Marchesotti,
Daniele Magazzeni,
Manuela Veloso
Abstract:
Concept bottleneck models perform classification by first predicting which of a list of human provided concepts are true about a datapoint. Then a downstream model uses these predicted concept labels to predict the target label. The predicted concepts act as a rationale for the target prediction. Model trust issues emerge in this paradigm when soft concept labels are used: it has previously been o…
▽ More
Concept bottleneck models perform classification by first predicting which of a list of human provided concepts are true about a datapoint. Then a downstream model uses these predicted concept labels to predict the target label. The predicted concepts act as a rationale for the target prediction. Model trust issues emerge in this paradigm when soft concept labels are used: it has previously been observed that extra information about the data distribution leaks into the concept predictions. In this work we show how Monte-Carlo Dropout can be used to attain soft concept predictions that do not contain leaked information.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Feature Importance for Time Series Data: Improving KernelSHAP
Authors:
Mattia Villani,
Joshua Lockhart,
Daniele Magazzeni
Abstract:
Feature importance techniques have enjoyed widespread attention in the explainable AI literature as a means of determining how trained machine learning models make their predictions. We consider Shapley value based approaches to feature importance, applied in the context of time series data. We present closed form solutions for the SHAP values of a number of time series models, including VARMAX. W…
▽ More
Feature importance techniques have enjoyed widespread attention in the explainable AI literature as a means of determining how trained machine learning models make their predictions. We consider Shapley value based approaches to feature importance, applied in the context of time series data. We present closed form solutions for the SHAP values of a number of time series models, including VARMAX. We also show how KernelSHAP can be applied to time series tasks, and how the feature importances that come from this technique can be combined to perform "event detection". Finally, we explore the use of Time Consistent Shapley values for feature importance.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Reductive MDPs: A Perspective Beyond Temporal Horizons
Authors:
Thomas Spooner,
Rui Silva,
Joshua Lockhart,
Jason Long,
Vacslav Glukhov
Abstract:
Solving general Markov decision processes (MDPs) is a computationally hard problem. Solving finite-horizon MDPs, on the other hand, is highly tractable with well known polynomial-time algorithms. What drives this extreme disparity, and do problems exist that lie between these diametrically opposed complexities? In this paper we identify and analyse a sub-class of stochastic shortest path problems…
▽ More
Solving general Markov decision processes (MDPs) is a computationally hard problem. Solving finite-horizon MDPs, on the other hand, is highly tractable with well known polynomial-time algorithms. What drives this extreme disparity, and do problems exist that lie between these diametrically opposed complexities? In this paper we identify and analyse a sub-class of stochastic shortest path problems (SSPs) for general state-action spaces whose dynamics satisfy a particular drift condition. This construction generalises the traditional, temporal notion of a horizon via decreasing reachability: a property called reductivity. It is shown that optimal policies can be recovered in polynomial-time for reductive SSPs -- via an extension of backwards induction -- with an efficient analogue in reductive MDPs. The practical considerations of the proposed approach are discussed, and numerical verification provided on a canonical optimal liquidation problem.
△ Less
Submitted 15 May, 2022;
originally announced May 2022.
-
Asynchronous Collaborative Learning Across Data Silos
Authors:
Tiffany Tuor,
Joshua Lockhart,
Daniele Magazzeni
Abstract:
Machine learning algorithms can perform well when trained on large datasets. While large organisations often have considerable data assets, it can be difficult for these assets to be unified in a manner that makes training possible. Data is very often 'siloed' in different parts of the organisation, with little to no access between silos. This fragmentation of data assets is especially prevalent i…
▽ More
Machine learning algorithms can perform well when trained on large datasets. While large organisations often have considerable data assets, it can be difficult for these assets to be unified in a manner that makes training possible. Data is very often 'siloed' in different parts of the organisation, with little to no access between silos. This fragmentation of data assets is especially prevalent in heavily regulated industries like financial services or healthcare. In this paper we propose a framework to enable asynchronous collaborative training of machine learning models across data silos. This allows data science teams to collaboratively train a machine learning model, without sharing data with one another. Our proposed approach enhances conventional federated learning techniques to make them suitable for this asynchronous training in this intra-organisation, cross-silo setting. We validate our proposed approach via extensive experiments.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
SURF: Improving classifiers in production by learning from busy and noisy end users
Authors:
Joshua Lockhart,
Samuel Assefa,
Ayham Alajdad,
Andrew Alexander,
Tucker Balch,
Manuela Veloso
Abstract:
Supervised learning classifiers inevitably make mistakes in production, perhaps mis-labeling an email, or flagging an otherwise routine transaction as fraudulent. It is vital that the end users of such a system are provided with a means of relabeling data points that they deem to have been mislabeled. The classifier can then be retrained on the relabeled data points in the hope of performance impr…
▽ More
Supervised learning classifiers inevitably make mistakes in production, perhaps mis-labeling an email, or flagging an otherwise routine transaction as fraudulent. It is vital that the end users of such a system are provided with a means of relabeling data points that they deem to have been mislabeled. The classifier can then be retrained on the relabeled data points in the hope of performance improvement. To reduce noise in this feedback data, well known algorithms from the crowdsourcing literature can be employed. However, the feedback setting provides a new challenge: how do we know what to do in the case of user non-response? If a user provides us with no feedback on a label then it can be dangerous to assume they implicitly agree: a user can be busy, lazy, or no longer a user of the system! We show that conventional crowdsourcing algorithms struggle in this user feedback setting, and present a new algorithm, SURF, that can cope with this non-response ambiguity.
△ Less
Submitted 12 October, 2020;
originally announced October 2020.
-
Some people aren't worth listening to: periodically retraining classifiers with feedback from a team of end users
Authors:
Joshua Lockhart,
Samuel Assefa,
Tucker Balch,
Manuela Veloso
Abstract:
Document classification is ubiquitous in a business setting, but often the end users of a classifier are engaged in an ongoing feedback-retrain loop with the team that maintain it. We consider this feedback-retrain loop from a multi-agent point of view, considering the end users as autonomous agents that provide feedback on the labelled data provided by the classifier. This allows us to examine th…
▽ More
Document classification is ubiquitous in a business setting, but often the end users of a classifier are engaged in an ongoing feedback-retrain loop with the team that maintain it. We consider this feedback-retrain loop from a multi-agent point of view, considering the end users as autonomous agents that provide feedback on the labelled data provided by the classifier. This allows us to examine the effect on the classifier's performance of unreliable end users who provide incorrect feedback. We demonstrate a classifier that can learn which users tend to be unreliable, filtering their feedback out of the loop, thus improving performance in subsequent iterations.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
How to Evaluate Trading Strategies: Single Agent Market Replay or Multiple Agent Interactive Simulation?
Authors:
Tucker Hybinette Balch,
Mahmoud Mahfouz,
Joshua Lockhart,
Maria Hybinette,
David Byrd
Abstract:
We show how a multi-agent simulator can support two important but distinct methods for assessing a trading strategy: Market Replay and Interactive Agent-Based Simulation (IABS). Our solution is important because each method offers strengths and weaknesses that expose or conceal flaws in the subject strategy. A key weakness of Market Replay is that the simulated market does not substantially adapt…
▽ More
We show how a multi-agent simulator can support two important but distinct methods for assessing a trading strategy: Market Replay and Interactive Agent-Based Simulation (IABS). Our solution is important because each method offers strengths and weaknesses that expose or conceal flaws in the subject strategy. A key weakness of Market Replay is that the simulated market does not substantially adapt to or respond to the presence of the experimental strategy. IABS methods provide an artificial market for the experimental strategy using a population of background trading agents. Because the background agents attend to market conditions and current price as part of their strategy, the overall market is responsive to the presence of the experimental strategy. Even so, IABS methods have their own weaknesses, primarily that it is unclear if the market environment they provide is realistic. We describe our approach in detail, and illustrate its use in an example application: The evaluation of market impact for various size orders.
△ Less
Submitted 27 June, 2019;
originally announced June 2019.
-
The Computational Complexity of Portal and Other 3D Video Games
Authors:
Erik D. Demaine,
Joshua Lockhart,
Jayson Lynch
Abstract:
We classify the computational complexity of the popular video games Portal and Portal 2. We isolate individual mechanics of the game and prove NP-hardness, PSPACE-completeness, or (pseudo)polynomiality depending on the specific game mechanics allowed. One of our proofs generalizes to prove NP-hardness of many other video games such as Half-Life 2, Halo, Doom, Elder Scrolls, Fallout, Grand Theft Au…
▽ More
We classify the computational complexity of the popular video games Portal and Portal 2. We isolate individual mechanics of the game and prove NP-hardness, PSPACE-completeness, or (pseudo)polynomiality depending on the specific game mechanics allowed. One of our proofs generalizes to prove NP-hardness of many other video games such as Half-Life 2, Halo, Doom, Elder Scrolls, Fallout, Grand Theft Auto, Left 4 Dead, Mass Effect, Deus Ex, Metal Gear Solid, and Resident Evil.
These results build on the established literature on the complexity of video games.
△ Less
Submitted 30 November, 2016;
originally announced November 2016.
-
Glued trees algorithm under phase dam**
Authors:
J. Lockhart,
C. Di Franco,
M. Paternostro
Abstract:
We study the behaviour of the glued trees algorithm described by Childs et al. in [STOC `03, Proc. 35th ACM Symposium on Theory of Computing (2004) 59] under decoherence. We consider a discrete time reformulation of the continuous time quantum walk protocol and apply a phase dam** channel to the coin state, investigating the effect of such a mechanism on the probability of the walker appearing o…
▽ More
We study the behaviour of the glued trees algorithm described by Childs et al. in [STOC `03, Proc. 35th ACM Symposium on Theory of Computing (2004) 59] under decoherence. We consider a discrete time reformulation of the continuous time quantum walk protocol and apply a phase dam** channel to the coin state, investigating the effect of such a mechanism on the probability of the walker appearing on the target vertex of the graph. We pay particular attention to any potential advantage coming from the use of weak decoherence for the spreading of the walk across the glued trees graph.
△ Less
Submitted 4 January, 2014; v1 submitted 21 March, 2013;
originally announced March 2013.