-
A Modelling Investigation for Solar Flare X-ray Stereoscopy with Solar Orbiter/STIX and Earth Orbiting Missions
Authors:
Natasha L. S. Jeffrey,
Säm Krucker,
Morgan Stores,
Eduard P. Kontar,
Pascal Saint-Hilaire,
Andrea F. Battaglia,
Laura Hayes,
Hannah Collier,
Astrid Veronig,
Yang Su,
Srikar Paavan Tadepalli,
Fanxiaoyu Xia
Abstract:
The Spectrometer/Telescope for Imaging X-rays (STIX) on board Solar Orbiter (SolO) provides a unique opportunity to systematically perform stereoscopic X-ray observations of solar flares with current and upcoming X-ray missions at Earth. These observations will produce the first reliable measurements of hard X-ray (HXR) directivity in decades, providing a new diagnostic of the flare-accelerated el…
▽ More
The Spectrometer/Telescope for Imaging X-rays (STIX) on board Solar Orbiter (SolO) provides a unique opportunity to systematically perform stereoscopic X-ray observations of solar flares with current and upcoming X-ray missions at Earth. These observations will produce the first reliable measurements of hard X-ray (HXR) directivity in decades, providing a new diagnostic of the flare-accelerated electron angular distribution and hel** to constrain the processes that accelerate electrons in flares. However, such observations must be compared to modelling, taking into account electron and X-ray transport effects and realistic plasma conditions, all of which can change the properties of the measured HXR directivity. Here, we show how HXR directivity, defined as the ratio of X-ray spectra at different spacecraft viewing angles, varies with different electron and flare properties (e.g., electron angular distribution, highest energy electrons, and magnetic configuration), and how modelling can be used to extract these typically unknown properties from the data. Lastly, we present a preliminary HXR directivity analysis of two flares, observed by the Fermi Gamma-ray Burst Monitor (GBM) and SolO/STIX, demonstrating the feasibility and challenges associated with such observations, and how HXR directivity can be extracted by comparison with the modelling presented here.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Adversarial Attacks on Combinatorial Multi-Armed Bandits
Authors:
Rishab Balasubramanian,
Jiawei Li,
Prasad Tadepalli,
Huazheng Wang,
Qingyun Wu,
Haoyu Zhao
Abstract:
We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first provide a sufficient and necessary condition for the attackability of CMAB, a notion to capture the vulnerability and robustness of CMAB. The attackability condition depends on the intrinsic properties of the corresponding CMAB instance such as the reward distributions of super arms and outcome distributions of…
▽ More
We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first provide a sufficient and necessary condition for the attackability of CMAB, a notion to capture the vulnerability and robustness of CMAB. The attackability condition depends on the intrinsic properties of the corresponding CMAB instance such as the reward distributions of super arms and outcome distributions of base arms. Additionally, we devise an attack algorithm for attackable CMAB instances. Contrary to prior understanding of multi-armed bandits, our work reveals a surprising fact that the attackability of a specific CMAB instance also depends on whether the bandit instance is known or unknown to the adversary. This finding indicates that adversarial attacks on CMAB are difficult in practice and a general attack strategy for any CMAB instance does not exist since the environment is mostly unknown to the adversary. We validate our theoretical findings via extensive experiments on real-world CMAB applications including probabilistic maximum covering problem, online minimum spanning tree, cascading bandits for online ranking, and online shortest path.
△ Less
Submitted 3 June, 2024; v1 submitted 8 October, 2023;
originally announced October 2023.
-
ARB: Advanced Reasoning Benchmark for Large Language Models
Authors:
Tomohiro Sawada,
Daniel Paleka,
Alexander Havrilla,
Pranav Tadepalli,
Paula Vidas,
Alexander Kranias,
John J. Nay,
Kshitij Gupta,
Aran Komatsuzaki
Abstract:
Large Language Models (LLMs) have demonstrated remarkable performance on various quantitative reasoning and knowledge benchmarks. However, many of these benchmarks are losing utility as LLMs get increasingly high scores, despite not yet reaching expert performance in these domains. We introduce ARB, a novel benchmark composed of advanced reasoning problems in multiple fields. ARB presents a more c…
▽ More
Large Language Models (LLMs) have demonstrated remarkable performance on various quantitative reasoning and knowledge benchmarks. However, many of these benchmarks are losing utility as LLMs get increasingly high scores, despite not yet reaching expert performance in these domains. We introduce ARB, a novel benchmark composed of advanced reasoning problems in multiple fields. ARB presents a more challenging test than prior benchmarks, featuring problems in mathematics, physics, biology, chemistry, and law. As a subset of ARB, we introduce a challenging set of math and physics problems which require advanced symbolic reasoning and domain knowledge. We evaluate recent models such as GPT-4 and Claude on ARB and demonstrate that current models score well below 50% on more demanding tasks. In order to improve both automatic and assisted evaluation capabilities, we introduce a rubric-based evaluation approach, allowing GPT-4 to score its own intermediate reasoning steps. Further, we conduct a human evaluation of the symbolic subset of ARB, finding promising agreement between annotators and GPT-4 rubric evaluation scores.
△ Less
Submitted 27 July, 2023; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Parametrically Retargetable Decision-Makers Tend To Seek Power
Authors:
Alexander Matt Turner,
Prasad Tadepalli
Abstract:
If capable AI agents are generally incentivized to seek power in service of the objectives we specify for them, then these systems will pose enormous risks, in addition to enormous benefits. In fully observable environments, most reward functions have an optimal policy which seeks power by kee** options open and staying alive. However, the real world is neither fully observable, nor must trained…
▽ More
If capable AI agents are generally incentivized to seek power in service of the objectives we specify for them, then these systems will pose enormous risks, in addition to enormous benefits. In fully observable environments, most reward functions have an optimal policy which seeks power by kee** options open and staying alive. However, the real world is neither fully observable, nor must trained agents be even approximately reward-optimal. We consider a range of models of AI decision-making, from optimal, to random, to choices informed by learning and interacting with an environment. We discover that many decision-making functions are retargetable, and that retargetability is sufficient to cause power-seeking tendencies. Our functional criterion is simple and broad. We show that a range of qualitatively dissimilar decision-making procedures incentivize agents to seek power. We demonstrate the flexibility of our results by reasoning about learned policy incentives in Montezuma's Revenge. These results suggest a safety risk: Eventually, retargetable training procedures may train real-world agents which seek power over humans.
△ Less
Submitted 11 October, 2022; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Formalizing the Problem of Side Effect Regularization
Authors:
Alexander Matt Turner,
Aseem Saxena,
Prasad Tadepalli
Abstract:
AI objectives are often hard to specify properly. Some approaches tackle this problem by regularizing the AI's side effects: Agents must weigh off "how much of a mess they make" with an imperfectly specified proxy objective. We propose a formal criterion for side effect regularization via the assistance game framework. In these games, the agent solves a partially observable Markov decision process…
▽ More
AI objectives are often hard to specify properly. Some approaches tackle this problem by regularizing the AI's side effects: Agents must weigh off "how much of a mess they make" with an imperfectly specified proxy objective. We propose a formal criterion for side effect regularization via the assistance game framework. In these games, the agent solves a partially observable Markov decision process (POMDP) representing its uncertainty about the objective function it should optimize. We consider the setting where the true objective is revealed to the agent at a later time step. We show that this POMDP is solved by trading off the proxy reward with the agent's ability to achieve a range of future tasks. We empirically demonstrate the reasonableness of our problem formalization via ground-truth evaluation in two gridworld environments.
△ Less
Submitted 8 November, 2022; v1 submitted 23 June, 2022;
originally announced June 2022.
-
Explainable Models via Compression of Tree Ensembles
Authors:
Siwen Yan,
Sriraam Natarajan,
Saket Joshi,
Roni Khardon,
Prasad Tadepalli
Abstract:
Ensemble models (bagging and gradient-boosting) of relational decision trees have proved to be one of the most effective learning methods in the area of probabilistic logic models (PLMs). While effective, they lose one of the most important aspect of PLMs -- interpretability. In this paper we consider the problem of compressing a large set of learned trees into a single explainable model. To this…
▽ More
Ensemble models (bagging and gradient-boosting) of relational decision trees have proved to be one of the most effective learning methods in the area of probabilistic logic models (PLMs). While effective, they lose one of the most important aspect of PLMs -- interpretability. In this paper we consider the problem of compressing a large set of learned trees into a single explainable model. To this effect, we propose CoTE -- Compression of Tree Ensembles -- that produces a single small decision list as a compressed representation. CoTE first converts the trees to decision lists and then performs the combination and compression with the aid of the original training set. An experimental evaluation demonstrates the effectiveness of CoTE in several benchmark relational data sets.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Dynamic probabilistic logic models for effective abstractions in RL
Authors:
Harsha Kokel,
Arjun Manoharan,
Sriraam Natarajan,
Balaraman Ravindran,
Prasad Tadepalli
Abstract:
State abstraction enables sample-efficient learning and better task transfer in complex reinforcement learning environments. Recently, we proposed RePReL (Kokel et al. 2021), a hierarchical framework that leverages a relational planner to provide useful state abstractions for learning. We present a brief overview of this framework and the use of a dynamic probabilistic logic model to design these…
▽ More
State abstraction enables sample-efficient learning and better task transfer in complex reinforcement learning environments. Recently, we proposed RePReL (Kokel et al. 2021), a hierarchical framework that leverages a relational planner to provide useful state abstractions for learning. We present a brief overview of this framework and the use of a dynamic probabilistic logic model to design these state abstractions. Our experiments show that RePReL not only achieves better performance and efficient learning on the task at hand but also demonstrates better generalization to unseen tasks.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
From Heatmaps to Structural Explanations of Image Classifiers
Authors:
Li Fuxin,
Zhongang Qi,
Saeed Khorram,
Vivswan Shitole,
Prasad Tadepalli,
Minsuk Kahng,
Alan Fern
Abstract:
This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps us…
▽ More
This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps users understand network classifications that are less intuitive and substantially improves user performance on a difficult fine-grained classification task of discriminating among different species of seagulls.
Realizing that an important missing piece is a reliable heatmap visualization tool, we have developed I-GOS and iGOS++ utilizing integrated gradients to avoid local optima in heatmap generation, which improved the performance across all resolutions. During the development of those visualizations, we realized that for a significant number of images, the classifier has multiple different paths to reach a confident prediction. This has lead to our recent development of structured attention graphs (SAGs), an approach that utilizes beam search to locate multiple coarse heatmaps for a single image, and compactly visualizes a set of heatmaps by capturing how different combinations of image regions impact the confidence of a classifier.
Through the research process, we have learned much about insights in building deep network explanations, the existence and frequency of multiple explanations, and various tricks of the trade that make explanations work. In this paper, we attempt to share those insights and opinions with the readers with the hope that some of them will be informative for future researchers on explainable deep learning.
△ Less
Submitted 13 September, 2021;
originally announced September 2021.
-
Improving Multilingual Translation by Representation and Gradient Regularization
Authors:
Yilin Yang,
Akiko Eriguchi,
Alexandre Muzio,
Prasad Tadepalli,
Stefan Lee,
Hany Hassan
Abstract:
Multilingual Neural Machine Translation (NMT) enables one model to serve all translation directions, including ones that are unseen during training, i.e. zero-shot translation. Despite being theoretically attractive, current models often produce low quality translations -- commonly failing to even produce outputs in the right target language. In this work, we observe that off-target translation is…
▽ More
Multilingual Neural Machine Translation (NMT) enables one model to serve all translation directions, including ones that are unseen during training, i.e. zero-shot translation. Despite being theoretically attractive, current models often produce low quality translations -- commonly failing to even produce outputs in the right target language. In this work, we observe that off-target translation is dominant even in strong multilingual systems, trained on massive multilingual corpora. To address this issue, we propose a joint approach to regularize NMT models at both representation-level and gradient-level. At the representation level, we leverage an auxiliary target language prediction task to regularize decoder outputs to retain information about the target language. At the gradient level, we leverage a small amount of direct data (in thousands of sentence pairs) to regularize model gradients. Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance by +5.59 and +10.38 BLEU on WMT and OPUS datasets respectively. Moreover, experiments show that our method also works well when the small amount of direct data is not available.
△ Less
Submitted 18 January, 2022; v1 submitted 10 September, 2021;
originally announced September 2021.
-
Imaging and Spectral Observations of a Type-II Radio Burst Revealing the Section of the CME-Driven Shock that Accelerates Electrons
Authors:
Satabdwa Majumdar,
Srikar Paavan Tadepalli,
Samriddhi Sankar Maity,
Ketaki Deshpande,
Anshu Kumari,
Ritesh Patel,
Nat Gopalswamy
Abstract:
We report on a multi-wavelength analysis of the 26 January 2014 solar eruption involving a coronal mass ejection (CME) and a Type-II radio burst, performed by combining data from various space-and ground-based instruments. An increasing standoff distance with height shows the presence of a strong shock, which further manifests itself in the continuation of the metric Type-II burst into the decamet…
▽ More
We report on a multi-wavelength analysis of the 26 January 2014 solar eruption involving a coronal mass ejection (CME) and a Type-II radio burst, performed by combining data from various space-and ground-based instruments. An increasing standoff distance with height shows the presence of a strong shock, which further manifests itself in the continuation of the metric Type-II burst into the decameter-hectometric (DH) domain. A plot of speed versus position angle (PA) shows different points on the CME leading edge travelled with different speeds. From the starting frequency of the Type-II burst and white-light data, we find that the shock signature producing the Type-II burst might be coming from the flanks of the CME. Measuring the speeds of the CME flanks, we find the southern flank to be at a higher speed than the northern flank; further the radio contours from Type-II imaging data showed that the burst source was coming from the southern flank of the CME. From the standoff distance at the CME nose, we find that the local Alfven speed is close to the white-light shock speed, thus causing the Mach number to be small there. Also, the presence of a streamer near the southern flank appears to have provided additional favorable conditions for the generation of shock-associated radio emission. These results provide conclusive evidence that the Type-II emission could originate from the flanks of the CME, which in our study is from the the southern flank of the CME.
△ Less
Submitted 17 March, 2021;
originally announced March 2021.
-
One Explanation is Not Enough: Structured Attention Graphs for Image Classification
Authors:
Vivswan Shitole,
Li Fuxin,
Minsuk Kahng,
Prasad Tadepalli,
Alan Fern
Abstract:
Attention maps are a popular way of explaining the decisions of convolutional networks for image classification. Typically, for each image of interest, a single attention map is produced, which assigns weights to pixels based on their importance to the classification. A single attention map, however, provides an incomplete understanding since there are often many other maps that explain a classifi…
▽ More
Attention maps are a popular way of explaining the decisions of convolutional networks for image classification. Typically, for each image of interest, a single attention map is produced, which assigns weights to pixels based on their importance to the classification. A single attention map, however, provides an incomplete understanding since there are often many other maps that explain a classification equally well. In this paper, we introduce structured attention graphs (SAGs), which compactly represent sets of attention maps for an image by capturing how different combinations of image regions impact a classifier's confidence. We propose an approach to compute SAGs and a visualization for SAGs so that deeper insight can be gained into a classifier's decisions. We conduct a user study comparing the use of SAGs to traditional attention maps for answering counterfactual questions about image classifications. Our results show that the users are more correct when answering comparative counterfactual questions based on SAGs compared to the baselines.
△ Less
Submitted 7 November, 2021; v1 submitted 12 November, 2020;
originally announced November 2020.
-
DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs
Authors:
Aayam Shrestha,
Stefan Lee,
Prasad Tadepalli,
Alan Fern
Abstract:
We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience. This approach can be applied on top of any learned representation and has the potential to easily support multiple solution objectives as well as zero-shot adjustment to changing environments and goals. Our main contribution is to introduce t…
▽ More
We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience. This approach can be applied on top of any learned representation and has the potential to easily support multiple solution objectives as well as zero-shot adjustment to changing environments and goals. Our main contribution is to introduce the Deep Averagers with Costs MDP (DAC-MDP) and to investigate its solutions for offline RL. DAC-MDPs are a non-parametric model that can leverage deep representations and account for limited data by introducing costs for exploiting under-represented parts of the model. In theory, we show conditions that allow for lower-bounding the performance of DAC-MDP solutions. We also investigate the empirical behavior in a number of environments, including those with image-based observations. Overall, the experiments demonstrate that the framework can work in practice and scale to large complex offline RL problems.
△ Less
Submitted 17 October, 2020;
originally announced October 2020.
-
On the Sub-Layer Functionalities of Transformer Decoder
Authors:
Yilin Yang,
Longyue Wang,
Shuming Shi,
Prasad Tadepalli,
Stefan Lee,
Zhaopeng Tu
Abstract:
There have been significant efforts to interpret the encoder of Transformer-based encoder-decoder architectures for neural machine translation (NMT); meanwhile, the decoder remains largely unexamined despite its critical role. During translation, the decoder must predict output tokens by considering both the source-language text from the encoder and the target-language prefix produced in previous…
▽ More
There have been significant efforts to interpret the encoder of Transformer-based encoder-decoder architectures for neural machine translation (NMT); meanwhile, the decoder remains largely unexamined despite its critical role. During translation, the decoder must predict output tokens by considering both the source-language text from the encoder and the target-language prefix produced in previous steps. In this work, we study how Transformer-based decoders leverage information from the source and target languages -- develo** a universal probe task to assess how information is propagated through each module of each decoder layer. We perform extensive experiments on three major translation datasets (WMT En-De, En-Fr, and En-Zh). Our analysis provides insight on when and where decoders leverage different sources. Based on these insights, we demonstrate that the residual feed-forward module in each Transformer decoder layer can be dropped with minimal loss of performance -- a significant reduction in computation and number of parameters, and consequently a significant boost to both training and inference speed.
△ Less
Submitted 6 October, 2020;
originally announced October 2020.
-
Avoiding Side Effects in Complex Environments
Authors:
Alexander Matt Turner,
Neale Ratzlaff,
Prasad Tadepalli
Abstract:
Reward function specification can be difficult. Rewarding the agent for making a widget may be easy, but penalizing the multitude of possible negative side effects is hard. In toy environments, Attainable Utility Preservation (AUP) avoided side effects by penalizing shifts in the ability to achieve randomly generated goals. We scale this approach to large, randomly generated environments based on…
▽ More
Reward function specification can be difficult. Rewarding the agent for making a widget may be easy, but penalizing the multitude of possible negative side effects is hard. In toy environments, Attainable Utility Preservation (AUP) avoided side effects by penalizing shifts in the ability to achieve randomly generated goals. We scale this approach to large, randomly generated environments based on Conway's Game of Life. By preserving optimal value for a single randomly generated reward function, AUP incurs modest overhead while leading the agent to complete the specified task and avoid many side effects. Videos and code are available at https://avoiding-side-effects.github.io/.
△ Less
Submitted 22 October, 2020; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Relation Extraction with Explanation
Authors:
Hamed Shahbazi,
Xiaoli Z. Fern,
Reza Ghaeini,
Prasad Tadepalli
Abstract:
Recent neural models for relation extraction with distant supervision alleviate the impact of irrelevant sentences in a bag by learning importance weights for the sentences. Efforts thus far have focused on improving extraction accuracy but little is known about their explainability. In this work we annotate a test set with ground-truth sentence-level explanations to evaluate the quality of explan…
▽ More
Recent neural models for relation extraction with distant supervision alleviate the impact of irrelevant sentences in a bag by learning importance weights for the sentences. Efforts thus far have focused on improving extraction accuracy but little is known about their explainability. In this work we annotate a test set with ground-truth sentence-level explanations to evaluate the quality of explanations afforded by the relation extraction models. We demonstrate that replacing the entity mentions in the sentences with their fine-grained entity types not only enhances extraction accuracy but also improves explanation. We also propose to automatically generate "distractor" sentences to augment the bags and train the model to ignore the distractors. Evaluations on the widely used FB-NYT dataset show that our methods achieve new state-of-the-art accuracy while improving model explainability.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.
-
Optimal Policies Tend to Seek Power
Authors:
Alexander Matt Turner,
Logan Smith,
Rohin Shah,
Andrew Critch,
Prasad Tadepalli
Abstract:
Some researchers speculate that intelligent reinforcement learning (RL) agents would be incentivized to seek resources and power in pursuit of their objectives. Other researchers point out that RL agents need not have human-like power-seeking instincts. To clarify this discussion, we develop the first formal theory of the statistical tendencies of optimal policies. In the context of Markov decisio…
▽ More
Some researchers speculate that intelligent reinforcement learning (RL) agents would be incentivized to seek resources and power in pursuit of their objectives. Other researchers point out that RL agents need not have human-like power-seeking instincts. To clarify this discussion, we develop the first formal theory of the statistical tendencies of optimal policies. In the context of Markov decision processes, we prove that certain environmental symmetries are sufficient for optimal policies to tend to seek power over the environment. These symmetries exist in many environments in which the agent can be shut down or destroyed. We prove that in these environments, most reward functions make it optimal to seek power by kee** a range of options available and, when maximizing average reward, by navigating towards larger sets of potential terminal states.
△ Less
Submitted 28 January, 2023; v1 submitted 3 December, 2019;
originally announced December 2019.
-
The Choice Function Framework for Online Policy Improvement
Authors:
Murugeswari Issakkimuthu,
Alan Fern,
Prasad Tadepalli
Abstract:
There are notable examples of online search improving over hand-coded or learned policies (e.g. AlphaZero) for sequential decision making. It is not clear, however, whether or not policy improvement is guaranteed for many of these approaches, even when given a perfect evaluation function and transition model. Indeed, simple counter examples show that seemingly reasonable online search procedures c…
▽ More
There are notable examples of online search improving over hand-coded or learned policies (e.g. AlphaZero) for sequential decision making. It is not clear, however, whether or not policy improvement is guaranteed for many of these approaches, even when given a perfect evaluation function and transition model. Indeed, simple counter examples show that seemingly reasonable online search procedures can hurt performance compared to the original policy. To address this issue, we introduce the choice function framework for analyzing online search procedures for policy improvement. A choice function specifies the actions to be considered at every node of a search tree, with all other actions being pruned. Our main contribution is to give sufficient conditions for stationary and non-stationary choice functions to guarantee that the value achieved by online search is no worse than the original policy. In addition, we describe a general parametric class of choice functions that satisfy those conditions and present an illustrative use case of the framework's empirical utility.
△ Less
Submitted 7 October, 2019; v1 submitted 1 October, 2019;
originally announced October 2019.
-
Entity-aware ELMo: Learning Contextual Entity Representation for Entity Disambiguation
Authors:
Hamed Shahbazi,
Xiaoli Z. Fern,
Reza Ghaeini,
Rasha Obeidat,
Prasad Tadepalli
Abstract:
We present a new local entity disambiguation system. The key to our system is a novel approach for learning entity representations. In our approach we learn an entity aware extension of Embedding for Language Model (ELMo) which we call Entity-ELMo (E-ELMo). Given a paragraph containing one or more named entity mentions, each mention is first defined as a function of the entire paragraph (including…
▽ More
We present a new local entity disambiguation system. The key to our system is a novel approach for learning entity representations. In our approach we learn an entity aware extension of Embedding for Language Model (ELMo) which we call Entity-ELMo (E-ELMo). Given a paragraph containing one or more named entity mentions, each mention is first defined as a function of the entire paragraph (including other mentions), then they predict the referent entities. Utilizing E-ELMo for local entity disambiguation, we outperform all of the state-of-the-art local and global models on the popular benchmarks by improving about 0.5\% on micro average accuracy for AIDA test-b with Yago candidate set. The evaluation setup of the training data and candidate set are the same as our baselines for fair comparison.
△ Less
Submitted 22 August, 2019; v1 submitted 13 August, 2019;
originally announced August 2019.
-
Conservative Agency via Attainable Utility Preservation
Authors:
Alexander Matt Turner,
Dylan Hadfield-Menell,
Prasad Tadepalli
Abstract:
Reward functions are easy to misspecify; although designers can make corrections after observing mistakes, an agent pursuing a misspecified reward function can irreversibly change the state of its environment. If that change precludes optimization of the correctly specified reward function, then correction is futile. For example, a robotic factory assistant could break expensive equipment due to a…
▽ More
Reward functions are easy to misspecify; although designers can make corrections after observing mistakes, an agent pursuing a misspecified reward function can irreversibly change the state of its environment. If that change precludes optimization of the correctly specified reward function, then correction is futile. For example, a robotic factory assistant could break expensive equipment due to a reward misspecification; even if the designers immediately correct the reward function, the damage is done. To mitigate this risk, we introduce an approach that balances optimization of the primary reward function with preservation of the ability to optimize auxiliary reward functions. Surprisingly, even when the auxiliary reward functions are randomly generated and therefore uninformative about the correctly specified reward function, this approach induces conservative, effective behavior.
△ Less
Submitted 10 June, 2020; v1 submitted 25 February, 2019;
originally announced February 2019.
-
Saliency Learning: Teaching the Model Where to Pay Attention
Authors:
Reza Ghaeini,
Xiaoli Z. Fern,
Hamed Shahbazi,
Prasad Tadepalli
Abstract:
Deep learning has emerged as a compelling solution to many NLP tasks with remarkable performances. However, due to their opacity, such models are hard to interpret and trust. Recent work on explaining deep models has introduced approaches to provide insights toward the model's behaviour and predictions, which are helpful for assessing the reliability of the model's predictions. However, such metho…
▽ More
Deep learning has emerged as a compelling solution to many NLP tasks with remarkable performances. However, due to their opacity, such models are hard to interpret and trust. Recent work on explaining deep models has introduced approaches to provide insights toward the model's behaviour and predictions, which are helpful for assessing the reliability of the model's predictions. However, such methods do not improve the model's reliability. In this paper, we aim to teach the model to make the right prediction for the right reason by providing explanation training and ensuring the alignment of the model's explanation with the ground truth explanation. Our experimental results on multiple tasks and datasets demonstrate the effectiveness of the proposed method, which produces more reliable predictions while delivering better results compared to traditionally trained models.
△ Less
Submitted 4 April, 2019; v1 submitted 22 February, 2019;
originally announced February 2019.
-
Interactive Naming for Explaining Deep Neural Networks: A Formative Study
Authors:
Mandana Hamidi-Haines,
Zhongang Qi,
Alan Fern,
Fuxin Li,
Prasad Tadepalli
Abstract:
We consider the problem of explaining the decisions of deep neural networks for image recognition in terms of human-recognizable visual concepts. In particular, given a test set of images, we aim to explain each classification in terms of a small number of image regions, or activation maps, which have been associated with semantic concepts by a human annotator. This allows for generating summary v…
▽ More
We consider the problem of explaining the decisions of deep neural networks for image recognition in terms of human-recognizable visual concepts. In particular, given a test set of images, we aim to explain each classification in terms of a small number of image regions, or activation maps, which have been associated with semantic concepts by a human annotator. This allows for generating summary views of the typical reasons for classifications, which can help build trust in a classifier and/or identify example types for which the classifier may not be trusted. For this purpose, we developed a user interface for "interactive naming," which allows a human annotator to manually cluster significant activation maps in a test set into meaningful groups called "visual concepts". The main contribution of this paper is a systematic study of the visual concepts produced by five human annotators using the interactive naming interface. In particular, we consider the adequacy of the concepts for explaining the classification of test-set images, correspondence of the concepts to activations of individual neurons, and the inter-annotator agreement of visual concepts. We find that a large fraction of the activation maps have recognizable visual concepts, and that there is significant agreement between the different annotators about their denotations. Our work is an exploratory study of the interplay between machine learning and human recognition mediated by visualizations of the results of learning.
△ Less
Submitted 20 December, 2018; v1 submitted 17 December, 2018;
originally announced December 2018.
-
Learning Scripts as Hidden Markov Models
Authors:
J. Walker Orr,
Prasad Tadepalli,
Janardhan Rao Doppa,
Xiaoli Fern,
Thomas G. Dietterich
Abstract:
Scripts have been proposed to model the stereotypical event sequences found in narratives. They can be applied to make a variety of inferences including filling gaps in the narratives and resolving ambiguous references. This paper proposes the first formal framework for scripts based on Hidden Markov Models (HMMs). Our framework supports robust inference and learning algorithms, which are lacking…
▽ More
Scripts have been proposed to model the stereotypical event sequences found in narratives. They can be applied to make a variety of inferences including filling gaps in the narratives and resolving ambiguous references. This paper proposes the first formal framework for scripts based on Hidden Markov Models (HMMs). Our framework supports robust inference and learning algorithms, which are lacking in previous clustering models. We develop an algorithm for structure and parameter learning based on Expectation Maximization and evaluate it on a number of natural datasets. The results show that our algorithm is superior to several informed baselines for predicting missing events in partial observation sequences.
△ Less
Submitted 11 September, 2018;
originally announced September 2018.
-
Attentional Multi-Reading Sarcasm Detection
Authors:
Reza Ghaeini,
Xiaoli Z. Fern,
Prasad Tadepalli
Abstract:
Recognizing sarcasm often requires a deep understanding of multiple sources of information, including the utterance, the conversational context, and real world facts. Most of the current sarcasm detection systems consider only the utterance in isolation. There are some limited attempts toward taking into account the conversational context. In this paper, we propose an interpretable end-to-end mode…
▽ More
Recognizing sarcasm often requires a deep understanding of multiple sources of information, including the utterance, the conversational context, and real world facts. Most of the current sarcasm detection systems consider only the utterance in isolation. There are some limited attempts toward taking into account the conversational context. In this paper, we propose an interpretable end-to-end model that combines information from both the utterance and the conversational context to detect sarcasm, and demonstrate its effectiveness through empirical evaluations. We also study the behavior of the proposed model to provide explanations for the model's decisions. Importantly, our model is capable of determining the impact of utterance and conversational context on the model's decisions. Finally, we provide an ablation study to illustrate the impact of different components of the proposed model.
△ Less
Submitted 9 September, 2018;
originally announced September 2018.
-
Event Detection with Neural Networks: A Rigorous Empirical Evaluation
Authors:
J. Walker Orr,
Prasad Tadepalli,
Xiaoli Fern
Abstract:
Detecting events and classifying them into predefined types is an important step in knowledge extraction from natural language texts. While the neural network models have generally led the state-of-the-art, the differences in performance between different architectures have not been rigorously studied. In this paper we present a novel GRU-based model that combines syntactic information along with…
▽ More
Detecting events and classifying them into predefined types is an important step in knowledge extraction from natural language texts. While the neural network models have generally led the state-of-the-art, the differences in performance between different architectures have not been rigorously studied. In this paper we present a novel GRU-based model that combines syntactic information along with temporal structure through an attention mechanism. We show that it is competitive with other neural network architectures through empirical evaluations under different random initializations and training-validation-test splits of ACE2005 dataset.
△ Less
Submitted 26 August, 2018;
originally announced August 2018.
-
Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference
Authors:
Reza Ghaeini,
Xiaoli Z. Fern,
Prasad Tadepalli
Abstract:
Deep learning models have achieved remarkable success in natural language inference (NLI) tasks. While these models are widely explored, they are hard to interpret and it is often unclear how and why they actually work. In this paper, we take a step toward explaining such deep learning based models through a case study on a popular neural model for NLI. In particular, we propose to interpret the i…
▽ More
Deep learning models have achieved remarkable success in natural language inference (NLI) tasks. While these models are widely explored, they are hard to interpret and it is often unclear how and why they actually work. In this paper, we take a step toward explaining such deep learning based models through a case study on a popular neural model for NLI. In particular, we propose to interpret the intermediate layers of NLI models by visualizing the saliency of attention and LSTM gating signals. We present several examples for which our methods are able to reveal interesting insights and identify the critical information contributing to the model decisions.
△ Less
Submitted 12 August, 2018;
originally announced August 2018.
-
Joint Neural Entity Disambiguation with Output Space Search
Authors:
Hamed Shahbazi,
Xiaoli Z. Fern,
Reza Ghaeini,
Chao Ma,
Rasha Obeidat,
Prasad Tadepalli
Abstract:
In this paper, we present a novel model for entity disambiguation that combines both local contextual information and global evidences through Limited Discrepancy Search (LDS). Given an input document, we start from a complete solution constructed by a local model and conduct a search in the space of possible corrections to improve the local solution from a global view point. Our search utilizes a…
▽ More
In this paper, we present a novel model for entity disambiguation that combines both local contextual information and global evidences through Limited Discrepancy Search (LDS). Given an input document, we start from a complete solution constructed by a local model and conduct a search in the space of possible corrections to improve the local solution from a global view point. Our search utilizes a heuristic function to focus more on the least confident local decisions and a pruning function to score the global solutions based on their local fitness and the global coherences among the predicted entities. Experimental results on CoNLL 2003 and TAC 2010 benchmarks verify the effectiveness of our model.
△ Less
Submitted 19 June, 2018;
originally announced June 2018.
-
Dependent Gated Reading for Cloze-Style Question Answering
Authors:
Reza Ghaeini,
Xiaoli Z. Fern,
Hamed Shahbazi,
Prasad Tadepalli
Abstract:
We present a novel deep learning architecture to address the cloze-style question answering task. Existing approaches employ reading mechanisms that do not fully exploit the interdependency between the document and the query. In this paper, we propose a novel \emph{dependent gated reading} bidirectional GRU network (DGR) to efficiently model the relationship between the document and the query duri…
▽ More
We present a novel deep learning architecture to address the cloze-style question answering task. Existing approaches employ reading mechanisms that do not fully exploit the interdependency between the document and the query. In this paper, we propose a novel \emph{dependent gated reading} bidirectional GRU network (DGR) to efficiently model the relationship between the document and the query during encoding and decision making. Our evaluation shows that DGR obtains highly competitive performance on well-known machine comprehension benchmarks such as the Children's Book Test (CBT-NE and CBT-CN) and Who DiD What (WDW, Strict and Relaxed). Finally, we extensively analyze and validate our model by ablation and attention studies.
△ Less
Submitted 1 June, 2018; v1 submitted 26 May, 2018;
originally announced May 2018.
-
Event Nugget Detection with Forward-Backward Recurrent Neural Networks
Authors:
Reza Ghaeini,
Xiaoli Z. Fern,
Liang Huang,
Prasad Tadepalli
Abstract:
Traditional event detection methods heavily rely on manually engineered rich features. Recent deep learning approaches alleviate this problem by automatic feature engineering. But such efforts, like tradition methods, have so far only focused on single-token event mentions, whereas in practice events can also be a phrase. We instead use forward-backward recurrent neural networks (FBRNNs) to detect…
▽ More
Traditional event detection methods heavily rely on manually engineered rich features. Recent deep learning approaches alleviate this problem by automatic feature engineering. But such efforts, like tradition methods, have so far only focused on single-token event mentions, whereas in practice events can also be a phrase. We instead use forward-backward recurrent neural networks (FBRNNs) to detect events that can be either words or phrases. To the best our knowledge, this is one of the first efforts to handle multi-word events and also the first attempt to use RNNs for event detection. Experimental results demonstrate that FBRNN is competitive with the state-of-the-art methods on the ACE 2005 and the Rich ERE 2015 event detection tasks.
△ Less
Submitted 15 February, 2018;
originally announced February 2018.
-
Coactive Learning for Locally Optimal Problem Solving
Authors:
Robby Goetschalckx,
Alan Fern,
Prasad Tadepalli
Abstract:
Coactive learning is an online problem solving setting where the solutions provided by a solver are interactively improved by a domain expert, which in turn drives learning. In this paper we extend the study of coactive learning to problems where obtaining a globally optimal or near-optimal solution may be intractable or where an expert can only be expected to make small, local improvements to a c…
▽ More
Coactive learning is an online problem solving setting where the solutions provided by a solver are interactively improved by a domain expert, which in turn drives learning. In this paper we extend the study of coactive learning to problems where obtaining a globally optimal or near-optimal solution may be intractable or where an expert can only be expected to make small, local improvements to a candidate solution. The goal of learning in this new setting is to minimize the cost as measured by the expert effort over time. We first establish theoretical bounds on the average cost of the existing coactive Perceptron algorithm. In addition, we consider new online algorithms that use cost-sensitive and Passive-Aggressive (PA) updates, showing similar or improved theoretical bounds. We provide an empirical evaluation of the learners in various domains, which show that the Perceptron based algorithms are quite effective and that unlike the case for online classification, the PA algorithms do not yield significant performance gains.
△ Less
Submitted 18 April, 2014;
originally announced April 2014.
-
Solving Relational MDPs with Exogenous Events and Additive Rewards
Authors:
S. Joshi,
R. Khardon,
P. Tadepalli,
A. Raghavan,
A. Fern
Abstract:
We formalize a simple but natural subclass of service domains for relational planning problems with object-centered, independent exogenous events and additive rewards capturing, for example, problems in inventory control. Focusing on this subclass, we present a new symbolic planning algorithm which is the first algorithm that has explicit performance guarantees for relational MDPs with exogenous e…
▽ More
We formalize a simple but natural subclass of service domains for relational planning problems with object-centered, independent exogenous events and additive rewards capturing, for example, problems in inventory control. Focusing on this subclass, we present a new symbolic planning algorithm which is the first algorithm that has explicit performance guarantees for relational MDPs with exogenous events. In particular, under some technical conditions, our planning algorithm provides a monotonic lower bound on the optimal value function. To support this algorithm we present novel evaluation and reduction techniques for generalized first order decision diagrams, a knowledge representation for real-valued functions over relational world states. Our planning algorithm uses a set of focus states, which serves as a training set, to simplify and approximate the symbolic solution, and can thus be seen to perform learning for planning. A preliminary experimental evaluation demonstrates the validity of our approach.
△ Less
Submitted 27 June, 2013; v1 submitted 26 June, 2013;
originally announced June 2013.
-
Output Space Search for Structured Prediction
Authors:
Janardhan Rao Doppa,
Alan Fern,
Prasad Tadepalli
Abstract:
We consider a framework for structured prediction based on search in the space of complete structured outputs. Given a structured input, an output is produced by running a time-bounded search procedure guided by a learned cost function, and then returning the least cost output uncovered during the search. This framework can be instantiated for a wide range of search spaces and search procedures, a…
▽ More
We consider a framework for structured prediction based on search in the space of complete structured outputs. Given a structured input, an output is produced by running a time-bounded search procedure guided by a learned cost function, and then returning the least cost output uncovered during the search. This framework can be instantiated for a wide range of search spaces and search procedures, and easily incorporates arbitrary structured-prediction loss functions. In this paper, we make two main technical contributions. First, we define the limited-discrepancy search space over structured outputs, which is able to leverage powerful classification learning algorithms to improve the search space quality. Second, we give a generic cost function learning approach, where the key idea is to learn a cost function that attempts to mimic the behavior of conducting searches guided by the true loss function. Our experiments on six benchmark domains demonstrate that using our framework with only a small amount of search is sufficient for significantly improving on state-of-the-art structured-prediction performance.
△ Less
Submitted 27 June, 2012;
originally announced June 2012.
-
A Formal Framework for Speedup Learning from Problems and Solutions
Authors:
P. Tadepalli,
B. K. Natarajan
Abstract:
Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned knowledge, namely control rules and macro-operators, and prove theorems that identify sufficient conditio…
▽ More
Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned knowledge, namely control rules and macro-operators, and prove theorems that identify sufficient conditions for learning in each representation. Our proofs are constructive in that they are accompanied with learning algorithms. Our framework captures both empirical and explanation-based speedup learning in a unified fashion. We illustrate our framework with implementations in two domains: symbolic integration and Eight Puzzle. This work integrates many strands of experimental and theoretical work in machine learning, including empirical learning of control rules, macro-operator learning, Explanation-Based Learning (EBL), and Probably Approximately Correct (PAC) Learning.
△ Less
Submitted 30 April, 1996;
originally announced May 1996.