Search | arXiv e-print repository

A Modelling Investigation for Solar Flare X-ray Stereoscopy with Solar Orbiter/STIX and Earth Orbiting Missions

Authors: Natasha L. S. Jeffrey, Säm Krucker, Morgan Stores, Eduard P. Kontar, Pascal Saint-Hilaire, Andrea F. Battaglia, Laura Hayes, Hannah Collier, Astrid Veronig, Yang Su, Srikar Paavan Tadepalli, Fanxiaoyu Xia

Abstract: The Spectrometer/Telescope for Imaging X-rays (STIX) on board Solar Orbiter (SolO) provides a unique opportunity to systematically perform stereoscopic X-ray observations of solar flares with current and upcoming X-ray missions at Earth. These observations will produce the first reliable measurements of hard X-ray (HXR) directivity in decades, providing a new diagnostic of the flare-accelerated el… ▽ More The Spectrometer/Telescope for Imaging X-rays (STIX) on board Solar Orbiter (SolO) provides a unique opportunity to systematically perform stereoscopic X-ray observations of solar flares with current and upcoming X-ray missions at Earth. These observations will produce the first reliable measurements of hard X-ray (HXR) directivity in decades, providing a new diagnostic of the flare-accelerated electron angular distribution and hel** to constrain the processes that accelerate electrons in flares. However, such observations must be compared to modelling, taking into account electron and X-ray transport effects and realistic plasma conditions, all of which can change the properties of the measured HXR directivity. Here, we show how HXR directivity, defined as the ratio of X-ray spectra at different spacecraft viewing angles, varies with different electron and flare properties (e.g., electron angular distribution, highest energy electrons, and magnetic configuration), and how modelling can be used to extract these typically unknown properties from the data. Lastly, we present a preliminary HXR directivity analysis of two flares, observed by the Fermi Gamma-ray Burst Monitor (GBM) and SolO/STIX, demonstrating the feasibility and challenges associated with such observations, and how HXR directivity can be extracted by comparison with the modelling presented here. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: Accepted for publication in ApJ (January 2024)

arXiv:2310.05308 [pdf, other]

Adversarial Attacks on Combinatorial Multi-Armed Bandits

Authors: Rishab Balasubramanian, Jiawei Li, Prasad Tadepalli, Huazheng Wang, Qingyun Wu, Haoyu Zhao

Abstract: We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first provide a sufficient and necessary condition for the attackability of CMAB, a notion to capture the vulnerability and robustness of CMAB. The attackability condition depends on the intrinsic properties of the corresponding CMAB instance such as the reward distributions of super arms and outcome distributions of… ▽ More We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first provide a sufficient and necessary condition for the attackability of CMAB, a notion to capture the vulnerability and robustness of CMAB. The attackability condition depends on the intrinsic properties of the corresponding CMAB instance such as the reward distributions of super arms and outcome distributions of base arms. Additionally, we devise an attack algorithm for attackable CMAB instances. Contrary to prior understanding of multi-armed bandits, our work reveals a surprising fact that the attackability of a specific CMAB instance also depends on whether the bandit instance is known or unknown to the adversary. This finding indicates that adversarial attacks on CMAB are difficult in practice and a general attack strategy for any CMAB instance does not exist since the environment is mostly unknown to the adversary. We validate our theoretical findings via extensive experiments on real-world CMAB applications including probabilistic maximum covering problem, online minimum spanning tree, cascading bandits for online ranking, and online shortest path. △ Less

Submitted 3 June, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

Comments: 28 pages, Accepted to ICML 2024

arXiv:2307.13692 [pdf, other]

ARB: Advanced Reasoning Benchmark for Large Language Models

Authors: Tomohiro Sawada, Daniel Paleka, Alexander Havrilla, Pranav Tadepalli, Paula Vidas, Alexander Kranias, John J. Nay, Kshitij Gupta, Aran Komatsuzaki

Abstract: Large Language Models (LLMs) have demonstrated remarkable performance on various quantitative reasoning and knowledge benchmarks. However, many of these benchmarks are losing utility as LLMs get increasingly high scores, despite not yet reaching expert performance in these domains. We introduce ARB, a novel benchmark composed of advanced reasoning problems in multiple fields. ARB presents a more c… ▽ More Large Language Models (LLMs) have demonstrated remarkable performance on various quantitative reasoning and knowledge benchmarks. However, many of these benchmarks are losing utility as LLMs get increasingly high scores, despite not yet reaching expert performance in these domains. We introduce ARB, a novel benchmark composed of advanced reasoning problems in multiple fields. ARB presents a more challenging test than prior benchmarks, featuring problems in mathematics, physics, biology, chemistry, and law. As a subset of ARB, we introduce a challenging set of math and physics problems which require advanced symbolic reasoning and domain knowledge. We evaluate recent models such as GPT-4 and Claude on ARB and demonstrate that current models score well below 50% on more demanding tasks. In order to improve both automatic and assisted evaluation capabilities, we introduce a rubric-based evaluation approach, allowing GPT-4 to score its own intermediate reasoning steps. Further, we conduct a human evaluation of the symbolic subset of ARB, finding promising agreement between annotators and GPT-4 rubric evaluation scores. △ Less

Submitted 27 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

Comments: Submitted to NeurIPS Datasets and Benchmarks Track

arXiv:2206.13477 [pdf, other]

Parametrically Retargetable Decision-Makers Tend To Seek Power

Authors: Alexander Matt Turner, Prasad Tadepalli

Abstract: If capable AI agents are generally incentivized to seek power in service of the objectives we specify for them, then these systems will pose enormous risks, in addition to enormous benefits. In fully observable environments, most reward functions have an optimal policy which seeks power by kee** options open and staying alive. However, the real world is neither fully observable, nor must trained… ▽ More If capable AI agents are generally incentivized to seek power in service of the objectives we specify for them, then these systems will pose enormous risks, in addition to enormous benefits. In fully observable environments, most reward functions have an optimal policy which seeks power by kee** options open and staying alive. However, the real world is neither fully observable, nor must trained agents be even approximately reward-optimal. We consider a range of models of AI decision-making, from optimal, to random, to choices informed by learning and interacting with an environment. We discover that many decision-making functions are retargetable, and that retargetability is sufficient to cause power-seeking tendencies. Our functional criterion is simple and broad. We show that a range of qualitatively dissimilar decision-making procedures incentivize agents to seek power. We demonstrate the flexibility of our results by reasoning about learned policy incentives in Montezuma's Revenge. These results suggest a safety risk: Eventually, retargetable training procedures may train real-world agents which seek power over humans. △ Less

Submitted 11 October, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

Comments: 10-page main paper, 36 pages total, poster at NeurIPS 2022

arXiv:2206.11812 [pdf, other]

Formalizing the Problem of Side Effect Regularization

Authors: Alexander Matt Turner, Aseem Saxena, Prasad Tadepalli

Abstract: AI objectives are often hard to specify properly. Some approaches tackle this problem by regularizing the AI's side effects: Agents must weigh off "how much of a mess they make" with an imperfectly specified proxy objective. We propose a formal criterion for side effect regularization via the assistance game framework. In these games, the agent solves a partially observable Markov decision process… ▽ More AI objectives are often hard to specify properly. Some approaches tackle this problem by regularizing the AI's side effects: Agents must weigh off "how much of a mess they make" with an imperfectly specified proxy objective. We propose a formal criterion for side effect regularization via the assistance game framework. In these games, the agent solves a partially observable Markov decision process (POMDP) representing its uncertainty about the objective function it should optimize. We consider the setting where the true objective is revealed to the agent at a later time step. We show that this POMDP is solved by trading off the proxy reward with the agent's ability to achieve a range of future tasks. We empirically demonstrate the reasonableness of our problem formalization via ground-truth evaluation in two gridworld environments. △ Less

Submitted 8 November, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

Comments: 14 pages, accepted to ML Safety Workshop at NeurIPS 2022. Alexander Turner and Aseem Saxena contributed equally

arXiv:2206.07904 [pdf, other]

Explainable Models via Compression of Tree Ensembles

Authors: Siwen Yan, Sriraam Natarajan, Saket Joshi, Roni Khardon, Prasad Tadepalli

Abstract: Ensemble models (bagging and gradient-boosting) of relational decision trees have proved to be one of the most effective learning methods in the area of probabilistic logic models (PLMs). While effective, they lose one of the most important aspect of PLMs -- interpretability. In this paper we consider the problem of compressing a large set of learned trees into a single explainable model. To this… ▽ More Ensemble models (bagging and gradient-boosting) of relational decision trees have proved to be one of the most effective learning methods in the area of probabilistic logic models (PLMs). While effective, they lose one of the most important aspect of PLMs -- interpretability. In this paper we consider the problem of compressing a large set of learned trees into a single explainable model. To this effect, we propose CoTE -- Compression of Tree Ensembles -- that produces a single small decision list as a compressed representation. CoTE first converts the trees to decision lists and then performs the combination and compression with the aid of the original training set. An experimental evaluation demonstrates the effectiveness of CoTE in several benchmark relational data sets. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: 24 pages, 14 figures

arXiv:2110.08318 [pdf, other]

Dynamic probabilistic logic models for effective abstractions in RL

Authors: Harsha Kokel, Arjun Manoharan, Sriraam Natarajan, Balaraman Ravindran, Prasad Tadepalli

Abstract: State abstraction enables sample-efficient learning and better task transfer in complex reinforcement learning environments. Recently, we proposed RePReL (Kokel et al. 2021), a hierarchical framework that leverages a relational planner to provide useful state abstractions for learning. We present a brief overview of this framework and the use of a dynamic probabilistic logic model to design these… ▽ More State abstraction enables sample-efficient learning and better task transfer in complex reinforcement learning environments. Recently, we proposed RePReL (Kokel et al. 2021), a hierarchical framework that leverages a relational planner to provide useful state abstractions for learning. We present a brief overview of this framework and the use of a dynamic probabilistic logic model to design these state abstractions. Our experiments show that RePReL not only achieves better performance and efficient learning on the task at hand but also demonstrates better generalization to unseen tasks. △ Less

Submitted 15 October, 2021; originally announced October 2021.

Comments: Accepted at StarAI 2021 (held in conjunction with IJCLR 2021)

arXiv:2109.06365 [pdf, other]

doi 10.1002/ail2.46

From Heatmaps to Structural Explanations of Image Classifiers

Authors: Li Fuxin, Zhongang Qi, Saeed Khorram, Vivswan Shitole, Prasad Tadepalli, Minsuk Kahng, Alan Fern

Abstract: This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps us… ▽ More This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps users understand network classifications that are less intuitive and substantially improves user performance on a difficult fine-grained classification task of discriminating among different species of seagulls. Realizing that an important missing piece is a reliable heatmap visualization tool, we have developed I-GOS and iGOS++ utilizing integrated gradients to avoid local optima in heatmap generation, which improved the performance across all resolutions. During the development of those visualizations, we realized that for a significant number of images, the classifier has multiple different paths to reach a confident prediction. This has lead to our recent development of structured attention graphs (SAGs), an approach that utilizes beam search to locate multiple coarse heatmaps for a single image, and compactly visualizes a set of heatmaps by capturing how different combinations of image regions impact the confidence of a classifier. Through the research process, we have learned much about insights in building deep network explanations, the existence and frequency of multiple explanations, and various tricks of the trade that make explanations work. In this paper, we attempt to share those insights and opinions with the readers with the hope that some of them will be informative for future researchers on explainable deep learning. △ Less

Submitted 13 September, 2021; originally announced September 2021.

Comments: Submitted to Applied AI Letters

Journal ref: Applied AI Letters.2021;2:e46

arXiv:2109.04778 [pdf, other]

Improving Multilingual Translation by Representation and Gradient Regularization

Authors: Yilin Yang, Akiko Eriguchi, Alexandre Muzio, Prasad Tadepalli, Stefan Lee, Hany Hassan

Abstract: Multilingual Neural Machine Translation (NMT) enables one model to serve all translation directions, including ones that are unseen during training, i.e. zero-shot translation. Despite being theoretically attractive, current models often produce low quality translations -- commonly failing to even produce outputs in the right target language. In this work, we observe that off-target translation is… ▽ More Multilingual Neural Machine Translation (NMT) enables one model to serve all translation directions, including ones that are unseen during training, i.e. zero-shot translation. Despite being theoretically attractive, current models often produce low quality translations -- commonly failing to even produce outputs in the right target language. In this work, we observe that off-target translation is dominant even in strong multilingual systems, trained on massive multilingual corpora. To address this issue, we propose a joint approach to regularize NMT models at both representation-level and gradient-level. At the representation level, we leverage an auxiliary target language prediction task to regularize decoder outputs to retain information about the target language. At the gradient level, we leverage a small amount of direct data (in thousands of sentence pairs) to regularize model gradients. Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance by +5.59 and +10.38 BLEU on WMT and OPUS datasets respectively. Moreover, experiments show that our method also works well when the small amount of direct data is not available. △ Less

Submitted 18 January, 2022; v1 submitted 10 September, 2021; originally announced September 2021.

Comments: EMNLP 2021 (Oral). Code and data: https://github.com/yilinyang7/fairseq_multi_fix

arXiv:2103.09536 [pdf, other]

doi 10.1007/s11207-021-01810-8

Imaging and Spectral Observations of a Type-II Radio Burst Revealing the Section of the CME-Driven Shock that Accelerates Electrons

Authors: Satabdwa Majumdar, Srikar Paavan Tadepalli, Samriddhi Sankar Maity, Ketaki Deshpande, Anshu Kumari, Ritesh Patel, Nat Gopalswamy

Abstract: We report on a multi-wavelength analysis of the 26 January 2014 solar eruption involving a coronal mass ejection (CME) and a Type-II radio burst, performed by combining data from various space-and ground-based instruments. An increasing standoff distance with height shows the presence of a strong shock, which further manifests itself in the continuation of the metric Type-II burst into the decamet… ▽ More We report on a multi-wavelength analysis of the 26 January 2014 solar eruption involving a coronal mass ejection (CME) and a Type-II radio burst, performed by combining data from various space-and ground-based instruments. An increasing standoff distance with height shows the presence of a strong shock, which further manifests itself in the continuation of the metric Type-II burst into the decameter-hectometric (DH) domain. A plot of speed versus position angle (PA) shows different points on the CME leading edge travelled with different speeds. From the starting frequency of the Type-II burst and white-light data, we find that the shock signature producing the Type-II burst might be coming from the flanks of the CME. Measuring the speeds of the CME flanks, we find the southern flank to be at a higher speed than the northern flank; further the radio contours from Type-II imaging data showed that the burst source was coming from the southern flank of the CME. From the standoff distance at the CME nose, we find that the local Alfven speed is close to the white-light shock speed, thus causing the Mach number to be small there. Also, the presence of a streamer near the southern flank appears to have provided additional favorable conditions for the generation of shock-associated radio emission. These results provide conclusive evidence that the Type-II emission could originate from the flanks of the CME, which in our study is from the the southern flank of the CME. △ Less

Submitted 17 March, 2021; originally announced March 2021.

Comments: 19 pages, 7 Figures, 1 table ; Accepted for publication in Solar Physics

arXiv:2011.06733 [pdf, other]

One Explanation is Not Enough: Structured Attention Graphs for Image Classification

Authors: Vivswan Shitole, Li Fuxin, Minsuk Kahng, Prasad Tadepalli, Alan Fern

Abstract: Attention maps are a popular way of explaining the decisions of convolutional networks for image classification. Typically, for each image of interest, a single attention map is produced, which assigns weights to pixels based on their importance to the classification. A single attention map, however, provides an incomplete understanding since there are often many other maps that explain a classifi… ▽ More Attention maps are a popular way of explaining the decisions of convolutional networks for image classification. Typically, for each image of interest, a single attention map is produced, which assigns weights to pixels based on their importance to the classification. A single attention map, however, provides an incomplete understanding since there are often many other maps that explain a classification equally well. In this paper, we introduce structured attention graphs (SAGs), which compactly represent sets of attention maps for an image by capturing how different combinations of image regions impact a classifier's confidence. We propose an approach to compute SAGs and a visualization for SAGs so that deeper insight can be gained into a classifier's decisions. We conduct a user study comparing the use of SAGs to traditional attention maps for answering counterfactual questions about image classifications. Our results show that the users are more correct when answering comparative counterfactual questions based on SAGs compared to the baselines. △ Less

Submitted 7 November, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

Comments: 26 pages, 25 figures

Journal ref: NeuRIPS 2021

arXiv:2010.08891 [pdf, other]

DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs

Authors: Aayam Shrestha, Stefan Lee, Prasad Tadepalli, Alan Fern

Abstract: We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience. This approach can be applied on top of any learned representation and has the potential to easily support multiple solution objectives as well as zero-shot adjustment to changing environments and goals. Our main contribution is to introduce t… ▽ More We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience. This approach can be applied on top of any learned representation and has the potential to easily support multiple solution objectives as well as zero-shot adjustment to changing environments and goals. Our main contribution is to introduce the Deep Averagers with Costs MDP (DAC-MDP) and to investigate its solutions for offline RL. DAC-MDPs are a non-parametric model that can leverage deep representations and account for limited data by introducing costs for exploiting under-represented parts of the model. In theory, we show conditions that allow for lower-bounding the performance of DAC-MDP solutions. We also investigate the empirical behavior in a number of environments, including those with image-based observations. Overall, the experiments demonstrate that the framework can work in practice and scale to large complex offline RL problems. △ Less

Submitted 17 October, 2020; originally announced October 2020.

Comments: Preprint. Under review at ICLR 2021

arXiv:2010.02648 [pdf, other]

On the Sub-Layer Functionalities of Transformer Decoder

Authors: Yilin Yang, Longyue Wang, Shuming Shi, Prasad Tadepalli, Stefan Lee, Zhaopeng Tu

Abstract: There have been significant efforts to interpret the encoder of Transformer-based encoder-decoder architectures for neural machine translation (NMT); meanwhile, the decoder remains largely unexamined despite its critical role. During translation, the decoder must predict output tokens by considering both the source-language text from the encoder and the target-language prefix produced in previous… ▽ More There have been significant efforts to interpret the encoder of Transformer-based encoder-decoder architectures for neural machine translation (NMT); meanwhile, the decoder remains largely unexamined despite its critical role. During translation, the decoder must predict output tokens by considering both the source-language text from the encoder and the target-language prefix produced in previous steps. In this work, we study how Transformer-based decoders leverage information from the source and target languages -- develo** a universal probe task to assess how information is propagated through each module of each decoder layer. We perform extensive experiments on three major translation datasets (WMT En-De, En-Fr, and En-Zh). Our analysis provides insight on when and where decoders leverage different sources. Based on these insights, we demonstrate that the residual feed-forward module in each Transformer decoder layer can be dropped with minimal loss of performance -- a significant reduction in computation and number of parameters, and consequently a significant boost to both training and inference speed. △ Less

Submitted 6 October, 2020; originally announced October 2020.

Comments: Findings of the 2020 Conference on Empirical Methods in Natural Language Processing (Long)

arXiv:2006.06547 [pdf, other]

Avoiding Side Effects in Complex Environments

Authors: Alexander Matt Turner, Neale Ratzlaff, Prasad Tadepalli

Abstract: Reward function specification can be difficult. Rewarding the agent for making a widget may be easy, but penalizing the multitude of possible negative side effects is hard. In toy environments, Attainable Utility Preservation (AUP) avoided side effects by penalizing shifts in the ability to achieve randomly generated goals. We scale this approach to large, randomly generated environments based on… ▽ More Reward function specification can be difficult. Rewarding the agent for making a widget may be easy, but penalizing the multitude of possible negative side effects is hard. In toy environments, Attainable Utility Preservation (AUP) avoided side effects by penalizing shifts in the ability to achieve randomly generated goals. We scale this approach to large, randomly generated environments based on Conway's Game of Life. By preserving optimal value for a single randomly generated reward function, AUP incurs modest overhead while leading the agent to complete the specified task and avoid many side effects. Videos and code are available at https://avoiding-side-effects.github.io/. △ Less

Submitted 22 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

Comments: Accepted as spotlight paper at NeurIPS 2020. 10 pages main paper; 19 pages with appendices

arXiv:2005.14271 [pdf, other]

Relation Extraction with Explanation

Authors: Hamed Shahbazi, Xiaoli Z. Fern, Reza Ghaeini, Prasad Tadepalli

Abstract: Recent neural models for relation extraction with distant supervision alleviate the impact of irrelevant sentences in a bag by learning importance weights for the sentences. Efforts thus far have focused on improving extraction accuracy but little is known about their explainability. In this work we annotate a test set with ground-truth sentence-level explanations to evaluate the quality of explan… ▽ More Recent neural models for relation extraction with distant supervision alleviate the impact of irrelevant sentences in a bag by learning importance weights for the sentences. Efforts thus far have focused on improving extraction accuracy but little is known about their explainability. In this work we annotate a test set with ground-truth sentence-level explanations to evaluate the quality of explanations afforded by the relation extraction models. We demonstrate that replacing the entity mentions in the sentences with their fine-grained entity types not only enhances extraction accuracy but also improves explanation. We also propose to automatically generate "distractor" sentences to augment the bags and train the model to ignore the distractors. Evaluations on the widely used FB-NYT dataset show that our methods achieve new state-of-the-art accuracy while improving model explainability. △ Less

Submitted 28 May, 2020; originally announced May 2020.

Comments: accepted by ACL 2020

Journal ref: ACL.2020

arXiv:1912.01683 [pdf, other]

Optimal Policies Tend to Seek Power

Authors: Alexander Matt Turner, Logan Smith, Rohin Shah, Andrew Critch, Prasad Tadepalli

Abstract: Some researchers speculate that intelligent reinforcement learning (RL) agents would be incentivized to seek resources and power in pursuit of their objectives. Other researchers point out that RL agents need not have human-like power-seeking instincts. To clarify this discussion, we develop the first formal theory of the statistical tendencies of optimal policies. In the context of Markov decisio… ▽ More Some researchers speculate that intelligent reinforcement learning (RL) agents would be incentivized to seek resources and power in pursuit of their objectives. Other researchers point out that RL agents need not have human-like power-seeking instincts. To clarify this discussion, we develop the first formal theory of the statistical tendencies of optimal policies. In the context of Markov decision processes, we prove that certain environmental symmetries are sufficient for optimal policies to tend to seek power over the environment. These symmetries exist in many environments in which the agent can be shut down or destroyed. We prove that in these environments, most reward functions make it optimal to seek power by kee** a range of options available and, when maximizing average reward, by navigating towards larger sets of potential terminal states. △ Less

Submitted 28 January, 2023; v1 submitted 3 December, 2019; originally announced December 2019.

Comments: Accepted to NeurIPS 2021 as spotlight paper. 12 pages, 44 pages with appendices. Since the 2021 acceptance, we updated the paper to point out that optimal policies can be qualitatively divorced from real-world learned policies

arXiv:1910.00614 [pdf, other]

The Choice Function Framework for Online Policy Improvement

Authors: Murugeswari Issakkimuthu, Alan Fern, Prasad Tadepalli

Abstract: There are notable examples of online search improving over hand-coded or learned policies (e.g. AlphaZero) for sequential decision making. It is not clear, however, whether or not policy improvement is guaranteed for many of these approaches, even when given a perfect evaluation function and transition model. Indeed, simple counter examples show that seemingly reasonable online search procedures c… ▽ More There are notable examples of online search improving over hand-coded or learned policies (e.g. AlphaZero) for sequential decision making. It is not clear, however, whether or not policy improvement is guaranteed for many of these approaches, even when given a perfect evaluation function and transition model. Indeed, simple counter examples show that seemingly reasonable online search procedures can hurt performance compared to the original policy. To address this issue, we introduce the choice function framework for analyzing online search procedures for policy improvement. A choice function specifies the actions to be considered at every node of a search tree, with all other actions being pruned. Our main contribution is to give sufficient conditions for stationary and non-stationary choice functions to guarantee that the value achieved by online search is no worse than the original policy. In addition, we describe a general parametric class of choice functions that satisfy those conditions and present an illustrative use case of the framework's empirical utility. △ Less

Submitted 7 October, 2019; v1 submitted 1 October, 2019; originally announced October 2019.

arXiv:1908.05762 [pdf, ps, other]

Entity-aware ELMo: Learning Contextual Entity Representation for Entity Disambiguation

Authors: Hamed Shahbazi, Xiaoli Z. Fern, Reza Ghaeini, Rasha Obeidat, Prasad Tadepalli

Abstract: We present a new local entity disambiguation system. The key to our system is a novel approach for learning entity representations. In our approach we learn an entity aware extension of Embedding for Language Model (ELMo) which we call Entity-ELMo (E-ELMo). Given a paragraph containing one or more named entity mentions, each mention is first defined as a function of the entire paragraph (including… ▽ More We present a new local entity disambiguation system. The key to our system is a novel approach for learning entity representations. In our approach we learn an entity aware extension of Embedding for Language Model (ELMo) which we call Entity-ELMo (E-ELMo). Given a paragraph containing one or more named entity mentions, each mention is first defined as a function of the entire paragraph (including other mentions), then they predict the referent entities. Utilizing E-ELMo for local entity disambiguation, we outperform all of the state-of-the-art local and global models on the popular benchmarks by improving about 0.5\% on micro average accuracy for AIDA test-b with Yago candidate set. The evaluation setup of the training data and candidate set are the same as our baselines for fair comparison. △ Less

Submitted 22 August, 2019; v1 submitted 13 August, 2019; originally announced August 2019.

arXiv:1902.09725 [pdf, other]

doi 10.1145/3375627.3375851

Conservative Agency via Attainable Utility Preservation

Authors: Alexander Matt Turner, Dylan Hadfield-Menell, Prasad Tadepalli

Abstract: Reward functions are easy to misspecify; although designers can make corrections after observing mistakes, an agent pursuing a misspecified reward function can irreversibly change the state of its environment. If that change precludes optimization of the correctly specified reward function, then correction is futile. For example, a robotic factory assistant could break expensive equipment due to a… ▽ More Reward functions are easy to misspecify; although designers can make corrections after observing mistakes, an agent pursuing a misspecified reward function can irreversibly change the state of its environment. If that change precludes optimization of the correctly specified reward function, then correction is futile. For example, a robotic factory assistant could break expensive equipment due to a reward misspecification; even if the designers immediately correct the reward function, the damage is done. To mitigate this risk, we introduce an approach that balances optimization of the primary reward function with preservation of the ability to optimize auxiliary reward functions. Surprisingly, even when the auxiliary reward functions are randomly generated and therefore uninformative about the correctly specified reward function, this approach induces conservative, effective behavior. △ Less

Submitted 10 June, 2020; v1 submitted 25 February, 2019; originally announced February 2019.

Comments: Published in AI, Ethics, and Society 2020

arXiv:1902.08649 [pdf, other]

Saliency Learning: Teaching the Model Where to Pay Attention

Authors: Reza Ghaeini, Xiaoli Z. Fern, Hamed Shahbazi, Prasad Tadepalli

Abstract: Deep learning has emerged as a compelling solution to many NLP tasks with remarkable performances. However, due to their opacity, such models are hard to interpret and trust. Recent work on explaining deep models has introduced approaches to provide insights toward the model's behaviour and predictions, which are helpful for assessing the reliability of the model's predictions. However, such metho… ▽ More Deep learning has emerged as a compelling solution to many NLP tasks with remarkable performances. However, due to their opacity, such models are hard to interpret and trust. Recent work on explaining deep models has introduced approaches to provide insights toward the model's behaviour and predictions, which are helpful for assessing the reliability of the model's predictions. However, such methods do not improve the model's reliability. In this paper, we aim to teach the model to make the right prediction for the right reason by providing explanation training and ensuring the alignment of the model's explanation with the ground truth explanation. Our experimental results on multiple tasks and datasets demonstrate the effectiveness of the proposed method, which produces more reliable predictions while delivering better results compared to traditionally trained models. △ Less

Submitted 4 April, 2019; v1 submitted 22 February, 2019; originally announced February 2019.

Comments: Accepted as a short paper at NAACL 2019. 10 pages, 2 figures, 6 tables

Journal ref: NAACL 2019

arXiv:1812.07150 [pdf, other]

Interactive Naming for Explaining Deep Neural Networks: A Formative Study

Authors: Mandana Hamidi-Haines, Zhongang Qi, Alan Fern, Fuxin Li, Prasad Tadepalli

Abstract: We consider the problem of explaining the decisions of deep neural networks for image recognition in terms of human-recognizable visual concepts. In particular, given a test set of images, we aim to explain each classification in terms of a small number of image regions, or activation maps, which have been associated with semantic concepts by a human annotator. This allows for generating summary v… ▽ More We consider the problem of explaining the decisions of deep neural networks for image recognition in terms of human-recognizable visual concepts. In particular, given a test set of images, we aim to explain each classification in terms of a small number of image regions, or activation maps, which have been associated with semantic concepts by a human annotator. This allows for generating summary views of the typical reasons for classifications, which can help build trust in a classifier and/or identify example types for which the classifier may not be trusted. For this purpose, we developed a user interface for "interactive naming," which allows a human annotator to manually cluster significant activation maps in a test set into meaningful groups called "visual concepts". The main contribution of this paper is a systematic study of the visual concepts produced by five human annotators using the interactive naming interface. In particular, we consider the adequacy of the concepts for explaining the classification of test-set images, correspondence of the concepts to activations of individual neurons, and the inter-annotator agreement of visual concepts. We find that a large fraction of the activation maps have recognizable visual concepts, and that there is significant agreement between the different annotators about their denotations. Our work is an exploratory study of the interplay between machine learning and human recognition mediated by visualizations of the results of learning. △ Less

Submitted 20 December, 2018; v1 submitted 17 December, 2018; originally announced December 2018.

arXiv:1809.03680 [pdf, other]

Learning Scripts as Hidden Markov Models

Authors: J. Walker Orr, Prasad Tadepalli, Janardhan Rao Doppa, Xiaoli Fern, Thomas G. Dietterich

Abstract: Scripts have been proposed to model the stereotypical event sequences found in narratives. They can be applied to make a variety of inferences including filling gaps in the narratives and resolving ambiguous references. This paper proposes the first formal framework for scripts based on Hidden Markov Models (HMMs). Our framework supports robust inference and learning algorithms, which are lacking… ▽ More Scripts have been proposed to model the stereotypical event sequences found in narratives. They can be applied to make a variety of inferences including filling gaps in the narratives and resolving ambiguous references. This paper proposes the first formal framework for scripts based on Hidden Markov Models (HMMs). Our framework supports robust inference and learning algorithms, which are lacking in previous clustering models. We develop an algorithm for structure and parameter learning based on Expectation Maximization and evaluate it on a number of natural datasets. The results show that our algorithm is superior to several informed baselines for predicting missing events in partial observation sequences. △ Less

Submitted 11 September, 2018; originally announced September 2018.

Comments: 7 pages, AAAI 2014

arXiv:1809.03051 [pdf, other]

Attentional Multi-Reading Sarcasm Detection

Authors: Reza Ghaeini, Xiaoli Z. Fern, Prasad Tadepalli

Abstract: Recognizing sarcasm often requires a deep understanding of multiple sources of information, including the utterance, the conversational context, and real world facts. Most of the current sarcasm detection systems consider only the utterance in isolation. There are some limited attempts toward taking into account the conversational context. In this paper, we propose an interpretable end-to-end mode… ▽ More Recognizing sarcasm often requires a deep understanding of multiple sources of information, including the utterance, the conversational context, and real world facts. Most of the current sarcasm detection systems consider only the utterance in isolation. There are some limited attempts toward taking into account the conversational context. In this paper, we propose an interpretable end-to-end model that combines information from both the utterance and the conversational context to detect sarcasm, and demonstrate its effectiveness through empirical evaluations. We also study the behavior of the proposed model to provide explanations for the model's decisions. Importantly, our model is capable of determining the impact of utterance and conversational context on the model's decisions. Finally, we provide an ablation study to illustrate the impact of different components of the proposed model. △ Less

Submitted 9 September, 2018; originally announced September 2018.

arXiv:1808.08504 [pdf, other]

Event Detection with Neural Networks: A Rigorous Empirical Evaluation

Authors: J. Walker Orr, Prasad Tadepalli, Xiaoli Fern

Abstract: Detecting events and classifying them into predefined types is an important step in knowledge extraction from natural language texts. While the neural network models have generally led the state-of-the-art, the differences in performance between different architectures have not been rigorously studied. In this paper we present a novel GRU-based model that combines syntactic information along with… ▽ More Detecting events and classifying them into predefined types is an important step in knowledge extraction from natural language texts. While the neural network models have generally led the state-of-the-art, the differences in performance between different architectures have not been rigorously studied. In this paper we present a novel GRU-based model that combines syntactic information along with temporal structure through an attention mechanism. We show that it is competitive with other neural network architectures through empirical evaluations under different random initializations and training-validation-test splits of ACE2005 dataset. △ Less

Submitted 26 August, 2018; originally announced August 2018.

Comments: 5 pages, EMNLP2018

arXiv:1808.03894 [pdf, other]

Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference

Authors: Reza Ghaeini, Xiaoli Z. Fern, Prasad Tadepalli

Abstract: Deep learning models have achieved remarkable success in natural language inference (NLI) tasks. While these models are widely explored, they are hard to interpret and it is often unclear how and why they actually work. In this paper, we take a step toward explaining such deep learning based models through a case study on a popular neural model for NLI. In particular, we propose to interpret the i… ▽ More Deep learning models have achieved remarkable success in natural language inference (NLI) tasks. While these models are widely explored, they are hard to interpret and it is often unclear how and why they actually work. In this paper, we take a step toward explaining such deep learning based models through a case study on a popular neural model for NLI. In particular, we propose to interpret the intermediate layers of NLI models by visualizing the saliency of attention and LSTM gating signals. We present several examples for which our methods are able to reveal interesting insights and identify the critical information contributing to the model decisions. △ Less

Submitted 12 August, 2018; originally announced August 2018.

Comments: 11 pages, 11 figures, accepted as a short paper at EMNLP 2018

Journal ref: EMNLP 2018

arXiv:1806.07495 [pdf, other]

Joint Neural Entity Disambiguation with Output Space Search

Authors: Hamed Shahbazi, Xiaoli Z. Fern, Reza Ghaeini, Chao Ma, Rasha Obeidat, Prasad Tadepalli

Abstract: In this paper, we present a novel model for entity disambiguation that combines both local contextual information and global evidences through Limited Discrepancy Search (LDS). Given an input document, we start from a complete solution constructed by a local model and conduct a search in the space of possible corrections to improve the local solution from a global view point. Our search utilizes a… ▽ More In this paper, we present a novel model for entity disambiguation that combines both local contextual information and global evidences through Limited Discrepancy Search (LDS). Given an input document, we start from a complete solution constructed by a local model and conduct a search in the space of possible corrections to improve the local solution from a global view point. Our search utilizes a heuristic function to focus more on the least confident local decisions and a pruning function to score the global solutions based on their local fitness and the global coherences among the predicted entities. Experimental results on CoNLL 2003 and TAC 2010 benchmarks verify the effectiveness of our model. △ Less

Submitted 19 June, 2018; originally announced June 2018.

Comments: Accepted as a long paper at COLING 2018, 11 pages

Journal ref: Proceedings of COLING 2018

arXiv:1805.10528 [pdf, other]

Dependent Gated Reading for Cloze-Style Question Answering

Authors: Reza Ghaeini, Xiaoli Z. Fern, Hamed Shahbazi, Prasad Tadepalli

Abstract: We present a novel deep learning architecture to address the cloze-style question answering task. Existing approaches employ reading mechanisms that do not fully exploit the interdependency between the document and the query. In this paper, we propose a novel \emph{dependent gated reading} bidirectional GRU network (DGR) to efficiently model the relationship between the document and the query duri… ▽ More We present a novel deep learning architecture to address the cloze-style question answering task. Existing approaches employ reading mechanisms that do not fully exploit the interdependency between the document and the query. In this paper, we propose a novel \emph{dependent gated reading} bidirectional GRU network (DGR) to efficiently model the relationship between the document and the query during encoding and decision making. Our evaluation shows that DGR obtains highly competitive performance on well-known machine comprehension benchmarks such as the Children's Book Test (CBT-NE and CBT-CN) and Who DiD What (WDW, Strict and Relaxed). Finally, we extensively analyze and validate our model by ablation and attention studies. △ Less

Submitted 1 June, 2018; v1 submitted 26 May, 2018; originally announced May 2018.

Comments: Accepted as a long paper at COLING 2018, 16 pages, 12 figures

Journal ref: COLING 2018

arXiv:1802.05672 [pdf, other]

Event Nugget Detection with Forward-Backward Recurrent Neural Networks

Authors: Reza Ghaeini, Xiaoli Z. Fern, Liang Huang, Prasad Tadepalli

Abstract: Traditional event detection methods heavily rely on manually engineered rich features. Recent deep learning approaches alleviate this problem by automatic feature engineering. But such efforts, like tradition methods, have so far only focused on single-token event mentions, whereas in practice events can also be a phrase. We instead use forward-backward recurrent neural networks (FBRNNs) to detect… ▽ More Traditional event detection methods heavily rely on manually engineered rich features. Recent deep learning approaches alleviate this problem by automatic feature engineering. But such efforts, like tradition methods, have so far only focused on single-token event mentions, whereas in practice events can also be a phrase. We instead use forward-backward recurrent neural networks (FBRNNs) to detect events that can be either words or phrases. To the best our knowledge, this is one of the first efforts to handle multi-word events and also the first attempt to use RNNs for event detection. Experimental results demonstrate that FBRNN is competitive with the state-of-the-art methods on the ACE 2005 and the Rich ERE 2015 event detection tasks. △ Less

Submitted 15 February, 2018; originally announced February 2018.

Comments: Published as a short paper at ACL 2016. The main purpose of this submission is to add this paper to arxiv

Report number: http://www.aclweb.org/anthology/P16-2060

Journal ref: ACL 2016

arXiv:1404.5511 [pdf, other]

Coactive Learning for Locally Optimal Problem Solving

Authors: Robby Goetschalckx, Alan Fern, Prasad Tadepalli

Abstract: Coactive learning is an online problem solving setting where the solutions provided by a solver are interactively improved by a domain expert, which in turn drives learning. In this paper we extend the study of coactive learning to problems where obtaining a globally optimal or near-optimal solution may be intractable or where an expert can only be expected to make small, local improvements to a c… ▽ More Coactive learning is an online problem solving setting where the solutions provided by a solver are interactively improved by a domain expert, which in turn drives learning. In this paper we extend the study of coactive learning to problems where obtaining a globally optimal or near-optimal solution may be intractable or where an expert can only be expected to make small, local improvements to a candidate solution. The goal of learning in this new setting is to minimize the cost as measured by the expert effort over time. We first establish theoretical bounds on the average cost of the existing coactive Perceptron algorithm. In addition, we consider new online algorithms that use cost-sensitive and Passive-Aggressive (PA) updates, showing similar or improved theoretical bounds. We provide an empirical evaluation of the learners in various domains, which show that the Perceptron based algorithms are quite effective and that unlike the case for online classification, the PA algorithms do not yield significant performance gains. △ Less

Submitted 18 April, 2014; originally announced April 2014.

Comments: AAAI 2014 paper, including appendices

arXiv:1306.6302 [pdf, other]

Solving Relational MDPs with Exogenous Events and Additive Rewards

Authors: S. Joshi, R. Khardon, P. Tadepalli, A. Raghavan, A. Fern

Abstract: We formalize a simple but natural subclass of service domains for relational planning problems with object-centered, independent exogenous events and additive rewards capturing, for example, problems in inventory control. Focusing on this subclass, we present a new symbolic planning algorithm which is the first algorithm that has explicit performance guarantees for relational MDPs with exogenous e… ▽ More We formalize a simple but natural subclass of service domains for relational planning problems with object-centered, independent exogenous events and additive rewards capturing, for example, problems in inventory control. Focusing on this subclass, we present a new symbolic planning algorithm which is the first algorithm that has explicit performance guarantees for relational MDPs with exogenous events. In particular, under some technical conditions, our planning algorithm provides a monotonic lower bound on the optimal value function. To support this algorithm we present novel evaluation and reduction techniques for generalized first order decision diagrams, a knowledge representation for real-valued functions over relational world states. Our planning algorithm uses a set of focus states, which serves as a training set, to simplify and approximate the symbolic solution, and can thus be seen to perform learning for planning. A preliminary experimental evaluation demonstrates the validity of our approach. △ Less

Submitted 27 June, 2013; v1 submitted 26 June, 2013; originally announced June 2013.

Comments: This is an extended version of our ECML/PKDD 2013 paper including all proofs. (v2 corrects typos and updates ref [10] to cite this report as the full version)

arXiv:1206.6460 [pdf]

Output Space Search for Structured Prediction

Authors: Janardhan Rao Doppa, Alan Fern, Prasad Tadepalli

Abstract: We consider a framework for structured prediction based on search in the space of complete structured outputs. Given a structured input, an output is produced by running a time-bounded search procedure guided by a learned cost function, and then returning the least cost output uncovered during the search. This framework can be instantiated for a wide range of search spaces and search procedures, a… ▽ More We consider a framework for structured prediction based on search in the space of complete structured outputs. Given a structured input, an output is produced by running a time-bounded search procedure guided by a learned cost function, and then returning the least cost output uncovered during the search. This framework can be instantiated for a wide range of search spaces and search procedures, and easily incorporates arbitrary structured-prediction loss functions. In this paper, we make two main technical contributions. First, we define the limited-discrepancy search space over structured outputs, which is able to leverage powerful classification learning algorithms to improve the search space quality. Second, we give a generic cost function learning approach, where the key idea is to learn a cost function that attempts to mimic the behavior of conducting searches guided by the true loss function. Our experiments on six benchmark domains demonstrate that using our framework with only a small amount of search is sufficient for significantly improving on state-of-the-art structured-prediction performance. △ Less

Submitted 27 June, 2012; originally announced June 2012.

Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

arXiv:cs/9605105 [pdf, ps]

A Formal Framework for Speedup Learning from Problems and Solutions

Authors: P. Tadepalli, B. K. Natarajan

Abstract: Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned knowledge, namely control rules and macro-operators, and prove theorems that identify sufficient conditio… ▽ More Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned knowledge, namely control rules and macro-operators, and prove theorems that identify sufficient conditions for learning in each representation. Our proofs are constructive in that they are accompanied with learning algorithms. Our framework captures both empirical and explanation-based speedup learning in a unified fashion. We illustrate our framework with implementations in two domains: symbolic integration and Eight Puzzle. This work integrates many strands of experimental and theoretical work in machine learning, including empirical learning of control rules, macro-operator learning, Explanation-Based Learning (EBL), and Probably Approximately Correct (PAC) Learning. △ Less

Submitted 30 April, 1996; originally announced May 1996.

Comments: See http://www.jair.org/ for any accompanying files

Journal ref: Journal of Artificial Intelligence Research, Vol 4, (1996), 445-475

Showing 1–32 of 32 results for author: Tadepalli, P