Skip to main content

Showing 1–50 of 54 results for author: Fern, A

.
  1. arXiv:2406.17279  [pdf, other

    cs.RO cs.AI

    Learning Decentralized Multi-Biped Control for Payload Transport

    Authors: Bikram Pandit, Ashutosh Gupta, Mohitvishnu S. Gadde, Addison Johnson, Aayam Kumar Shrestha, Helei Duan, Jeremy Dao, Alan Fern

    Abstract: Payload transport over flat terrain via multi-wheel robot carriers is well-understood, highly effective, and configurable. In this paper, our goal is to provide similar effectiveness and configurability for transport over rough terrain that is more suitable for legs rather than wheels. For this purpose, we consider multi-biped robot carriers, where wheels are replaced by multiple bipedal robots at… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Submitted to CoRL 2024, Project website: decmbc.github.io

  2. arXiv:2404.19173  [pdf, other

    cs.RO

    Revisiting Reward Design and Evaluation for Robust Humanoid Standing and Walking

    Authors: Bart van Marum, Aayam Shrestha, Helei Duan, Pranay Dugar, Jeremy Dao, Alan Fern

    Abstract: A necessary capability for humanoid robots is the ability to stand and walk while rejecting natural disturbances. Recent progress has been made using sim-to-real reinforcement learning (RL) to train such locomotion controllers, with approaches differing mainly in their reward functions. However, prior works lack a clear method to systematically test new reward functions and compare controller perf… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 8 pages, 5 figs

  3. arXiv:2311.03388  [pdf, other

    cs.LG cs.AI physics.ao-ph

    Attention-based Models for Snow-Water Equivalent Prediction

    Authors: Krishu K. Thapa, Bhupinderjeet Singh, Supriya Savalkar, Alan Fern, Kirti Rajagopalan, Ananth Kalyanaraman

    Abstract: Snow Water-Equivalent (SWE) -- the amount of water available if snowpack is melted -- is a key decision variable used by water management agencies to make irrigation, flood control, power generation and drought management decisions. SWE values vary spatiotemporally -- affected by weather, topography and other environmental factors. While daily SWE can be measured by Snow Telemetry (SNOTEL) station… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: 7 pages, To be published in Proceedings of The Thirty-Sixth Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-24)

    ACM Class: I.2

  4. arXiv:2310.03191  [pdf, other

    cs.RO

    Sim-to-Real Learning for Humanoid Box Loco-Manipulation

    Authors: Jeremy Dao, Helei Duan, Alan Fern

    Abstract: In this work we propose a learning-based approach to box loco-manipulation for a humanoid robot. This is a particularly challenging problem due to the need for whole-body coordination in order to lift boxes of varying weight, position, and orientation while maintaining balance. To address this challenge, we present a sim-to-real reinforcement learning approach for training general box pickup and c… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  5. arXiv:2309.14594  [pdf, other

    cs.RO

    Learning Vision-Based Bipedal Locomotion for Challenging Terrain

    Authors: Helei Duan, Bikram Pandit, Mohitvishnu S. Gadde, Bart van Marum, Jeremy Dao, Chanho Kim, Alan Fern

    Abstract: Reinforcement learning (RL) for bipedal locomotion has recently demonstrated robust gaits over moderate terrains using only proprioceptive sensing. However, such blind controllers will fail in environments where robots must anticipate and adapt to local terrain, which requires visual perception. In this paper, we propose a fully-learned system that allows bipedal robots to react to local terrain w… ▽ More

    Submitted 8 July, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: IEEE International Conference on Robotics and Automation 2024

  6. arXiv:2301.01815  [pdf, other

    cs.LG

    Multi-Task Learning for Budbreak Prediction

    Authors: Aseem Saxena, Paola Pesantez-Cabrera, Rohan Ballapragada, Markus Keller, Alan Fern

    Abstract: Grapevine budbreak is a key phenological stage of seasonal development, which serves as a signal for the onset of active growth. This is also when grape plants are most vulnerable to damage from freezing temperatures. Hence, it is important for winegrowers to anticipate the day of budbreak occurrence to protect their vineyards from late spring frost events. This work investigates deep learning for… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

    Comments: Accepted at AIFS Workshop AAAI 2023. arXiv admin note: text overlap with arXiv:2209.10585

  7. arXiv:2209.10585  [pdf, other

    cs.LG

    Grape Cold Hardiness Prediction via Multi-Task Learning

    Authors: Aseem Saxena, Paola Pesantez-Cabrera, Rohan Ballapragada, Kin-Ho Lam, Markus Keller, Alan Fern

    Abstract: Cold temperatures during fall and spring have the potential to cause frost damage to grapevines and other fruit plants, which can significantly decrease harvest yields. To help prevent these losses, farmers deploy expensive frost mitigation measures such as sprinklers, heaters, and wind machines when they judge that damage may occur. This judgment, however, is challenging because the cold hardines… ▽ More

    Submitted 4 January, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

    Comments: 6 pages, 2 figures, accepted at IAAI-23

  8. arXiv:2207.07835  [pdf, other

    cs.RO

    Dynamic Bipedal Maneuvers through Sim-to-Real Reinforcement Learning

    Authors: Fangzhou Yu, Ryan Batke, Jeremy Dao, Jonathan Hurst, Kevin Green, Alan Fern

    Abstract: For legged robots to match the athletic capabilities of humans and animals, they must not only produce robust periodic walking and running, but also seamlessly switch between nominal locomotion gaits and more specialized transient maneuvers. Despite recent advancements in controls of bipedal robots, there has been little focus on producing highly dynamic behaviors. Recent work utilizing reinforcem… ▽ More

    Submitted 16 July, 2022; originally announced July 2022.

    Comments: In review for the 2022 IEEE-RAS International Conference on Humanoid Robots. 8 pages, 8 figures, 3 tables

  9. arXiv:2207.04163  [pdf, other

    cs.RO

    Optimizing Bipedal Maneuvers of Single Rigid-Body Models for Reinforcement Learning

    Authors: Ryan Batke, Fangzhou Yu, Jeremy Dao, Jonathan Hurst, Ross L. Hatton, Alan Fern, Kevin Green

    Abstract: In this work, we propose a method to generate reduced-order model reference trajectories for general classes of highly dynamic maneuvers for bipedal robots for use in sim-to-real reinforcement learning. Our approach is to utilize a single rigid-body model (SRBM) to optimize libraries of trajectories offline to be used as expert references in the reward function of a learned policy. This method tra… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

    Comments: 8 pages, 6 figures

  10. arXiv:2206.02039  [pdf, other

    cs.AI cs.LG

    Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL

    Authors: Kin-Ho Lam, Delyar Tabatabai, Jed Irvine, Donald Bertucci, Anita Ruangrotsakun, Minsuk Kahng, Alan Fern

    Abstract: Reinforcement learning (RL) agents are commonly evaluated via their expected value over a distribution of test scenarios. Unfortunately, this evaluation approach provides limited evidence for post-deployment generalization beyond the test distribution. In this paper, we address this limitation by extending the recent CheckList testing methodology from natural language processing to planning-based… ▽ More

    Submitted 7 June, 2022; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: This work will appear in the Proceedings of the 32nd International Conference on Automated Planning and Scheduling (ICAPS2022) https://icaps22.icaps-conference.org/papers

  11. arXiv:2205.10739  [pdf, other

    cs.LG cs.AI

    Offline Policy Comparison with Confidence: Benchmarks and Baselines

    Authors: Anurag Koul, Mariano Phielipp, Alan Fern

    Abstract: Decision makers often wish to use offline historical data to compare sequential-action policies at various world states. Importantly, computational tools should produce confidence values for such offline policy comparison (OPC) to account for statistical variance and limited data coverage. Nevertheless, there is little work that directly evaluates the quality of confidence values for OPC. In this… ▽ More

    Submitted 22 May, 2022; originally announced May 2022.

  12. arXiv:2205.01807  [pdf, other

    cs.RO

    Learning Dynamic Bipedal Walking Across Step** Stones

    Authors: Helei Duan, Ashish Malik, Mohitvishnu S. Gadde, Jeremy Dao, Alan Fern, Jonathan Hurst

    Abstract: In this work, we propose a learning approach for 3D dynamic bipedal walking when footsteps are constrained to step** stones. While recent work has shown progress on this problem, real-world demonstrations have been limited to relatively simple open-loop, perception-free scenarios. Our main contribution is a more advanced learning approach that enables real-world demonstrations, using the Cassie… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Video will be uploaded later

  13. arXiv:2204.04340  [pdf, other

    cs.RO cs.LG

    Sim-to-Real Learning for Bipedal Locomotion Under Unsensed Dynamic Loads

    Authors: Jeremy Dao, Kevin Green, Helei Duan, Alan Fern, Jonathan Hurst

    Abstract: Recent work on sim-to-real learning for bipedal locomotion has demonstrated new levels of robustness and agility over a variety of terrains. However, that work, and most prior bipedal locomotion work, have not considered locomotion under a variety of external loads that can significantly influence the overall system dynamics. In many applications, robots will need to maintain robust locomotion und… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: Accepted to ICRA 2022. Video attachment: https://youtu.be/IeSUM_ej8wE

  14. arXiv:2203.07589  [pdf, other

    cs.RO

    Sim-to-Real Learning of Footstep-Constrained Bipedal Dynamic Walking

    Authors: Helei Duan, Ashish Malik, Jeremy Dao, Aseem Saxena, Kevin Green, Jonah Siekmann, Alan Fern, Jonathan Hurst

    Abstract: Recently, work on reinforcement learning (RL) for bipedal robots has successfully learned controllers for a variety of dynamic gaits with robust sim-to-real demonstrations. In order to maintain balance, the learned controllers have full freedom of where to place the feet, resulting in highly robust gaits. In the real world however, the environment will often impose constraints on the feasible foot… ▽ More

    Submitted 3 May, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: Accepted at ICRA 2022. Video at https://www.youtube.com/watch?v=-zim1QQgA2s

  15. arXiv:2109.13978  [pdf, other

    cs.AI

    Identifying Reasoning Flaws in Planning-Based RL Using Tree Explanations

    Authors: Kin-Ho Lam, Zhengxian Lin, Jed Irvine, Jonathan Dodge, Zeyad T Shureih, Roli Khanna, Minsuk Kahng, Alan Fern

    Abstract: Enabling humans to identify potential flaws in an agent's decision making is an important Explainable AI application. We consider identifying such flaws in a planning-based deep reinforcement learning (RL) agent for a complex real-time strategy game. In particular, the agent makes decisions via tree search using a learned model and evaluation function over interpretable states and actions. This gi… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

  16. arXiv:2109.06365  [pdf, other

    cs.CV cs.LG

    From Heatmaps to Structural Explanations of Image Classifiers

    Authors: Li Fuxin, Zhongang Qi, Saeed Khorram, Vivswan Shitole, Prasad Tadepalli, Minsuk Kahng, Alan Fern

    Abstract: This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps us… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Submitted to Applied AI Letters

    Journal ref: Applied AI Letters.2021;2:e46

  17. arXiv:2107.04982  [pdf, other

    cs.LG cs.AI

    Out-of-Distribution Dynamics Detection: RL-Relevant Benchmarks and Results

    Authors: Mohamad H Danesh, Alan Fern

    Abstract: We study the problem of out-of-distribution dynamics (OODD) detection, which involves detecting when the dynamics of a temporal process change compared to the training-distribution dynamics. This is relevant to applications in control, reinforcement learning (RL), and multi-variate time-series, where changes to test time dynamics can impact the performance of learning controllers/predictors in unk… ▽ More

    Submitted 24 May, 2022; v1 submitted 11 July, 2021; originally announced July 2021.

    Comments: ICML 2021 Workshop on Uncertainty and Robustness in Deep Learning

  18. arXiv:2106.06621  [pdf, other

    cs.LG cs.AI

    Piecewise-constant Neural ODEs

    Authors: Sam Greydanus, Stefan Lee, Alan Fern

    Abstract: Neural networks are a popular tool for modeling sequential data but they generally do not treat time as a continuous variable. Neural ODEs represent an important exception: they parameterize the time derivative of a hidden state with a neural network and then integrate over arbitrary amounts of time. But these parameterizations, which have arbitrary curvature, can be hard to integrate and thus tra… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Comments: 8 pages, 5 figures (not counting appendix)

  19. arXiv:2105.08328  [pdf, other

    cs.RO

    Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning

    Authors: Jonah Siekmann, Kevin Green, John Warila, Alan Fern, Jonathan Hurst

    Abstract: Accurate and precise terrain estimation is a difficult problem for robot locomotion in real-world environments. Thus, it is useful to have systems that do not depend on accurate estimation to the point of fragility. In this paper, we explore the limits of such an approach by investigating the problem of traversing stair-like terrain without any external perception or terrain models on a bipedal ro… ▽ More

    Submitted 18 May, 2021; originally announced May 2021.

    Comments: Accepted to RSS 2021. Submission video available at https://youtu.be/MPhEmC6b6XU and video of a supplemental robustness test at https://youtu.be/nuhHiKEtaZQ

  20. arXiv:2105.00137  [pdf, other

    cs.LG

    Deep Convolution for Irregularly Sampled Temporal Point Clouds

    Authors: Erich Merrill, Stefan Lee, Li Fuxin, Thomas G. Dietterich, Alan Fern

    Abstract: We consider the problem of modeling the dynamics of continuous spatial-temporal processes represented by irregular samples through both space and time. Such processes occur in sensor networks, citizen science, multi-robot systems, and many others. We propose a new deep model that is able to directly learn and predict over this irregularly sampled data, without voxelization, by leveraging a recent… ▽ More

    Submitted 30 April, 2021; originally announced May 2021.

    Comments: 12 pages, submitted to ICLR 2021

  21. Optimizing Discrete Spaces via Expensive Evaluations: A Learning to Search Framework

    Authors: Aryan Deshwal, Syrine Belakaria, Janardhan Rao Doppa, Alan Fern

    Abstract: We consider the problem of optimizing expensive black-box functions over discrete spaces (e.g., sets, sequences, graphs). The key challenge is to select a sequence of combinatorial structures to evaluate, in order to identify high-performing structures as quickly as possible. Our main contribution is to introduce and evaluate a new learning-to-search framework for this problem called L2S-DISCO. Th… ▽ More

    Submitted 14 December, 2020; originally announced December 2020.

    Comments: 9 pages, 8 figures

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence. 34, 04 (Apr. 2020), 3773-3780

  22. arXiv:2011.06733  [pdf, other

    cs.CV cs.LG

    One Explanation is Not Enough: Structured Attention Graphs for Image Classification

    Authors: Vivswan Shitole, Li Fuxin, Minsuk Kahng, Prasad Tadepalli, Alan Fern

    Abstract: Attention maps are a popular way of explaining the decisions of convolutional networks for image classification. Typically, for each image of interest, a single attention map is produced, which assigns weights to pixels based on their importance to the classification. A single attention map, however, provides an incomplete understanding since there are often many other maps that explain a classifi… ▽ More

    Submitted 7 November, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

    Comments: 26 pages, 25 figures

    Journal ref: NeuRIPS 2021

  23. arXiv:2011.04741  [pdf, other

    cs.RO

    Learning Task Space Actions for Bipedal Locomotion

    Authors: Helei Duan, Jeremy Dao, Kevin Green, Taylor Apgar, Alan Fern, Jonathan Hurst

    Abstract: Recent work has demonstrated the success of reinforcement learning (RL) for training bipedal locomotion policies for real robots. This prior work, however, has focused on learning joint-coordination controllers based on an objective of following joint trajectories produced by already available controllers. As such, it is difficult to train these approaches to achieve higher-level goals of legged l… ▽ More

    Submitted 5 May, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: Accepted at ICRA 2021. Video supplement at https://www.youtube.com/watch?v=8OCOzPqZcGM

  24. arXiv:2011.01387  [pdf, other

    cs.RO

    Sim-to-Real Learning of All Common Bipedal Gaits via Periodic Reward Composition

    Authors: Jonah Siekmann, Yesh Godse, Alan Fern, Jonathan Hurst

    Abstract: We study the problem of realizing the full spectrum of bipedal locomotion on a real robot with sim-to-real reinforcement learning (RL). A key challenge of learning legged locomotion is describing different gaits, via reward functions, in a way that is intuitive for the designer and specific enough to reliably learn the gait across different initial random seeds or hyperparameters. A common approac… ▽ More

    Submitted 11 March, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted for presentation at ICRA 2021. The first two authors contributed equally to this work

  25. arXiv:2010.11234  [pdf, other

    cs.RO

    Learning Spring Mass Locomotion: Guiding Policies with a Reduced-Order Model

    Authors: Kevin Green, Yesh Godse, Jeremy Dao, Ross L. Hatton, Alan Fern, Jonathan Hurst

    Abstract: In this paper, we describe an approach to achieve dynamic legged locomotion on physical robots which combines existing methods for control with reinforcement learning. Specifically, our goal is a control hierarchy in which highest-level behaviors are planned through reduced-order models, which describe the fundamental physics of legged locomotion, and lower level controllers utilize a learned poli… ▽ More

    Submitted 11 March, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: 7 pages, 8 figures. Accepted to IEEE Robotics and Automation Letters (RA-L) with ICRA 2021 presentation option. Video supplement: https://youtu.be/80oJeaAd8CE Code: https://github.com/osudrl/ASLIP-RL

  26. arXiv:2010.09832  [pdf, other

    cs.LG cs.AI

    Dream and Search to Control: Latent Space Planning for Continuous Control

    Authors: Anurag Koul, Varun V. Kumar, Alan Fern, Somdeb Majumdar

    Abstract: Learning and planning with latent space dynamics has been shown to be useful for sample efficiency in model-based reinforcement learning (MBRL) for discrete and continuous control tasks. In particular, recent work, for discrete action spaces, demonstrated the effectiveness of latent-space planning via Monte-Carlo Tree Search (MCTS) for bootstrap** MBRL during learning and at test time. However,… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: Preprint

  27. arXiv:2010.08891  [pdf, other

    cs.LG cs.AI stat.ML

    DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs

    Authors: Aayam Shrestha, Stefan Lee, Prasad Tadepalli, Alan Fern

    Abstract: We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience. This approach can be applied on top of any learned representation and has the potential to easily support multiple solution objectives as well as zero-shot adjustment to changing environments and goals. Our main contribution is to introduce t… ▽ More

    Submitted 17 October, 2020; originally announced October 2020.

    Comments: Preprint. Under review at ICLR 2021

  28. arXiv:2010.05180  [pdf, other

    cs.AI

    Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions

    Authors: Zhengxian Lin, Kim-Ho Lam, Alan Fern

    Abstract: We investigate a deep reinforcement learning (RL) architecture that supports explaining why a learned agent prefers one action over another. The key idea is to learn action-values that are directly represented via human-understandable properties of expected futures. This is realized via the embedded self-prediction (ESP)model, which learns said properties in terms of human provided features. Actio… ▽ More

    Submitted 17 January, 2021; v1 submitted 11 October, 2020; originally announced October 2020.

    Comments: Published (Oral) at ICLR 2021

  29. arXiv:2006.03745  [pdf, other

    cs.LG stat.ML

    Re-understanding Finite-State Representations of Recurrent Policy Networks

    Authors: Mohamad H. Danesh, Anurag Koul, Alan Fern, Saeed Khorram

    Abstract: We introduce an approach for understanding control policies represented as recurrent neural networks. Recent work has approached this problem by transforming such recurrent policy networks into finite-state machines (FSM) and then analyzing the equivalent minimized FSM. While this led to interesting insights, the minimization process can obscure a deeper understanding of a machine's operation by m… ▽ More

    Submitted 11 July, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

    Comments: ICML 2021

  30. arXiv:2006.02402  [pdf, other

    cs.RO

    Learning Memory-Based Control for Human-Scale Bipedal Locomotion

    Authors: Jonah Siekmann, Srikar Valluri, Jeremy Dao, Lorenzo Bermillo, Helei Duan, Alan Fern, Jonathan Hurst

    Abstract: Controlling a non-statically stable biped is a difficult problem largely due to the complex hybrid dynamics involved. Recent work has demonstrated the effectiveness of reinforcement learning (RL) for simulation-based training of neural network controllers that successfully transfer to real bipeds. The existing work, however, has primarily used simple memoryless network architectures, even though m… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

    Comments: 8 pages, 5 figures, submitted to Robotics: Science and Systems 2020

  31. arXiv:1910.00614  [pdf, other

    cs.AI

    The Choice Function Framework for Online Policy Improvement

    Authors: Murugeswari Issakkimuthu, Alan Fern, Prasad Tadepalli

    Abstract: There are notable examples of online search improving over hand-coded or learned policies (e.g. AlphaZero) for sequential decision making. It is not clear, however, whether or not policy improvement is guaranteed for many of these approaches, even when given a perfect evaluation function and transition model. Indeed, simple counter examples show that seemingly reasonable online search procedures c… ▽ More

    Submitted 7 October, 2019; v1 submitted 1 October, 2019; originally announced October 2019.

  32. arXiv:1903.09708  [pdf, other

    cs.HC cs.AI

    Explaining Reinforcement Learning to Mere Mortals: An Empirical Study

    Authors: Andrew Anderson, Jonathan Dodge, Amrita Sadarangani, Zoe Juozapaitis, Evan Newman, Jed Irvine, Souti Chattopadhyay, Alan Fern, Margaret Burnett

    Abstract: We present a user study to investigate the impact of explanations on non-experts' understanding of reinforcement learning (RL) agents. We investigate both a common RL visualization, saliency maps (the focus of attention), and a more recent explanation type, reward-decomposition bars (predictions of future types of rewards). We designed a 124 participant, four-treatment experiment to compare partic… ▽ More

    Submitted 18 June, 2019; v1 submitted 22 March, 2019; originally announced March 2019.

    Comments: 7 pages

  33. arXiv:1812.07150  [pdf, other

    cs.LG cs.CV stat.ML

    Interactive Naming for Explaining Deep Neural Networks: A Formative Study

    Authors: Mandana Hamidi-Haines, Zhongang Qi, Alan Fern, Fuxin Li, Prasad Tadepalli

    Abstract: We consider the problem of explaining the decisions of deep neural networks for image recognition in terms of human-recognizable visual concepts. In particular, given a test set of images, we aim to explain each classification in terms of a small number of image regions, or activation maps, which have been associated with semantic concepts by a human annotator. This allows for generating summary v… ▽ More

    Submitted 20 December, 2018; v1 submitted 17 December, 2018; originally announced December 2018.

  34. arXiv:1811.12530  [pdf, other

    cs.LG stat.ML

    Learning Finite State Representations of Recurrent Policy Networks

    Authors: Anurag Koul, Sam Greydanus, Alan Fern

    Abstract: Recurrent neural networks (RNNs) are an effective representation of control policies for a wide range of reinforcement and imitation learning problems. RNN policies, however, are particularly difficult to explain, understand, and analyze due to their use of continuous-valued memory vectors and observation features. In this paper, we introduce a new technique, Quantized Bottleneck Insertion, to lea… ▽ More

    Submitted 29 November, 2018; originally announced November 2018.

    Comments: Preprint. Under review at ICLR 2019

  35. arXiv:1808.00529  [pdf, other

    cs.LG stat.ML

    Open Category Detection with PAC Guarantees

    Authors: Si Liu, Risheek Garrepalli, Thomas G. Dietterich, Alan Fern, Dan Hendrycks

    Abstract: Open category detection is the problem of detecting "alien" test instances that belong to categories or classes that were not present in the training data. In many applications, reliably detecting such aliens is central to ensuring the safety and accuracy of test set predictions. Unfortunately, there are no algorithms that provide theoretical guarantees on their ability to detect aliens under gene… ▽ More

    Submitted 1 August, 2018; originally announced August 2018.

  36. arXiv:1711.00138  [pdf, other

    cs.AI

    Visualizing and Understanding Atari Agents

    Authors: Sam Greydanus, Anurag Koul, Jonathan Dodge, Alan Fern

    Abstract: While deep reinforcement learning (deep RL) agents are effective at maximizing rewards, it is often unclear what strategies they use to do so. In this paper, we take a step toward explaining deep RL agents through a case study using Atari 2600 environments. In particular, we focus on using saliency maps to understand how an agent learns and executes a policy. We introduce a method for generating u… ▽ More

    Submitted 10 September, 2018; v1 submitted 31 October, 2017; originally announced November 2017.

    Comments: ICML 2018 conference paper. Code: https://github.com/greydanus/visualize_atari Blog: https://greydanus.github.io/2017/11/01/visualize-atari/

  37. arXiv:1710.01420  [pdf, other

    cs.DB cs.LG

    Usable & Scalable Learning Over Relational Data With Automatic Language Bias

    Authors: Jose Picado, Arash Termehchy, Sudhanshu Pathak, Alan Fern, Praveen Ilango, Yunqiao Cai

    Abstract: Relational databases are valuable resources for learning novel and interesting relations and concepts. In order to constraint the search through the large space of candidate definitions, users must tune the algorithm by specifying a language bias. Unfortunately, specifying the language bias is done via trial and error and is guided by the expert's intuitions. We propose AutoBias, a system that lev… ▽ More

    Submitted 6 April, 2020; v1 submitted 3 October, 2017; originally announced October 2017.

  38. arXiv:1708.09441  [pdf, other

    cs.LG cs.AI stat.ML

    Incorporating Feedback into Tree-based Anomaly Detection

    Authors: Shubhomoy Das, Weng-Keen Wong, Alan Fern, Thomas G. Dietterich, Md Amran Siddiqui

    Abstract: Anomaly detectors are often used to produce a ranked list of statistical anomalies, which are examined by human analysts in order to extract the actual anomalies of interest. Unfortunately, in realworld applications, this process can be exceedingly difficult for the analyst since a large fraction of high-ranking anomalies are false positives and not interesting from the application perspective. In… ▽ More

    Submitted 30 August, 2017; originally announced August 2017.

    Comments: 8 Pages, KDD 2017 Workshop on Interactive Data Exploration and Analytics (IDEA'17), August 14th, 2017, Halifax, Nova Scotia, Canada

    ACM Class: I.2.6; I.5.5

  39. arXiv:1607.07770  [pdf, ps, other

    cs.CV

    Approximate Policy Iteration for Budgeted Semantic Video Segmentation

    Authors: Behrooz Mahasseni, Sinisa Todorovic, Alan Fern

    Abstract: This paper formulates and presents a solution to the new problem of budgeted semantic video segmentation. Given a video, the goal is to accurately assign a semantic class label to every pixel in the video within a specified time budget. Typical approaches to such labeling problems, such as Conditional Random Fields (CRFs), focus on maximizing accuracy but do not provide a principled method for sat… ▽ More

    Submitted 26 July, 2016; originally announced July 2016.

  40. arXiv:1508.03846  [pdf, other

    cs.DB cs.AI cs.LG cs.LO

    Schema Independent Relational Learning

    Authors: Jose Picado, Arash Termehchy, Alan Fern, Parisa Ataei

    Abstract: Learning novel concepts and relations from relational databases is an important problem with many applications in database systems and machine learning. Relational learning algorithms learn the definition of a new relation in terms of existing relations in the database. Nevertheless, the same data set may be represented under different schemas for various reasons, such as efficiency, data quality,… ▽ More

    Submitted 6 November, 2017; v1 submitted 16 August, 2015; originally announced August 2015.

  41. arXiv:1503.01158  [pdf, other

    cs.AI cs.LG stat.ML

    A Meta-Analysis of the Anomaly Detection Problem

    Authors: Andrew Emmott, Shubhomoy Das, Thomas Dietterich, Alan Fern, Weng-Keen Wong

    Abstract: This article provides a thorough meta-analysis of the anomaly detection problem. To accomplish this we first identify approaches to benchmarking anomaly detection algorithms across the literature and produce a large corpus of anomaly detection benchmarks that vary in their construction across several dimensions we deem important to real-world applications: (a) point difficulty, (b) relative freque… ▽ More

    Submitted 26 August, 2016; v1 submitted 3 March, 2015; originally announced March 2015.

  42. arXiv:1503.00038  [pdf, other

    cs.AI cs.LG stat.ML

    Sequential Feature Explanations for Anomaly Detection

    Authors: Md Amran Siddiqui, Alan Fern, Thomas G. Dietterich, Weng-Keen Wong

    Abstract: In many applications, an anomaly detection system presents the most anomalous data instance to a human analyst, who then must determine whether the instance is truly of interest (e.g. a threat in a security setting). Unfortunately, most anomaly detectors provide no explanation about why an instance was considered anomalous, leaving the analyst with no guidance about where to begin the investigatio… ▽ More

    Submitted 27 February, 2015; originally announced March 2015.

    Comments: 9 pages, 4 figures and submitted to KDD 2015

  43. arXiv:1409.2553  [pdf, other

    cs.DB

    Representation Independent Analytics Over Structured Data

    Authors: Yodsawalai Chodpathumwan, Jose Picado, Arash Termehchy, Alan Fern, Yizhou Sun

    Abstract: Database analytics algorithms leverage quantifiable structural properties of the data to predict interesting concepts and relationships. The same information, however, can be represented using many different structures and the structural properties observed over particular representations do not necessarily hold for alternative structures. Thus, there is no guarantee that current database analytic… ▽ More

    Submitted 8 September, 2014; originally announced September 2014.

  44. arXiv:1404.5511  [pdf, other

    cs.LG

    Coactive Learning for Locally Optimal Problem Solving

    Authors: Robby Goetschalckx, Alan Fern, Prasad Tadepalli

    Abstract: Coactive learning is an online problem solving setting where the solutions provided by a solver are interactively improved by a domain expert, which in turn drives learning. In this paper we extend the study of coactive learning to problems where obtaining a globally optimal or near-optimal solution may be intractable or where an expert can only be expected to make small, local improvements to a c… ▽ More

    Submitted 18 April, 2014; originally announced April 2014.

    Comments: AAAI 2014 paper, including appendices

  45. arXiv:1402.0911  [pdf, ps, other

    eess.SY physics.soc-ph

    A Policy Switching Approach to Consolidating Load Shedding and Islanding Protection Schemes

    Authors: Rich Meier, Eduardo Cotilla-Sanchez, Alan Fern

    Abstract: In recent years there have been many improvements in the reliability of critical infrastructure systems. Despite these improvements, the power systems industry has seen relatively small advances in this regard. For instance, power quality deficiencies, a high number of localized contingencies, and large cascading outages are still too widespread. Though progress has been made in improving generati… ▽ More

    Submitted 9 August, 2014; v1 submitted 4 February, 2014; originally announced February 2014.

    Comments: Full Paper Accepted to PSCC 2014 - IEEE Co-Sponsored Conference. 7 Pages, 2 Figures, 2 Tables

  46. arXiv:1306.6302  [pdf, other

    cs.AI cs.LG

    Solving Relational MDPs with Exogenous Events and Additive Rewards

    Authors: S. Joshi, R. Khardon, P. Tadepalli, A. Raghavan, A. Fern

    Abstract: We formalize a simple but natural subclass of service domains for relational planning problems with object-centered, independent exogenous events and additive rewards capturing, for example, problems in inventory control. Focusing on this subclass, we present a new symbolic planning algorithm which is the first algorithm that has explicit performance guarantees for relational MDPs with exogenous e… ▽ More

    Submitted 27 June, 2013; v1 submitted 26 June, 2013; originally announced June 2013.

    Comments: This is an extended version of our ECML/PKDD 2013 paper including all proofs. (v2 corrects typos and updates ref [10] to cite this report as the full version)

  47. arXiv:1301.0614  [pdf

    cs.AI

    Inductive Policy Selection for First-Order MDPs

    Authors: Sung Wook Yoon, Alan Fern, Robert Givan

    Abstract: We select policies for large Markov Decision Processes (MDPs) with compact first-order representations. We find policies that generalize well as the number of objects in the domain grows, potentially without bound. Existing dynamic-programming approaches based on flat, propositional, or first-order representations either are impractical here or do not naturally scale as the number of objects grows… ▽ More

    Submitted 12 December, 2012; originally announced January 2013.

    Comments: Appears in Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002)

    Report number: UAI-P-2002-PG-568-576

  48. arXiv:1210.4880  [pdf

    cs.AI cs.GT cs.LG

    Inferring Strategies from Limited Reconnaissance in Real-time Strategy Games

    Authors: Jesse Hostetler, Ethan W. Dereszynski, Thomas G. Dietterich, Alan Fern

    Abstract: In typical real-time strategy (RTS) games, enemy units are visible only when they are within sight range of a friendly unit. Knowledge of an opponent's disposition is limited to what can be observed through scouting. Information is costly, since units dedicated to scouting are unavailable for other purposes, and the enemy will resist scouting attempts. It is important to infer as much as possible… ▽ More

    Submitted 16 October, 2012; originally announced October 2012.

    Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

    Report number: UAI-P-2012-PG-367-376

  49. arXiv:1210.4876  [pdf

    cs.LG stat.ML

    Active Imitation Learning via Reduction to I.I.D. Active Learning

    Authors: Kshitij Judah, Alan Fern, Thomas G. Dietterich

    Abstract: In standard passive imitation learning, the goal is to learn a target policy by passively observing full execution trajectories of it. Unfortunately, generating such trajectories can require substantial expert effort and be impractical in some cases. In this paper, we consider active imitation learning with the goal of reducing this effort by querying the expert about the desired action at individ… ▽ More

    Submitted 16 October, 2012; originally announced October 2012.

    Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

    Report number: UAI-P-2012-PG-428-437

  50. arXiv:1206.6460  [pdf

    cs.LG cs.AI stat.ML

    Output Space Search for Structured Prediction

    Authors: Janardhan Rao Doppa, Alan Fern, Prasad Tadepalli

    Abstract: We consider a framework for structured prediction based on search in the space of complete structured outputs. Given a structured input, an output is produced by running a time-bounded search procedure guided by a learned cost function, and then returning the least cost output uncovered during the search. This framework can be instantiated for a wide range of search spaces and search procedures, a… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)