Skip to main content

Showing 1–23 of 23 results for author: Arumugam, D

.
  1. arXiv:2403.03175  [pdf

    physics.app-ph eess.SP quant-ph

    Remote sensing of soil moisture using Rydberg atoms and satellite signals of opportunity

    Authors: Darmindra Arumugam, Jun-Hee Park, Brook Feyissa, Jack Bush, Srinivas Prasad Mysore Nagaraja

    Abstract: Spaceborne radar remote sensing of the earth system is essential to study natural and man-made changes in the ecosystem, water and energy cycles, weather and air quality, sea level, and surface dynamics. A major challenge with current approaches is the lack of broad spectrum tunability due to narrow band microwave electronics, that limit systems to specific science variable retrievals. This result… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  2. arXiv:2310.17769  [pdf, other

    cs.CL cs.AI

    Social Contract AI: Aligning AI Assistants with Implicit Group Norms

    Authors: Jan-Philipp Fränken, Sam Kwok, Peixuan Ye, Kanishk Gandhi, Dilip Arumugam, Jared Moore, Alex Tamkin, Tobias Gerstenberg, Noah D. Goodman

    Abstract: We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions. To validate our proposal, we run proof-of-concept simulations in the economic ultimatum game, formalizing user preferences as policies that guide the actions of simulated players. We find that the AI assistant accurately aligns its behavior to match standard policies fro… ▽ More

    Submitted 3 December, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: SoLaR NeurIPS 2023 Workshop (https://solar-neurips.github.io/)

  3. arXiv:2309.09495  [pdf, other

    cs.HC cs.SE

    PwR: Exploring the Role of Representations in Conversational Programming

    Authors: Pradyumna YM, Vinod Ganesan, Dinesh Kumar Arumugam, Meghna Gupta, Nischith Shadagopan, Tanay Dixit, Sameer Segal, Pratyush Kumar, Mohit Jain, Sriram Rajamani

    Abstract: Large Language Models (LLMs) have revolutionized programming and software engineering. AI programming assistants such as GitHub Copilot X enable conversational programming, narrowing the gap between human intent and code generation. However, prior literature has identified a key challenge--there is a gap between user's mental model of the system's understanding after a sequence of natural language… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: 23 pages, 3 figures, 2 tables, under submission for ACM CHI 2024

    ACM Class: H.5.2

  4. arXiv:2307.11897  [pdf, other

    cs.LG cs.AI

    Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning

    Authors: Akash Velu, Skanda Vaidyanath, Dilip Arumugam

    Abstract: Oftentimes, environments for sequential decision-making problems can be quite sparse in the provision of evaluative feedback to guide reinforcement-learning agents. In the extreme case, long trajectories of behavior are merely punctuated with a single terminal feedback signal, leading to a significant temporal delay between the observation of a non-trivial reward and the individual steps of behavi… ▽ More

    Submitted 18 August, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

  5. arXiv:2305.11455  [pdf, other

    cs.CL cs.AI cs.LG

    Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

    Authors: Wanqiao Xu, Shi Dong, Dilip Arumugam, Benjamin Van Roy

    Abstract: A centerpiece of the ever-popular reinforcement learning from human feedback (RLHF) approach to fine-tuning autoregressive language models is the explicit training of a reward model to emulate human feedback, distinct from the language model itself. This reward model is then coupled with policy-gradient methods to dramatically improve the alignment between language model outputs and desired respon… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  6. arXiv:2305.03263  [pdf, other

    cs.LG cs.AI

    Bayesian Reinforcement Learning with Limited Cognitive Load

    Authors: Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

    Abstract: All biological and artificial agents must learn and make decisions given limits on their ability to process information. As such, a general theory of adaptive behavior should be able to account for the complex interactions between an agent's learning history, decisions, and capacity constraints. Recent work in computer science has begun to clarify the principles that shape these dynamics by bridgi… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  7. arXiv:2212.12633  [pdf, other

    cs.LG cs.AI

    Inclusive Artificial Intelligence

    Authors: Dilip Arumugam, Shi Dong, Benjamin Van Roy

    Abstract: Prevailing methods for assessing and comparing generative AIs incentivize responses that serve a hypothetical representative individual. Evaluating models in these terms presumes homogeneous preferences across the population and engenders selection of agglomerative AIs, which fail to represent the diverse range of interests across individuals. We propose an alternative evaluation method that inste… ▽ More

    Submitted 3 March, 2023; v1 submitted 23 December, 2022; originally announced December 2022.

  8. arXiv:2210.16877  [pdf, ps, other

    cs.LG cs.AI

    On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning

    Authors: Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

    Abstract: Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources. Prior work has drawn inspiration from this fact and leveraged an information-theoretic model of such behaviors or policies as communication cha… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: Accepted to the NeurIPS Workshop on Information-Theoretic Principles in Cognitive Systems (InfoCog) 2022. arXiv admin note: text overlap with arXiv:2206.02072

  9. arXiv:2210.16872  [pdf, ps, other

    cs.LG stat.ML

    Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction

    Authors: Dilip Arumugam, Satinder Singh

    Abstract: The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the Bayes-optimal solution to the exploration-exploitation trade-off in reinforcement learning. As the computation of exact solutions to Bayesian reinforcement-learning problems is intractable, much of the literature has focused on develo** suitable approximation algorithms. In this work, before diving into algorithm design, we… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: Accepted to Neural Information Processing Systems (NeurIPS) 2022

  10. arXiv:2206.02072  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning

    Authors: Dilip Arumugam, Benjamin Van Roy

    Abstract: The quintessential model-based reinforcement-learning agent iteratively refines its estimates or prior beliefs about the true underlying model of the environment. Recent empirical successes in model-based reinforcement learning with function approximation, however, eschew the true model in favor of a surrogate that, while ignoring various facets of the environment, still facilitates effective plan… ▽ More

    Submitted 30 October, 2022; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: Accepted to Neural Information Processing Systems (NeurIPS) 2022

  11. arXiv:2206.02025  [pdf, ps, other

    cs.LG cs.IT

    Between Rate-Distortion Theory & Value Equivalence in Model-Based Reinforcement Learning

    Authors: Dilip Arumugam, Benjamin Van Roy

    Abstract: The quintessential model-based reinforcement-learning agent iteratively refines its estimates or prior beliefs about the true underlying model of the environment. Recent empirical successes in model-based reinforcement learning with function approximation, however, eschew the true model in favor of a surrogate that, while ignoring various facets of the environment, still facilitates effective plan… ▽ More

    Submitted 4 June, 2022; originally announced June 2022.

    Comments: Accepted to the Multi-Disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) 2022

  12. arXiv:2110.13973  [pdf, other

    cs.LG cs.IT

    The Value of Information When Deciding What to Learn

    Authors: Dilip Arumugam, Benjamin Van Roy

    Abstract: All sequential decision-making agents explore so as to acquire knowledge about a particular target. It is often the responsibility of the agent designer to construct this target which, in rich and complex environments, constitutes a onerous burden; without full knowledge of the environment itself, a designer may forge a sub-optimal learning target that poorly balances the amount of information an… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: Accepted to Neural Information Processing Systems (NeurIPS) 2021

  13. arXiv:2110.03424  [pdf, other

    cs.LG cs.AI

    Bad-Policy Density: A Measure of Reinforcement Learning Hardness

    Authors: David Abel, Cameron Allen, Dilip Arumugam, D. Ellis Hershkowitz, Michael L. Littman, Lawson L. S. Wong

    Abstract: Reinforcement learning is hard in general. Yet, in many specific environments, learning is easy. What makes learning easy in one environment, but difficult in another? We address this question by proposing a simple measure of reinforcement-learning hardness called the bad-policy density. This quantity measures the fraction of the deterministic stationary policy space that is below a desired thresh… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: Presented at the 2021 ICML Workshop on Reinforcement Learning Theory

  14. arXiv:2103.06224  [pdf, ps, other

    cs.LG cs.IT

    An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning

    Authors: Dilip Arumugam, Peter Henderson, Pierre-Luc Bacon

    Abstract: How do we formalize the challenge of credit assignment in reinforcement learning? Common intuition would draw attention to reward sparsity as a key contributor to difficult credit assignment and traditional heuristics would look to temporal recency for the solution, calling upon the classic eligibility trace. We posit that it is not the sparsity of the reward itself that causes difficulty in credi… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: Workshop on Biological and Artificial Reinforcement Learning (NeurIPS 2020)

  15. arXiv:2101.06197  [pdf, other

    cs.LG cs.IT

    Deciding What to Learn: A Rate-Distortion Approach

    Authors: Dilip Arumugam, Benjamin Van Roy

    Abstract: Agents that learn to select optimal actions represent a prominent focus of the sequential decision-making literature. In the face of a complex environment or constraints on time and resources, however, aiming to synthesize such an optimal policy can become infeasible. These scenarios give rise to an important trade-off between the information an agent must acquire to learn and the sub-optimality o… ▽ More

    Submitted 21 June, 2021; v1 submitted 15 January, 2021; originally announced January 2021.

  16. arXiv:2010.02383  [pdf, other

    cs.LG cs.AI stat.ML

    Randomized Value Functions via Posterior State-Abstraction Sampling

    Authors: Dilip Arumugam, Benjamin Van Roy

    Abstract: State abstraction has been an essential tool for dramatically improving the sample efficiency of reinforcement-learning algorithms. Indeed, by exposing and accentuating various types of latent structure within the environment, different classes of state abstraction have enabled improved theoretical guarantees and empirical performance. When dealing with state abstractions that capture structure in… ▽ More

    Submitted 17 June, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Accepted to the Workshop on Biological and Artificial Reinforcement Learning (NeurIPS 2020)

  17. arXiv:2006.10810  [pdf, other

    cs.LG stat.ML

    Reparameterized Variational Divergence Minimization for Stable Imitation

    Authors: Dilip Arumugam, Debadeepta Dey, Alekh Agarwal, Asli Celikyilmaz, Elnaz Nouri, Bill Dolan

    Abstract: While recent state-of-the-art results for adversarial imitation-learning algorithms are encouraging, recent works exploring the imitation learning from observation (ILO) setting, where trajectories \textit{only} contain expert observations, have not been met with the same success. Inspired by recent investigations of $f$-divergence manipulation for the standard imitation learning setting(Ke et al.… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

  18. arXiv:2004.10876  [pdf, other

    cs.AI cs.RO

    Flexible and Efficient Long-Range Planning Through Curious Exploration

    Authors: Aidan Curtis, Minjian Xin, Dilip Arumugam, Kevin Feigelis, Daniel Yamins

    Abstract: Identifying algorithms that flexibly and efficiently discover temporally-extended multi-phase plans is an essential step for the advancement of robotics and model-based reinforcement learning. The core problem of long-range planning is finding an efficient way to search through the tree of possible action sequences. Existing non-learned planning solutions from the Task and Motion Planning (TAMP) l… ▽ More

    Submitted 8 July, 2020; v1 submitted 22 April, 2020; originally announced April 2020.

  19. arXiv:1902.04257  [pdf, other

    cs.LG stat.ML

    Deep Reinforcement Learning from Policy-Dependent Human Feedback

    Authors: Dilip Arumugam, Jun Ki Lee, Sophie Saskin, Michael L. Littman

    Abstract: To widen their accessibility and increase their utility, intelligent agents must be able to learn complex behaviors as specified by (non-expert) human users. Moreover, they will need to learn these behaviors within a reasonable amount of time while efficiently leveraging the sparse feedback a human trainer is capable of providing. Recent work has shown that human feedback can be characterized as a… ▽ More

    Submitted 12 February, 2019; originally announced February 2019.

  20. arXiv:1812.01129  [pdf, other

    cs.LG cs.AI

    Mitigating Planner Overfitting in Model-Based Reinforcement Learning

    Authors: Dilip Arumugam, David Abel, Kavosh Asadi, Nakul Gopalan, Christopher Grimm, Jun Ki Lee, Lucas Lehnert, Michael L. Littman

    Abstract: An agent with an inaccurate model of its environment faces a difficult choice: it can ignore the errors in its model and act in the real world in whatever way it determines is optimal with respect to its model. Alternatively, it can take a more conservative stance and eschew its model in favor of optimizing its behavior solely via real-world interaction. This latter approach can be exceedingly slo… ▽ More

    Submitted 19 March, 2020; v1 submitted 3 December, 2018; originally announced December 2018.

  21. arXiv:1707.08668  [pdf, other

    cs.AI cs.CL

    A Tale of Two DRAGGNs: A Hybrid Approach for Interpreting Action-Oriented and Goal-Oriented Instructions

    Authors: Siddharth Karamcheti, Edward C. Williams, Dilip Arumugam, Mina Rhee, Nakul Gopalan, Lawson L. S. Wong, Stefanie Tellex

    Abstract: Robots operating alongside humans in diverse, stochastic environments must be able to accurately interpret natural language commands. These instructions often fall into one of two categories: those that specify a goal condition or target state, and those that specify explicit actions, or how to perform a given task. Recent approaches have used reward functions as a semantic representation of goal-… ▽ More

    Submitted 26 July, 2017; originally announced July 2017.

    Comments: Accepted at the 1st Workshop on Language Grounding for Robotics at ACL 2017

  22. arXiv:1706.00536  [pdf, other

    cs.AI

    Modeling Latent Attention Within Neural Networks

    Authors: Christopher Grimm, Dilip Arumugam, Siddharth Karamcheti, David Abel, Lawson L. S. Wong, Michael L. Littman

    Abstract: Deep neural networks are able to solve tasks across a variety of domains and modalities of data. Despite many empirical successes, we lack the ability to clearly understand and interpret the learned internal mechanisms that contribute to such effective behaviors or, more critically, failure modes. In this work, we present a general method for visualizing an arbitrary neural network's inner mechani… ▽ More

    Submitted 30 December, 2017; v1 submitted 1 June, 2017; originally announced June 2017.

  23. Accurately and Efficiently Interpreting Human-Robot Instructions of Varying Granularities

    Authors: Dilip Arumugam, Siddharth Karamcheti, Nakul Gopalan, Lawson L. S. Wong, Stefanie Tellex

    Abstract: Humans can ground natural language commands to tasks at both abstract and fine-grained levels of specificity. For instance, a human forklift operator can be instructed to perform a high-level action, like "grab a pallet" or a low-level action like "tilt back a little bit." While robots are also capable of grounding language commands to tasks, previous methods implicitly assume that all commands an… ▽ More

    Submitted 19 June, 2018; v1 submitted 21 April, 2017; originally announced April 2017.

    Comments: Updated with final version - Published as Conference Paper in Robotics: Science and Systems 2017