Skip to main content

Showing 1–14 of 14 results for author: Gangwani, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.13159  [pdf, other

    cs.LG math.OC stat.ML

    Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow

    Authors: Yinuo Ren, Tesi Xiao, Tanmay Gangwani, Anshuka Rangi, Holakou Rahmanian, Lexing Ying, Subhajit Sanyal

    Abstract: Multi-objective optimization (MOO) aims to optimize multiple, possibly conflicting objectives with widespread applications. We introduce a novel interacting particle method for MOO inspired by molecular dynamics simulations. Our approach combines overdamped Langevin and birth-death dynamics, incorporating a "dominance potential" to steer particles toward global Pareto optimality. In contrast to pr… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  2. arXiv:2302.00284  [pdf, other

    cs.LG cs.AI

    Selective Uncertainty Propagation in Offline RL

    Authors: Sanath Kumar Krishnamurthy, Shrey Modi, Tanmay Gangwani, Sumeet Katariya, Branislav Kveton, Anshuka Rangi

    Abstract: We consider the finite-horizon offline reinforcement learning (RL) setting, and are motivated by the challenge of learning the policy at any step h in dynamic programming (DP) algorithms. To learn this, it is sufficient to evaluate the treatment effect of deviating from the behavioral policy at step h after having optimized the policy for all future steps. Since the policy at any step can affect n… ▽ More

    Submitted 12 February, 2024; v1 submitted 1 February, 2023; originally announced February 2023.

  3. arXiv:2204.11446  [pdf, other

    cs.LG stat.ML

    Imitation Learning from Observations under Transition Model Disparity

    Authors: Tanmay Gangwani, Yuan Zhou, Jian Peng

    Abstract: Learning to perform tasks by leveraging a dataset of expert observations, also known as imitation learning from observations (ILO), is an important paradigm for learning skills without access to the expert reward function or the expert actions. We consider ILO in the setting where the expert and the learner agents operate in different environments, with the source of the discrepancy being the tran… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

    Comments: ICLR 2022 camera-ready

  4. arXiv:2109.09031  [pdf, other

    cs.LG stat.ML

    Hindsight Foresight Relabeling for Meta-Reinforcement Learning

    Authors: Michael Wan, Jian Peng, Tanmay Gangwani

    Abstract: Meta-reinforcement learning (meta-RL) algorithms allow for agents to learn new behaviors from small amounts of experience, mitigating the sample inefficiency problem in RL. However, while meta-RL agents can adapt quickly to new tasks at test time after experiencing only a few trajectories, the meta-training process is still sample-inefficient. Prior works have found that in the multi-task RL setti… ▽ More

    Submitted 25 April, 2022; v1 submitted 18 September, 2021; originally announced September 2021.

    Comments: ICLR 2022 camera-ready

  5. arXiv:2107.12958  [pdf, other

    cs.DC cs.CR cs.IT cs.LG

    Adaptive Verifiable Coded Computing: Towards Fast, Secure and Private Distributed Machine Learning

    Authors: Tingting Tang, Ramy E. Ali, Hanieh Hashemi, Tynan Gangwani, Salman Avestimehr, Murali Annavaram

    Abstract: Stragglers, Byzantine workers, and data privacy are the main bottlenecks in distributed cloud computing. Some prior works proposed coded computing strategies to jointly address all three challenges. They require either a large number of workers, a significant communication cost or a significant computational complexity to tolerate Byzantine workers. Much of the overhead in prior schemes comes from… ▽ More

    Submitted 22 March, 2022; v1 submitted 27 July, 2021; originally announced July 2021.

  6. arXiv:2011.02614  [pdf, other

    cs.LG stat.ML

    Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

    Authors: Tanmay Gangwani, Jian Peng, Yuan Zhou

    Abstract: Quality-Diversity (QD) is a concept from Neuroevolution with some intriguing applications to Reinforcement Learning. It facilitates learning a population of agents where each member is optimized to simultaneously accumulate high task-returns and exhibit behavioral diversity compared to other members. In this paper, we build on a recent kernel-based method for training a QD policy ensemble with Ste… ▽ More

    Submitted 4 November, 2020; originally announced November 2020.

    Comments: CoRL 2020 camera-ready

  7. arXiv:2010.12718  [pdf, other

    cs.LG stat.ML

    Learning Guidance Rewards with Trajectory-space Smoothing

    Authors: Tanmay Gangwani, Yuan Zhou, Jian Peng

    Abstract: Long-term temporal credit assignment is an important challenge in deep reinforcement learning (RL). It refers to the ability of the agent to attribute actions to consequences that may occur after a long time interval. Existing policy-gradient and Q-learning algorithms typically rely on dense environmental rewards that provide rich short-term supervision and help with credit assignment. However, th… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2020 camera-ready

  8. arXiv:2006.07041  [pdf, other

    stat.ML cs.LG

    Mutual Information Based Knowledge Transfer Under State-Action Dimension Mismatch

    Authors: Michael Wan, Tanmay Gangwani, Jian Peng

    Abstract: Deep reinforcement learning (RL) algorithms have achieved great success on a wide variety of sequential decision-making tasks. However, many of these algorithms suffer from high sample complexity when learning from scratch using environmental rewards, due to issues such as credit-assignment and high-variance gradients, among others. Transfer learning, in which knowledge gained on a source task is… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: Conference on Uncertainty in Artificial Intelligence (UAI 2020)

  9. arXiv:2002.11879  [pdf, other

    stat.ML cs.LG

    State-only Imitation with Transition Dynamics Mismatch

    Authors: Tanmay Gangwani, Jian Peng

    Abstract: Imitation Learning (IL) is a popular paradigm for training agents to achieve complicated goals by leveraging expert behavior, rather than dealing with the hardships of designing a correct reward function. With the environment modeled as a Markov Decision Process (MDP), most of the existing IL algorithms are contingent on the availability of expert demonstrations in the same MDP as the one in which… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

    Comments: ICLR 2020 camera-ready

  10. arXiv:1906.09510  [pdf, other

    cs.LG stat.ML

    Learning Belief Representations for Imitation Learning in POMDPs

    Authors: Tanmay Gangwani, Joel Lehman, Qiang Liu, Jian Peng

    Abstract: We consider the problem of imitation learning from expert demonstrations in partially observable Markov decision processes (POMDPs). Belief representations, which characterize the distribution over the latent states in a POMDP, have been modeled using recurrent neural networks and probabilistic latent variable models, and shown to be effective for reinforcement learning in POMDPs. In this work, we… ▽ More

    Submitted 22 June, 2019; originally announced June 2019.

    Comments: Conference on Uncertainty in Artificial Intelligence (UAI 2019)

  11. arXiv:1905.08178  [pdf, ps, other

    cs.PL

    Partial Redundancy Elimination using Lazy Code Motion

    Authors: Sandeep Dasgupta, Tanmay Gangwani

    Abstract: Partial Redundancy Elimination (PRE) is a compiler optimization that eliminates expressions that are redundant on some but not necessarily all paths through a program. In this project, we implemented a PRE optimization pass in LLVM and measured results on a variety of applications. We chose PRE because it is a powerful technique that subsumes Common Subexpression Elimination (CSE) and Loop Invaria… ▽ More

    Submitted 20 May, 2019; originally announced May 2019.

  12. arXiv:1811.10296  [pdf, other

    cs.CR

    Distributed and Secure ML with Self-tallying Multi-party Aggregation

    Authors: Yunhui Long, Tanmay Gangwani, Haris Mughees, Carl Gunter

    Abstract: Privacy preserving multi-party computation has many applications in areas such as medicine and online advertisements. In this work, we propose a framework for distributed, secure machine learning among untrusted individuals. The framework consists of two parts: a two-step training protocol based on homomorphic addition and a zero knowledge proof for data validity. By combining these two techniques… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

    Comments: NeurIPS 2018 Workshop on PPML

  13. arXiv:1805.10309  [pdf, other

    stat.ML cs.LG

    Learning Self-Imitating Diverse Policies

    Authors: Tanmay Gangwani, Qiang Liu, Jian Peng

    Abstract: The success of popular algorithms for deep reinforcement learning, such as policy-gradients and Q-learning, relies heavily on the availability of an informative reward signal at each timestep of the sequential decision-making process. When rewards are only sparsely available during an episode, or a rewarding feedback is provided only after episode termination, these algorithms perform sub-optimall… ▽ More

    Submitted 22 February, 2019; v1 submitted 25 May, 2018; originally announced May 2018.

    Comments: ICLR 2019

  14. arXiv:1711.01012  [pdf, other

    stat.ML cs.LG

    Policy Optimization by Genetic Distillation

    Authors: Tanmay Gangwani, Jian Peng

    Abstract: Genetic algorithms have been widely used in many practical optimization problems. Inspired by natural selection, operators, including mutation, crossover and selection, provide effective heuristics for search and black-box optimization. However, they have not been shown useful for deep reinforcement learning, possibly due to the catastrophic consequence of parameter crossovers of neural networks.… ▽ More

    Submitted 12 March, 2018; v1 submitted 2 November, 2017; originally announced November 2017.