Skip to main content

Showing 1–5 of 5 results for author: Mediratta, I

.
  1. arXiv:2312.05742  [pdf, other

    cs.LG cs.AI

    The Generalization Gap in Offline Reinforcement Learning

    Authors: Ishita Mediratta, Qingfei You, Minqi Jiang, Roberta Raileanu

    Abstract: Despite recent progress in offline learning, these methods are still trained and tested on the same environment. In this paper, we compare the generalization abilities of widely used online and offline learning methods such as online reinforcement learning (RL), offline RL, sequence modeling, and behavioral cloning. Our experiments show that offline learning algorithms perform worse on new environ… ▽ More

    Submitted 14 March, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: Published as a conference paper at ICLR 2024; First two authors contributed equally

  2. arXiv:2310.06452  [pdf, other

    cs.LG cs.AI cs.CL

    Understanding the Effects of RLHF on LLM Generalisation and Diversity

    Authors: Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, Edward Grefenstette, Roberta Raileanu

    Abstract: Large language models (LLMs) fine-tuned with reinforcement learning from human feedback (RLHF) have been used in some of the most widely deployed AI models to date, such as OpenAI's ChatGPT or Anthropic's Claude. While there has been significant work develo** these methods, our understanding of the benefits and downsides of each stage in RLHF is still limited. To fill this gap, we present an ext… ▽ More

    Submitted 19 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Code available here: https://github.com/facebookresearch/rlfh-gen-div

  3. arXiv:2308.10797  [pdf, other

    cs.LG cs.AI

    Stabilizing Unsupervised Environment Design with a Learned Adversary

    Authors: Ishita Mediratta, Minqi Jiang, Jack Parker-Holder, Michael Dennis, Eugene Vinitsky, Tim Rocktäschel

    Abstract: A key challenge in training generally-capable agents is the design of training tasks that facilitate broad generalization and robustness to environment variations. This challenge motivates the problem setting of Unsupervised Environment Design (UED), whereby a student agent trains on an adaptive distribution of tasks proposed by a teacher agent. A pioneering approach for UED is PAIRED, which uses… ▽ More

    Submitted 22 August, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: CoLLAs 2023 - Oral; Second and third authors contributed equally

  4. arXiv:2112.08879  [pdf, other

    cs.CV cs.CL

    Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds

    Authors: Ayush Jain, Nikolaos Gkanatsios, Ishita Mediratta, Katerina Fragkiadaki

    Abstract: Most models tasked to ground referential utterances in 2D and 3D scenes learn to select the referred object from a pool of object proposals provided by a pre-trained detector. This is limiting because an utterance may refer to visual entities at various levels of granularity, such as the chair, the leg of the chair, or the tip of the front leg of the chair, which may be missed by the detector. We… ▽ More

    Submitted 21 July, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: First two authors contributed equally | ECCV 2022 Camera Ready

  5. arXiv:2104.03851  [pdf, other

    cs.CV

    CoCoNets: Continuous Contrastive 3D Scene Representations

    Authors: Shamit Lal, Mihir Prabhudesai, Ishita Mediratta, Adam W. Harley, Katerina Fragkiadaki

    Abstract: This paper explores self-supervised learning of amodal 3D feature representations from RGB and RGB-D posed images and videos, agnostic to object and scene semantic content, and evaluates the resulting scene representations in the downstream tasks of visual correspondence, object tracking, and object detection. The model infers a latent3D representation of the scene in the form of 3D feature points… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.