Skip to main content

Showing 1–6 of 6 results for author: Gotmare, A D

.
  1. arXiv:2306.00029  [pdf, other

    cs.SE cs.AI

    CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

    Authors: Nghi D. Q. Bui, Hung Le, Yue Wang, Junnan Li, Akhilesh Deepak Gotmare, Steven C. H. Hoi

    Abstract: Code intelligence plays a key role in transforming modern software engineering. Recently, deep learning-based models, especially Transformer-based large language models (LLMs), have demonstrated remarkable potential in tackling these tasks by leveraging massive open-source code data and programming language features. However, the development and deployment of such models often require expertise in… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: Ongoing work - Draft Preview

  2. arXiv:2305.07922  [pdf, other

    cs.CL cs.LG cs.PL

    CodeT5+: Open Code Large Language Models for Code Understanding and Generation

    Authors: Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D. Q. Bui, Junnan Li, Steven C. H. Hoi

    Abstract: Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence. However, existing code LLMs have two main limitations in terms of architecture and pretraining tasks. First, they often adopt a specific architecture (encoder-only or decoder-only) or rely on a unified encoder-decoder network for different downstream tasks. The former paradigm is limi… ▽ More

    Submitted 20 May, 2023; v1 submitted 13 May, 2023; originally announced May 2023.

    Comments: 26 pages, preprint

  3. arXiv:2207.01780  [pdf, other

    cs.LG cs.CL cs.PL

    CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning

    Authors: Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi

    Abstract: Program synthesis or code generation aims to generate a program that satisfies a problem specification. Recent approaches using large-scale pretrained language models (LMs) have shown promising results, yet they have some critical limitations. In particular, they often follow a standard supervised fine-tuning procedure to train a code generation model only from the pairs of natural-language proble… ▽ More

    Submitted 3 November, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: An earlier version of the work was accepted to NeurIPS 2022

  4. arXiv:2110.07811  [pdf, other

    cs.CL cs.PL

    Cascaded Fast and Slow Models for Efficient Semantic Code Search

    Authors: Akhilesh Deepak Gotmare, Junnan Li, Shafiq Joty, Steven C. H. Hoi

    Abstract: The goal of natural language semantic code search is to retrieve a semantically relevant code snippet from a fixed set of candidates using a natural language query. Existing approaches are neither effective nor efficient enough towards a practical semantic code search system. In this paper, we propose an efficient and accurate semantic code search framework with cascaded fast and slow models, in w… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: 12 pages

  5. arXiv:2107.07651  [pdf, other

    cs.CV cs.AI

    Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

    Authors: Junnan Li, Ramprasaath R. Selvaraju, Akhilesh Deepak Gotmare, Shafiq Joty, Caiming Xiong, Steven Hoi

    Abstract: Large-scale vision and language representation learning has shown promising improvements on various vision-language tasks. Most existing methods employ a transformer-based multimodal encoder to jointly model visual tokens (region-based image features) and word tokens. Because the visual tokens and word tokens are unaligned, it is challenging for the multimodal encoder to learn image-text interacti… ▽ More

    Submitted 7 October, 2021; v1 submitted 15 July, 2021; originally announced July 2021.

  6. arXiv:2009.06367  [pdf, other

    cs.CL cs.LG

    GeDi: Generative Discriminator Guided Sequence Generation

    Authors: Ben Krause, Akhilesh Deepak Gotmare, Bryan McCann, Nitish Shirish Keskar, Shafiq Joty, Richard Socher, Nazneen Fatema Rajani

    Abstract: While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they generate. This is especially problematic because datasets used for training large LMs usually contain significant toxicity, hate, bias, and negativity. We propose GeDi as an efficient method for us… ▽ More

    Submitted 22 October, 2020; v1 submitted 14 September, 2020; originally announced September 2020.