Skip to main content

Showing 1–11 of 11 results for author: Bieber, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2208.07461  [pdf, other

    cs.LG cs.PL cs.SE

    A Library for Representing Python Programs as Graphs for Machine Learning

    Authors: David Bieber, Kensen Shi, Petros Maniatis, Charles Sutton, Vincent Hellendoorn, Daniel Johnson, Daniel Tarlow

    Abstract: Graph representations of programs are commonly a central element of machine learning for code research. We introduce an open source Python library python_graphs that applies static analysis to construct graph representations of Python programs suitable for training machine learning models. Our library admits the construction of control-flow graphs, data-flow graphs, and composite ``program graphs'… ▽ More

    Submitted 15 August, 2022; originally announced August 2022.

    Comments: 21 pages, 14 figures

  2. arXiv:2207.10342  [pdf, ps, other

    cs.CL cs.AI

    Language Model Cascades

    Authors: David Dohan, Winnie Xu, Aitor Lewkowycz, Jacob Austin, David Bieber, Raphael Gontijo Lopes, Yuhuai Wu, Henryk Michalewski, Rif A. Saurous, Jascha Sohl-dickstein, Kevin Murphy, Charles Sutton

    Abstract: Prompted models have demonstrated impressive few-shot learning abilities. Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities. These compositions are probabilistic models, and may be expressed in the language of graphical models with random variables whose values are complex data types such as strings. Cases with cont… ▽ More

    Submitted 28 July, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: Presented as spotlight at the Beyond Bases workshop at ICML 2022 (https://beyond-bayes.github.io)

  3. arXiv:2203.03771  [pdf, other

    cs.LG cs.PL

    Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions

    Authors: David Bieber, Rishab Goel, Daniel Zheng, Hugo Larochelle, Daniel Tarlow

    Abstract: The execution behavior of a program often depends on external resources, such as program inputs or file contents, and so cannot be run in isolation. Nevertheless, software developers benefit from fast iteration loops where automated tools identify errors as early as possible, even before programs can be compiled and run. This presents an interesting machine learning challenge: can we predict runti… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: 20 pages, 7 figures

  4. arXiv:2112.00114  [pdf, other

    cs.LG cs.NE

    Show Your Work: Scratchpads for Intermediate Computation with Language Models

    Authors: Maxwell Nye, Anders Johan Andreassen, Guy Gur-Ari, Henryk Michalewski, Jacob Austin, David Bieber, David Dohan, Aitor Lewkowycz, Maarten Bosma, David Luan, Charles Sutton, Augustus Odena

    Abstract: Large pre-trained language models perform remarkably well on tasks that can be done "in one pass", such as generating realistic text or synthesizing computer programs. However, they struggle with tasks that require unbounded multi-step computation, such as adding integers or executing programs. Surprisingly, we find that these same models are able to perform complex multi-step computations -- even… ▽ More

    Submitted 30 November, 2021; originally announced December 2021.

  5. arXiv:2010.12621  [pdf, other

    cs.LG

    Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks

    Authors: David Bieber, Charles Sutton, Hugo Larochelle, Daniel Tarlow

    Abstract: Graph neural networks (GNNs) have emerged as a powerful tool for learning software engineering tasks including code completion, bug finding, and program repair. They benefit from leveraging program structure like control flow graphs, but they are not well-suited to tasks like program execution that require far more sequential reasoning steps than number of GNN propagation steps. Recurrent neural n… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: Accepted at NeurIPS 2020

  6. arXiv:2007.14381  [pdf, other

    cs.PL cs.LG stat.ML

    BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration

    Authors: Augustus Odena, Kensen Shi, David Bieber, Rishabh Singh, Charles Sutton, Hanjun Dai

    Abstract: Program synthesis is challenging largely because of the difficulty of search in a large space of programs. Human programmers routinely tackle the task of writing complex programs by writing sub-programs and then analyzing their intermediate results to compose them in appropriate ways. Motivated by this intuition, we present a new synthesis approach that leverages learning to guide a bottom-up sear… ▽ More

    Submitted 30 September, 2021; v1 submitted 28 July, 2020; originally announced July 2020.

  7. arXiv:2003.09040  [pdf, other

    cs.PL cs.LG stat.ML

    TF-Coder: Program Synthesis for Tensor Manipulations

    Authors: Kensen Shi, David Bieber, Rishabh Singh

    Abstract: The success and popularity of deep learning is on the rise, partially due to powerful deep learning frameworks such as TensorFlow and PyTorch that make it easier to develop deep learning models. However, these libraries also come with steep learning curves, since programming in these frameworks is quite different from traditional imperative programming with explicit loops and conditionals. In this… ▽ More

    Submitted 7 April, 2022; v1 submitted 19 March, 2020; originally announced March 2020.

    Comments: Published in ACM Transactions on Programming Languages and Systems (TOPLAS) with presentation at PLDI 2022

  8. arXiv:2002.09067  [pdf, other

    cs.LG cs.DS stat.ML

    Incremental Sampling Without Replacement for Sequence Models

    Authors: Kensen Shi, David Bieber, Charles Sutton

    Abstract: Sampling is a fundamental technique, and sampling without replacement is often desirable when duplicate samples are not beneficial. Within machine learning, sampling is useful for generating diverse outputs from a trained model. We present an elegant procedure for sampling without replacement from a broad class of randomized programs, including generative neural models that construct outputs seque… ▽ More

    Submitted 19 July, 2021; v1 submitted 20 February, 2020; originally announced February 2020.

  9. arXiv:1904.02818  [pdf, other

    cs.LG cs.CL cs.SE stat.ML

    Neural Networks for Modeling Source Code Edits

    Authors: Rui Zhao, David Bieber, Kevin Swersky, Daniel Tarlow

    Abstract: Programming languages are emerging as a challenging and interesting domain for machine learning. A core task, which has received significant attention in recent years, is building generative models of source code. However, to our knowledge, previous generative models have always been framed in terms of generating static snapshots of code. In this work, we instead treat source code as a dynamic obj… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

    Comments: Deanonymized version of ICLR 2019 submission

  10. arXiv:1904.01720  [pdf, other

    cs.LG stat.ML

    Neural Program Repair by Jointly Learning to Localize and Repair

    Authors: Marko Vasic, Aditya Kanade, Petros Maniatis, David Bieber, Rishabh Singh

    Abstract: Due to its potential to improve programmer productivity and software quality, automated program repair has been an active topic of research. Newer techniques harness neural networks to learn directly from examples of buggy programs and their fixes. In this work, we consider a recently identified class of bugs called variable-misuse bugs. The state-of-the-art solution for variable misuse enumerates… ▽ More

    Submitted 2 April, 2019; originally announced April 2019.

    Comments: ICLR 2019

  11. arXiv:1705.07208  [pdf, other

    cs.CV cs.LG

    PixColor: Pixel Recursive Colorization

    Authors: Sergio Guadarrama, Ryan Dahl, David Bieber, Mohammad Norouzi, Jonathon Shlens, Kevin Murphy

    Abstract: We propose a novel approach to automatically produce multiple colorized versions of a grayscale image. Our method results from the observation that the task of automated colorization is relatively easy given a low-resolution version of the color image. We first train a conditional PixelCNN to generate a low resolution color for a given grayscale image. Then, given the generated low-resolution colo… ▽ More

    Submitted 5 June, 2017; v1 submitted 19 May, 2017; originally announced May 2017.