Skip to main content

Showing 1–30 of 30 results for author: Brockschmidt, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2302.01170  [pdf, other

    stat.ML cond-mat.stat-mech cs.LG physics.chem-ph

    Timewarp: Transferable Acceleration of Molecular Dynamics by Learning Time-Coarsened Dynamics

    Authors: Leon Klein, Andrew Y. K. Foong, Tor Erlend Fjelde, Bruno Mlodozeniec, Marc Brockschmidt, Sebastian Nowozin, Frank Noé, Ryota Tomioka

    Abstract: Molecular dynamics (MD) simulation is a widely used technique to simulate molecular systems, most commonly at the all-atom resolution where equations of motion are integrated with timesteps on the order of femtoseconds ($1\textrm{fs}=10^{-15}\textrm{s}$). MD is often used to compute equilibrium properties, which requires sampling from an equilibrium distribution such as the Boltzmann distribution.… ▽ More

    Submitted 1 December, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

  2. arXiv:2206.06986  [pdf, other

    cs.AI cs.LG

    Exploring Representation of Horn Clauses using GNNs (Extended Technical Report)

    Authors: Chencheng Liang, Philipp Rümmer, Marc Brockschmidt

    Abstract: Learning program semantics from raw source code is challenging due to the complexity of real-world programming language syntax and due to the difficulty of reconstructing long-distance relational information implicitly represented in programs using identifiers. Addressing the first point, we consider Constrained Horn Clauses (CHCs) as a standard representation of program verification problems, pro… ▽ More

    Submitted 26 July, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

  3. arXiv:2201.12113  [pdf, other

    cs.LG cs.SE

    HEAT: Hyperedge Attention Networks

    Authors: Dobrik Georgiev, Marc Brockschmidt, Miltiadis Allamanis

    Abstract: Learning from structured data is a core machine learning task. Commonly, such data is represented as graphs, which normally only consider (typed) binary relationships between pairs of nodes. This is a substantial limitation for many domains with highly-structured data. One important such domain is source code, where hypergraph-based representations can better capture the semantically rich and stru… ▽ More

    Submitted 5 September, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

    Comments: Published in TMLR

  4. arXiv:2106.10158  [pdf, other

    cs.LG cs.SE

    Learning to Complete Code with Sketches

    Authors: Daya Guo, Alexey Svyatkovskiy, Jian Yin, Nan Duan, Marc Brockschmidt, Miltiadis Allamanis

    Abstract: Code completion is usually cast as a language modelling problem, i.e., continuing an input in a left-to-right fashion. However, in practice, some parts of the completion (e.g., string literals) may be very hard to predict, whereas subsequent parts directly follow from the context. To handle this, we instead consider the scenario of generating code completions with "holes" inserted in places where… ▽ More

    Submitted 23 January, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: Published in ICLR 2022

  5. arXiv:2105.12787  [pdf, other

    cs.LG cs.SE

    Self-Supervised Bug Detection and Repair

    Authors: Miltiadis Allamanis, Henry Jackson-Flux, Marc Brockschmidt

    Abstract: Machine learning-based program analyses have recently shown the promise of integrating formal and probabilistic reasoning towards aiding software development. However, in the absence of large annotated corpora, training these analyses is challenging. Towards addressing this, we present BugLab, an approach for self-supervised learning of bug detection and repair. BugLab co-trains two models: (1) a… ▽ More

    Submitted 16 November, 2021; v1 submitted 26 May, 2021; originally announced May 2021.

    Comments: Published in NeurIPS 2021

  6. arXiv:2103.03864  [pdf, other

    cs.LG q-bio.QM

    Learning to Extend Molecular Scaffolds with Structural Motifs

    Authors: Krzysztof Maziarz, Henry Jackson-Flux, Pashmina Cameron, Finton Sirockin, Nadine Schneider, Nikolaus Stiefl, Marwin Segler, Marc Brockschmidt

    Abstract: Recent advancements in deep learning-based modeling of molecules promise to accelerate in silico drug discovery. A plethora of generative models is available, building molecules either atom-by-atom and bond-by-bond or fragment-by-fragment. However, many drug discovery projects require a fixed scaffold to be present in the generated molecule, and incorporating that constraint has only recently been… ▽ More

    Submitted 12 May, 2024; v1 submitted 5 March, 2021; originally announced March 2021.

    Comments: Published at the 10th International Conference on Learning Representations (ICLR 2022)

  7. arXiv:2006.04771  [pdf, other

    cs.LG stat.ML

    Copy that! Editing Sequences by Copying Spans

    Authors: Sheena Panthaplackel, Miltiadis Allamanis, Marc Brockschmidt

    Abstract: Neural sequence-to-sequence models are finding increasing use in editing of documents, for example in correcting a text document or repairing source code. In this paper, we argue that common seq2seq models (with a facility to copy single tokens) are not a natural fit for such tasks, as they have to explicitly copy each unchanged token. We present an extension of seq2seq models capable of copying e… ▽ More

    Submitted 14 December, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: Published in AAAI 2021

  8. arXiv:1912.07942  [pdf, other

    cs.LG cs.CL cs.CR stat.ML

    Analyzing Information Leakage of Updates to Natural Language Models

    Authors: Santiago Zanella-Béguelin, Lukas Wutschitz, Shruti Tople, Victor Rühle, Andrew Paverd, Olga Ohrimenko, Boris Köpf, Marc Brockschmidt

    Abstract: To continuously improve quality and reflect changes in data, machine learning applications have to regularly retrain and update their core models. We show that a differential analysis of language model snapshots before and after an update can reveal a surprising amount of detailed information about changes in the training data. We propose two new metrics---\emph{differential score} and \emph{diffe… ▽ More

    Submitted 5 August, 2021; v1 submitted 17 December, 2019; originally announced December 2019.

  9. arXiv:1911.01077  [pdf, other

    cs.LO cs.PL

    Inferring Lower Runtime Bounds for Integer Programs

    Authors: Florian Frohn, Matthias Naaf, Marc Brockschmidt, Jürgen Giesl

    Abstract: We present a technique to infer lower bounds on the worst-case runtime complexity of integer programs, where in contrast to earlier work, our approach is not restricted to tail-recursion. Our technique constructs symbolic representations of program executions using a framework for iterative, under-approximating program simplification. The core of this simplification is a method for (under-approxim… ▽ More

    Submitted 28 September, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

  10. arXiv:1910.05639  [pdf, other

    cs.LG cs.SI nlin.CD stat.ML

    Disentangling Interpretable Generative Parameters of Random and Real-World Graphs

    Authors: Niklas Stoehr, Emine Yilmaz, Marc Brockschmidt, Jan Stuehmer

    Abstract: While a wide range of interpretable generative procedures for graphs exist, matching observed graph topologies with such procedures and choices for its parameters remains an open problem. Devising generative models that closely reproduce real-world graphs requires domain knowledge and time-consuming simulation. While existing deep learning approaches rely on less manual modelling, they offer littl… ▽ More

    Submitted 6 November, 2019; v1 submitted 12 October, 2019; originally announced October 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Workshop on Graph Representation Learning

  11. arXiv:1909.09436  [pdf, other

    cs.LG cs.IR cs.SE stat.ML

    CodeSearchNet Challenge: Evaluating the State of Semantic Code Search

    Authors: Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, Marc Brockschmidt

    Abstract: Semantic code search is the task of retrieving relevant code given a natural language query. While related to other information retrieval tasks, it requires bridging the gap between the language used in code (often abbreviated and highly technical) and natural language more suitable to describe vague concepts and ideas. To enable evaluation of progress on code search, we are releasing the CodeSe… ▽ More

    Submitted 8 June, 2020; v1 submitted 20 September, 2019; originally announced September 2019.

    Comments: Updated evaluation numbers after fixing indexing bug

  12. arXiv:1906.12192  [pdf, other

    cs.LG stat.ML

    GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation

    Authors: Marc Brockschmidt

    Abstract: This paper presents a new Graph Neural Network (GNN) type using feature-wise linear modulation (FiLM). Many standard GNN variants propagate information along the edges of a graph by computing "messages" based only on the representation of the source of each edge. In GNN-FiLM, the representation of the target node of an edge is additionally used to compute a transformation that can be applied to al… ▽ More

    Submitted 26 June, 2020; v1 submitted 28 June, 2019; originally announced June 2019.

    Comments: As published in ICML 2020 proceedings

  13. arXiv:1906.10816  [pdf, other

    cs.LG cs.AI cs.CL cs.PL stat.ML

    Program Synthesis and Semantic Parsing with Learned Code Idioms

    Authors: Richard Shin, Miltiadis Allamanis, Marc Brockschmidt, Oleksandr Polozov

    Abstract: Program synthesis of general-purpose source code from natural language specifications is challenging due to the need to reason about high-level patterns in the target program and low-level implementation details at the same time. In this work, we present PATOIS, a system that allows a neural program synthesizer to explicitly interleave high-level and low-level reasoning at every generation step. I… ▽ More

    Submitted 4 November, 2019; v1 submitted 25 June, 2019; originally announced June 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS) 2019. 13 pages total, 9 pages of main text

  14. arXiv:1811.01824  [pdf, ps, other

    cs.LG cs.CL cs.SE stat.ML

    Structured Neural Summarization

    Authors: Patrick Fernandes, Miltiadis Allamanis, Marc Brockschmidt

    Abstract: Summarization of long sequences into a concise statement is a core problem in natural language processing, requiring non-trivial understanding of the input. Based on the promising results of graph neural networks on highly structured data, we develop a framework to extend existing sequence encoders with a graph component that can reason about long-distance relationships in weakly structured data s… ▽ More

    Submitted 3 February, 2021; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: Published in ICLR 2019 https://openreview.net/forum?id=H1ersoRqtm

  15. arXiv:1810.13337  [pdf, other

    cs.LG cs.SE stat.ML

    Learning to Represent Edits

    Authors: Pengcheng Yin, Graham Neubig, Miltiadis Allamanis, Marc Brockschmidt, Alexander L. Gaunt

    Abstract: We introduce the problem of learning distributed representations of edits. By combining a "neural editor" with an "edit encoder", our models learn to represent the salient information of an edit and can be used to apply edits to new inputs. We experiment on natural language and source code edit data. Our evaluation yields promising results that suggest that our neural network models learn to captu… ▽ More

    Submitted 22 February, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

    Comments: ICLR 2019

  16. arXiv:1807.03100  [pdf, other

    cs.CL cs.AI cs.DB cs.LG cs.PL

    Robust Text-to-SQL Generation with Execution-Guided Decoding

    Authors: Chenglong Wang, Kedar Tatwawadi, Marc Brockschmidt, Po-Sen Huang, Yi Mao, Oleksandr Polozov, Rishabh Singh

    Abstract: We consider the problem of neural semantic parsing, which translates natural language questions into executable SQL queries. We introduce a new mechanism, execution guidance, to leverage the semantics of SQL. It detects and excludes faulty programs during the decoding procedure by conditioning on the execution of partially generated program. The mechanism can be used with any autoregressive genera… ▽ More

    Submitted 12 September, 2018; v1 submitted 9 July, 2018; originally announced July 2018.

  17. arXiv:1805.09076  [pdf, other

    cs.LG stat.ML

    Constrained Graph Variational Autoencoders for Molecule Design

    Authors: Qi Liu, Miltiadis Allamanis, Marc Brockschmidt, Alexander L. Gaunt

    Abstract: Graphs are ubiquitous data structures for representing interactions between entities. With an emphasis on the use of graphs to represent chemical molecules, we explore the task of learning to generate graphs that conform to a distribution observed in training data. We propose a variational autoencoder model in which both encoder and decoder are graph-structured. Our decoder assumes a sequential or… ▽ More

    Submitted 7 March, 2019; v1 submitted 23 May, 2018; originally announced May 2018.

    Comments: 8 pages, 5 figures

  18. arXiv:1805.08490  [pdf, other

    cs.LG cs.PL stat.ML

    Generative Code Modeling with Graphs

    Authors: Marc Brockschmidt, Miltiadis Allamanis, Alexander L. Gaunt, Oleksandr Polozov

    Abstract: Generative models for source code are an interesting structured prediction problem, requiring to reason about both hard syntactic and semantic constraints as well as about natural, likely programs. We present a novel model for this problem that uses a graph to represent the intermediate state of the generated output. The generative procedure interleaves grammar-driven expansion steps with graph au… ▽ More

    Submitted 16 April, 2019; v1 submitted 22 May, 2018; originally announced May 2018.

  19. arXiv:1803.06272  [pdf, other

    cs.LG stat.ML

    Graph Partition Neural Networks for Semi-Supervised Classification

    Authors: Renjie Liao, Marc Brockschmidt, Daniel Tarlow, Alexander L. Gaunt, Raquel Urtasun, Richard Zemel

    Abstract: We present graph partition neural networks (GPNN), an extension of graph neural networks (GNNs) able to handle extremely large graphs. GPNNs alternate between locally propagating information between nodes in small subgraphs and globally propagating information between the subgraphs. To efficiently partition graphs, we experiment with several partitioning algorithms and also propose a novel variant… ▽ More

    Submitted 16 March, 2018; originally announced March 2018.

  20. arXiv:1711.00740  [pdf, other

    cs.LG cs.AI cs.PL cs.SE

    Learning to Represent Programs with Graphs

    Authors: Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi

    Abstract: Learning tasks on source code (i.e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax. For example, long-range dependencies induced by using the same variable or function in distant locations are often not considered. We propose to use graphs to represent… ▽ More

    Submitted 4 May, 2018; v1 submitted 1 November, 2017; originally announced November 2017.

    Comments: Published in ICLR 2018. arXiv admin note: text overlap with arXiv:1705.07867

  21. arXiv:1705.07867  [pdf, ps, other

    cs.LG cs.SE

    SmartPaste: Learning to Adapt Source Code

    Authors: Miltiadis Allamanis, Marc Brockschmidt

    Abstract: Deep Neural Networks have been shown to succeed at a range of natural language tasks such as machine translation and text summarization. While tasks on source code (ie, formal languages) have been considered recently, most work in this area does not attempt to capitalize on the unique opportunities offered by its known syntax and structure. In this work, we introduce SmartPaste, a first task that… ▽ More

    Submitted 22 May, 2017; originally announced May 2017.

  22. arXiv:1612.00817  [pdf, other

    cs.LG cs.AI cs.NE

    Summary - TerpreT: A Probabilistic Programming Language for Program Induction

    Authors: Alexander L. Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, Daniel Tarlow

    Abstract: We study machine learning formulations of inductive program synthesis; that is, given input-output examples, synthesize source code that maps inputs to corresponding outputs. Our key contribution is TerpreT, a domain-specific language for expressing program synthesis problems. A TerpreT model is composed of a specification of a program representation and an interpreter that describes how programs… ▽ More

    Submitted 2 December, 2016; originally announced December 2016.

    Comments: 7 pages, 2 figures, 4 tables in 1st Workshop on Neural Abstract Machines & Program Induction (NAMPI), @NIPS 2016

  23. arXiv:1611.02109  [pdf, other

    cs.LG

    Differentiable Programs with Neural Libraries

    Authors: Alexander L. Gaunt, Marc Brockschmidt, Nate Kushman, Daniel Tarlow

    Abstract: We develop a framework for combining differentiable programming languages with neural networks. Using this framework we create end-to-end trainable systems that learn to write interpretable algorithms with perceptual components. We explore the benefits of inductive biases for strong generalization and modularity that come from the program-like structure of our models. In particular, modularity all… ▽ More

    Submitted 2 March, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

  24. arXiv:1611.01989  [pdf, other

    cs.LG

    DeepCoder: Learning to Write Programs

    Authors: Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, Daniel Tarlow

    Abstract: We develop a first line of attack for solving programming competition-style problems from input-output examples using deep learning. The approach is to train a neural network to predict properties of the program that generated the outputs from the inputs. We use the neural network's predictions to augment search techniques from the programming languages community, including enumerative search and… ▽ More

    Submitted 8 March, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

    Comments: Submitted to ICLR 2017

  25. arXiv:1611.01988  [pdf, ps, other

    cs.PL cs.LG

    Differentiable Functional Program Interpreters

    Authors: John K. Feser, Marc Brockschmidt, Alexander L. Gaunt, Daniel Tarlow

    Abstract: Programming by Example (PBE) is the task of inducing computer programs from input-output examples. It can be seen as a type of machine learning where the hypothesis space is the set of legal programs in some programming language. Recent work on differentiable interpreters relaxes the discrete space of programs into a continuous space so that search over programs can be performed using gradient-bas… ▽ More

    Submitted 2 March, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

  26. arXiv:1608.04428  [pdf, other

    cs.LG cs.AI cs.NE

    TerpreT: A Probabilistic Programming Language for Program Induction

    Authors: Alexander L. Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, Daniel Tarlow

    Abstract: We study machine learning formulations of inductive program synthesis; given input-output examples, we try to synthesize source code that maps inputs to corresponding outputs. Our aims are to develop new machine learning approaches based on neural networks and graphical models, and to understand the capabilities of machine learning techniques relative to traditional alternatives, such as those bas… ▽ More

    Submitted 15 August, 2016; originally announced August 2016.

    Comments: 50 pages, 20 figures, 4 tables

  27. arXiv:1512.08689  [pdf, other

    cs.LO

    T2: Temporal Property Verification

    Authors: Marc Brockschmidt, Byron Cook, Samin Ishtiaq, Heidy Khlaaf, Nir Piterman

    Abstract: We present the open-source tool T2, the first public release from the TERMINATOR project. T2 has been extended over the past decade to support automatic temporal-logic proving techniques and to handle a general class of user-provided liveness and safety properties. Input can be provided in a native format and in C, via the support of the LLVM compiler framework. We briefly discuss T2's architectur… ▽ More

    Submitted 6 January, 2016; v1 submitted 29 December, 2015; originally announced December 2015.

    Comments: Full version of TACAS'16 tool paper

  28. arXiv:1511.05493  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Gated Graph Sequence Neural Networks

    Authors: Yujia Li, Daniel Tarlow, Marc Brockschmidt, Richard Zemel

    Abstract: Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs. Our starting point is previous work on Graph Neural Networks (Scarselli et al., 2009), which we modify to use gated recurrent units and modern optimization techniques and then exten… ▽ More

    Submitted 22 September, 2017; v1 submitted 17 November, 2015; originally announced November 2015.

    Comments: Published as a conference paper in ICLR 2016. Fixed a typo

  29. arXiv:1507.03851  [pdf, ps, other

    cs.LO

    Compositional Safety Verification with Max-SMT

    Authors: Marc Brockschmidt, Daniel Larraz, Albert Oliveras, Enric Rodriguez-Carbonell, Albert Rubio

    Abstract: We present an automated compositional program verification technique for safety properties based on conditional inductive invariants. For a given program part (e.g., a single loop) and a postcondition $\varphi$, we show how to, using a Max-SMT solver, an inductive invariant together with a precondition can be synthesized so that the precondition ensures the validity of the invariant and that the i… ▽ More

    Submitted 3 August, 2015; v1 submitted 14 July, 2015; originally announced July 2015.

    Comments: Extended technical report version of the conference paper at FMCAD'15

  30. arXiv:1406.3988  [pdf, ps, other

    cs.LO

    CTL+FO Verification as Constraint Solving

    Authors: Tewodros A. Beyene, Marc Brockschmidt, Andrey Rybalchenko

    Abstract: Expressing program correctness often requires relating program data throughout (different branches of) an execution. Such properties can be represented using CTL+FO, a logic that allows mixing temporal and first-order quantification. Verifying that a program satisfies a CTL+FO property is a challenging problem that requires both temporal and data reasoning. Temporal quantifiers require discovery o… ▽ More

    Submitted 21 June, 2014; v1 submitted 16 June, 2014; originally announced June 2014.