Search | arXiv e-print repository

Timewarp: Transferable Acceleration of Molecular Dynamics by Learning Time-Coarsened Dynamics

Authors: Leon Klein, Andrew Y. K. Foong, Tor Erlend Fjelde, Bruno Mlodozeniec, Marc Brockschmidt, Sebastian Nowozin, Frank Noé, Ryota Tomioka

Abstract: Molecular dynamics (MD) simulation is a widely used technique to simulate molecular systems, most commonly at the all-atom resolution where equations of motion are integrated with timesteps on the order of femtoseconds ($1\textrm{fs}=10^{-15}\textrm{s}$). MD is often used to compute equilibrium properties, which requires sampling from an equilibrium distribution such as the Boltzmann distribution.… ▽ More Molecular dynamics (MD) simulation is a widely used technique to simulate molecular systems, most commonly at the all-atom resolution where equations of motion are integrated with timesteps on the order of femtoseconds ($1\textrm{fs}=10^{-15}\textrm{s}$). MD is often used to compute equilibrium properties, which requires sampling from an equilibrium distribution such as the Boltzmann distribution. However, many important processes, such as binding and folding, occur over timescales of milliseconds or beyond, and cannot be efficiently sampled with conventional MD. Furthermore, new MD simulations need to be performed for each molecular system studied. We present Timewarp, an enhanced sampling method which uses a normalising flow as a proposal distribution in a Markov chain Monte Carlo method targeting the Boltzmann distribution. The flow is trained offline on MD trajectories and learns to make large steps in time, simulating the molecular dynamics of $10^{5} - 10^{6}\:\textrm{fs}$. Crucially, Timewarp is transferable between molecular systems: once trained, we show that it generalises to unseen small peptides (2-4 amino acids) at all-atom resolution, exploring their metastable states and providing wall-clock acceleration of sampling compared to standard MD. Our method constitutes an important step towards general, transferable algorithms for accelerating MD. △ Less

Submitted 1 December, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

arXiv:2206.06986 [pdf, other]

Exploring Representation of Horn Clauses using GNNs (Extended Technical Report)

Authors: Chencheng Liang, Philipp Rümmer, Marc Brockschmidt

Abstract: Learning program semantics from raw source code is challenging due to the complexity of real-world programming language syntax and due to the difficulty of reconstructing long-distance relational information implicitly represented in programs using identifiers. Addressing the first point, we consider Constrained Horn Clauses (CHCs) as a standard representation of program verification problems, pro… ▽ More Learning program semantics from raw source code is challenging due to the complexity of real-world programming language syntax and due to the difficulty of reconstructing long-distance relational information implicitly represented in programs using identifiers. Addressing the first point, we consider Constrained Horn Clauses (CHCs) as a standard representation of program verification problems, providing a simple and programming language-independent syntax. For the second challenge, we explore graph representations of CHCs, and propose a new Relational Hypergraph Neural Network (R-HyGNN) architecture to learn program features. We introduce two different graph representations of CHCs. One is called constraint graph (CG), and emphasizes syntactic information of CHCs by translating the symbols and their relations in CHCs as typed nodes and binary edges, respectively, and constructing the constraints as abstract syntax trees. The second one is called control- and data-flow hypergraph (CDHG), and emphasizes semantic information of CHCs by representing the control and data flow through ternary hyperedges. We then propose a new GNN architecture, R-HyGNN, extending Relational Graph Convolutional Networks, to handle hypergraphs. To evaluate the ability of R-HyGNN to extract semantic information from programs, we use R-HyGNNs to train models on the two graph representations, and on five proxy tasks with increasing difficulty, using benchmarks from CHC-COMP 2021 as training data. The most difficult proxy task requires the model to predict the occurrence of clauses in counter-examples, which subsumes satisfiability of CHCs. CDHG achieves 90.59% accuracy in this task. Furthermore, R-HyGNN has perfect predictions on one of the graphs consisting of more than 290 clauses. Overall, our experiments indicate that R-HyGNN can capture intricate program features for guiding verification problems. △ Less

Submitted 26 July, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

arXiv:2201.12113 [pdf, other]

HEAT: Hyperedge Attention Networks

Authors: Dobrik Georgiev, Marc Brockschmidt, Miltiadis Allamanis

Abstract: Learning from structured data is a core machine learning task. Commonly, such data is represented as graphs, which normally only consider (typed) binary relationships between pairs of nodes. This is a substantial limitation for many domains with highly-structured data. One important such domain is source code, where hypergraph-based representations can better capture the semantically rich and stru… ▽ More Learning from structured data is a core machine learning task. Commonly, such data is represented as graphs, which normally only consider (typed) binary relationships between pairs of nodes. This is a substantial limitation for many domains with highly-structured data. One important such domain is source code, where hypergraph-based representations can better capture the semantically rich and structured nature of code. In this work, we present HEAT, a neural model capable of representing typed and qualified hypergraphs, where each hyperedge explicitly qualifies how participating nodes contribute. It can be viewed as a generalization of both message passing neural networks and Transformers. We evaluate HEAT on knowledge base completion and on bug detection and repair using a novel hypergraph representation of programs. In both settings, it outperforms strong baselines, indicating its power and generality. △ Less

Submitted 5 September, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

Comments: Published in TMLR

arXiv:2106.10158 [pdf, other]

Learning to Complete Code with Sketches

Authors: Daya Guo, Alexey Svyatkovskiy, Jian Yin, Nan Duan, Marc Brockschmidt, Miltiadis Allamanis

Abstract: Code completion is usually cast as a language modelling problem, i.e., continuing an input in a left-to-right fashion. However, in practice, some parts of the completion (e.g., string literals) may be very hard to predict, whereas subsequent parts directly follow from the context. To handle this, we instead consider the scenario of generating code completions with "holes" inserted in places where… ▽ More Code completion is usually cast as a language modelling problem, i.e., continuing an input in a left-to-right fashion. However, in practice, some parts of the completion (e.g., string literals) may be very hard to predict, whereas subsequent parts directly follow from the context. To handle this, we instead consider the scenario of generating code completions with "holes" inserted in places where a model is uncertain. We develop Grammformer, a Transformer-based model that guides code generation by the programming language grammar, and compare it to a variety of more standard sequence models. We train the models on code completion for C# and Python given partial code context. To evaluate models, we consider both ROUGE as well as a new metric RegexAcc that measures success of generating completions matching long outputs with as few holes as possible. In our experiments, Grammformer generates 10-50% more accurate completions compared to traditional generative models and 37-50% longer sketches compared to sketch-generating baselines trained with similar techniques. △ Less

Submitted 23 January, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

Comments: Published in ICLR 2022

arXiv:2105.12787 [pdf, other]

Self-Supervised Bug Detection and Repair

Authors: Miltiadis Allamanis, Henry Jackson-Flux, Marc Brockschmidt

Abstract: Machine learning-based program analyses have recently shown the promise of integrating formal and probabilistic reasoning towards aiding software development. However, in the absence of large annotated corpora, training these analyses is challenging. Towards addressing this, we present BugLab, an approach for self-supervised learning of bug detection and repair. BugLab co-trains two models: (1) a… ▽ More Machine learning-based program analyses have recently shown the promise of integrating formal and probabilistic reasoning towards aiding software development. However, in the absence of large annotated corpora, training these analyses is challenging. Towards addressing this, we present BugLab, an approach for self-supervised learning of bug detection and repair. BugLab co-trains two models: (1) a detector model that learns to detect and repair bugs in code, (2) a selector model that learns to create buggy code for the detector to use as training data. A Python implementation of BugLab improves by up to 30% upon baseline methods on a test dataset of 2374 real-life bugs and finds 19 previously unknown bugs in open-source software. △ Less

Submitted 16 November, 2021; v1 submitted 26 May, 2021; originally announced May 2021.

Comments: Published in NeurIPS 2021

arXiv:2103.03864 [pdf, other]

Learning to Extend Molecular Scaffolds with Structural Motifs

Authors: Krzysztof Maziarz, Henry Jackson-Flux, Pashmina Cameron, Finton Sirockin, Nadine Schneider, Nikolaus Stiefl, Marwin Segler, Marc Brockschmidt

Abstract: Recent advancements in deep learning-based modeling of molecules promise to accelerate in silico drug discovery. A plethora of generative models is available, building molecules either atom-by-atom and bond-by-bond or fragment-by-fragment. However, many drug discovery projects require a fixed scaffold to be present in the generated molecule, and incorporating that constraint has only recently been… ▽ More Recent advancements in deep learning-based modeling of molecules promise to accelerate in silico drug discovery. A plethora of generative models is available, building molecules either atom-by-atom and bond-by-bond or fragment-by-fragment. However, many drug discovery projects require a fixed scaffold to be present in the generated molecule, and incorporating that constraint has only recently been explored. Here, we propose MoLeR, a graph-based model that naturally supports scaffolds as initial seed of the generative procedure, which is possible because it is not conditioned on the generation history. Our experiments show that MoLeR performs comparably to state-of-the-art methods on unconstrained molecular optimization tasks, and outperforms them on scaffold-based tasks, while being an order of magnitude faster to train and sample from than existing approaches. Furthermore, we show the influence of a number of seemingly minor design choices on the overall performance. △ Less

Submitted 12 May, 2024; v1 submitted 5 March, 2021; originally announced March 2021.

Comments: Published at the 10th International Conference on Learning Representations (ICLR 2022)

arXiv:2006.04771 [pdf, other]

Copy that! Editing Sequences by Copying Spans

Authors: Sheena Panthaplackel, Miltiadis Allamanis, Marc Brockschmidt

Abstract: Neural sequence-to-sequence models are finding increasing use in editing of documents, for example in correcting a text document or repairing source code. In this paper, we argue that common seq2seq models (with a facility to copy single tokens) are not a natural fit for such tasks, as they have to explicitly copy each unchanged token. We present an extension of seq2seq models capable of copying e… ▽ More Neural sequence-to-sequence models are finding increasing use in editing of documents, for example in correcting a text document or repairing source code. In this paper, we argue that common seq2seq models (with a facility to copy single tokens) are not a natural fit for such tasks, as they have to explicitly copy each unchanged token. We present an extension of seq2seq models capable of copying entire spans of the input to the output in one step, greatly reducing the number of decisions required during inference. This extension means that there are now many ways of generating the same output, which we handle by deriving a new objective for training and a variation of beam search for inference that explicitly handles this problem. In our experiments on a range of editing tasks of natural language and source code, we show that our new model consistently outperforms simpler baselines. △ Less

Submitted 14 December, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

Comments: Published in AAAI 2021

arXiv:1912.07942 [pdf, other]

doi 10.1145/3372297.3417880

Analyzing Information Leakage of Updates to Natural Language Models

Authors: Santiago Zanella-Béguelin, Lukas Wutschitz, Shruti Tople, Victor Rühle, Andrew Paverd, Olga Ohrimenko, Boris Köpf, Marc Brockschmidt

Abstract: To continuously improve quality and reflect changes in data, machine learning applications have to regularly retrain and update their core models. We show that a differential analysis of language model snapshots before and after an update can reveal a surprising amount of detailed information about changes in the training data. We propose two new metrics---\emph{differential score} and \emph{diffe… ▽ More To continuously improve quality and reflect changes in data, machine learning applications have to regularly retrain and update their core models. We show that a differential analysis of language model snapshots before and after an update can reveal a surprising amount of detailed information about changes in the training data. We propose two new metrics---\emph{differential score} and \emph{differential rank}---for analyzing the leakage due to updates of natural language models. We perform leakage analysis using these metrics across models trained on several different datasets using different methods and configurations. We discuss the privacy implications of our findings, propose mitigation strategies and evaluate their effect. △ Less

Submitted 5 August, 2021; v1 submitted 17 December, 2019; originally announced December 2019.

arXiv:1911.01077 [pdf, other]

Inferring Lower Runtime Bounds for Integer Programs

Authors: Florian Frohn, Matthias Naaf, Marc Brockschmidt, Jürgen Giesl

Abstract: We present a technique to infer lower bounds on the worst-case runtime complexity of integer programs, where in contrast to earlier work, our approach is not restricted to tail-recursion. Our technique constructs symbolic representations of program executions using a framework for iterative, under-approximating program simplification. The core of this simplification is a method for (under-approxim… ▽ More We present a technique to infer lower bounds on the worst-case runtime complexity of integer programs, where in contrast to earlier work, our approach is not restricted to tail-recursion. Our technique constructs symbolic representations of program executions using a framework for iterative, under-approximating program simplification. The core of this simplification is a method for (under-approximating) program acceleration based on recurrence solving and a variation of ranking functions. Afterwards, we deduce asymptotic lower bounds from the resulting simplified programs using a special-purpose calculus and an SMT encoding. We implemented our technique in our tool LoAT and show that it infers non-trivial lower bounds for a large class of examples. △ Less

Submitted 28 September, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

arXiv:1910.05639 [pdf, other]

Disentangling Interpretable Generative Parameters of Random and Real-World Graphs

Authors: Niklas Stoehr, Emine Yilmaz, Marc Brockschmidt, Jan Stuehmer

Abstract: While a wide range of interpretable generative procedures for graphs exist, matching observed graph topologies with such procedures and choices for its parameters remains an open problem. Devising generative models that closely reproduce real-world graphs requires domain knowledge and time-consuming simulation. While existing deep learning approaches rely on less manual modelling, they offer littl… ▽ More While a wide range of interpretable generative procedures for graphs exist, matching observed graph topologies with such procedures and choices for its parameters remains an open problem. Devising generative models that closely reproduce real-world graphs requires domain knowledge and time-consuming simulation. While existing deep learning approaches rely on less manual modelling, they offer little interpretability. This work approaches graph generation (decoding) as the inverse of graph compression (encoding). We show that in a disentanglement-focused deep autoencoding framework, specifically Beta-Variational Autoencoders (Beta-VAE), choices of generative procedures and their parameters arise naturally in the latent space. Our model is capable of learning disentangled, interpretable latent variables that represent the generative parameters of procedurally generated random graphs and real-world graphs. The degree of disentanglement is quantitatively measured using the Mutual Information Gap (MIG). When training our Beta-VAE model on ER random graphs, its latent variables have a near one-to-one map** to the ER random graph parameters n and p. We deploy the model to analyse the correlation between graph topology and node attributes measuring their mutual dependence without handpicking topological properties. △ Less

Submitted 6 November, 2019; v1 submitted 12 October, 2019; originally announced October 2019.

Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Workshop on Graph Representation Learning

arXiv:1909.09436 [pdf, other]

CodeSearchNet Challenge: Evaluating the State of Semantic Code Search

Authors: Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, Marc Brockschmidt

Abstract: Semantic code search is the task of retrieving relevant code given a natural language query. While related to other information retrieval tasks, it requires bridging the gap between the language used in code (often abbreviated and highly technical) and natural language more suitable to describe vague concepts and ideas. To enable evaluation of progress on code search, we are releasing the CodeSe… ▽ More Semantic code search is the task of retrieving relevant code given a natural language query. While related to other information retrieval tasks, it requires bridging the gap between the language used in code (often abbreviated and highly technical) and natural language more suitable to describe vague concepts and ideas. To enable evaluation of progress on code search, we are releasing the CodeSearchNet Corpus and are presenting the CodeSearchNet Challenge, which consists of 99 natural language queries with about 4k expert relevance annotations of likely results from CodeSearchNet Corpus. The corpus contains about 6 million functions from open-source code spanning six programming languages (Go, Java, JavaScript, PHP, Python, and Ruby). The CodeSearchNet Corpus also contains automatically generated query-like natural language for 2 million functions, obtained from mechanically scra** and preprocessing associated function documentation. In this article, we describe the methodology used to obtain the corpus and expert labels, as well as a number of simple baseline solutions for the task. We hope that CodeSearchNet Challenge encourages researchers and practitioners to study this interesting task further and will host a competition and leaderboard to track the progress on the challenge. We are also keen on extending CodeSearchNet Challenge to more queries and programming languages in the future. △ Less

Submitted 8 June, 2020; v1 submitted 20 September, 2019; originally announced September 2019.

Comments: Updated evaluation numbers after fixing indexing bug

arXiv:1906.12192 [pdf, other]

GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation

Authors: Marc Brockschmidt

Abstract: This paper presents a new Graph Neural Network (GNN) type using feature-wise linear modulation (FiLM). Many standard GNN variants propagate information along the edges of a graph by computing "messages" based only on the representation of the source of each edge. In GNN-FiLM, the representation of the target node of an edge is additionally used to compute a transformation that can be applied to al… ▽ More This paper presents a new Graph Neural Network (GNN) type using feature-wise linear modulation (FiLM). Many standard GNN variants propagate information along the edges of a graph by computing "messages" based only on the representation of the source of each edge. In GNN-FiLM, the representation of the target node of an edge is additionally used to compute a transformation that can be applied to all incoming messages, allowing feature-wise modulation of the passed information. Results of experiments comparing different GNN architectures on three tasks from the literature are presented, based on re-implementations of baseline methods. Hyperparameters for all methods were found using extensive search, yielding somewhat surprising results: differences between baseline models are smaller than reported in the literature. Nonetheless, GNN-FiLM outperforms baseline methods on a regression task on molecular graphs and performs competitively on other tasks. △ Less

Submitted 26 June, 2020; v1 submitted 28 June, 2019; originally announced June 2019.

Comments: As published in ICML 2020 proceedings

arXiv:1906.10816 [pdf, other]

Program Synthesis and Semantic Parsing with Learned Code Idioms

Authors: Richard Shin, Miltiadis Allamanis, Marc Brockschmidt, Oleksandr Polozov

Abstract: Program synthesis of general-purpose source code from natural language specifications is challenging due to the need to reason about high-level patterns in the target program and low-level implementation details at the same time. In this work, we present PATOIS, a system that allows a neural program synthesizer to explicitly interleave high-level and low-level reasoning at every generation step. I… ▽ More Program synthesis of general-purpose source code from natural language specifications is challenging due to the need to reason about high-level patterns in the target program and low-level implementation details at the same time. In this work, we present PATOIS, a system that allows a neural program synthesizer to explicitly interleave high-level and low-level reasoning at every generation step. It accomplishes this by automatically mining common code idioms from a given corpus, incorporating them into the underlying language for neural synthesis, and training a tree-based neural synthesizer to use these idioms during code generation. We evaluate PATOIS on two complex semantic parsing datasets and show that using learned code idioms improves the synthesizer's accuracy. △ Less

Submitted 4 November, 2019; v1 submitted 25 June, 2019; originally announced June 2019.

Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS) 2019. 13 pages total, 9 pages of main text

arXiv:1811.01824 [pdf, ps, other]

Structured Neural Summarization

Authors: Patrick Fernandes, Miltiadis Allamanis, Marc Brockschmidt

Abstract: Summarization of long sequences into a concise statement is a core problem in natural language processing, requiring non-trivial understanding of the input. Based on the promising results of graph neural networks on highly structured data, we develop a framework to extend existing sequence encoders with a graph component that can reason about long-distance relationships in weakly structured data s… ▽ More Summarization of long sequences into a concise statement is a core problem in natural language processing, requiring non-trivial understanding of the input. Based on the promising results of graph neural networks on highly structured data, we develop a framework to extend existing sequence encoders with a graph component that can reason about long-distance relationships in weakly structured data such as text. In an extensive evaluation, we show that the resulting hybrid sequence-graph models outperform both pure sequence models as well as pure graph models on a range of summarization tasks. △ Less

Submitted 3 February, 2021; v1 submitted 5 November, 2018; originally announced November 2018.

Comments: Published in ICLR 2019 https://openreview.net/forum?id=H1ersoRqtm

arXiv:1810.13337 [pdf, other]

Learning to Represent Edits

Authors: Pengcheng Yin, Graham Neubig, Miltiadis Allamanis, Marc Brockschmidt, Alexander L. Gaunt

Abstract: We introduce the problem of learning distributed representations of edits. By combining a "neural editor" with an "edit encoder", our models learn to represent the salient information of an edit and can be used to apply edits to new inputs. We experiment on natural language and source code edit data. Our evaluation yields promising results that suggest that our neural network models learn to captu… ▽ More We introduce the problem of learning distributed representations of edits. By combining a "neural editor" with an "edit encoder", our models learn to represent the salient information of an edit and can be used to apply edits to new inputs. We experiment on natural language and source code edit data. Our evaluation yields promising results that suggest that our neural network models learn to capture the structure and semantics of edits. We hope that this interesting task and data source will inspire other researchers to work further on this problem. △ Less

Submitted 22 February, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

Comments: ICLR 2019

arXiv:1807.03100 [pdf, other]

Robust Text-to-SQL Generation with Execution-Guided Decoding

Authors: Chenglong Wang, Kedar Tatwawadi, Marc Brockschmidt, Po-Sen Huang, Yi Mao, Oleksandr Polozov, Rishabh Singh

Abstract: We consider the problem of neural semantic parsing, which translates natural language questions into executable SQL queries. We introduce a new mechanism, execution guidance, to leverage the semantics of SQL. It detects and excludes faulty programs during the decoding procedure by conditioning on the execution of partially generated program. The mechanism can be used with any autoregressive genera… ▽ More We consider the problem of neural semantic parsing, which translates natural language questions into executable SQL queries. We introduce a new mechanism, execution guidance, to leverage the semantics of SQL. It detects and excludes faulty programs during the decoding procedure by conditioning on the execution of partially generated program. The mechanism can be used with any autoregressive generative model, which we demonstrate on four state-of-the-art recurrent or template-based semantic parsing models. We demonstrate that execution guidance universally improves model performance on various text-to-SQL datasets with different scales and query complexity: WikiSQL, ATIS, and GeoQuery. As a result, we achieve new state-of-the-art execution accuracy of 83.8% on WikiSQL. △ Less

Submitted 12 September, 2018; v1 submitted 9 July, 2018; originally announced July 2018.

arXiv:1805.09076 [pdf, other]

Constrained Graph Variational Autoencoders for Molecule Design

Authors: Qi Liu, Miltiadis Allamanis, Marc Brockschmidt, Alexander L. Gaunt

Abstract: Graphs are ubiquitous data structures for representing interactions between entities. With an emphasis on the use of graphs to represent chemical molecules, we explore the task of learning to generate graphs that conform to a distribution observed in training data. We propose a variational autoencoder model in which both encoder and decoder are graph-structured. Our decoder assumes a sequential or… ▽ More Graphs are ubiquitous data structures for representing interactions between entities. With an emphasis on the use of graphs to represent chemical molecules, we explore the task of learning to generate graphs that conform to a distribution observed in training data. We propose a variational autoencoder model in which both encoder and decoder are graph-structured. Our decoder assumes a sequential ordering of graph extension steps and we discuss and analyze design choices that mitigate the potential downsides of this linearization. Experiments compare our approach with a wide range of baselines on the molecule generation task and show that our method is more successful at matching the statistics of the original dataset on semantically important metrics. Furthermore, we show that by using appropriate sha** of the latent space, our model allows us to design molecules that are (locally) optimal in desired properties. △ Less

Submitted 7 March, 2019; v1 submitted 23 May, 2018; originally announced May 2018.

Comments: 8 pages, 5 figures

arXiv:1805.08490 [pdf, other]

Generative Code Modeling with Graphs

Authors: Marc Brockschmidt, Miltiadis Allamanis, Alexander L. Gaunt, Oleksandr Polozov

Abstract: Generative models for source code are an interesting structured prediction problem, requiring to reason about both hard syntactic and semantic constraints as well as about natural, likely programs. We present a novel model for this problem that uses a graph to represent the intermediate state of the generated output. The generative procedure interleaves grammar-driven expansion steps with graph au… ▽ More Generative models for source code are an interesting structured prediction problem, requiring to reason about both hard syntactic and semantic constraints as well as about natural, likely programs. We present a novel model for this problem that uses a graph to represent the intermediate state of the generated output. The generative procedure interleaves grammar-driven expansion steps with graph augmentation and neural message passing steps. An experimental evaluation shows that our new model can generate semantically meaningful expressions, outperforming a range of strong baselines. △ Less

Submitted 16 April, 2019; v1 submitted 22 May, 2018; originally announced May 2018.

arXiv:1803.06272 [pdf, other]

Graph Partition Neural Networks for Semi-Supervised Classification

Authors: Renjie Liao, Marc Brockschmidt, Daniel Tarlow, Alexander L. Gaunt, Raquel Urtasun, Richard Zemel

Abstract: We present graph partition neural networks (GPNN), an extension of graph neural networks (GNNs) able to handle extremely large graphs. GPNNs alternate between locally propagating information between nodes in small subgraphs and globally propagating information between the subgraphs. To efficiently partition graphs, we experiment with several partitioning algorithms and also propose a novel variant… ▽ More We present graph partition neural networks (GPNN), an extension of graph neural networks (GNNs) able to handle extremely large graphs. GPNNs alternate between locally propagating information between nodes in small subgraphs and globally propagating information between the subgraphs. To efficiently partition graphs, we experiment with several partitioning algorithms and also propose a novel variant for fast processing of large scale graphs. We extensively test our model on a variety of semi-supervised node classification tasks. Experimental results indicate that GPNNs are either superior or comparable to state-of-the-art methods on a wide variety of datasets for graph-based semi-supervised classification. We also show that GPNNs can achieve similar performance as standard GNNs with fewer propagation steps. △ Less

Submitted 16 March, 2018; originally announced March 2018.

arXiv:1711.00740 [pdf, other]

Learning to Represent Programs with Graphs

Authors: Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi

Abstract: Learning tasks on source code (i.e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax. For example, long-range dependencies induced by using the same variable or function in distant locations are often not considered. We propose to use graphs to represent… ▽ More Learning tasks on source code (i.e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax. For example, long-range dependencies induced by using the same variable or function in distant locations are often not considered. We propose to use graphs to represent both the syntactic and semantic structure of code and use graph-based deep learning methods to learn to reason over program structures. In this work, we present how to construct graphs from source code and how to scale Gated Graph Neural Networks training to such large graphs. We evaluate our method on two tasks: VarNaming, in which a network attempts to predict the name of a variable given its usage, and VarMisuse, in which the network learns to reason about selecting the correct variable that should be used at a given program location. Our comparison to methods that use less structured program representations shows the advantages of modeling known structure, and suggests that our models learn to infer meaningful names and to solve the VarMisuse task in many cases. Additionally, our testing showed that VarMisuse identifies a number of bugs in mature open-source projects. △ Less

Submitted 4 May, 2018; v1 submitted 1 November, 2017; originally announced November 2017.

Comments: Published in ICLR 2018. arXiv admin note: text overlap with arXiv:1705.07867

arXiv:1705.07867 [pdf, ps, other]

SmartPaste: Learning to Adapt Source Code

Authors: Miltiadis Allamanis, Marc Brockschmidt

Abstract: Deep Neural Networks have been shown to succeed at a range of natural language tasks such as machine translation and text summarization. While tasks on source code (ie, formal languages) have been considered recently, most work in this area does not attempt to capitalize on the unique opportunities offered by its known syntax and structure. In this work, we introduce SmartPaste, a first task that… ▽ More Deep Neural Networks have been shown to succeed at a range of natural language tasks such as machine translation and text summarization. While tasks on source code (ie, formal languages) have been considered recently, most work in this area does not attempt to capitalize on the unique opportunities offered by its known syntax and structure. In this work, we introduce SmartPaste, a first task that requires to use such information. The task is a variant of the program repair problem that requires to adapt a given (pasted) snippet of code to surrounding, existing source code. As first solutions, we design a set of deep neural models that learn to represent the context of each variable location and variable usage in a data flow-sensitive way. Our evaluation suggests that our models can learn to solve the SmartPaste task in many cases, achieving 58.6% accuracy, while learning meaningful representation of variable usages. △ Less

Submitted 22 May, 2017; originally announced May 2017.

arXiv:1612.00817 [pdf, other]

Summary - TerpreT: A Probabilistic Programming Language for Program Induction

Authors: Alexander L. Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, Daniel Tarlow

Abstract: We study machine learning formulations of inductive program synthesis; that is, given input-output examples, synthesize source code that maps inputs to corresponding outputs. Our key contribution is TerpreT, a domain-specific language for expressing program synthesis problems. A TerpreT model is composed of a specification of a program representation and an interpreter that describes how programs… ▽ More We study machine learning formulations of inductive program synthesis; that is, given input-output examples, synthesize source code that maps inputs to corresponding outputs. Our key contribution is TerpreT, a domain-specific language for expressing program synthesis problems. A TerpreT model is composed of a specification of a program representation and an interpreter that describes how programs map inputs to outputs. The inference task is to observe a set of input-output examples and infer the underlying program. From a TerpreT model we automatically perform inference using four different back-ends: gradient descent (thus each TerpreT model can be seen as defining a differentiable interpreter), linear program (LP) relaxations for graphical models, discrete satisfiability solving, and the Sketch program synthesis system. TerpreT has two main benefits. First, it enables rapid exploration of a range of domains, program representations, and interpreter models. Second, it separates the model specification from the inference algorithm, allowing proper comparisons between different approaches to inference. We illustrate the value of TerpreT by develo** several interpreter models and performing an extensive empirical comparison between alternative inference algorithms on a variety of program models. To our knowledge, this is the first work to compare gradient-based search over program space to traditional search-based alternatives. Our key empirical finding is that constraint solvers dominate the gradient descent and LP-based formulations. This is a workshop summary of a longer report at arXiv:1608.04428 △ Less

Submitted 2 December, 2016; originally announced December 2016.

Comments: 7 pages, 2 figures, 4 tables in 1st Workshop on Neural Abstract Machines & Program Induction (NAMPI), @NIPS 2016

arXiv:1611.02109 [pdf, other]

Differentiable Programs with Neural Libraries

Authors: Alexander L. Gaunt, Marc Brockschmidt, Nate Kushman, Daniel Tarlow

Abstract: We develop a framework for combining differentiable programming languages with neural networks. Using this framework we create end-to-end trainable systems that learn to write interpretable algorithms with perceptual components. We explore the benefits of inductive biases for strong generalization and modularity that come from the program-like structure of our models. In particular, modularity all… ▽ More We develop a framework for combining differentiable programming languages with neural networks. Using this framework we create end-to-end trainable systems that learn to write interpretable algorithms with perceptual components. We explore the benefits of inductive biases for strong generalization and modularity that come from the program-like structure of our models. In particular, modularity allows us to learn a library of (neural) functions which grows and improves as more tasks are solved. Empirically, we show that this leads to lifelong learning systems that transfer knowledge to new tasks more effectively than baselines. △ Less

Submitted 2 March, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

arXiv:1611.01989 [pdf, other]

DeepCoder: Learning to Write Programs

Authors: Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, Daniel Tarlow

Abstract: We develop a first line of attack for solving programming competition-style problems from input-output examples using deep learning. The approach is to train a neural network to predict properties of the program that generated the outputs from the inputs. We use the neural network's predictions to augment search techniques from the programming languages community, including enumerative search and… ▽ More We develop a first line of attack for solving programming competition-style problems from input-output examples using deep learning. The approach is to train a neural network to predict properties of the program that generated the outputs from the inputs. We use the neural network's predictions to augment search techniques from the programming languages community, including enumerative search and an SMT-based solver. Empirically, we show that our approach leads to an order of magnitude speedup over the strong non-augmented baselines and a Recurrent Neural Network approach, and that we are able to solve problems of difficulty comparable to the simplest problems on programming competition websites. △ Less

Submitted 8 March, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

Comments: Submitted to ICLR 2017

arXiv:1611.01988 [pdf, ps, other]

Differentiable Functional Program Interpreters

Authors: John K. Feser, Marc Brockschmidt, Alexander L. Gaunt, Daniel Tarlow

Abstract: Programming by Example (PBE) is the task of inducing computer programs from input-output examples. It can be seen as a type of machine learning where the hypothesis space is the set of legal programs in some programming language. Recent work on differentiable interpreters relaxes the discrete space of programs into a continuous space so that search over programs can be performed using gradient-bas… ▽ More Programming by Example (PBE) is the task of inducing computer programs from input-output examples. It can be seen as a type of machine learning where the hypothesis space is the set of legal programs in some programming language. Recent work on differentiable interpreters relaxes the discrete space of programs into a continuous space so that search over programs can be performed using gradient-based optimization. While conceptually powerful, so far differentiable interpreter-based program synthesis has only been capable of solving very simple problems. In this work, we study modeling choices that arise when constructing a differentiable programming language and their impact on the success of synthesis. The main motivation for the modeling choices comes from functional programming: we study the effect of memory allocation schemes, immutable data, type systems, and built-in control-flow structures. Empirically we show that incorporating functional programming ideas into differentiable programming languages allows us to learn much more complex programs than is possible with existing differentiable languages. △ Less

Submitted 2 March, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

arXiv:1608.04428 [pdf, other]

TerpreT: A Probabilistic Programming Language for Program Induction

Authors: Alexander L. Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, Daniel Tarlow

Abstract: We study machine learning formulations of inductive program synthesis; given input-output examples, we try to synthesize source code that maps inputs to corresponding outputs. Our aims are to develop new machine learning approaches based on neural networks and graphical models, and to understand the capabilities of machine learning techniques relative to traditional alternatives, such as those bas… ▽ More We study machine learning formulations of inductive program synthesis; given input-output examples, we try to synthesize source code that maps inputs to corresponding outputs. Our aims are to develop new machine learning approaches based on neural networks and graphical models, and to understand the capabilities of machine learning techniques relative to traditional alternatives, such as those based on constraint solving from the programming languages community. Our key contribution is the proposal of TerpreT, a domain-specific language for expressing program synthesis problems. TerpreT is similar to a probabilistic programming language: a model is composed of a specification of a program representation (declarations of random variables) and an interpreter describing how programs map inputs to outputs (a model connecting unknowns to observations). The inference task is to observe a set of input-output examples and infer the underlying program. TerpreT has two main benefits. First, it enables rapid exploration of a range of domains, program representations, and interpreter models. Second, it separates the model specification from the inference algorithm, allowing like-to-like comparisons between different approaches to inference. From a single TerpreT specification we automatically perform inference using four different back-ends. These are based on gradient descent, linear program (LP) relaxations for graphical models, discrete satisfiability solving, and the Sketch program synthesis system. We illustrate the value of TerpreT by develo** several interpreter models and performing an empirical comparison between alternative inference algorithms. Our key empirical finding is that constraint solvers dominate the gradient descent and LP-based formulations. We conclude with suggestions for the machine learning community to make progress on program synthesis. △ Less

Submitted 15 August, 2016; originally announced August 2016.

Comments: 50 pages, 20 figures, 4 tables

arXiv:1512.08689 [pdf, other]

T2: Temporal Property Verification

Authors: Marc Brockschmidt, Byron Cook, Samin Ishtiaq, Heidy Khlaaf, Nir Piterman

Abstract: We present the open-source tool T2, the first public release from the TERMINATOR project. T2 has been extended over the past decade to support automatic temporal-logic proving techniques and to handle a general class of user-provided liveness and safety properties. Input can be provided in a native format and in C, via the support of the LLVM compiler framework. We briefly discuss T2's architectur… ▽ More We present the open-source tool T2, the first public release from the TERMINATOR project. T2 has been extended over the past decade to support automatic temporal-logic proving techniques and to handle a general class of user-provided liveness and safety properties. Input can be provided in a native format and in C, via the support of the LLVM compiler framework. We briefly discuss T2's architecture, its underlying techniques, and conclude with an experimental illustration of its competitiveness and directions for future extensions. △ Less

Submitted 6 January, 2016; v1 submitted 29 December, 2015; originally announced December 2015.

Comments: Full version of TACAS'16 tool paper

arXiv:1511.05493 [pdf, other]

Gated Graph Sequence Neural Networks

Authors: Yujia Li, Daniel Tarlow, Marc Brockschmidt, Richard Zemel

Abstract: Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs. Our starting point is previous work on Graph Neural Networks (Scarselli et al., 2009), which we modify to use gated recurrent units and modern optimization techniques and then exten… ▽ More Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs. Our starting point is previous work on Graph Neural Networks (Scarselli et al., 2009), which we modify to use gated recurrent units and modern optimization techniques and then extend to output sequences. The result is a flexible and broadly useful class of neural network models that has favorable inductive biases relative to purely sequence-based models (e.g., LSTMs) when the problem is graph-structured. We demonstrate the capabilities on some simple AI (bAbI) and graph algorithm learning tasks. We then show it achieves state-of-the-art performance on a problem from program verification, in which subgraphs need to be matched to abstract data structures. △ Less

Submitted 22 September, 2017; v1 submitted 17 November, 2015; originally announced November 2015.

Comments: Published as a conference paper in ICLR 2016. Fixed a typo

arXiv:1507.03851 [pdf, ps, other]

Compositional Safety Verification with Max-SMT

Authors: Marc Brockschmidt, Daniel Larraz, Albert Oliveras, Enric Rodriguez-Carbonell, Albert Rubio

Abstract: We present an automated compositional program verification technique for safety properties based on conditional inductive invariants. For a given program part (e.g., a single loop) and a postcondition $\varphi$, we show how to, using a Max-SMT solver, an inductive invariant together with a precondition can be synthesized so that the precondition ensures the validity of the invariant and that the i… ▽ More We present an automated compositional program verification technique for safety properties based on conditional inductive invariants. For a given program part (e.g., a single loop) and a postcondition $\varphi$, we show how to, using a Max-SMT solver, an inductive invariant together with a precondition can be synthesized so that the precondition ensures the validity of the invariant and that the invariant implies $\varphi$. From this, we build a bottom-up program verification framework that propagates preconditions of small program parts as postconditions for preceding program parts. The method recovers from failures to prove the validity of a precondition, using the obtained intermediate results to restrict the search space for further proof attempts. As only small program parts need to be handled at a time, our method is scalable and distributable. The derived conditions can be viewed as implicit contracts between different parts of the program, and thus enable an incremental program analysis. △ Less

Submitted 3 August, 2015; v1 submitted 14 July, 2015; originally announced July 2015.

Comments: Extended technical report version of the conference paper at FMCAD'15

arXiv:1406.3988 [pdf, ps, other]

CTL+FO Verification as Constraint Solving

Authors: Tewodros A. Beyene, Marc Brockschmidt, Andrey Rybalchenko

Abstract: Expressing program correctness often requires relating program data throughout (different branches of) an execution. Such properties can be represented using CTL+FO, a logic that allows mixing temporal and first-order quantification. Verifying that a program satisfies a CTL+FO property is a challenging problem that requires both temporal and data reasoning. Temporal quantifiers require discovery o… ▽ More Expressing program correctness often requires relating program data throughout (different branches of) an execution. Such properties can be represented using CTL+FO, a logic that allows mixing temporal and first-order quantification. Verifying that a program satisfies a CTL+FO property is a challenging problem that requires both temporal and data reasoning. Temporal quantifiers require discovery of invariants and ranking functions, while first-order quantifiers demand instantiation techniques. In this paper, we present a constraint-based method for proving CTL+FO properties automatically. Our method makes the interplay between the temporal and first-order quantification explicit in a constraint encoding that combines recursion and existential quantification. By integrating this constraint encoding with an off-the-shelf solver we obtain an automatic verifier for CTL+FO. △ Less

Submitted 21 June, 2014; v1 submitted 16 June, 2014; originally announced June 2014.

Showing 1–30 of 30 results for author: Brockschmidt, M