Skip to main content

Showing 1–24 of 24 results for author: Radul, A

.
  1. arXiv:2210.04729  [pdf, ps, other

    cs.PL

    The Foil: Capture-Avoiding Substitution With No Sharp Edges

    Authors: Dougal Maclaurin, Alexey Radul, Adam Paszke

    Abstract: Correctly manipulating program terms in a compiler is surprisingly difficult because of the need to avoid name capture. The rapier from "Secrets of the Glasgow Haskell Compiler inliner" is a cutting-edge technique for fast, stateless capture-avoiding substitution for expressions represented with explicit names. It is, however, a sharp tool: its invariants are tricky and need to be maintained throu… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: Presented at IFL 2022

  2. arXiv:2204.10923  [pdf, other

    cs.PL

    You Only Linearize Once: Tangents Transpose to Gradients

    Authors: Alexey Radul, Adam Paszke, Roy Frostig, Matthew Johnson, Dougal Maclaurin

    Abstract: Automatic differentiation (AD) is conventionally understood as a family of distinct algorithms, rooted in two "modes" -- forward and reverse -- which are typically presented (and implemented) separately. Can there be only one? Following up on the AD systems developed in the JAX and Dex projects, we formalize a decomposition of reverse-mode AD into (i) forward-mode AD followed by (ii) unzip** the… ▽ More

    Submitted 6 December, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

  3. arXiv:2105.09469  [pdf, other

    cs.PL cs.LG

    Decomposing reverse-mode automatic differentiation

    Authors: Roy Frostig, Matthew J. Johnson, Dougal Maclaurin, Adam Paszke, Alexey Radul

    Abstract: We decompose reverse-mode automatic differentiation into (forward-mode) linearization followed by transposition. Doing so isolates the essential difference between forward- and reverse-mode AD, and simplifies their joint implementation. In particular, once forward-mode AD rules are defined for every primitive operation in a source language, only linear primitives require an additional transpositio… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: Presented at the LAFI 2021 workshop at POPL, 17 January 2021

  4. arXiv:2104.05372  [pdf, other

    cs.PL

    Getting to the Point. Index Sets and Parallelism-Preserving Autodiff for Pointful Array Programming

    Authors: Adam Paszke, Daniel Johnson, David Duvenaud, Dimitrios Vytiniotis, Alexey Radul, Matthew Johnson, Jonathan Ragan-Kelley, Dougal Maclaurin

    Abstract: We present a novel programming language design that attempts to combine the clarity and safety of high-level functional languages with the efficiency and parallelism of low-level numerical languages. We treat arrays as eagerly-memoized functions on typed index sets, allowing abstract function manipulations, such as currying, to work on arrays. In contrast to composing primitive bulk-array operatio… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: 31 pages with appendix, 11 figures. A conference submission is still under review

  5. arXiv:2010.09647  [pdf, other

    cs.PL

    The Base Measure Problem and its Solution

    Authors: Alexey Radul, Boris Alexeev

    Abstract: Probabilistic programming systems generally compute with probability density functions, leaving the base measure of each such function implicit. This mostly works, but creates problems when densities with respect to different base measures are accidentally combined or compared. Mistakes also happen when computing volume corrections for continuous changes of variables, which in general depend on th… ▽ More

    Submitted 10 December, 2020; v1 submitted 6 October, 2020; originally announced October 2020.

  6. arXiv:2001.05035  [pdf, ps, other

    stat.CO

    FunMC: A functional API for building Markov Chains

    Authors: Pavel Sountsov, Alexey Radul, Srinivas Vasudevan

    Abstract: Constant-memory algorithms, also loosely called Markov chains, power the vast majority of probabilistic inference and machine learning applications today. A lot of progress has been made in constructing user-friendly APIs around these algorithms. Such APIs, however, rarely make it easy to research new algorithms of this type. In this work we present FunMC, a minimal Python library for doing method… ▽ More

    Submitted 26 May, 2021; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: Updated source code to reflect API; updated link to point to new location

  7. arXiv:1910.11141  [pdf, other

    cs.DC cs.LG cs.PL

    Automatically Batching Control-Intensive Programs for Modern Accelerators

    Authors: Alexey Radul, Brian Patton, Dougal Maclaurin, Matthew D. Hoffman, Rif A. Saurous

    Abstract: We present a general approach to batching arbitrary computations for accelerators such as GPUs. We show orders-of-magnitude speedups using our method on the No U-Turn Sampler (NUTS), a workhorse algorithm in Bayesian statistics. The central challenge of batching NUTS and other Markov chain Monte Carlo algorithms is data-dependent control flow and recursion. We overcome this by mechanically transfo… ▽ More

    Submitted 12 March, 2020; v1 submitted 23 October, 2019; originally announced October 2019.

    Comments: 10 pages; Machine Learning and Systems 2020

  8. arXiv:1811.02091  [pdf, other

    stat.ML cs.LG cs.PL

    Simple, Distributed, and Accelerated Probabilistic Programming

    Authors: Dustin Tran, Matthew Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, Alexey Radul, Matthew Johnson, Rif A. Saurous

    Abstract: We describe a simple, low-level approach for embedding probabilistic programming in a deep learning ecosystem. In particular, we distill probabilistic programming down to a single abstraction---the random variable. Our lightweight implementation in TensorFlow enables numerous applications: a model-parallel variational auto-encoder (VAE) with 2nd-generation tensor processing units (TPUv2s); a data-… ▽ More

    Submitted 28 November, 2018; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: Appears in Neural Information Processing Systems, 2018. Code available at http://bit.ly/2JpFipt

  9. arXiv:1704.04977  [pdf, other

    cs.AI

    Probabilistic programs for inferring the goals of autonomous agents

    Authors: Marco F. Cusumano-Towner, Alexey Radul, David Wingate, Vikash K. Mansinghka

    Abstract: Intelligent systems sometimes need to infer the probable goals of people, cars, and robots, based on partial observations of their motion. This paper introduces a class of probabilistic programs for formulating and solving these problems. The formulation uses randomized path planning algorithms as the basis for probabilistic models of the process by which autonomous agents plan to achieve their go… ▽ More

    Submitted 18 April, 2017; v1 submitted 17 April, 2017; originally announced April 2017.

  10. arXiv:1611.07051  [pdf, other

    stat.ML

    Time Series Structure Discovery via Probabilistic Program Synthesis

    Authors: Ulrich Schaechtle, Feras Saad, Alexey Radul, Vikash Mansinghka

    Abstract: There is a widespread need for techniques that can discover structure from time series data. Recently introduced techniques such as Automatic Bayesian Covariance Discovery (ABCD) provide a way to find structure within a single time series by searching through a space of covariance kernels that is generated using a simple grammar. While ABCD can identify a broad class of temporal patterns, it is di… ▽ More

    Submitted 22 May, 2017; v1 submitted 21 November, 2016; originally announced November 2016.

    Comments: The first two authors contributed equally to this work

  11. arXiv:1610.00831  [pdf, ps, other

    cs.PL

    Notes on Pure Dataflow Matrix Machines: Programming with Self-referential Matrix Transformations

    Authors: Michael Bukatin, Steve Matthews, Andrey Radul

    Abstract: Dataflow matrix machines are self-referential generalized recurrent neural nets. The self-referential mechanism is provided via a stream of matrices defining the connectivity and weights of the network in question. A natural question is: what should play the role of untyped lambda-calculus for this programming architecture? The proposed answer is a discipline of programming with only one kind of s… ▽ More

    Submitted 2 November, 2018; v1 submitted 3 October, 2016; originally announced October 2016.

    Comments: 7 pages (v3 - update page 7)

  12. arXiv:1606.09470  [pdf, ps, other

    cs.PL cs.NE

    Programming Patterns in Dataflow Matrix Machines and Generalized Recurrent Neural Nets

    Authors: Michael Bukatin, Steve Matthews, Andrey Radul

    Abstract: Dataflow matrix machines arise naturally in the context of synchronous dataflow programming with linear streams. They can be viewed as a rather powerful generalization of recurrent neural networks. Similarly to recurrent neural networks, large classes of dataflow matrix machines are described by matrices of numbers, and therefore dataflow matrix machines can be synthesized by computing their matri… ▽ More

    Submitted 3 August, 2018; v1 submitted 30 June, 2016; originally announced June 2016.

    Comments: 13 pages (v2 - update references)

  13. arXiv:1605.05296  [pdf, ps, other

    cs.NE cs.PL

    Dataflow matrix machines as programmable, dynamically expandable, self-referential generalized recurrent neural networks

    Authors: Michael Bukatin, Steve Matthews, Andrey Radul

    Abstract: Dataflow matrix machines are a powerful generalization of recurrent neural networks. They work with multiple types of linear streams and multiple types of neurons, including higher-order neurons which dynamically update the matrix describing weights and topology of the network in question while the network is running. It seems that the power of dataflow matrix machines is sufficient for them to be… ▽ More

    Submitted 20 June, 2018; v1 submitted 17 May, 2016; originally announced May 2016.

    Comments: 9 pages (v2 - update references)

  14. arXiv:1603.09002  [pdf, ps, other

    cs.NE

    Dataflow Matrix Machines as a Generalization of Recurrent Neural Networks

    Authors: Michael Bukatin, Steve Matthews, Andrey Radul

    Abstract: Dataflow matrix machines are a powerful generalization of recurrent neural networks. They work with multiple types of arbitrary linear streams, multiple types of powerful neurons, and allow to incorporate higher-order constructions. We expect them to be useful in machine learning and probabilistic programming, and in the synthesis of dynamic systems and of deterministic and probabilistic programs.

    Submitted 28 May, 2018; v1 submitted 29 March, 2016; originally announced March 2016.

    Comments: 4 pages position paper (v2 - update references)

  15. arXiv:1512.05665  [pdf, other

    cs.LG cs.AI stat.ML

    Probabilistic Programming with Gaussian Process Memoization

    Authors: Ulrich Schaechtle, Ben Zinberg, Alexey Radul, Kostas Stathis, Vikash K. Mansinghka

    Abstract: Gaussian Processes (GPs) are widely used tools in statistics, machine learning, robotics, computer vision, and scientific computation. However, despite their popularity, they can be difficult to apply; all but the simplest classification or regression applications require specification and inference over complex covariance functions that do not admit simple analytical posteriors. This paper shows… ▽ More

    Submitted 5 January, 2016; v1 submitted 17 December, 2015; originally announced December 2015.

    Comments: 36 pages, 9 figures

  16. arXiv:1502.05767  [pdf, ps, other

    cs.SC cs.LG stat.ML

    Automatic differentiation in machine learning: a survey

    Authors: Atilim Gunes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, Jeffrey Mark Siskind

    Abstract: Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established… ▽ More

    Submitted 5 February, 2018; v1 submitted 19 February, 2015; originally announced February 2015.

    Comments: 43 pages, 5 figures

    MSC Class: 68W30; 65D25; 68T05 ACM Class: G.1.4; I.2.6

    Journal ref: Atilim Gunes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, Jeffrey Mark Siskind. Automatic differentiation in machine learning: a survey. The Journal of Machine Learning Research, 18(153):1--43, 2018

  17. arXiv:1211.4892  [pdf, ps, other

    cs.SC cs.MS math.DG

    Confusion of Tagged Perturbations in Forward Automatic Differentiation of Higher-Order Functions

    Authors: Oleksandr Manzyuk, Barak A. Pearlmutter, Alexey Andreyevich Radul, David R. Rush, Jeffrey Mark Siskind

    Abstract: Forward Automatic Differentiation (AD) is a technique for augmenting programs to compute derivatives. The essence of Forward AD is to attach perturbations to each number, and propagate these through the computation. When derivatives are nested, the distinct derivative calculations, and their associated perturbations, must be distinguished. This is typically accomplished by creating a unique tag fo… ▽ More

    Submitted 29 June, 2019; v1 submitted 20 November, 2012; originally announced November 2012.

  18. arXiv:1203.1450  [pdf, ps, other

    cs.PL cs.MS math.NA

    AD in Fortran, Part 2: Implementation via Prepreprocessor

    Authors: Alexey Radul, Barak A. Pearlmutter, Jeffrey Mark Siskind

    Abstract: We describe an implementation of the Farfel Fortran AD extensions. These extensions integrate forward and reverse AD directly into the programming model, with attendant benefits to flexibility, modularity, and ease of use. The implementation we describe is a "prepreprocessor" that generates input to existing Fortran-based AD tools. In essence, blocks of code which are targeted for AD by Farfel con… ▽ More

    Submitted 8 March, 2012; v1 submitted 7 March, 2012; originally announced March 2012.

    Journal ref: Recent Advances in Algorithmic Differentiation, Springer Lecture Notes in Computational Science and Engineering volume 87, 2012, ISBN 978-3-642-30022-6, pages 273-284

  19. arXiv:1203.1448  [pdf, ps, other

    cs.PL cs.MS math.NA

    AD in Fortran, Part 1: Design

    Authors: Alexey Radul, Barak A. Pearlmutter, Jeffrey Mark Siskind

    Abstract: We propose extensions to Fortran which integrate forward and reverse Automatic Differentiation (AD) directly into the programming model. Irrespective of implementation technology, embedding AD constructs directly into the language extends the reach and convenience of AD while allowing abstraction of concepts of interest to scientific-computing practice, such as root finding, optimization, and find… ▽ More

    Submitted 8 March, 2012; v1 submitted 7 March, 2012; originally announced March 2012.

  20. arXiv:1110.1556  [pdf, other

    math.HO

    Jewish Problems

    Authors: Tanya Khovanova, Alexey Radul

    Abstract: This is a special collection of problems that were given to select applicants during oral entrance exams to the math department of Moscow State University. These problems were designed to prevent Jews and other undesirables from getting a passing grade. Among problems that were used by the department to blackball unwanted candidate students, these problems are distinguished by having a simple solu… ▽ More

    Submitted 15 October, 2011; v1 submitted 7 October, 2011; originally announced October 2011.

    Comments: 21 pages, 14 figures

    Journal ref: published as "KIller Problems" in The American Matheamtical Monthly Vol. 119, No. 10 (2012), pp815-82

  21. arXiv:1003.3406  [pdf, ps, other

    math.CO cs.DM cs.DS math.HO

    Baron Munchhausen's Sequence

    Authors: Tanya Khovanova, Konstantin Knop, Alexey Radul

    Abstract: We investigate a coin-weighing puzzle that appeared in the all-Russian math Olympiad in 2000. We liked the puzzle because the methods of analysis differ from classical coin-weighing puzzles. We generalize the puzzle by varying the number of participating coins, and deduce a complete solution, perhaps surprisingly, the objective can be achieved in no more than two weighings regardless of the number… ▽ More

    Submitted 17 March, 2010; originally announced March 2010.

    Comments: 26 pages

    MSC Class: 11B99; 00A08; 00A08

    Journal ref: Journal of Integer Sequences, v.13 (2010), Article 10.8.7

  22. arXiv:hep-th/9512150  [pdf, ps, other

    hep-th

    Representation theory of the vertex algebra $W_{1 + \infty}$

    Authors: Victor Kac, Andrey Radul

    Abstract: In our paper~\cite{KR} we began a systematic study of representations of the universal central extension $\widehat{\Cal D}\/$ of the Lie algebra of differential operators on the circle. This study was continued in the paper~\cite{FKRW} in the framework of vertex algebra theory. It was shown that the associated to $\widehat {\Cal D}\/$ simple vertex algebra $W_{1+ \infty, N}\/$ with positive inte… ▽ More

    Submitted 18 December, 1995; originally announced December 1995.

    Comments: 26 pages, AMS-TeX, all macros included

  23. W_{1+\infty} and W(gl_N) with central charge N

    Authors: E. Frenkel, V. Kac, A. Radul, W. Wang

    Abstract: We study representations of the central extension of the Lie algebra of differential operators on the circle, the W-infinity algebra. We obtain complete and specialized character formulas for a large class of representations, which we call primitive; these include all quasi-finite irreducible unitary representations. We show that any primitive representation with central charge N has a canonical… ▽ More

    Submitted 3 October, 1994; v1 submitted 18 May, 1994; originally announced May 1994.

    Comments: 29 pages, Latex, uses file amssym.def (a few remarks added, typos corrected)

    Journal ref: Commun.Math.Phys. 170 (1995) 337-358

  24. Quasifinite highest weight modules over the Lie algebra of differential operators on the circle

    Authors: Victor G. Kac, A. Radul

    Abstract: We classify positive energy representations with finite degeneracies of the Lie algebra $W_{1+\infty}\/$ and construct them in terms of representation theory of the Lie algebra $\hatgl ( \infty R_m )\/$ of infinite matrices with finite number of non-zero diagonals over the algebra $R_m = \C [ t ] / ( t^{m + 1} )\/$. The unitary ones are classified as well. Similar results are obtained for the s… ▽ More

    Submitted 31 August, 1993; originally announced August 1993.

    Journal ref: Commun.Math.Phys. 157 (1993) 429-457