-
TorchBench: Benchmarking PyTorch with High API Surface Coverage
Authors:
Yueming Hao,
Xu Zhao,
Bin Bao,
David Berard,
Will Constable,
Adnan Aziz,
Xu Liu
Abstract:
Deep learning (DL) has been a revolutionary technique in various domains. To facilitate the model development and deployment, many deep learning frameworks are proposed, among which PyTorch is one of the most popular solutions. The performance of ecosystem around PyTorch is critically important, which saves the costs of training models and reduces the response time of model inferences. In this pap…
▽ More
Deep learning (DL) has been a revolutionary technique in various domains. To facilitate the model development and deployment, many deep learning frameworks are proposed, among which PyTorch is one of the most popular solutions. The performance of ecosystem around PyTorch is critically important, which saves the costs of training models and reduces the response time of model inferences. In this paper, we propose TorchBench, a novel benchmark suite to study the performance of PyTorch software stack. Unlike existing benchmark suites, TorchBench encloses many representative models, covering a large PyTorch API surface. TorchBench is able to comprehensively characterize the performance of the PyTorch software stack, guiding the performance optimization across models, PyTorch framework, and GPU libraries. We show two practical use cases of TorchBench. (1) We profile TorchBench to identify GPU performance inefficiencies in PyTorch. We are able to optimize many performance bugs and upstream patches to the official PyTorch repository. (2) We integrate TorchBench into PyTorch continuous integration system. We are able to identify performance regression in multiple daily code checkins to prevent PyTorch repository from introducing performance bugs. TorchBench is open source and keeps evolving.
△ Less
Submitted 24 June, 2023; v1 submitted 27 April, 2023;
originally announced April 2023.
-
Feasibility and stability in large Lotka Volterra systems with interaction structure
Authors:
Xiaoyuan Liu,
George W. A. Constable,
Jonathan W. Pitchford
Abstract:
Complex system stability can be studied via linear stability analysis using Random Matrix Theory (RMT) or via feasibility (requiring positive equilibrium abundances). Both approaches highlight the importance of interaction structure. Here we show, analytically and numerically, how RMT and feasibility approaches can be complementary. In generalised Lotka-Volterra (GLV) models with random interactio…
▽ More
Complex system stability can be studied via linear stability analysis using Random Matrix Theory (RMT) or via feasibility (requiring positive equilibrium abundances). Both approaches highlight the importance of interaction structure. Here we show, analytically and numerically, how RMT and feasibility approaches can be complementary. In generalised Lotka-Volterra (GLV) models with random interaction matrices, feasibility increases when predator-prey interactions increase; increasing competition/mutualism has the opposite effect. These changes have crucial impact on the stability of the GLV model.
△ Less
Submitted 20 April, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Using Python for Model Inference in Deep Learning
Authors:
Zachary DeVito,
Jason Ansel,
Will Constable,
Michael Suo,
Ailing Zhang,
Kim Hazelwood
Abstract:
Python has become the de-facto language for training deep neural networks, coupling a large suite of scientific computing libraries with efficient libraries for tensor computation such as PyTorch or TensorFlow. However, when models are used for inference they are typically extracted from Python as TensorFlow graphs or TorchScript programs in order to meet performance and packaging constraints. The…
▽ More
Python has become the de-facto language for training deep neural networks, coupling a large suite of scientific computing libraries with efficient libraries for tensor computation such as PyTorch or TensorFlow. However, when models are used for inference they are typically extracted from Python as TensorFlow graphs or TorchScript programs in order to meet performance and packaging constraints. The extraction process can be time consuming, impeding fast prototy**. We show how it is possible to meet these performance and packaging constraints while performing inference in Python. In particular, we present a way of using multiple Python interpreters within a single process to achieve scalable inference and describe a new container format for models that contains both native Python code and data. This approach simplifies the model deployment story by eliminating the model extraction step, and makes it easier to integrate existing performance-enhancing Python libraries. We evaluate our design on a suite of popular PyTorch models on Github, showing how they can be packaged in our inference format, and comparing their performance to TorchScript. For larger models, our packaged Python models perform the same as TorchScript, and for smaller models where there is some Python overhead, our multi-interpreter approach ensures inference is still scalable.
△ Less
Submitted 1 April, 2021;
originally announced April 2021.
-
Fluctuation spectra of large random dynamical systems reveal hidden structure in ecological networks
Authors:
Yvonne Krumbeck,
Qian Yang,
George W. A. Constable,
Tim Rogers
Abstract:
Understanding the relationship between complexity and stability in large dynamical systems -- such as ecosystems -- remains a key open question in complexity theory which has inspired a rich body of work developed over more than fifty years. The vast majority of this theory addresses asymptotic linear stability around equilibrium points, but the idea of `stability' in fact has other uses in the em…
▽ More
Understanding the relationship between complexity and stability in large dynamical systems -- such as ecosystems -- remains a key open question in complexity theory which has inspired a rich body of work developed over more than fifty years. The vast majority of this theory addresses asymptotic linear stability around equilibrium points, but the idea of `stability' in fact has other uses in the empirical ecological literature. The important notion of `temporal stability' describes the character of fluctuations in population dynamics, driven by intrinsic or extrinsic noise. Here we apply tools from random matrix theory to the problem of temporal stability, deriving analytical predictions for the fluctuation spectra of complex ecological networks. We show that different network structures leave distinct signatures in the spectrum of fluctuations, and demonstrate the application of our theory to the analysis ecological timeseries data of plankton abundances.
△ Less
Submitted 12 May, 2021; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Fitness differences suppress the number of mating types in evolving isogamous species
Authors:
Yvonne Krumbeck,
George W. A. Constable,
Tim Rogers
Abstract:
Sexual reproduction is not always synonymous with the existence of two morphologically different sexes; isogamous species produce sex cells of equal size, typically falling into multiple distinct self-incompatible classes, termed mating types. A long-standing open question in evolutionary biology is: what governs the number of these mating types across species? Simple theoretical arguments imply a…
▽ More
Sexual reproduction is not always synonymous with the existence of two morphologically different sexes; isogamous species produce sex cells of equal size, typically falling into multiple distinct self-incompatible classes, termed mating types. A long-standing open question in evolutionary biology is: what governs the number of these mating types across species? Simple theoretical arguments imply an advantage to rare types, suggesting the number of types should grow consistently, however, empirical observations are very different. While some isogamous species exhibit thousands of mating types, such species are exceedingly rare, and most have fewer than ten. In this paper, we present a mathematical analysis to quantify the role of fitness variation - characterised by different mortality rates - in determining the number mating types emerging in simple evolutionary models. We predict that the number of mating types decreases as the variance of mortality increases.
△ Less
Submitted 13 December, 2019; v1 submitted 17 June, 2019;
originally announced June 2019.
-
Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning
Authors:
Scott Cyphers,
Arjun K. Bansal,
Anahita Bhiwandiwalla,
Jayaram Bobba,
Matthew Brookhart,
Avijit Chakraborty,
Will Constable,
Christian Convey,
Leona Cook,
Omar Kanawi,
Robert Kimball,
Jason Knight,
Nikolay Korovaiko,
Varun Kumar,
Yixing Lao,
Christopher R. Lishka,
Jaikrishnan Menon,
Jennifer Myers,
Sandeep Aswath Narayana,
Adam Procter,
Tristan J. Webb
Abstract:
The Deep Learning (DL) community sees many novel topologies published each year. Achieving high performance on each new topology remains challenging, as each requires some level of manual effort. This issue is compounded by the proliferation of frameworks and hardware platforms. The current approach, which we call "direct optimization", requires deep changes within each framework to improve the tr…
▽ More
The Deep Learning (DL) community sees many novel topologies published each year. Achieving high performance on each new topology remains challenging, as each requires some level of manual effort. This issue is compounded by the proliferation of frameworks and hardware platforms. The current approach, which we call "direct optimization", requires deep changes within each framework to improve the training performance for each hardware backend (CPUs, GPUs, FPGAs, ASICs) and requires $\mathcal{O}(fp)$ effort; where $f$ is the number of frameworks and $p$ is the number of platforms. While optimized kernels for deep-learning primitives are provided via libraries like Intel Math Kernel Library for Deep Neural Networks (MKL-DNN), there are several compiler-inspired ways in which performance can be further optimized. Building on our experience creating neon (a fast deep learning library on GPUs), we developed Intel nGraph, a soon to be open-sourced C++ library to simplify the realization of optimized deep learning performance across frameworks and hardware platforms. Initially-supported frameworks include TensorFlow, MXNet, and Intel neon framework. Initial backends are Intel Architecture CPUs (CPU), the Intel(R) Nervana Neural Network Processor(R) (NNP), and NVIDIA GPUs. Currently supported compiler optimizations include efficient memory management and data layout abstraction. In this paper, we describe our overall architecture and its core components. In the future, we envision extending nGraph API support to a wider range of frameworks, hardware (including FPGAs and ASICs), and compiler optimizations (training versus inference optimizations, multi-node and multi-device scaling via efficient sub-graph partitioning, and HW-specific compounding of operations).
△ Less
Submitted 29 January, 2018; v1 submitted 24 January, 2018;
originally announced January 2018.
-
Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks
Authors:
Urs Köster,
Tristan J. Webb,
Xin Wang,
Marcel Nassar,
Arjun K. Bansal,
William H. Constable,
Oğuz H. Elibol,
Scott Gray,
Stewart Hall,
Luke Hornof,
Amir Khosrowshahi,
Carey Kloss,
Ruby J. Pai,
Naveen Rao
Abstract:
Deep neural networks are commonly developed and trained in 32-bit floating point format. Significant gains in performance and energy efficiency could be realized by training and inference in numerical formats optimized for deep learning. Despite advances in limited precision inference in recent years, training of neural networks in low bit-width remains a challenging problem. Here we present the F…
▽ More
Deep neural networks are commonly developed and trained in 32-bit floating point format. Significant gains in performance and energy efficiency could be realized by training and inference in numerical formats optimized for deep learning. Despite advances in limited precision inference in recent years, training of neural networks in low bit-width remains a challenging problem. Here we present the Flexpoint data format, aiming at a complete replacement of 32-bit floating point format training and inference, designed to support modern deep network topologies without modifications. Flexpoint tensors have a shared exponent that is dynamically adjusted to minimize overflows and maximize available dynamic range. We validate Flexpoint by training AlexNet, a deep residual network and a generative adversarial network, using a simulator implemented with the neon deep learning framework. We demonstrate that 16-bit Flexpoint closely matches 32-bit floating point in training all three models, without any need for tuning of model hyperparameters. Our results suggest Flexpoint as a promising numerical format for future hardware for training and inference.
△ Less
Submitted 2 December, 2017; v1 submitted 6 November, 2017;
originally announced November 2017.
-
Exploiting fast-variables to understand population dynamics and evolution
Authors:
George W. A. Constable,
Alan J. McKane
Abstract:
We describe a continuous-time modelling framework for biological population dynamics that accounts for demographic noise. In the spirit of the methodology used by statistical physicists, transitions between the states of the system are caused by individual events while the dynamics are described in terms of the time-evolution of a probability density function. In general, the application of the di…
▽ More
We describe a continuous-time modelling framework for biological population dynamics that accounts for demographic noise. In the spirit of the methodology used by statistical physicists, transitions between the states of the system are caused by individual events while the dynamics are described in terms of the time-evolution of a probability density function. In general, the application of the diffusion approximation still leaves a description that is quite complex. However, in many biological applications one or more of the processes happen slowly relative to the system's other processes, and the dynamics can be approximated as occurring within a slow low-dimensional subspace. We review these time-scale separation arguments and analyse the more simple stochastic dynamics that result in a number of cases. We stress that it is important to retain the demographic noise derived in this way, and emphasise this point by showing that it can alter the direction of selection compared to the prediction made from an analysis of the corresponding deterministic model.
△ Less
Submitted 17 July, 2018; v1 submitted 25 July, 2017;
originally announced July 2017.
-
A map** of the stochastic Lotka-Volterra model to models of population genetics and game theory
Authors:
George W. A. Constable,
Alan J. McKane
Abstract:
The relationship between the M-species stochastic Lotka-Volterra competition (SLVC) model and the M-allele Moran model of population genetics is explored via timescale separation arguments. When selection for species is weak and the population size is large but finite, precise conditions are determined for the stochastic dynamics of the SLVC model to be mappable to the neutral Moran model, the Mor…
▽ More
The relationship between the M-species stochastic Lotka-Volterra competition (SLVC) model and the M-allele Moran model of population genetics is explored via timescale separation arguments. When selection for species is weak and the population size is large but finite, precise conditions are determined for the stochastic dynamics of the SLVC model to be mappable to the neutral Moran model, the Moran model with frequency-independent selection and the Moran model with frequency-dependent selection (equivalently, a game-theoretic formulation of the Moran model). We demonstrate how these map**s can be used to calculate extinction probabilities and the times until a species' extinction in the SLVC model.
△ Less
Submitted 25 April, 2017;
originally announced April 2017.
-
Demographic noise can reverse the direction of deterministic selection
Authors:
George W. A. Constable,
Tim Rogers,
Alan J. McKane,
Corina E. Tarnita
Abstract:
Deterministic evolutionary theory robustly predicts that populations displaying altruistic behaviours will be driven to extinction by mutant cheats that absorb common benefits but do not themselves contribute. Here we show that when demographic stochasticity is accounted for, selection can in fact act in the reverse direction to that predicted deterministically, instead favouring cooperative behav…
▽ More
Deterministic evolutionary theory robustly predicts that populations displaying altruistic behaviours will be driven to extinction by mutant cheats that absorb common benefits but do not themselves contribute. Here we show that when demographic stochasticity is accounted for, selection can in fact act in the reverse direction to that predicted deterministically, instead favouring cooperative behaviors that appreciably increase the carrying capacity of the population. Populations that exist in larger numbers experience a selective advantage by being more stochastically robust to invasions than smaller populations, and this advantage can persist even in the presence of reproductive costs. We investigate this general effect in the specific context of public goods production and find conditions for stochastic selection reversal leading to the success of public good producers. This insight, developed here analytically, is missed by both the deterministic analysis as well as standard game theoretic models that enforce a fixed population size. The effect is found to be amplified by space; in this scenario we find that selection reversal occurs within biologically reasonable parameter regimes for microbial populations. Beyond the public good problem, we formulate a general mathematical framework for models that may exhibit stochastic selection reversal. In this context, we describe a stochastic analogue to r-K theory, by which small populations can evolve to higher densities in the absence of disturbance.
△ Less
Submitted 11 August, 2016;
originally announced August 2016.
-
Stationary solutions for metapopulation Moran models with mutation and selection
Authors:
George W. A. Constable,
Alan J. McKane
Abstract:
We construct an individual-based metapopulation model of population genetics featuring migration, mutation, selection and genetic drift. In the case of a single `island', the model reduces to the Moran model. Using the diffusion approximation and timescale separation arguments, an effective one-variable description of the model is developed. The effective description bears similarities to the well…
▽ More
We construct an individual-based metapopulation model of population genetics featuring migration, mutation, selection and genetic drift. In the case of a single `island', the model reduces to the Moran model. Using the diffusion approximation and timescale separation arguments, an effective one-variable description of the model is developed. The effective description bears similarities to the well-mixed Moran model with effective parameters which depend on the network structure and island sizes, and is amenable to analysis. Predictions from the reduced theory match the results from stochastic simulations across a range of parameters. The nature of the fast-variable elimination technique we adopt is further studied by applying it to a linear system, where it provides a precise description of the slow-dynamics in the limit of large timescale separation.
△ Less
Submitted 14 April, 2015; v1 submitted 19 December, 2014;
originally announced December 2014.
-
Models of genetic drift as limiting forms of the Lotka-Volterra competition model
Authors:
George W. A. Constable,
Alan J. McKane
Abstract:
The relationship between the Moran model and stochastic Lotka-Volterra competition (SLVC) model is explored via timescale separation arguments. For neutral systems the two are found to be equivalent at long times. For systems with selective pressure, their behavior differs. It is argued that the SLVC is preferable to the Moran model since in the SLVC population size is regulated by competition, ra…
▽ More
The relationship between the Moran model and stochastic Lotka-Volterra competition (SLVC) model is explored via timescale separation arguments. For neutral systems the two are found to be equivalent at long times. For systems with selective pressure, their behavior differs. It is argued that the SLVC is preferable to the Moran model since in the SLVC population size is regulated by competition, rather than arbitrarily fixed as in the Moran model. As a consequence, ambiguities found in the Moran model associated with the introduction of more complex processes, such as selection, are avoided.
△ Less
Submitted 14 April, 2015; v1 submitted 31 July, 2014;
originally announced July 2014.
-
Population genetics on islands connected by an arbitrary network: An analytic approach
Authors:
George W A Constable,
Alan J McKane
Abstract:
We analyse a model consisting of a population of individuals which is subdivided into a finite set of demes, each of which has a fixed but differing number of individuals. The individuals can reproduce, die and migrate between the demes according to an arbitrary migration network. They are haploid, with two alleles present in the population; frequency independent selection is also incorporated, wh…
▽ More
We analyse a model consisting of a population of individuals which is subdivided into a finite set of demes, each of which has a fixed but differing number of individuals. The individuals can reproduce, die and migrate between the demes according to an arbitrary migration network. They are haploid, with two alleles present in the population; frequency independent selection is also incorporated, where the strength and direction of selection can vary from deme to deme. The system is formulated as an individual-based model, and the diffusion approximation systematically applied to express it as a set of nonlinear coupled stochastic differential equations. These can be made amenable to analysis through the elimination of fast-time variables. The resulting reduced model is analysed in a number of situations, including migration-selection balance leading to a polymorphic equilibrium of the two alleles, and an illustration of how the subdivision of the population can lead to non-trivial behaviour in the case where the network is a simple hub. The method we develop is systematic, may be applied to any network, and agrees well with the results of simulations in all cases studied and across a wide range of parameter values.
△ Less
Submitted 11 February, 2014;
originally announced February 2014.
-
Fast-mode elimination in stochastic metapopulation models
Authors:
George W. A. Constable,
Alan J. McKane
Abstract:
We investigate the stochastic dynamics of entities which are confined to a set of islands, between which they migrate. They are assumed to be one of two types, and in addition to migration, they also reproduce and die. Systems which fall into this class are common in biology and social science, occurring in ecology, population genetics, epidemiology, biochemistry, linguistics, opinion dynamics, an…
▽ More
We investigate the stochastic dynamics of entities which are confined to a set of islands, between which they migrate. They are assumed to be one of two types, and in addition to migration, they also reproduce and die. Systems which fall into this class are common in biology and social science, occurring in ecology, population genetics, epidemiology, biochemistry, linguistics, opinion dynamics, and other areas. In all these cases the governing equations are intractable, consisting as they do of multidimensional Fokker-Planck equations or, equivalently, coupled nonlinear stochastic differential equations with multiplicative noise. We develop a methodology which exploits a separation in time scales between fast and slow variables to reduce these equations so that they resemble those for a single island, which are amenable to analysis. The technique is generally applicable, but we choose to discuss it in the context of population genetics, in part because of the extra features that appear due to selection. The idea behind the method is simple, its application systematic, and the results in very good agreement with simulations of the full model for a range of parameter values.
△ Less
Submitted 1 April, 2014; v1 submitted 6 February, 2014;
originally announced February 2014.
-
Stochastic dynamics on slow manifolds
Authors:
George W A Constable,
Alan J McKane,
Tim Rogers
Abstract:
The theory of slow manifolds is an important tool in the study of deterministic dynamical systems, giving a practical method by which to reduce the number of relevant degrees of freedom in a model, thereby often resulting in a considerable simplification. In this article we demonstrate how the same basic methodology may also be applied to stochastic dynamical systems, by examining the behaviour of…
▽ More
The theory of slow manifolds is an important tool in the study of deterministic dynamical systems, giving a practical method by which to reduce the number of relevant degrees of freedom in a model, thereby often resulting in a considerable simplification. In this article we demonstrate how the same basic methodology may also be applied to stochastic dynamical systems, by examining the behaviour of trajectories conditioned on the event that they do not depart the slow manifold. We apply the method to two models: one from ecology and one from epidemiology, achieving a reduction in model dimension and illustrating the high quality of the analytical approximations.
△ Less
Submitted 28 June, 2013; v1 submitted 31 January, 2013;
originally announced January 2013.