Search | arXiv e-print repository

TURNIP: A "Nondeterministic" GPU Runtime with CPU RAM Offload

Authors: Zhimin Ding, Jiawen Yao, Brianna Barrow, Tania Lorido Botran, Christopher Jermaine, Yuxin Tang, Jiehui Li, Xinyu Yao, Sleem Mahmoud Abdelghafar, Daniel Bourgeois

Abstract: An obvious way to alleviate memory difficulties in GPU-based AI computing is via CPU offload, where data are moved between GPU and CPU RAM, so inexpensive CPU RAM is used to increase the amount of storage available. While CPU offload is an obvious idea, it can greatly slow down a computation, due to the relatively slow transfer rate between CPU RAM and GPU RAM. Thus, any system for CPU offload nee… ▽ More An obvious way to alleviate memory difficulties in GPU-based AI computing is via CPU offload, where data are moved between GPU and CPU RAM, so inexpensive CPU RAM is used to increase the amount of storage available. While CPU offload is an obvious idea, it can greatly slow down a computation, due to the relatively slow transfer rate between CPU RAM and GPU RAM. Thus, any system for CPU offload needs to ensure that when such a transfer needs to happen, no computation is blocked waiting for the transfer to finish. One of the key challenges when using CPU offload is that memory transfers introduce nondeterminacy into the system: it is not possible to know before runtime when the transfers will finish, and hence what is the best order of operations to run to ensure there is no blocking. In this paper, we describe TURNIP, which is a system for running AI computations using CPU offload. The key innovation in TURNIP is the compilation of the AI computation into a dependency graph that gives the TURNIP runtime freedom to run operations such as GPU kernel calls in many different orders; at runtime, TURNIP chooses the best order in response to real-time events. △ Less

Submitted 27 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

arXiv:2306.00088 [pdf, other]

Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning

Authors: Yuxin Tang, Zhimin Ding, Dimitrije Jankov, Binhang Yuan, Daniel Bourgeois, Chris Jermaine

Abstract: The relational data model was designed to facilitate large-scale data management and analytics. We consider the problem of how to differentiate computations expressed relationally. We show experimentally that a relational engine running an auto-differentiated relational algorithm can easily scale to very large datasets, and is competitive with state-of-the-art, special-purpose systems for large-sc… ▽ More The relational data model was designed to facilitate large-scale data management and analytics. We consider the problem of how to differentiate computations expressed relationally. We show experimentally that a relational engine running an auto-differentiated relational algorithm can easily scale to very large datasets, and is competitive with state-of-the-art, special-purpose systems for large-scale distributed machine learning. △ Less

Submitted 7 June, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

Comments: ICML 2023

arXiv:2009.00524 [pdf, other]

Tensor Relational Algebra for Machine Learning System Design

Authors: Binhang Yuan, Dimitrije Jankov, Jia Zou, Yuxin Tang, Daniel Bourgeois, Chris Jermaine

Abstract: We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides litt… ▽ More We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters. △ Less

Submitted 9 August, 2021; v1 submitted 1 September, 2020; originally announced September 2020.

arXiv:1904.07539 [pdf, other]

doi 10.1145/3308558.3313526

A Dynamic Embedding Model of the Media Landscape

Authors: Jeremie Rappaz, Dylan Bourgeois, Karl Aberer

Abstract: Information about world events is disseminated through a wide variety of news channels, each with specific considerations in the choice of their reporting. Although the multiplicity of these outlets should ensure a variety of viewpoints, recent reports suggest that the rising concentration of media ownership may void this assumption. This observation motivates the study of the impact of ownership… ▽ More Information about world events is disseminated through a wide variety of news channels, each with specific considerations in the choice of their reporting. Although the multiplicity of these outlets should ensure a variety of viewpoints, recent reports suggest that the rising concentration of media ownership may void this assumption. This observation motivates the study of the impact of ownership on the global media landscape and its influence on the coverage the actual viewer receives. To this end, the selection of reported events has been shown to be informative about the high-level structure of the news ecosystem. However, existing methods only provide a static view into an inherently dynamic system, providing underperforming statistical models and hindering our understanding of the media landscape as a whole. In this work, we present a dynamic embedding method that learns to capture the decision process of individual news sources in their selection of reported events while also enabling the systematic detection of large-scale transformations in the media landscape over prolonged periods of time. In an experiment covering over 580M real-world event mentions, we show our approach to outperform static embedding methods in predictive terms. We demonstrate the potential of the method for news monitoring applications and investigative journalism by shedding light on important changes in programming induced by mergers and acquisitions, policy changes, or network-wide content diffusion. These findings offer evidence of strong content convergence trends inside large broadcasting groups, influencing the news ecosystem in a time of increasing media ownership concentration. △ Less

Submitted 16 April, 2019; originally announced April 2019.

Journal ref: The Web Conference 2019 (WWW '19)

arXiv:1904.07536 [pdf, other]

doi 10.1145/3184558.3188724

Selection Bias in News Coverage: Learning it, Fighting it

Authors: Dylan Bourgeois, Jeremie Rappaz, Karl Aberer

Abstract: News entities must select and filter the coverage they broadcast through their respective channels since the set of world events is too large to be treated exhaustively. The subjective nature of this filtering induces biases due to, among other things, resource constraints, editorial guidelines, ideological affinities, or even the fragmented nature of the information at a journalist's disposal. Th… ▽ More News entities must select and filter the coverage they broadcast through their respective channels since the set of world events is too large to be treated exhaustively. The subjective nature of this filtering induces biases due to, among other things, resource constraints, editorial guidelines, ideological affinities, or even the fragmented nature of the information at a journalist's disposal. The magnitude and direction of these biases are, however, widely unknown. The absence of ground truth, the sheer size of the event space, or the lack of an exhaustive set of absolute features to measure make it difficult to observe the bias directly, to characterize the leaning's nature and to factor it out to ensure a neutral coverage of the news. In this work, we introduce a methodology to capture the latent structure of media's decision process on a large scale. Our contribution is multi-fold. First, we show media coverage to be predictable using personalization techniques, and evaluate our approach on a large set of events collected from the GDELT database. We then show that a personalized and parametrized approach not only exhibits higher accuracy in coverage prediction, but also provides an interpretable representation of the selection bias. Last, we propose a method able to select a set of sources by leveraging the latent representation. These selected sources provide a more diverse and egalitarian coverage, all while retaining the most actively covered events. △ Less

Submitted 16 April, 2019; originally announced April 2019.

Journal ref: The Web Conference 2018 (WWW '18) pages 535-543

arXiv:1903.03894 [pdf, other]

GNNExplainer: Generating Explanations for Graph Neural Networks

Authors: Rex Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, Jure Leskovec

Abstract: Graph Neural Networks (GNNs) are a powerful tool for machine learning on graphs.GNNs combine node feature information with the graph structure by recursively passing neural messages along edges of the input graph. However, incorporating both graph structure and feature information leads to complex models, and explaining predictions made by GNNs remains unsolved. Here we propose GNNExplainer, the f… ▽ More Graph Neural Networks (GNNs) are a powerful tool for machine learning on graphs.GNNs combine node feature information with the graph structure by recursively passing neural messages along edges of the input graph. However, incorporating both graph structure and feature information leads to complex models, and explaining predictions made by GNNs remains unsolved. Here we propose GNNExplainer, the first general, model-agnostic approach for providing interpretable explanations for predictions of any GNN-based model on any graph-based machine learning task. Given an instance, GNNExplainer identifies a compact subgraph structure and a small subset of node features that have a crucial role in GNN's prediction. Further, GNNExplainer can generate consistent and concise explanations for an entire class of instances. We formulate GNNExplainer as an optimization task that maximizes the mutual information between a GNN's prediction and distribution of possible subgraph structures. Experiments on synthetic and real-world graphs show that our approach can identify important graph structures as well as node features, and outperforms baselines by 17.1% on average. GNNExplainer provides a variety of benefits, from the ability to visualize semantically relevant structures to interpretability, to giving insights into errors of faulty GNNs. △ Less

Submitted 13 November, 2019; v1 submitted 9 March, 2019; originally announced March 2019.

arXiv:1808.06689 [pdf, other]

Bayesian Function-on-Scalars Regression for High Dimensional Data

Authors: Daniel R. Kowal, Daniel C. Bourgeois

Abstract: We develop a fully Bayesian framework for function-on-scalars regression with many predictors. The functional data response is modeled nonparametrically using unknown basis functions, which produces a flexible and data-adaptive functional basis. We incorporate shrinkage priors that effectively remove unimportant scalar covariates from the model and reduce sensitivity to the number of (unknown) bas… ▽ More We develop a fully Bayesian framework for function-on-scalars regression with many predictors. The functional data response is modeled nonparametrically using unknown basis functions, which produces a flexible and data-adaptive functional basis. We incorporate shrinkage priors that effectively remove unimportant scalar covariates from the model and reduce sensitivity to the number of (unknown) basis functions. For variable selection in functional regression, we propose a decision theoretic posterior summarization technique, which identifies a subset of covariates that retains nearly the predictive accuracy of the full model. Our approach is broadly applicable for Bayesian functional regression models, and unlike existing methods provides joint rather than marginal selection of important predictor variables. Computationally scalable posterior inference is achieved using a Gibbs sampler with linear time complexity in the number of predictors. The resulting algorithm is empirically faster than existing frequentist and Bayesian techniques, and provides joint estimation of model parameters, prediction and imputation of functional trajectories, and uncertainty quantification via the posterior distribution. A simulation study demonstrates improvements in estimation accuracy, uncertainty quantification, and variable selection relative to existing alternatives. The methodology is applied to actigraphy data to investigate the association between intraday physical activity and responses to a sleep questionnaire. △ Less

Submitted 23 October, 2018; v1 submitted 20 August, 2018; originally announced August 2018.

arXiv:1808.00711 [pdf, other]

Using holistic event information in the trigger

Authors: Dylan Bourgeois, Conor Fitzpatrick, Sascha Stahl

Abstract: In order to achieve the data rates proposed for the future Run 3 upgrade of the LHCb detector, new processing models must be developed to deal with the increased throughput. For this reason, we aim to investigate the feasibility of purely data-driven holistic methods, with the constraint of introducing minimal computational overhead, hence using only raw detector information. These filters should… ▽ More In order to achieve the data rates proposed for the future Run 3 upgrade of the LHCb detector, new processing models must be developed to deal with the increased throughput. For this reason, we aim to investigate the feasibility of purely data-driven holistic methods, with the constraint of introducing minimal computational overhead, hence using only raw detector information. These filters should be unbiased - having a neutral effect with respect to the studied physics channels. In particular, the use of machine learning based methods seems particularly suitable, potentially providing a natural formulation for heuristic-free, unbiased filters whose objective would be to optimize between throughput and bandwidth. △ Less

Submitted 8 August, 2018; v1 submitted 2 August, 2018; originally announced August 2018.

Report number: LHCb-PUB-2018-010

arXiv:1501.01472 [pdf]

doi 10.1016/j.str.2004.07.013

Structure of superoxide reductase bound to ferrocyanide and active site expansion upon X-ray-induced photo-reduction

Authors: Virgile Adam, Antoine Royant, Vincent Nivière, Fernando P Molina-Heredia, Dominique Bourgeois

Abstract: Some sulfate-reducing and microaerophilic bacteria rely on the enzyme superoxide reductase (SOR) to eliminate the toxic superoxide anion radical (O2*-). SOR catalyses the one-electron reduction of O2*- to hydrogen peroxide at a nonheme ferrous iron center. The structures of Desulfoarculus baarsii SOR (mutant E47A) alone and in complex with ferrocyanide were solved to 1.15 and 1.7 A resolution, res… ▽ More Some sulfate-reducing and microaerophilic bacteria rely on the enzyme superoxide reductase (SOR) to eliminate the toxic superoxide anion radical (O2*-). SOR catalyses the one-electron reduction of O2*- to hydrogen peroxide at a nonheme ferrous iron center. The structures of Desulfoarculus baarsii SOR (mutant E47A) alone and in complex with ferrocyanide were solved to 1.15 and 1.7 A resolution, respectively. The latter structure, the first ever reported of a complex between ferrocyanide and a protein, reveals that this organo-metallic compound entirely plugs the SOR active site, coordinating the active iron through a bent cyano bridge. The subtle structural differences between the mixed-valence and the fully reduced SOR-ferrocyanide adducts were investigated by taking advantage of the photoelectrons induced by X-rays. The results reveal that photo-reduction from Fe(III) to Fe(II) of the iron center, a very rapid process under a powerful synchrotron beam, induces an expansion of the SOR active site. △ Less

Submitted 7 January, 2015; originally announced January 2015.

Journal ref: Structure, 2004, 12, pp.1729-40

arXiv:1412.5040 [pdf]

Raman-assisted crystallography reveals end-on peroxide intermediates in a nonheme iron enzyme

Authors: Gergely Katona, Philippe Carpentier, Vincent Nivière, Patricia Amara, Virgile Adam, Jérémy Ohana, Nikolay Tsanov, Dominique Bourgeois

Abstract: Iron-peroxide intermediates are central in the reaction cycle of many iron-containing biomolecules. We trapped iron(III)-(hydro)peroxo species in crystals of superoxide reductase (SOR), a nonheme mononuclear iron enzyme that scavenges superoxide radicals. X-ray diffraction data at 1.95 angstrom resolution and Raman spectra recorded in crystallo revealed iron-(hydro)peroxo intermediates with the (h… ▽ More Iron-peroxide intermediates are central in the reaction cycle of many iron-containing biomolecules. We trapped iron(III)-(hydro)peroxo species in crystals of superoxide reductase (SOR), a nonheme mononuclear iron enzyme that scavenges superoxide radicals. X-ray diffraction data at 1.95 angstrom resolution and Raman spectra recorded in crystallo revealed iron-(hydro)peroxo intermediates with the (hydro)peroxo group bound end-on. The dynamic SOR active site promotes the formation of transient hydrogen bond networks, which presumably assist the cleavage of the iron-oxygen bond in order to release the reaction product, hydrogen peroxide. △ Less

Submitted 16 December, 2014; originally announced December 2014.

Journal ref: Science, American Association for the Advancement of Science, 2007, pp.449-53

Showing 1–10 of 10 results for author: Bourgeois, D