Search | arXiv e-print repository

AI Age Discrepancy: A Novel Parameter for Frailty Assessment in Kidney Tumor Patients

Authors: Rikhil Seshadri, Jayant Siva, Angelica Bartholomew, Clara Goebel, Gabriel Wallerstein-King, Beatriz López Morato, Nicholas Heller, Jason Scovell, Rebecca Campbell, Andrew Wood, Michal Ozery-Flato, Vesna Barros, Maria Gabrani, Michal Rosen-Zvi, Resha Tejpaul, Vidhyalakshmi Ramesh, Nikolaos Papanikolopoulos, Subodh Regmi, Ryan Ward, Robert Abouassaly, Steven C. Campbell, Erick Remer, Christopher Weight

Abstract: Kidney cancer is a global health concern, and accurate assessment of patient frailty is crucial for optimizing surgical outcomes. This paper introduces AI Age Discrepancy, a novel metric derived from machine learning analysis of preoperative abdominal CT scans, as a potential indicator of frailty and postoperative risk in kidney cancer patients. This retrospective study of 599 patients from the 20… ▽ More Kidney cancer is a global health concern, and accurate assessment of patient frailty is crucial for optimizing surgical outcomes. This paper introduces AI Age Discrepancy, a novel metric derived from machine learning analysis of preoperative abdominal CT scans, as a potential indicator of frailty and postoperative risk in kidney cancer patients. This retrospective study of 599 patients from the 2023 Kidney Tumor Segmentation (KiTS) challenge dataset found that a higher AI Age Discrepancy is significantly associated with longer hospital stays and lower overall survival rates, independent of established factors. This suggests that AI Age Discrepancy may provide valuable insights into patient frailty and could thus inform clinical decision-making in kidney cancer treatment. △ Less

Submitted 2 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

Comments: 10 pages, 3 figures, 2 tables

arXiv:2312.13565 [pdf, other]

Automatic Curriculum Learning with Gradient Reward Signals

Authors: Ryan Campbell, Junsang Yoon

Abstract: This paper investigates the impact of using gradient norm reward signals in the context of Automatic Curriculum Learning (ACL) for deep reinforcement learning (DRL). We introduce a framework where the teacher model, utilizing the gradient norm information of a student model, dynamically adapts the learning curriculum. This approach is based on the hypothesis that gradient norms can provide a nuanc… ▽ More This paper investigates the impact of using gradient norm reward signals in the context of Automatic Curriculum Learning (ACL) for deep reinforcement learning (DRL). We introduce a framework where the teacher model, utilizing the gradient norm information of a student model, dynamically adapts the learning curriculum. This approach is based on the hypothesis that gradient norms can provide a nuanced and effective measure of learning progress. Our experimental setup involves several reinforcement learning environments (PointMaze, AntMaze, and AdroitHandRelocate), to assess the efficacy of our method. We analyze how gradient norm rewards influence the teacher's ability to craft challenging yet achievable learning sequences, ultimately enhancing the student's performance. Our results show that this approach not only accelerates the learning process but also leads to improved generalization and adaptability in complex tasks. The findings underscore the potential of gradient norm signals in creating more efficient and robust ACL systems, opening new avenues for research in curriculum learning and reinforcement learning. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: 11 pages, 15 figures

arXiv:2312.12655 [pdf, other]

Can Transformers Learn Sequential Function Classes In Context?

Authors: Ryan Campbell, Emma Guo, Evan Hu, Reya Vir, Ethan Hsiao

Abstract: In-context learning (ICL) has revolutionized the capabilities of transformer models in NLP. In our project, we extend the understanding of the mechanisms underpinning ICL by exploring whether transformers can learn from sequential, non-textual function class data distributions. We introduce a novel sliding window sequential function class and employ toy-sized transformers with a GPT-2 architecture… ▽ More In-context learning (ICL) has revolutionized the capabilities of transformer models in NLP. In our project, we extend the understanding of the mechanisms underpinning ICL by exploring whether transformers can learn from sequential, non-textual function class data distributions. We introduce a novel sliding window sequential function class and employ toy-sized transformers with a GPT-2 architecture to conduct our experiments. Our analysis indicates that these models can indeed leverage ICL when trained on non-textual sequential function classes. Additionally, our experiments with randomized y-label sequences highlights that transformers retain some ICL capabilities even when the label associations are obfuscated. We provide evidence that transformers can reason with and understand sequentiality encoded within function classes, as reflected by the effective learning of our proposed tasks. Our results also show that the performance deteriorated with increasing randomness in the labels, though not to the extent one might expect, implying a potential robustness of learned sequentiality against label noise. Future research may want to look into how previous explanations of transformers, such as induction heads and task vectors, relate to sequentiality in ICL in these toy examples. Our investigation lays the groundwork for further research into how transformers process and perceive sequential data. △ Less

Submitted 20 December, 2023; v1 submitted 19 December, 2023; originally announced December 2023.

Comments: 8 pages, 8 figures

arXiv:2303.08774 [pdf, other]

GPT-4 Technical Report

Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was develo** infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4. △ Less

Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: 100 pages; updated authors list; fixed author names and added citation

arXiv:2301.11615 [pdf, other]

Decompositions into two linear forests of bounded lengths

Authors: Rutger Campbell, Florian Hörsch, Benjamin Moore

Abstract: For some $k \in \mathbb{Z}_{\geq 0}\cup \infty$, we call a linear forest $k$-bounded if each of its components has at most $k$ edges. We will say a $(k,\ell)$-bounded linear forest decomposition of a graph $G$ is a partition of $E(G)$ into the edge sets of two linear forests $F_k,F_\ell$ where $F_k$ is $k$-bounded and $F_\ell$ is $\ell$-bounded. We show that the problem of deciding whether a given… ▽ More For some $k \in \mathbb{Z}_{\geq 0}\cup \infty$, we call a linear forest $k$-bounded if each of its components has at most $k$ edges. We will say a $(k,\ell)$-bounded linear forest decomposition of a graph $G$ is a partition of $E(G)$ into the edge sets of two linear forests $F_k,F_\ell$ where $F_k$ is $k$-bounded and $F_\ell$ is $\ell$-bounded. We show that the problem of deciding whether a given graph has such a decomposition is NP-complete if both $k$ and $\ell$ are at least $2$, NP-complete if $k\geq 9$ and $\ell =1$, and is in P for $(k,\ell)=(2,1)$. Before this, the only known NP-complete cases were the $(2,2)$ and $(3,3)$ cases. Our hardness result answers a question of Bermond et al. from 1984. We also show that planar graphs of girth at least nine decompose into a linear forest and a matching, which in particular is stronger than $3$-edge-colouring such graphs. △ Less

Submitted 27 January, 2023; originally announced January 2023.

arXiv:2206.02395 [pdf, other]

Product structure of graph classes with bounded treewidth

Authors: Rutger Campbell, Katie Clinch, Marc Distel, J. Pascal Gollin, Kevin Hendrey, Robert Hickingbotham, Tony Huynh, Freddie Illingworth, Youri Tamitegama, Jane Tan, David R. Wood

Abstract: We show that many graphs with bounded treewidth can be described as subgraphs of the strong product of a graph with smaller treewidth and a bounded-size complete graph. To this end, define the "underlying treewidth" of a graph class $\mathcal{G}$ to be the minimum non-negative integer $c$ such that, for some function $f$, for every graph ${G \in \mathcal{G}}$ there is a graph $H$ with… ▽ More We show that many graphs with bounded treewidth can be described as subgraphs of the strong product of a graph with smaller treewidth and a bounded-size complete graph. To this end, define the "underlying treewidth" of a graph class $\mathcal{G}$ to be the minimum non-negative integer $c$ such that, for some function $f$, for every graph ${G \in \mathcal{G}}$ there is a graph $H$ with ${\text{tw}(H) \leq c}$ such that $G$ is isomorphic to a subgraph of ${H \boxtimes K_{f(\text{tw}(G))}}$. We introduce disjointed coverings of graphs and show they determine the underlying treewidth of any graph class. Using this result, we prove that the class of planar graphs has underlying treewidth 3; the class of $K_{s,t}$-minor-free graphs has underlying treewidth $s$ (for ${t \geq \max\{s,3\}}$); and the class of $K_t$-minor-free graphs has underlying treewidth ${t-2}$. In general, we prove that a monotone class has bounded underlying treewidth if and only if it excludes some fixed topological minor. We also study the underlying treewidth of graph classes defined by an excluded subgraph or excluded induced subgraph. We show that the class of graphs with no $H$ subgraph has bounded underlying treewidth if and only if every component of $H$ is a subdivided star, and that the class of graphs with no induced $H$ subgraph has bounded underlying treewidth if and only if every component of $H$ is a star. △ Less

Submitted 17 July, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

arXiv:2109.03362 [pdf, other]

On the space of coefficients of a Feed Forward Neural Network

Authors: Dinesh Valluri, Rory Campbell

Abstract: We define and establish the conditions for `equivalent neural networks' - neural networks with different weights, biases, and threshold functions that result in the same associated function. We prove that given a neural network $\mathcal{N}$ with piece-wise linear activation, the space of coefficients describing all equivalent neural networks is given by a semialgebraic set. This result is obtaine… ▽ More We define and establish the conditions for `equivalent neural networks' - neural networks with different weights, biases, and threshold functions that result in the same associated function. We prove that given a neural network $\mathcal{N}$ with piece-wise linear activation, the space of coefficients describing all equivalent neural networks is given by a semialgebraic set. This result is obtained by studying different representations of a given piece-wise linear function using the Tarski-Seidenberg theorem. △ Less

Submitted 7 September, 2021; originally announced September 2021.

Comments: 13 pages, 5 figures

MSC Class: 68T07; 14P10 ACM Class: I.1.0

arXiv:2103.03221 [pdf, ps, other]

GenoML: Automated Machine Learning for Genomics

Authors: Mary B. Makarious, Hampton L. Leonard, Dan Vitale, Hirotaka Iwaki, David Saffo, Lana Sargent, Anant Dadu, Eduardo Salmerón Castaño, John F. Carter, Melina Maleknia, Juan A. Botia, Cornelis Blauwendraat, Roy H. Campbell, Sayed Hadi Hashemi, Andrew B. Singleton, Mike A. Nalls, Faraz Faghri

Abstract: GenoML is a Python package automating machine learning workflows for genomics (genetics and multi-omics) with an open science philosophy. Genomics data require significant domain expertise to clean, pre-process, harmonize and perform quality control of the data. Furthermore, tuning, validation, and interpretation involve taking into account the biology and possibly the limitations of the underlyin… ▽ More GenoML is a Python package automating machine learning workflows for genomics (genetics and multi-omics) with an open science philosophy. Genomics data require significant domain expertise to clean, pre-process, harmonize and perform quality control of the data. Furthermore, tuning, validation, and interpretation involve taking into account the biology and possibly the limitations of the underlying data collection, protocols, and technology. GenoML's mission is to bring machine learning for genomics and clinical data to non-experts by develo** an easy-to-use tool that automates the full development, evaluation, and deployment process. Emphasis is put on open science to make workflows easily accessible, replicable, and transferable within the scientific community. Source code and documentation is available at https://genoml.com. △ Less

Submitted 4 March, 2021; originally announced March 2021.

arXiv:2010.02508 [pdf, other]

Adversarial Boot Camp: label free certified robustness in one epoch

Authors: Ryan Campbell, Chris Finlay, Adam M Oberman

Abstract: Machine learning models are vulnerable to adversarial attacks. One approach to addressing this vulnerability is certification, which focuses on models that are guaranteed to be robust for a given perturbation size. A drawback of recent certified models is that they are stochastic: they require multiple computationally expensive model evaluations with random noise added to a given input. In our wor… ▽ More Machine learning models are vulnerable to adversarial attacks. One approach to addressing this vulnerability is certification, which focuses on models that are guaranteed to be robust for a given perturbation size. A drawback of recent certified models is that they are stochastic: they require multiple computationally expensive model evaluations with random noise added to a given input. In our work, we present a deterministic certification approach which results in a certifiably robust model. This approach is based on an equivalence between training with a particular regularized loss, and the expected values of Gaussian averages. We achieve certified models on ImageNet-1k by retraining a model with this loss for one epoch without the use of label information. △ Less

Submitted 5 October, 2020; originally announced October 2020.

Comments: 13 pages, 5 figures, 5 tables. Under review as a conference paper at ICLR 2021. arXiv admin note: substantial text overlap with arXiv:2006.06061

arXiv:2006.06061 [pdf, other]

Deterministic Gaussian Averaged Neural Networks

Authors: Ryan Campbell, Chris Finlay, Adam M Oberman

Abstract: We present a deterministic method to compute the Gaussian average of neural networks used in regression and classification. Our method is based on an equivalence between training with a particular regularized loss, and the expected values of Gaussian averages. We use this equivalence to certify models which perform well on clean data but are not robust to adversarial perturbations. In terms of cer… ▽ More We present a deterministic method to compute the Gaussian average of neural networks used in regression and classification. Our method is based on an equivalence between training with a particular regularized loss, and the expected values of Gaussian averages. We use this equivalence to certify models which perform well on clean data but are not robust to adversarial perturbations. In terms of certified accuracy and adversarial robustness, our method is comparable to known stochastic methods such as randomized smoothing, but requires only a single model evaluation during inference. △ Less

Submitted 10 June, 2020; originally announced June 2020.

arXiv:2004.14020 [pdf, other]

Caramel: Accelerating Decentralized Distributed Deep Learning with Computation Scheduling

Authors: Sayed Hadi Hashemi, Sangeetha Abdu Jyothi, Brighten Godfrey, Roy Campbell

Abstract: The method of choice for parameter aggregation in Deep Neural Network (DNN) training, a network-intensive task, is shifting from the Parameter Server model to decentralized aggregation schemes (AllReduce) inspired by theoretical guarantees of better performance. However, current implementations of AllReduce overlook the interdependence of communication and computation, resulting in significant per… ▽ More The method of choice for parameter aggregation in Deep Neural Network (DNN) training, a network-intensive task, is shifting from the Parameter Server model to decentralized aggregation schemes (AllReduce) inspired by theoretical guarantees of better performance. However, current implementations of AllReduce overlook the interdependence of communication and computation, resulting in significant performance degradation. In this paper, we develop Caramel, a system that accelerates decentralized distributed deep learning through model-aware computation scheduling and communication optimizations for AllReduce. Caramel achieves this goal through (a) computation DAG scheduling that expands the feasible window of transfer for each parameter (transfer boundaries), and (b) network optimizations for smoothening of the load including adaptive batching and pipelining of parameter transfers. Caramel maintains the correctness of the dataflow model, is hardware-independent, and does not require any user-level or framework-level changes. We implement Caramel over TensorFlow and show that the iteration time of DNN training can be improved by up to 3.62x in a cloud environment. △ Less

Submitted 29 April, 2020; originally announced April 2020.

arXiv:1904.05456 [pdf, other]

doi 10.1145/2814576.2814808

R-Storm: Resource-Aware Scheduling in Storm

Authors: Boyang Peng, Mohammad Hosseini, Zhihao Hong, Reza Farivar, Roy Campbell

Abstract: The era of big data has led to the emergence of new systems for real-time distributed stream processing, e.g., Apache Storm is one of the most popular stream processing systems in industry today. However, Storm, like many other stream processing systems lacks an intelligent scheduling mechanism. The default round-robin scheduling currently deployed in Storm disregards resource demands and availabi… ▽ More The era of big data has led to the emergence of new systems for real-time distributed stream processing, e.g., Apache Storm is one of the most popular stream processing systems in industry today. However, Storm, like many other stream processing systems lacks an intelligent scheduling mechanism. The default round-robin scheduling currently deployed in Storm disregards resource demands and availability, and can therefore be inefficient at times. We present R-Storm (Resource-Aware Storm), a system that implements resource-aware scheduling within Storm. R-Storm is designed to increase overall throughput by maximizing resource utilization while minimizing network latency. When scheduling tasks, R-Storm can satisfy both soft and hard resource constraints as well as minimizing network distance between components that communicate with each other. We evaluate R-Storm on set of micro-benchmark Storm applications as well as Storm applications used in production at Yahoo! Inc. From our experimental results we conclude that R-Storm achieves 30-47% higher throughput and 69-350% better CPU utilization than default Storm for the micro-benchmarks. For the Yahoo! Storm applications, R-Storm outperforms default Storm by around 50% based on overall throughput. We also demonstrate that R-Storm performs much better when scheduling multiple Storm applications than default Storm. △ Less

Submitted 10 April, 2019; originally announced April 2019.

Comments: Proceedings of the 16th Annual ACM Middleware Conference, Pages 149-161, Vancouver, BC, Canada, December 07 - 11, 2015

Journal ref: In Proceedings of the 16th Annual Middleware Conference (Middleware 2015). ACM, New York, NY, USA, 149-161

arXiv:1903.00374 [pdf, other]

Model-Based Reinforcement Learning for Atari

Authors: Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

Abstract: Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and… ▽ More Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment, which corresponds to two hours of real-time play. In most games SimPLe outperforms state-of-the-art model-free algorithms, in some games by over an order of magnitude. △ Less

Submitted 3 April, 2024; v1 submitted 1 March, 2019; originally announced March 2019.

arXiv:1812.00546 [pdf, other]

Learning the progression and clinical subtypes of Alzheimer's disease from longitudinal clinical data

Authors: Vipul Satone, Rachneet Kaur, Faraz Faghri, Mike A Nalls, Andrew B Singleton, Roy H Campbell

Abstract: Alzheimer's disease (AD) is a degenerative brain disease impairing a person's ability to perform day to day activities. The clinical manifestations of Alzheimer's disease are characterized by heterogeneity in age, disease span, progression rate, impairment of memory and cognitive abilities. Due to these variabilities, personalized care and treatment planning, as well as patient counseling about th… ▽ More Alzheimer's disease (AD) is a degenerative brain disease impairing a person's ability to perform day to day activities. The clinical manifestations of Alzheimer's disease are characterized by heterogeneity in age, disease span, progression rate, impairment of memory and cognitive abilities. Due to these variabilities, personalized care and treatment planning, as well as patient counseling about their individual progression is limited. Recent developments in machine learning to detect hidden patterns in complex, multi-dimensional datasets provides significant opportunities to address this critical need. In this work, we use unsupervised and supervised machine learning approaches for subtype identification and prediction. We apply machine learning methods to the extensive clinical observations available at the Alzheimer's Disease Neuroimaging Initiative (ADNI) data set to identify patient subtypes and to predict disease progression. Our analysis depicts the progression space for the Alzheimer's disease into low, moderate and high disease progression zones. The proposed work will enable early detection and characterization of distinct disease subtypes based on clinical heterogeneity. We anticipate that our models will enable patient counseling, clinical trial design, and ultimately individualized clinical care. △ Less

Submitted 5 December, 2018; v1 submitted 2 December, 2018; originally announced December 2018.

Comments: This volume represents the accepted submissions from the Machine Learning for Health (ML4H) workshop at the conference on Neural Information Processing Systems (NeurIPS) 2018, held on December 8, 2018 in Montreal, Canada

Report number: ML4H/2018/206

arXiv:1810.06983 [pdf, other]

Decomposing feature-level variation with Covariate Gaussian Process Latent Variable Models

Authors: Kaspar Märtens, Kieran R. Campbell, Christopher Yau

Abstract: The interpretation of complex high-dimensional data typically requires the use of dimensionality reduction techniques to extract explanatory low-dimensional representations. However, in many real-world problems these representations may not be sufficient to aid interpretation on their own, and it would be desirable to interpret the model in terms of the original features themselves. Our goal is to… ▽ More The interpretation of complex high-dimensional data typically requires the use of dimensionality reduction techniques to extract explanatory low-dimensional representations. However, in many real-world problems these representations may not be sufficient to aid interpretation on their own, and it would be desirable to interpret the model in terms of the original features themselves. Our goal is to characterise how feature-level variation depends on latent low-dimensional representations, external covariates, and non-linear interactions between the two. In this paper, we propose to achieve this through a structured kernel decomposition in a hybrid Gaussian Process model which we call the Covariate Gaussian Process Latent Variable Model (c-GPLVM). We demonstrate the utility of our model on simulated examples and applications in disease progression modelling from high-dimensional gene expression data in the presence of additional phenotypes. In each setting we show how the c-GPLVM can extract low-dimensional structures from high-dimensional data sets whilst allowing a breakdown of feature-level variability that is not present in other commonly used dimensionality reduction approaches. △ Less

Submitted 24 June, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

Journal ref: Proceedings of the 36th International Conference on Machine Learning (ICML 2019)

arXiv:1803.03288 [pdf, other]

TicTac: Accelerating Distributed Deep Learning with Communication Scheduling

Authors: Sayed Hadi Hashemi, Sangeetha Abdu Jyothi, Roy H. Campbell

Abstract: State-of-the-art deep learning systems rely on iterative distributed training to tackle the increasing complexity of models and input data. The iteration time in these communication-heavy systems depends on the computation time, communication time and the extent of overlap of computation and communication. In this work, we identify a shortcoming in systems with graph representation for computati… ▽ More State-of-the-art deep learning systems rely on iterative distributed training to tackle the increasing complexity of models and input data. The iteration time in these communication-heavy systems depends on the computation time, communication time and the extent of overlap of computation and communication. In this work, we identify a shortcoming in systems with graph representation for computation, such as TensorFlow and PyTorch, that result in high variance in iteration time --- random order of received parameters across workers. We develop a system, TicTac, to improve the iteration time by fixing this issue in distributed deep learning with Parameter Servers while guaranteeing near-optimal overlap of communication and computation. TicTac identifies and enforces an order of network transfers which improves the iteration time using prioritization. Our system is implemented over TensorFlow and requires no changes to the model or developer inputs. TicTac improves the throughput by up to $37.7\%$ in inference and $19.2\%$ in training, while also reducing straggler effect by up to $2.3\times$. Our code is publicly available. △ Less

Submitted 3 October, 2018; v1 submitted 8 March, 2018; originally announced March 2018.

arXiv:1710.11252 [pdf, other]

Stochastic Variational Video Prediction

Authors: Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy H. Campbell, Sergey Levine

Abstract: Predicting the future in real-world settings, particularly from raw sensory observations such as images, is exceptionally challenging. Real-world events can be stochastic and unpredictable, and the high dimensionality and complexity of natural images requires the predictive model to build an intricate understanding of the natural world. Many existing methods tackle this problem by making simplifyi… ▽ More Predicting the future in real-world settings, particularly from raw sensory observations such as images, is exceptionally challenging. Real-world events can be stochastic and unpredictable, and the high dimensionality and complexity of natural images requires the predictive model to build an intricate understanding of the natural world. Many existing methods tackle this problem by making simplifying assumptions about the environment. One common assumption is that the outcome is deterministic and there is only one plausible future. This can lead to low-quality predictions in real-world settings with stochastic dynamics. In this paper, we develop a stochastic variational video prediction (SV2P) method that predicts a different possible future for each sample of its latent variables. To the best of our knowledge, our model is the first to provide effective stochastic multi-frame prediction for real-world video. We demonstrate the capability of the proposed method in predicting detailed future frames of videos on multiple real-world datasets, both action-free and action-conditioned. We find that our proposed method produces substantially improved video predictions when compared to the same model without stochasticity, and to other stochastic video prediction methods. Our SV2P implementation will be open sourced upon publication. △ Less

Submitted 6 March, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

arXiv:1710.00112 [pdf]

Toward Scalable Machine Learning and Data Mining: the Bioinformatics Case

Authors: Faraz Faghri, Sayed Hadi Hashemi, Mohammad Babaeizadeh, Mike A. Nalls, Saurabh Sinha, Roy H. Campbell

Abstract: In an effort to overcome the data deluge in computational biology and bioinformatics and to facilitate bioinformatics research in the era of big data, we identify some of the most influential algorithms that have been widely used in the bioinformatics community. These top data mining and machine learning algorithms cover classification, clustering, regression, graphical model-based learning, and d… ▽ More In an effort to overcome the data deluge in computational biology and bioinformatics and to facilitate bioinformatics research in the era of big data, we identify some of the most influential algorithms that have been widely used in the bioinformatics community. These top data mining and machine learning algorithms cover classification, clustering, regression, graphical model-based learning, and dimensionality reduction. The goal of this study is to guide the focus of scalable computing experts in the endeavor of applying new storage and scalable computation designs to bioinformatics algorithms that merit their attention most, following the engineering maxim of "optimize the common case". △ Less

Submitted 29 September, 2017; originally announced October 2017.

arXiv:1710.00110 [pdf, other]

Decentralized User-Centric Access Control using PubSub over Blockchain

Authors: Sayed Hadi Hashemi, Faraz Faghri, Roy H Campbell

Abstract: We present a mechanism that puts users in the center of control and empowers them to dictate the access to their collections of data. Revisiting the fundamental mechanisms in security for providing protection, our solution uses capabilities, access lists, and access rights following well-understood formal notions for reasoning about access. This contribution presents a practical, correct, auditabl… ▽ More We present a mechanism that puts users in the center of control and empowers them to dictate the access to their collections of data. Revisiting the fundamental mechanisms in security for providing protection, our solution uses capabilities, access lists, and access rights following well-understood formal notions for reasoning about access. This contribution presents a practical, correct, auditable, transparent, distributed, and decentralized mechanism that is well-matched to the current emerging environments including Internet of Things, smart city, precision medicine, and autonomous cars. It is based on well-tested principles and practices used in a distributed authorization, cryptocurrencies, and scalable computing. △ Less

Submitted 29 September, 2017; originally announced October 2017.

arXiv:1708.09538 [pdf]

A Novel Scheduling Framework Leveraging Hardware Cache Partitioning for Cache-Side-Channel Elimination in Clouds

Authors: Read Sprabery, Konstantin Evchenko, Abhilash Raj, Rakesh B. Bobba, Sibin Mohan, Roy H. Campbell

Abstract: While there exist many isolation mechanisms that are available to cloud service providers, including virtual machines, containers, etc., the problem of side-channel increases in importance as a remaining security vulnerability, particularly in the presence of shared caches and multicore processors. In this paper we present a hardware-software mechanism that improves the isolation of cloud processe… ▽ More While there exist many isolation mechanisms that are available to cloud service providers, including virtual machines, containers, etc., the problem of side-channel increases in importance as a remaining security vulnerability, particularly in the presence of shared caches and multicore processors. In this paper we present a hardware-software mechanism that improves the isolation of cloud processes in the presence of shared caches on multicore chips. Combining the Intel CAT architecture that enables cache partitioning on the fly with novel scheduling techniques and state cleansing mechanisms, we enable cache-side-channel free computing for Linux-based containers and virtual machines, in particular, those managed by KVM. We do a preliminary evaluation of our system using a CPU bound workload. Our system allows Simultaneous Multithreading (SMT) to remain enabled and does not require application level changes. △ Less

Submitted 30 August, 2017; originally announced August 2017.

arXiv:1704.06001 [pdf, other]

Fast Generation for Convolutional Autoregressive Models

Authors: Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, Mohammad Babaeizadeh, Shiyu Chang, Yang Zhang, Mark A. Hasegawa-Johnson, Roy H. Campbell, Thomas S. Huang

Abstract: Convolutional autoregressive models have recently demonstrated state-of-the-art performance on a number of generation tasks. While fast, parallel training methods have been crucial for their success, generation is typically implemented in a naïve fashion where redundant computations are unnecessarily repeated. This results in slow generation, making such models infeasible for production environmen… ▽ More Convolutional autoregressive models have recently demonstrated state-of-the-art performance on a number of generation tasks. While fast, parallel training methods have been crucial for their success, generation is typically implemented in a naïve fashion where redundant computations are unnecessarily repeated. This results in slow generation, making such models infeasible for production environments. In this work, we describe a method to speed up generation in convolutional autoregressive models. The key idea is to cache hidden states to avoid redundant computation. We apply our fast generation method to the Wavenet and PixelCNN++ models and achieve up to $21\times$ and $183\times$ speedups respectively. △ Less

Submitted 20 April, 2017; originally announced April 2017.

Comments: Accepted at ICLR 2017 Workshop

arXiv:1612.00521 [pdf, other]

Performance Modeling of Distributed Deep Neural Networks

Authors: Sayed Hadi Hashemi, Shadi A. Noghabi, William Gropp, Roy H Campbell

Abstract: During the past decade, machine learning has become extremely popular and can be found in many aspects of our every day life. Nowayadays with explosion of data while rapid growth of computation capacity, Distributed Deep Neural Networks (DDNNs) which can improve their performance linearly with more computation resources, have become hot and trending. However, there has not been an in depth study o… ▽ More During the past decade, machine learning has become extremely popular and can be found in many aspects of our every day life. Nowayadays with explosion of data while rapid growth of computation capacity, Distributed Deep Neural Networks (DDNNs) which can improve their performance linearly with more computation resources, have become hot and trending. However, there has not been an in depth study of the performance of these systems, and how well they scale. In this paper we analyze CNTK, one of the most commonly used DDNNs, by first building a performance model and then evaluating the system two settings: a small cluster with all nodes in a single rack connected to a top of rack switch, and in large scale using Blue Waters with arbitary placement of nodes. Our main focus was the scalability of the system with respect to adding more nodes. Based on our results, this system has an excessive initialization overhead because of poor I/O utilization which dominates the whole execution time. Because of this, the system does not scale beyond a few nodes (4 in Blue Waters). Additionally, due to a single server-multiple worker design the server becomes a bottleneck after 16 nodes limiting the scalability of the CNTK. △ Less

Submitted 14 December, 2016; v1 submitted 1 December, 2016; originally announced December 2016.

arXiv:1611.06211 [pdf, other]

NoiseOut: A Simple Way to Prune Neural Networks

Authors: Mohammad Babaeizadeh, Paris Smaragdis, Roy H. Campbell

Abstract: Neural networks are usually over-parameterized with significant redundancy in the number of required neurons which results in unnecessary computation and memory usage at inference time. One common approach to address this issue is to prune these big networks by removing extra neurons and parameters while maintaining the accuracy. In this paper, we propose NoiseOut, a fully automated pruning algori… ▽ More Neural networks are usually over-parameterized with significant redundancy in the number of required neurons which results in unnecessary computation and memory usage at inference time. One common approach to address this issue is to prune these big networks by removing extra neurons and parameters while maintaining the accuracy. In this paper, we propose NoiseOut, a fully automated pruning algorithm based on the correlation between activations of neurons in the hidden layers. We prove that adding additional output neurons with entirely random targets results into a higher correlation between neurons which makes pruning by NoiseOut even more efficient. Finally, we test our method on various networks and datasets. These experiments exhibit high pruning rates while maintaining the accuracy of the original network. △ Less

Submitted 18 November, 2016; originally announced November 2016.

arXiv:cs/0509095 [pdf]

Leveraging Social-Network Infrastructure to Improve Peer-to-Peer Overlay Performance: Results from Orkut

Authors: Zahid Anwar, William Yurcik, Vivek Pandey, Asim Shankar, Indranil Gupta, Roy H. Campbell

Abstract: Application-level peer-to-peer (P2P) network overlays are an emerging paradigm that facilitates decentralization and flexibility in the scalable deployment of applications such as group communication, content delivery, and data sharing. However the construction of the overlay graph topology optimized for low latency, low link and node stress and lookup performance is still an open problem. We pr… ▽ More Application-level peer-to-peer (P2P) network overlays are an emerging paradigm that facilitates decentralization and flexibility in the scalable deployment of applications such as group communication, content delivery, and data sharing. However the construction of the overlay graph topology optimized for low latency, low link and node stress and lookup performance is still an open problem. We present a design of an overlay constructed on top of a social network and show that it gives a sizable improvement in lookups, average round-trip delay and scalability as opposed to other overlay topologies. We build our overlay on top of the topology of a popular real-world social network namely Orkut. We show Orkuts suitability for our purposes by evaluating the clustering behavior of its graph structure and the socializing pattern of its members. △ Less

Submitted 28 September, 2005; originally announced September 2005.

Comments: 9 pages 8 figures

ACM Class: C.2.2

Showing 1–24 of 24 results for author: Campbell, R