Skip to main content

Showing 1–7 of 7 results for author: Houston, M

.
  1. arXiv:2201.11990  [pdf, other

    cs.CL

    Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model

    Authors: Shaden Smith, Mostofa Patwary, Brandon Norick, Patrick LeGresley, Samyam Rajbhandari, Jared Casper, Zhun Liu, Shrimai Prabhumoye, George Zerveas, Vijay Korthikanti, Elton Zhang, Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, Bryan Catanzaro

    Abstract: Pretrained general-purpose language models can achieve state-of-the-art accuracies in various natural language processing domains by adapting to downstream tasks via zero-shot, few-shot and fine-tuning techniques. Because of their success, the size of these models has increased rapidly, requiring high-performance hardware, software, and algorithmic techniques to enable training such large models.… ▽ More

    Submitted 4 February, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

    Comments: Shaden Smith and Mostofa Patwary contributed equally

  2. Strategies for Maximizing Detection Rate in Radio SETI

    Authors: Kenneth M. Houston, Andrew P. V. Siemion, Steve Croft

    Abstract: The Search for Extraterrestrial intelligence (SETI) is a scientific and cultural effort seeking evidence of intelligent life beyond earth. Radio SETI observes the radio spectrum for ''technosignatures" that could be produced by an advanced ET society. This work models radio SETI as an end-to-end system, and focuses on narrow-band intentional transmissions. We look at strategies to maximize the exp… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Comments: Accepted for publication in AJ

  3. arXiv:1910.13444  [pdf, other

    physics.comp-ph cs.LG stat.ML

    Highly-scalable, physics-informed GANs for learning solutions of stochastic PDEs

    Authors: Liu Yang, Sean Treichler, Thorsten Kurth, Keno Fischer, David Barajas-Solano, Josh Romero, Valentin Churavy, Alexandre Tartakovsky, Michael Houston, Prabhat, George Karniadakis

    Abstract: Uncertainty quantification for forward and inverse problems is a central challenge across physical and biomedical disciplines. We address this challenge for the problem of modeling subsurface flow at the Hanford Site by combining stochastic computational models with observational data using physics-informed GAN models. The geographic extent, spatial heterogeneity, and multiple correlation length s… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

    Comments: 3rd Deep Learning on Supercomputers Workshop (DLS) at SC19

  4. arXiv:1810.01993  [pdf, other

    cs.DC

    Exascale Deep Learning for Climate Analytics

    Authors: Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston

    Abstract: We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parall… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

    Comments: 12 pages, 5 tables, 4, figures, Super Computing Conference November 11-16, 2018, Dallas, TX, USA

  5. arXiv:1710.03740  [pdf, other

    cs.AI cs.LG stat.ML

    Mixed Precision Training

    Authors: Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, Hao Wu

    Abstract: Deep neural networks have enabled progress in a wide variety of applications. Growing the size of the neural network typically results in improved accuracy. As model sizes grow, the memory and compute requirements for training these models also increases. We introduce a technique to train deep neural networks using half precision floating point numbers. In our technique, weights, activations and g… ▽ More

    Submitted 15 February, 2018; v1 submitted 10 October, 2017; originally announced October 2017.

    Comments: Published as a conference paper at ICLR 2018

  6. arXiv:1610.03521  [pdf, other

    q-bio.QM q-bio.TO

    A Review of Mathematical Models for Muscular Dystrophy: A Systems Biology Approach

    Authors: Amanda N. Cameron, Matthew T. Houston, Juan B. Gutierrez

    Abstract: Muscular dystrophy (MD) describes generalized progressive muscular weakness due to the wasting of muscle fibers. The progression of the disease is affected by known immunological and mechanical factors, and possibly other unknown mechanisms. These dynamics have begun to be elucidated in the last two decades. This article reviews mathematical models of MD that characterize molecular and cellular co… ▽ More

    Submitted 28 October, 2016; v1 submitted 11 October, 2016; originally announced October 2016.

    Comments: 23 pages, 2 figures

    MSC Class: 92C42

  7. arXiv:0706.3060  [pdf, ps, other

    cs.CE cs.DC

    N-Body Simulations on GPUs

    Authors: Erich Elsen, V. Vishal, Mike Houston, Vijay Pande, Pat Hanrahan, Eric Darve

    Abstract: Commercial graphics processors (GPUs) have high compute capacity at very low cost, which makes them attractive for general purpose scientific computing. In this paper we show how graphics processors can be used for N-body simulations to obtain improvements in performance over current generation CPUs. We have developed a highly optimized algorithm for performing the O(N^2) force calculations that… ▽ More

    Submitted 20 June, 2007; originally announced June 2007.