Skip to main content

Showing 1–5 of 5 results for author: Sewall, J

.
  1. arXiv:2406.11704  [pdf, other

    cs.CL cs.AI cs.LG

    Nemotron-4 340B Technical Report

    Authors: Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek , et al. (58 additional authors not shown)

    Abstract: We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation be… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2210.08046  [pdf, other

    cs.GR cs.LG cs.MA

    Differentiable Hybrid Traffic Simulation

    Authors: Sanghyun Son, Yi-Ling Qiao, Jason Sewall, Ming C. Lin

    Abstract: We introduce a novel differentiable hybrid traffic simulator, which simulates traffic using a hybrid model of both macroscopic and microscopic models and can be directly integrated into a neural network for traffic control and flow optimization. This is the first differentiable traffic simulator for macroscopic and hybrid models that can compute gradients for traffic states across time steps and i… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: 13 pages, Siggraph Asia 2022 Journal Paper

    ACM Class: I.6.1; I.6.3

  3. arXiv:1808.04728  [pdf, other

    astro-ph.CO astro-ph.IM cs.LG physics.comp-ph

    CosmoFlow: Using Deep Learning to Learn the Universe at Scale

    Authors: Amrita Mathuriya, Deborah Bard, Peter Mendygral, Lawrence Meadows, James Arnemann, Lei Shao, Siyu He, Tuomas Karna, Daina Moise, Simon J. Pennycook, Kristyn Maschoff, Jason Sewall, Nalini Kumar, Shirley Ho, Mike Ringenburg, Prabhat, Victor Lee

    Abstract: Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many el… ▽ More

    Submitted 9 November, 2018; v1 submitted 14 August, 2018; originally announced August 2018.

    Comments: 11 pages, 6 pages, presented at SuperComputing 2018

  4. arXiv:1710.08774  [pdf, other

    cs.PF cs.DC

    High-Performance Code Generation though Fusion and Vectorization

    Authors: Jason Sewall, Simon J. Pennycook

    Abstract: We present a technique for automatically transforming kernel-based computations in disparate, nested loops into a fused, vectorized form that can reduce intermediate storage needs and lead to improved performance on contemporary hardware. We introduce representations for the abstract relationships and data dependencies of kernels in loop nests and algorithms for manipulating them into more effic… ▽ More

    Submitted 24 October, 2017; originally announced October 2017.

  5. arXiv:1611.07409  [pdf, ps, other

    cs.PF

    A Metric for Performance Portability

    Authors: S. J. Pennycook, J. D. Sewall, V. W. Lee

    Abstract: The term "performance portability" has been informally used in computing to refer to a variety of notions which generally include: 1) the ability to run one application across multiple hardware platforms; and 2) achieving some notional level of performance on these platforms. However, there has been a noticeable lack of consensus on the precise meaning of the term, and authors' conclusions regardi… ▽ More

    Submitted 22 November, 2016; originally announced November 2016.

    Comments: 7 pages, in Proceedings of the 7th International Workshop in Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems