Skip to main content

Showing 1–5 of 5 results for author: Pennycook, S J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.16122  [pdf, other

    cs.PF astro-ph.CO cs.DC

    A Performance-Portable SYCL Implementation of CRK-HACC for Exascale

    Authors: Esteban M. Rangel, S. John Pennycook, Adrian Pope, Nicholas Frontiere, Zhiqiang Ma, Varsha Madananth

    Abstract: The first generation of exascale systems will include a variety of machine architectures, featuring GPUs from multiple vendors. As a result, many developers are interested in adopting portable programming models to avoid maintaining multiple versions of their code. It is necessary to document experiences with such programming models to assist developers in understanding the advantages and disadvan… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 12 pages, 13 figures, 2023 International Workshop on Performance, Portability & Productivity in HPC

    ACM Class: D.2.7; D.2.8; D.1.3; J.2

  2. arXiv:1808.04728  [pdf, other

    astro-ph.CO astro-ph.IM cs.LG physics.comp-ph

    CosmoFlow: Using Deep Learning to Learn the Universe at Scale

    Authors: Amrita Mathuriya, Deborah Bard, Peter Mendygral, Lawrence Meadows, James Arnemann, Lei Shao, Siyu He, Tuomas Karna, Daina Moise, Simon J. Pennycook, Kristyn Maschoff, Jason Sewall, Nalini Kumar, Shirley Ho, Mike Ringenburg, Prabhat, Victor Lee

    Abstract: Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many el… ▽ More

    Submitted 9 November, 2018; v1 submitted 14 August, 2018; originally announced August 2018.

    Comments: 11 pages, 6 pages, presented at SuperComputing 2018

  3. arXiv:1710.08774  [pdf, other

    cs.PF cs.DC

    High-Performance Code Generation though Fusion and Vectorization

    Authors: Jason Sewall, Simon J. Pennycook

    Abstract: We present a technique for automatically transforming kernel-based computations in disparate, nested loops into a fused, vectorized form that can reduce intermediate storage needs and lead to improved performance on contemporary hardware. We introduce representations for the abstract relationships and data dependencies of kernels in loop nests and algorithms for manipulating them into more effic… ▽ More

    Submitted 24 October, 2017; originally announced October 2017.

  4. arXiv:1611.07409  [pdf, ps, other

    cs.PF

    A Metric for Performance Portability

    Authors: S. J. Pennycook, J. D. Sewall, V. W. Lee

    Abstract: The term "performance portability" has been informally used in computing to refer to a variety of notions which generally include: 1) the ability to run one application across multiple hardware platforms; and 2) achieving some notional level of performance on these platforms. However, there has been a noticeable lack of consensus on the precise meaning of the term, and authors' conclusions regardi… ▽ More

    Submitted 22 November, 2016; originally announced November 2016.

    Comments: 7 pages, in Proceedings of the 7th International Workshop in Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems

  5. arXiv:1503.08809  [pdf, other

    cs.DC astro-ph.CO cs.PF

    Separable projection integrals for higher-order correlators of the cosmic microwave sky: Acceleration by factors exceeding 100

    Authors: J. P. Briggs, S. J. Pennycook, J. R. Fergusson, J. Jäykkä, E. P. S. Shellard

    Abstract: We present a case study describing efforts to optimise and modernise "Modal", the simulation and analysis pipeline used by the Planck satellite experiment for constraining general non-Gaussian models of the early universe via the bispectrum (or three-point correlator) of the cosmic microwave background radiation. We focus on one particular element of the code: the projection of bispectra from the… ▽ More

    Submitted 26 January, 2016; v1 submitted 30 March, 2015; originally announced March 2015.

    Comments: Accepted by Journal of Computational Physics