Skip to main content

Showing 1–7 of 7 results for author: Giles, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.05341  [pdf, other

    astro-ph.IM cs.DC

    Accelerating Dedispersion using Many-Core Architectures

    Authors: Jan Novotný, Karel Adámek, M. A. Clark, Mike Giles, Wesley Armour

    Abstract: Astrophysical radio signals are excellent probes of extreme physical processes that emit them. However, to reach Earth, electromagnetic radiation passes through the ionised interstellar medium (ISM), introducing a frequency-dependent time delay (dispersion) to the emitted signal. Removing dispersion enables searches for transient signals like Fast Radio Bursts (FRB) or repeating signals from isola… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Journal ref: The Astrophysical Journal Supplement Series, Volume 269, Number 1, 2023

  2. arXiv:2010.16225  [pdf, other

    math.NA cs.PF math.PR

    Effects of round-to-nearest and stochastic rounding in the numerical solution of the heat equation in low precision

    Authors: Matteo Croci, Michael B. Giles

    Abstract: Motivated by the advent of machine learning, the last few years have seen the return of hardware-supported low-precision computing. Computations with fewer digits are faster and more memory and energy efficient, but can be extremely susceptible to rounding errors. As shown by recent studies into reduced-precision climate simulations, an application that can largely benefit from the advantages of l… ▽ More

    Submitted 28 March, 2022; v1 submitted 30 October, 2020; originally announced October 2020.

    Comments: 30 pages, 4 figures

    MSC Class: 65G50; 65G30; 65M06; 65M12; 65M15; 65M22; 65Y99; 65C20

    Journal ref: IMA Journal of Numerical Analysis, April 2022

  3. arXiv:1910.01972  [pdf, ps, other

    cs.MS cs.DC cs.PF

    GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory

    Authors: Karel Adámek, Sofia Dimoudi, Mike Giles, Wesley Armour

    Abstract: We present an implementation of the overlap-and-save method, a method for the convolution of very long signals with short response functions, which is tailored to GPUs. We have implemented several FFT algorithms (using the CUDA programming language) which exploit GPU shared memory, allowing for GPU accelerated convolution. We compare our implementation with an implementation of the overlap-and-sav… ▽ More

    Submitted 10 April, 2020; v1 submitted 4 October, 2019; originally announced October 2019.

    Comments: accepted to ACM TACO

    Journal ref: ACM Trans. Archit. Code Optim. 17, 3, Article 18 (September 2020)

  4. arXiv:1709.02125  [pdf, other

    cs.DC

    Beyond 16GB: Out-of-Core Stencil Computations

    Authors: Istvan Z Reguly, Gihan R Mudalige, Michael B Giles

    Abstract: Stencil computations are a key class of applications, widely used in the scientific computing community, and a class that has particularly benefited from performance improvements on architectures with high memory bandwidth. Unfortunately, such architectures come with a limited amount of fast memory, which is limiting the size of the problems that can be efficiently solved. In this paper, we addres… ▽ More

    Submitted 26 October, 2017; v1 submitted 7 September, 2017; originally announced September 2017.

  5. Loop Tiling in Large-Scale Stencil Codes at Run-time with OPS

    Authors: Istvan Z Reguly, Gihan R Mudalige, Mike B Giles

    Abstract: The key common bottleneck in most stencil codes is data movement, and prior research has shown that improving data locality through optimisations that schedule across loops do particularly well. However, in many large PDE applications it is not possible to apply such optimisations through compilers because there are many options, execution paths and data per grid point, many dependent on run-time… ▽ More

    Submitted 26 June, 2017; v1 submitted 3 April, 2017; originally announced April 2017.

  6. Acceleration of a Full-scale Industrial CFD Application with OP2

    Authors: István Z. Reguly, Gihan R. Mudalige, Carlo Bertolli, Michael B. Giles, Adam Betts, Paul H. J. Kelly, David Radford

    Abstract: Hydra is a full-scale industrial CFD application used for the design of turbomachinery at Rolls Royce plc. It consists of over 300 parallel loops with a code base exceeding 50K lines and is capable of performing complex simulations over highly detailed unstructured mesh geometries. Unlike simpler structured-mesh applications, which feature high speed-ups when accelerated by modern processor archit… ▽ More

    Submitted 27 March, 2014; originally announced March 2014.

    Comments: Submitted to ACM Transactions on Parallel Computing

    ACM Class: C.4

    Journal ref: IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 5, pp. 1265-1278, May 1 2016. doi: 10.1109/TPDS.2015.2453972

  7. arXiv:1309.1780  [pdf, ps, other

    cs.CE cs.MS cs.SE

    Software Abstractions and Methodologies for HPC Simulation Codes on Future Architectures

    Authors: A. Dubey, S. Brandt, R. Brower, M. Giles, P. Hovland, D. Q. Lamb, F. Loffler, B. Norris, B. OShea, C. Rebbi, M. Snir, R. Thakur

    Abstract: Large, complex, multi-scale, multi-physics simulation codes, running on high performance com-puting (HPC) platforms, have become essential to advancing science and engineering. These codes simulate multi-scale, multi-physics phenomena with unprecedented fidelity on petascale platforms, and are used by large communities. Continued ability of these codes to run on future platforms is as crucial to t… ▽ More

    Submitted 6 September, 2013; originally announced September 2013.

    Comments: Position Paper