Skip to main content

Showing 1–4 of 4 results for author: Giles, M B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2010.16225  [pdf, other

    math.NA cs.PF math.PR

    Effects of round-to-nearest and stochastic rounding in the numerical solution of the heat equation in low precision

    Authors: Matteo Croci, Michael B. Giles

    Abstract: Motivated by the advent of machine learning, the last few years have seen the return of hardware-supported low-precision computing. Computations with fewer digits are faster and more memory and energy efficient, but can be extremely susceptible to rounding errors. As shown by recent studies into reduced-precision climate simulations, an application that can largely benefit from the advantages of l… ▽ More

    Submitted 28 March, 2022; v1 submitted 30 October, 2020; originally announced October 2020.

    Comments: 30 pages, 4 figures

    MSC Class: 65G50; 65G30; 65M06; 65M12; 65M15; 65M22; 65Y99; 65C20

    Journal ref: IMA Journal of Numerical Analysis, April 2022

  2. arXiv:1709.02125  [pdf, other

    cs.DC

    Beyond 16GB: Out-of-Core Stencil Computations

    Authors: Istvan Z Reguly, Gihan R Mudalige, Michael B Giles

    Abstract: Stencil computations are a key class of applications, widely used in the scientific computing community, and a class that has particularly benefited from performance improvements on architectures with high memory bandwidth. Unfortunately, such architectures come with a limited amount of fast memory, which is limiting the size of the problems that can be efficiently solved. In this paper, we addres… ▽ More

    Submitted 26 October, 2017; v1 submitted 7 September, 2017; originally announced September 2017.

  3. Loop Tiling in Large-Scale Stencil Codes at Run-time with OPS

    Authors: Istvan Z Reguly, Gihan R Mudalige, Mike B Giles

    Abstract: The key common bottleneck in most stencil codes is data movement, and prior research has shown that improving data locality through optimisations that schedule across loops do particularly well. However, in many large PDE applications it is not possible to apply such optimisations through compilers because there are many options, execution paths and data per grid point, many dependent on run-time… ▽ More

    Submitted 26 June, 2017; v1 submitted 3 April, 2017; originally announced April 2017.

  4. Acceleration of a Full-scale Industrial CFD Application with OP2

    Authors: István Z. Reguly, Gihan R. Mudalige, Carlo Bertolli, Michael B. Giles, Adam Betts, Paul H. J. Kelly, David Radford

    Abstract: Hydra is a full-scale industrial CFD application used for the design of turbomachinery at Rolls Royce plc. It consists of over 300 parallel loops with a code base exceeding 50K lines and is capable of performing complex simulations over highly detailed unstructured mesh geometries. Unlike simpler structured-mesh applications, which feature high speed-ups when accelerated by modern processor archit… ▽ More

    Submitted 27 March, 2014; originally announced March 2014.

    Comments: Submitted to ACM Transactions on Parallel Computing

    ACM Class: C.4

    Journal ref: IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 5, pp. 1265-1278, May 1 2016. doi: 10.1109/TPDS.2015.2453972