Skip to main content

Showing 1–9 of 9 results for author: Rajamanickam, S

Searching in archive math. Search in all archives.
.
  1. arXiv:2304.04876  [pdf, other

    math.NA cs.DC cs.MS

    An Experimental Study of Two-Level Schwarz Domain Decomposition Preconditioners on GPUs

    Authors: Ichitaro Yamazaki, Alexander Heinlein, Sivasankaran Rajamanickam

    Abstract: The generalized Dryja--Smith--Widlund (GDSW) preconditioner is a two-level overlap** Schwarz domain decomposition (DD) preconditioner that couples a classical one-level overlap** Schwarz preconditioner with an energy-minimizing coarse space. When used to accelerate the convergence rate of Krylov subspace iterative methods, the GDSW preconditioner provides robustness and scalability for the sol… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: Accepted for publication in IPDPS'23

  2. arXiv:2109.01232  [pdf, other

    cs.DC cs.MS math.NA

    A Study of Mixed Precision Strategies for GMRES on GPUs

    Authors: Jennifer A. Loe, Christian A. Glusa, Ichitaro Yamazaki, Erik G. Boman, Sivasankaran Rajamanickam

    Abstract: Support for lower precision computation is becoming more common in accelerator hardware due to lower power usage, reduced data movement and increased computational performance. However, computational science and engineering (CSE) problems require double precision accuracy in several domains. This conflict between hardware trends and application needs has resulted in a need for mixed precision stra… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2105.07544

  3. arXiv:2105.07544  [pdf, other

    math.NA cs.MS

    Experimental Evaluation of Multiprecision Strategies for GMRES on GPUs

    Authors: Jennifer A. Loe, Christian A. Glusa, Ichitaro Yamazaki, Erik G. Boman, Sivasankaran Rajamanickam

    Abstract: Support for lower precision computation is becoming more common in accelerator hardware due to lower power usage, reduced data movement and increased computational performance. However, computational science and engineering (CSE) problems require double precision accuracy in several domains. This conflict between hardware trends and application needs has resulted in a need for multiprecision strat… ▽ More

    Submitted 16 May, 2021; originally announced May 2021.

    Comments: Accepted for publication in the IEEE IPDPS Accelerators and Hybrid Emerging Systems (AsHES) 11th Workshop, 2021

  4. arXiv:2104.01196  [pdf, other

    math.NA cs.MS

    Two-Stage Gauss--Seidel Preconditioners and Smoothers for Krylov Solvers on a GPU cluster

    Authors: Luc Berger-Vergiat, Brian Kelley, Sivasankaran Rajamanickam, Jonathan Hu, Katarzyna Swirydowicz, Paul Mullowney, Stephen Thomas, Ichitaro Yamazaki

    Abstract: Gauss-Seidel (GS) relaxation is often employed as a preconditioner for a Krylov solver or as a smoother for Algebraic Multigrid (AMG). However, the requisite sparse triangular solve is difficult to parallelize on many-core architectures such as graphics processing units (GPUs). In the present study, the performance of the traditional GS relaxation based on a triangular solve is compared with two-s… ▽ More

    Submitted 24 April, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

  5. arXiv:2007.06674  [pdf, other

    cs.MS math.NA

    A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic

    Authors: Ahmad Abdelfattah, Hartwig Anzt, Erik G. Boman, Erin Carson, Terry Cojean, Jack Dongarra, Mark Gates, Thomas Grützmacher, Nicholas J. Higham, Sherry Li, Neil Lindquist, Yang Liu, Jennifer Loe, Piotr Luszczek, Pratik Nayak, Sri Pranesh, Siva Rajamanickam, Tobias Ribizel, Barry Smith, Kasia Swirydowicz, Stephen Thomas, Stanimire Tomov, Yaohung M. Tsai, Ichitaro Yamazaki, Urike Meier Yang

    Abstract: Within the past years, hardware vendors have started designing low precision special function units in response to the demand of the Machine Learning community and their demand for high compute power in low precision formats. Also the server-line products are increasingly featuring low-precision special function units, such as the NVIDIA tensor cores in ORNL's Summit supercomputer providing more t… ▽ More

    Submitted 13 July, 2020; originally announced July 2020.

    Comments: Technical report as a part of the Exascale computing project (ECP)

    ACM Class: G.1.3; G.4

  6. arXiv:1901.02971  [pdf, other

    math.NA

    An Algebraic Sparsified Nested Dissection Algorithm Using Low-Rank Approximations

    Authors: Léopold Cambier, Chao Chen, Erik G Boman, Sivasankaran Rajamanickam, Raymond S. Tuminaro, Eric Darve

    Abstract: We propose a new algorithm for the fast solution of large, sparse, symmetric positive-definite linear systems, spaND -- sparsified Nested Dissection. It is based on nested dissection, sparsification and low-rank compression. After eliminating all interiors at a given level of the elimination tree, the algorithm sparsifies all separators corresponding to the interiors. This operation reduces the si… ▽ More

    Submitted 27 January, 2020; v1 submitted 9 January, 2019; originally announced January 2019.

  7. A Robust Hierarchical Solver for Ill-conditioned Systems with Applications to Ice Sheet Modeling

    Authors: Chao Chen, Leopold Cambier, Erik G. Boman, Sivasankaran Rajamanickam, Raymond S. Tuminaro, Eric Darve

    Abstract: A hierarchical solver is proposed for solving sparse ill-conditioned linear systems in parallel. The solver is based on a modification of the LoRaSp method, but employs a deferred-compression technique, which provably reduces the approximation error and significantly improves efficiency. Moreover, the deferred-compression technique introduces minimal overhead and does not affect parallelism. As a… ▽ More

    Submitted 29 November, 2018; v1 submitted 27 November, 2018; originally announced November 2018.

    Comments: corrected misspelled author names

    MSC Class: 65F99

  8. arXiv:1808.08172  [pdf, other

    math.NA cs.DC

    Asynchronous One-Level and Two-Level Domain Decomposition Solvers

    Authors: Christian Glusa, Paritosh Ramanan, Erik G. Boman, Edmond Chow, Sivasankaran Rajamanickam

    Abstract: Parallel implementations of linear iterative solvers generally alternate between phases of data exchange and phases of local computation. Increasingly large problem sizes on more heterogeneous systems make load balancing and network layout very challenging tasks. In particular, global communication patterns such as inner products become increasingly limiting at scale. We explore the use of asynchr… ▽ More

    Submitted 10 August, 2020; v1 submitted 24 August, 2018; originally announced August 2018.

    MSC Class: 68W10; 65Y05; 68W15; 65N55

  9. arXiv:1712.07297  [pdf, other

    math.NA cs.MS

    A distributed-memory hierarchical solver for general sparse linear systems

    Authors: Chao Chen, Hadi Pouransari, Sivasankaran Rajamanickam, Erik G. Boman, Eric Darve

    Abstract: We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct s… ▽ More

    Submitted 19 December, 2017; originally announced December 2017.

    MSC Class: 65F50