Skip to main content

Showing 1–1 of 1 results for author: Faingnaert, T

.
  1. arXiv:2009.12263  [pdf, other

    cs.MS cs.DC cs.LG cs.PF

    Flexible Performant GEMM Kernels on GPUs

    Authors: Thomas Faingnaert, Tim Besard, Bjorn De Sutter

    Abstract: General Matrix Multiplication or GEMM kernels take centre place in high performance computing and machine learning. Recent NVIDIA GPUs include GEMM accelerators, such as NVIDIA's Tensor Cores. Their exploitation is hampered by the two-language problem: it requires either low-level programming which implies low programmer productivity or using libraries that only offer a limited set of components.… ▽ More

    Submitted 22 November, 2021; v1 submitted 25 September, 2020; originally announced September 2020.

    Comments: This paper was submitted to IEEE TPDS