Skip to main content

Showing 1–9 of 9 results for author: Rupp, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2112.10971  [pdf, other

    stat.ML cs.LG q-bio.PE

    Differentiated uniformization: A new method for inferring Markov chains on combinatorial state spaces including stochastic epidemic models

    Authors: Kevin Rupp, Rudolf Schill, Jonas Süskind, Peter Georg, Maren Klever, Andreas Lösch, Lars Grasedyck, Tilo Wettig, Rainer Spang

    Abstract: Motivation: We consider continuous-time Markov chains that describe the stochastic evolution of a dynamical system by a transition-rate matrix $Q$ which depends on a parameter $θ$. Computing the probability distribution over states at time $t$ requires the matrix exponential $\exp(tQ)$, and inferring $θ$ from data requires its derivative $\partial\exp\!(tQ)/\partialθ$. Both are challenging to comp… ▽ More

    Submitted 20 December, 2021; originally announced December 2021.

  2. arXiv:2011.00715  [pdf, other

    cs.MS cs.DC

    Toward Performance-Portable PETSc for GPU-based Exascale Systems

    Authors: Richard Tran Mills, Mark F. Adams, Satish Balay, Jed Brown, Alp Dener, Matthew Knepley, Scott E. Kruger, Hannah Morgan, Todd Munson, Karl Rupp, Barry F. Smith, Stefano Zampini, Hong Zhang, Junchao Zhang

    Abstract: The Portable Extensible Toolkit for Scientific computation (PETSc) library delivers scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization.The PETSc design for performance portability addresses fundamental GPU accelerator challenges and stresses flexibility and extensibility by separating the programming model used by the application from… ▽ More

    Submitted 29 September, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

    Comments: 15 pages, 10 figures, 2 tables

    Report number: ANL/MCS-P9401-1020 MSC Class: 65F10; 65F50; 68N99; 68W10 ACM Class: G.4

  3. arXiv:1607.04245  [pdf, other

    cs.MS

    Finite Element Integration with Quadrature on the GPU

    Authors: Matthew G. Knepley, Karl Rupp, Andy R. Terrel

    Abstract: We present a novel, quadrature-based finite element integration method for low-order elements on GPUs, using a pattern we call \textit{thread transposition} to avoid reductions while vectorizing aggressively. On the NVIDIA GTX580, which has a nominal single precision peak flop rate of 1.5 TF/s and a memory bandwidth of 192 GB/s, we achieve close to 300 GF/s for element integration on first-order d… ▽ More

    Submitted 14 July, 2016; originally announced July 2016.

    Comments: 14 pages, 6 figures

    ACM Class: G.4; G.1.8

  4. arXiv:1604.07163  [pdf, other

    cs.MS

    Extreme-scale Multigrid Components within PETSc

    Authors: Dave A. May, Patrick Sanan, Karl Rupp, Matthew G. Knepley, Barry F. Smith

    Abstract: Elliptic partial differential equations (PDEs) frequently arise in continuum descriptions of physical processes relevant to science and engineering. Multilevel preconditioners represent a family of scalable techniques for solving discrete PDEs of this type and thus are the method of choice for high-resolution simulations. The scalability and time-to-solution of massively parallel multilevel precon… ▽ More

    Submitted 25 April, 2016; originally announced April 2016.

  5. arXiv:1510.01122  [pdf, other

    cs.OH cs.MS

    On The Evolution Of User Support Topics in Computational Science and Engineering Software

    Authors: K. Rupp, S. Balay, J. Brown, M. Knepley, L. C. McInnes, B. Smith

    Abstract: We investigate ten years of user support emails in the large-scale solver library PETSc in order to identify changes in user requests. For this purpose we assign each email thread to one or several categories describing the type of support request. We find that despite several changes in hardware architecture as well programming models, the relative share of emails for the individual categories do… ▽ More

    Submitted 5 October, 2015; originally announced October 2015.

    Comments: 2 pages, 1 figure, whitepaper for the workshop "Computational Science & Engineering Software Sustainability and Productivity Challenges"

    MSC Class: 68N01 ACM Class: D.2.7

  6. arXiv:1410.4054  [pdf, ps, other

    cs.MS cs.DC cs.PF

    Pipelined Iterative Solvers with Kernel Fusion for Graphics Processing Units

    Authors: Karl Rupp, Josef Weinbub, Ansgar Jüngel, Tibor Grasser

    Abstract: We revisit the implementation of iterative solvers on discrete graphics processing units and demonstrate the benefit of implementations using extensive kernel fusion for pipelined formulations over conventional implementations of classical formulations. The proposed implementations with both CUDA and OpenCL are freely available in ViennaCL and are shown to be competitive with or even superior to o… ▽ More

    Submitted 4 November, 2016; v1 submitted 15 October, 2014; originally announced October 2014.

    Comments: 27 pages, 9 figures, 3 tables

    MSC Class: 65F10 (Secondary); 65F50; 65Y05 (Primary); 65Y10 ACM Class: G.1.3

    Journal ref: ACM Transactions on Mathematical Software (TOMS), Volume 43, Issue 2, Article No. 11 (2016)

  7. arXiv:1409.0669  [pdf, other

    cs.MS cs.DC cs.PF

    Performance Portability Study of Linear Algebra Kernels in OpenCL

    Authors: Karl Rupp, Philippe Tillet, Florian Rudolf, Josef Weinbub, Tibor Grasser, Ansgar Jüngel

    Abstract: The performance portability of OpenCL kernel implementations for common memory bandwidth limited linear algebra operations across different hardware generations of the same vendor as well as across vendors is studied. Certain combinations of kernel implementations and work sizes are found to exhibit good performance across compute kernels, hardware generations, and, to a lesser degree, vendors. As… ▽ More

    Submitted 2 September, 2014; originally announced September 2014.

    Comments: 11 pages, 8 figures, 2 tables, International Workshop on OpenCL 2014

    Journal ref: Proceedings of the International Workshop on OpenCL 2013 & 2014 (IWOCL)

  8. arXiv:1309.1204  [pdf, other

    cs.MS cs.CE

    Achieving High Performance with Unified Residual Evaluation

    Authors: Matthew G. Knepley, Jed Brown, Karl Rupp, Barry F. Smith

    Abstract: We examine residual evaluation, perhaps the most basic operation in numerical simulation. By raising the level of abstraction in this operation, we can eliminate specialized code, enable optimization, and greatly increase the extensibility of existing code.

    Submitted 6 September, 2013; v1 submitted 4 September, 2013; originally announced September 2013.

    Comments: 4 pages, 1 figure

  9. arXiv:1212.6326  [pdf, other

    cs.MS cs.DC physics.comp-ph

    Programming CUDA and OpenCL: A Case Study Using Modern C++ Libraries

    Authors: Denis Demidov, Karsten Ahnert, Karl Rupp, Peter Gottschling

    Abstract: We present a comparison of several modern C++ libraries providing high-level interfaces for programming multi- and many-core architectures on top of CUDA or OpenCL. The comparison focuses on the solution of ordinary differential equations and is based on odeint, a framework for the solution of systems of ordinary differential equations. Odeint is designed in a very flexible way and may be easily a… ▽ More

    Submitted 26 April, 2013; v1 submitted 27 December, 2012; originally announced December 2012.

    Comments: 21 pages, 4 figures, submitted to SIAM Journal of Scientific Computing and accepted