Skip to main content

Showing 1–6 of 6 results for author: Vinter, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2104.09768  [pdf, ps, other

    cs.AR

    SME: A High Productivity FPGA Tool for Software Programmers

    Authors: Carl-Johannes Johnsen, Alberte Thegler, Kenneth Skovhede, Brian Vinter

    Abstract: For several decades, the CPU has been the standard model to use in the majority of computing. While the CPU does excel in some areas, heterogeneous computing, such as reconfigurable hardware, is showing increasing potential in areas like parallelization, performance, and power usage. This is especially prominent in problems favoring deep pipelining or tight latency requirements. However, due to th… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

  2. arXiv:1504.06474  [pdf, ps, other

    cs.MS cs.DC math.NA

    Speculative Segmented Sum for Sparse Matrix-Vector Multiplication on Heterogeneous Processors

    Authors: Weifeng Liu, Brian Vinter

    Abstract: Sparse matrix-vector multiplication (SpMV) is a central building block for scientific software and graph applications. Recently, heterogeneous processors composed of different types of cores attracted much attention because of their flexible core configuration and high energy efficiency. In this paper, we propose a compressed sparse row (CSR) format based SpMV algorithm utilizing both types of cor… ▽ More

    Submitted 14 September, 2015; v1 submitted 24 April, 2015; originally announced April 2015.

    Comments: 22 pages, 8 figures, Published at Parallel Computing (PARCO)

    MSC Class: 65F50 ACM Class: G.4; G.1.3

  3. arXiv:1504.05022  [pdf, ps, other

    cs.MS cs.DC math.NA

    A Framework for General Sparse Matrix-Matrix Multiplication on GPUs and Heterogeneous Processors

    Authors: Weifeng Liu, Brian Vinter

    Abstract: General sparse matrix-matrix multiplication (SpGEMM) is a fundamental building block for numerous applications such as algebraic multigrid method (AMG), breadth first search and shortest path problem. Compared to other sparse BLAS routines, an efficient parallel SpGEMM implementation has to handle extra irregularity from three aspects: (1) the number of nonzero entries in the resulting sparse matr… ▽ More

    Submitted 13 September, 2015; v1 submitted 20 April, 2015; originally announced April 2015.

    Comments: 25 pages, 12 figures, published at Journal of Parallel and Distributed Computing (JPDC)

    MSC Class: 65F50 ACM Class: G.4; G.1.3

  4. arXiv:1503.05032  [pdf, ps, other

    cs.MS cs.DC math.NA

    CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication

    Authors: Weifeng Liu, Brian Vinter

    Abstract: Sparse matrix-vector multiplication (SpMV) is a fundamental building block for numerous applications. In this paper, we propose CSR5 (Compressed Sparse Row 5), a new storage format, which offers high-throughput SpMV on various platforms including CPUs, GPUs and Xeon Phi. First, the CSR5 format is insensitive to the sparsity structure of the input matrix. Thus the single format can support an SpMV… ▽ More

    Submitted 9 April, 2015; v1 submitted 17 March, 2015; originally announced March 2015.

    Comments: 12 pages, 10 figures, In Proceedings of the 29th ACM International Conference on Supercomputing (ICS '15)

    MSC Class: 65F50 ACM Class: G.4; G.1.3

  5. arXiv:1210.7774  [pdf, other

    cs.PL cs.DC

    cphVB: A System for Automated Runtime Optimization and Parallelization of Vectorized Applications

    Authors: Mads Ruben Burgdorff Kristensen, Simon Andreas Frimann Lund, Troels Blum, Brian Vinter

    Abstract: Modern processor architectures, in addition to having still more cores, also require still more consideration to memory-layout in order to run at full capacity. The usefulness of most languages is deprecating as their abstractions, structures or objects are hard to map onto modern processor architectures efficiently. The work in this paper introduces a new abstract machine framework, cphVB, that… ▽ More

    Submitted 25 March, 2013; v1 submitted 26 October, 2012; originally announced October 2012.

    Journal ref: Proceedings of The 11th Python In Science Conference (SciPy 2012)

  6. Managing Communication Latency-Hiding at Runtime for Parallel Programming Languages and Libraries

    Authors: Mads Ruben Burgdorff Kristensen, Brian Vinter

    Abstract: This work introduces a runtime model for managing communication with support for latency-hiding. The model enables non-computer science researchers to exploit communication latency-hiding techniques seamlessly. For compiled languages, it is often possible to create efficient schedules for communication, but this is not the case for interpreted languages. By maintaining data dependencies between sc… ▽ More

    Submitted 18 January, 2012; originally announced January 2012.

    Comments: PREPRINT

    Journal ref: Proceeding HPCC '12 Proceedings of the 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems