Skip to main content

Showing 1–14 of 14 results for author: Steuwer, M

.
  1. arXiv:2404.02218  [pdf, other

    cs.DC cs.MS

    A shared compilation stack for distributed-memory parallelism in stencil DSLs

    Authors: George Bisbas, Anton Lydike, Emilien Bauer, Nick Brown, Mathieu Fehr, Lawrence Mitchell, Gabriel Rodriguez-Canal, Maurice Jamieson, Paul H. J. Kelly, Michel Steuwer, Tobias Grosser

    Abstract: Domain Specific Languages (DSLs) increase programmer productivity and provide high performance. Their targeted abstractions allow scientists to express problems at a high level, providing rich details that optimizing compilers can exploit to target current- and next-generation supercomputers. The convenience and performance of DSLs come with significant development and maintenance costs. The siloe… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  2. arXiv:2311.07422  [pdf, other

    cs.PL

    Sidekick compilation with xDSL

    Authors: Mathieu Fehr, Michel Weber, Christian Ulmann, Alexandre Lopoukhine, Martin Lücke, Théo Degioanni, Michel Steuwer, Tobias Grosser

    Abstract: Traditionally, compiler researchers either conduct experiments within an existing production compiler or develop their own prototype compiler; both options come with trade-offs. On one hand, prototy** in a production compiler can be cumbersome, as they are often optimized for program compilation speed at the expense of software simplicity and development speed. On the other hand, the transition… ▽ More

    Submitted 16 June, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: 14 pages, 15 figures; updated twice to include acknowledgements

  3. arXiv:2305.03448  [pdf, other

    cs.PL

    Descend: A Safe GPU Systems Programming Language

    Authors: Bastian Köpcke, Sergei Gorlatch, Michel Steuwer

    Abstract: Graphics Processing Units (GPU) offer tremendous computational power by following a throughput oriented computing paradigm where many thousand computational units operate in parallel. Programming this massively parallel hardware is challenging. Programmers must correctly and efficiently coordinate thousands of threads and their accesses to various shared memory spaces. Existing mainstream GPU prog… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  4. arXiv:2304.14154  [pdf, other

    cs.PL

    Traced Types for Safe Strategic Rewriting

    Authors: Rongxiao Fu, Ornela Dardha, Michel Steuwer

    Abstract: Strategy languages enable programmers to compose rewrite rules into strategies and control their application. This is useful in programming languages, e.g., for describing program transformations compositionally, but also in automated theorem proving, where related ideas have been studies with tactics languages. Clearly, not all compositions of rewrites are correct, but how can we assist programme… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  5. Structural Subty** as Parametric Polymorphism

    Authors: Wenhao Tang, Daniel Hillerström, James McKinna, Michel Steuwer, Ornela Dardha, Rongxiao Fu, Sam Lindley

    Abstract: Structural subty** and parametric polymorphism provide similar flexibility and reusability to programmers. For example, both features enable the programmer to provide a wider record as an argument to a function that expects a narrower one. However, the means by which they do so differs substantially, and the precise details of the relationship between them exists, at best, as folklore in literat… ▽ More

    Submitted 11 September, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: 47 pages, accepted by OOPSLA 2023

  6. arXiv:2212.11142  [pdf, other

    cs.PL cs.LG cs.PF

    BaCO: A Fast and Portable Bayesian Compiler Optimization Framework

    Authors: Erik Hellsten, Artur Souza, Johannes Lenfers, Rubens Lacouture, Olivia Hsu, Adel Ejjeh, Fredrik Kjolstad, Michel Steuwer, Kunle Olukotun, Luigi Nardi

    Abstract: We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks. Particularly, it deals with permutation, ordered, and continuous parameter types along with both known and unknown parameter constraints. To reason about these… ▽ More

    Submitted 11 April, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

  7. Primrose: Selecting Container Data Types by Their Properties

    Authors: Xueying Qin, Liam O'Connor, Michel Steuwer

    Abstract: Context: Container data types are ubiquitous in computer programming, enabling developers to efficiently store and process collections of data with an easy-to-use programming interface. Many programming languages offer a variety of container implementations in their standard libraries based on data structures offering different capabilities and performance characteristics. Inquiry: Choosing the… ▽ More

    Submitted 20 February, 2023; v1 submitted 19 May, 2022; originally announced May 2022.

    Journal ref: The Art, Science, and Engineering of Programming, 2023, Vol. 7, Issue 3, Article 11

  8. arXiv:2201.03611  [pdf, other

    cs.PL

    RISE & Shine: Language-Oriented Compiler Design

    Authors: Michel Steuwer, Thomas Koehler, Bastian Köpcke, Federico Pizzuti

    Abstract: The trend towards specialization of software and hardware - fuelled by the end of Moore's law and the still accelerating interest in domain-specific computing, such as machine learning - forces us to radically rethink our compiler designs. The era of a universal compiler framework built around a single one-size-fits-all intermediate representation (IR) is over. This realization has sparked the cre… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

  9. arXiv:2111.13040  [pdf, other

    cs.PL cs.PF

    Sketch-Guided Equality Saturation: Scaling Equality Saturation to Complex Optimizations of Functional Programs

    Authors: Thomas Koehler, Phil Trinder, Michel Steuwer

    Abstract: Generating high-performance code for diverse hardware and application domains is challenging. Functional array programming languages with patterns like map and reduce have been successfully combined with term rewriting to define and explore optimization spaces. However, deciding what sequence of rewrites to apply is hard and has a huge impact on the performance of the rewritten program. Equality s… ▽ More

    Submitted 3 June, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: 23 pages excluding references, submitted to OOPLSA 2022

  10. arXiv:2103.13390  [pdf, ps, other

    cs.PL

    Row-Polymorphic Types for Strategic Rewriting

    Authors: Rongxiao Fu, Xueying Qin, Ornela Dardha, Michel Steuwer

    Abstract: We present a type system for strategy languages that express program transformations as compositions of rewrite rules. Our row-polymorphic type system assists compiler engineers to write correct strategies by statically rejecting non meaningful compositions of rewrites that otherwise would fail during rewriting at runtime. Furthermore, our type system enables reasoning about how rewriting transfor… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

  11. arXiv:2002.02268  [pdf, other

    cs.PL cs.PF

    A Language for Describing Optimization Strategies

    Authors: Bastian Hagedorn, Johannes Lenfers, Thomas Koehler, Sergei Gorlatch, Michel Steuwer

    Abstract: Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many applications. The predominantly used imperative languages - like C or OpenCL - force the programmer to intertwine the code describing functionality and optimizations. This results in a nightmare for portability which is particularly problematic given the accelerating trend towards specialized hardware d… ▽ More

    Submitted 6 February, 2020; originally announced February 2020.

    Comments: https://elevate-lang.org/ https://github.com/elevate-lang

  12. arXiv:1710.08332  [pdf, other

    cs.DC cs.PL

    Strategy Preserving Compilation for Parallel Functional Code

    Authors: Robert Atkey, Michel Steuwer, Sam Lindley, Christophe Dubach

    Abstract: Graphics Processing Units (GPUs) and other parallel devices are widely available and have the potential for accelerating a wide class of algorithms. However, expert programming skills are required to achieving maximum performance. hese devices expose low-level hardware details through imperative programming interfaces where programmers explicity encode device-specific optimisation strategies. This… ▽ More

    Submitted 23 October, 2017; originally announced October 2017.

  13. arXiv:1511.02490  [pdf, other

    cs.DC

    Autotuning OpenCL Workgroup Size for Stencil Patterns

    Authors: Chris Cummins, Pavlos Petoumenos, Michel Steuwer, Hugh Leather

    Abstract: Selecting an appropriate workgroup size is critical for the performance of OpenCL kernels, and requires knowledge of the underlying hardware, the data being operated on, and the implementation of the kernel. This makes portable performance of OpenCL programs a challenging goal, since simple heuristics and statically chosen values fail to exploit the available performance. To address this, we propo… ▽ More

    Submitted 6 January, 2016; v1 submitted 8 November, 2015; originally announced November 2015.

    Comments: 8 pages, 6 figures, presented at the 6th International Workshop on Adaptive Self-tuning Computing Systems (ADAPT '16)

  14. arXiv:1502.02389  [pdf, other

    cs.DC cs.PF cs.PL

    Patterns and Rewrite Rules for Systematic Code Generation (From High-Level Functional Patterns to High-Performance OpenCL Code)

    Authors: Michel Steuwer, Christian Fensch, Christophe Dubach

    Abstract: Computing systems have become increasingly complex with the emergence of heterogeneous hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous computational power at the cost of increased programming effort. This results in a tension between achieving performance and code portability. Code is either tuned using device-specific optimizations to achieve maximum performa… ▽ More

    Submitted 9 February, 2015; originally announced February 2015.

    Comments: Technical Report

    ACM Class: D.3.3; D.3.4