BaCO: A Fast and Portable Bayesian Compiler Optimization Framework
Authors:
Erik Hellsten,
Artur Souza,
Johannes Lenfers,
Rubens Lacouture,
Olivia Hsu,
Adel Ejjeh,
Fredrik Kjolstad,
Michel Steuwer,
Kunle Olukotun,
Luigi Nardi
Abstract:
We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks. Particularly, it deals with permutation, ordered, and continuous parameter types along with both known and unknown parameter constraints. To reason about these…
▽ More
We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks. Particularly, it deals with permutation, ordered, and continuous parameter types along with both known and unknown parameter constraints. To reason about these parameter types and efficiently deliver high-quality code, BaCO uses Bayesian optimiza tion algorithms specialized towards the autotuning domain. We demonstrate BaCO's effectiveness on three modern compiler systems: TACO, RISE & ELEVATE, and HPVM2FPGA for CPUs, GPUs, and FPGAs respectively. For these domains, BaCO outperforms current state-of-the-art autotuners by delivering on average 1.36x-1.56x faster code with a tiny search budget, and BaCO is able to reach expert-level performance 2.9x-3.9x faster.
△ Less
Submitted 11 April, 2023; v1 submitted 1 December, 2022;
originally announced December 2022.
A Language for Describing Optimization Strategies
Authors:
Bastian Hagedorn,
Johannes Lenfers,
Thomas Koehler,
Sergei Gorlatch,
Michel Steuwer
Abstract:
Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many applications. The predominantly used imperative languages - like C or OpenCL - force the programmer to intertwine the code describing functionality and optimizations. This results in a nightmare for portability which is particularly problematic given the accelerating trend towards specialized hardware d…
▽ More
Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many applications. The predominantly used imperative languages - like C or OpenCL - force the programmer to intertwine the code describing functionality and optimizations. This results in a nightmare for portability which is particularly problematic given the accelerating trend towards specialized hardware devices to further increase efficiency.
Many emerging DSLs used in performance demanding domains such as deep learning, automatic differentiation, or image processing attempt to simplify or even fully automate the optimization process. Using a high-level - often functional - language, programmers focus on describing functionality in a declarative way. In some systems such as Halide or TVM, a separate schedule specifies how the program should be optimized. Unfortunately, these schedules are not written in well-defined programming languages. Instead, they are implemented as a set of ad-hoc predefined APIs that the compiler writers have exposed.
In this paper, we present Elevate: a functional language for describing optimization strategies. Elevate follows a tradition of prior systems used in different contexts that express optimization strategies as composition of rewrites. In contrast to systems with scheduling APIs, in Elevate programmers are not restricted to a set of built-in optimizations but define their own optimization strategies freely in a composable way. We show how user-defined optimization strategies in Elevate enable the effective optimization of programs expressed in a functional data-parallel language demonstrating competitive performance with Halide and TVM.
△ Less
Submitted 6 February, 2020;
originally announced February 2020.