Showing 1–2 of 2 results for author: Ferreron, A

Search v0.5.6 released 2020-02-24

arXiv:1803.09584 [pdf, other]

cs.PF cs.AR cs.DC

doi 10.1109/ISPASS.2017.7975275

Crossing the Architectural Barrier: Evaluating Representative Regions of Parallel HPC Applications

Authors: Alexandra Ferreron, Radhika Jagtap, Sascha Bischoff, Roxana Rusitoru

Abstract: Exascale computing will get mankind closer to solving important social, scientific and engineering problems. Due to high prototy** costs, High Performance Computing (HPC) system architects make use of simulation models for design space exploration and hardware-software co-design. However, as HPC systems reach exascale proportions, the cost of simulation increases, since simulators themselves are… ▽ More Exascale computing will get mankind closer to solving important social, scientific and engineering problems. Due to high prototy** costs, High Performance Computing (HPC) system architects make use of simulation models for design space exploration and hardware-software co-design. However, as HPC systems reach exascale proportions, the cost of simulation increases, since simulators themselves are largely single-threaded. Tools for selecting representative parts of parallel applications to reduce running costs are widespread, e.g., BarrierPoint achieves this by analysing, in simulation, abstract characteristics such as basic blocks and reuse distances. However, architectures new to HPC have a limited set of tools available. In this work, we provide an independent cross-architectural evaluation on real hardware - across Intel and ARM - of the BarrierPoint methodology, when applied to parallel HPC proxy applications. We present both cases: when the methodology can be applied and when it cannot. In the former case, results show that we can predict the performance of full application execution by running shorter representative sections. In the latter case, we dive into the underlying issues and suggest improvements. We demonstrate a total simulation time reduction of up to 178x, whilst kee** the error below 2.3% for both cycles and instructions. △ Less

Submitted 20 March, 2018; originally announced March 2018.

Comments: 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
arXiv:1803.06955 [pdf, other]

cs.AR

AISC: Approximate Instruction Set Computer

Authors: Alexandra Ferreron, Jesus Alastruey-Benede, Dario Suarez-Gracia, Ulya R. Karpuzcu

Abstract: This paper makes the case for a single-ISA heterogeneous computing platform, AISC, where each compute engine (be it a core or an accelerator) supports a different subset of the very same ISA. An ISA subset may not be functionally complete, but the union of the (per compute engine) subsets renders a functionally complete, platform-wide single ISA. Tailoring the microarchitecture of each compute eng… ▽ More This paper makes the case for a single-ISA heterogeneous computing platform, AISC, where each compute engine (be it a core or an accelerator) supports a different subset of the very same ISA. An ISA subset may not be functionally complete, but the union of the (per compute engine) subsets renders a functionally complete, platform-wide single ISA. Tailoring the microarchitecture of each compute engine to the subset of the ISA that it supports can easily reduce hardware complexity. At the same time, the energy efficiency of computing can improve by exploiting algorithmic noise tolerance: by map** code sequences that can tolerate (any potential inaccuracy induced by) the incomplete ISA-subsets to the corresponding compute engines. △ Less

Submitted 19 March, 2018; originally announced March 2018.

Search v0.5.6 released 2020-02-24