Search | arXiv e-print repository

Computational Performance and Energy Efficiency of ARM based HPC servers

Abstract: HPC world is dominated by x86 ISA CPUs. This monoculture is not necessarily justified by best performance evaluation, but may inherit from e.g. SW related restrictions on the choice of HW platforms. To avoid running (further) into path dependency, alternate HW platforms need to be evaluated for performance compared to existing HPC setup. As a result, it may turn out alternate HW platforms are more… ▽ More HPC world is dominated by x86 ISA CPUs. This monoculture is not necessarily justified by best performance evaluation, but may inherit from e.g. SW related restrictions on the choice of HW platforms. To avoid running (further) into path dependency, alternate HW platforms need to be evaluated for performance compared to existing HPC setup. As a result, it may turn out alternate HW platforms are more efficient for HPC. In any case, even if performance differences are low, avoiding path dependencies that stem from HW choice restrictions simplifies switching to different HW platforms in future, should suitable systems evolve. Moreover, broadening the perspective to generic HW platforms may trigger cooperation and wield influence on HW platform development, resulting in HW/SW co-design advantages. △ Less

Submitted 12 June, 2024; originally announced July 2024.

Comments: 13 pages

arXiv:1807.03546 [pdf, ps, other]

Parallel Architecture Hardware and General Purpose Operating System Co-design

Authors: Oskar Schirmer

Abstract: Because most optimisations to achieve higher computational performance eventually are limited, parallelism that scales is required. Parallelised hardware alone is not sufficient, but software that matches the architecture is required to gain best performance. For decades now, hardware design has been guided by the basic design of existing software, to avoid the higher cost to redesign the latter.… ▽ More Because most optimisations to achieve higher computational performance eventually are limited, parallelism that scales is required. Parallelised hardware alone is not sufficient, but software that matches the architecture is required to gain best performance. For decades now, hardware design has been guided by the basic design of existing software, to avoid the higher cost to redesign the latter. In doing so, however, quite a variety of superior concepts is excluded a priori. Consequently, co-design of both hardware and software is crucial where highest performance is the goal. For special purpose application, this co-design is common practice. For general purpose application, however, a precondition for usability of a computer system is an operating system which is both comprehensive and dynamic. As no such operating system has ever been designed, a sketch for a comprehensive dynamic operating system is presented, based on a straightforward hardware architecture to demonstrate how design decisions regarding software and hardware do coexist and harmonise. △ Less

Submitted 10 July, 2018; originally announced July 2018.

Comments: 66 pages, 30 figures and tables

arXiv:1709.03404 [pdf, ps, other]

A Domain-specific Language for High-reliability Software used in the JUICE SWI Instrument - The hO Language Manual

Authors: Felix Winkelmann, Oskar Schirmer

Abstract: hO is a custom restricted dialect of Oberon, developed at the Max-Planck Institute for Solar System Research in Göttingen and used in the SWI flight software for the JUICE mission. hO is applied to reduce the possibility of syntactically valid but incorrect code, provide better means of statically analyzing source code, is more readable than C and gives syntactic support for the software architect… ▽ More hO is a custom restricted dialect of Oberon, developed at the Max-Planck Institute for Solar System Research in Göttingen and used in the SWI flight software for the JUICE mission. hO is applied to reduce the possibility of syntactically valid but incorrect code, provide better means of statically analyzing source code, is more readable than C and gives syntactic support for the software architecture used in the SWI instrument software. By using a higher-level, application-specific notation a whole range of possible errors is eliminated and source code size is reduced, while making the code itself easier to understand, review and analyze. △ Less

Submitted 11 September, 2017; originally announced September 2017.

Comments: 21 pages

arXiv:1612.06749 [pdf, ps, other]

GuStL - An Experimental Guarded States Language

Authors: Oskar Schirmer

Abstract: Programming a parallel computing system that consists of several thousands or even up to a million message passing processing units may ask for a language that supports waiting for and sending messages over hardware channels. As programs are looked upon as state machines, the language provides syntax to implement a main event driven loop. The language presented herewith surely will not serve as a… ▽ More Programming a parallel computing system that consists of several thousands or even up to a million message passing processing units may ask for a language that supports waiting for and sending messages over hardware channels. As programs are looked upon as state machines, the language provides syntax to implement a main event driven loop. The language presented herewith surely will not serve as a generic programming language for any arbitrary task. Its main purpose is to allow for a prototypical implementation of a dynamic software system as a proof of concept. △ Less

Submitted 10 July, 2018; v1 submitted 20 December, 2016; originally announced December 2016.

Comments: 12 pages

arXiv:1612.06748 [pdf, ps, other]

NOP - A Simple Experimental Processor for Parallel Deployment

Authors: Oskar Schirmer

Abstract: The design of a parallel computing system using several thousands or even up to a million processors asks for processing units that are simple and thus small in space, to make as many processing units as possible fit on a single die. The design presented herewith is far from being optimised, it is not meant to compete with industry performance devices. Its main purpose is to allow for a prototyp… ▽ More The design of a parallel computing system using several thousands or even up to a million processors asks for processing units that are simple and thus small in space, to make as many processing units as possible fit on a single die. The design presented herewith is far from being optimised, it is not meant to compete with industry performance devices. Its main purpose is to allow for a prototypical implementation of a dynamic software system as a proof of concept. △ Less

Submitted 20 December, 2016; originally announced December 2016.

Comments: 28 pages, 2 figures

arXiv:1302.6911 [pdf, ps, other]

Using Virtual Addresses with Communication Channels

Authors: Oskar Schirmer

Abstract: While for single processor and SMP machines, memory is the allocatable quantity, for machines made up of large amounts of parallel computing units, each with its own local memory, the allocatable quantity is a single computing unit. Where virtual address management is used to keep memory coherent and allow allocation of more than physical memory is actually available, virtual communication channel… ▽ More While for single processor and SMP machines, memory is the allocatable quantity, for machines made up of large amounts of parallel computing units, each with its own local memory, the allocatable quantity is a single computing unit. Where virtual address management is used to keep memory coherent and allow allocation of more than physical memory is actually available, virtual communication channel references can be used to make computing units stay connected across allocation and swap**. △ Less

Submitted 11 February, 2013; originally announced February 2013.

Comments: 5 pages, 4 figures

Showing 1–6 of 6 results for author: Schirmer, O