Search | arXiv e-print repository

doi 10.1145/3659994.3660315

Urgent Edge Computing

Authors: Patrizio Dazzi, Luca Ferrucci, Marco Danelutto, Konstantinos Tserpes, Antonis Makris, Theodoros Theodoropoulos, Jacopo Massa, Emanuele Carlini, Matteo Mordacchini

Abstract: This position paper introduces Urgent Edge Computing (UEC) as a paradigm shift addressing the evolving demands of time-sensitive applications in distributed edge environments, in time-critical scenarios. With a focus on ultra-low latency, availability, resource management, decentralization, self-organization, and robust security, UEC aims to facilitate operations in critical scenarios such as disa… ▽ More This position paper introduces Urgent Edge Computing (UEC) as a paradigm shift addressing the evolving demands of time-sensitive applications in distributed edge environments, in time-critical scenarios. With a focus on ultra-low latency, availability, resource management, decentralization, self-organization, and robust security, UEC aims to facilitate operations in critical scenarios such as disaster response, environmental monitoring, and smart city management. This paper outlines and discusses the key requirements, challenges, and enablers along with a conceptual architecture. The paper also outlines the potential applications of Urgent Edge Computing △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2311.11015 [pdf, other]

Power Aware Scheduling of Tasks on FPGAs in Data Centers

Authors: Rourab Paul, Marco Danelutto

Abstract: A variety of computing platform like Field Programmable Gate Array (FPGA), Graphics Processing Unit (GPU) and multicore Central Processing Unit (CPU) in data centers are suitable for acceleration of data-intensive workloads. Especially, FPGA platforms in data centers are gaining popularity for high-performance computations due to their high speed, reconfigurable nature and cost effectiveness. Such… ▽ More A variety of computing platform like Field Programmable Gate Array (FPGA), Graphics Processing Unit (GPU) and multicore Central Processing Unit (CPU) in data centers are suitable for acceleration of data-intensive workloads. Especially, FPGA platforms in data centers are gaining popularity for high-performance computations due to their high speed, reconfigurable nature and cost effectiveness. Such heterogeneous, highly parallel computational architectures in data centers, combined with high-speed communication technologies like 5G, are becoming increasingly suitable for real-time applications. However, flexibility, cost-effectiveness, high computational capabilities, and energy efficiency remain challenging issues in FPGA based data centers. In this context an energy efficient scheduling solution is required to maximize the resource profitability of FPGA. This paper introduces a power-aware scheduling methodology aimed at accommodating periodic hardware tasks within the available FPGAs of a data center at their potentially maximum speed. This proposed methodology guarantees the execution of these tasks us ing the maximum number of parallel computation units possible to implement in the FPGAs, with minimum power consumption. The proposed scheduling methodology is implemented in a data center with multiple Alveo-50 Xilinx-AMD FPGAs and Vitis 2023 tool. The evidence from the implementation shows the proposed scheduling methodology is efficient compared to existing solutions. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:1609.05002 [pdf, ps, other]

State access patterns in embarrassingly parallel computations

Authors: Marco Danelutto, Massimo Torquati, Peter Kilpatrick

Abstract: We introduce a set of state access patterns suitable for managing state in embarrassingly parallel computations on streams. The state access patterns are useful to model typical stream parallel applications. We present a classification of the patterns according to the extent and way in which the state is modified. We define precisely the state access patterns and discuss possible implementation sc… ▽ More We introduce a set of state access patterns suitable for managing state in embarrassingly parallel computations on streams. The state access patterns are useful to model typical stream parallel applications. We present a classification of the patterns according to the extent and way in which the state is modified. We define precisely the state access patterns and discuss possible implementation schemas, performances and possibilities to manage adaptivity (parallelism degree) in the patterns. We present experimental results relative to an implementations on top of the structured parallel programming framework FastFlow that demonstrate the feasibility and efficiency of the proposed access patterns. △ Less

Submitted 16 September, 2016; originally announced September 2016.

Comments: 8 pages, accepted and presented at HLPGPU 2016 (Prague, Czech Republic, Tuesday, Jan 19th 2016. Co-Located with HiPEAC 2016)

arXiv:1609.04567 [pdf, ps, other]

doi 10.1007/s11227-016-1871-z

A parallel pattern for iterative stencil + reduce

Authors: M. Aldinucci, M. Danelutto, M. Drocco, P. Kilpatrick, C. Misale, G. Peretti Pezzi, M. Torquati

Abstract: We advocate the Loop-of-stencil-reduce pattern as a means of simplifying the implementation of data-parallel programs on heterogeneous multi-core platforms. Loop-of-stencil-reduce is general enough to subsume map, reduce, map-reduce, stencil, stencil-reduce, and, crucially, their usage in a loop in both data-parallel and streaming applications, or a combination of both. The pattern makes it possib… ▽ More We advocate the Loop-of-stencil-reduce pattern as a means of simplifying the implementation of data-parallel programs on heterogeneous multi-core platforms. Loop-of-stencil-reduce is general enough to subsume map, reduce, map-reduce, stencil, stencil-reduce, and, crucially, their usage in a loop in both data-parallel and streaming applications, or a combination of both. The pattern makes it possible to deploy a single stencil computation kernel on different GPUs. We discuss the implementation of Loop-of-stencil-reduce in FastFlow, a framework for the implementation of applications based on the parallel patterns. Experiments are presented to illustrate the use of Loop-of-stencil-reduce in develo** data-parallel kernels running on heterogeneous systems. △ Less

Submitted 15 September, 2016; originally announced September 2016.

arXiv:1204.5402 [pdf, other]

FastFlow tutorial

Authors: Marco Aldinucci, Marco Danelutto, Massimo Torquati

Abstract: FastFlow is a structured parallel programming framework targeting shared memory multicores. Its layered design and the optimized implementation of the communication mechanisms used to implement the FastFlow streaming networks provided to the application programmer as algorithmic skeletons support the development of efficient fine grain parallel applications. FastFlow is available (open source) at… ▽ More FastFlow is a structured parallel programming framework targeting shared memory multicores. Its layered design and the optimized implementation of the communication mechanisms used to implement the FastFlow streaming networks provided to the application programmer as algorithmic skeletons support the development of efficient fine grain parallel applications. FastFlow is available (open source) at SourceForge (http://sourceforge.net/projects/mc-fastflow/). This work introduces FastFlow programming techniques and points out the different ways used to parallelize existing C/C++ code using FastFlow as a software accelerator. In short: this is a kind of tutorial on FastFlow. △ Less

Submitted 24 April, 2012; originally announced April 2012.

Comments: 49 pages + cover

Report number: TR-12-04 ACM Class: D.1.3; D.3.2; C.1.3

arXiv:1002.4668 [pdf, other]

Accelerating sequential programs using FastFlow and self-offloading

Authors: Marco Aldinucci, Marco Danelutto, Peter Kilpatrick, Massimiliano Meneghin, Massimo Torquati

Abstract: FastFlow is a programming environment specifically targeting cache-coherent shared-memory multi-cores. FastFlow is implemented as a stack of C++ template libraries built on top of lock-free (fence-free) synchronization mechanisms. In this paper we present a further evolution of FastFlow enabling programmers to offload part of their workload on a dynamically created software accelerator running o… ▽ More FastFlow is a programming environment specifically targeting cache-coherent shared-memory multi-cores. FastFlow is implemented as a stack of C++ template libraries built on top of lock-free (fence-free) synchronization mechanisms. In this paper we present a further evolution of FastFlow enabling programmers to offload part of their workload on a dynamically created software accelerator running on unused CPUs. The offloaded function can be easily derived from pre-existing sequential code. We emphasize in particular the effective trade-off between human productivity and execution efficiency of the approach. △ Less

Submitted 24 February, 2010; originally announced February 2010.

Comments: 17 pages + cover

Report number: TR-10-03 ACM Class: D.1.3; D.3.2; C.1.3

arXiv:0909.1517 [pdf, other]

doi 10.1007/978-1-4419-6794-7_8

Autonomic management of multiple non-functional concerns in behavioural skeletons

Authors: Marco Aldinucci, Marco Danelutto, Peter Kilpatrick

Abstract: We introduce and address the problem of concurrent autonomic management of different non-functional concerns in parallel applications build as a hierarchical composition of behavioural skeletons. We first define the problems arising when multiple concerns are dealt with by independent managers, then we propose a methodology supporting coordinated management, and finally we discuss how autonomic… ▽ More We introduce and address the problem of concurrent autonomic management of different non-functional concerns in parallel applications build as a hierarchical composition of behavioural skeletons. We first define the problems arising when multiple concerns are dealt with by independent managers, then we propose a methodology supporting coordinated management, and finally we discuss how autonomic management of multiple concerns may be implemented in a typical use case. The paper concludes with an outline of the challenges involved in realizing the proposed methodology on distributed target architectures such as clusters and grids. Being based on the behavioural skeleton concept proposed in the CoreGRID GCM, it is anticipated that the methodology will be readily integrated into the current reference implementation of GCM based on Java ProActive and running on top of major grid middleware systems. △ Less

Submitted 8 September, 2009; originally announced September 2009.

Comments: 20 pages + cover page

Report number: TR-09-10 ACM Class: D.1.3; F.1.1; D.2.2; D.2.3

Showing 1–7 of 7 results for author: Danelutto, M