-
Harnessing the Computing Continuum across Personalized Healthcare, Maintenance and Inspection, and Farming 4.0
Authors:
Fatemeh Baghdadi,
Davide Cirillo,
Daniele Lezzi,
Francesc Lordan,
Fernando Vazquez,
Eugenio Lomurno,
Alberto Archetti,
Danilo Ardagna,
Matteo Matteucci
Abstract:
The AI-SPRINT project, launched in 2021 and funded by the European Commission, focuses on the development and implementation of AI applications across the computing continuum. This continuum ensures the coherent integration of computational resources and services from centralized data centers to edge devices, facilitating efficient and adaptive computation and application delivery. AI-SPRINT has a…
▽ More
The AI-SPRINT project, launched in 2021 and funded by the European Commission, focuses on the development and implementation of AI applications across the computing continuum. This continuum ensures the coherent integration of computational resources and services from centralized data centers to edge devices, facilitating efficient and adaptive computation and application delivery. AI-SPRINT has achieved significant scientific advances, including streamlined processes, improved efficiency, and the ability to operate in real time, as evidenced by three practical use cases. This paper provides an in-depth examination of these applications -- Personalized Healthcare, Maintenance and Inspection, and Farming 4.0 -- highlighting their practical implementation and the objectives achieved with the integration of AI-SPRINT technologies. We analyze how the proposed toolchain effectively addresses a range of challenges and refines processes, discussing its relevance and impact in multiple domains. After a comprehensive overview of the main AI-SPRINT tools used in these scenarios, the paper summarizes of the findings and key lessons learned.
△ Less
Submitted 23 February, 2024;
originally announced March 2024.
-
The BioExcel methodology for develo** dynamic, scalable, reliable and portable computational biomolecular workflows
Authors:
Jorge Ejarque,
Pau Andrio,
Adam Hospital,
Javier Conejero,
Daniele Lezzi,
Josep LL. Gelpi,
Rosa M. Badia
Abstract:
Develo** complex biomolecular workflows is not always straightforward. It requires tedious developments to enable the interoperability between the different biomolecular simulation and analysis tools. Moreover, the need to execute the pipelines on distributed systems increases the complexity of these developments. To address these issues, we propose a methodology to simplify the implementation o…
▽ More
Develo** complex biomolecular workflows is not always straightforward. It requires tedious developments to enable the interoperability between the different biomolecular simulation and analysis tools. Moreover, the need to execute the pipelines on distributed systems increases the complexity of these developments. To address these issues, we propose a methodology to simplify the implementation of these workflows on HPC infrastructures. It combines a library, the BioExcel Building Blocks (BioBBs), that allows scientists to implement biomolecular pipelines as Python scripts, and the PyCOMPSs programming framework which allows to easily convert Python scripts into task-based parallel workflows executed in distributed computing systems such as HPC clusters, clouds, containerized platforms, etc. Using this methodology, we have implemented a set of computational molecular workflows and we have performed several experiments to validate its portability, scalability, reliability and malleability.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Workflow environments for advanced cyberinfrastructure platforms
Authors:
Rosa M Badia,
Jorge Ejarque,
Francesc Lordan,
Daniele Lezzi,
Javier Conejero,
Javier Álvarez Cid-Fuentes,
Yolanda Becerra,
Anna Queralt
Abstract:
Progress in science is deeply bound to the effective use of high-performance computing infrastructures and to the efficient extraction of knowledge from vast amounts of data. Such data comes from different sources that follow a cycle composed of pre-processing steps for data curation and preparation for subsequent computing steps, and later analysis and analytics steps applied to the results. Howe…
▽ More
Progress in science is deeply bound to the effective use of high-performance computing infrastructures and to the efficient extraction of knowledge from vast amounts of data. Such data comes from different sources that follow a cycle composed of pre-processing steps for data curation and preparation for subsequent computing steps, and later analysis and analytics steps applied to the results. However, scientific workflows are currently fragmented in multiple components, with different processes for computing and data management, and with gaps in the viewpoints of the user profiles involved. Our vision is that future workflow environments and tools for the development of scientific workflows should follow a holistic approach, where both data and computing are integrated in a single flow built on simple, high-level interfaces. The topics of research that we propose involve novel ways to express the workflows that integrate the different data and compute processes, dynamic runtimes to support the execution of the workflows in complex and heterogeneous computing infrastructures in an efficient way, both in terms of performance and energy. These infrastructures include highly distributed resources, from sensors and instruments, and devices in the edge, to High-Performance Computing and Cloud computing resources. This paper presents our vision to develop these workflow environments and also the steps we are currently following to achieve it.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
Calibration of LOFAR data on the cloud
Authors:
J. Sabater,
S. Sánchez Expósito,
P. N. Best,
J. Garrido,
L. Verdes-Montenegro,
D. Lezzi
Abstract:
New scientific instruments are starting to generate an unprecedented amount of data. LOFAR, one of the Square Kilometre Array pathfinders, is already producing data on a petabyte scale. The calibration of these data presents a huge challenge for final users: a) extensive storage and computing resources are required; b) the installation and maintenance of the processing software is not trivial; and…
▽ More
New scientific instruments are starting to generate an unprecedented amount of data. LOFAR, one of the Square Kilometre Array pathfinders, is already producing data on a petabyte scale. The calibration of these data presents a huge challenge for final users: a) extensive storage and computing resources are required; b) the installation and maintenance of the processing software is not trivial; and c) the requirements of (experimental) calibration pipelines are quickly evolving. After encountering some limitations in classical infrastructures, we investigated the viability of cloud infrastructures as a solution. We found that the installation and operation of LOFAR data calibration pipelines is not only possible, but can also be efficient in cloud infrastructures. The main advantages were: (1) ease of software installation and maintenance, and the availability of standard APIs and tools, widely used in the industry; this reduces the requirement for significant manual intervention, which can have a highly negative impact; (2) flexibility to adapt the infrastructure to the needs of the problem, especially as those demands change over time; (3) on-demand consumption of (shared) resources. We found no significant impediments associated with the speed of data transfer, the use of external block storage, or the memory available. However, the availability of scratch storage areas of an appropriate size is critical. Finally, we considered the cost-effectiveness of a commercial cloud like Amazon Web Services. While it is more expensive than the operation of a large, fully-utilised cluster completely dedicated to LOFAR data reduction, its costs are competitive if the number of datasets to be analysed is not high, or if the costs of maintaining the dedicated system become high. Coupled with the advantages discussed above, this suggests that a cloud infrastructure may be favourable for many users.
△ Less
Submitted 17 April, 2017;
originally announced April 2017.