-
Prototy** a ROOT-based distributed analysis workflow for HL-LHC: the CMS use case
Authors:
Tommaso Tedeschi,
Vincenzo Eduardo Padulano,
Daniele Spiga,
Diego Ciangottini,
Mirco Tracolli,
Enric Tejedor Saavedra,
Enrico Guiraud,
Massimo Biasotto
Abstract:
The challenges expected for the next era of the Large Hadron Collider (LHC), both in terms of storage and computing resources, provide LHC experiments with a strong motivation for evaluating ways of rethinking their computing models at many levels. Great efforts have been put into optimizing the computing resource utilization for the data analysis, which leads both to lower hardware requirements a…
▽ More
The challenges expected for the next era of the Large Hadron Collider (LHC), both in terms of storage and computing resources, provide LHC experiments with a strong motivation for evaluating ways of rethinking their computing models at many levels. Great efforts have been put into optimizing the computing resource utilization for the data analysis, which leads both to lower hardware requirements and faster turnaround for physics analyses. In this scenario, the Compact Muon Solenoid (CMS) collaboration is involved in several activities aimed at benchmarking different solutions for running High Energy Physics (HEP) analysis workflows. A promising solution is evolving software towards more user-friendly approaches featuring a declarative programming model and interactive workflows. The computing infrastructure should keep up with this trend by offering on the one side modern interfaces, and on the other side hiding the complexity of the underlying environment, while efficiently leveraging the already deployed grid infrastructure and scaling toward opportunistic resources like public cloud or HPC centers. This article presents the first example of using the ROOT RDataFrame technology to exploit such next-generation approaches for a production-grade CMS physics analysis. A new analysis facility is created to offer users a modern interactive web interface based on JupyterLab that can leverage HTCondor-based grid resources on different geographical sites. The physics analysis is converted from a legacy iterative approach to the modern declarative approach offered by RDataFrame and distributed over multiple computing nodes. The new scenario offers not only an overall improved programming experience, but also an order of magnitude speedup increase with respect to the previous approach.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
A Serverless Engine for High Energy Physics Distributed Analysis
Authors:
Jacek Kuśnierz,
Vincenzo Eduardo Padulano,
Maciej Malawski,
Kamil Burkiewicz,
Enric Tejedor Saavedra,
Pedro Alonso-Jordá,
Michael Pitt,
Valentina Avati
Abstract:
The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing syst…
▽ More
The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing systems. While this approach has been effective so far, current estimates for future computing needs of the field present large scaling challenges. Such a managed approach may not be the only viable way to tackle them and an interesting alternative could be provided by serverless architectures, to enable an even larger scaling potential.
This work describes a novel approach to running real HEP scientific applications through a distributed serverless computing engine. The engine is built upon ROOT, a well-established HEP data analysis software, and distributes its computations to a large pool of concurrent executions on Amazon Web Services Lambda Serverless Platform. Thanks to the developed tool, physicists are able to access datasets stored at CERN (also those that are under restricted access policies) and process it on remote infrastructures outside of their typical environment. The analysis of the serverless functions is monitored at runtime to gather performance metrics, both for data- and computation-intensive workloads.
△ Less
Submitted 2 June, 2022;
originally announced June 2022.
-
HL-LHC Analysis With ROOT
Authors:
Axel Naumann,
Philippe Canal,
Enric Tejedor,
Enrico Guiraud,
Lorenzo Moneta,
Bertrand Bellenot,
Olivier Couet,
Alja Mrak Tadel,
Matevz Tadel,
Sergey Linev,
Javier Lopez Gomez,
Jonas Rembser,
Vincenzo Eduardo Padulano,
Jakob Blomer,
Jonas Hahnfeld,
Bernhard Manfred Gruber,
Vassil Vassilev
Abstract:
ROOT is high energy physics' software for storing and mining data in a statistically sound way, to publish results with scientific graphics. It is evolving since 25 years, now providing the storage format for more than one exabyte of data; virtually all high energy physics experiments use ROOT. With another significant increase in the amount of data to be handled scheduled to arrive in 2027, ROOT…
▽ More
ROOT is high energy physics' software for storing and mining data in a statistically sound way, to publish results with scientific graphics. It is evolving since 25 years, now providing the storage format for more than one exabyte of data; virtually all high energy physics experiments use ROOT. With another significant increase in the amount of data to be handled scheduled to arrive in 2027, ROOT is preparing for a massive upgrade of its core ingredients. As part of a review of crucial software for high energy physics, the ROOT team has documented its R&D plans for the coming years.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
ROOT for the HL-LHC: data format
Authors:
Axel Naumann,
Philippe Canal,
Enric Tejedor,
Enrico Guiraud,
Lorenzo Moneta,
Bertrand Bellenot,
Olivier Couet,
Alja Mrak Tadel,
Matevz Tadel,
Sergey Linev,
Javier Lopez Gomez,
Jonas Rembser,
Vincenzo Eduardo Padulano,
Jakob Blomer,
Jonas Hahnfeld,
Bernhard Manfred Gruber,
Vassil Vassilev
Abstract:
This document discusses the state, roadmap, and risks of the foundational components of ROOT with respect to the experiments at the HL-LHC (Run 4 and beyond). As foundational components, the document considers in particular the ROOT input/output (I/O) subsystem. The current HEP I/O is based on the TFile container file format and the TTree binary event data format. The work going into the new RNTup…
▽ More
This document discusses the state, roadmap, and risks of the foundational components of ROOT with respect to the experiments at the HL-LHC (Run 4 and beyond). As foundational components, the document considers in particular the ROOT input/output (I/O) subsystem. The current HEP I/O is based on the TFile container file format and the TTree binary event data format. The work going into the new RNTuple event data format aims at superseding TTree, to make RNTuple the production ROOT event data I/O that meets the requirements of Run 4 and beyond.
△ Less
Submitted 9 April, 2022;
originally announced April 2022.
-
Software Challenges For HL-LHC Data Analysis
Authors:
ROOT Team,
Kim Albertsson Brann,
Guilherme Amadio,
Sitong An,
Bertrand Bellenot,
Jakob Blomer,
Philippe Canal,
Olivier Couet,
Massimiliano Galli,
Enrico Guiraud,
Stephan Hageboeck,
Sergey Linev,
Pere Mato Vila,
Lorenzo Moneta,
Axel Naumann,
Alja Mrak Tadel,
Vincenzo Eduardo Padulano,
Fons Rademakers,
Oksana Shadura,
Matevz Tadel,
Enric Tejedor Saavedra,
Xavier Valls Pla,
Vassil Vassilev,
Stefan Wunsch
Abstract:
The high energy physics community is discussing where investment is needed to prepare software for the HL-LHC and its unprecedented challenges. The ROOT project is one of the central software players in high energy physics since decades. From its experience and expectations, the ROOT team has distilled a comprehensive set of areas that should see research and development in the context of data ana…
▽ More
The high energy physics community is discussing where investment is needed to prepare software for the HL-LHC and its unprecedented challenges. The ROOT project is one of the central software players in high energy physics since decades. From its experience and expectations, the ROOT team has distilled a comprehensive set of areas that should see research and development in the context of data analysis software, for making best use of HL-LHC's physics potential. This work shows what these areas could be, why the ROOT team believes investing in them is needed, which gains are expected, and where related work is ongoing. It can serve as an indication for future research proposals and cooperations.
△ Less
Submitted 4 May, 2020; v1 submitted 16 April, 2020;
originally announced April 2020.