-
Baler -- Machine Learning Based Compression of Scientific Data
Authors:
Fritjof Bengtsson,
Caterina Doglioni,
Per Alexander Ekman,
Axel Gallén,
Pratik Jawahar,
Alma Orucevic-Alagic,
Marta Camps Santasmasas,
Nicola Skidmore,
Oliver Woolland
Abstract:
Storing and sharing increasingly large datasets is a challenge across scientific research and industry. In this paper, we document the development and applications of Baler - a Machine Learning based data compression tool for use across scientific disciplines and industry. Here, we present Baler's performance for the compression of High Energy Physics (HEP) data, as well as its application to Comp…
▽ More
Storing and sharing increasingly large datasets is a challenge across scientific research and industry. In this paper, we document the development and applications of Baler - a Machine Learning based data compression tool for use across scientific disciplines and industry. Here, we present Baler's performance for the compression of High Energy Physics (HEP) data, as well as its application to Computational Fluid Dynamics (CFD) toy data as a proof-of-principle. We also present suggestions for cross-disciplinary guidelines to enable feasibility studies for machine learning based compression for scientific data.
△ Less
Submitted 16 February, 2024; v1 submitted 3 May, 2023;
originally announced May 2023.
-
Applications and Techniques for Fast Machine Learning in Science
Authors:
Allison McCarn Deiana,
Nhan Tran,
Joshua Agar,
Michaela Blott,
Giuseppe Di Guglielmo,
Javier Duarte,
Philip Harris,
Scott Hauck,
Mia Liu,
Mark S. Neubauer,
Jennifer Ngadiuba,
Seda Ogrenci-Memik,
Maurizio Pierini,
Thea Aarrestad,
Steffen Bahr,
Jurgen Becker,
Anne-Sophie Berthold,
Richard J. Bonventre,
Tomas E. Muller Bravo,
Markus Diefenthaler,
Zhen Dong,
Nick Fritzsche,
Amir Gholami,
Ekaterina Govorkova,
Kyle J Hazelwood
, et al. (62 additional authors not shown)
Abstract:
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML ac…
▽ More
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlap** challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.
△ Less
Submitted 25 October, 2021;
originally announced October 2021.
-
The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider
Authors:
T. Aarrestad,
M. van Beekveld,
M. Bona,
A. Boveia,
S. Caron,
J. Davies,
A. De Simone,
C. Doglioni,
J. M. Duarte,
A. Farbin,
H. Gupta,
L. Hendriks,
L. Heinrich,
J. Howarth,
P. Jawahar,
A. Jueid,
J. Lastow,
A. Leinweber,
J. Mamuzic,
E. Merényi,
A. Morandini,
P. Moskvitina,
C. Nellist,
J. Ngadiuba,
B. Ostdiek
, et al. (14 additional authors not shown)
Abstract:
We describe the outcome of a data challenge conducted as part of the Dark Machines Initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims at detecting signals of new physics at the LHC using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We defin…
▽ More
We describe the outcome of a data challenge conducted as part of the Dark Machines Initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims at detecting signals of new physics at the LHC using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 Billion simulated LHC events corresponding to $10~\rm{fb}^{-1}$ of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.
△ Less
Submitted 9 December, 2021; v1 submitted 28 May, 2021;
originally announced May 2021.
-
HL-LHC Computing Review: Common Tools and Community Software
Authors:
HEP Software Foundation,
:,
Thea Aarrestad,
Simone Amoroso,
Markus Julian Atkinson,
Joshua Bendavid,
Tommaso Boccali,
Andrea Bocci,
Andy Buckley,
Matteo Cacciari,
Paolo Calafiura,
Philippe Canal,
Federico Carminati,
Taylor Childers,
Vitaliano Ciulli,
Gloria Corti,
Davide Costanzo,
Justin Gage Dezoort,
Caterina Doglioni,
Javier Mauricio Duarte,
Agnieszka Dziurda,
Peter Elmer,
Markus Elsing,
V. Daniel Elvira,
Giulio Eulisse
, et al. (85 additional authors not shown)
Abstract:
Common and community software packages, such as ROOT, Geant4 and event generators have been a key part of the LHC's success so far and continued development and optimisation will be critical in the future. The challenges are driven by an ambitious physics programme, notably the LHC accelerator upgrade to high-luminosity, HL-LHC, and the corresponding detector upgrades of ATLAS and CMS. In this doc…
▽ More
Common and community software packages, such as ROOT, Geant4 and event generators have been a key part of the LHC's success so far and continued development and optimisation will be critical in the future. The challenges are driven by an ambitious physics programme, notably the LHC accelerator upgrade to high-luminosity, HL-LHC, and the corresponding detector upgrades of ATLAS and CMS. In this document we address the issues for software that is used in multiple experiments (usually even more widely than ATLAS and CMS) and maintained by teams of developers who are either not linked to a particular experiment or who contribute to common software within the context of their experiment activity. We also give space to general considerations for future software and projects that tackle upcoming challenges, no matter who writes it, which is an area where community convergence on best practice is extremely useful.
△ Less
Submitted 31 August, 2020;
originally announced August 2020.
-
HEP Community White Paper on Software trigger and event reconstruction: Executive Summary
Authors:
Johannes Albrecht,
Kenneth Bloom,
Tommaso Boccali,
Antonio Boveia,
Michel De Cian,
Caterina Doglioni,
Agnieszka Dziurda,
Amir Farbin,
Conor Fitzpatrick,
Frank Gaede,
Simon George,
Vladimir Gligorov,
Hadrien Grasland,
Lucia Grillo,
Benedikt Hegner,
William Kalderon,
Sami Kama,
Patrick Koppenburg,
Slava Krutelyov,
Rob Kutschke,
Walter Lampl,
David Lange,
Ed Moyse,
Andrew Norman,
Marko Petric
, et al. (17 additional authors not shown)
Abstract:
Realizing the physics programs of the planned and upgraded high-energy physics (HEP) experiments over the next 10 years will require the HEP community to address a number of challenges in the area of software and computing. For this reason, the HEP software community has engaged in a planning process over the past two years, with the objective of identifying and prioritizing the research and devel…
▽ More
Realizing the physics programs of the planned and upgraded high-energy physics (HEP) experiments over the next 10 years will require the HEP community to address a number of challenges in the area of software and computing. For this reason, the HEP software community has engaged in a planning process over the past two years, with the objective of identifying and prioritizing the research and development required to enable the next generation of HEP detectors to fulfill their full physics potential. The aim is to produce a Community White Paper which will describe the community strategy and a roadmap for software and computing research and development in HEP for the 2020s. The topics of event reconstruction and software triggers were considered by a joint working group and are summarized together in this document.
△ Less
Submitted 23 February, 2018;
originally announced February 2018.
-
HEP Community White Paper on Software trigger and event reconstruction
Authors:
Johannes Albrecht,
Kenneth Bloom,
Tommaso Boccali,
Antonio Boveia,
Michel De Cian,
Caterina Doglioni,
Agnieszka Dziurda,
Amir Farbin,
Conor Fitzpatrick,
Frank Gaede,
Simon George,
Vladimir Gligorov,
Hadrien Grasland,
Lucia Grillo,
Benedikt Hegner,
William Kalderon,
Sami Kama,
Patrick Koppenburg,
Slava Krutelyov,
Rob Kutschke,
Walter Lampl,
David Lange,
Ed Moyse,
Andrew Norman,
Marko Petric
, et al. (17 additional authors not shown)
Abstract:
Realizing the physics programs of the planned and upgraded high-energy physics (HEP) experiments over the next 10 years will require the HEP community to address a number of challenges in the area of software and computing. For this reason, the HEP software community has engaged in a planning process over the past two years, with the objective of identifying and prioritizing the research and devel…
▽ More
Realizing the physics programs of the planned and upgraded high-energy physics (HEP) experiments over the next 10 years will require the HEP community to address a number of challenges in the area of software and computing. For this reason, the HEP software community has engaged in a planning process over the past two years, with the objective of identifying and prioritizing the research and development required to enable the next generation of HEP detectors to fulfill their full physics potential. The aim is to produce a Community White Paper which will describe the community strategy and a roadmap for software and computing research and development in HEP for the 2020s. The topics of event reconstruction and software triggers were considered by a joint working group and are summarized together in this document.
△ Less
Submitted 23 February, 2018;
originally announced February 2018.
-
A Roadmap for HEP Software and Computing R&D for the 2020s
Authors:
Johannes Albrecht,
Antonio Augusto Alves Jr,
Guilherme Amadio,
Giuseppe Andronico,
Nguyen Anh-Ky,
Laurent Aphecetche,
John Apostolakis,
Makoto Asai,
Luca Atzori,
Marian Babik,
Giuseppe Bagliesi,
Marilena Bandieramonte,
Sunanda Banerjee,
Martin Barisits,
Lothar A. T. Bauerdick,
Stefano Belforte,
Douglas Benjamin,
Catrin Bernius,
Wahid Bhimji,
Riccardo Maria Bianchi,
Ian Bird,
Catherine Biscarat,
Jakob Blomer,
Kenneth Bloom,
Tommaso Boccali
, et al. (285 additional authors not shown)
Abstract:
Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for…
▽ More
Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade.
△ Less
Submitted 19 December, 2018; v1 submitted 18 December, 2017;
originally announced December 2017.
-
Lithosphere-asthenosphere system in the Mediterranean region in the framework of polarized plate tectonics
Authors:
Reneta Blagoeva Raykova,
Giuliano Francesco Panza,
Carlo Doglioni
Abstract:
Velocity structure of the lithosphere-asthenosphere system, to the depth of about 350 km, is obtained for almost 400 cells, sized 1 degree by 1 degree in the Mediterranean region. The models are obtained by the following sequence of methods and tools: surface-wave dispersion measurements and collection; 2D tomography of dispersion relations; non-linear inversion of cellular dispersion relations; s…
▽ More
Velocity structure of the lithosphere-asthenosphere system, to the depth of about 350 km, is obtained for almost 400 cells, sized 1 degree by 1 degree in the Mediterranean region. The models are obtained by the following sequence of methods and tools: surface-wave dispersion measurements and collection; 2D tomography of dispersion relations; non-linear inversion of cellular dispersion relations; smoothing optimization method to select a preferred model for each cell. The 3D velocity model, that satisfies Occam razor principle, is obtained as a juxtaposition of selected cellular models. The reconstructed picture of the lithosphere-asthenosphere system evidences the, globally well known, asymmetry between the W- and E-directed subduction zones, attributed to the westward drift of the lithosphere relative to the mantle. Different relationship between slabs and mantle dynamics cause strong compositional differences in the upper mantle, as shown by large variations of seismic waves velocity, consistent with Polarized Plate Tectonics model.
△ Less
Submitted 9 November, 2015;
originally announced November 2015.