-
Democratizing LHC Data Analysis with ADL/CutLang
Authors:
Sezen Sekmen,
Gokhan Unel,
Harrison B. Prosper,
Aytul Adiguzel,
Burak Sen
Abstract:
Data analysis at the LHC has a very steep learning curve, which erects a formidable barrier between data and anyone who wishes to analyze data, either to study an idea or to simply understand how data analysis is performed. To make analysis more accessible, we designed the so-called Analysis Description Language (ADL), a domain specific language capable of describing the contents of an LHC analysi…
▽ More
Data analysis at the LHC has a very steep learning curve, which erects a formidable barrier between data and anyone who wishes to analyze data, either to study an idea or to simply understand how data analysis is performed. To make analysis more accessible, we designed the so-called Analysis Description Language (ADL), a domain specific language capable of describing the contents of an LHC analysis in a standard and unambiguous way, independent of any computing frameworks. ADL has an English-like highly human-readable syntax and directly employs concepts relevant to HEP. Therefore it eliminates the need to learn complex analysis frameworks written based on general purpose languages such as C++ or Python, and shifts the focus directly to physics. Analyses written in ADL can be run on data using a runtime interpreter called CutLang, without the necessity of programming. ADL and CutLang are designed for use by anyone with an interest in, and/or knowledge of LHC physics, ranging from experimentalists and phenomenologists to non-professional enthusiasts. ADL/CutLang are originally designed for research, but are also equally intended for education and public use. This approach has already been employed to train undergraduate students with no programming experience in LHC analysis in two dedicated schools in Turkey and Vietnam, and is being adapted for use with LHC Open Data. Moreover, work is in progress towards piloting an educational module in particle physics data analysis for high school students and teachers. Here, we introduce ADL and CutLang and present the educational activities based on these practical tools.
△ Less
Submitted 24 March, 2022;
originally announced March 2022.
-
Analysis Description Language: A DSL for HEP Analysis
Authors:
Harrison B. Prosper,
Sezen Sekmen,
Gokhan Unel
Abstract:
We propose to adopt a declarative domain specific language for describing the physics algorithm of a high energy physics (HEP) analysis in a standard and unambiguous way decoupled from analysis software frameworks, and argue that this approach provides an accessible and sustainable environment for analysis design, use and preservation. Prototype of such a language called Analysis Description Langu…
▽ More
We propose to adopt a declarative domain specific language for describing the physics algorithm of a high energy physics (HEP) analysis in a standard and unambiguous way decoupled from analysis software frameworks, and argue that this approach provides an accessible and sustainable environment for analysis design, use and preservation. Prototype of such a language called Analysis Description Language (ADL) and its associated tools are being developed and applied in various HEP physics studies. We present the motivations for using a DSL, design principles of ADL and its runtime interpreter CutLang, along with current physics studies based on this approach. We also outline ideas and prospects for the future. Recent physics studies, hands-on workshops and surveys indicate that ADL is a feasible and effective approach with many advantages and benefits, and offers a direction to which the HEP field should give serious consideration.
△ Less
Submitted 18 March, 2022;
originally announced March 2022.
-
Recent advances in ADL, CutLang and adl2tnm
Authors:
Harrison B. Prosper,
Sezen Sekmen,
Gokhan Unel,
Arpon Paul
Abstract:
This paper presents an overview and features of an Analysis Description Language (ADL) designed for HEP data analysis. ADL is a domain specific, declarative language that describes the physics content of an analysis in a standard and unambiguous way, independent of any computing frameworks. It also describes infrastructures that render ADL executable, namely CutLang, a direct runtime interpreter (…
▽ More
This paper presents an overview and features of an Analysis Description Language (ADL) designed for HEP data analysis. ADL is a domain specific, declarative language that describes the physics content of an analysis in a standard and unambiguous way, independent of any computing frameworks. It also describes infrastructures that render ADL executable, namely CutLang, a direct runtime interpreter (originally also a language), and adl2tnm, a transpiler converting ADL into C++ code. In ADL, analyses are described in human readable plain text files, clearly separating object, variable and event selection definitions in blocks, with a syntax that includes mathematical and logical operations, comparison and optimisation operators, reducers, four-vector algebra and commonly used functions. Recent studies demonstrate that adapting the ADL approach has numerous benefits for the experimental and phenomenological HEP communities. These include facilitating the abstraction, design, optimization, visualization, validation, combination, reproduction, interpretation and overall communication of the analysis contents and long term preservation of the analyses beyond the lifetimes of experiments. Here we also discuss some of the current ADL applications in physics studies and future prospects based on static analysis and differentiable programming.
△ Less
Submitted 28 July, 2021;
originally announced August 2021.
-
The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider
Authors:
T. Aarrestad,
M. van Beekveld,
M. Bona,
A. Boveia,
S. Caron,
J. Davies,
A. De Simone,
C. Doglioni,
J. M. Duarte,
A. Farbin,
H. Gupta,
L. Hendriks,
L. Heinrich,
J. Howarth,
P. Jawahar,
A. Jueid,
J. Lastow,
A. Leinweber,
J. Mamuzic,
E. Merényi,
A. Morandini,
P. Moskvitina,
C. Nellist,
J. Ngadiuba,
B. Ostdiek
, et al. (14 additional authors not shown)
Abstract:
We describe the outcome of a data challenge conducted as part of the Dark Machines Initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims at detecting signals of new physics at the LHC using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We defin…
▽ More
We describe the outcome of a data challenge conducted as part of the Dark Machines Initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims at detecting signals of new physics at the LHC using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 Billion simulated LHC events corresponding to $10~\rm{fb}^{-1}$ of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.
△ Less
Submitted 9 December, 2021; v1 submitted 28 May, 2021;
originally announced May 2021.
-
Analysis Description Languages for the LHC
Authors:
Sezen Sekmen,
Philippe Gras,
Lindsey Gray,
Benjamin Krikler,
Jim Pivarski,
Harrison B. Prosper,
Andrea Rizzi,
Gokhan Unel,
Gordon Watts
Abstract:
An analysis description language is a domain specific language capable of describing the contents of an LHC analysis in a standard and unambiguous way, independent of any computing framework. It is designed for use by anyone with an interest in, and knowledge of, LHC physics, i.e., experimentalists, phenomenologists and other enthusiasts. Adopting analysis description languages would bring numerou…
▽ More
An analysis description language is a domain specific language capable of describing the contents of an LHC analysis in a standard and unambiguous way, independent of any computing framework. It is designed for use by anyone with an interest in, and knowledge of, LHC physics, i.e., experimentalists, phenomenologists and other enthusiasts. Adopting analysis description languages would bring numerous benefits for the LHC experimental and phenomenological communities ranging from analysis preservation beyond the lifetimes of experiments or analysis software to facilitating the abstraction, design, visualization, validation, combination, reproduction, interpretation and overall communication of the analysis contents. Here, we introduce the analysis description language concept and summarize the current efforts ongoing to develop such languages and tools to use them in LHC analyses.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
CutLang as an Analysis Description Language for Introducing Students to Analyses in Particle Physics
Authors:
Aytul Adiguzel,
Orhan Cakir,
Umit Kaya,
V. Erkcan Ozcan,
Sertac Ozturk,
Sezen Sekmen,
Ilkay Turk Cakir,
N. Gokhan Unel
Abstract:
The fifth edition of the "Computing Applications in Particle Physics" school was held on 3-7 February 2020, at Istanbul University, Turkey. This particular edition focused on the processing of simulated data from the Large Hadron Collider collisions using an Analysis Description Language and its runtime interpreter called CutLang. 24 undergraduate and 6 graduate students were initiated to collider…
▽ More
The fifth edition of the "Computing Applications in Particle Physics" school was held on 3-7 February 2020, at Istanbul University, Turkey. This particular edition focused on the processing of simulated data from the Large Hadron Collider collisions using an Analysis Description Language and its runtime interpreter called CutLang. 24 undergraduate and 6 graduate students were initiated to collider data analysis during the school. After 3 days of lectures and exercises, the students were grouped into teams of 3 or 4 and each team was assigned an analysis publication from ATLAS or CMS experiments. After 1.5 days of independent study, each team was able to reproduce the assigned analysis using CutLang.
△ Less
Submitted 26 March, 2021; v1 submitted 27 August, 2020;
originally announced August 2020.
-
CutLang: a cut-based HEP analysis description language and runtime interpreter
Authors:
Gokhan Unel,
Sezen Sekmen,
Anna Monica Toon
Abstract:
We present CutLang, an analysis description language and runtime interpreter for high energy collider physics data analyses. An analysis description language is a declerative domain specific language that can express all elements of a data analysis in an easy and unambiguous way. A full-fledged human readable analysis description language, incorporating logical and mathematical expressions, would…
▽ More
We present CutLang, an analysis description language and runtime interpreter for high energy collider physics data analyses. An analysis description language is a declerative domain specific language that can express all elements of a data analysis in an easy and unambiguous way. A full-fledged human readable analysis description language, incorporating logical and mathematical expressions, would eliminate many programming difficulties and errors, consequently allowing the scientist to focus on the goal, but not on the tool. In this paper, we discuss the guiding principles and scope of the CutLang language, implementation of the CutLang runtime interpreter and the CutLang framework, and demonstrate an example of top pair reconstruction.
△ Less
Submitted 23 September, 2019;
originally announced September 2019.
-
HEP Software Foundation Community White Paper Working Group - Detector Simulation
Authors:
HEP Software Foundation,
:,
J Apostolakis,
M Asai,
S Banerjee,
R Bianchi,
P Canal,
R Cenci,
J Chapman,
G Corti,
G Cosmo,
S Easo,
L de Oliveira,
A Dotti,
V Elvira,
S Farrell,
L Fields,
K Genser,
A Gheata,
M Gheata,
J Harvey,
F Hariri,
R Hatcher,
K Herner,
M Hildreth
, et al. (40 additional authors not shown)
Abstract:
A working group on detector simulation was formed as part of the high-energy physics (HEP) Software Foundation's initiative to prepare a Community White Paper that describes the main software challenges and opportunities to be faced in the HEP field over the next decade. The working group met over a period of several months in order to review the current status of the Full and Fast simulation appl…
▽ More
A working group on detector simulation was formed as part of the high-energy physics (HEP) Software Foundation's initiative to prepare a Community White Paper that describes the main software challenges and opportunities to be faced in the HEP field over the next decade. The working group met over a period of several months in order to review the current status of the Full and Fast simulation applications of HEP experiments and the improvements that will need to be made in order to meet the goals of future HEP experimental programmes. The scope of the topics covered includes the main components of a HEP simulation application, such as MC truth handling, geometry modeling, particle propagation in materials and fields, physics modeling of the interactions of particles with matter, the treatment of pileup and other backgrounds, as well as signal processing and digitisation. The resulting work programme described in this document focuses on the need to improve both the software performance and the physics of detector simulation. The goals are to increase the accuracy of the physics models and expand their applicability to future physics programmes, while achieving large factors in computing performance gains consistent with projections on available computing resources.
△ Less
Submitted 12 March, 2018;
originally announced March 2018.
-
A Roadmap for HEP Software and Computing R&D for the 2020s
Authors:
Johannes Albrecht,
Antonio Augusto Alves Jr,
Guilherme Amadio,
Giuseppe Andronico,
Nguyen Anh-Ky,
Laurent Aphecetche,
John Apostolakis,
Makoto Asai,
Luca Atzori,
Marian Babik,
Giuseppe Bagliesi,
Marilena Bandieramonte,
Sunanda Banerjee,
Martin Barisits,
Lothar A. T. Bauerdick,
Stefano Belforte,
Douglas Benjamin,
Catrin Bernius,
Wahid Bhimji,
Riccardo Maria Bianchi,
Ian Bird,
Catherine Biscarat,
Jakob Blomer,
Kenneth Bloom,
Tommaso Boccali
, et al. (285 additional authors not shown)
Abstract:
Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for…
▽ More
Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade.
△ Less
Submitted 19 December, 2018; v1 submitted 18 December, 2017;
originally announced December 2017.
-
Recent Developments in CMS Fast Simulation
Authors:
Sezen Sekmen
Abstract:
CMS has developed a fast detector simulation package, which serves as a fast and reliable alternative to the detailed GEANT4-based (full) simulation, and enables efficient simulation of large numbers of standard model and new physics events. Fast simulation becomes particularly important with the current increase in the LHC luminosity. Here, I will discuss the basic principles behind the CMS fast…
▽ More
CMS has developed a fast detector simulation package, which serves as a fast and reliable alternative to the detailed GEANT4-based (full) simulation, and enables efficient simulation of large numbers of standard model and new physics events. Fast simulation becomes particularly important with the current increase in the LHC luminosity. Here, I will discuss the basic principles behind the CMS fast simulation framework, and how they are implemented in the different detector components in order to simulate and reconstruct sufficiently accurate physics objects for analysis. I will focus on recent developments in tracking and geometry interface, which improve the flexibility and emulation performance of the framework, and allow a better synchronization with the full simulation. I will then show how these developments have led to an improved agreement of basic analysis objects and event variables between fast and full simulation.
△ Less
Submitted 13 January, 2017;
originally announced January 2017.
-
Priors for New Physics
Authors:
Maurizio Pierini,
Harrison B. Prosper,
Sezen Sekmen,
Maria Spiropulu
Abstract:
The interpretation of data in terms of multi-parameter models of new physics, using the Bayesian approach, requires the construction of multi-parameter priors. We propose a construction that uses elements of Bayesian reference analysis. Our idea is to initiate the chain of inference with the reference prior for a likelihood function that depends on a single parameter of interest that is a function…
▽ More
The interpretation of data in terms of multi-parameter models of new physics, using the Bayesian approach, requires the construction of multi-parameter priors. We propose a construction that uses elements of Bayesian reference analysis. Our idea is to initiate the chain of inference with the reference prior for a likelihood function that depends on a single parameter of interest that is a function of the parameters of the physics model. The reference posterior density of the parameter of interest induces on the parameter space of the physics model a class of posterior densities. We propose to continue the chain of inference with a particular density from this class, namely, the one for which indistinguishable models are equiprobable and use it as the prior for subsequent analysis. We illustrate our method by applying it to the constrained minimal supersymmetric Standard Model and two non-universal variants of it.
△ Less
Submitted 2 August, 2011;
originally announced August 2011.
-
Model Inference with Reference Priors
Authors:
Maurizio Pierini,
Harrison Prosper,
Sezen Sekmen,
Maria Spiropulu
Abstract:
We describe the application of model inference based on reference priors to two concrete examples in high energy physics: the determination of the CKM matrix parameters rhobar and etabar and the determination of the parameters m_0 and m_1/2 in a simplified version of the CMSSM SUSY model. We show how a 1-dimensional reference posterior can be mapped to the n-dimensional (n-D) parameter space of th…
▽ More
We describe the application of model inference based on reference priors to two concrete examples in high energy physics: the determination of the CKM matrix parameters rhobar and etabar and the determination of the parameters m_0 and m_1/2 in a simplified version of the CMSSM SUSY model. We show how a 1-dimensional reference posterior can be mapped to the n-dimensional (n-D) parameter space of the given class of models, under a minimal set of conditions on the n-D function. This reference-based function can be used as a prior for the next iteration of inference, using Bayes' theorem recursively.
△ Less
Submitted 14 July, 2011;
originally announced July 2011.