-
Knowledge Distillation for Anomaly Detection
Authors:
Adrian Alan Pol,
Ekaterina Govorkova,
Sonja Gronroos,
Nadezda Chernyavskaya,
Philip Harris,
Maurizio Pierini,
Isobel Ojalvo,
Peter Elmer
Abstract:
Unsupervised deep learning techniques are widely used to identify anomalous behaviour. The performance of such methods is a product of the amount of training data and the model size. However, the size is often a limiting factor for the deployment on resource-constrained devices. We present a novel procedure based on knowledge distillation for compressing an unsupervised anomaly detection model int…
▽ More
Unsupervised deep learning techniques are widely used to identify anomalous behaviour. The performance of such methods is a product of the amount of training data and the model size. However, the size is often a limiting factor for the deployment on resource-constrained devices. We present a novel procedure based on knowledge distillation for compressing an unsupervised anomaly detection model into a supervised deployable one and we suggest a set of techniques to improve the detection sensitivity. Compressed models perform comparably to their larger counterparts while significantly reducing the size and memory footprint.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Symbolic Regression on FPGAs for Fast Machine Learning Inference
Authors:
Ho Fung Tsoi,
Adrian Alan Pol,
Vladimir Loncar,
Ekaterina Govorkova,
Miles Cranmer,
Sridhara Dasu,
Peter Elmer,
Philip Harris,
Isobel Ojalvo,
Maurizio Pierini
Abstract:
The high-energy physics community is investigating the potential of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) to enhance physics sensitivity while still meeting data processing time constraints. In this contribution, we introduce a novel end-to-end procedure that utilizes a machine learning technique called symbolic regression (SR). It searches the equati…
▽ More
The high-energy physics community is investigating the potential of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) to enhance physics sensitivity while still meeting data processing time constraints. In this contribution, we introduce a novel end-to-end procedure that utilizes a machine learning technique called symbolic regression (SR). It searches the equation space to discover algebraic relations approximating a dataset. We use PySR (a software to uncover these expressions based on an evolutionary algorithm) and extend the functionality of hls4ml (a package for machine learning inference in FPGAs) to support PySR-generated expressions for resource-constrained production environments. Deep learning models often optimize the top metric by pinning the network size because the vast hyperparameter space prevents an extensive search for neural architecture. Conversely, SR selects a set of models on the Pareto front, which allows for optimizing the performance-resource trade-off directly. By embedding symbolic forms, our implementation can dramatically reduce the computational resources needed to perform critical tasks. We validate our method on a physics benchmark: the multiclass classification of jets produced in simulated proton-proton collisions at the CERN Large Hadron Collider. We show that our approach can approximate a 3-layer neural network using an inference model that achieves up to a 13-fold decrease in execution time, down to 5 ns, while still preserving more than 90% approximation accuracy.
△ Less
Submitted 17 January, 2024; v1 submitted 6 May, 2023;
originally announced May 2023.
-
The HEP Software Foundation Community
Authors:
Graeme A Stewart,
Peter Elmer,
Elizabeth Sexton-Kennedy
Abstract:
The HEP Software Foundation was founded in 2014 to tackle common problems of software development and sustainability for high-energy physics. In this paper we outline the motivation for the founding of the organisation and give a brief history of its development. We describe how the organisation functions today and what challenges remain to be faced in the future.
The HEP Software Foundation was founded in 2014 to tackle common problems of software development and sustainability for high-energy physics. In this paper we outline the motivation for the founding of the organisation and give a brief history of its development. We describe how the organisation functions today and what challenges remain to be faced in the future.
△ Less
Submitted 17 May, 2022;
originally announced May 2022.
-
Graph Neural Networks for Charged Particle Tracking on FPGAs
Authors:
Abdelrahman Elabd,
Vesal Razavimaleki,
Shi-Yu Huang,
Javier Duarte,
Markus Atkinson,
Gage DeZoort,
Peter Elmer,
Scott Hauck,
**-Xuan Hu,
Shih-Chieh Hsu,
Bo-Cheng Lai,
Mark Neubauer,
Isobel Ojalvo,
Savannah Thais,
Matthew Trahms
Abstract:
The determination of charged particle trajectories in collisions at the CERN Large Hadron Collider (LHC) is an important but challenging problem, especially in the high interaction density conditions expected during the future high-luminosity phase of the LHC (HL-LHC). Graph neural networks (GNNs) are a type of geometric deep learning algorithm that has successfully been applied to this task by em…
▽ More
The determination of charged particle trajectories in collisions at the CERN Large Hadron Collider (LHC) is an important but challenging problem, especially in the high interaction density conditions expected during the future high-luminosity phase of the LHC (HL-LHC). Graph neural networks (GNNs) are a type of geometric deep learning algorithm that has successfully been applied to this task by embedding tracker data as a graph -- nodes represent hits, while edges represent possible track segments -- and classifying the edges as true or fake track segments. However, their study in hardware- or software-based trigger applications has been limited due to their large computational cost. In this paper, we introduce an automated translation workflow, integrated into a broader tool called $\texttt{hls4ml}$, for converting GNNs into firmware for field-programmable gate arrays (FPGAs). We use this translation tool to implement GNNs for charged particle tracking, trained using the TrackML challenge dataset, on FPGAs with designs targeting different graph sizes, task complexites, and latency/throughput requirements. This work could enable the inclusion of charged particle tracking GNNs at the trigger level for HL-LHC experiments.
△ Less
Submitted 23 March, 2022; v1 submitted 3 December, 2021;
originally announced December 2021.
-
Charged particle tracking via edge-classifying interaction networks
Authors:
Gage DeZoort,
Savannah Thais,
Javier Duarte,
Vesal Razavimaleki,
Markus Atkinson,
Isobel Ojalvo,
Mark Neubauer,
Peter Elmer
Abstract:
Recent work has demonstrated that geometric deep learning methods such as graph neural networks (GNNs) are well suited to address a variety of reconstruction problems in high energy particle physics. In particular, particle tracking data is naturally represented as a graph by identifying silicon tracker hits as nodes and particle trajectories as edges; given a set of hypothesized edges, edge-class…
▽ More
Recent work has demonstrated that geometric deep learning methods such as graph neural networks (GNNs) are well suited to address a variety of reconstruction problems in high energy particle physics. In particular, particle tracking data is naturally represented as a graph by identifying silicon tracker hits as nodes and particle trajectories as edges; given a set of hypothesized edges, edge-classifying GNNs identify those corresponding to real particle trajectories. In this work, we adapt the physics-motivated interaction network (IN) GNN toward the problem of particle tracking in pileup conditions similar to those expected at the high-luminosity Large Hadron Collider. Assuming idealized hit filtering at various particle momenta thresholds, we demonstrate the IN's excellent edge-classification accuracy and tracking efficiency through a suite of measurements at each stage of GNN-based tracking: graph construction, edge classification, and track building. The proposed IN architecture is substantially smaller than previously studied GNN tracking architectures; this is particularly promising as a reduction in size is critical for enabling GNN-based tracking in constrained computing environments. Furthermore, the IN may be represented as either a set of explicit matrix operations or a message passing GNN. Efforts are underway to accelerate each representation via heterogeneous computing resources towards both high-level and low-latency triggering applications.
△ Less
Submitted 18 November, 2021; v1 submitted 30 March, 2021;
originally announced March 2021.
-
AwkwardForth: accelerating Uproot with an internal DSL
Authors:
Jim Pivarski,
Ianna Osborne,
Pratyush Das,
David Lange,
Peter Elmer
Abstract:
File formats for generic data structures, such as ROOT, Avro, and Parquet, pose a problem for deserialization: it must be fast, but its code depends on the type of the data structure, not known at compile-time. Just-in-time compilation can satisfy both constraints, but we propose a more portable solution: specialized virtual machines. AwkwardForth is a Forth-driven virtual machine for deserializin…
▽ More
File formats for generic data structures, such as ROOT, Avro, and Parquet, pose a problem for deserialization: it must be fast, but its code depends on the type of the data structure, not known at compile-time. Just-in-time compilation can satisfy both constraints, but we propose a more portable solution: specialized virtual machines. AwkwardForth is a Forth-driven virtual machine for deserializing data into Awkward Arrays. As a language, it is not intended for humans to write, but it loosens the coupling between Uproot and Awkward Array. AwkwardForth programs for deserializing record-oriented formats (ROOT and Avro) are about as fast as C++ ROOT and 10-80$\times$ faster than fastavro. Columnar formats (simple TTrees, RNTuple, and Parquet) only require specialization to interpret metadata and are therefore faster with precompiled code.
△ Less
Submitted 24 February, 2021;
originally announced February 2021.
-
Parallelizing the Unpacking and Clustering of Detector Data for Reconstruction of Charged Particle Tracks on Multi-core CPUs and Many-core GPUs
Authors:
Giuseppe Cerati,
Peter Elmer,
Brian Gravelle,
Matti Kortelainen,
Vyacheslav Krutelyov,
Steven Lantz,
Mario Masciovecchio,
Kevin McDermott,
Boyana Norris,
Allison Reinsvold Hall,
Micheal Reid,
Daniel Riley,
Matevž Tadel,
Peter Wittich,
Bei Wang,
Frank Würthwein,
Avraham Yagil
Abstract:
We present results from parallelizing the unpacking and clustering steps of the raw data from the silicon strip modules for reconstruction of charged particle tracks. Throughput is further improved by concurrently processing multiple events using nested OpenMP parallelism on CPU or CUDA streams on GPU. The new implementation along with earlier work in develo** a parallelized and vectorized imple…
▽ More
We present results from parallelizing the unpacking and clustering steps of the raw data from the silicon strip modules for reconstruction of charged particle tracks. Throughput is further improved by concurrently processing multiple events using nested OpenMP parallelism on CPU or CUDA streams on GPU. The new implementation along with earlier work in develo** a parallelized and vectorized implementation of the combinatoric Kalman filter algorithm has enabled efficient global reconstruction of the entire event on modern computer architectures. We demonstrate the performance of the new implementation on Intel Xeon and NVIDIA GPU architectures.
△ Less
Submitted 27 January, 2021;
originally announced January 2021.
-
Awkward Arrays in Python, C++, and Numba
Authors:
Jim Pivarski,
Peter Elmer,
David Lange
Abstract:
The Awkward Array library has been an important tool for physics analysis in Python since September 2018. However, some interface and implementation issues have been raised in Awkward Array's first year that argue for a reimplementation in C++ and Numba. We describe those issues, the new architecture, and present some examples of how the new interface will look to users. Of particular importance i…
▽ More
The Awkward Array library has been an important tool for physics analysis in Python since September 2018. However, some interface and implementation issues have been raised in Awkward Array's first year that argue for a reimplementation in C++ and Numba. We describe those issues, the new architecture, and present some examples of how the new interface will look to users. Of particular importance is the separation of kernel functions from data structure management, which allows a C++ implementation and a Numba implementation to share kernel functions, and the algorithm that transforms record-oriented data into columnar Awkward Arrays.
△ Less
Submitted 2 July, 2020; v1 submitted 15 January, 2020;
originally announced January 2020.
-
Using Big Data Technologies for HEP Analysis
Authors:
Matteo Cremonesi,
Claudio Bellini,
Bianny Bian,
Luca Canali,
Vasileios Dimakopoulos,
Peter Elmer,
Ian Fisk,
Maria Girone,
Oliver Gutsche,
Siew-Yan Hoh,
Bo Jayatilaka,
Viktor Khristenko,
Andrea Luiselli,
Andrew Melo,
Evangelos Evangelos,
Dominick Olivito,
Jacopo Pazzini,
Jim Pivarski,
Alexey Svyatkovskiy,
Marco Zanetti
Abstract:
The HEP community is approaching an era were the excellent performances of the particle accelerators in delivering collision at high rate will force the experiments to record a large amount of information. The growing size of the datasets could potentially become a limiting factor in the capability to produce scientific results timely and efficiently. Recently, new technologies and new approaches…
▽ More
The HEP community is approaching an era were the excellent performances of the particle accelerators in delivering collision at high rate will force the experiments to record a large amount of information. The growing size of the datasets could potentially become a limiting factor in the capability to produce scientific results timely and efficiently. Recently, new technologies and new approaches have been developed in industry to answer to the necessity to retrieve information as quickly as possible to analyze PB and EB datasets. Providing the scientists with these modern computing tools will lead to rethinking the principles of data analysis in HEP, making the overall scientific process faster and smoother.
In this paper, we are presenting the latest developments and the most recent results on the usage of Apache Spark for HEP analysis. The study aims at evaluating the efficiency of the application of the new tools both quantitatively, by measuring the performances, and qualitatively, focusing on the user experience. The first goal is achieved by develo** a data reduction facility: working together with CERN Openlab and Intel, CMS replicates a real physics search using Spark-based technologies, with the ambition of reducing 1 PB of public data in 5 hours, collected by the CMS experiment, to 1 TB of data in a format suitable for physics analysis.
The second goal is achieved by implementing multiple physics use-cases in Apache Spark using as input preprocessed datasets derived from official CMS data and simulation. By performing different end-analyses up to the publication plots on different hardware, feasibility, usability and portability are compared to the ones of a traditional ROOT-based workflow.
△ Less
Submitted 21 January, 2019;
originally announced January 2019.
-
CMS Analysis and Data Reduction with Apache Spark
Authors:
Oliver Gutsche,
Luca Canali,
Illia Cremer,
Matteo Cremonesi,
Peter Elmer,
Ian Fisk,
Maria Girone,
Bo Jayatilaka,
Jim Kowalkowski,
Viktor Khristenko,
Evangelos Motesnitsalis,
Jim Pivarski,
Saba Sehrish,
Kacper Surdy,
Alexey Svyatkovskiy
Abstract:
Experimental Particle Physics has been at the forefront of analyzing the world's largest datasets for decades. The HEP community was among the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems for distributed data processing, collectively called "Big Data" technologies have emerged from industry and open source projects to support the a…
▽ More
Experimental Particle Physics has been at the forefront of analyzing the world's largest datasets for decades. The HEP community was among the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems for distributed data processing, collectively called "Big Data" technologies have emerged from industry and open source projects to support the analysis of Petabyte and Exabyte datasets in industry. While the principles of data analysis in HEP have not changed (filtering and transforming experiment-specific data formats), these new technologies use different approaches and tools, promising a fresh look at analysis of very large datasets that could potentially reduce the time-to-physics with increased interactivity. Moreover these new tools are typically actively developed by large communities, often profiting of industry resources, and under open source licensing. These factors result in a boost for adoption and maturity of the tools and for the communities supporting them, at the same time hel** in reducing the cost of ownership for the end-users. In this talk, we are presenting studies of using Apache Spark for end user data analysis. We are studying the HEP analysis workflow separated into two thrusts: the reduction of centrally produced experiment datasets and the end-analysis up to the publication plot. Studying the first thrust, CMS is working together with CERN openlab and Intel on the CMS Big Data Reduction Facility. The goal is to reduce 1 PB of official CMS data to 1 TB of ntuple output for analysis. We are presenting the progress of this 2-year project with first results of scaling up Spark-based HEP analysis. Studying the second thrust, we are presenting studies on using Apache Spark for a CMS Dark Matter physics search, comparing Spark's feasibility, usability and performance to the ROOT-based analysis.
△ Less
Submitted 31 October, 2017;
originally announced November 2017.
-
Fast Access to Columnar, Hierarchically Nested Data via Code Transformation
Authors:
Jim Pivarski,
Peter Elmer,
Brian Bockelman,
Zhe Zhang
Abstract:
Big Data query systems represent data in a columnar format for fast, selective access, and in some cases (e.g. Apache Drill), perform calculations directly on the columnar data without row materialization, avoiding runtime costs.
However, many analysis procedures cannot be easily or efficiently expressed as SQL. In High Energy Physics, the majority of data processing requires nested loops with c…
▽ More
Big Data query systems represent data in a columnar format for fast, selective access, and in some cases (e.g. Apache Drill), perform calculations directly on the columnar data without row materialization, avoiding runtime costs.
However, many analysis procedures cannot be easily or efficiently expressed as SQL. In High Energy Physics, the majority of data processing requires nested loops with complex dependencies. When faced with tasks like these, the conventional approach is to convert the columnar data back into an object form, usually with a performance price.
This paper describes a new technique to transform procedural code so that it operates on hierarchically nested, columnar data natively, without row materialization. It can be viewed as a compiler pass on the typed abstract syntax tree, rewriting references to objects as columnar array lookups.
We will also present performance comparisons between transformed code and conventional object-oriented code in a High Energy Physics context.
△ Less
Submitted 3 November, 2017; v1 submitted 20 August, 2017;
originally announced August 2017.
-
Big Data in HEP: A comprehensive use case study
Authors:
Oliver Gutsche,
Matteo Cremonesi,
Peter Elmer,
Bo Jayatilaka,
Jim Kowalkowski,
Jim Pivarski,
Saba Sehrish,
Cristina Mantilla Surez,
Alexey Svyatkovskiy,
Nhan Tran
Abstract:
Experimental Particle Physics has been at the forefront of analyzing the worlds largest datasets for decades. The HEP community was the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems collectively called Big Data technologies have emerged to support the analysis of Petabyte and Exabyte datasets in industry. While the principles of dat…
▽ More
Experimental Particle Physics has been at the forefront of analyzing the worlds largest datasets for decades. The HEP community was the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems collectively called Big Data technologies have emerged to support the analysis of Petabyte and Exabyte datasets in industry. While the principles of data analysis in HEP have not changed (filtering and transforming experiment-specific data formats), these new technologies use different approaches and promise a fresh look at analysis of very large datasets and could potentially reduce the time-to-physics with increased interactivity. In this talk, we present an active LHC Run 2 analysis, searching for dark matter with the CMS detector, as a testbed for Big Data technologies. We directly compare the traditional NTuple-based analysis with an equivalent analysis using Apache Spark on the Hadoop ecosystem and beyond. In both cases, we start the analysis with the official experiment data formats and produce publication physics plots. We will discuss advantages and disadvantages of each approach and give an outlook on further studies needed.
△ Less
Submitted 12 March, 2017;
originally announced March 2017.
-
High Energy Physics Forum for Computational Excellence: Working Group Reports (I. Applications Software II. Software Libraries and Tools III. Systems)
Authors:
Salman Habib,
Robert Roser,
Tom LeCompte,
Zach Marshall,
Anders Borgland,
Brett Viren,
Peter Nugent,
Makoto Asai,
Lothar Bauerdick,
Hal Finkel,
Steve Gottlieb,
Stefan Hoeche,
Paul Sheldon,
Jean-Luc Vay,
Peter Elmer,
Michael Kirby,
Simon Patton,
Maxim Potekhin,
Brian Yanny,
Paolo Calafiura,
Eli Dart,
Oliver Gutsche,
Taku Izubuchi,
Adam Lyon,
Don Petravick
Abstract:
Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence…
▽ More
Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence (HEP-FCE) initiated a roadmap planning activity with two key overlap** drivers -- 1) software effectiveness, and 2) infrastructure and expertise advancement. The HEP-FCE formed three working groups, 1) Applications Software, 2) Software Libraries and Tools, and 3) Systems (including systems software), to provide an overview of the current status of HEP computing and to present findings and opportunities for the desired HEP computational roadmap. The final versions of the reports are combined in this document, and are presented along with introductory material.
△ Less
Submitted 28 October, 2015;
originally announced October 2015.
-
Future Computing Platforms for Science in a Power Constrained Era
Authors:
David Abdurachmanov,
Peter Elmer,
Giulio Eulisse,
Robert Knight
Abstract:
Power consumption will be a key constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics (HEP). This makes performance-per-watt a crucial metric for selecting cost-efficient computing solutions. For this paper, we have done a wide survey of current and emerging architectures becoming available on the market including x86-64 variants, ARMv7 32-b…
▽ More
Power consumption will be a key constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics (HEP). This makes performance-per-watt a crucial metric for selecting cost-efficient computing solutions. For this paper, we have done a wide survey of current and emerging architectures becoming available on the market including x86-64 variants, ARMv7 32-bit, ARMv8 64-bit, Many-Core and GPU solutions, as well as newer System-on-Chip (SoC) solutions. We compare performance and energy efficiency using an evolving set of standardized HEP-related benchmarks and power measurement techniques we have been develo**. We evaluate the potential for use of such computing solutions in the context of DHTC systems, such as the Worldwide LHC Computing Grid (WLCG).
△ Less
Submitted 28 July, 2015;
originally announced October 2015.
-
Designing Computing System Architecture and Models for the HL-LHC era
Authors:
Lothar Bauerdick,
Brian Bockelman,
Peter Elmer,
Stephen Gowdy,
Matevz Tadel,
Frank Wuerthwein
Abstract:
This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade.
This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade.
△ Less
Submitted 20 July, 2015;
originally announced July 2015.
-
Optimizing CMS build infrastructure via Apache Mesos
Authors:
David Abdurachmanov,
Alessandro Degano,
Peter Elmer,
Giulio Eulisse,
David Mendez,
Shahzad Muzaffar
Abstract:
The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous…
▽ More
The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous environment of commodity x86-64 processors and Linux. Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of nodes. We present how we migrated our continuos integration system to schedule jobs on a relatively small Apache Mesos enabled cluster and how this resulted in better resource usage, higher peak performance and lower latency thanks to the dynamic scheduling capabilities of Mesos.
△ Less
Submitted 28 July, 2015; v1 submitted 20 July, 2015;
originally announced July 2015.
-
Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi
Authors:
David Abdurachmanov,
Brian Bockelman,
Peter Elmer,
Giulio Eulisse,
Robert Knight,
Shahzad Muzaffar
Abstract:
Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with special…
▽ More
Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).
△ Less
Submitted 10 October, 2014;
originally announced October 2014.
-
Techniques and tools for measuring energy efficiency of scientific software applications
Authors:
David Abdurachmanov,
Peter Elmer,
Giulio Eulisse,
Robert Knight,
Tapio Niemi,
Jukka K. Nurminen,
Filip Nyback,
Goncalo Pestana,
Zhonghong Ou,
Kashif Khan
Abstract:
The scale of scientific High Performance Computing (HPC) and High Throughput Computing (HTC) has increased significantly in recent years, and is becoming sensitive to total energy use and cost. Energy-efficiency has thus become an important concern in scientific fields such as High Energy Physics (HEP). There has been a growing interest in utilizing alternate architectures, such as low power ARM p…
▽ More
The scale of scientific High Performance Computing (HPC) and High Throughput Computing (HTC) has increased significantly in recent years, and is becoming sensitive to total energy use and cost. Energy-efficiency has thus become an important concern in scientific fields such as High Energy Physics (HEP). There has been a growing interest in utilizing alternate architectures, such as low power ARM processors, to replace traditional Intel x86 architectures. Nevertheless, even though such solutions have been successfully used in mobile applications with low I/O and memory demands, it is unclear if they are suitable and more energy-efficient in the scientific computing environment. Furthermore, there is a lack of tools and experience to derive and compare power consumption between the architectures for various workloads, and eventually to support software optimizations for energy efficiency. To that end, we have performed several physical and software-based measurements of workloads from HEP applications running on ARM and Intel architectures, and compare their power consumption and performance. We leverage several profiling tools (both in hardware and software) to extract different characteristics of the power use. We report the results of these measurements and the experience gained in develo** a set of measurement techniques and profiling tools to accurately assess the power consumption for scientific workloads.
△ Less
Submitted 10 October, 2014;
originally announced October 2014.
-
Power-aware applications for scientific cluster and distributed computing
Authors:
David Abdurachmanov,
Peter Elmer,
Giulio Eulisse,
Paola Grosso,
Curtis Hillegas,
Burt Holzman,
Ruben L. Janssen,
Sander Klous,
Robert Knight,
Shahzad Muzaffar
Abstract:
The aggregate power use of computing hardware is an important cost factor in scientific cluster and distributed computing systems. The Worldwide LHC Computing Grid (WLCG) is a major example of such a distributed computing system, used primarily for high throughput computing (HTC) applications. It has a computing capacity and power consumption rivaling that of the largest supercomputers. The comput…
▽ More
The aggregate power use of computing hardware is an important cost factor in scientific cluster and distributed computing systems. The Worldwide LHC Computing Grid (WLCG) is a major example of such a distributed computing system, used primarily for high throughput computing (HTC) applications. It has a computing capacity and power consumption rivaling that of the largest supercomputers. The computing capacity required from this system is also expected to grow over the next decade. Optimizing the power utilization and cost of such systems is thus of great interest.
A number of trends currently underway will provide new opportunities for power-aware optimizations. We discuss how power-aware software applications and scheduling might be used to reduce power consumption, both as autonomous entities and as part of a (globally) distributed system. As concrete examples of computing centers we provide information on the large HEP-focused Tier-1 at FNAL, and the Tigress High Performance Computing Center at Princeton University, which provides HPC resources in a university context.
△ Less
Submitted 22 October, 2014; v1 submitted 28 April, 2014;
originally announced April 2014.
-
Explorations of the viability of ARM and Xeon Phi for physics processing
Authors:
David Abdurachmanov,
Kapil Arya,
Josh Bendavid,
Tommaso Boccali,
Gene Cooperman,
Andrea Dotti,
Peter Elmer,
Giulio Eulisse,
Francesco Giacomini,
Christopher D. Jones,
Matteo Manzali,
Shahzad Muzaffar
Abstract:
We report on our investigations into the viability of the ARM processor and the Intel Xeon Phi co-processor for scientific computing. We describe our experience porting software to these processors and running benchmarks using real physics applications to explore the potential of these processors for production physics processing.
We report on our investigations into the viability of the ARM processor and the Intel Xeon Phi co-processor for scientific computing. We describe our experience porting software to these processors and running benchmarks using real physics applications to explore the potential of these processors for production physics processing.
△ Less
Submitted 21 January, 2014; v1 submitted 5 November, 2013;
originally announced November 2013.
-
Use of checkpoint-restart for complex HEP software on traditional architectures and Intel MIC
Authors:
Kapil Arya,
Gene Cooperman,
Andrea Dotti,
Peter Elmer
Abstract:
Process checkpoint-restart is a technology with great potential for use in HEP workflows. Use cases include debugging, reducing the startup time of applications both in offline batch jobs and the High Level Trigger, permitting job preemption in environments where spare CPU cycles are being used opportunistically and efficient scheduling of a mix of multicore and single-threaded jobs. We report on…
▽ More
Process checkpoint-restart is a technology with great potential for use in HEP workflows. Use cases include debugging, reducing the startup time of applications both in offline batch jobs and the High Level Trigger, permitting job preemption in environments where spare CPU cycles are being used opportunistically and efficient scheduling of a mix of multicore and single-threaded jobs. We report on tests of checkpoint-restart technology using CMS software, Geant4-MT (multi-threaded Geant4), and the DMTCP (Distributed Multithreaded Checkpointing) package. We analyze both single- and multi-threaded applications and test on both standard Intel x86 architectures and on Intel MIC. The tests with multi-threaded applications on Intel MIC are used to consider scalability and performance. These are considered an indicator of what the future may hold for many-core computing.
△ Less
Submitted 22 January, 2014; v1 submitted 1 November, 2013;
originally announced November 2013.
-
Initial explorations of ARM processors for scientific computing
Authors:
David Abdurachmanov,
Peter Elmer,
Giulio Eulisse,
Shahzad Muzaffar
Abstract:
Power efficiency is becoming an ever more important metric for both high performance and high throughput computing. Over the course of next decade it is expected that flops/watt will be a major driver for the evolution of computer architecture. Servers with large numbers of ARM processors, already ubiquitous in mobile computing, are a promising alternative to traditional x86-64 computing. We prese…
▽ More
Power efficiency is becoming an ever more important metric for both high performance and high throughput computing. Over the course of next decade it is expected that flops/watt will be a major driver for the evolution of computer architecture. Servers with large numbers of ARM processors, already ubiquitous in mobile computing, are a promising alternative to traditional x86-64 computing. We present the results of our initial investigations into the use of ARM processors for scientific computing applications. In particular we report the results from our work with a current generation ARMv7 development board to explore ARM-specific issues regarding the software development environment, operating system, performance benchmarks and issues for porting High Energy Physics software.
△ Less
Submitted 22 January, 2014; v1 submitted 1 November, 2013;
originally announced November 2013.
-
The Need for an R&D and Upgrade Program for CMS Software and Computing
Authors:
Peter Elmer,
Salvatore Rappoccio,
Kevin Stenson,
Peter Wittich
Abstract:
Over the next ten years, the physics reach of the Large Hadron Collider (LHC) at the European Organization for Nuclear Research (CERN) will be greatly extended through increases in the instantaneous luminosity of the accelerator and large increases in the amount of collected data. Due to changes in the way Moore's Law computing performance gains have been realized in the past decade, an aggressive…
▽ More
Over the next ten years, the physics reach of the Large Hadron Collider (LHC) at the European Organization for Nuclear Research (CERN) will be greatly extended through increases in the instantaneous luminosity of the accelerator and large increases in the amount of collected data. Due to changes in the way Moore's Law computing performance gains have been realized in the past decade, an aggressive program of R&D is needed to ensure that the computing capability of CMS will be up to the task of collecting and analyzing this data.
△ Less
Submitted 6 August, 2013;
originally announced August 2013.
-
Distributed Offline Data Reconstruction in BaBar
Authors:
Teela Pulliam,
Peter Elmer,
Alvise Dorigo
Abstract:
The BaBar experiment at SLAC is in its fourth year of running. The data processing system has been continuously evolving to meet the challenges of higher luminosity running and the increasing bulk of data to re-process each year. To meet these goals a two-pass processing architecture has been adopted, where 'rolling calibrations' are quickly calculated on a small fraction of the events in the fi…
▽ More
The BaBar experiment at SLAC is in its fourth year of running. The data processing system has been continuously evolving to meet the challenges of higher luminosity running and the increasing bulk of data to re-process each year. To meet these goals a two-pass processing architecture has been adopted, where 'rolling calibrations' are quickly calculated on a small fraction of the events in the first pass and the bulk data reconstruction done in the second. This allows for quick detector feedback in the first pass and allows for the parallelization of the second pass over two or more separate farms. This two-pass system allows also for distribution of processing farms off-site. The first such site has been setup at INFN Padova. The challenges met here were many. The software was ported to a full Linux-based, commodity hardware system. The raw dataset, 90 TB, was imported from SLAC utilizing a 155 Mbps network link. A system for quality control and export of the processed data back to SLAC was developed. Between SLAC and Padova we are currently running three pass-one farms, with 32 CPUs each, and nine pass-two farms with 64 to 80 CPUs each. The pass-two farms can process between 2 and 4 million events per day. Details about the implementation and performance of the system will be presented.
△ Less
Submitted 13 June, 2003;
originally announced June 2003.
-
The new BaBar Data Reconstruction Control System
Authors:
A. Ceseracciu,
M. Piemontese,
F. Safai Tehrani,
P. Elmer,
D. Johnson,
T. M. Pulliam
Abstract:
The BaBar experiment is characterized by extremely high luminosity, a complex detector, and a huge data volume, with increasing requirements each year. To fulfill these requirements a new control system has been designed and developed for the offline data reconstruction system. The new control system described in this paper provides the performance and flexibility needed to manage a large number…
▽ More
The BaBar experiment is characterized by extremely high luminosity, a complex detector, and a huge data volume, with increasing requirements each year. To fulfill these requirements a new control system has been designed and developed for the offline data reconstruction system. The new control system described in this paper provides the performance and flexibility needed to manage a large number of small computing farms, and takes full benefit of OO design. The infrastructure is well isolated from the processing layer, it is generic and flexible, based on a light framework providing message passing and cooperative multitasking. The system is actively distributed, enforces the separation between different processing tiers by using different naming domains, and glues them together by dedicated brokers. It provides a powerful Finite State Machine framework to describe custom processing models in a simple regular language. This paper describes this new control system, currently in use at SLAC and Padova on ~450 CPUs organized in 12 farms.
△ Less
Submitted 11 June, 2003; v1 submitted 31 May, 2003;
originally announced June 2003.