Search | arXiv e-print repository

Knowledge Distillation for Anomaly Detection

Authors: Adrian Alan Pol, Ekaterina Govorkova, Sonja Gronroos, Nadezda Chernyavskaya, Philip Harris, Maurizio Pierini, Isobel Ojalvo, Peter Elmer

Abstract: Unsupervised deep learning techniques are widely used to identify anomalous behaviour. The performance of such methods is a product of the amount of training data and the model size. However, the size is often a limiting factor for the deployment on resource-constrained devices. We present a novel procedure based on knowledge distillation for compressing an unsupervised anomaly detection model int… ▽ More Unsupervised deep learning techniques are widely used to identify anomalous behaviour. The performance of such methods is a product of the amount of training data and the model size. However, the size is often a limiting factor for the deployment on resource-constrained devices. We present a novel procedure based on knowledge distillation for compressing an unsupervised anomaly detection model into a supervised deployable one and we suggest a set of techniques to improve the detection sensitivity. Compressed models perform comparably to their larger counterparts while significantly reducing the size and memory footprint. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2305.04099 [pdf, other]

doi 10.1051/epjconf/202429509036

Symbolic Regression on FPGAs for Fast Machine Learning Inference

Authors: Ho Fung Tsoi, Adrian Alan Pol, Vladimir Loncar, Ekaterina Govorkova, Miles Cranmer, Sridhara Dasu, Peter Elmer, Philip Harris, Isobel Ojalvo, Maurizio Pierini

Abstract: The high-energy physics community is investigating the potential of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) to enhance physics sensitivity while still meeting data processing time constraints. In this contribution, we introduce a novel end-to-end procedure that utilizes a machine learning technique called symbolic regression (SR). It searches the equati… ▽ More The high-energy physics community is investigating the potential of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) to enhance physics sensitivity while still meeting data processing time constraints. In this contribution, we introduce a novel end-to-end procedure that utilizes a machine learning technique called symbolic regression (SR). It searches the equation space to discover algebraic relations approximating a dataset. We use PySR (a software to uncover these expressions based on an evolutionary algorithm) and extend the functionality of hls4ml (a package for machine learning inference in FPGAs) to support PySR-generated expressions for resource-constrained production environments. Deep learning models often optimize the top metric by pinning the network size because the vast hyperparameter space prevents an extensive search for neural architecture. Conversely, SR selects a set of models on the Pareto front, which allows for optimizing the performance-resource trade-off directly. By embedding symbolic forms, our implementation can dramatically reduce the computational resources needed to perform critical tasks. We validate our method on a physics benchmark: the multiclass classification of jets produced in simulated proton-proton collisions at the CERN Large Hadron Collider. We show that our approach can approximate a 3-layer neural network using an inference model that achieves up to a 13-fold decrease in execution time, down to 5 ns, while still preserving more than 90% approximation accuracy. △ Less

Submitted 17 January, 2024; v1 submitted 6 May, 2023; originally announced May 2023.

Comments: 9 pages. Accepted to 26th International Conference on Computing in High Energy & Nuclear Physics (CHEP 2023)

Journal ref: EPJ Web of Conferences 295, 09036 (2024)

arXiv:2205.08193 [pdf, ps, other]

The HEP Software Foundation Community

Authors: Graeme A Stewart, Peter Elmer, Elizabeth Sexton-Kennedy

Abstract: The HEP Software Foundation was founded in 2014 to tackle common problems of software development and sustainability for high-energy physics. In this paper we outline the motivation for the founding of the organisation and give a brief history of its development. We describe how the organisation functions today and what challenges remain to be faced in the future. The HEP Software Foundation was founded in 2014 to tackle common problems of software development and sustainability for high-energy physics. In this paper we outline the motivation for the founding of the organisation and give a brief history of its development. We describe how the organisation functions today and what challenges remain to be faced in the future. △ Less

Submitted 17 May, 2022; originally announced May 2022.

Report number: HSF-DOC-2022-01

arXiv:2112.02048 [pdf, other]

doi 10.3389/fdata.2022.828666

Graph Neural Networks for Charged Particle Tracking on FPGAs

Authors: Abdelrahman Elabd, Vesal Razavimaleki, Shi-Yu Huang, Javier Duarte, Markus Atkinson, Gage DeZoort, Peter Elmer, Scott Hauck, **-Xuan Hu, Shih-Chieh Hsu, Bo-Cheng Lai, Mark Neubauer, Isobel Ojalvo, Savannah Thais, Matthew Trahms

Abstract: The determination of charged particle trajectories in collisions at the CERN Large Hadron Collider (LHC) is an important but challenging problem, especially in the high interaction density conditions expected during the future high-luminosity phase of the LHC (HL-LHC). Graph neural networks (GNNs) are a type of geometric deep learning algorithm that has successfully been applied to this task by em… ▽ More The determination of charged particle trajectories in collisions at the CERN Large Hadron Collider (LHC) is an important but challenging problem, especially in the high interaction density conditions expected during the future high-luminosity phase of the LHC (HL-LHC). Graph neural networks (GNNs) are a type of geometric deep learning algorithm that has successfully been applied to this task by embedding tracker data as a graph -- nodes represent hits, while edges represent possible track segments -- and classifying the edges as true or fake track segments. However, their study in hardware- or software-based trigger applications has been limited due to their large computational cost. In this paper, we introduce an automated translation workflow, integrated into a broader tool called $\texttt{hls4ml}$, for converting GNNs into firmware for field-programmable gate arrays (FPGAs). We use this translation tool to implement GNNs for charged particle tracking, trained using the TrackML challenge dataset, on FPGAs with designs targeting different graph sizes, task complexites, and latency/throughput requirements. This work could enable the inclusion of charged particle tracking GNNs at the trigger level for HL-LHC experiments. △ Less

Submitted 23 March, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

Comments: 28 pages, 17 figures, 1 table, published version

Journal ref: Front. Big Data 5 (2022) 828666

arXiv:2103.16701 [pdf, other]

doi 10.1007/s41781-021-00073-z

Charged particle tracking via edge-classifying interaction networks

Authors: Gage DeZoort, Savannah Thais, Javier Duarte, Vesal Razavimaleki, Markus Atkinson, Isobel Ojalvo, Mark Neubauer, Peter Elmer

Abstract: Recent work has demonstrated that geometric deep learning methods such as graph neural networks (GNNs) are well suited to address a variety of reconstruction problems in high energy particle physics. In particular, particle tracking data is naturally represented as a graph by identifying silicon tracker hits as nodes and particle trajectories as edges; given a set of hypothesized edges, edge-class… ▽ More Recent work has demonstrated that geometric deep learning methods such as graph neural networks (GNNs) are well suited to address a variety of reconstruction problems in high energy particle physics. In particular, particle tracking data is naturally represented as a graph by identifying silicon tracker hits as nodes and particle trajectories as edges; given a set of hypothesized edges, edge-classifying GNNs identify those corresponding to real particle trajectories. In this work, we adapt the physics-motivated interaction network (IN) GNN toward the problem of particle tracking in pileup conditions similar to those expected at the high-luminosity Large Hadron Collider. Assuming idealized hit filtering at various particle momenta thresholds, we demonstrate the IN's excellent edge-classification accuracy and tracking efficiency through a suite of measurements at each stage of GNN-based tracking: graph construction, edge classification, and track building. The proposed IN architecture is substantially smaller than previously studied GNN tracking architectures; this is particularly promising as a reduction in size is critical for enabling GNN-based tracking in constrained computing environments. Furthermore, the IN may be represented as either a set of explicit matrix operations or a message passing GNN. Efforts are underway to accelerate each representation via heterogeneous computing resources towards both high-level and low-latency triggering applications. △ Less

Submitted 18 November, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

Comments: This is a post-peer-review, pre-copyedit version of this article. The final authenticated version is available online at: https://doi.org/10.1007/s41781-021-00073-z

Journal ref: Comput. Softw. Big Sci. 5, 26 (2021)

arXiv:2102.13516 [pdf, other]

doi 10.1051/epjconf/202125103002

AwkwardForth: accelerating Uproot with an internal DSL

Authors: Jim Pivarski, Ianna Osborne, Pratyush Das, David Lange, Peter Elmer

Abstract: File formats for generic data structures, such as ROOT, Avro, and Parquet, pose a problem for deserialization: it must be fast, but its code depends on the type of the data structure, not known at compile-time. Just-in-time compilation can satisfy both constraints, but we propose a more portable solution: specialized virtual machines. AwkwardForth is a Forth-driven virtual machine for deserializin… ▽ More File formats for generic data structures, such as ROOT, Avro, and Parquet, pose a problem for deserialization: it must be fast, but its code depends on the type of the data structure, not known at compile-time. Just-in-time compilation can satisfy both constraints, but we propose a more portable solution: specialized virtual machines. AwkwardForth is a Forth-driven virtual machine for deserializing data into Awkward Arrays. As a language, it is not intended for humans to write, but it loosens the coupling between Uproot and Awkward Array. AwkwardForth programs for deserializing record-oriented formats (ROOT and Avro) are about as fast as C++ ROOT and 10-80$\times$ faster than fastavro. Columnar formats (simple TTrees, RNTuple, and Parquet) only require specialization to interpret metadata and are therefore faster with precompiled code. △ Less

Submitted 24 February, 2021; originally announced February 2021.

Comments: 11 pages, 2 figures, submitted to the 25th International Conference on Computing in High Energy & Nuclear Physics

arXiv:2101.11489 [pdf, other]

Parallelizing the Unpacking and Clustering of Detector Data for Reconstruction of Charged Particle Tracks on Multi-core CPUs and Many-core GPUs

Authors: Giuseppe Cerati, Peter Elmer, Brian Gravelle, Matti Kortelainen, Vyacheslav Krutelyov, Steven Lantz, Mario Masciovecchio, Kevin McDermott, Boyana Norris, Allison Reinsvold Hall, Micheal Reid, Daniel Riley, Matevž Tadel, Peter Wittich, Bei Wang, Frank Würthwein, Avraham Yagil

Abstract: We present results from parallelizing the unpacking and clustering steps of the raw data from the silicon strip modules for reconstruction of charged particle tracks. Throughput is further improved by concurrently processing multiple events using nested OpenMP parallelism on CPU or CUDA streams on GPU. The new implementation along with earlier work in develo** a parallelized and vectorized imple… ▽ More We present results from parallelizing the unpacking and clustering steps of the raw data from the silicon strip modules for reconstruction of charged particle tracks. Throughput is further improved by concurrently processing multiple events using nested OpenMP parallelism on CPU or CUDA streams on GPU. The new implementation along with earlier work in develo** a parallelized and vectorized implementation of the combinatoric Kalman filter algorithm has enabled efficient global reconstruction of the entire event on modern computer architectures. We demonstrate the performance of the new implementation on Intel Xeon and NVIDIA GPU architectures. △ Less

Submitted 27 January, 2021; originally announced January 2021.

arXiv:2001.06307 [pdf, other]

doi 10.1051/epjconf/202024505023

Awkward Arrays in Python, C++, and Numba

Authors: Jim Pivarski, Peter Elmer, David Lange

Abstract: The Awkward Array library has been an important tool for physics analysis in Python since September 2018. However, some interface and implementation issues have been raised in Awkward Array's first year that argue for a reimplementation in C++ and Numba. We describe those issues, the new architecture, and present some examples of how the new interface will look to users. Of particular importance i… ▽ More The Awkward Array library has been an important tool for physics analysis in Python since September 2018. However, some interface and implementation issues have been raised in Awkward Array's first year that argue for a reimplementation in C++ and Numba. We describe those issues, the new architecture, and present some examples of how the new interface will look to users. Of particular importance is the separation of kernel functions from data structure management, which allows a C++ implementation and a Numba implementation to share kernel functions, and the algorithm that transforms record-oriented data into columnar Awkward Arrays. △ Less

Submitted 2 July, 2020; v1 submitted 15 January, 2020; originally announced January 2020.

Comments: To be published in CHEP 2019 proceedings, EPJ Web of Conferences; post-review update

arXiv:1901.07143 [pdf, other]

doi 10.1051/epjconf/201921406030

Using Big Data Technologies for HEP Analysis

Authors: Matteo Cremonesi, Claudio Bellini, Bianny Bian, Luca Canali, Vasileios Dimakopoulos, Peter Elmer, Ian Fisk, Maria Girone, Oliver Gutsche, Siew-Yan Hoh, Bo Jayatilaka, Viktor Khristenko, Andrea Luiselli, Andrew Melo, Evangelos Evangelos, Dominick Olivito, Jacopo Pazzini, Jim Pivarski, Alexey Svyatkovskiy, Marco Zanetti

Abstract: The HEP community is approaching an era were the excellent performances of the particle accelerators in delivering collision at high rate will force the experiments to record a large amount of information. The growing size of the datasets could potentially become a limiting factor in the capability to produce scientific results timely and efficiently. Recently, new technologies and new approaches… ▽ More The HEP community is approaching an era were the excellent performances of the particle accelerators in delivering collision at high rate will force the experiments to record a large amount of information. The growing size of the datasets could potentially become a limiting factor in the capability to produce scientific results timely and efficiently. Recently, new technologies and new approaches have been developed in industry to answer to the necessity to retrieve information as quickly as possible to analyze PB and EB datasets. Providing the scientists with these modern computing tools will lead to rethinking the principles of data analysis in HEP, making the overall scientific process faster and smoother. In this paper, we are presenting the latest developments and the most recent results on the usage of Apache Spark for HEP analysis. The study aims at evaluating the efficiency of the application of the new tools both quantitatively, by measuring the performances, and qualitatively, focusing on the user experience. The first goal is achieved by develo** a data reduction facility: working together with CERN Openlab and Intel, CMS replicates a real physics search using Spark-based technologies, with the ambition of reducing 1 PB of public data in 5 hours, collected by the CMS experiment, to 1 TB of data in a format suitable for physics analysis. The second goal is achieved by implementing multiple physics use-cases in Apache Spark using as input preprocessed datasets derived from official CMS data and simulation. By performing different end-analyses up to the publication plots on different hardware, feasibility, usability and portability are compared to the ones of a traditional ROOT-based workflow. △ Less

Submitted 21 January, 2019; originally announced January 2019.

arXiv:1711.00375 [pdf, other]

CMS Analysis and Data Reduction with Apache Spark

Authors: Oliver Gutsche, Luca Canali, Illia Cremer, Matteo Cremonesi, Peter Elmer, Ian Fisk, Maria Girone, Bo Jayatilaka, Jim Kowalkowski, Viktor Khristenko, Evangelos Motesnitsalis, Jim Pivarski, Saba Sehrish, Kacper Surdy, Alexey Svyatkovskiy

Abstract: Experimental Particle Physics has been at the forefront of analyzing the world's largest datasets for decades. The HEP community was among the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems for distributed data processing, collectively called "Big Data" technologies have emerged from industry and open source projects to support the a… ▽ More Experimental Particle Physics has been at the forefront of analyzing the world's largest datasets for decades. The HEP community was among the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems for distributed data processing, collectively called "Big Data" technologies have emerged from industry and open source projects to support the analysis of Petabyte and Exabyte datasets in industry. While the principles of data analysis in HEP have not changed (filtering and transforming experiment-specific data formats), these new technologies use different approaches and tools, promising a fresh look at analysis of very large datasets that could potentially reduce the time-to-physics with increased interactivity. Moreover these new tools are typically actively developed by large communities, often profiting of industry resources, and under open source licensing. These factors result in a boost for adoption and maturity of the tools and for the communities supporting them, at the same time hel** in reducing the cost of ownership for the end-users. In this talk, we are presenting studies of using Apache Spark for end user data analysis. We are studying the HEP analysis workflow separated into two thrusts: the reduction of centrally produced experiment datasets and the end-analysis up to the publication plot. Studying the first thrust, CMS is working together with CERN openlab and Intel on the CMS Big Data Reduction Facility. The goal is to reduce 1 PB of official CMS data to 1 TB of ntuple output for analysis. We are presenting the progress of this 2-year project with first results of scaling up Spark-based HEP analysis. Studying the second thrust, we are presenting studies on using Apache Spark for a CMS Dark Matter physics search, comparing Spark's feasibility, usability and performance to the ROOT-based analysis. △ Less

Submitted 31 October, 2017; originally announced November 2017.

Comments: Proceedings for 18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2017). arXiv admin note: text overlap with arXiv:1703.04171

arXiv:1708.08319 [pdf, other]

Fast Access to Columnar, Hierarchically Nested Data via Code Transformation

Authors: Jim Pivarski, Peter Elmer, Brian Bockelman, Zhe Zhang

Abstract: Big Data query systems represent data in a columnar format for fast, selective access, and in some cases (e.g. Apache Drill), perform calculations directly on the columnar data without row materialization, avoiding runtime costs. However, many analysis procedures cannot be easily or efficiently expressed as SQL. In High Energy Physics, the majority of data processing requires nested loops with c… ▽ More Big Data query systems represent data in a columnar format for fast, selective access, and in some cases (e.g. Apache Drill), perform calculations directly on the columnar data without row materialization, avoiding runtime costs. However, many analysis procedures cannot be easily or efficiently expressed as SQL. In High Energy Physics, the majority of data processing requires nested loops with complex dependencies. When faced with tasks like these, the conventional approach is to convert the columnar data back into an object form, usually with a performance price. This paper describes a new technique to transform procedural code so that it operates on hierarchically nested, columnar data natively, without row materialization. It can be viewed as a compiler pass on the typed abstract syntax tree, rewriting references to objects as columnar array lookups. We will also present performance comparisons between transformed code and conventional object-oriented code in a High Energy Physics context. △ Less

Submitted 3 November, 2017; v1 submitted 20 August, 2017; originally announced August 2017.

Comments: 10 pages, 2 figures, submitted to IEEE Big Data

arXiv:1703.04171 [pdf, other]

doi 10.1088/1742-6596/898/7/072012

Big Data in HEP: A comprehensive use case study

Authors: Oliver Gutsche, Matteo Cremonesi, Peter Elmer, Bo Jayatilaka, Jim Kowalkowski, Jim Pivarski, Saba Sehrish, Cristina Mantilla Surez, Alexey Svyatkovskiy, Nhan Tran

Abstract: Experimental Particle Physics has been at the forefront of analyzing the worlds largest datasets for decades. The HEP community was the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems collectively called Big Data technologies have emerged to support the analysis of Petabyte and Exabyte datasets in industry. While the principles of dat… ▽ More Experimental Particle Physics has been at the forefront of analyzing the worlds largest datasets for decades. The HEP community was the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems collectively called Big Data technologies have emerged to support the analysis of Petabyte and Exabyte datasets in industry. While the principles of data analysis in HEP have not changed (filtering and transforming experiment-specific data formats), these new technologies use different approaches and promise a fresh look at analysis of very large datasets and could potentially reduce the time-to-physics with increased interactivity. In this talk, we present an active LHC Run 2 analysis, searching for dark matter with the CMS detector, as a testbed for Big Data technologies. We directly compare the traditional NTuple-based analysis with an equivalent analysis using Apache Spark on the Hadoop ecosystem and beyond. In both cases, we start the analysis with the official experiment data formats and produce publication physics plots. We will discuss advantages and disadvantages of each approach and give an outlook on further studies needed. △ Less

Submitted 12 March, 2017; originally announced March 2017.

Comments: Proceedings for 22nd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2016)

arXiv:1510.08545 [pdf, ps, other]

High Energy Physics Forum for Computational Excellence: Working Group Reports (I. Applications Software II. Software Libraries and Tools III. Systems)

Authors: Salman Habib, Robert Roser, Tom LeCompte, Zach Marshall, Anders Borgland, Brett Viren, Peter Nugent, Makoto Asai, Lothar Bauerdick, Hal Finkel, Steve Gottlieb, Stefan Hoeche, Paul Sheldon, Jean-Luc Vay, Peter Elmer, Michael Kirby, Simon Patton, Maxim Potekhin, Brian Yanny, Paolo Calafiura, Eli Dart, Oliver Gutsche, Taku Izubuchi, Adam Lyon, Don Petravick

Abstract: Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence… ▽ More Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence (HEP-FCE) initiated a roadmap planning activity with two key overlap** drivers -- 1) software effectiveness, and 2) infrastructure and expertise advancement. The HEP-FCE formed three working groups, 1) Applications Software, 2) Software Libraries and Tools, and 3) Systems (including systems software), to provide an overview of the current status of HEP computing and to present findings and opportunities for the desired HEP computational roadmap. The final versions of the reports are combined in this document, and are presented along with introductory material. △ Less

Submitted 28 October, 2015; originally announced October 2015.

Comments: 72 pages

arXiv:1510.03676 [pdf, other]

doi 10.1088/1742-6596/664/9/092007

Future Computing Platforms for Science in a Power Constrained Era

Authors: David Abdurachmanov, Peter Elmer, Giulio Eulisse, Robert Knight

Abstract: Power consumption will be a key constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics (HEP). This makes performance-per-watt a crucial metric for selecting cost-efficient computing solutions. For this paper, we have done a wide survey of current and emerging architectures becoming available on the market including x86-64 variants, ARMv7 32-b… ▽ More Power consumption will be a key constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics (HEP). This makes performance-per-watt a crucial metric for selecting cost-efficient computing solutions. For this paper, we have done a wide survey of current and emerging architectures becoming available on the market including x86-64 variants, ARMv7 32-bit, ARMv8 64-bit, Many-Core and GPU solutions, as well as newer System-on-Chip (SoC) solutions. We compare performance and energy efficiency using an evolving set of standardized HEP-related benchmarks and power measurement techniques we have been develo**. We evaluate the potential for use of such computing solutions in the context of DHTC systems, such as the Worldwide LHC Computing Grid (WLCG). △ Less

Submitted 28 July, 2015; originally announced October 2015.

Comments: Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japan

arXiv:1507.07430 [pdf, other]

doi 10.1088/1742-6596/664/3/032010

Designing Computing System Architecture and Models for the HL-LHC era

Authors: Lothar Bauerdick, Brian Bockelman, Peter Elmer, Stephen Gowdy, Matevz Tadel, Frank Wuerthwein

Abstract: This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade. This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade. △ Less

Submitted 20 July, 2015; originally announced July 2015.

Comments: Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japan

arXiv:1507.07429 [pdf, ps, other]

doi 10.1088/1742-6596/664/6/062013

Optimizing CMS build infrastructure via Apache Mesos

Authors: David Abdurachmanov, Alessandro Degano, Peter Elmer, Giulio Eulisse, David Mendez, Shahzad Muzaffar

Abstract: The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous… ▽ More The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous environment of commodity x86-64 processors and Linux. Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of nodes. We present how we migrated our continuos integration system to schedule jobs on a relatively small Apache Mesos enabled cluster and how this resulted in better resource usage, higher peak performance and lower latency thanks to the dynamic scheduling capabilities of Mesos. △ Less

Submitted 28 July, 2015; v1 submitted 20 July, 2015; originally announced July 2015.

Comments: Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japan

arXiv:1410.3441 [pdf, other]

Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

Authors: David Abdurachmanov, Brian Bockelman, Peter Elmer, Giulio Eulisse, Robert Knight, Shahzad Muzaffar

Abstract: Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with special… ▽ More Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG). △ Less

Submitted 10 October, 2014; originally announced October 2014.

Comments: Submitted to proceedings of 16th International workshop on Advanced Computing and Analysis Techniques in physics research (ACAT 2014), Prague

arXiv:1410.3440 [pdf, other]

Techniques and tools for measuring energy efficiency of scientific software applications

Authors: David Abdurachmanov, Peter Elmer, Giulio Eulisse, Robert Knight, Tapio Niemi, Jukka K. Nurminen, Filip Nyback, Goncalo Pestana, Zhonghong Ou, Kashif Khan

Abstract: The scale of scientific High Performance Computing (HPC) and High Throughput Computing (HTC) has increased significantly in recent years, and is becoming sensitive to total energy use and cost. Energy-efficiency has thus become an important concern in scientific fields such as High Energy Physics (HEP). There has been a growing interest in utilizing alternate architectures, such as low power ARM p… ▽ More The scale of scientific High Performance Computing (HPC) and High Throughput Computing (HTC) has increased significantly in recent years, and is becoming sensitive to total energy use and cost. Energy-efficiency has thus become an important concern in scientific fields such as High Energy Physics (HEP). There has been a growing interest in utilizing alternate architectures, such as low power ARM processors, to replace traditional Intel x86 architectures. Nevertheless, even though such solutions have been successfully used in mobile applications with low I/O and memory demands, it is unclear if they are suitable and more energy-efficient in the scientific computing environment. Furthermore, there is a lack of tools and experience to derive and compare power consumption between the architectures for various workloads, and eventually to support software optimizations for energy efficiency. To that end, we have performed several physical and software-based measurements of workloads from HEP applications running on ARM and Intel architectures, and compare their power consumption and performance. We leverage several profiling tools (both in hardware and software) to extract different characteristics of the power use. We report the results of these measurements and the experience gained in develo** a set of measurement techniques and profiling tools to accurately assess the power consumption for scientific workloads. △ Less

Submitted 10 October, 2014; originally announced October 2014.

Comments: Submitted to proceedings of 16th International workshop on Advanced Computing and Analysis Techniques in physics research (ACAT 2014), Prague

arXiv:1404.6929 [pdf, other]

Power-aware applications for scientific cluster and distributed computing

Authors: David Abdurachmanov, Peter Elmer, Giulio Eulisse, Paola Grosso, Curtis Hillegas, Burt Holzman, Ruben L. Janssen, Sander Klous, Robert Knight, Shahzad Muzaffar

Abstract: The aggregate power use of computing hardware is an important cost factor in scientific cluster and distributed computing systems. The Worldwide LHC Computing Grid (WLCG) is a major example of such a distributed computing system, used primarily for high throughput computing (HTC) applications. It has a computing capacity and power consumption rivaling that of the largest supercomputers. The comput… ▽ More The aggregate power use of computing hardware is an important cost factor in scientific cluster and distributed computing systems. The Worldwide LHC Computing Grid (WLCG) is a major example of such a distributed computing system, used primarily for high throughput computing (HTC) applications. It has a computing capacity and power consumption rivaling that of the largest supercomputers. The computing capacity required from this system is also expected to grow over the next decade. Optimizing the power utilization and cost of such systems is thus of great interest. A number of trends currently underway will provide new opportunities for power-aware optimizations. We discuss how power-aware software applications and scheduling might be used to reduce power consumption, both as autonomous entities and as part of a (globally) distributed system. As concrete examples of computing centers we provide information on the large HEP-focused Tier-1 at FNAL, and the Tigress High Performance Computing Center at Princeton University, which provides HPC resources in a university context. △ Less

Submitted 22 October, 2014; v1 submitted 28 April, 2014; originally announced April 2014.

Comments: Submitted to proceedings of International Symposium on Grids and Clouds (ISGC) 2014, 23-28 March 2014, Academia Sinica, Taipei, Taiwan

arXiv:1311.1001 [pdf, other]

doi 10.1088/1742-6596/513/5/052008

Explorations of the viability of ARM and Xeon Phi for physics processing

Authors: David Abdurachmanov, Kapil Arya, Josh Bendavid, Tommaso Boccali, Gene Cooperman, Andrea Dotti, Peter Elmer, Giulio Eulisse, Francesco Giacomini, Christopher D. Jones, Matteo Manzali, Shahzad Muzaffar

Abstract: We report on our investigations into the viability of the ARM processor and the Intel Xeon Phi co-processor for scientific computing. We describe our experience porting software to these processors and running benchmarks using real physics applications to explore the potential of these processors for production physics processing. We report on our investigations into the viability of the ARM processor and the Intel Xeon Phi co-processor for scientific computing. We describe our experience porting software to these processors and running benchmarks using real physics applications to explore the potential of these processors for production physics processing. △ Less

Submitted 21 January, 2014; v1 submitted 5 November, 2013; originally announced November 2013.

Comments: Submitted to proceedings of the 20th International Conference on Computing in High Energy and Nuclear Physics (CHEP13), Amsterdam

arXiv:1311.0272 [pdf, other]

doi 10.1088/1742-6596/523/1/012015

Use of checkpoint-restart for complex HEP software on traditional architectures and Intel MIC

Authors: Kapil Arya, Gene Cooperman, Andrea Dotti, Peter Elmer

Abstract: Process checkpoint-restart is a technology with great potential for use in HEP workflows. Use cases include debugging, reducing the startup time of applications both in offline batch jobs and the High Level Trigger, permitting job preemption in environments where spare CPU cycles are being used opportunistically and efficient scheduling of a mix of multicore and single-threaded jobs. We report on… ▽ More Process checkpoint-restart is a technology with great potential for use in HEP workflows. Use cases include debugging, reducing the startup time of applications both in offline batch jobs and the High Level Trigger, permitting job preemption in environments where spare CPU cycles are being used opportunistically and efficient scheduling of a mix of multicore and single-threaded jobs. We report on tests of checkpoint-restart technology using CMS software, Geant4-MT (multi-threaded Geant4), and the DMTCP (Distributed Multithreaded Checkpointing) package. We analyze both single- and multi-threaded applications and test on both standard Intel x86 architectures and on Intel MIC. The tests with multi-threaded applications on Intel MIC are used to consider scalability and performance. These are considered an indicator of what the future may hold for many-core computing. △ Less

Submitted 22 January, 2014; v1 submitted 1 November, 2013; originally announced November 2013.

Comments: Submitted to proceedings of the 15th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT2013), Bei**g

arXiv:1311.0269 [pdf, ps, other]

doi 10.1088/1742-6596/523/1/012009

Initial explorations of ARM processors for scientific computing

Authors: David Abdurachmanov, Peter Elmer, Giulio Eulisse, Shahzad Muzaffar

Abstract: Power efficiency is becoming an ever more important metric for both high performance and high throughput computing. Over the course of next decade it is expected that flops/watt will be a major driver for the evolution of computer architecture. Servers with large numbers of ARM processors, already ubiquitous in mobile computing, are a promising alternative to traditional x86-64 computing. We prese… ▽ More Power efficiency is becoming an ever more important metric for both high performance and high throughput computing. Over the course of next decade it is expected that flops/watt will be a major driver for the evolution of computer architecture. Servers with large numbers of ARM processors, already ubiquitous in mobile computing, are a promising alternative to traditional x86-64 computing. We present the results of our initial investigations into the use of ARM processors for scientific computing applications. In particular we report the results from our work with a current generation ARMv7 development board to explore ARM-specific issues regarding the software development environment, operating system, performance benchmarks and issues for porting High Energy Physics software. △ Less

Submitted 22 January, 2014; v1 submitted 1 November, 2013; originally announced November 2013.

Comments: Submitted to proceedings of the 15th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT2013), Bei**g. arXiv admin note: text overlap with arXiv:1311.1001

arXiv:1308.1247 [pdf, other]

The Need for an R&D and Upgrade Program for CMS Software and Computing

Authors: Peter Elmer, Salvatore Rappoccio, Kevin Stenson, Peter Wittich

Abstract: Over the next ten years, the physics reach of the Large Hadron Collider (LHC) at the European Organization for Nuclear Research (CERN) will be greatly extended through increases in the instantaneous luminosity of the accelerator and large increases in the amount of collected data. Due to changes in the way Moore's Law computing performance gains have been realized in the past decade, an aggressive… ▽ More Over the next ten years, the physics reach of the Large Hadron Collider (LHC) at the European Organization for Nuclear Research (CERN) will be greatly extended through increases in the instantaneous luminosity of the accelerator and large increases in the amount of collected data. Due to changes in the way Moore's Law computing performance gains have been realized in the past decade, an aggressive program of R&D is needed to ensure that the computing capability of CMS will be up to the task of collecting and analyzing this data. △ Less

Submitted 6 August, 2013; originally announced August 2013.

arXiv:cs/0306069 [pdf, ps, other]

Distributed Offline Data Reconstruction in BaBar

Authors: Teela Pulliam, Peter Elmer, Alvise Dorigo

Abstract: The BaBar experiment at SLAC is in its fourth year of running. The data processing system has been continuously evolving to meet the challenges of higher luminosity running and the increasing bulk of data to re-process each year. To meet these goals a two-pass processing architecture has been adopted, where 'rolling calibrations' are quickly calculated on a small fraction of the events in the fi… ▽ More The BaBar experiment at SLAC is in its fourth year of running. The data processing system has been continuously evolving to meet the challenges of higher luminosity running and the increasing bulk of data to re-process each year. To meet these goals a two-pass processing architecture has been adopted, where 'rolling calibrations' are quickly calculated on a small fraction of the events in the first pass and the bulk data reconstruction done in the second. This allows for quick detector feedback in the first pass and allows for the parallelization of the second pass over two or more separate farms. This two-pass system allows also for distribution of processing farms off-site. The first such site has been setup at INFN Padova. The challenges met here were many. The software was ported to a full Linux-based, commodity hardware system. The raw dataset, 90 TB, was imported from SLAC utilizing a 155 Mbps network link. A system for quality control and export of the processed data back to SLAC was developed. Between SLAC and Padova we are currently running three pass-one farms, with 32 CPUs each, and nine pass-two farms with 64 to 80 CPUs each. The pass-two farms can process between 2 and 4 million events per day. Details about the implementation and performance of the system will be presented. △ Less

Submitted 13 June, 2003; originally announced June 2003.

Comments: CHEP03 paper, MODT012

Report number: SLAC-PUB-9903 ACM Class: C.3

arXiv:cs/0306008 [pdf, ps, other]

The new BaBar Data Reconstruction Control System

Authors: A. Ceseracciu, M. Piemontese, F. Safai Tehrani, P. Elmer, D. Johnson, T. M. Pulliam

Abstract: The BaBar experiment is characterized by extremely high luminosity, a complex detector, and a huge data volume, with increasing requirements each year. To fulfill these requirements a new control system has been designed and developed for the offline data reconstruction system. The new control system described in this paper provides the performance and flexibility needed to manage a large number… ▽ More The BaBar experiment is characterized by extremely high luminosity, a complex detector, and a huge data volume, with increasing requirements each year. To fulfill these requirements a new control system has been designed and developed for the offline data reconstruction system. The new control system described in this paper provides the performance and flexibility needed to manage a large number of small computing farms, and takes full benefit of OO design. The infrastructure is well isolated from the processing layer, it is generic and flexible, based on a light framework providing message passing and cooperative multitasking. The system is actively distributed, enforces the separation between different processing tiers by using different naming domains, and glues them together by dedicated brokers. It provides a powerful Finite State Machine framework to describe custom processing models in a simple regular language. This paper describes this new control system, currently in use at SLAC and Padova on ~450 CPUs organized in 12 farms. △ Less

Submitted 11 June, 2003; v1 submitted 31 May, 2003; originally announced June 2003.

Comments: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 9 pages, LaTeX, 2 eps figures PSN TUDT011

Report number: SLAC-PUB-9873 ACM Class: H.2.4

Showing 1–25 of 25 results for author: Elmer, P