-
Evaluating Portable Parallelization Strategies for Heterogeneous Architectures in High Energy Physics
Authors:
Mohammad Atif,
Meghna Battacharya,
Paolo Calafiura,
Taylor Childers,
Mark Dewing,
Zhihua Dong,
Oliver Gutsche,
Salman Habib,
Kyle Knoepfel,
Matti Kortelainen,
Ka Hei Martin Kwok,
Charles Leggett,
Meifeng Lin,
Vincent Pascuzzi,
Alexei Strelchenko,
Vakhtang Tsulaia,
Brett Viren,
Tianle Wang,
Beomki Yeo,
Haiwang Yu
Abstract:
High-energy physics (HEP) experiments have developed millions of lines of code over decades that are optimized to run on traditional x86 CPU systems. However, we are seeing a rapidly increasing fraction of floating point computing power in leadership-class computing facilities and traditional data centers coming from new accelerator architectures, such as GPUs. HEP experiments are now faced with t…
▽ More
High-energy physics (HEP) experiments have developed millions of lines of code over decades that are optimized to run on traditional x86 CPU systems. However, we are seeing a rapidly increasing fraction of floating point computing power in leadership-class computing facilities and traditional data centers coming from new accelerator architectures, such as GPUs. HEP experiments are now faced with the untenable prospect of rewriting millions of lines of x86 CPU code, for the increasingly dominant architectures found in these computational accelerators. This task is made more challenging by the architecture-specific languages and APIs promoted by manufacturers such as NVIDIA, Intel and AMD. Producing multiple, architecture-specific implementations is not a viable scenario, given the available person power and code maintenance issues.
The Portable Parallelization Strategies team of the HEP Center for Computational Excellence is investigating the use of Kokkos, SYCL, OpenMP, std::execution::parallel and alpaka as potential portability solutions that promise to execute on multiple architectures from the same source code, using representative use cases from major HEP experiments, including the DUNE experiment of the Long Baseline Neutrino Facility, and the ATLAS and CMS experiments of the Large Hadron Collider. This cross-cutting evaluation of portability solutions using real applications will help inform and guide the HEP community when choosing their software and hardware suites for the next generation of experimental frameworks. We present the outcomes of our studies, including performance metrics, porting challenges, API evaluations, and build system integration.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
A reconfigurable neural network ASIC for detector front-end data compression at the HL-LHC
Authors:
Giuseppe Di Guglielmo,
Farah Fahim,
Christian Herwig,
Manuel Blanco Valentin,
Javier Duarte,
Cristian Gingu,
Philip Harris,
James Hirschauer,
Martin Kwok,
Vladimir Loncar,
Yingyi Luo,
Llovizna Miranda,
Jennifer Ngadiuba,
Daniel Noonan,
Seda Ogrenci-Memik,
Maurizio Pierini,
Sioni Summers,
Nhan Tran
Abstract:
Despite advances in the programmable logic capabilities of modern trigger systems, a significant bottleneck remains in the amount of data to be transported from the detector to off-detector logic where trigger decisions are made. We demonstrate that a neural network autoencoder model can be implemented in a radiation tolerant ASIC to perform lossy data compression alleviating the data transmission…
▽ More
Despite advances in the programmable logic capabilities of modern trigger systems, a significant bottleneck remains in the amount of data to be transported from the detector to off-detector logic where trigger decisions are made. We demonstrate that a neural network autoencoder model can be implemented in a radiation tolerant ASIC to perform lossy data compression alleviating the data transmission problem while preserving critical information of the detector energy profile. For our application, we consider the high-granularity calorimeter from the CMS experiment at the CERN Large Hadron Collider. The advantage of the machine learning approach is in the flexibility and configurability of the algorithm. By changing the neural network weights, a unique data compression algorithm can be deployed for each sensor in different detector regions, and changing detector or collider conditions. To meet area, performance, and power constraints, we perform a quantization-aware training to create an optimized neural network hardware implementation. The design is achieved through the use of high-level synthesis tools and the hls4ml framework, and was processed through synthesis and physical layout flows based on a LP CMOS 65 nm technology node. The flow anticipates 200 Mrad of ionizing radiation to select gates, and reports a total area of 3.6 mm^2 and consumes 95 mW of power. The simulated energy consumption per inference is 2.4 nJ. This is the first radiation tolerant on-detector ASIC implementation of a neural network that has been designed for particle physics applications.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
End-to-End Real-time Catheter Segmentation with Optical Flow-Guided War** during Endovascular Intervention
Authors:
Anh Nguyen,
Dennis Kundrat,
Giulio Dagnino,
Wenqiang Chi,
Mohamed E. M. K. Abdelaziz,
Yao Guo,
YingLiang Ma,
Trevor M. Y. Kwok,
Celia Riga,
Guang-Zhong Yang
Abstract:
Accurate real-time catheter segmentation is an important pre-requisite for robot-assisted endovascular intervention. Most of the existing learning-based methods for catheter segmentation and tracking are only trained on small-scale datasets or synthetic data due to the difficulties of ground-truth annotation. Furthermore, the temporal continuity in intraoperative imaging sequences is not fully uti…
▽ More
Accurate real-time catheter segmentation is an important pre-requisite for robot-assisted endovascular intervention. Most of the existing learning-based methods for catheter segmentation and tracking are only trained on small-scale datasets or synthetic data due to the difficulties of ground-truth annotation. Furthermore, the temporal continuity in intraoperative imaging sequences is not fully utilised. In this paper, we present FW-Net, an end-to-end and real-time deep learning framework for endovascular intervention. The proposed FW-Net has three modules: a segmentation network with encoder-decoder architecture, a flow network to extract optical flow information, and a novel flow-guided war** function to learn the frame-to-frame temporal continuity. We show that by effectively learning temporal continuity, the network can successfully segment and track the catheters in real-time sequences using only raw ground-truth for training. Detailed validation results confirm that our FW-Net outperforms state-of-the-art techniques while achieving real-time performance.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.