Search | arXiv e-print repository

doi 10.1051/epjconf/202125103036

Event Classification with Multi-step Machine Learning

Authors: Masahiko Saito, Tomoe Kishimoto, Yuya Kaneta, Taichi Itoh, Yoshiaki Umeda, Junichi Tanaka, Yutaro Iiyama, Ryu Sawada, Koji Terashi

Abstract: The usefulness and value of Multi-step Machine Learning (ML), where a task is organized into connected sub-tasks with known intermediate inference goals, as opposed to a single large model learned end-to-end without intermediate sub-tasks, is presented. Pre-optimized ML models are connected and better performance is obtained by re-optimizing the connected one. The selection of an ML model from sev… ▽ More The usefulness and value of Multi-step Machine Learning (ML), where a task is organized into connected sub-tasks with known intermediate inference goals, as opposed to a single large model learned end-to-end without intermediate sub-tasks, is presented. Pre-optimized ML models are connected and better performance is obtained by re-optimizing the connected one. The selection of an ML model from several small ML model candidates for each sub-task has been performed by using the idea based on Neural Architecture Search (NAS). In this paper, Differentiable Architecture Search (DARTS) and Single Path One-Shot NAS (SPOS-NAS) are tested, where the construction of loss functions is improved to keep all ML models smoothly learning. Using DARTS and SPOS-NAS as an optimization and selection as well as the connections for multi-step machine learning systems, we find that (1) such a system can quickly and successfully select highly performant model combinations, and (2) the selected models are consistent with baseline algorithms, such as grid search, and their outputs are well controlled. △ Less

Submitted 4 June, 2021; originally announced June 2021.

Journal ref: EPJ Web of Conferences 251, 03036 (2021)

arXiv:2101.07571 [pdf, ps, other]

An Improvement of Object Detection Performance using Multi-step Machine Learnings

Authors: Tomoe Kishimoto, Masahiko Saito, Junichi Tanaka, Yutaro Iiyama, Ryu Sawada, Koji Terashi

Abstract: Connecting multiple machine learning models into a pipeline is effective for handling complex problems. By breaking down the problem into steps, each tackled by a specific component model of the pipeline, the overall solution can be made accurate and explainable. This paper describes an enhancement of object detection based on this multi-step concept, where a post-processing step called the calibr… ▽ More Connecting multiple machine learning models into a pipeline is effective for handling complex problems. By breaking down the problem into steps, each tackled by a specific component model of the pipeline, the overall solution can be made accurate and explainable. This paper describes an enhancement of object detection based on this multi-step concept, where a post-processing step called the calibration model is introduced. The calibration model consists of a convolutional neural network, and utilizes rich contextual information based on the domain knowledge of the input. Improvements of object detection performance by 0.8-1.9 in average precision metric over existing object detectors have been observed using the new model. △ Less

Submitted 19 January, 2021; originally announced January 2021.

Comments: Submitted to ICIP 2021

arXiv:2101.05108 [pdf, other]

doi 10.1088/2632-2153/ac0ea1

Fast convolutional neural networks on FPGAs with hls4ml

Authors: Thea Aarrestad, Vladimir Loncar, Nicolò Ghielmetti, Maurizio Pierini, Sioni Summers, Jennifer Ngadiuba, Christoffer Petersson, Hampus Linander, Yutaro Iiyama, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Dylan Rankin, Sergo **dariani, Kevin Pedro, Nhan Tran, Mia Liu, Edward Kreinar, Zhenbin Wu, Duc Hoang

Abstract: We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on FPGAs. By extending the hls4ml library, we demonstrate an inference latency of $5\,μ$s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Num… ▽ More We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on FPGAs. By extending the hls4ml library, we demonstrate an inference latency of $5\,μ$s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation. △ Less

Submitted 29 April, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

Comments: 18 pages, 18 figures, 4 tables

Journal ref: Mach. Learn.: Sci. Technol. 2 045015 (2021)

arXiv:2008.03601 [pdf, other]

doi 10.3389/fdata.2020.598927

Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics

Authors: Yutaro Iiyama, Gianluca Cerminara, Abhijay Gupta, Jan Kieseler, Vladimir Loncar, Maurizio Pierini, Shah Rukh Qasim, Marcel Rieger, Sioni Summers, Gerrit Van Onsem, Kinga Wozniak, Jennifer Ngadiuba, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Dylan Rankin, Sergo **dariani, Mia Liu, Kevin Pedro, Nhan Tran, Edward Kreinar, Zhenbin Wu

Abstract: Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how t… ▽ More Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how to design distance-weighted graph networks that can be executed with a latency of less than 1$μ\mathrm{s}$ on an FPGA. To do so, we consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We use a graph network architecture developed for such purposes, and apply additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization. Using the $\mathtt{hls4ml}$ library, we convert the compressed models into firmware to be implemented on an FPGA. Performance of the synthesized models is presented both in terms of inference accuracy and resource usage. △ Less

Submitted 3 February, 2021; v1 submitted 8 August, 2020; originally announced August 2020.

Comments: 15 pages, 4 figures

Report number: FERMILAB-PUB-20-405-E-SCD

Journal ref: Frontiers in Big Data 3 (2021) 44

arXiv:2003.11409 [pdf, other]

doi 10.1007/s41781-021-00054-2

Dynamo -- Handling Scientific Data Across Sites and Storage Media

Authors: Yutaro Iiyama, Benedikt Maier, Daniel Abercrombie, Maxim Goncharov, Christoph Paus

Abstract: Dynamo is a full-stack software solution for scientific data management. Dynamo's architecture is modular, extensible, and customizable, making the software suitable for managing data in a wide range of installation scales, from a few terabytes stored at a single location to hundreds of petabytes distributed across a worldwide computing grid. This article documents the core system design of Dynamo… ▽ More Dynamo is a full-stack software solution for scientific data management. Dynamo's architecture is modular, extensible, and customizable, making the software suitable for managing data in a wide range of installation scales, from a few terabytes stored at a single location to hundreds of petabytes distributed across a worldwide computing grid. This article documents the core system design of Dynamo and describes the applications that implement various data management tasks. A brief report is also given on the operational experiences of the system at the CMS experiment at the CERN Large Hadron Collider and at a small scale analysis facility. △ Less

Submitted 16 May, 2021; v1 submitted 25 March, 2020; originally announced March 2020.

Comments: 18 pages, 9 figures

Journal ref: Computing and Software for Big Science 5, 11 (2021)

arXiv:1902.07987 [pdf, other]

doi 10.1140/epjc/s10052-019-7113-9

Learning representations of irregular particle-detector geometry with distance-weighted graph networks

Authors: Shah Rukh Qasim, Jan Kieseler, Yutaro Iiyama, Maurizio Pierini

Abstract: We explore the use of graph networks to deal with irregular-geometry detectors in the context of particle reconstruction. Thanks to their representation-learning capabilities, graph networks can exploit the full detector granularity, while natively managing the event sparsity and arbitrarily complex detector geometries. We introduce two distance-weighted graph network architectures, dubbed GarNet… ▽ More We explore the use of graph networks to deal with irregular-geometry detectors in the context of particle reconstruction. Thanks to their representation-learning capabilities, graph networks can exploit the full detector granularity, while natively managing the event sparsity and arbitrarily complex detector geometries. We introduce two distance-weighted graph network architectures, dubbed GarNet and GravNet layers, and apply them to a typical particle reconstruction task. The performance of the new architectures is evaluated on a data set of simulated particle interactions on a toy model of a highly granular calorimeter, loosely inspired by the endcap calorimeter to be installed in the CMS detector for the High-Luminosity LHC phase. We study the clustering of energy depositions, which is the basis for calorimetric particle reconstruction, and provide a quantitative comparison to alternative approaches. The proposed algorithms provide an interesting alternative to existing methods, offering equally performing or less resource-demanding solutions with less underlying assumptions on the detector geometry and, consequently, the possibility to generalize to other detectors. △ Less

Submitted 24 July, 2019; v1 submitted 21 February, 2019; originally announced February 2019.

Comments: 10p, v2: published version

Journal ref: Eur. Phys. J. C, 79 7 (2019) 608

Showing 1–6 of 6 results for author: Iiyama, Y