Search | arXiv e-print repository

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

Authors: Shashank Subramanian, Peter Harrington, Kurt Keutzer, Wahid Bhimji, Dmitriy Morozov, Michael Mahoney, Amir Gholami

Abstract: Pre-trained machine learning (ML) models have shown great performance for a wide range of applications, in particular in natural language processing (NLP) and computer vision (CV). Here, we study how pre-training could be used for scientific machine learning (SciML) applications, specifically in the context of transfer learning. We study the transfer behavior of these models as (i) the pre-trained… ▽ More Pre-trained machine learning (ML) models have shown great performance for a wide range of applications, in particular in natural language processing (NLP) and computer vision (CV). Here, we study how pre-training could be used for scientific machine learning (SciML) applications, specifically in the context of transfer learning. We study the transfer behavior of these models as (i) the pre-trained model size is scaled, (ii) the downstream training dataset size is scaled, (iii) the physics parameters are systematically pushed out of distribution, and (iv) how a single model pre-trained on a mixture of different physics problems can be adapted to various downstream applications. We find that-when fine-tuned appropriately-transfer learning can help reach desired accuracy levels with orders of magnitude fewer downstream examples (across different tasks that can even be out-of-distribution) than training from scratch, with consistent behavior across a wide range of downstream examples. We also find that fine-tuning these models yields more performance gains as model size increases, compared to training from scratch on new downstream tasks. These results hold for a broad range of PDE learning tasks. All in all, our results demonstrate the potential of the "pre-train and fine-tune" paradigm for SciML problems, demonstrating a path towards building SciML foundation models. We open-source our code for reproducibility. △ Less

Submitted 31 May, 2023; originally announced June 2023.

Comments: 16 pages, 11 figures

Journal ref: NeurIPS 2023

arXiv:2210.05822 [pdf, other]

The Future of High Energy Physics Software and Computing

Authors: V. Daniel Elvira, Steven Gottlieb, Oliver Gutsche, Benjamin Nachman, S. Bailey, W. Bhimji, P. Boyle, G. Cerati, M. Carrasco Kind, K. Cranmer, G. Davies, V. D. Elvira, R. Gardner, K. Heitmann, M. Hildreth, W. Hopkins, T. Humble, M. Lin, P. Onyisi, J. Qiang, K. Pedro, G. Perdue, A. Roberts, M. Savage, P. Shanahan , et al. (3 additional authors not shown)

Abstract: Software and Computing (S&C) are essential to all High Energy Physics (HEP) experiments and many theoretical studies. The size and complexity of S&C are now commensurate with that of experimental instruments, playing a critical role in experimental design, data acquisition/instrumental control, reconstruction, and analysis. Furthermore, S&C often plays a leading role in driving the precision of th… ▽ More Software and Computing (S&C) are essential to all High Energy Physics (HEP) experiments and many theoretical studies. The size and complexity of S&C are now commensurate with that of experimental instruments, playing a critical role in experimental design, data acquisition/instrumental control, reconstruction, and analysis. Furthermore, S&C often plays a leading role in driving the precision of theoretical calculations and simulations. Within this central role in HEP, S&C has been immensely successful over the last decade. This report looks forward to the next decade and beyond, in the context of the 2021 Particle Physics Community Planning Exercise ("Snowmass") organized by the Division of Particles and Fields (DPF) of the American Physical Society. △ Less

Submitted 8 November, 2022; v1 submitted 11 October, 2022; originally announced October 2022.

Comments: Computational Frontier Report Contribution to Snowmass 2021; 41 pages, 1 figure. v2: missing ref and added missing topical group conveners. v3: fixed typos

arXiv:2209.08868 [pdf, other]

Snowmass 2021 Computational Frontier CompF4 Topical Group Report: Storage and Processing Resource Access

Authors: W. Bhimji, D. Carder, E. Dart, J. Duarte, I. Fisk, R. Gardner, C. Guok, B. Jayatilaka, T. Lehman, M. Lin, C. Maltzahn, S. McKee, M. S. Neubauer, O. Rind, O. Shadura, N. V. Tran, P. van Gemmeren, G. Watts, B. A. Weaver, F. Würthwein

Abstract: Computing plays a significant role in all areas of high energy physics. The Snowmass 2021 CompF4 topical group's scope is facilities R&D, where we consider "facilities" as the computing hardware and software infrastructure inside the data centers plus the networking between data centers, irrespective of who owns them, and what policies are applied for using them. In other words, it includes commer… ▽ More Computing plays a significant role in all areas of high energy physics. The Snowmass 2021 CompF4 topical group's scope is facilities R&D, where we consider "facilities" as the computing hardware and software infrastructure inside the data centers plus the networking between data centers, irrespective of who owns them, and what policies are applied for using them. In other words, it includes commercial clouds, federally funded High Performance Computing (HPC) systems for all of science, and systems funded explicitly for a given experimental or theoretical program. This topical group report summarizes the findings and recommendations for the storage, processing, networking and associated software service infrastructures for future high energy physics research, based on the discussions organized through the Snowmass 2021 community study. △ Less

Submitted 29 September, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

Comments: Snowmass 2021 Computational Frontier CompF4 topical group report. v2: Expanded introduction. Updated author list. 52 pages, 6 figures

arXiv:2205.04601 [pdf, other]

Long-term stability and generalization of observationally-constrained stochastic data-driven models for geophysical turbulence

Authors: Ashesh Chattopadhyay, Jaideep Pathak, Ebrahim Nabizadeh, Wahid Bhimji, Pedram Hassanzadeh

Abstract: Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lo… ▽ More Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lot of training data which may not be available from reanalysis (observational data) products. Moreover, an accurate, noise-free, initial condition to start forecasting with a data-driven weather model is not available in realistic scenarios. Finally, deterministic data-driven forecasting models suffer from issues with long-term stability and unphysical climate drift, which makes these data-driven models unsuitable for computing climate statistics. Given these challenges, previous studies have tried to pre-train deep learning-based weather forecasting models on a large amount of imperfect long-term climate model simulations and then re-train them on available observational data. In this paper, we propose a convolutional variational autoencoder-based stochastic data-driven model that is pre-trained on an imperfect climate model simulation from a 2-layer quasi-geostrophic flow and re-trained, using transfer learning, on a small number of noisy observations from a perfect simulation. This re-trained model then performs stochastic forecasting with a noisy initial condition sampled from the perfect simulation. We show that our ensemble-based stochastic data-driven model outperforms a baseline deterministic encoder-decoder-based convolutional model in terms of short-term skills while remaining stable for long-term climate simulations yielding accurate climatology. △ Less

Submitted 9 May, 2022; originally announced May 2022.

arXiv:2203.07645 [pdf, other]

Software and Computing for Small HEP Experiments

Authors: Dave Casper, Maria Elena Monzani, Benjamin Nachman, Costas Andreopoulos, Stephen Bailey, Deborah Bard, Wahid Bhimji, Giuseppe Cerati, Grigorios Chachamis, Jacob Daughhetee, Miriam Diamond, V. Daniel Elvira, Alden Fan, Krzysztof Genser, Paolo Girotti, Scott Kravitz, Robert Kutschke, Vincent R. Pascuzzi, Gabriel N. Perdue, Erica Snider, Elizabeth Sexton-Kennedy, Graeme Andrew Stewart, Matthew Szydagis, Eric Torrence, Christopher Tunnell

Abstract: This white paper briefly summarized key conclusions of the recent US Community Study on the Future of Particle Physics (Snowmass 2021) workshop on Software and Computing for Small High Energy Physics Experiments. This white paper briefly summarized key conclusions of the recent US Community Study on the Future of Particle Physics (Snowmass 2021) workshop on Software and Computing for Small High Energy Physics Experiments. △ Less

Submitted 27 December, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: Contribution to Snowmass 2021

Report number: FERMILAB-CONF-22-138

arXiv:2105.12880 [pdf, other]

The Petascale DTN Project: High Performance Data Transfer for HPC Facilities

Authors: Eli Dart, William Allcock, Wahid Bhimji, Tim Boerner, Ravinderjeet Cheema, Andrew Cherry, Brent Draney, Salman Habib, Damian Hazen, Jason Hill, Matt Kollross, Suzanne Parete-Koon, Daniel Pelfrey, Adrian Pope, Jeff Porter, David Wheeler

Abstract: The movement of large-scale (tens of Terabytes and larger) data sets between high performance computing (HPC) facilities is an important and increasingly critical capability. A growing number of scientific collaborations rely on HPC facilities for tasks which either require large-scale data sets as input or produce large-scale data sets as output. In order to enable the transfer of these data sets… ▽ More The movement of large-scale (tens of Terabytes and larger) data sets between high performance computing (HPC) facilities is an important and increasingly critical capability. A growing number of scientific collaborations rely on HPC facilities for tasks which either require large-scale data sets as input or produce large-scale data sets as output. In order to enable the transfer of these data sets as needed by the scientific community, HPC facilities must design and deploy the appropriate data transfer capabilities to allow users to do data placement at scale. This paper describes the Petascale DTN Project, an effort undertaken by four HPC facilities, which succeeded in achieving routine data transfer rates of over 1PB/week between the facilities. We describe the design and configuration of the Data Transfer Node (DTN) clusters used for large-scale data transfers at these facilities, the software tools used, and the performance tuning that enabled this capability. △ Less

Submitted 8 September, 2021; v1 submitted 26 May, 2021; originally announced May 2021.

arXiv:2002.06304 [pdf, other]

doi 10.1051/epjconf/202024505012

Optimization of Software on High Performance Computing Platforms for the LUX-ZEPLIN Dark Matter Experiment

Authors: Venkitesh Ayyar, Wahid Bhimji, Maria Elena Monzani, Andrew Naylor, Simon Patton, Craig E. Tull

Abstract: High Energy Physics experiments like the LUX-ZEPLIN dark matter experiment face unique challenges when running their computation on High Performance Computing resources. In this paper, we describe some strategies to optimize memory usage of simulation codes with the help of profiling tools. We employed this approach and achieved memory reduction of 10-30\%. While this has been performed in the con… ▽ More High Energy Physics experiments like the LUX-ZEPLIN dark matter experiment face unique challenges when running their computation on High Performance Computing resources. In this paper, we describe some strategies to optimize memory usage of simulation codes with the help of profiling tools. We employed this approach and achieved memory reduction of 10-30\%. While this has been performed in the context of the LZ experiment, it has wider applicability to other HEP experimental codes that face these challenges on modern computer architectures. △ Less

Submitted 14 February, 2020; originally announced February 2020.

Comments: Contribution to Proceedings of CHEP 2019, Nov 4-8, Adelaide, Australia

Journal ref: EPJ Web of Conferences 245, 05012 (2020)

arXiv:2002.05761 [pdf, other]

doi 10.1051/epjconf/202024506003

The use of Convolutional Neural Networks for signal-background classification in Particle Physics experiments

Authors: Venkitesh Ayyar, Wahid Bhimji, Lisa Gerhardt, Sally Robertson, Zahra Ronaghi

Abstract: The success of Convolutional Neural Networks (CNNs) in image classification has prompted efforts to study their use for classifying image data obtained in Particle Physics experiments. Here, we discuss our efforts to apply CNNs to 2D and 3D image data from particle physics experiments to classify signal from background. In this work we present an extensive convolutional neural architecture searc… ▽ More The success of Convolutional Neural Networks (CNNs) in image classification has prompted efforts to study their use for classifying image data obtained in Particle Physics experiments. Here, we discuss our efforts to apply CNNs to 2D and 3D image data from particle physics experiments to classify signal from background. In this work we present an extensive convolutional neural architecture search, achieving high accuracy for signal/background discrimination for a HEP classification use-case based on simulated data from the Ice Cube neutrino observatory and an ATLAS-like detector. We demonstrate among other things that we can achieve the same accuracy as complex ResNet architectures with CNNs with less parameters, and present comparisons of computational requirements, training and inference times. △ Less

Submitted 13 February, 2020; originally announced February 2020.

Comments: Contribution to Proceedings of CHEP 2019, Nov 4-8, Adelaide, Australia

Journal ref: EPJ Web of Conferences 245, 06003 (2020)

arXiv:1907.03382 [pdf, other]

doi 10.1145/3295500.3356180

Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale

Authors: Atılım Güneş Baydin, Lei Shao, Wahid Bhimji, Lukas Heinrich, Lawrence Meadows, Jialin Liu, Andreas Munk, Saeid Naderiparizi, Bradley Gram-Hansen, Gilles Louppe, Mingfei Ma, Xiaohui Zhao, Philip Torr, Victor Lee, Kyle Cranmer, Prabhat, Frank Wood

Abstract: Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL frame… ▽ More Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN--LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global minibatch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL. △ Less

Submitted 27 August, 2019; v1 submitted 7 July, 2019; originally announced July 2019.

Comments: 14 pages, 8 figures

MSC Class: 68T37; 68T05; 62P35 ACM Class: G.3; I.2.6; J.2

Journal ref: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC19), November 17--22, 2019

arXiv:1903.02556 [pdf, other]

doi 10.1007/JHEP05(2019)181

Machine Learning Templates for QCD Factorization in the Search for Physics Beyond the Standard Model

Authors: Joshua Lin, Wahid Bhimji, Benjamin Nachman

Abstract: High-multiplicity all-hadronic final states are an important, but difficult final state for searching for physics beyond the Standard Model. A powerful search method is to look for large jets with accidental substructure due to multiple hard partons falling within a single jet. One way for estimating the background in this search is to exploit an approximate factorization in quantum chromodynamics… ▽ More High-multiplicity all-hadronic final states are an important, but difficult final state for searching for physics beyond the Standard Model. A powerful search method is to look for large jets with accidental substructure due to multiple hard partons falling within a single jet. One way for estimating the background in this search is to exploit an approximate factorization in quantum chromodynamics whereby the jet mass distribution is determined only by its kinematic properties. Traditionally, this approach has been executed using histograms constructed in a background-rich region. We propose a new approach based on Generative Adversarial Networks (GANs). These neural network approaches are naturally unbinned and can be readily conditioned on multiple jet properties. In addition to using vanilla GANs for this purpose, a modification to the traditional WGAN approach has been investigated where weight clip** is replaced with a naturally compact set (in this case, the circle). Both the vanilla and modified WGAN approaches significantly outperform the histogram method, especially when modeling the dependence on features not used in the histogram construction. These results can be useful for enhancing the sensitivity of LHC searches to high-multiplicity final states involving many quarks and gluons and serve as a useful benchmark where GANs may have immediate benefit to the HEP community. △ Less

Submitted 27 May, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

Comments: 17 pages, 6 figures. v2: Updated to journal version

arXiv:1902.08324 [pdf, other]

A pattern recognition algorithm for quantum annealers

Authors: Frederic Bapst, Wahid Bhimji, Paolo Calafiura, Heather Gray, Wim Lavrijsen, Lucy Linder

Abstract: The reconstruction of charged particles will be a key computing challenge for the high-luminosity Large Hadron Collider (HL-LHC) where increased data rates lead to large increases in running time for current pattern recognition algorithms. An alternative approach explored here expresses pattern recognition as a Quadratic Unconstrained Binary Optimization (QUBO) using software and quantum annealing… ▽ More The reconstruction of charged particles will be a key computing challenge for the high-luminosity Large Hadron Collider (HL-LHC) where increased data rates lead to large increases in running time for current pattern recognition algorithms. An alternative approach explored here expresses pattern recognition as a Quadratic Unconstrained Binary Optimization (QUBO) using software and quantum annealing. At track densities comparable with current LHC conditions, our approach achieves physics performance competitive with state-of-the-art pattern recognition algorithms. More research will be needed to achieve comparable performance in HL-LHC conditions, as increasing track density decreases the purity of the QUBO track segment classifier. △ Less

Submitted 21 February, 2019; originally announced February 2019.

arXiv:1809.06166 [pdf, other]

Graph Neural Networks for IceCube Signal Classification

Authors: Nicholas Choma, Federico Monti, Lisa Gerhardt, Tomasz Palczewski, Zahra Ronaghi, Prabhat, Wahid Bhimji, Michael M. Bronstein, Spencer R. Klein, Joan Bruna

Abstract: Tasks involving the analysis of geometric (graph- and manifold-structured) data have recently gained prominence in the machine learning community, giving birth to a rapidly develo** field of geometric deep learning. In this work, we leverage graph neural networks to improve signal detection in the IceCube neutrino observatory. The IceCube detector array is modeled as a graph, where vertices are… ▽ More Tasks involving the analysis of geometric (graph- and manifold-structured) data have recently gained prominence in the machine learning community, giving birth to a rapidly develo** field of geometric deep learning. In this work, we leverage graph neural networks to improve signal detection in the IceCube neutrino observatory. The IceCube detector array is modeled as a graph, where vertices are sensors and edges are a learned function of the sensors' spatial coordinates. As only a subset of IceCube's sensors is active during a given observation, we note the adaptive nature of our GNN, wherein computation is restricted to the input signal support. We demonstrate the effectiveness of our GNN architecture on a task classifying IceCube events, where it outperforms both a traditional physics-based method as well as classical 3D convolution neural networks. △ Less

Submitted 17 September, 2018; originally announced September 2018.

arXiv:1807.07706 [pdf, other]

Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model

Authors: Atılım Güneş Baydin, Lukas Heinrich, Wahid Bhimji, Lei Shao, Saeid Naderiparizi, Andreas Munk, Jialin Liu, Bradley Gram-Hansen, Gilles Louppe, Lawrence Meadows, Philip Torr, Victor Lee, Prabhat, Kyle Cranmer, Frank Wood

Abstract: We present a novel probabilistic programming framework that couples directly to existing large-scale simulators through a cross-platform probabilistic execution protocol, which allows general-purpose inference engines to record and control random number draws within simulators in a language-agnostic way. The execution of existing simulators as probabilistic programs enables highly interpretable po… ▽ More We present a novel probabilistic programming framework that couples directly to existing large-scale simulators through a cross-platform probabilistic execution protocol, which allows general-purpose inference engines to record and control random number draws within simulators in a language-agnostic way. The execution of existing simulators as probabilistic programs enables highly interpretable posterior inference in the structured model defined by the simulator code base. We demonstrate the technique in particle physics, on a scientifically accurate simulation of the tau lepton decay, which is a key ingredient in establishing the properties of the Higgs boson. Inference efficiency is achieved via inference compilation where a deep recurrent neural network is trained to parameterize proposal distributions and control the stochastic simulator in a sequential importance sampling scheme, at a fraction of the computational cost of a Markov chain Monte Carlo baseline. △ Less

Submitted 17 February, 2020; v1 submitted 20 July, 2018; originally announced July 2018.

Comments: 20 pages, 9 figures

MSC Class: 68T37; 68T05; 62P35 ACM Class: G.3; I.2.6; J.2

Journal ref: In Advances in Neural Information Processing Systems 33 (NeurIPS), Vancouver, Canada, 2019

arXiv:1807.02876 [pdf, other]

Machine Learning in High Energy Physics Community White Paper

Authors: Kim Albertsson, Piero Altoe, Dustin Anderson, John Anderson, Michael Andrews, Juan Pedro Araque Espinosa, Adam Aurisano, Laurent Basara, Adrian Bevan, Wahid Bhimji, Daniele Bonacorsi, Bjorn Burkle, Paolo Calafiura, Mario Campanelli, Louis Capps, Federico Carminati, Stefano Carrazza, Yi-fan Chen, Taylor Childers, Yann Coadou, Elias Coniavitis, Kyle Cranmer, Claire David, Douglas Davis, Andrea De Simone , et al. (103 additional authors not shown)

Abstract: Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We d… ▽ More Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We detail a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit. △ Less

Submitted 16 May, 2019; v1 submitted 8 July, 2018; originally announced July 2018.

Comments: Editors: Sergei Gleyzer, Paul Seyfert and Steven Schramm

arXiv:1712.07901 [pdf, other]

Improvements to Inference Compilation for Probabilistic Programming in Large-Scale Scientific Simulators

Authors: Mario Lezcano Casado, Atilim Gunes Baydin, David Martinez Rubio, Tuan Anh Le, Frank Wood, Lukas Heinrich, Gilles Louppe, Kyle Cranmer, Karen Ng, Wahid Bhimji, Prabhat

Abstract: We consider the problem of Bayesian inference in the family of probabilistic models implicitly defined by stochastic generative models of data. In scientific fields ranging from population biology to cosmology, low-level mechanistic components are composed to create complex generative models. These models lead to intractable likelihoods and are typically non-differentiable, which poses challenges… ▽ More We consider the problem of Bayesian inference in the family of probabilistic models implicitly defined by stochastic generative models of data. In scientific fields ranging from population biology to cosmology, low-level mechanistic components are composed to create complex generative models. These models lead to intractable likelihoods and are typically non-differentiable, which poses challenges for traditional approaches to inference. We extend previous work in "inference compilation", which combines universal probabilistic programming and deep learning methods, to large-scale scientific simulators, and introduce a C++ based probabilistic programming library called CPProb. We successfully use CPProb to interface with SHERPA, a large code-base used in particle physics. Here we describe the technical innovations realized and planned for this library. △ Less

Submitted 21 December, 2017; originally announced December 2017.

Comments: 7 pages, 2 figures

MSC Class: 68T37; 68T05; 62P35 ACM Class: G.3; I.2.6; J.2

arXiv:1712.06982 [pdf, other]

doi 10.1007/s41781-018-0018-8

A Roadmap for HEP Software and Computing R&D for the 2020s

Authors: Johannes Albrecht, Antonio Augusto Alves Jr, Guilherme Amadio, Giuseppe Andronico, Nguyen Anh-Ky, Laurent Aphecetche, John Apostolakis, Makoto Asai, Luca Atzori, Marian Babik, Giuseppe Bagliesi, Marilena Bandieramonte, Sunanda Banerjee, Martin Barisits, Lothar A. T. Bauerdick, Stefano Belforte, Douglas Benjamin, Catrin Bernius, Wahid Bhimji, Riccardo Maria Bianchi, Ian Bird, Catherine Biscarat, Jakob Blomer, Kenneth Bloom, Tommaso Boccali , et al. (285 additional authors not shown)

Abstract: Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for… ▽ More Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade. △ Less

Submitted 19 December, 2018; v1 submitted 18 December, 2017; originally announced December 2017.

Report number: HSF-CWP-2017-01

Journal ref: Comput Softw Big Sci (2019) 3, 7

arXiv:1711.03573 [pdf, other]

Deep Neural Networks for Physics Analysis on low-level whole-detector data at the LHC

Authors: Wahid Bhimji, Steven Andrew Farrell, Thorsten Kurth, Michela Paganini, Prabhat, Evan Racah

Abstract: There has been considerable recent activity applying deep convolutional neural nets (CNNs) to data from particle physics experiments. Current approaches on ATLAS/CMS have largely focussed on a subset of the calorimeter, and for identifying objects or particular particle types. We explore approaches that use the entire calorimeter, combined with track information, for directly conducting physics an… ▽ More There has been considerable recent activity applying deep convolutional neural nets (CNNs) to data from particle physics experiments. Current approaches on ATLAS/CMS have largely focussed on a subset of the calorimeter, and for identifying objects or particular particle types. We explore approaches that use the entire calorimeter, combined with track information, for directly conducting physics analyses: i.e. classifying events as known-physics background or new-physics signals. We use an existing RPV-Supersymmetry analysis as a case study and explore CNNs on multi-channel, high-resolution sparse images: applied on GPU and multi-node CPU architectures (including Knights Landing (KNL) Xeon Phi nodes) on the Cori supercomputer at NERSC. △ Less

Submitted 29 November, 2017; v1 submitted 9 November, 2017; originally announced November 2017.

Comments: Presented at ACAT 2017 Conference, Submitted to J. Phys. Conf. Ser

arXiv:1708.05256 [pdf, other]

Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data

Authors: Thorsten Kurth, Jian Zhang, Nadathur Satish, Ioannis Mitliagkas, Evan Racah, Mostofa Ali Patwary, Tareq Malas, Narayanan Sundaram, Wahid Bhimji, Mikhail Smorkalov, Jack Deslippe, Mikhail Shiryaev, Srinivas Sridharan, Prabhat, Pradeep Dubey

Abstract: This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures. We develop supervised convolutional architectures for discriminating signals in high-energy physics data as well as semi-supervised architectures for localizing and classifying extreme weather in climate data. Our Intelcaffe-based implementation… ▽ More This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures. We develop supervised convolutional architectures for discriminating signals in high-energy physics data as well as semi-supervised architectures for localizing and classifying extreme weather in climate data. Our Intelcaffe-based implementation obtains $\sim$2TFLOP/s on a single Cori Phase-II Xeon-Phi node. We use a hybrid strategy employing synchronous node-groups, while using asynchronous communication across groups. We use this strategy to scale training of a single model to $\sim$9600 Xeon-Phi nodes; obtaining peak performance of 11.73-15.07 PFLOP/s and sustained performance of 11.41-13.27 PFLOP/s. At scale, our HEP architecture produces state-of-the-art classification accuracy on a dataset with 10M images, exceeding that achieved by selections on high-level physics-motivated features. Our semi-supervised architecture successfully extracts weather patterns in a 15TB climate dataset. Our results demonstrate that Deep Learning can be optimized and scaled effectively on many-core, HPC systems. △ Less

Submitted 17 August, 2017; originally announced August 2017.

Comments: 12 pages, 9 figures

arXiv:1706.02390 [pdf, other]

doi 10.1186/s40668-019-0029-9

CosmoGAN: creating high-fidelity weak lensing convergence maps using Generative Adversarial Networks

Authors: Mustafa Mustafa, Deborah Bard, Wahid Bhimji, Zarija Lukić, Rami Al-Rfou, Jan M. Kratochvil

Abstract: Inferring model parameters from experimental data is a grand challenge in many sciences, including cosmology. This often relies critically on high fidelity numerical simulations, which are prohibitively computationally expensive. The application of deep learning techniques to generative modeling is renewing interest in using high dimensional density estimators as computationally inexpensive emulat… ▽ More Inferring model parameters from experimental data is a grand challenge in many sciences, including cosmology. This often relies critically on high fidelity numerical simulations, which are prohibitively computationally expensive. The application of deep learning techniques to generative modeling is renewing interest in using high dimensional density estimators as computationally inexpensive emulators of fully-fledged simulations. These generative models have the potential to make a dramatic shift in the field of scientific simulations, but for that shift to happen we need to study the performance of such generators in the precision regime needed for science applications. To this end, in this work we apply Generative Adversarial Networks to the problem of generating weak lensing convergence maps. We show that our generator network produces maps that are described by, with high statistical confidence, the same summary statistics as the fully simulated maps. △ Less

Submitted 22 May, 2019; v1 submitted 7 June, 2017; originally announced June 2017.

Comments: 11 pages, 8 figures

Journal ref: Computational Astrophysics and CosmologySimulations, Data Analysis and Algorithms 2019 6:1

arXiv:1607.08220 [pdf, other]

doi 10.1109/IPDPS.2016.57

PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures

Authors: Md. Mostofa Ali Patwary, Nadathur Rajagopalan Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Prabhat, Pradeep Dubey

Abstract: Computing $k$-Nearest Neighbors (KNN) is one of the core kernels used in many machine learning, data mining and scientific computing applications. Although kd-tree based $O(\log n)$ algorithms have been proposed for computing KNN, due to its inherent sequentiality, linear algorithms are being used in practice. This limits the applicability of such methods to millions of data points, with limited s… ▽ More Computing $k$-Nearest Neighbors (KNN) is one of the core kernels used in many machine learning, data mining and scientific computing applications. Although kd-tree based $O(\log n)$ algorithms have been proposed for computing KNN, due to its inherent sequentiality, linear algorithms are being used in practice. This limits the applicability of such methods to millions of data points, with limited scalability for Big Data analytics challenges in the scientific domain. In this paper, we present parallel and highly optimized kd-tree based KNN algorithms (both construction and querying) suitable for distributed architectures. Our algorithm includes novel approaches for pruning search space and improving load balancing and partitioning among nodes and threads. Using TB-sized datasets from three science applications: astrophysics, plasma physics, and particle physics, we show that our implementation can construct kd-tree of 189 billion particles in 48 seconds on utilizing $\sim$50,000 cores. We also demonstrate computation of KNN of 19 billion queries in 12 seconds. We demonstrate almost linear speedup both for shared and distributed memory computers. Our algorithms outperforms earlier implementations by more than order of magnitude; thereby radically improving the applicability of our implementation to state-of-the-art Big Data analytics problems. In addition, we showcase performance and scalability on the recently released Intel Xeon Phi processor showing that our algorithm scales well even on massively parallel architectures. △ Less

Submitted 27 July, 2016; originally announced July 2016.

Comments: 11 pages in PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures, Md. Mostofa Ali Patwary et.al., IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2016

arXiv:1601.07621 [pdf, other]

doi 10.1109/ICMLA.2016.0160

Revealing Fundamental Physics from the Daya Bay Neutrino Experiment using Deep Neural Networks

Authors: Evan Racah, Seyoon Ko, Peter Sadowski, Wahid Bhimji, Craig Tull, Sang-Yun Oh, Pierre Baldi, Prabhat

Abstract: Experiments in particle physics produce enormous quantities of data that must be analyzed and interpreted by teams of physicists. This analysis is often exploratory, where scientists are unable to enumerate the possible types of signal prior to performing the experiment. Thus, tools for summarizing, clustering, visualizing and classifying high-dimensional data are essential. In this work, we show… ▽ More Experiments in particle physics produce enormous quantities of data that must be analyzed and interpreted by teams of physicists. This analysis is often exploratory, where scientists are unable to enumerate the possible types of signal prior to performing the experiment. Thus, tools for summarizing, clustering, visualizing and classifying high-dimensional data are essential. In this work, we show that meaningful physical content can be revealed by transforming the raw data into a learned high-level representation using deep neural networks, with measurements taken at the Daya Bay Neutrino Experiment as a case study. We further show how convolutional deep neural networks can provide an effective classification filter with greater than 97% accuracy across different classes of physics events, significantly better than other machine learning approaches. △ Less

Submitted 6 December, 2016; v1 submitted 27 January, 2016; originally announced January 2016.

arXiv:1406.6311 [pdf]

doi 10.1140/epjc/s10052-014-3026-9

The Physics of the B Factories

Authors: A. J. Bevan, B. Golob, Th. Mannel, S. Prell, B. D. Yabsley, K. Abe, H. Aihara, F. Anulli, N. Arnaud, T. Aushev, M. Beneke, J. Beringer, F. Bianchi, I. I. Bigi, M. Bona, N. Brambilla, J. B rodzicka, P. Chang, M. J. Charles, C. H. Cheng, H. -Y. Cheng, R. Chistov, P. Colangelo, J. P. Coleman, A. Drutskoy , et al. (2009 additional authors not shown)

Abstract: This work is on the Physics of the B Factories. Part A of this book contains a brief description of the SLAC and KEK B Factories as well as their detectors, BaBar and Belle, and data taking related issues. Part B discusses tools and methods used by the experiments in order to obtain results. The results themselves can be found in Part C. Please note that version 3 on the archive is the auxiliary… ▽ More This work is on the Physics of the B Factories. Part A of this book contains a brief description of the SLAC and KEK B Factories as well as their detectors, BaBar and Belle, and data taking related issues. Part B discusses tools and methods used by the experiments in order to obtain results. The results themselves can be found in Part C. Please note that version 3 on the archive is the auxiliary version of the Physics of the B Factories book. This uses the notation alpha, beta, gamma for the angles of the Unitarity Triangle. The nominal version uses the notation phi_1, phi_2 and phi_3. Please cite this work as Eur. Phys. J. C74 (2014) 3026. △ Less

Submitted 31 October, 2015; v1 submitted 24 June, 2014; originally announced June 2014.

Comments: 928 pages, version 3 (arXiv:1406.6311v3) corresponds to the alpha, beta, gamma version of the book, the other versions use the phi1, phi2, phi3 notation

Report number: SLAC-PUB-15968, KEK Preprint 2014-3

Journal ref: Eur. Phys. J. C74 (2014) 3026

arXiv:1102.3114 [pdf, ps, other]

doi 10.1088/1742-6596/331/5/052019

Establishing Applicability of SSDs to LHC Tier-2 Hardware Configuration

Authors: Samuel C Skipsey, Wahid Bhimji, Mike Kenyon

Abstract: Solid State Disk technologies are increasingly replacing high-speed hard disks as the storage technology in high-random-I/O environments. There are several potentially I/O bound services within the typical LHC Tier-2 - in the back-end, with the trend towards many-core architectures continuing, worker nodes running many single-threaded jobs and storage nodes delivering many simultaneous files can b… ▽ More Solid State Disk technologies are increasingly replacing high-speed hard disks as the storage technology in high-random-I/O environments. There are several potentially I/O bound services within the typical LHC Tier-2 - in the back-end, with the trend towards many-core architectures continuing, worker nodes running many single-threaded jobs and storage nodes delivering many simultaneous files can both exhibit I/O limited efficiency. We estimate the effectiveness of affordable SSDs in the context of worker nodes, on a large Tier-2 production setup using both low level tools and real LHC I/O intensive data analysis jobs comparing and contrasting with high performance spinning disk based solutions. We consider the applicability of each solution in the context of its price/performance metrics, with an eye on the pragmatic issues facing Tier-2 provision and upgrades △ Less

Submitted 15 February, 2011; originally announced February 2011.

Comments: 6 pages, 1 figure, 4 tables. Conference proceedings for CHEP2010

Showing 1–23 of 23 results for author: Bhimji, W