Search | arXiv e-print repository

Search for fractionally charged particles with CUORE

Authors: CUORE Collaboration, D. Q. Adams, C. Alduino, K. Alfonso, F. T. Avignone III, O. Azzolini, G. Bari, F. Bellini, G. Benato, M. Beretta, M. Biassoni, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Caminata, A. Campani, J. Cao, S. Capelli, C. Capelli, L. Cappelli, L. Cardani, P. Carniti, N. Casali, E. Celi , et al. (95 additional authors not shown)

Abstract: The Cryogenic Underground Observatory for Rare Events (CUORE) is a detector array comprised by 988 5$\;$cm$\times$5$\;$cm$\times$5$\;$cm TeO$_2$ crystals held below 20 mK, primarily searching for neutrinoless double-beta decay in $^{130}$Te. Unprecedented in size amongst cryogenic calorimetric experiments, CUORE provides a promising setting for the study of exotic through-going particles. Using th… ▽ More The Cryogenic Underground Observatory for Rare Events (CUORE) is a detector array comprised by 988 5$\;$cm$\times$5$\;$cm$\times$5$\;$cm TeO$_2$ crystals held below 20 mK, primarily searching for neutrinoless double-beta decay in $^{130}$Te. Unprecedented in size amongst cryogenic calorimetric experiments, CUORE provides a promising setting for the study of exotic through-going particles. Using the first tonne-year of CUORE's exposure, we perform a search for hypothesized fractionally charged particles (FCPs), which are well-motivated by various Standard Model extensions and would have suppressed interactions with matter. No excess of FCP candidate tracks is observed over background, setting leading limits on the underground FCP flux with charges between $e/24-e/5$ at 90\% confidence level. Using the low background environment and segmented geometry of CUORE, we establish the sensitivity of tonne-scale sub-Kelvin detectors to diverse signatures of new physics. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 7 pages, 5 figures

arXiv:2406.04199 [pdf]

High-Fidelity Electron Spin Gates in a Scalable Diamond Quantum Register

Authors: Timo Joas, Florian Ferlemann, Roberto Sailer, Philipp J. Vetter, **gfu Zhang, Ressa S. Said, Tokuyuki Teraji, Shinobu Onoda, Tommaso Calarco, Genko Genov, Matthias M. Müller, Fedor Jelezko

Abstract: Diamond is a promising platform for quantum information processing as it can host highly coherent qubits that might allow for the construction of large quantum registers. A prerequisite for such devices is a coherent interaction between electron nitrogen vacancy (NV) spins. Entanglement between dipolar-coupled NV spin pairs has been demonstrated, but with a limited entanglement fidelity and its er… ▽ More Diamond is a promising platform for quantum information processing as it can host highly coherent qubits that might allow for the construction of large quantum registers. A prerequisite for such devices is a coherent interaction between electron nitrogen vacancy (NV) spins. Entanglement between dipolar-coupled NV spin pairs has been demonstrated, but with a limited entanglement fidelity and its error sources have not been characterized. Here, we design a robust, easy to implement entangling gate between NV spins in diamond and quantify the influence of multiple error sources on the gate performance. Experimentally, we demonstrate a record gate fidelity of $F=(96.0 \pm 2.5)$ % under ambient conditions. Our identification of the dominant errors paves the way towards NV-NV gates beyond the error correction threshold. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.17937 [pdf, other]

Data-driven background model for the CUORE experiment

Authors: CUORE Collaboration, D. Q. Adams, C. Alduino, K. Alfonso, F. T. Avignone III, O. Azzolini, G. Bari, F. Bellini, G. Benato, M. Beretta, M. Biassoni, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Caminata, A. Campani, J. Cao, S. Capelli, C. Capelli, L. Cappelli, L. Cardani, P. Carniti, N. Casali, E. Celi , et al. (93 additional authors not shown)

Abstract: We present the model we developed to reconstruct the CUORE radioactive background based on the analysis of an experimental exposure of 1038.4 kg yr. The data reconstruction relies on a simultaneous Bayesian fit applied to energy spectra over a broad energy range. The high granularity of the CUORE detector, together with the large exposure and extended stable operations, allow for an in-depth explo… ▽ More We present the model we developed to reconstruct the CUORE radioactive background based on the analysis of an experimental exposure of 1038.4 kg yr. The data reconstruction relies on a simultaneous Bayesian fit applied to energy spectra over a broad energy range. The high granularity of the CUORE detector, together with the large exposure and extended stable operations, allow for an in-depth exploration of both spatial and time dependence of backgrounds. We achieve high sensitivity to both bulk and surface activities of the materials of the setup, detecting levels as low as 10 nBq kg$^{-1}$ and 0.1 nBq cm$^{-2}$, respectively. We compare the contamination levels we extract from the background model with prior radio-assay data, which informs future background risk mitigation strategies. The results of this background model play a crucial role in constructing the background budget for the CUPID experiment as it will exploit the same CUORE infrastructure. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2404.14366 [pdf]

Lessons Learned in Performing a Trustworthy AI and Fundamental Rights Assessment

Authors: Marjolein Boonstra, Frédérick Bruneault, Subrata Chakraborty, Tjitske Faber, Alessio Gallucci, Eleanore Hickman, Gerard Kema, Hee** Kim, Jaap Kooiker, Elisabeth Hildt, Annegret Lamadé, Emilie Wiinblad Mathez, Florian Möslein, Genien Pathuis, Giovanni Sartor, Marijke Steege, Alice Stocco, Willy Tadema, Jarno Tuimala, Isabel van Vledder, Dennis Vetter, Jana Vetter, Magnus Westerlund, Roberto V. Zicari

Abstract: This report shares the experiences, results and lessons learned in conducting a pilot project ``Responsible use of AI'' in cooperation with the Province of Friesland, Rijks ICT Gilde-part of the Ministry of the Interior and Kingdom Relations (BZK) (both in The Netherlands) and a group of members of the Z-Inspection$^{\small{\circledR}}$ Initiative. The pilot project took place from May 2022 throug… ▽ More This report shares the experiences, results and lessons learned in conducting a pilot project ``Responsible use of AI'' in cooperation with the Province of Friesland, Rijks ICT Gilde-part of the Ministry of the Interior and Kingdom Relations (BZK) (both in The Netherlands) and a group of members of the Z-Inspection$^{\small{\circledR}}$ Initiative. The pilot project took place from May 2022 through January 2023. During the pilot, the practical application of a deep learning algorithm from the province of Frŷslan was assessed. The AI maps heathland grassland by means of satellite images for monitoring nature reserves. Environmental monitoring is one of the crucial activities carried on by society for several purposes ranging from maintaining standards on drinkable water to quantifying the CO2 emissions of a particular state or region. Using satellite imagery and machine learning to support decisions is becoming an important part of environmental monitoring. The main focus of this report is to share the experiences, results and lessons learned from performing both a Trustworthy AI assessment using the Z-Inspection$^{\small{\circledR}}$ process and the EU framework for Trustworthy AI, and combining it with a Fundamental Rights assessment using the Fundamental Rights and Algorithms Impact Assessment (FRAIA) as recommended by the Dutch government for the use of AI algorithms by the Dutch public authorities. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: On behalf of the Z-Inspection$^{\small{\circledR}}$ Initiative

arXiv:2404.13602 [pdf, other]

The environmental low-frequency background for macro-calorimeters at the millikelvin scale

Authors: L. Aragão, A. Armigliato, R. Brancaccio, C. Brofferio, S. Castellaro, A. D'Addabbo, G. De Luca, F. Del Corso, S. Di Sabatino, R. Liu, L. Marini, I. Nutini, S. Quitadamo, P. Ruggieri, K. J. Vetter, M. Zavatarelli, S. Zucchelli

Abstract: Many of the most sensitive physics experiments searching for rare events, like neutrinoless double beta ($0νββ$) decay and dark matter interactions, rely on cryogenic macro-calorimeters operating at the mK-scale. Located underground at the Gran Sasso National Laboratory (LNGS), in central Italy, CUORE (Cryogenic Underground Observatory for Rare Events) is one of the leading experiments for the sea… ▽ More Many of the most sensitive physics experiments searching for rare events, like neutrinoless double beta ($0νββ$) decay and dark matter interactions, rely on cryogenic macro-calorimeters operating at the mK-scale. Located underground at the Gran Sasso National Laboratory (LNGS), in central Italy, CUORE (Cryogenic Underground Observatory for Rare Events) is one of the leading experiments for the search of $0νββ$ decay, implementing the low-temperature calorimetric technology. We present a novel multi-detector analysis to correlate environmental phenomena with the low-frequency noise of low-temperature calorimeters. Indeed, the correlation of marine and seismic data with data from a pair of CUORE detectors indicates that cryogenic detectors are sensitive not only to intense vibrations generated by earthquakes, but also to the much fainter vibrations induced by marine microseisms in the Mediterranean Sea due to the motion of sea waves. Proving that cryogenic macro-calorimeters are sensitive to such environmental sources of noise opens the possibility of studying their impact on the detectors physics-case sensitivity. Moreover, this study could pave the road for technology developments dedicated to the mitigation of the noise induced by marine microseisms, from which the entire community of cryogenic calorimeters can benefit. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2404.04453 [pdf, other]

With or without $ν$? Hunting for the seed of the matter-antimatter asymmetry

Authors: CUORE Collaboration, D. Q. Adams, C. Alduino, K. Alfonso, F. T. Avignone III, O. Azzolini, G. Bari, F. Bellini, G. Benato, M. Beretta, M. Biassoni, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Caminata, A. Campani, J. Cao, S. Capelli, C. Capelli, L. Cappelli, L. Cardani, P. Carniti, N. Casali, E. Celi , et al. (93 additional authors not shown)

Abstract: The matter-antimatter asymmetry underlines the incompleteness of the current understanding of particle physics. Neutrinoless double-beta ($0νββ$) decay may help explain this asymmetry, while unveiling the Majorana nature of the neutrino. The CUORE experiment searches for $0νββ$ decay of $^{130}$Te using a tonne-scale cryogenic calorimeter operated at milli-kelvin temperatures. We report no evidenc… ▽ More The matter-antimatter asymmetry underlines the incompleteness of the current understanding of particle physics. Neutrinoless double-beta ($0νββ$) decay may help explain this asymmetry, while unveiling the Majorana nature of the neutrino. The CUORE experiment searches for $0νββ$ decay of $^{130}$Te using a tonne-scale cryogenic calorimeter operated at milli-kelvin temperatures. We report no evidence for $0νββ$ decay and place a lower limit on the half-life of T$_{1/2}$ $>$ 3.8 $\times$ 10$^{25}$ years (90% C.I.) with over 2 tonne$\cdot$year TeO$_2$ exposure. The tools and techniques developed for this result and the 5 year stable operation of nearly 1000 detectors demonstrate the infrastructure for a next-generation experiment capable of searching for $0νββ$ decay across multiple isotopes. △ Less

Submitted 5 April, 2024; originally announced April 2024.

arXiv:2403.12636 [pdf, other]

A Practical Guide to Statistical Distances for Evaluating Generative Models in Science

Authors: Sebastian Bischoff, Alana Darcher, Michael Deistler, Richard Gao, Franziska Gerken, Manuel Gloeckler, Lisa Haxel, Jaivardhan Kapoor, Janne K Lappalainen, Jakob H Macke, Guy Moss, Matthijs Pals, Felix Pei, Rachel Rapp, A Erdem Sağtekin, Cornelius Schröder, Auguste Schulz, Zinovia Stefanidi, Shoji Toyota, Linda Ulmer, Julius Vetter

Abstract: Generative models are invaluable in many fields of science because of their ability to capture high-dimensional and complicated distributions, such as photo-realistic images, protein structures, and connectomes. How do we evaluate the samples these models generate? This work aims to provide an accessible entry point to understanding popular notions of statistical distances, requiring only foundati… ▽ More Generative models are invaluable in many fields of science because of their ability to capture high-dimensional and complicated distributions, such as photo-realistic images, protein structures, and connectomes. How do we evaluate the samples these models generate? This work aims to provide an accessible entry point to understanding popular notions of statistical distances, requiring only foundational knowledge in mathematics and statistics. We focus on four commonly used notions of statistical distances representing different methodologies: Using low-dimensional projections (Sliced-Wasserstein; SW), obtaining a distance using classifiers (Classifier Two-Sample Tests; C2ST), using embeddings through kernels (Maximum Mean Discrepancy; MMD), or neural networks (Fréchet Inception Distance; FID). We highlight the intuition behind each distance and explain their merits, scalability, complexity, and pitfalls. To demonstrate how these distances are used in practice, we evaluate generative models from different scientific domains, namely a model of decision making and a model generating medical images. We showcase that distinct distances can give different results on similar data. Through this guide, we aim to help researchers to use, interpret, and evaluate statistical distances for generative models in science. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.00616 [pdf, other]

Gate-set evaluation metrics for closed-loop optimal control on nitrogen-vacancy center ensembles in diamond

Authors: Philipp J. Vetter, Thomas Reisser, Maximilian G. Hirsch, Tommaso Calarco, Felix Motzoi, Fedor Jelezko, Matthias M. Müller

Abstract: A recurring challenge in quantum science and technology is the precise control of their underlying dynamics that lead to the desired quantum operations, often described by a set of quantum gates. These gates can be subject to application-specific errors, leading to a dependence of their controls on the chosen circuit, the quality measure and the gate-set itself. A natural solution would be to appl… ▽ More A recurring challenge in quantum science and technology is the precise control of their underlying dynamics that lead to the desired quantum operations, often described by a set of quantum gates. These gates can be subject to application-specific errors, leading to a dependence of their controls on the chosen circuit, the quality measure and the gate-set itself. A natural solution would be to apply quantum optimal control in an application-oriented fashion. In turn, this requires the definition of a meaningful measure of the contextual gate-set performance. Therefore, we explore and compare the applicability of quantum process tomography, linear inversion gate-set tomography, randomized linear gate-set tomography, and randomized benchmarking as measures for closed-loop quantum optimal control experiments, using a macroscopic ensemble of nitrogen-vacancy centers in diamond as a test-bed. Our work demonstrates the relative trade-offs between those measures and how to significantly enhance the gate-set performance, leading to an improvement across all investigated methods. △ Less

Submitted 24 March, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.07808 [pdf, other]

Sourcerer: Sample-based Maximum Entropy Source Distribution Estimation

Authors: Julius Vetter, Guy Moss, Cornelius Schröder, Richard Gao, Jakob H. Macke

Abstract: Scientific modeling applications often require estimating a distribution of parameters consistent with a dataset of observations - an inference task also known as source distribution estimation. This problem can be ill-posed, however, since many different source distributions might produce the same distribution of data-consistent simulations. To make a principled choice among many equally valid so… ▽ More Scientific modeling applications often require estimating a distribution of parameters consistent with a dataset of observations - an inference task also known as source distribution estimation. This problem can be ill-posed, however, since many different source distributions might produce the same distribution of data-consistent simulations. To make a principled choice among many equally valid sources, we propose an approach which targets the maximum entropy distribution, i.e., prioritizes retaining as much uncertainty as possible. Our method is purely sample-based - leveraging the Sliced-Wasserstein distance to measure the discrepancy between the dataset and simulations - and thus suitable for simulators with intractable likelihoods. We benchmark our method on several tasks, and show that it can recover source distributions with substantially higher entropy than recent source estimation methods, without sacrificing the fidelity of the simulations. Finally, to demonstrate the utility of our approach, we infer source distributions for parameters of the Hodgkin-Huxley model from experimental datasets with thousands of single-neuron measurements. In summary, we propose a principled method for inferring source distributions of scientific simulator parameters while retaining as much uncertainty as possible. △ Less

Submitted 15 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

arXiv:2311.01131 [pdf, other]

Improving the Performance of Cryogenic Calorimeters with Nonlinear Multivariate Noise Cancellation Algorithms

Authors: Kenneth J. Vetter, Mattia Beretta, Chiara Capelli, Francesca Del Corso, Erin V. Hansen, Roger G. Huang, Yury G. Kolomensky, Laura Marini, Irene Nutini, Vivek Singh, Aaron Torres, Bradford Welliver, Sergio Zimmermann, Stefano Zucchelli

Abstract: State-of-the-art physics experiments require high-resolution, low-noise, and low-threshold detectors to achieve competitive scientific results. However, experimental environments invariably introduce sources of noise, such as electrical interference or microphonics. The sources of this environmental noise can often be monitored by adding specially designed "auxiliary devices" (e.g. microphones, ac… ▽ More State-of-the-art physics experiments require high-resolution, low-noise, and low-threshold detectors to achieve competitive scientific results. However, experimental environments invariably introduce sources of noise, such as electrical interference or microphonics. The sources of this environmental noise can often be monitored by adding specially designed "auxiliary devices" (e.g. microphones, accelerometers, seismometers, magnetometers, and antennae). A model can then be constructed to predict the detector noise based on the auxiliary device information, which can then be subtracted from the true detector signal. Here, we present a multivariate noise cancellation algorithm which can be used in a variety of settings to improve the performance of detectors using multiple auxiliary devices. To validate this approach, we apply it to simulated data to remove noise due to electromagnetic interference and microphonic vibrations. We then employ the algorithm to a cryogenic light detector in the laboratory and show an improvement in the detector performance. Finally, we motivate the use of nonlinear terms to better model vibrational contributions to the noise in thermal detectors. We show a further improvement in the performance of a particular channel of the CUORE detector when using the nonlinear algorithm in combination with optimal filtering techniques. △ Less

Submitted 6 February, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

Comments: 21 pages, 15 figures, 7 tables

ACM Class: J.2

arXiv:2309.10292 [pdf, other]

doi 10.1145/3624062.3624278

Julia as a unifying end-to-end workflow language on the Frontier exascale system

Authors: William F. Godoy, Pedro Valero-Lara, Caira Anderson, Katrina W. Lee, Ana Gainaru, Rafael Ferreira da Silva, Jeffrey S. Vetter

Abstract: We evaluate Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy's first exascale supercomputer. We evaluate the performance, scaling, and trade-offs of (i) the computational… ▽ More We evaluate Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy's first exascale supercomputer. We evaluate the performance, scaling, and trade-offs of (i) the computational kernel on AMD's MI250x GPUs, (ii) weak scaling up to 4,096 MPI processes/GPUs or 512 nodes, (iii) parallel I/O writes using the ADIOS2 library bindings, and (iv) Jupyter Notebooks for interactive analysis. Results suggest that although Julia generates a reasonable LLVM-IR, a nearly 50% performance difference exists vs. native AMD HIP stencil codes when running on the GPUs. As expected, we observed near-zero overhead when using MPI and parallel I/O bindings for system-wide installed implementations. Consequently, Julia emerges as a compelling high-performance and high-productivity workflow composition language, as measured on the fastest supercomputer in the world. △ Less

Submitted 27 September, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: 11 pages, 8 figures, accepted at the 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23), IEEE/ACM The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC23

arXiv:2309.07103 [pdf, other]

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

Authors: Pedro Valero-Lara, Alexis Huante, Mustafa Al Lail, William F. Godoy, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

Abstract: We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous wor… ▽ More We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous work that is based on the OpenAI Codex, which is a descendant of GPT-3, to generate similar kernels with simple prompts via GitHub Copilot. Our goal is to compare the accuracy of Llama-2 and our original GPT-3 baseline by using a similar metric. Llama-2 has a simplified model that shows competitive or even superior accuracy. We also report on the differences between these foundational large language models as generative AI continues to redefine human-computer interactions. Overall, Copilot generates codes that are more reliable but less optimized, whereas codes generated by Llama-2 are less reliable but more optimized when correct. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: Accepted at LCPC 2023, The 36th International Workshop on Languages and Compilers for Parallel Computing http://www.lcpcworkshop.org/LCPC23/ . 13 pages, 5 figures, 1 table

arXiv:2307.11242 [pdf, other]

On-Sensor Data Filtering using Neuromorphic Computing for High Energy Physics Experiments

Authors: Shruti R. Kulkarni, Aaron Young, Prasanna Date, Narasinga Rao Miniskar, Jeffrey S. Vetter, Farah Fahim, Benjamin Parpillon, Jennet Dickinson, Nhan Tran, Jieun Yoo, Corrinne Mills, Morris Swartz, Petar Maksimovic, Catherine D. Schuman, Alice Bean

Abstract: This work describes the investigation of neuromorphic computing-based spiking neural network (SNN) models used to filter data from sensor electronics in high energy physics experiments conducted at the High Luminosity Large Hadron Collider. We present our approach for develo** a compact neuromorphic model that filters out the sensor data based on the particle's transverse momentum with the goal… ▽ More This work describes the investigation of neuromorphic computing-based spiking neural network (SNN) models used to filter data from sensor electronics in high energy physics experiments conducted at the High Luminosity Large Hadron Collider. We present our approach for develo** a compact neuromorphic model that filters out the sensor data based on the particle's transverse momentum with the goal of reducing the amount of data being sent to the downstream electronics. The incoming charge waveforms are converted to streams of binary-valued events, which are then processed by the SNN. We present our insights on the various system design choices - from data encoding to optimal hyperparameters of the training algorithm - for an accurate and compact SNN optimized for hardware deployment. Our results show that an SNN trained with an evolutionary algorithm and an optimized set of hyperparameters obtains a signal efficiency of about 91% with nearly half as many parameters as a deep neural network. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: Manuscript accepted at ICONS'23

arXiv:2306.15121 [pdf, other]

doi 10.1145/3605731.3605886

Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

Authors: William F. Godoy, Pedro Valero-Lara, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

Abstract: We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offl… ▽ More We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offload] and OpenACC), (3) Python (e.g., numba, Numba, cuPy, and pyCUDA), and (4) Julia (e.g., Threads, CUDA.jl, AMDGPU.jl, and KernelAbstractions.jl). We use the GitHub Copilot capabilities powered by OpenAI Codex available in Visual Studio Code as of April 2023 to generate a vast amount of implementations given simple <kernel> + <programming model> + <optional hints> prompt variants. To quantify and compare the results, we propose a proficiency metric around the initial 10 suggestions given for each prompt. Results suggest that the OpenAI Codex outputs for C++ correlate with the adoption and maturity of programming models. For example, OpenMP and CUDA score really high, whereas HIP is still lacking. We found that prompts from either a targeted language such as Fortran or the more general-purpose Python can benefit from adding code keywords, while Julia prompts perform acceptably well for its mature programming models (e.g., Threads and CUDA.jl). We expect for these benchmarks to provide a point of reference for each programming model's community. Overall, understanding the convergence of large language models, AI, and HPC is crucial due to its rapidly evolving nature and how it is redefining human-computer interactions. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: Accepted at the Sixteenth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2023 to be held in conjunction with ICPP 2023: The 52nd International Conference on Parallel Processing. 10 pages, 6 figures, 5 tables

arXiv:2304.04674 [pdf, other]

doi 10.1088/1748-0221/18/06/P06033

A first test of CUPID prototypal light detectors with NTD-Ge sensors in a pulse-tube cryostat

Authors: CUPID collaboration, K. Alfonso, A. Armatol, C. Augier, F. T. Avignone III, O. Azzolini, M. Balata, A. S. Barabash, G. Bari, A. Barresi, D. Baudin, F. Bellini, G. Benato, V. Berest, M. Beretta, M. Bettelli, M. Biassoni, J. Billard, V. Boldrini, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Campani, C. Capelli , et al. (154 additional authors not shown)

Abstract: CUPID is a next-generation bolometric experiment aiming at searching for neutrinoless double-beta decay with ~250 kg of isotopic mass of $^{100}$Mo. It will operate at $\sim$10 mK in a cryostat currently hosting a similar-scale bolometric array for the CUORE experiment at the Gran Sasso National Laboratory (Italy). CUPID will be based on large-volume scintillating bolometers consisting of… ▽ More CUPID is a next-generation bolometric experiment aiming at searching for neutrinoless double-beta decay with ~250 kg of isotopic mass of $^{100}$Mo. It will operate at $\sim$10 mK in a cryostat currently hosting a similar-scale bolometric array for the CUORE experiment at the Gran Sasso National Laboratory (Italy). CUPID will be based on large-volume scintillating bolometers consisting of $^{100}$Mo-enriched Li$_2$MoO$_4$ crystals, facing thin Ge-wafer-based bolometric light detectors. In the CUPID design, the detector structure is novel and needs to be validated. In particular, the CUORE cryostat presents a high level of mechanical vibrations due to the use of pulse tubes and the effect of vibrations on the detector performance must be investigated. In this paper we report the first test of the CUPID-design bolometric light detectors with NTD-Ge sensors in a dilution refrigerator equipped with a pulse tube in an above-ground lab. Light detectors are characterized in terms of sensitivity, energy resolution, pulse time constants, and noise power spectrum. Despite the challenging noisy environment due to pulse-tube-induced vibrations, we demonstrate that all the four tested light detectors comply with the CUPID goal in terms of intrinsic energy resolution of 100 eV RMS baseline noise. Indeed, we have measured 70--90 eV RMS for the four devices, which show an excellent reproducibility. We have also obtained outstanding energy resolutions at the 356 keV line from a $^{133}$Ba source with one light detector achieving 0.71(5) keV FWHM, which is -- to our knowledge -- the best ever obtained when compared to $γ$ detectors of any technology in this energy range. △ Less

Submitted 10 April, 2023; originally announced April 2023.

Comments: Prepared for submission to JINST; 16 pages, 7 figures, and 1 table

arXiv:2304.04611 [pdf, other]

doi 10.1088/1748-0221/18/06/P06018

Twelve-crystal prototype of Li$_2$MoO$_4$ scintillating bolometers for CUPID and CROSS experiments

Authors: CUPID, CROSS collaborations, :, K. Alfonso, A. Armatol, C. Augier, F. T. Avignone III, O. Azzolini, M. Balata, I. C. Bandac, A. S. Barabash, G. Bari, A. Barresi, D. Baudin, F. Bellini, G. Benato, V. Berest, M. Beretta, M. Bettelli, M. Biassoni, J. Billard, V. Boldrini, A. Branca, C. Brofferio, C. Bucci , et al. (160 additional authors not shown)

Abstract: An array of twelve 0.28 kg lithium molybdate (LMO) low-temperature bolometers equipped with 16 bolometric Ge light detectors, aiming at optimization of detector structure for CROSS and CUPID double-beta decay experiments, was constructed and tested in a low-background pulse-tube-based cryostat at the Canfranc underground laboratory in Spain. Performance of the scintillating bolometers was studied… ▽ More An array of twelve 0.28 kg lithium molybdate (LMO) low-temperature bolometers equipped with 16 bolometric Ge light detectors, aiming at optimization of detector structure for CROSS and CUPID double-beta decay experiments, was constructed and tested in a low-background pulse-tube-based cryostat at the Canfranc underground laboratory in Spain. Performance of the scintillating bolometers was studied depending on the size of phonon NTD-Ge sensors glued to both LMO and Ge absorbers, shape of the Ge light detectors (circular vs. square, from two suppliers), in different light collection conditions (with and without reflector, with aluminum coated LMO crystal surface). The scintillating bolometer array was operated over 8 months in the low-background conditions that allowed to probe a very low, $μ$Bq/kg, level of the LMO crystals radioactive contamination by $^{228}$Th and $^{226}$Ra. △ Less

Submitted 10 April, 2023; originally announced April 2023.

Comments: Prepared for submission to JINST; 23 pages, 9 figures, and 4 tables

arXiv:2303.06195 [pdf, other]

doi 10.1109/IPDPSW59300.2023.00068

Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes

Authors: William F. Godoy, Pedro Valero-Lara, T. Elise Dettling, Christian Trefftz, Ian Jorquera, Thomas Sheehy, Ross G. Miller, Marc Gonzalez-Tallada, Jeffrey S. Vetter, Valentin Churavy

Abstract: We explore the performance and portability of the high-level programming models: the LLVM-based Julia and Python/Numba, and Kokkos on high-performance computing (HPC) nodes: AMD Epyc CPUs and MI250X graphical processing units (GPUs) on Frontier's test bed Crusher system and Ampere's Arm-based CPUs and NVIDIA's A100 GPUs on the Wombat system at the Oak Ridge Leadership Computing Facilities. We comp… ▽ More We explore the performance and portability of the high-level programming models: the LLVM-based Julia and Python/Numba, and Kokkos on high-performance computing (HPC) nodes: AMD Epyc CPUs and MI250X graphical processing units (GPUs) on Frontier's test bed Crusher system and Ampere's Arm-based CPUs and NVIDIA's A100 GPUs on the Wombat system at the Oak Ridge Leadership Computing Facilities. We compare the default performance of a hand-rolled dense matrix multiplication algorithm on CPUs against vendor-compiled C/OpenMP implementations, and on each GPU against CUDA and HIP. Rather than focusing on the kernel optimization per-se, we select this naive approach to resemble exploratory work in science and as a lower-bound for performance to isolate the effect of each programming model. Julia and Kokkos perform comparably with C/OpenMP on CPUs, while Julia implementations are competitive with CUDA and HIP on GPUs. Performance gaps are identified on NVIDIA A100 GPUs for Julia's single precision and Kokkos, and for Python/Numba in all scenarios. We also comment on half-precision support, productivity, performance portability metrics, and platform readiness. We expect to contribute to the understanding and direction for high-level, high-productivity languages in HPC as the first-generation exascale systems are deployed. △ Less

Submitted 10 March, 2023; originally announced March 2023.

Comments: Accepted at the 28th HIPS workshop, held in conjunction with IPDPS 2023. 10 pages, 9 figures

arXiv:2212.11144 [pdf, other]

doi 10.1016/j.cpc.2023.108782

QuOCS: The Quantum Optimal Control Suite

Authors: Marco Rossignolo, Thomas Reisser, Alastair Marshall, Phila Rembold, Alice Pagano, Philipp J. Vetter, Ressa S. Said, Matthias M. Müller, Felix Motzoi, Tommaso Calarco, Fedor Jelezko, Simone Montangero

Abstract: Quantum optimal control includes the family of pulse-sha** algorithms that aim to unlock the full potential of a variety of quantum technologies. Our Quantum Optimal Control Suite (QuOCS) unites experimental focus and model-based approaches in a unified framework. The easy usage and installation of QuOCS and the availability of various combinable optimization strategies is designed to improve th… ▽ More Quantum optimal control includes the family of pulse-sha** algorithms that aim to unlock the full potential of a variety of quantum technologies. Our Quantum Optimal Control Suite (QuOCS) unites experimental focus and model-based approaches in a unified framework. The easy usage and installation of QuOCS and the availability of various combinable optimization strategies is designed to improve the performance of many quantum technology platforms, such as color defects in diamond, superconducting qubits, atom- or ion-based quantum computers. It can also be applied to the study of more general phenomena in physics. In this paper, we describe the software and the main toolbox of gradient-free and gradient-based algorithms. We then show how the user can connect it to their experiment. In addition, we provide illustrative examples where our optimization suite solves typical quantum optimal control problems, in both open- and closed-loop settings. Integration into existing experimental control software is already provided for the experiment control software Qudi [J. M. Binder et al., SoftwareX, 6, 85-90, (2017)], and further extensions are investigated and highly encouraged. QuOCS is available from GitHub, under Apache License 2.0, and can be found on the PyPI repository. △ Less

Submitted 22 December, 2022; v1 submitted 21 December, 2022; originally announced December 2022.

Comments: 24 pages, 7 figures

arXiv:2211.02740 [pdf, other]

Bridging HPC Communities through the Julia Programming Language

Authors: Valentin Churavy, William F Godoy, Carsten Bauer, Hendrik Ranocha, Michael Schlottke-Lakemper, Ludovic Räss, Johannes Blaschke, Mosè Giordano, Erik Schnetter, Samuel Omlin, Jeffrey S. Vetter, Alan Edelman

Abstract: The Julia programming language has evolved into a modern alternative to fill existing gaps in scientific computing and data science applications. Julia leverages a unified and coordinated single-language and ecosystem paradigm and has a proven track record of achieving high performance without sacrificing user productivity. These aspects make Julia a viable alternative to high-performance computin… ▽ More The Julia programming language has evolved into a modern alternative to fill existing gaps in scientific computing and data science applications. Julia leverages a unified and coordinated single-language and ecosystem paradigm and has a proven track record of achieving high performance without sacrificing user productivity. These aspects make Julia a viable alternative to high-performance computing's (HPC's) existing and increasingly costly many-body workflow composition strategy in which traditional HPC languages (e.g., Fortran, C, C++) are used for simulations, and higher-level languages (e.g., Python, R, MATLAB) are used for data analysis and interactive computing. Julia's rapid growth in language capabilities, package ecosystem, and community make it a promising universal language for HPC. This paper presents the views of a multidisciplinary group of researchers from academia, government, and industry that advocate for an HPC software development paradigm that emphasizes developer productivity, workflow portability, and low barriers for entry. We believe that the Julia programming language, its ecosystem, and its community provide modern and powerful capabilities that enable this group's objectives. Crucially, we believe that Julia can provide a feasible and less costly approach to programming scientific applications and workflows that target HPC facilities. In this work, we examine the current practice and role of Julia as a common, end-to-end programming model to address major challenges in scientific reproducibility, data-driven AI/machine learning, co-design and workflows, scalability and performance portability in heterogeneous computing, network communication, data management, and community education. As a result, the diversification of current investments to fulfill the needs of the upcoming decade is crucial as more supercomputing centers prepare for the exascale era. △ Less

Submitted 10 November, 2022; v1 submitted 4 November, 2022; originally announced November 2022.

Comments: 20 pages; improved image quality

arXiv:2210.15619 [pdf, other]

Large area photon calorimeter with Ir-Pt bilayer transition-edge sensor for the CUPID experiment

Authors: V. Singh, G. Benato, M. Beretta, C. Capelli, C. L. Chang, B. K. Fujikawa, E. V. Hansen, Yu. G. Kolomensky, WK. Kwok, M. Lisovenko, L. Marini, V. Novosad, J. Pearson, B. Schmidt, K. J. Vetter, G. Wang, B. Welliver, U. Welp, V. Yefremenko, J. Zhang

Abstract: CUPID is a next-generation neutrinoless double-beta decay experiment that will require cryogenic light detectors to improve background suppression, via the simultaneous readout of heat and light channels from its scintillating crystals. In this work we showcase light detectors based on a novel Ir-Pt bilayer transition edge sensor. We have performed a systematic study to improve the thermal couplin… ▽ More CUPID is a next-generation neutrinoless double-beta decay experiment that will require cryogenic light detectors to improve background suppression, via the simultaneous readout of heat and light channels from its scintillating crystals. In this work we showcase light detectors based on a novel Ir-Pt bilayer transition edge sensor. We have performed a systematic study to improve the thermal coupling between the photon absorber and the sensor, and thereby its responsivity. Our first devices meet CUPID's baseline noise requirement of <100~eV rms. Our detectors have risetimes of $\sim$180 $μ$s and measured timing jitter of <20 $μ$s for the expected signal-to-noise at the Q-value of the decay, which achieves the CUPID's criterion of rejecting two-neutrino double-beta decay pileup events. The current work will inform the fabrication of future devices, culminating in the final TES design and a scaleable readout scheme for CUPID. △ Less

Submitted 30 October, 2022; v1 submitted 27 October, 2022; originally announced October 2022.

Comments: 12 pages, 14 figures

arXiv:2208.07468 [pdf, other]

Encoding Integers and Rationals on Neuromorphic Computers using Virtual Neuron

Authors: Prasanna Date, Shruti Kulkarni, Aaron Young, Catherine Schuman, Thomas Potok, Jeffrey Vetter

Abstract: Neuromorphic computers perform computations by emulating the human brain, and use extremely low power. They are expected to be indispensable for energy-efficient computing in the future. While they are primarily used in spiking neural network-based machine learning applications, neuromorphic computers are known to be Turing-complete, and thus, capable of general-purpose computation. However, to fu… ▽ More Neuromorphic computers perform computations by emulating the human brain, and use extremely low power. They are expected to be indispensable for energy-efficient computing in the future. While they are primarily used in spiking neural network-based machine learning applications, neuromorphic computers are known to be Turing-complete, and thus, capable of general-purpose computation. However, to fully realize their potential for general-purpose, energy-efficient computing, it is important to devise efficient mechanisms for encoding numbers. Current encoding approaches have limited applicability and may not be suitable for general-purpose computation. In this paper, we present the virtual neuron as an encoding mechanism for integers and rational numbers. We evaluate the performance of the virtual neuron on physical and simulated neuromorphic hardware and show that it can perform an addition operation using 23 nJ of energy on average using a mixed-signal memristor-based neuromorphic processor. We also demonstrate its utility by using it in some of the mu-recursive functions, which are the building blocks of general-purpose computation. △ Less

Submitted 15 August, 2022; originally announced August 2022.

ACM Class: B.2; B.6

arXiv:2205.04549 [pdf, other]

An Energy-dependent Electro-thermal Response Model of CUORE Cryogenic Calorimeter

Authors: CUORE Collaboration, D. Q. Adams, C. Alduino, K. Alfonso, F. T. Avignone III, O. Azzolini, G. Bari, F. Bellini, G. Benato, M. Beretta, M. Biassoni, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Caminata, A. Campani, L. Canonica, X. G. Cao, S. Capelli, C. Capelli, L. Cappelli, L. Cardani, P. Carniti, N. Casali , et al. (96 additional authors not shown)

Abstract: The Cryogenic Underground Observatory for Rare Events (CUORE) is the most sensitive experiment searching for neutrinoless double-beta decay ($0νββ$) in $^{130}\text{Te}$. CUORE uses a cryogenic array of 988 TeO$_2$ calorimeters operated at $\sim$10 mK with a total mass of 741 kg. To further increase the sensitivity, the detector response must be well understood. Here, we present a non-linear therm… ▽ More The Cryogenic Underground Observatory for Rare Events (CUORE) is the most sensitive experiment searching for neutrinoless double-beta decay ($0νββ$) in $^{130}\text{Te}$. CUORE uses a cryogenic array of 988 TeO$_2$ calorimeters operated at $\sim$10 mK with a total mass of 741 kg. To further increase the sensitivity, the detector response must be well understood. Here, we present a non-linear thermal model for the CUORE experiment on a detector-by-detector basis. We have examined both equilibrium and dynamic electro-thermal models of detectors by numerically fitting non-linear differential equations to the detector data of a subset of CUORE channels which are well characterized and representative of all channels. We demonstrate that the hot-electron effect and electric-field dependence of resistance in NTD-Ge thermistors alone are inadequate to describe our detectors' energy dependent pulse shapes. We introduce an empirical second-order correction factor in the exponential temperature dependence of the thermistor, which produces excellent agreement with energy-dependent pulse shape data up to 6 MeV. We also present a noise analysis using the fitted thermal parameters and show that the intrinsic thermal noise is negligible compared to the observed noise for our detectors. △ Less

Submitted 28 July, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

Comments: 34 pages, 14 figures, 6 tables

arXiv:2205.03132 [pdf, other]

doi 10.1103/PhysRevLett.129.222501

New direct limit on neutrinoless double beta decay half-life of $^{128}$Te with CUORE

Authors: D. Q. Adams, C. Alduino, K. Alfonso, F. T. Avignone III, O. Azzolini, G. Bari, F. Bellini, G. Benato, M. Beretta, M. Biassoni, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Caminata, A. Campani, L. Canonica, X. G. Cao, C. Capelli, S. Capelli, L. Cappelli, L. Cardani, P. Carniti, N. Casali, E. Celi , et al. (95 additional authors not shown)

Abstract: The Cryogenic Underground Observatory for Rare Events (CUORE) at Laboratori Nazionali del Gran Sasso of INFN in Italy is an experiment searching for neutrinoless double beta (0$νββ$) decay. Its main goal is to investigate this decay in $^{130}$Te, but its ton-scale mass and low background make CUORE sensitive to other rare processes as well. In this work, we present our first results on the search… ▽ More The Cryogenic Underground Observatory for Rare Events (CUORE) at Laboratori Nazionali del Gran Sasso of INFN in Italy is an experiment searching for neutrinoless double beta (0$νββ$) decay. Its main goal is to investigate this decay in $^{130}$Te, but its ton-scale mass and low background make CUORE sensitive to other rare processes as well. In this work, we present our first results on the search for \nbb decay of $^{128}$Te, the Te isotope with the second highest natural isotopic abundance. We find no evidence for this decay, and using a Bayesian analysis we set a lower limit on the $^{128}$Te \nbb decay half-life of T$_{1/2} > 3.6 \times 10^{24}$ yr (90\% CI). This represents the most stringent limit on the half-life of this isotope, improving by over a factor 30 the previous direct search results, and exceeding those from geochemical experiments for the first time. △ Less

Submitted 6 May, 2022; originally announced May 2022.

arXiv:2204.07336 [pdf, ps, other]

Preparing for the Future -- Rethinking Proxy Apps

Authors: Satoshi Matsuoka, Jens Domke, Mohamed Wahib, Aleksandr Drozd, Ray Bair, Andrew A. Chien, Jeffrey S. Vetter, John Shalf

Abstract: A considerable amount of research and engineering went into designing proxy applications, which represent common high-performance computing workloads, to co-design and evaluate the current generation of supercomputers, e.g., RIKEN's Supercomputer Fugaku, ANL's Aurora, or ORNL's Frontier. This process was necessary to standardize the procurement while avoiding duplicated effort at each HPC center t… ▽ More A considerable amount of research and engineering went into designing proxy applications, which represent common high-performance computing workloads, to co-design and evaluate the current generation of supercomputers, e.g., RIKEN's Supercomputer Fugaku, ANL's Aurora, or ORNL's Frontier. This process was necessary to standardize the procurement while avoiding duplicated effort at each HPC center to develop their own benchmarks. Unfortunately, proxy applications force HPC centers and providers (vendors) into a an undesirable state of rigidity, in contrast to the fast-moving trends of current technology and future heterogeneity. To accommodate an extremely-heterogeneous future, we have to reconsider how to co-design supercomputers during the next decade, and avoid repeating the past mistakes. This position paper outlines the current state-of-the-art in system co-design, challenges encountered over the past years, and a proposed plan to move forward. △ Less

Submitted 15 April, 2022; originally announced April 2022.

arXiv:2203.08684 [pdf, other]

doi 10.1103/PhysRevC.105.065504

Search for Neutrinoless $β^+EC$ Decay of $^{120}$Te with CUORE

Authors: D. Q. Adams, C. Alduino, K. Alfonso, F. T. Avignone III, O. Azzolini, G. Bari, F. Bellini, G. Benato, M. Beretta, M. Biassoni, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Caminata, A. Campani, L. Canonica, X. G. Cao, C. Capelli, S. Capelli, L. Cappelli, L. Cardani, P. Carniti, N. Casali, E. Celi , et al. (96 additional authors not shown)

Abstract: CUORE is a large scale cryogenic experiment searching for neutrinoless double beta decay ($0νββ$) in $^{130}$Te. The CUORE detector is made of natural tellurium, providing the possibility of rare event searches on isotopes other than $^{130}$Te. In this work we describe a search for neutrinoless positron emitting electron capture ($0νβ^+EC$) decay in $^{120}$Te with a total TeO$_2$ exposure of 355… ▽ More CUORE is a large scale cryogenic experiment searching for neutrinoless double beta decay ($0νββ$) in $^{130}$Te. The CUORE detector is made of natural tellurium, providing the possibility of rare event searches on isotopes other than $^{130}$Te. In this work we describe a search for neutrinoless positron emitting electron capture ($0νβ^+EC$) decay in $^{120}$Te with a total TeO$_2$ exposure of 355.7 kg $\cdot$ yr, corresponding to 0.2405 kg $\cdot$ yr of $^{120}$Te. Albeit $0 νββ$ with two final state electrons represents the most promising channel, the emission of a positron and two 511-keV $γ$s make $0νβ^+EC$ decay signature extremely clear. To fully exploit the potential offered by the detector modularity we include events with different topology and perform a simultaneous fit of five selected signal signatures. Using blinded data we extract a median exclusion sensitivity of $3.4 \cdot 10^{22}$ yr at 90% Credibility Interval (C.I.). After unblinding we find no evidence of $0νβ^+EC$ signal and set a 90% C.I. Bayesian lower limit of $2.9 \cdot 10^{22}$ yr on $^{120}$Te half-life. This result improves by an order of magnitude the existing limit from the combined analysis of CUORE-0 and Cuoricino. △ Less

Submitted 18 July, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

arXiv:2203.08386 [pdf, other]

Toward CUPID-1T

Authors: A. Armatol, C. Augier, F. T. Avignone III, O. Azzolini, M. Balata, K. Ballen, A. S. Barabash, G. Bari, A. Barresi, D. Baudin, F. Bellini, G. Benato, M. Beretta, M. Bettelli, M. Biassoni, J. Billard, V. Boldrini, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, C. Capelli, S. Capelli, L. Cappelli, L. Cardani , et al. (150 additional authors not shown)

Abstract: Current experiments to search for broken lepton-number symmetry through the observation of neutrinoless double-beta decay ($0\mathrm{νββ}$) provide the most stringent limits on the Majorana nature of neutrinos and the effective Majorana neutrino mass ($m_{ββ}$). The next-generation experiments will focus on the sensitivity to the $0\mathrm{νββ}$ half-life of $\mathcal{O}(10^{27}$--$10^{28}$~years… ▽ More Current experiments to search for broken lepton-number symmetry through the observation of neutrinoless double-beta decay ($0\mathrm{νββ}$) provide the most stringent limits on the Majorana nature of neutrinos and the effective Majorana neutrino mass ($m_{ββ}$). The next-generation experiments will focus on the sensitivity to the $0\mathrm{νββ}$ half-life of $\mathcal{O}(10^{27}$--$10^{28}$~years$)$ and $m_{ββ}\lesssim15$~meV, which would provide complete coverage of the so-called Inverted Ordering region of the neutrino mass parameter space. By taking advantage of recent technological breakthroughs, new, future calorimetric experiments at the 1-ton scale can increase the sensitivity by at least another order of magnitude, exploring the large fraction of the parameter space that corresponds to the Normal neutrino mass ordering. In case of a discovery, such experiments could provide important insights toward a new understanding of the mechanism of $0\mathrm{νββ}$. We present here a series of projects underway that will provide advancements in background reduction, cryogenic readout, and physics searches beyond $0\mathrm{νββ}$, all moving toward the next-to-next generation CUPID-1T detector. △ Less

Submitted 8 April, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

Comments: contribution to Snowmass 2021

arXiv:2202.06279 [pdf, other]

Optimization of the first CUPID detector module

Authors: CUPID collaboration, A. Armatol, C. Augier, F. T. Avignone III, O. Azzolini, M. Balata, K. Ballen, A. S. Barabash, G. Bari, A. Barresi, D. Baudin, F. Bellini, G. Benato, M. Beretta, M. Bettelli, M. Biassoni, J. Billard, V. Boldrini, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, C. Capelli, S. Capelli, L. Cappelli , et al. (153 additional authors not shown)

Abstract: CUPID will be a next generation experiment searching for the neutrinoless double $β$ decay, whose discovery would establish the Majorana nature of the neutrino. Based on the experience achieved with the CUORE experiment, presently taking data at LNGS, CUPID aims to reach a background free environment by means of scintillating Li$_{2}$$^{100}$MoO$_4$ crystals coupled to light detectors. Indeed, the… ▽ More CUPID will be a next generation experiment searching for the neutrinoless double $β$ decay, whose discovery would establish the Majorana nature of the neutrino. Based on the experience achieved with the CUORE experiment, presently taking data at LNGS, CUPID aims to reach a background free environment by means of scintillating Li$_{2}$$^{100}$MoO$_4$ crystals coupled to light detectors. Indeed, the simultaneous heat and light detection allows us to reject the dominant background of $α$ particles, as proven by the CUPID-0 and CUPID-Mo demonstrators. In this work we present the results of the first test of the CUPID baseline module. In particular, we propose a new optimized detector structure and light sensors design to enhance the engineering and the light collection, respectively. We characterized the heat detectors, achieving an energy resolution of (5.9 $\pm$ 0.2) keV FWHM at the $Q$-value of $^{100}$Mo (about 3034 keV). We studied the light collection of the baseline CUPID design with respect to an alternative configuration which features gravity-assisted light detectors' mounting. In both cases we obtained an improvement in the light collection with respect to past measures and we validated the particle identification capability of the detector, which ensures an $α$ particle rejection higher than 99.9%, fully satisfying the requirements for CUPID. △ Less

Submitted 13 February, 2022; originally announced February 2022.

Comments: 10 pages, 5 figures

arXiv:2201.01732 [pdf, ps, other]

Annotating TAP responses on-the-fly against an IVOA data model

Authors: Mireille Louys, Laurent Michel, François Bonnarel, Joann Vetter

Abstract: With the success and widespread of the IVOA Table Access Protocol (1) for discovering and querying tabular data in astronomy, more than one hundred of TAP services exposing altogether 22 thousands of tables are accessible from the IVOA Registries at the time of writing. Currently the TAP protocol presents table data and metadata via a {TAP\_SCHEMA} describing the served tables with their columns a… ▽ More With the success and widespread of the IVOA Table Access Protocol (1) for discovering and querying tabular data in astronomy, more than one hundred of TAP services exposing altogether 22 thousands of tables are accessible from the IVOA Registries at the time of writing. Currently the TAP protocol presents table data and metadata via a {TAP\_SCHEMA} describing the served tables with their columns and possible joins between them. We explore here how to add an information layer, so that values within table columns can be gathered and used to populate instances of objects defined in a selected IVOA data model like Photometry, Coords, Measure, Transform or the proposed MANGO container model. This information layer is provided through annotation tags which tell how the columns' values can be interpreted as attributes of instances of that model. Then when a TAP query is processed, our server add-on interprets the ADQL query string and produces on-the-fly, when possible, the TAP response as an annotated VOTable document. The FIELD elements in the table response are mapped to corresponding model elements templated for this service. This has been prototyped in Java, using the VOLLT package library and a template annotation document representing elements from the MANGO data model. This has been exercised on examples based on Vizier and Chandra catalogs. △ Less

Submitted 5 January, 2022; originally announced January 2022.

Comments: 4 pages , 2 figures , ADASS XXX1 Conference Proceedings held on 24-28 October 2021

arXiv:2108.07883 [pdf, other]

doi 10.1016/j.ppnp.2021.103902

CUORE Opens the Door to Tonne-scale Cryogenics Experiments

Authors: CUORE Collaboration, D. Q. Adams, C. Alduino, F. Alessandria, K. Alfonso, E. Andreotti, F. T. Avignone III, O. Azzolini, M. Balata, I. Bandac, T. I. Banks, G. Bari, M. Barucci, J. W. Beeman, F. Bellini, G. Benato, M. Beretta, A. Bersani, D. Biare, M. Biassoni, F. Bragazzi, A. Branca, C. Brofferio, A. Bryant, A. Buccheri , et al. (184 additional authors not shown)

Abstract: The past few decades have seen major developments in the design and operation of cryogenic particle detectors. This technology offers an extremely good energy resolution - comparable to semiconductor detectors - and a wide choice of target materials, making low temperature calorimetric detectors ideal for a variety of particle physics applications. Rare event searches have continued to require eve… ▽ More The past few decades have seen major developments in the design and operation of cryogenic particle detectors. This technology offers an extremely good energy resolution - comparable to semiconductor detectors - and a wide choice of target materials, making low temperature calorimetric detectors ideal for a variety of particle physics applications. Rare event searches have continued to require ever greater exposures, which has driven them to ever larger cryogenic detectors, with the CUORE experiment being the first to reach a tonne-scale, mK-cooled, experimental mass. CUORE, designed to search for neutrinoless double beta decay, has been operational since 2017 at a temperature of about 10 mK. This result has been attained by the use of an unprecedentedly large cryogenic infrastructure called the CUORE cryostat: conceived, designed and commissioned for this purpose. In this article the main characteristics and features of the cryogenic facility developed for the CUORE experiment are highlighted. A brief introduction of the evolution of the field and of the past cryogenic facilities are given. The motivation behind the design and development of the CUORE cryogenic facility is detailed as are the steps taken toward realization, commissioning, and operation of the CUORE cryostat. The major challenges overcome by the collaboration and the solutions implemented throughout the building of the cryogenic facility will be discussed along with the potential improvements for future facilities. The success of CUORE has opened the door to a new generation of large-scale cryogenic facilities in numerous fields of science. Broader implications of the incredible feat achieved by the CUORE collaboration on the future cryogenic facilities in various fields ranging from neutrino and dark matter experiments to quantum computing will be examined. △ Less

Submitted 2 December, 2021; v1 submitted 17 August, 2021; originally announced August 2021.

Comments: 45 pages, 14 figures

Journal ref: Prog. Part. Nucl. Phys., 122 (2021), Article 103902

arXiv:2107.10537 [pdf, other]

doi 10.1103/PhysRevApplied.17.044028

Zero- and Low-Field Sensing with Nitrogen Vacancy Centers

Authors: Philipp J. Vetter, Alastair Marshall, Genko T. Genov, Tim F. Weiss, Nico Striegler, Eva F. Großmann, Santiago Oviedo Casado, Javier Cerrillo, Javier Prior, Philipp Neumann, Fedor Jelezko

Abstract: Over the years, an enormous effort has been made to establish nitrogen vacancy (NV) centers in diamond as easily accessible and precise magnetic field sensors. However, most of their sensing protocols rely on the application of bias magnetic fields, preventing their usage in zero- or low-field experiments. We overcome this limitation by exploiting the full spin $S=1$ nature of the NV center, allow… ▽ More Over the years, an enormous effort has been made to establish nitrogen vacancy (NV) centers in diamond as easily accessible and precise magnetic field sensors. However, most of their sensing protocols rely on the application of bias magnetic fields, preventing their usage in zero- or low-field experiments. We overcome this limitation by exploiting the full spin $S=1$ nature of the NV center, allowing us to detect nuclear spin signals at zero- and low-field with a linearly polarized microwave field. As conventional dynamical decoupling protocols fail in this regime, we develop new robust pulse sequences and optimized pulse pairs, which allow us to sense temperature and weak AC magnetic fields and achieve an efficient decoupling from environmental noise. Our work allows for much broader and simpler applications of NV centers as magnetic field sensors in the zero- and low-field regime and can be further extended to three-level systems in ions and atoms. △ Less

Submitted 2 February, 2022; v1 submitted 22 July, 2021; originally announced July 2021.

Journal ref: Physical Review Applied 17.4 (2022): 044028

arXiv:2104.06906 [pdf, other]

doi 10.1038/s41586-022-04497-4

Search for Majorana neutrinos exploiting millikelvin cryogenics with CUORE

Authors: D. Q. Adams, C. Alduino, K. Alfonso, F. T. Avignone III, O. Azzolini, G. Bari, F. Bellini, G. Benato, M. Beretta, M. Biassoni, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Caminata, A. Campani, L. Canonica, X. G. Cao, S. Capelli, L. Cappelli, L. Cardani, P. Carniti, N. Casali, E. Celi, D. Chiesa , et al. (89 additional authors not shown)

Abstract: The possibility that neutrinos may be their own antiparticles, unique among the known fundamental particles, arises from the symmetric theory of fermions proposed by Ettore Majorana in 1937. Given the profound consequences of such Majorana neutrinos, among which is a potential explanation for the matter-antimatter asymmetry of the universe via leptogenesis, the Majorana nature of neutrinos command… ▽ More The possibility that neutrinos may be their own antiparticles, unique among the known fundamental particles, arises from the symmetric theory of fermions proposed by Ettore Majorana in 1937. Given the profound consequences of such Majorana neutrinos, among which is a potential explanation for the matter-antimatter asymmetry of the universe via leptogenesis, the Majorana nature of neutrinos commands intense experimental scrutiny globally; one of the primary experimental probes is neutrinoless double beta ($0 νββ$) decay. Here we show results from the search for $0 νββ$ decay of $^{130}$Te, using the latest advanced cryogenic calorimeters with the CUORE experiment. CUORE, operating just 10 millikelvin above absolute zero, has pushed the state of the art on three frontiers: the sheer mass held at such ultra-low temperatures, operational longevity, and the low levels of ionising radiation emanating from the cryogenic infrastructure. We find no evidence for $0 νββ$ decay and set a lower bound of $T_{1/2}^{0 ν} > 2.2 \times 10^{25}$ years at a 90% credibility interval. We discuss potential applications of the advances made with CUORE to other fields such as direct dark matter, neutrino and nuclear physics searches and large-scale quantum computing, which can benefit from sustained operation of large payloads in a low-radioactivity, ultra-low temperature cryogenic environment. △ Less

Submitted 11 April, 2022; v1 submitted 14 April, 2021; originally announced April 2021.

Journal ref: Nature 604, 53 (2022)

arXiv:2101.10702 [pdf, other]

doi 10.1140/epjc/s10052-021-09317-z

Search for Double-Beta Decay of $\mathrm{^{130}Te}$ to the $0^+$ States of $\mathrm{^{130}Xe}$ with CUORE

Authors: CUORE Collaboration, D. Q. Adams, C. Alduino, K. Alfonso, F. T. Avignone III, O. Azzolini, G. Bari, F. Bellini, G. Benato, M. Biassoni A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Caminata, A. Campani, L. Canonica, X. G. Cao, S. Capelli, L. Cappelli, L. Cardani, P. Carniti N. Casali, E. Celi, D. Chiesa M. Clemenza S. Copello, C. Cosmelli, O. Cremonesi , et al. (83 additional authors not shown)

Abstract: The CUORE experiment is a large bolometric array searching for the lepton number violating neutrino-less double beta decay ($0νββ$) in the isotope $\mathrm{^{130}Te}$. In this work we present the latest results on two searches for the double beta decay (DBD) of $\mathrm{^{130}Te}$ to the first $0^{+}_2$ excited state of $\mathrm{^{130}Xe}$: the $0νββ$ decay and the Standard Model-allowed two-neutr… ▽ More The CUORE experiment is a large bolometric array searching for the lepton number violating neutrino-less double beta decay ($0νββ$) in the isotope $\mathrm{^{130}Te}$. In this work we present the latest results on two searches for the double beta decay (DBD) of $\mathrm{^{130}Te}$ to the first $0^{+}_2$ excited state of $\mathrm{^{130}Xe}$: the $0νββ$ decay and the Standard Model-allowed two-neutrinos double beta decay ($2νββ$). Both searches are based on a 372.5 kg$\times$yr TeO$_2$ exposure. The de-excitation gamma rays emitted by the excited Xe nucleus in the final state yield a unique signature, which can be searched for with low background by studying coincident events in two or more bolometers. The closely packed arrangement of the CUORE crystals constitutes a significant advantage in this regard. The median limit setting sensitivities at 90\% Credible Interval (C.I.) of the given searches were estimated as $\mathrm{S^{0ν}_{1/2} = 5.6 \times 10^{24} \: \mathrm{yr}}$ for the ${0νββ}$ decay and $\mathrm{S^{2ν}_{1/2} = 2.1 \times 10^{24} \: \mathrm{yr}}$ for the ${2νββ}$ decay. No significant evidence for either of the decay modes was observed and a Bayesian lower bound at $90\%$ C.I. on the decay half lives is obtained as: $\mathrm{(T_{1/2})^{0ν}_{0^+_2} > 5.9 \times 10^{24} \: \mathrm{yr}}$ for the $0νββ$ mode and $\mathrm{(T_{1/2})^{2ν}_{0^+_2} > 1.3 \times 10^{24} \: \mathrm{yr}}$ for the $2νββ$ mode. These represent the most stringent limits on the DBD of $^{130}$Te to excited states and improve by a factor $\sim5$ the previous results on this process. △ Less

Submitted 30 July, 2021; v1 submitted 26 January, 2021; originally announced January 2021.

Comments: 13 pages, 6 figures

arXiv:2012.11749 [pdf, other]

doi 10.1103/PhysRevLett.126.171801

Measurement of the 2$νββ$ Decay Half-life of $^{130}$Te with CUORE

Authors: CUORE Collaboration, D. Q. Adams, C. Alduino, K. Alfonso, F. T. Avignone III, O. Azzolini, G. Bari, F. Bellini, G. Benato, M. Biassoni, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, A. Caminata, A. Campani, L. Canonica, X. G. Cao, S. Capelli, L. Cappelli, L. Cardani, P. Carniti, N. Casali, D. Chiesa, M. Clemenza , et al. (88 additional authors not shown)

Abstract: We measured two-neutrino double beta decay of $^{130}$Te using an exposure of 300.7 kg$\cdot$yr accumulated with the CUORE detector. Using a Bayesian analysis to fit simulated spectra to experimental data, it was possible to disentangle all the major background sources and precisely measure the two-neutrino contribution. The half-life is in agreement with past measurements with a strongly reduced… ▽ More We measured two-neutrino double beta decay of $^{130}$Te using an exposure of 300.7 kg$\cdot$yr accumulated with the CUORE detector. Using a Bayesian analysis to fit simulated spectra to experimental data, it was possible to disentangle all the major background sources and precisely measure the two-neutrino contribution. The half-life is in agreement with past measurements with a strongly reduced uncertainty: $T^{2ν}_{1/2} = 7.71^{+0.08}_{-0.06}\mathrm{(stat.)}^{+0.12}_{-0.15}\mathrm{(syst.)}\times10^{20}$ yr. This measurement is the most precise determination of the $^{130}$Te 2$νββ$ decay half-life to date. △ Less

Submitted 19 May, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

Comments: Published in PRL

ACM Class: J.2

Journal ref: Phys. Rev. Lett. 126, 171801 (2021)

arXiv:2011.13806 [pdf, other]

doi 10.1088/1748-0221/16/02/P02037

A CUPID Li$_{2}$$^{100}$MoO$_4$ scintillating bolometer tested in the CROSS underground facility

Authors: The CUPID Interest Group, A. Armatol, E. Armengaud, W. Armstrong, C. Augier, F. T. Avignone III, O. Azzolini, I. C. Bandac, A. S. Barabash, G. Bari, A. Barresi, D. Baudin, F. Bellini, G. Benato, M. Beretta, L. Bergé, Ch. Bourgeois, M. Biassoni, J. Billard, V. Boldrini, A. Branca, C. Brofferio, C. Bucci, J. M. Calvo-Mozota, J. Camilleri , et al. (156 additional authors not shown)

Abstract: A scintillating bolometer based on a large cubic Li$_{2}$$^{100}$MoO$_4$ crystal (45 mm side) and a Ge wafer (scintillation detector) has been operated in the CROSS cryogenic facility at the Canfranc underground laboratory in Spain. The dual-readout detector is a prototype of the technology that will be used in the next-generation $0\nu2β$ experiment CUPID. The measurements were performed at 18 an… ▽ More A scintillating bolometer based on a large cubic Li$_{2}$$^{100}$MoO$_4$ crystal (45 mm side) and a Ge wafer (scintillation detector) has been operated in the CROSS cryogenic facility at the Canfranc underground laboratory in Spain. The dual-readout detector is a prototype of the technology that will be used in the next-generation $0\nu2β$ experiment CUPID. The measurements were performed at 18 and 12 mK temperature in a pulse tube dilution refrigerator. This setup utilizes the same technology as the CUORE cryostat that will host CUPID and so represents an accurate estimation of the expected performance. The Li$_{2}$$^{100}$MoO$_4$ bolometer shows a high energy resolution of 6 keV FWHM at the 2615 keV $γ$ line. The detection of scintillation light for each event triggered by the Li$_{2}$$^{100}$MoO$_4$ bolometer allowed for a full separation ($\sim$8$σ$) between $γ$($β$) and $α$ events above 2 MeV. The Li$_{2}$$^{100}$MoO$_4$ crystal also shows a high internal radiopurity with $^{228}$Th and $^{226}$Ra activities of less than 3 and 8 $μ$Bq/kg, respectively. Taking also into account the advantage of a more compact and massive detector array, which can be made of cubic-shaped crystals (compared to the cylindrical ones), this test demonstrates the great potential of cubic Li$_{2}$$^{100}$MoO$_4$ scintillating bolometers for high-sensitivity searches for the $^{100}$Mo $0\nu2β$ decay in CROSS and CUPID projects. △ Less

Submitted 27 November, 2020; originally announced November 2020.

Comments: 19 pages, 7 figures, 1 table

arXiv:2011.13656 [pdf, other]

Characterization of cubic Li$_{2}$$^{100}$MoO$_4$ crystals for the CUPID experiment

Authors: A. Armatol, E. Armengaud, W. Armstrong, C. Augier, F. T. Avignone III, O. Azzolini, A. Barabash, G. Bari, A. Barresi, D. Baudin, F. Bellini, G. Benato, M. Beretta, L. Bergè, M. Biassoni, J. Billard, V. Boldrini, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, S. Capelli, L. Cappelli, L. Cardani, P. Carniti , et al. (147 additional authors not shown)

Abstract: The CUPID Collaboration is designing a tonne-scale, background-free detector to search for double beta decay with sufficient sensitivity to fully explore the parameter space corresponding to the inverted neutrino mass hierarchy scenario. One of the CUPID demonstrators, CUPID-Mo, has proved the potential of enriched Li$_{2}$$^{100}$MoO$_4$ crystals as suitable detectors for neutrinoless double beta… ▽ More The CUPID Collaboration is designing a tonne-scale, background-free detector to search for double beta decay with sufficient sensitivity to fully explore the parameter space corresponding to the inverted neutrino mass hierarchy scenario. One of the CUPID demonstrators, CUPID-Mo, has proved the potential of enriched Li$_{2}$$^{100}$MoO$_4$ crystals as suitable detectors for neutrinoless double beta decay search. In this work, we characterised cubic crystals that, compared to the cylindrical crystals used by CUPID-Mo, are more appealing for the construction of tightly packed arrays. We measured an average energy resolution of (6.7$\pm$0.6) keV FWHM in the region of interest, approaching the CUPID target of 5 keV FWHM. We assessed the identification of $α$ particles with and without a reflecting foil that enhances the scintillation light collection efficiency, proving that the baseline design of CUPID already ensures a complete suppression of this $α$-induced background contribution. We also used the collected data to validate a Monte Carlo simulation modelling the light collection efficiency, which will enable further optimisations of the detector. △ Less

Submitted 27 November, 2020; originally announced November 2020.

arXiv:2011.11726 [pdf, other]

doi 10.1103/PhysRevC.104.015501

Novel technique for the study of pile-up events in cryogenic bolometers

Authors: A. Armatol, E. Armengaud, W. Armstrong, C. Augier, F. T. Avignone III, O. Azzolini, A. Barabash, G. Bari, A. Barresi, D. Baudin, F. Bellini, G. Benato, M. Beretta, L. Bergé, M. Biassoni, J. Billard, V. Boldrini, A. Branca, C. Brofferio, C. Bucci, J. Camilleri, S. Capelli, L. Cappelli, L. Cardani, P. Carniti , et al. (144 additional authors not shown)

Abstract: Precise characterization of detector time resolution is of crucial importance for next-generation cryogenic-bolometer experiments searching for neutrinoless double-beta decay, such as CUPID, in order to reject background due to pile-up of two-neutrino double-beta decay events. In this paper, we describe a technique developed to study the pile-up rejection capability of cryogenic bolometers. Our ap… ▽ More Precise characterization of detector time resolution is of crucial importance for next-generation cryogenic-bolometer experiments searching for neutrinoless double-beta decay, such as CUPID, in order to reject background due to pile-up of two-neutrino double-beta decay events. In this paper, we describe a technique developed to study the pile-up rejection capability of cryogenic bolometers. Our approach, which consists of producing controlled pile-up events with a programmable waveform generator, has the benefit that we can reliably and reproducibly control the time separation and relative energy of the individual components of the generated pile-up events. The resulting data allow us to optimize and benchmark analysis strategies to discriminate between individual and pile-up pulses. We describe a test of this technique performed with a small array of detectors at the Laboratori Nazionali del Gran Sasso, in Italy; we obtain a 90% rejection efficiency against pulser-generated pile-up events with rise time of ~15ms down to time separation between the individual events of about 2ms. △ Less

Submitted 12 July, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

Journal ref: Phys. Rev. C 104, 015501 (2021)

arXiv:1903.04364 [pdf, other]

doi 10.1109/IPDPSW.2019.00092

TensorFlow Doing HPC

Authors: Steven W. D. Chien, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav Bulatov, Erwin Laure, Jeffrey S. Vetter

Abstract: TensorFlow is a popular emerging open-source programming framework supporting the execution of distributed applications on heterogeneous hardware. While TensorFlow has been initially designed for develo** Machine Learning (ML) applications, in fact TensorFlow aims at supporting the development of a much broader range of application kinds that are outside the ML domain and can possibly include HP… ▽ More TensorFlow is a popular emerging open-source programming framework supporting the execution of distributed applications on heterogeneous hardware. While TensorFlow has been initially designed for develo** Machine Learning (ML) applications, in fact TensorFlow aims at supporting the development of a much broader range of application kinds that are outside the ML domain and can possibly include HPC applications. However, very few experiments have been conducted to evaluate TensorFlow performance when running HPC workloads on supercomputers. This work addresses this lack by designing four traditional HPC benchmark applications: STREAM, matrix-matrix multiply, Conjugate Gradient (CG) solver and Fast Fourier Transform (FFT). We analyze their performance on two supercomputers with accelerators and evaluate the potential of TensorFlow for develo** HPC applications. Our tests show that TensorFlow can fully take advantage of high performance networks and accelerators on supercomputers. Running our TensorFlow STREAM benchmark, we obtain over 50% of theoretical communication bandwidth on our testing platform. We find an approximately 2x, 1.7x and 1.8x performance improvement when increasing the number of GPUs from two to four in the matrix-matrix multiply, CG and FFT applications respectively. All our performance results demonstrate that TensorFlow has high potential of emerging also as HPC programming framework for heterogeneous supercomputers. △ Less

Submitted 11 March, 2019; originally announced March 2019.

Comments: Accepted for publication at The Ninth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES'19)

arXiv:1810.08955 [pdf, other]

Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network Training

Authors: Jiawen Liu, Dong Li, Gokcen Kestor, Jeffrey Vetter

Abstract: Training neural network often uses a machine learning framework such as TensorFlow and Caffe2. These frameworks employ a dataflow model where the NN training is modeled as a directed graph composed of a set of nodes. Operations in neural network training are typically implemented by the frameworks as primitives and represented as nodes in the dataflow graph. Training NN models in a dataflow-based… ▽ More Training neural network often uses a machine learning framework such as TensorFlow and Caffe2. These frameworks employ a dataflow model where the NN training is modeled as a directed graph composed of a set of nodes. Operations in neural network training are typically implemented by the frameworks as primitives and represented as nodes in the dataflow graph. Training NN models in a dataflow-based machine learning framework involves a large number of fine-grained operations. Those operations have diverse memory access patterns and computation intensity. How to manage and schedule those operations is challenging, because we have to decide the number of threads to run each operation (concurrency control) and schedule those operations for good hardware utilization and system throughput. In this paper, we extend an existing runtime system (the TensorFlow runtime) to enable automatic concurrency control and scheduling of operations. We explore performance modeling to predict the performance of operations with various thread-level parallelism. Our performance model is highly accurate and lightweight. Leveraging the performance model, our runtime system employs a set of scheduling strategies that co-run operations to improve hardware utilization and system throughput. Our runtime system demonstrates a big performance benefit. Comparing with using the recommended configurations for concurrency control and operation scheduling in TensorFlow, our approach achieves 33% performance (execution time) improvement on average (up to 49%) for three neural network models, and achieves high performance closing to the optimal one manually obtained by the user. △ Less

Submitted 18 February, 2019; v1 submitted 21 October, 2018; originally announced October 2018.

arXiv:1807.06019 [pdf, other]

doi 10.1117/12.2313483

Optimizing the Efficiency of Fabry-Perot Interferometers with Silicon-Substrate Mirrors

Authors: Nicholas F. Cothard, Mahiro Abe, Thomas Nikola, Gordon J. Stacey, German Cortes-Medellin, Patricio A. Gallardo, Brian J. Koopman, Michael D. Niemack, Stephen C. Parshley, Eve M. Vavagiakis, Kenneth J. Vetter

Abstract: We present the novel design of microfabricated, silicon-substrate based mirrors for use in cryogenic Fabry-Perot Interferometers (FPIs) for the mid-IR to sub-mm/mm wavelength regime. One side of the silicon substrate will have a double-layer metamaterial anti-reflection coating (ARC) anisotropically etched into it and the other side will be metalized with a reflective mesh pattern. The double-laye… ▽ More We present the novel design of microfabricated, silicon-substrate based mirrors for use in cryogenic Fabry-Perot Interferometers (FPIs) for the mid-IR to sub-mm/mm wavelength regime. One side of the silicon substrate will have a double-layer metamaterial anti-reflection coating (ARC) anisotropically etched into it and the other side will be metalized with a reflective mesh pattern. The double-layer ARC ensures a reflectance of less than 1% at the surface substrate over the FPI bandwidth. This low reflectance is required to achieve broadband capability and to mitigate contaminating resonances from the silicon surface. Two silicon substrates with their metalized surfaces facing each other and held parallel with an adjustable separation will compose the FPI. To create an FPI with nearly uniform finesse over the FPI bandwidth, we use a combination of inductive and capacitive gold meshes evaporated onto the silicon substrate. We also consider the use of niobium as a superconducting reflective mesh for long wavelengths to eliminate ohmic losses at each reflection in the resonating cavity of the FPI and thereby increase overall transmission. We develop these silicon-substrate based FPIs for use in ground (e.g. CCAT-prime), air (e.g. HIRMES), and future space-based telescopes (e.g. the Origins Space Telescope concept). Such FPIs are well suited for spectroscopic imaging with the upcoming large IR/sub-mm/mm TES bolometer detector arrays. Here we present the fabrication and performance of multi-layer, plasma-etched, silicon metamaterial ARC, as well as models of the mirrors and FPIs. △ Less

Submitted 16 July, 2018; originally announced July 2018.

Comments: Presented at SPIE Advances in Optical and Mechanical Technologies for Telescopes and Instrumentation III, June 14, 2018

arXiv:1803.04014 [pdf, other]

doi 10.1109/IPDPSW.2018.00091

NVIDIA Tensor Core Programmability, Performance & Precision

Authors: Stefano Markidis, Steven Wei Der Chien, Erwin Laure, Ivy Bo Peng, Jeffrey S. Vetter

Abstract: The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to pro… ▽ More The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program NVIDIA Tensor Cores, their performances and the precision loss due to computation in mixed precision. Currently, NVIDIA provides three different ways of programming matrix-multiply-and-accumulate on Tensor Cores: the CUDA Warp Matrix Multiply Accumulate (WMMA) API, CUTLASS, a templated library based on WMMA, and cuBLAS GEMM. After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively. A WMMA implementation of batched GEMM reaches a performance of 4 Tflops/s. While precision loss due to matrix multiplication with half precision input might be critical in many HPC applications, it can be considerably reduced at the cost of increased computation. Our results indicate that HPC applications using matrix multiplications can strongly benefit from using of NVIDIA Tensor Cores. △ Less

Submitted 11 March, 2018; originally announced March 2018.

Comments: This paper has been accepted by the Eighth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES) 2018

arXiv:1802.00819 [pdf, other]

doi 10.1103/PhysRevLett.121.060401

Controllable Non-Markovianity for a Spin Qubit in Diamond

Authors: Jan F. Haase, Philipp J. Vetter, Thomas Unden, Andrea Smirne, Joachim Rosskopf, Boris Naydenov, Alastair Stacey, Fedor Jelezko, Martin B. Plenio, Susana F. Huelga

Abstract: We present a flexible scheme to realize non-artificial non-Markovian dynamics of an electronic spin qubit, using a nitrogen-vacancy center in diamond where the inherent nitrogen spin serves as a regulator of the dynamics. By changing the population of the nitrogen spin, we show that we can smoothly tune the non-Markovianity of the electron spin's dynamic. Furthermore, we examine the decoherence dy… ▽ More We present a flexible scheme to realize non-artificial non-Markovian dynamics of an electronic spin qubit, using a nitrogen-vacancy center in diamond where the inherent nitrogen spin serves as a regulator of the dynamics. By changing the population of the nitrogen spin, we show that we can smoothly tune the non-Markovianity of the electron spin's dynamic. Furthermore, we examine the decoherence dynamics induced by the spin bath to exclude other sources of non-Markovianity. The amount of collected measurement data is kept at a minimum by employing Bayesian data analysis. This allows for a precise quantification of the parameters involved in the description of the dynamics and a prediction of so far unobserved data points. △ Less

Submitted 16 August, 2018; v1 submitted 2 February, 2018; originally announced February 2018.

Comments: 12 pages, 9 figure, including supplemental material

Journal ref: Phys. Rev. Lett. 121, 060401 (2018)

arXiv:1612.03744 [pdf, ps, other]

Fault Attacks on Encrypted General Purpose Compute Platforms

Authors: Robert Buhren, Shay Gueron, Jan Nordholz, Jean-Pierre Seifert, Julian Vetter

Abstract: Adversaries with physical access to a target platform can perform cold boot or DMA attacks to extract sensitive data from the RAM. In response, several main-memory encryption schemes have been proposed to prevent such attacks. Also hardware vendors have acknowledged the threat and already announced respective hardware extensions. Intel's SGX and AMD's SME will provide means to encrypt parts of the… ▽ More Adversaries with physical access to a target platform can perform cold boot or DMA attacks to extract sensitive data from the RAM. In response, several main-memory encryption schemes have been proposed to prevent such attacks. Also hardware vendors have acknowledged the threat and already announced respective hardware extensions. Intel's SGX and AMD's SME will provide means to encrypt parts of the RAM to protect security-relevant assets that reside there. Encrypting the RAM will protect the user's content against passive eavesdrop**. However, the level of protection it provides in scenarios that involve an adversary who is not only able to read from RAM but can also change content in RAM is less clear. Obviously, encryption offers some protection against such an "active" adversary: from the ciphertext the adversary cannot see what value is changed in the plaintext, nor predict the system behaviour based on the changes. But is this enough to prevent an active adversary from performing malicious tasks? This paper addresses the open research question whether encryption alone is a dependable protection mechanism in practice when considering an active adversary. To this end, we first build a software based memory encryption solution on a desktop system which mimics AMD's SME. Subsequently, we demonstrate a proof-of-concept fault attack on this system, by which we are able to extract the private RSA key of a GnuPG user. Our work suggests that transparent memory encryption is not enough to prevent active attacks. △ Less

Submitted 12 December, 2016; originally announced December 2016.

arXiv:1404.4629 [pdf, ps, other]

A Survey of Methods For Analyzing and Improving GPU Energy Efficiency

Authors: Sparsh Mittal, Jeffrey S. Vetter

Abstract: Recent years have witnessed a phenomenal growth in the computational capabilities and applications of GPUs. However, this trend has also led to dramatic increase in their power consumption. This paper surveys research works on analyzing and improving energy efficiency of GPUs. It also provides a classification of these techniques on the basis of their main research idea. Further, it attempts to sy… ▽ More Recent years have witnessed a phenomenal growth in the computational capabilities and applications of GPUs. However, this trend has also led to dramatic increase in their power consumption. This paper surveys research works on analyzing and improving energy efficiency of GPUs. It also provides a classification of these techniques on the basis of their main research idea. Further, it attempts to synthesize research works which compare energy efficiency of GPUs with other computing systems, e.g. FPGAs and CPUs. The aim of this survey is to provide researchers with knowledge of state-of-the-art in GPU power management and motivate them to architect highly energy-efficient GPUs of tomorrow. △ Less

Submitted 18 April, 2014; v1 submitted 17 April, 2014; originally announced April 2014.

Comments: Accepted with minor revision in ACM Computing Survey Journal (impact factor 3.85, five year impact of 7.85)

ACM Class: A.1; I.3.1; H.3.4

Showing 1–43 of 43 results for author: Vetter, J