Search | arXiv e-print repository

A stable decoupled perfectly matched layer for the 3D wave equation using the nodal discontinuous Galerkin method

Authors: Sophia Julia Feriani, Matthias Cosnefroy, Allan Peter Engsig-Karup, Tim Warburton, Finnur Pind, Cheol-Ho Jeong

Abstract: In outdoor acoustics, the calculations of sound propagating in air can be computationally heavy if the domain is chosen large enough to fulfil the Sommerfeld radiation condition. By strategically truncating the computational domain with a efficient boundary treatment, the computational cost is lowered. One commonly used boundary treatment is the perfectly matched layer (PML) that dampens outgoing… ▽ More In outdoor acoustics, the calculations of sound propagating in air can be computationally heavy if the domain is chosen large enough to fulfil the Sommerfeld radiation condition. By strategically truncating the computational domain with a efficient boundary treatment, the computational cost is lowered. One commonly used boundary treatment is the perfectly matched layer (PML) that dampens outgoing waves without polluting the computed solution in the inner domain. The purpose of this study is to propose and assess a new perfectly matched layer formulation for the 3D acoustic wave equation, using the nodal discontinuous Galerkin finite element method. The formulation is based on an efficient PML formulation that can be decoupled to further increase the computational efficiency and guarantee stability without sacrificing accuracy. This decoupled PML formulation is demonstrated to be long-time stable and an optimization procedure of the dam** functions is proposed to enhance the performance of the formulation. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2311.07835 [pdf, other]

doi 10.1103/PhysRevD.110.012005

Expanding neutrino oscillation parameter measurements in NOvA using a Bayesian approach

Authors: NOvA Collaboration, M. A. Acero, B. Acharya, P. Adamson, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, N. Balashov, P. Baldi, B. A. Bambah, A. Bat, K. Bays, R. Bernstein, T. J. C. Bezerra, V. Bhatnagar, D. Bhattarai, B. Bhuyan, J. Bian, A. C. Booth, R. Bowles, B. Brahma, C. Bromberg , et al. (174 additional authors not shown)

Abstract: NOvA is a long-baseline neutrino oscillation experiment that measures oscillations in charged-current $ν_μ \rightarrow ν_μ$ (disappearance) and $ν_μ \rightarrow ν_{e}$ (appearance) channels, and their antineutrino counterparts, using neutrinos of energies around 2 GeV over a distance of 810 km. In this work we reanalyze the dataset first examined in our previous paper [Phys. Rev. D 106, 032004 (20… ▽ More NOvA is a long-baseline neutrino oscillation experiment that measures oscillations in charged-current $ν_μ \rightarrow ν_μ$ (disappearance) and $ν_μ \rightarrow ν_{e}$ (appearance) channels, and their antineutrino counterparts, using neutrinos of energies around 2 GeV over a distance of 810 km. In this work we reanalyze the dataset first examined in our previous paper [Phys. Rev. D 106, 032004 (2022)] using an alternative statistical approach based on Bayesian Markov Chain Monte Carlo. We measure oscillation parameters consistent with the previous results. We also extend our inferences to include the first NOvA measurements of the reactor mixing angle $θ_{13}$ and the Jarlskog invariant. We use these results to quantify the strength of our inferences about CP violation, as well as to examine the effects of constraints from short-baseline measurements of $θ_{13}$ using antineutrinos from nuclear reactors when making NOvA measurements of $θ_{23}$. Our long-baseline measurement of $θ_{13}$ is also shown to be consistent with the reactor measurements, supporting the general applicability and robustness of the PMNS framework for neutrino oscillations. △ Less

Submitted 27 May, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: 20 pages, 17 figures; version accepted by Phys. Rev. D. Data associated with this paper is available at https://doi.org/10.15484/2349444

Report number: FERMILAB-PUB-23-667-AD-CSAID-ND

Journal ref: Phys.Rev.D 110 (2024) 1, 012005

arXiv:2305.10965 [pdf, other]

Stop** Criteria for the Conjugate Gradient Algorithm in High-Order Finite Element Methods

Authors: Yichen Guo, Eric de Sturler, Tim Warburton

Abstract: We introduce three new stop** criteria that balance algebraic and discretization errors for the conjugate gradient algorithm applied to high-order finite element discretizations of Poisson problems. The current state of the art stop** criteria compare a posteriori estimates of discretization error against estimates of the algebraic error. Firstly, we propose a new error indicator derived from… ▽ More We introduce three new stop** criteria that balance algebraic and discretization errors for the conjugate gradient algorithm applied to high-order finite element discretizations of Poisson problems. The current state of the art stop** criteria compare a posteriori estimates of discretization error against estimates of the algebraic error. Firstly, we propose a new error indicator derived from a recovery-based error estimator that is less computationally expensive and more reliable. Secondly, we introduce a new stop** criterion that suggests stop** when the norm of the linear residual is less than a small fraction of an error indicator derived directly from the residual. This indicator shares the same mesh size and polynomial degree scaling as the norm of the residual, resulting in a robust criterion regardless of the mesh size, the polynomial degree, and the shape regularity of the mesh. Thirdly, in solving Poisson problems with highly variable piecewise constant coefficients, we introduce a subdomain-based criterion that recommends stop** when the norm of the linear residual restricted to each subdomain is smaller than the corresponding indicator also restricted to that subdomain. Numerical experiments, including tests with anisotropic meshes and highly variable piecewise constant coefficients, demonstrate that the proposed criteria efficiently avoid both premature termination and over-solving. △ Less

Submitted 18 May, 2023; originally announced May 2023.

Comments: 22 pages, 11 figures

MSC Class: 65N30; 65N22; 65F10

arXiv:2207.14353 [pdf, other]

The Profiled Feldman-Cousins technique for confidence interval construction in the presence of nuisance parameters

Authors: M. A. Acero, B. Acharya, P. Adamson, L. Aliaga, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, A. Bat, K. Bays, R. Bernstein, V. Bhatnagar, D. Bhattarai, B. Bhuyan, J. Bian, A. C. Booth, R. Bowles , et al. (196 additional authors not shown)

Abstract: Measuring observables to constrain models using maximum-likelihood estimation is fundamental to many physics experiments. The Profiled Feldman-Cousins method described here is a potential solution to common challenges faced in constructing accurate confidence intervals: small datasets, bounded parameters, and the need to properly handle nuisance parameters. This method achieves more accurate frequ… ▽ More Measuring observables to constrain models using maximum-likelihood estimation is fundamental to many physics experiments. The Profiled Feldman-Cousins method described here is a potential solution to common challenges faced in constructing accurate confidence intervals: small datasets, bounded parameters, and the need to properly handle nuisance parameters. This method achieves more accurate frequentist coverage than other methods in use, and is generally applicable to the problem of parameter estimation in neutrino oscillations and similar measurements. We describe an implementation of this method in the context of the NOvA experiment. △ Less

Submitted 1 August, 2022; v1 submitted 28 July, 2022; originally announced July 2022.

Comments: 19 pages, 12 figures

Report number: FERMILAB-PUB-22-476-ND

arXiv:2206.10585 [pdf, other]

doi 10.1103/PhysRevLett.130.051802

Measurement of the $ν_e-$Nucleus Charged-Current Double-Differential Cross Section at $\left< E_ν \right> = $ 2.4 GeV using NOvA

Authors: M. A. Acero, P. Adamson, L. Aliaga, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, R. Bernstein, V. Bhatnagar, D. Bhattarai, B. Bhuyan, J. Bian, A. C. Booth, R. Bowles, B. Brahma, C. Bromberg , et al. (190 additional authors not shown)

Abstract: The inclusive electron neutrino charged-current cross section is measured in the NOvA near detector using $8.02\times10^{20}$ protons-on-target (POT) in the NuMI beam. The sample of GeV electron neutrino interactions is the largest analyzed to date and is limited by $\simeq$ 17\% systematic rather than the $\simeq$ 7.4\% statistical uncertainties. The double-differential cross section in final-sta… ▽ More The inclusive electron neutrino charged-current cross section is measured in the NOvA near detector using $8.02\times10^{20}$ protons-on-target (POT) in the NuMI beam. The sample of GeV electron neutrino interactions is the largest analyzed to date and is limited by $\simeq$ 17\% systematic rather than the $\simeq$ 7.4\% statistical uncertainties. The double-differential cross section in final-state electron energy and angle is presented for the first time, together with the single-differential dependence on $Q^{2}$ (squared four-momentum transfer) and energy, in the range 1 GeV $ \leq E_ν < $6 GeV. Detailed comparisons are made to the predictions of the GENIE, GiBUU, NEUT, and NuWro neutrino event generators. The data do not strongly favor a model over the others consistently across all three cross sections measured, though some models have especially good or poor agreement in the single differential cross section vs. $Q^{2}$. △ Less

Submitted 21 June, 2022; originally announced June 2022.

Report number: FERMILAB-PUB-22-446-ND

arXiv:2203.10238 [pdf, other]

On the entropy projection and the robustness of high order entropy stable discontinuous Galerkin schemes for under-resolved flows

Authors: Jesse Chan, Hendrik Ranocha, Andres Rueda-Ramirez, Gregor Gassner, Tim Warburton

Abstract: High order entropy stable schemes provide improved robustness for computational simulations of fluid flows. However, additional stabilization and positivity preserving limiting can still be required for variable-density flows with under-resolved features. We demonstrate numerically that entropy stable DG methods which incorporate an "entropy projection" are less likely to require additional limiti… ▽ More High order entropy stable schemes provide improved robustness for computational simulations of fluid flows. However, additional stabilization and positivity preserving limiting can still be required for variable-density flows with under-resolved features. We demonstrate numerically that entropy stable DG methods which incorporate an "entropy projection" are less likely to require additional limiting to retain positivity for certain types of flows. We conclude by investigating potential explanations for this observed improvement in robustness. △ Less

Submitted 19 March, 2022; originally announced March 2022.

arXiv:2202.12477 [pdf, ps, other]

HipBone: A performance-portable GPU-accelerated C++ version of the NekBone benchmark

Authors: Noel Chalmers, Abhishek Mishra, Damon McDougall, Tim Warburton

Abstract: We present hipBone, an open source performance-portable proxy application for the Nek5000 (and NekRS) CFD applications. HipBone is a fully GPU-accelerated C++ implementation of the original NekBone CPU proxy application with several novel algorithmic and implementation improvements which optimize its performance on modern fine-grain parallel GPU accelerators. Our optimizations include a conversion… ▽ More We present hipBone, an open source performance-portable proxy application for the Nek5000 (and NekRS) CFD applications. HipBone is a fully GPU-accelerated C++ implementation of the original NekBone CPU proxy application with several novel algorithmic and implementation improvements which optimize its performance on modern fine-grain parallel GPU accelerators. Our optimizations include a conversion to store the degrees of freedom of the problem in assembled form in order to reduce the amount of data moved during the main iteration and a portable implementation of the main Poisson operator kernel. We demonstrate near-roofline performance of the operator kernel on three different modern GPU accelerators from two different vendors. We present a novel algorithm for splitting the application of the Poisson operator on GPUs which aggressively hides MPI communication required for both halo exchange and assembly. Our implementation of nearest-neighbor MPI communication then leverages several different routing algorithms and GPU-Direct RDMA capabilities, when available, which improves scalability of the benchmark. We demonstrate the performance of hipBone on three different clusters housed at Oak Ridge National Laboratory, namely the Summit supercomputer and the Frontier early-access clusters, Spock and Crusher. Our tests demonstrate both portability across different clusters and very good scaling efficiency, especially on large problems. △ Less

Submitted 24 February, 2022; originally announced February 2022.

arXiv:2110.13041 [pdf, other]

doi 10.3389/fdata.2022.787421

Applications and Techniques for Fast Machine Learning in Science

Authors: Allison McCarn Deiana, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S. Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik, Maurizio Pierini, Thea Aarrestad, Steffen Bahr, Jurgen Becker, Anne-Sophie Berthold, Richard J. Bonventre, Tomas E. Muller Bravo, Markus Diefenthaler, Zhen Dong, Nick Fritzsche, Amir Gholami, Ekaterina Govorkova, Kyle J Hazelwood , et al. (62 additional authors not shown)

Abstract: In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML ac… ▽ More In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlap** challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: 66 pages, 13 figures, 5 tables

Report number: FERMILAB-PUB-21-502-AD-E-SCD

Journal ref: Front. Big Data 5, 787421 (2022)

arXiv:2110.01716 [pdf, other]

Highly Optimized Full-Core Reactor Simulations on Summit

Authors: Paul Fischer, Elia Merzari, Misun Min, Stefan Kerkemeier, Yu-Hsiang Lan, Malachi Phillips, Thilina Rathnayake, April Novak, Derek Gaston, Noel Chalmers, Tim Warburton

Abstract: Nek5000/RS is a highly-performant open-source spectral element code for simulation of incompressible and low-Mach fluid flow, heat transfer, and combustion with a particular focus on turbulent flows in complex domains. It is based on high-order discretizations that realize the same (or lower) cost per gridpoint as traditional low-order methods. State-of-the-art multilevel preconditioners, efficien… ▽ More Nek5000/RS is a highly-performant open-source spectral element code for simulation of incompressible and low-Mach fluid flow, heat transfer, and combustion with a particular focus on turbulent flows in complex domains. It is based on high-order discretizations that realize the same (or lower) cost per gridpoint as traditional low-order methods. State-of-the-art multilevel preconditioners, efficient high-order time-splitting methods, and runtime-adaptive communication strategies are built on a fast OCCA-based kernel library, libParanumal, to provide scalability and portability across the spectrum of current and future high-performance computing platforms. On Summit, Nek5000/RS has recently achieved an milestone in the simulation of nuclear reactors: the first full-core computational fluid dynamics simulations of reactor cores, including pebble beds with > 350,000 pebbles and 98M elements advanced in less than 0.25 seconds per Navier-Stokes timestep. With carefully tuned algorithms, it is possible to simulate a single flow-through time for a full reactor core in less than six hours on all of Summit. △ Less

Submitted 1 October, 2021; originally announced October 2021.

Comments: 9 pages, 3 figures, 6 tables

MSC Class: 35-04 ACM Class: D.0; F.2; G.2; G.4; I.6; J.2

arXiv:2109.12220 [pdf, other]

doi 10.1103/PhysRevD.107.052011

Measurement of the Double-Differential Muon-neutrino Charged-Current Inclusive Cross Section in the NOvA Near Detector

Authors: M. A. Acero, P. Adamson, L. Aliaga, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, B. Behera, R. Bernstein, V. Bhatnagar, D. Bhattarai, B. Bhuyan, J. Bian, J. Blair, A. C. Booth, R. Bowles , et al. (181 additional authors not shown)

Abstract: We report cross-section measurements of the final-state muon kinematics for \numu charged-current interactions in the NOvA near detector using an accumulated 8.09$\times10^{20}$ protons-on-target (POT) in the NuMI beam. We present the results as a double-differential cross section in the observed outgoing muon energy and angle, as well as single-differential cross sections in the derived neutrino… ▽ More We report cross-section measurements of the final-state muon kinematics for \numu charged-current interactions in the NOvA near detector using an accumulated 8.09$\times10^{20}$ protons-on-target (POT) in the NuMI beam. We present the results as a double-differential cross section in the observed outgoing muon energy and angle, as well as single-differential cross sections in the derived neutrino energy, $E_ν$, and square of the four-momentum transfer, $Q^2$. We compare the results to inclusive cross-section predictions from various neutrino event generators via $χ^2$ calculations using a covariance matrix that accounts for bin-to-bin correlations of systematic uncertainties. These comparisons show a clear discrepancy between the data and each of the tested predictions at forward muon angle and low $Q^2$, indicating a missing suppression of the cross section in current neutrino-nucleus scattering models. △ Less

Submitted 18 July, 2023; v1 submitted 24 September, 2021; originally announced September 2021.

Report number: Fermilab PUB-21-455-ND-PPD-SCD

arXiv:2109.05072 [pdf, other]

GPU Algorithms for Efficient Exascale Discretizations

Authors: Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Ryan Bleile, Jed Brown, Jean-Sylvain Camier, Robert Carson, Noel Chalmers, Veselin Dobrev, Yohann Dudouit, Paul Fischer, Ali Karakus, Stefan Kerkemeier, Tzanio Kolev, Yu-Hsiang Lan, Elia Merzari, Misun Min, Malachi Phillips, Thilina Rathnayake, Robert Rieben, Thomas Stitt, Ananias Tomboulides, Stanimire Tomov, Vladimir Tomov, Arturo Vargas , et al. (2 additional authors not shown)

Abstract: In this paper we describe the research and development activities in the Center for Efficient Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art high-order finite-element algorithms for high-order applications on GPU-accelerated platforms. We discuss the GPU developments in several components of the CEED software stack, including the libCEED, MAGMA, MFEM,… ▽ More In this paper we describe the research and development activities in the Center for Efficient Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art high-order finite-element algorithms for high-order applications on GPU-accelerated platforms. We discuss the GPU developments in several components of the CEED software stack, including the libCEED, MAGMA, MFEM, libParanumal, and Nek projects. We report performance and capability improvements in several CEED-enabled applications on both NVIDIA and AMD GPU systems. △ Less

Submitted 10 September, 2021; originally announced September 2021.

arXiv:2109.04996 [pdf, other]

doi 10.1177/10943420211020803

Efficient Exascale Discretizations: High-Order Finite Element Methods

Authors: Tzanio Kolev, Paul Fischer, Misun Min, Jack Dongarra, Jed Brown, Veselin Dobrev, Tim Warburton, Stanimire Tomov, Mark S. Shephard, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Noel Chalmers, Yohann Dudouit, Ali Karakus, Ian Karlin, Stefan Kerkemeier, Yu-Hsiang Lan, David Medina, Elia Merzari, Aleksandr Obabko, Will Pazner, Thilina Rathnayake, Cameron W. Smith , et al. (5 additional authors not shown)

Abstract: Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high efficiency in the area of PDE discretizations on u… ▽ More Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high efficiency in the area of PDE discretizations on unstructured grids is to use matrix-free/partially-assembled high-order finite element methods, since these methods can increase the accuracy and/or lower the computational time due to reduced data motion. In this paper we provide an overview of the research and development activities in the Center for Efficient Exascale Discretizations (CEED), a co-design center in the Exascale Computing Project that is focused on the development of next-generation discretization software and algorithms to enable a wide range of finite element applications to run efficiently on future hardware. CEED is a research partnership involving more than 30 computational scientists from two US national labs and five universities, including members of the Nek5000, MFEM, MAGMA and PETSc projects. We discuss the CEED co-design activities based on targeted benchmarks, miniapps and discretization libraries and our work on performance optimizations for large-scale GPU architectures. We also provide a broad overview of research and development activities in areas such as unstructured adaptive mesh refinement algorithms, matrix-free linear solvers, high-order data visualization, and list examples of collaborations with several ECP and external applications. △ Less

Submitted 10 September, 2021; originally announced September 2021.

Comments: 22 pages, 18 figures

arXiv:2108.08381 [pdf, other]

A Local Discontinuous Galerkin Level Set Reinitialization with Subcell Stabilization on Unstructured Meshes

Authors: Ali Karakus, Noel Chalmers, Tim Warburton

Abstract: In this paper we consider a level set reinitialization technique based on a high-order, local discontinuous Galerkin method on unstructured triangular meshes. A finite volume based subcell stabilization is used to improve the nonlinear stability of the method. Instead of the standard hyperbolic level set reinitialization, the flow of time Eikonal equation is discretized to construct an approximate… ▽ More In this paper we consider a level set reinitialization technique based on a high-order, local discontinuous Galerkin method on unstructured triangular meshes. A finite volume based subcell stabilization is used to improve the nonlinear stability of the method. Instead of the standard hyperbolic level set reinitialization, the flow of time Eikonal equation is discretized to construct an approximate signed distance function. Using the Eikonal equation removes the regularization parameter in the standard approach which allows more predictable behavior and faster convergence speeds around the interface. This makes our approach very efficient especially for banded level set formulations. A set of numerical experiments including both smooth and non-smooth interfaces indicate that the method experimentally achieves design order accuracy. △ Less

Submitted 18 August, 2021; originally announced August 2021.

Comments: 19 pages, 10 figures

arXiv:2108.08219 [pdf, other]

doi 10.1103/PhysRevD.106.032004

An Improved Measurement of Neutrino Oscillation Parameters by the NOvA Experiment

Authors: M. A. Acero, P. Adamson, L. Aliaga, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, R. Bernstein, V. Bhatnagar, D. Bhattarai, B. Bhuyan, J. Bian, J. Blair, A. C. Booth, R. Bowles, C. Bromberg , et al. (180 additional authors not shown)

Abstract: We present new $ν_μ\rightarrowν_e$, $ν_μ\rightarrowν_μ$, $\overlineν_μ\rightarrow\overlineν_e$, and $\overlineν_μ\rightarrow\overlineν_μ$ oscillation measurements by the NOvA experiment, with a 50% increase in neutrino-mode beam exposure over the previously reported results. The additional data, combined with previously published neutrino and antineutrino data, are all analyzed using improved tech… ▽ More We present new $ν_μ\rightarrowν_e$, $ν_μ\rightarrowν_μ$, $\overlineν_μ\rightarrow\overlineν_e$, and $\overlineν_μ\rightarrow\overlineν_μ$ oscillation measurements by the NOvA experiment, with a 50% increase in neutrino-mode beam exposure over the previously reported results. The additional data, combined with previously published neutrino and antineutrino data, are all analyzed using improved techniques and simulations. A joint fit to the $ν_e$, $ν_μ$, $\overlineν_e$, and $\overlineν_μ$ candidate samples within the 3-flavor neutrino oscillation framework continues to yield a best-fit point in the normal mass ordering and the upper octant of the $θ_{23}$ mixing angle, with $Δm^{2}_{32} = (2.41\pm0.07)\times 10^{-3}$ eV$^2$ and $\sin^2θ_{23} = 0.57^{+0.03}_{-0.04}$. The data disfavor combinations of oscillation parameters that give rise to a large asymmetry in the rates of $ν_e$ and $\overlineν_e$ appearance. This includes values of the CP-violating phase in the vicinity of $δ_\text{CP} = π/2$ which are excluded by $>3σ$ for the inverted mass ordering, and values around $δ_\text{CP} = 3π/2$ in the normal ordering which are disfavored at 2$σ$ confidence. △ Less

Submitted 8 August, 2022; v1 submitted 18 August, 2021; originally announced August 2021.

Comments: 11 pages, 6 figures. Supplementary material attached (7 figures)

Report number: FERMILAB-PUB-21-373-ND

Journal ref: Phys. Rev. D 106, 032004 (2022)

arXiv:2106.06035 [pdf, other]

doi 10.1103/PhysRevD.104.063024

Extended search for supernova-like neutrinos in NOvA coincident with LIGO/Virgo detections

Authors: M. A. Acero, P. Adamson, L. Aliaga, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian, J. Blair, A. C. Booth, R. Bowles, C. Bromberg, N. Buchanan , et al. (178 additional authors not shown)

Abstract: A search is performed for supernova-like neutrino interactions coincident with 76 gravitational wave events detected by the LIGO/Virgo Collaboration. For 40 of these events, full readout of the time around the gravitational wave is available from the NOvA Far Detector. For these events, we set limits on the fluence of the sum of all neutrino flavors of $F < 7(4)\times 10^{10}\mathrm{cm}^{-2}$ at 9… ▽ More A search is performed for supernova-like neutrino interactions coincident with 76 gravitational wave events detected by the LIGO/Virgo Collaboration. For 40 of these events, full readout of the time around the gravitational wave is available from the NOvA Far Detector. For these events, we set limits on the fluence of the sum of all neutrino flavors of $F < 7(4)\times 10^{10}\mathrm{cm}^{-2}$ at 90% C.L. assuming energy and time distributions corresponding to the Garching supernova models with masses 9.6(27)$\mathrm{M}_\odot$. Under the hypothesis that any given gravitational wave event was caused by a supernova, this corresponds to a distance of $r > 29(50)$kpc at 90% C.L. Weaker limits are set for other gravitational wave events with partial Far Detector data and/or Near Detector data. △ Less

Submitted 23 August, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

Comments: 10 pages, 2 figures

Report number: FERMILAB-PUB-21-276-ND

Journal ref: Phys. Rev. D 104, 063024 (2021)

arXiv:2106.04673 [pdf, other]

doi 10.1103/PhysRevLett.127.201801

Search for active-sterile antineutrino mixing using neutral-current interactions with the NOvA experiment

Authors: M. A. Acero, P. Adamson, L. Aliaga, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian, J. Blair, A. C. Booth, R. Bowles, C. Bromberg, N. Buchanan , et al. (174 additional authors not shown)

Abstract: This Letter reports results from the first long-baseline search for sterile antineutrinos mixing in an accelerator-based antineutrino-dominated beam. The rate of neutral-current interactions in the two NOvA detectors, at distances of 1 km and 810 km from the beam source, is analyzed using an exposure of $12.51\times10^{20}$ protons-on-target from the NuMI beam at Fermilab running in antineutrino m… ▽ More This Letter reports results from the first long-baseline search for sterile antineutrinos mixing in an accelerator-based antineutrino-dominated beam. The rate of neutral-current interactions in the two NOvA detectors, at distances of 1 km and 810 km from the beam source, is analyzed using an exposure of $12.51\times10^{20}$ protons-on-target from the NuMI beam at Fermilab running in antineutrino mode. A total of $121$ of neutral-current candidates are observed at the Far Detector, compared to a prediction of $122\pm11$(stat.)$\pm15$(syst.) assuming mixing between three active flavors. No evidence for $\barν_μ\rightarrow\barν_{s}$ oscillation is observed. Interpreting this result within a 3+1 model, constraints are placed on the mixing angles $θ_{24} < 25^{\circ}$ and $θ_{34} < 32^{\circ}$ at the 90% C.L. for $0.05$eV$^{2} \leq Δm^{2}_{41} \leq 0.5$eV$^{2}$, the range of mass splittings that produces no significant oscillations at the Near Detector. These are the first 3+1 confidence limits set using long-baseline accelerator antineutrinos. △ Less

Submitted 30 September, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

Comments: 8 pages, 4 figures

Report number: FERMILAB-PUB-21-271-ND

arXiv:2105.03848 [pdf, other]

doi 10.1103/PhysRevD.104.012014

Seasonal Variation of Multiple-Muon Cosmic Ray Air Showers Observed in the NOvA Detector on the Surface

Authors: M. A. Acero, P. Adamson, L. Aliaga, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian, J. Blair, A. C. Booth, R. Bowles, C. Bromberg, N. Buchanan , et al. (172 additional authors not shown)

Abstract: We report the rate of cosmic ray air showers with multiplicities exceeding 15 muon tracks recorded in the NOvA Far Detector between May 2016 and May 2018. The detector is located on the surface under an overburden of 3.6 meters water equivalent. We observe a seasonal dependence in the rate of multiple-muon showers, which varies in magnitude with multiplicity and zenith angle. During this period, t… ▽ More We report the rate of cosmic ray air showers with multiplicities exceeding 15 muon tracks recorded in the NOvA Far Detector between May 2016 and May 2018. The detector is located on the surface under an overburden of 3.6 meters water equivalent. We observe a seasonal dependence in the rate of multiple-muon showers, which varies in magnitude with multiplicity and zenith angle. During this period, the effective atmospheric temperature and surface pressure ranged between 210 K to 230 K and 940mbar to 990mbar, respectively; the shower rates are anti-correlated with the variation in the effective temperature. The variations are about 30% larger for the highest multiplicities than the lowest multiplicities and 20% larger for showers near the horizon than vertical showers. △ Less

Submitted 13 July, 2021; v1 submitted 9 May, 2021; originally announced May 2021.

Report number: FERMILAB-PUB-21-224-ND

Journal ref: Phys. Rev. D 104, 012014 (2021)

arXiv:2104.05829 [pdf, other]

NekRS, a GPU-Accelerated Spectral Element Navier-Stokes Solver

Authors: Paul Fischer, Stefan Kerkemeier, Misun Min, Yu-Hsiang Lan, Malachi Phillips, Thilina Rathnayake, Elia Merzari, Ananias Tomboulides, Ali Karakus, Noel Chalmers, Tim Warburton

Abstract: The development of NekRS, a GPU-oriented thermal-fluids simulation code based on the spectral element method (SEM) is described. For performance portability, the code is based on the open concurrent compute abstraction and leverages scalable developments in the SEM code Nek5000 and in libParanumal, which is a library of high-performance kernels for high-order discretizations and PDE-based miniapps… ▽ More The development of NekRS, a GPU-oriented thermal-fluids simulation code based on the spectral element method (SEM) is described. For performance portability, the code is based on the open concurrent compute abstraction and leverages scalable developments in the SEM code Nek5000 and in libParanumal, which is a library of high-performance kernels for high-order discretizations and PDE-based miniapps. Critical performance sections of the Navier-Stokes time advancement are addressed. Performance results on several platforms are presented, including scaling to 27,648 V100s on OLCF Summit, for calculations of up to 60B gridpoints. △ Less

Submitted 12 April, 2021; originally announced April 2021.

Comments: 14 pages, 8 figures

MSC Class: 35-04 ACM Class: D.0; F.2; G.2; G.4; I.6

arXiv:2011.11089 [pdf, other]

Entropy stable modal discontinuous Galerkin schemes and wall boundary conditions for the compressible Navier-Stokes equations

Authors: Jesse Chan, Yimin Lin, Tim Warburton

Abstract: Entropy stable schemes ensure that physically meaningful numerical solutions also satisfy a semi-discrete entropy inequality under appropriate boundary conditions. In this work, we describe a discretization of viscous terms in the compressible Navier-Stokes equations which enables a simple and explicit imposition of entropy stable no-slip (adiabatic and isothermal) and reflective (symmetry) wall b… ▽ More Entropy stable schemes ensure that physically meaningful numerical solutions also satisfy a semi-discrete entropy inequality under appropriate boundary conditions. In this work, we describe a discretization of viscous terms in the compressible Navier-Stokes equations which enables a simple and explicit imposition of entropy stable no-slip (adiabatic and isothermal) and reflective (symmetry) wall boundary conditions for discontinuous Galerkin (DG) discretizations. Numerical results confirm the robustness and accuracy of the proposed approaches. △ Less

Submitted 22 November, 2020; originally announced November 2020.

arXiv:2009.10917 [pdf, ps, other]

Portable high-order finite element kernels I: Streaming Operations

Authors: Noel Chalmers, Tim Warburton

Abstract: This paper is devoted to the development of highly efficient kernels performing vector operations relevant in linear system solvers. In particular, we focus on the low arithmetic intensity operations (i.e., streaming operations) performed within the conjugate gradient iterative method, using the parameters specified in the CEED benchmark problems for high-order hexahedral finite elements. We propo… ▽ More This paper is devoted to the development of highly efficient kernels performing vector operations relevant in linear system solvers. In particular, we focus on the low arithmetic intensity operations (i.e., streaming operations) performed within the conjugate gradient iterative method, using the parameters specified in the CEED benchmark problems for high-order hexahedral finite elements. We propose a suite of new Benchmark Streaming tests to focus on the distinct streaming operations which must be performed. We implemented these new tests using the OCCA abstraction framework to demonstrate portability of these streaming operations on different GPU architectures, and propose a simple performance model for such kernels which can accurately capture data movement rates as well as kernel launch costs. △ Less

Submitted 22 September, 2020; originally announced September 2020.

arXiv:2009.10863 [pdf, other]

Initial Guesses for Sequences of Linear Systems in a GPU-Accelerated Incompressible Flow Solver

Authors: Anthony P. Austin, Noel Chalmers, Tim Warburton

Abstract: We consider several methods for generating initial guesses when iteratively solving sequences of linear systems, showing that they can be implemented efficiently in GPU-accelerated PDE solvers, specifically solvers for incompressible flow. We propose new initial guess methods based on stabilized polynomial extrapolation and compare them to the projection method of Fischer [15], showing that they a… ▽ More We consider several methods for generating initial guesses when iteratively solving sequences of linear systems, showing that they can be implemented efficiently in GPU-accelerated PDE solvers, specifically solvers for incompressible flow. We propose new initial guess methods based on stabilized polynomial extrapolation and compare them to the projection method of Fischer [15], showing that they are generally competitive with projection schemes despite requiring only half the storage and performing considerably less data movement and communication. Our implementations of these algorithms are freely available as part of the libParanumal collection of GPU-accelerated flow solvers. △ Less

Submitted 22 September, 2020; originally announced September 2020.

Comments: 28 pages, 5 figures

MSC Class: 65F10; 65M22

arXiv:2009.04867 [pdf, other]

doi 10.1103/PhysRevD.103.012007

Search for Slow Magnetic Monopoles with the NOvA Detector on the Surface

Authors: NOvA Collaboration, M. A. Acero, P. Adamson, L. Aliaga, T. Alion, V. Allakhverdian, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, S. Bending, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian, J. Blair , et al. (174 additional authors not shown)

Abstract: We report a search for a magnetic monopole component of the cosmic-ray flux in a 95-day exposure of the NOvA experiment's Far Detector, a 14 kt segmented liquid scintillator detector designed primarily to observe GeV-scale electron neutrinos. No events consistent with monopoles were observed, setting an upper limit on the flux of $2\times 10^{-14} \mathrm{cm^{-2}s^{-1}sr^{-1}}$ at 90% C.L. for mon… ▽ More We report a search for a magnetic monopole component of the cosmic-ray flux in a 95-day exposure of the NOvA experiment's Far Detector, a 14 kt segmented liquid scintillator detector designed primarily to observe GeV-scale electron neutrinos. No events consistent with monopoles were observed, setting an upper limit on the flux of $2\times 10^{-14} \mathrm{cm^{-2}s^{-1}sr^{-1}}$ at 90% C.L. for monopole speed $6\times 10^{-4} < β< 5\times 10^{-3}$ and mass greater than $5\times 10^{8}$ GeV. Because of NOvA's small overburden of 3 meters-water equivalent, this constraint covers a previously unexplored low-mass region. △ Less

Submitted 5 January, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: 8 pages, 7 figures

Report number: FERMILAB-PUB-20-472-ND

Journal ref: Phys. Rev. D 103, 012007 (2021)

arXiv:2006.08727 [pdf, other]

doi 10.1140/epjc/s10052-020-08577-5

Adjusting Neutrino Interaction Models and Evaluating Uncertainties using NOvA Near Detector Data

Authors: NOvA Collaboration, M. A. Acero, P. Adamson, G. Agam, L. Aliaga, T. Alion, V. Allakhverdian, N. Anfimov, A. Antoshkin, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, S. Bending, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian, J. Blair , et al. (170 additional authors not shown)

Abstract: The two-detector design of the NOvA neutrino oscillation experiment, in which two functionally identical detectors are exposed to an intense neutrino beam, aids in canceling leading order effects of cross-section uncertainties. However, limited knowledge of neutrino interaction cross sections still gives rise to some of the largest systematic uncertainties in current oscillation measurements. We s… ▽ More The two-detector design of the NOvA neutrino oscillation experiment, in which two functionally identical detectors are exposed to an intense neutrino beam, aids in canceling leading order effects of cross-section uncertainties. However, limited knowledge of neutrino interaction cross sections still gives rise to some of the largest systematic uncertainties in current oscillation measurements. We show contemporary models of neutrino interactions to be discrepant with data from NOvA, consistent with discrepancies seen in other experiments. Adjustments to neutrino interaction models in GENIE that improve agreement with our data are presented. We also describe systematic uncertainties on these models, including uncertainties on multi-nucleon interactions from a newly developed procedure using NOvA near detector data. △ Less

Submitted 10 December, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

Comments: Code implementing adjustments to GENIE 2.12.2 described in this paper is available at https://github.com/novaexperiment/NOvARwgt-public

Report number: FERMILAB-PUB-20-243-ND

Journal ref: Eur. Phys. J. C 80, 1119 (2020)

arXiv:2005.07155 [pdf, other]

doi 10.1088/1475-7516/2020/10/014

Supernova neutrino detection in NOvA

Authors: NOvA Collaboration, M. A. Acero, P. Adamson, G. Agam, L. Aliaga, T. Alion, V. Allakhverdian, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, S. Bending, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian , et al. (177 additional authors not shown)

Abstract: The NOvA long-baseline neutrino experiment uses a pair of large, segmented, liquid-scintillator calorimeters to study neutrino oscillations, using GeV-scale neutrinos from the Fermilab NuMI beam. These detectors are also sensitive to the flux of neutrinos which are emitted during a core-collapse supernova through inverse beta decay interactions on carbon at energies of… ▽ More The NOvA long-baseline neutrino experiment uses a pair of large, segmented, liquid-scintillator calorimeters to study neutrino oscillations, using GeV-scale neutrinos from the Fermilab NuMI beam. These detectors are also sensitive to the flux of neutrinos which are emitted during a core-collapse supernova through inverse beta decay interactions on carbon at energies of $\mathcal{O}(10~\text{MeV})$. This signature provides a means to study the dominant mode of energy release for a core-collapse supernova occurring in our galaxy. We describe the data-driven software trigger system developed and employed by the NOvA experiment to identify and record neutrino data from nearby galactic supernovae. This technique has been used by NOvA to self-trigger on potential core-collapse supernovae in our galaxy, with an estimated sensitivity reaching out to 10~kpc distance while achieving a detection efficiency of 23\% to 49\% for supernovae from progenitor stars with masses of 9.6M$_\odot$ to 27M$_\odot$, respectively. △ Less

Submitted 29 July, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

Comments: 30 pages, 17 figures

Report number: FERMILAB-PUB-20-201-E

Journal ref: JCAP 10 (2020) 014

arXiv:2004.06722 [pdf, other]

Scalability of High-Performance PDE Solvers

Authors: Paul Fischer, Misun Min, Thilina Rathnayake, Som Dutta, Tzanio Kolev, Veselin Dobrev, Jean-Sylvain Camier, Martin Kronbichler, Tim Warburton, Kasia Swirydowicz, Jed Brown

Abstract: Performance tests and analyses are critical to effective HPC software development and are central components in the design and implementation of computational algorithms for achieving faster simulations on existing and future computing architectures for large-scale application problems. In this paper, we explore performance and space-time trade-offs for important compute-intensive kernels of large… ▽ More Performance tests and analyses are critical to effective HPC software development and are central components in the design and implementation of computational algorithms for achieving faster simulations on existing and future computing architectures for large-scale application problems. In this paper, we explore performance and space-time trade-offs for important compute-intensive kernels of large-scale numerical solvers for PDEs that govern a wide range of physical applications. We consider a sequence of PDE- motivated bake-off problems designed to establish best practices for efficient high-order simulations across a variety of codes and platforms. We measure peak performance (degrees of freedom per second) on a fixed number of nodes and identify effective code optimization strategies for each architecture. In addition to peak performance, we identify the minimum time to solution at 80% parallel efficiency. The performance analysis is based on spectral and p-type finite elements but is equally applicable to a broad spectrum of numerical PDE discretizations, including finite difference, finite volume, and h-type finite elements. △ Less

Submitted 14 April, 2020; originally announced April 2020.

Comments: 25 pages, 54 figures

MSC Class: 35-04 ACM Class: D.0; F.2; G.2; G.4; I.6

arXiv:2001.07240 [pdf, ps, other]

doi 10.1103/PhysRevD.101.112006

Search for multi-messenger signals in NOvA coincident with LIGO/Virgo detections

Authors: NOvA Collaboration, M. A. Acero, P. Adamson, L. Aliaga, T. Alion, V. Allakhverdian, N. Anfimov, A. Antoshkin, L. Asquith, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, S. Bending, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian, J. Blair, A. C. Booth , et al. (155 additional authors not shown)

Abstract: Using the NOvA neutrino detectors, a broad search has been performed for any signal coincident with 28 gravitational wave events detected by the LIGO/Virgo Collaboration between September 2015 and July 2019. For all of these events, NOvA is sensitive to possible arrival of neutrinos and cosmic rays of GeV and higher energies. For five (seven) events in the NOvA Far (Near) Detector, timely public a… ▽ More Using the NOvA neutrino detectors, a broad search has been performed for any signal coincident with 28 gravitational wave events detected by the LIGO/Virgo Collaboration between September 2015 and July 2019. For all of these events, NOvA is sensitive to possible arrival of neutrinos and cosmic rays of GeV and higher energies. For five (seven) events in the NOvA Far (Near) Detector, timely public alerts from the LIGO/Virgo Collaboration allowed recording of MeV-scale events. No signal candidates were found. △ Less

Submitted 20 April, 2021; v1 submitted 20 January, 2020; originally announced January 2020.

Comments: 11 pages, 6 figures; Corrected fluence limits

Report number: FERMILAB-PUB-20-018-ND

Journal ref: Phys. Rev. D 101, 112006 (2020)

arXiv:1912.08739 [pdf, other]

doi 10.1088/1748-0221/15/03/P03035

Design and performance of a 35-ton liquid argon time projection chamber as a prototype for future very large detectors

Authors: D. L. Adams, M. Baird, G. Barr, N. Barros, A. Blake, E. Blaufuss, A. Booth, D. Brailsford, N. Buchanan, B. Carls, H. Chen, M. Convery, G. De Geronimo, T. Dealtry, R. Dharmapalan, Z. Djurcic, J. Fowler, S. Glavin, R. A. Gomes, M. C. Goodman, M. Graham, L. Greenler, A. Hahn, J. Hartnell, R. Herbst , et al. (49 additional authors not shown)

Abstract: Liquid argon time projection chamber technology is an attractive choice for large neutrino detectors, as it provides a high-resolution active target and it is expected to be scalable to very large masses. Consequently, it has been chosen as the technology for the first module of the DUNE far detector. However, the fiducial mass required for "far detectors" of the next generation of neutrino oscill… ▽ More Liquid argon time projection chamber technology is an attractive choice for large neutrino detectors, as it provides a high-resolution active target and it is expected to be scalable to very large masses. Consequently, it has been chosen as the technology for the first module of the DUNE far detector. However, the fiducial mass required for "far detectors" of the next generation of neutrino oscillation experiments far exceeds what has been demonstrated so far. Scaling to this larger mass, as well as the requirement for underground construction places a number of additional constraints on the design. A prototype 35-ton cryostat was built at Fermi National Acccelerator Laboratory to test the functionality of the components foreseen to be used in a very large far detector. The Phase I run, completed in early 2014, demonstrated that liquid argon could be maintained at sufficient purity in a membrane cryostat. A time projection chamber was installed for the Phase II run, which collected data in February and March of 2016. The Phase II run was a test of the modular anode plane assemblies with wrapped wires, cold readout electronics, and integrated photon detection systems. While the details of the design do not match exactly those chosen for the DUNE far detector, the 35-ton TPC prototype is a demonstration of the functionality of the basic components. Measurements are performed using the Phase II data to extract signal and noise characteristics and to align the detector components. A measurement of the electron lifetime is presented, and a novel technique for measuring a track's position based on pulse properties is described. △ Less

Submitted 2 March, 2020; v1 submitted 18 December, 2019; originally announced December 2019.

Comments: 28 pages, 12 figures, accepted by JINST

arXiv:1906.04907 [pdf, other]

doi 10.1103/PhysRevLett.123.151803

First measurement of neutrino oscillation parameters using neutrinos and antineutrinos by NOvA

Authors: M. A. Acero, P. Adamson, L. Aliaga, T. Alion, V. Allakhverdian, S. Altakarli, N. Anfimov, A. Antoshkin, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, S. Bending, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian, T. Blackburn, J. Blair, A. C. Booth , et al. (174 additional authors not shown)

Abstract: The NOvA experiment has made a $4.4σ$-significant observation of $\barν_{e}$ appearance in a 2 GeV $\barν_μ$ beam at a distance of 810 km. Using $12.33\times10^{20}$ protons on target delivered to the Fermilab NuMI neutrino beamline, the experiment recorded 27 $\barν_μ \rightarrow \barν_{e}$ candidates with a background of 10.3 and 102 $\barν_μ \rightarrow \barν_μ$ candidates. This new antineutrin… ▽ More The NOvA experiment has made a $4.4σ$-significant observation of $\barν_{e}$ appearance in a 2 GeV $\barν_μ$ beam at a distance of 810 km. Using $12.33\times10^{20}$ protons on target delivered to the Fermilab NuMI neutrino beamline, the experiment recorded 27 $\barν_μ \rightarrow \barν_{e}$ candidates with a background of 10.3 and 102 $\barν_μ \rightarrow \barν_μ$ candidates. This new antineutrino data is combined with neutrino data to measure the oscillation parameters $|Δm^2_{32}| = 2.48^{+0.11}_{-0.06}\times10^{-3}$ eV$^2/c^4$, $\sin^2 θ_{23} = 0.56^{+0.04}_{-0.03}$ in the normal neutrino mass hierarchy and upper octant and excludes most values near $δ_{\rm CP}=π/2$ for the inverted mass hierarchy by more than 3$σ$. The data favor the normal neutrino mass hierarchy by 1.9$σ$ and $θ_{23}$ values in the upper octant by 1.6$σ$. △ Less

Submitted 14 June, 2019; v1 submitted 11 June, 2019; originally announced June 2019.

Comments: 8 pages, 3 figures. Supplementary material attached (6 figures). To view attachments, please download and extract the gzipped tar source file listed under "Other formats". Fixed supplementary material to include just the compiled pdf not the Latex Source

Journal ref: Phys. Rev. Lett. 123, 151803 (2019)

arXiv:1904.12975 [pdf, other]

doi 10.1103/PhysRevD.99.122004

Observation of seasonal variation of atmospheric multiple-muon events in the NOvA Near Detector

Authors: M. A. Acero, P. Adamson, L. Aliaga, T. Alion, V. Allakhverdian, S. Altakarli, N. Anmov, A. Antoshkin, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Bashar, K. Bays, S. Bending, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian, J. Blair, A. C. Booth, P. Bour , et al. (166 additional authors not shown)

Abstract: Using two years of data from the NOvA Near Detector at Fermilab, we report a seasonal variation of cosmic ray induced multiple-muon event rates which has an opposite phase to the seasonal variation in the atmospheric temperature. The strength of the seasonal multipl$ increase as a function of the muon multiplicity. However, no significant dependence of the strength of the seasonal variation of the… ▽ More Using two years of data from the NOvA Near Detector at Fermilab, we report a seasonal variation of cosmic ray induced multiple-muon event rates which has an opposite phase to the seasonal variation in the atmospheric temperature. The strength of the seasonal multipl$ increase as a function of the muon multiplicity. However, no significant dependence of the strength of the seasonal variation of the multiple-muon variation is seen as a function of the muon zenith angle, or the spatial or angular separation between the correlated muons. △ Less

Submitted 8 July, 2019; v1 submitted 29 April, 2019; originally announced April 2019.

Journal ref: Phys. Rev. D 99, 122004 (2019)

arXiv:1902.00558 [pdf, ps, other]

doi 10.1103/PhysRevD.102.012004

Measurement of Neutrino-Induced Neutral-Current Coherent $π^0$ Production in the NOvA Near Detector

Authors: M. A. Acero, P. Adamson, L. Aliaga, T. Alion, V. Allakhverdian, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, A. Aurisano, A. Back, C. Backhouse, M. Baird, N. Balashov, P. Baldi, B. A. Bambah, S. Basher, K. Bays, B. Behera, S. Bending, R. Bernstein, V. Bhatnagar, B. Bhuyan, J. Bian, J. Blair, A. C. Booth , et al. (166 additional authors not shown)

Abstract: The cross section of neutrino-induced neutral-current coherent $π^0$ production on a carbon-dominated target is measured in the NOvA near detector. This measurement uses a narrow-band neutrino beam with an average neutrino energy of 2.7\,GeV, which is of interest to ongoing and future long-baseline neutrino oscillation experiments. The measured, flux-averaged cross section is… ▽ More The cross section of neutrino-induced neutral-current coherent $π^0$ production on a carbon-dominated target is measured in the NOvA near detector. This measurement uses a narrow-band neutrino beam with an average neutrino energy of 2.7\,GeV, which is of interest to ongoing and future long-baseline neutrino oscillation experiments. The measured, flux-averaged cross section is $σ= 13.8\pm0.9 (\text{stat})\pm2.3 (\text{syst}) \times 10^{-40}\,\text{cm}^2/\text{nucleus}$, consistent with model prediction. This result is the most precise measurement of neutral-current coherent $π^0$ production in the few-GeV neutrino energy region. △ Less

Submitted 9 July, 2020; v1 submitted 1 February, 2019; originally announced February 2019.

Report number: FERMILAB-PUB-19-047-ND

Journal ref: Phys. Rev. D 102, 012004 (2020)

arXiv:1808.10481 [pdf, other]

Leapfrog time-step** for Hermite methods

Authors: Arturo Vargas, Thomas Hagstrom, Jesse Chan, Tim Warburton

Abstract: We introduce Hermite-leapfrog methods for first order wave systems. The new Hermite-leapfrog methods pair leapfrog time-step** with the Hermite methods of Goodrich and co-authors. The new schemes stagger field variables in both time and space and are high-order accurate. We provide a detailed description of the method and demonstrate that the method conserves variable quantities in one-space dim… ▽ More We introduce Hermite-leapfrog methods for first order wave systems. The new Hermite-leapfrog methods pair leapfrog time-step** with the Hermite methods of Goodrich and co-authors. The new schemes stagger field variables in both time and space and are high-order accurate. We provide a detailed description of the method and demonstrate that the method conserves variable quantities in one-space dimension. Higher dimensional versions of the method are constructed via a tensor product construction. Numerical evidence and rigorous analysis in one space dimension establish stability and high-order convergence. Experiments demonstrating efficient implementations on a graphics processing unit are also presented. △ Less

Submitted 30 August, 2018; originally announced August 2018.

Comments: Submitted to Journal of Scientific Computing

arXiv:1805.02082 [pdf, other]

doi 10.1016/j.jcp.2019.03.050

Discontinuous Galerkin Discretizations of the Boltzmann Equations in 2D: semi-analytic time step** and absorbing boundary layers

Authors: A. Karakus, N. Chalmers, J. S. Hesthaven, T. Warburton

Abstract: We present an efficient nodal discontinuous Galerkin method for approximating nearly incompressible flows using the Boltzmann equations. The equations are discretized with Hermite polynomials in velocity space yielding a first order conservation law. A stabilized unsplit perfectly matching layer (PML) formulation is introduced for the resulting nonlinear flow equations. The proposed PML equations… ▽ More We present an efficient nodal discontinuous Galerkin method for approximating nearly incompressible flows using the Boltzmann equations. The equations are discretized with Hermite polynomials in velocity space yielding a first order conservation law. A stabilized unsplit perfectly matching layer (PML) formulation is introduced for the resulting nonlinear flow equations. The proposed PML equations exponentially absorb the difference between the nonlinear fluctuation and the prescribed mean flow. We introduce semi-analytic time discretization methods to improve the time step restrictions in small relaxation times. We also introduce a multirate semi-analytic Adams-Bashforth method which preserves efficiency in stiff regimes. Accuracy and performance of the method are tested using distinct cases including isothermal vortex, flow around square cylinder, and wall mounted square cylinder test cases. △ Less

Submitted 5 May, 2018; originally announced May 2018.

Comments: 37 pages, 11 figures

arXiv:1804.02221 [pdf, other]

doi 10.1016/j.jcp.2018.08.038

An entropy stable discontinuous Galerkin method for the shallow water equations on curvilinear meshes with wet/dry fronts accelerated by GPUs

Authors: Niklas Wintermeyer, Andrew R. Winters, Gregor J. Gassner, Timothy Warburton

Abstract: We extend the entropy stable high order nodal discontinuous Galerkin spectral element approximation for the non-linear two dimensional shallow water equations presented by Wintermeyer et al. [N. Wintermeyer, A. R. Winters, G. J. Gassner, and D. A. Kopriva. An entropy stable nodal discontinuous Galerkin method for the two dimensional shallow water equations on unstructured curvilinear meshes with d… ▽ More We extend the entropy stable high order nodal discontinuous Galerkin spectral element approximation for the non-linear two dimensional shallow water equations presented by Wintermeyer et al. [N. Wintermeyer, A. R. Winters, G. J. Gassner, and D. A. Kopriva. An entropy stable nodal discontinuous Galerkin method for the two dimensional shallow water equations on unstructured curvilinear meshes with discontinuous bathymetry. Journal of Computational Physics, 340:200-242, 2017] with a shock capturing technique and a positivity preservation capability to handle dry areas. The scheme preserves the entropy inequality, is well-balanced and works on unstructured, possibly curved, quadrilateral meshes. For the shock capturing, we introduce an artificial viscosity to the equations and prove that the numerical scheme remains entropy stable. We add a positivity preserving limiter to guarantee non-negative water heights as long as the mean water height is non-negative. We prove that non-negative mean water heights are guaranteed under a certain additional time step restriction for the entropy stable numerical interface flux. We implement the method on GPU architectures using the abstract language OCCA, a unified approach to multi-threading languages. We show that the entropy stable scheme is well suited to GPUs as the necessary extra calculations do not negatively impact the runtime up to reasonably high polynomial degrees (around $N=7$). We provide numerical examples that challenge the shock capturing and positivity properties of our scheme to verify our theoretical findings. △ Less

Submitted 6 April, 2018; originally announced April 2018.

arXiv:1803.06379 [pdf, other]

doi 10.1088/1748-0221/13/06/P06022

Photon detector system timing performance in the DUNE 35-ton prototype liquid argon time projection chamber

Authors: D. L. Adams, T. Alion, J. T. Anderson, L. Bagby, M. Baird, G. Barr, N. Barros, K. Biery, A. Blake, E. Blaufuss, T. Boone, A. Booth, D. Brailsford, N. Buchanan, A. Chatterjee, M. Convery, J. Davies, T. Dealtry, P. DeLurgio, G. Deuerling, R. Dharmapalan, Z. Djurcic, G. Drake, B. Eberly, J. Freeman , et al. (53 additional authors not shown)

Abstract: The 35-ton prototype for the Deep Underground Neutrino Experiment far detector was a single-phase liquid argon time projection chamber with an integrated photon detector system, all situated inside a membrane cryostat. The detector took cosmic-ray data for six weeks during the period of February 1, 2016 to March 12, 2016. The performance of the photon detection system was checked with these data.… ▽ More The 35-ton prototype for the Deep Underground Neutrino Experiment far detector was a single-phase liquid argon time projection chamber with an integrated photon detector system, all situated inside a membrane cryostat. The detector took cosmic-ray data for six weeks during the period of February 1, 2016 to March 12, 2016. The performance of the photon detection system was checked with these data. An installed photon detector was demonstrated to measure the arrival times of cosmic-ray muons with a resolution better than 32 ns, limited by the timing of the trigger system. A measurement of the timing resolution using closely-spaced calibration pulses yielded a resolution of 15 ns for pulses at a level of 6 photo-electrons. Scintillation light from cosmic-ray muons was observed to be attenuated with increasing distance with a characteristic length of $155 \pm 28$ cm. △ Less

Submitted 5 June, 2018; v1 submitted 16 March, 2018; originally announced March 2018.

Comments: 17 pages, 8 figures. Submitted to JINST

arXiv:1801.00246 [pdf, other]

A GPU Accelerated Discontinuous Galerkin Incompressible Flow Solver

Authors: Ali Karakus, Noel Chalmers, Kasia Swirydowicz, Timothy Warburton

Abstract: We present a GPU-accelerated version of a high-order discontinuous Galerkin discretization of the unsteady incompressible Navier-Stokes equations. The equations are discretized in time using a semi-implicit scheme with explicit treatment of the nonlinear term and implicit treatment of the split Stokes operators. The pressure system is solved with a conjugate gradient method together with a fully G… ▽ More We present a GPU-accelerated version of a high-order discontinuous Galerkin discretization of the unsteady incompressible Navier-Stokes equations. The equations are discretized in time using a semi-implicit scheme with explicit treatment of the nonlinear term and implicit treatment of the split Stokes operators. The pressure system is solved with a conjugate gradient method together with a fully GPU-accelerated multigrid preconditioner which is designed to minimize memory requirements and to increase overall performance. A semi-Lagrangian subcycling advection algorithm is used to shift the computational load per timestep away from the pressure Poisson solve by allowing larger timestep sizes in exchange for an increased number of advection steps. Numerical results confirm we achieve the design order accuracy in time and space. We optimize the performance of the most time-consuming kernels by tuning the fine-grain parallelism, memory utilization, and maximizing bandwidth. To assess overall performance we present an empirically calibrated roofline performance model for a target GPU to explain the achieved efficiency. We demonstrate that, in the most cases, the kernels used in the solver are close to their empirically predicted roofline performance. △ Less

Submitted 7 May, 2018; v1 submitted 31 December, 2017; originally announced January 2018.

Comments: 33 pages, 10 figures

arXiv:1711.00903 [pdf, other]

Acceleration of tensor-product operations for high-order finite element methods

Authors: Kasia Świrydowicz, Noel Chalmers, Ali Karakus, Timothy Warburton

Abstract: This paper is devoted to GPU kernel optimization and performance analysis of three tensor-product operators arising in finite element methods. We provide a mathematical background to these operations and implementation details. Achieving close-to-the-peak performance for these operators requires extensive optimization because of the operators' properties: low arithmetic intensity, tiered structure… ▽ More This paper is devoted to GPU kernel optimization and performance analysis of three tensor-product operators arising in finite element methods. We provide a mathematical background to these operations and implementation details. Achieving close-to-the-peak performance for these operators requires extensive optimization because of the operators' properties: low arithmetic intensity, tiered structure, and the need to store intermediate results inside the kernel. We give a guided overview of optimization strategies and we present a performance model that allows us to compare the efficacy of these optimizations against an empirically calibrated roofline. △ Less

Submitted 13 November, 2017; v1 submitted 2 November, 2017; originally announced November 2017.

Comments: 31 pages, 11 figures

arXiv:1706.07081 [pdf, other]

The Single-Phase ProtoDUNE Technical Design Report

Authors: B. Abi, R. Acciarri, M. A. Acero, M. Adamowski, C. Adams, D. L. Adams, P. Adamson, M. Adinolfi, Z. Ahmad, C. H. Albright, T. Alion, J. Anderson, K. Anderson, C. Andreopoulos, M. P. Andrews, R. A. Andrews, J. dos Anjos, A. Ankowski, J. Anthony, M. Antonello, A. Aranda Fernandez, A. Ariga, T. Ariga, E. Arrieta Diaz, J. Asaadi , et al. (806 additional authors not shown)

Abstract: ProtoDUNE-SP is the single-phase DUNE Far Detector prototype that is under construction and will be operated at the CERN Neutrino Platform (NP) starting in 2018. ProtoDUNE-SP, a crucial part of the DUNE effort towards the construction of the first DUNE 10-kt fiducial mass far detector module (17 kt total LAr mass), is a significant experiment in its own right. With a total liquid argon (LAr) mass… ▽ More ProtoDUNE-SP is the single-phase DUNE Far Detector prototype that is under construction and will be operated at the CERN Neutrino Platform (NP) starting in 2018. ProtoDUNE-SP, a crucial part of the DUNE effort towards the construction of the first DUNE 10-kt fiducial mass far detector module (17 kt total LAr mass), is a significant experiment in its own right. With a total liquid argon (LAr) mass of 0.77 kt, it represents the largest monolithic single-phase LArTPC detector to be built to date. It's technical design is given in this report. △ Less

Submitted 27 July, 2017; v1 submitted 21 June, 2017; originally announced June 2017.

Comments: 165 pages, fix references, author list and minor numbers

arXiv:1702.04316 [pdf, ps, other]

Acceleration of the Implicit-Explicit Non-hydrostatic Unified Model of the Atmosphere (NUMA) on Manycore Processors

Authors: Daniel S. Abdi, Francis X. Giraldo, Emil M. Constantinescu, Lester E. Carr III, Lucas C. Wilcox, Timothy C. Warburton

Abstract: We present the acceleration of an IMplicit-EXplicit (IMEX) non-hydrostatic atmospheric model on manycore processors such as GPUs and Intel's MIC architecture. IMEX time integration methods sidestep the constraint imposed by the Courant-Friedrichs-Lewy condition on explicit methods through corrective implicit solves within each time step. In this work, we implement and evaluate the performance of I… ▽ More We present the acceleration of an IMplicit-EXplicit (IMEX) non-hydrostatic atmospheric model on manycore processors such as GPUs and Intel's MIC architecture. IMEX time integration methods sidestep the constraint imposed by the Courant-Friedrichs-Lewy condition on explicit methods through corrective implicit solves within each time step. In this work, we implement and evaluate the performance of IMEX on manycore processors relative to explicit methods. Using 3D-IMEX at Courant number C=15 , we obtained a speedup of about 4X relative to an explicit time step** method run with the maximum allowable C=1. In addition, we demonstrate a much larger speedup of 100X at C=150 using 1D-IMEX due to the unconditional stability of the method in the vertical direction. Several improvements on the IMEX procedure were necessary in order to outperform our results with explicit methods: a) reducing the number of degrees of freedom of the IMEX formulation by forming the Schur complement; b) formulating a horizontally-explicit vertically-implicit (HEVI) 1D-IMEX scheme that has a lower workload and potentially better scalability than 3D-IMEX; c) using high-order polynomial preconditioners to reduce the condition number of the resulting system; d) using a direct solver for the 1D-IMEX method by performing and storing LU factorizations once to obtain a constant cost for any Courant number. Without all of these improvements, explicit time integration methods turned out to be difficult to beat. We discuss in detail the IMEX infrastructure required for formulating and implementing efficient methods on manycore processors. Finally, we validate our results with standard benchmark problems in NWP and evaluate the performance and scalability of the IMEX method using up to 4192 GPUs and 16 Knights Landing processors. △ Less

Submitted 13 February, 2017; originally announced February 2017.

arXiv:1612.06124 [pdf, other]

doi 10.1088/1748-0221/12/03/P03014

Cryogenic CMOS Cameras for High Voltage Monitoring in Liquid Argon

Authors: Nicola McConkey, Neil Spooner, Matthew Thiesse, Michael Wallbank, Thomas Karl Warburton

Abstract: The prevalent use of large volume liquid argon detectors strongly motivates the development of novel readout and monitoring technology which functions at cryogenic temperatures. This paper presents the development of a cryogenic CMOS camera system suitable for use inside a large volume liquid argon detector for online monitoring purposes. The characterisation of the system is described in detail.… ▽ More The prevalent use of large volume liquid argon detectors strongly motivates the development of novel readout and monitoring technology which functions at cryogenic temperatures. This paper presents the development of a cryogenic CMOS camera system suitable for use inside a large volume liquid argon detector for online monitoring purposes. The characterisation of the system is described in detail. The reliability of such a camera system has been demonstrated over several months, and recent data from operation within the liquid argon region of the DUNE 35tcryostat is presented. The cameras were used to monitor for high voltage breakdown inside the cryostat, with capability to observe breakdown of a liquid argon time projection chamber in situ. They were also used for detector monitoring, especially of components during cooldown. △ Less

Submitted 19 December, 2016; originally announced December 2016.

Comments: to be submitted to JINST

arXiv:1611.00102 [pdf, other]

On the penalty stabilization mechanism for upwind discontinuous Galerkin formulations of first order hyperbolic systems

Authors: Jesse Chan, T. Warburton

Abstract: Penalty fluxes are dissipative numerical fluxes for high order discontinuous Galerkin (DG) methods which depend on a penalization parameter. We investigate the dependence of the spectra of high order DG discretizations on this parameter, and show that as its value increases, the spectra of the DG discretization splits into two disjoint sets of eigenvalues. One set converges to the eigenvalues of a… ▽ More Penalty fluxes are dissipative numerical fluxes for high order discontinuous Galerkin (DG) methods which depend on a penalization parameter. We investigate the dependence of the spectra of high order DG discretizations on this parameter, and show that as its value increases, the spectra of the DG discretization splits into two disjoint sets of eigenvalues. One set converges to the eigenvalues of a conforming discretization, while the other set corresponds to spurious eigenvalues which are damped proportionally to the parameter. Numerical experiments also demonstrate that undamped spurious modes present in both in the limit of zero and large penalization parameters are damped for moderate values of the upwind parameter. △ Less

Submitted 20 October, 2017; v1 submitted 31 October, 2016; originally announced November 2016.

Comments: In CAMWA

arXiv:1610.05023 [pdf, other]

A GPU-accelerated nodal discontinuous Galerkin method with high-order absorbing boundary conditions and corner/edge compatibility

Authors: Axel Modave, Andreas Atle, Jesse Chan, Tim Warburton

Abstract: Discontinuous Galerkin finite element schemes exhibit attractive features for accurate large-scale wave-propagation simulations on modern parallel architectures. For many applications, these schemes must be coupled with non-reflective boundary treatments to limit the size of the computational domain without losing accuracy or computational efficiency, which remains a challenging task. In this pape… ▽ More Discontinuous Galerkin finite element schemes exhibit attractive features for accurate large-scale wave-propagation simulations on modern parallel architectures. For many applications, these schemes must be coupled with non-reflective boundary treatments to limit the size of the computational domain without losing accuracy or computational efficiency, which remains a challenging task. In this paper, we present a combination of a nodal discontinuous Galerkin method with high-order absorbing boundary conditions (HABCs) for cuboidal computational domains. Compatibility conditions are derived for HABCs intersecting at the edges and the corners of a cuboidal domain. We propose a GPU implementation of the computational procedure, which results in a multidimensional solver with equations to be solved on 0D, 1D, 2D and 3D spatial regions. Numerical results demonstrate both the accuracy and the computational efficiency of our approach. △ Less

Submitted 27 February, 2017; v1 submitted 17 October, 2016; originally announced October 2016.

arXiv:1609.09841 [pdf, ps, other]

GPU Acceleration of Hermite Methods for the Simulation of Wave Propagation

Authors: Arturo Vargas, Jesse Chan, Thomas Hagstrom, Timothy Warburton

Abstract: The Hermite methods of Goodrich, Hagstrom, and Lorenz (2006) use Hermite interpolation to construct high order numerical methods for hyperbolic initial value problems. The structure of the method has several favorable features for parallel computing. In this work, we propose algorithms that take advantage of the many-core architecture of Graphics Processing Units. The algorithm exploits the compac… ▽ More The Hermite methods of Goodrich, Hagstrom, and Lorenz (2006) use Hermite interpolation to construct high order numerical methods for hyperbolic initial value problems. The structure of the method has several favorable features for parallel computing. In this work, we propose algorithms that take advantage of the many-core architecture of Graphics Processing Units. The algorithm exploits the compact stencil of Hermite methods and uses data structures that allow for efficient data load and stores. Additionally the highly localized evolution operator of Hermite methods allows us to combine multi-stage time-step** methods within the new algorithms incurring minimal accesses of global memory. Using a scalar linear wave equation, we study the algorithm by considering Hermite interpolation and evolution as individual kernels and alternatively combined them into a monolithic kernel. For both approaches we demonstrate strategies to increase performance. Our numerical experiments show that although a two kernel approach allows for better performance on the hardware, a monolithic kernel can offer a comparable time to solution with less global memory usage. △ Less

Submitted 30 September, 2016; originally announced September 2016.

Comments: 12 pages. Submitted to ICOSAHOM 2016 proceedings

arXiv:1608.03836 [pdf, other]

Weight-adjusted discontinuous Galerkin methods: curvilinear meshes

Authors: Jesse Chan, Russell J. Hewett, T. Warburton

Abstract: Traditional time-domain discontinuous Galerkin (DG) methods result in large storage costs at high orders of approximation due to the storage of dense elemental matrices. In this work, we propose a weight-adjusted DG (WADG) methods for curvilinear meshes which reduce storage costs while retaining energy stability. A priori error estimates show that high order accuracy is preserved under sufficient… ▽ More Traditional time-domain discontinuous Galerkin (DG) methods result in large storage costs at high orders of approximation due to the storage of dense elemental matrices. In this work, we propose a weight-adjusted DG (WADG) methods for curvilinear meshes which reduce storage costs while retaining energy stability. A priori error estimates show that high order accuracy is preserved under sufficient conditions on the mesh, which are illustrated through convergence tests with different sequences of meshes. Numerical and computational experiments verify the accuracy and performance of WADG for a model problem on curved domains. △ Less

Submitted 12 August, 2016; originally announced August 2016.

Comments: Submitted to SISC

arXiv:1608.01944 [pdf, other]

Weight-adjusted discontinuous Galerkin methods: wave propagation in heterogeneous media

Authors: Jesse Chan, Russell J. Hewett, T. Warburton

Abstract: Time-domain discontinuous Galerkin (DG) methods for wave propagation require accounting for the inversion of dense elemental mass matrices, where each mass matrix is computed with respect to a parameter-weighted L2 inner product. In applications where the wavespeed varies spatially at a sub-element scale, these matrices are distinct over each element, necessitating additional storage. In this work… ▽ More Time-domain discontinuous Galerkin (DG) methods for wave propagation require accounting for the inversion of dense elemental mass matrices, where each mass matrix is computed with respect to a parameter-weighted L2 inner product. In applications where the wavespeed varies spatially at a sub-element scale, these matrices are distinct over each element, necessitating additional storage. In this work, we propose a weight-adjusted DG (WADG) method which reduces storage costs by replacing the weighted L2 inner product with a weight-adjusted inner product. This equivalent inner product results in an energy stable method, but does not increase storage costs for locally varying weights. A-priori error estimates are derived, and numerical examples are given illustrating the application of this method to the acoustic wave equation with heterogeneous wavespeed. △ Less

Submitted 1 January, 2017; v1 submitted 5 August, 2016; originally announced August 2016.

Comments: Submitted to SISC

arXiv:1607.03399 [pdf, other]

Reduced storage nodal discontinuous Galerkin methods on semi-structured prismatic meshes

Authors: Jesse Chan, Zheng Wang, Russell J. Hewett, T. Warburton

Abstract: We present a high order time-domain nodal discontinuous Galerkin method for wave problems on hybrid meshes consisting of both wedge and tetrahedral elements. We allow for vertically mapped wedges which can be deformed along the extruded coordinate, and present a simple method for producing quasi-uniform wedge meshes for layered domains. We show that standard mass lum** techniques result in a los… ▽ More We present a high order time-domain nodal discontinuous Galerkin method for wave problems on hybrid meshes consisting of both wedge and tetrahedral elements. We allow for vertically mapped wedges which can be deformed along the extruded coordinate, and present a simple method for producing quasi-uniform wedge meshes for layered domains. We show that standard mass lum** techniques result in a loss of energy stability on meshes of vertically mapped wedges, and propose an alternative which is both energy stable and efficient. High order convergence is demonstrated, and comparisons are made with existing low-storage methods on wedges. Finally, the computational performance of the method on Graphics Processing Units is evaluated. △ Less

Submitted 31 October, 2016; v1 submitted 12 July, 2016; originally announced July 2016.

Comments: Submitted to CAMWA

arXiv:1604.08501 [pdf, ps, other]

doi 10.1145/2935323.2935325

Array Program Transformation with Loo.py by Example: High-Order Finite Elements

Authors: Andreas Klöckner, Lucas C. Wilcox, T. Warburton

Abstract: To concisely and effectively demonstrate the capabilities of our program transformation system Loo.py, we examine a transformation path from two real-world Fortran subroutines as found in a weather model to a single high-performance computational kernel suitable for execution on modern GPU hardware. Along the transformation path, we encounter kernel fusion, vectorization, prefetch- ing, paralleliz… ▽ More To concisely and effectively demonstrate the capabilities of our program transformation system Loo.py, we examine a transformation path from two real-world Fortran subroutines as found in a weather model to a single high-performance computational kernel suitable for execution on modern GPU hardware. Along the transformation path, we encounter kernel fusion, vectorization, prefetch- ing, parallelization, and algorithmic changes achieved by mechanized conversion between imperative and functional/substitution- based code, among a number more. We conclude with performance results that demonstrate the effects and support the effectiveness of the applied transformations. △ Less

Submitted 13 April, 2016; originally announced April 2016.

ACM Class: D.3.4; D.1.3; G.4

Journal ref: ARRAY 2016 Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming Pages 9-16

arXiv:1602.07997 [pdf, other]

doi 10.1016/j.cageo.2016.03.008

GPU performance analysis of a nodal discontinuous Galerkin method for acoustic and elastic models

Authors: Axel Modave, Amik St-Cyr, Tim Warburton

Abstract: Finite element schemes based on discontinuous Galerkin methods possess features amenable to massively parallel computing accelerated with general purpose graphics processing units (GPUs). However, the computational performance of such schemes strongly depends on their implementation. In the past, several implementation strategies have been proposed. They are based exclusively on specialized comput… ▽ More Finite element schemes based on discontinuous Galerkin methods possess features amenable to massively parallel computing accelerated with general purpose graphics processing units (GPUs). However, the computational performance of such schemes strongly depends on their implementation. In the past, several implementation strategies have been proposed. They are based exclusively on specialized compute kernels tuned for each operation, or they can leverage BLAS libraries that provide optimized routines for basic linear algebra operations. In this paper, we present and analyze up-to-date performance results for different implementations, tested in a unified framework on a single NVIDIA GTX980 GPU. We show that specialized kernels written with a one-node-per-thread strategy are competitive for polynomial bases up to the fifth and seventh degrees for acoustic and elastic models, respectively. For higher degrees, a strategy that makes use of the NVIDIA cuBLAS library provides better results, able to reach a net arithmetic throughput 35.7% of the theoretical peak value. △ Less

Submitted 25 February, 2016; originally announced February 2016.

Comments: Paper submitted to Computers & Geosciences, 21 pages

arXiv:1601.05471 [pdf, other]

Long-Baseline Neutrino Facility (LBNF) and Deep Underground Neutrino Experiment (DUNE) Conceptual Design Report Volume 1: The LBNF and DUNE Projects

Authors: R. Acciarri, M. A. Acero, M. Adamowski, C. Adams, P. Adamson, S. Adhikari, Z. Ahmad, C. H. Albright, T. Alion, E. Amador, J. Anderson, K. Anderson, C. Andreopoulos, M. Andrews, R. Andrews, I. Anghel, J. d. Anjos, A. Ankowski, M. Antonello, A. ArandaFernandez, A. Ariga, T. Ariga, D. Aristizabal, E. Arrieta-Diaz, K. Aryal , et al. (780 additional authors not shown)

Abstract: This document presents the Conceptual Design Report (CDR) put forward by an international neutrino community to pursue the Deep Underground Neutrino Experiment at the Long-Baseline Neutrino Facility (LBNF/DUNE), a groundbreaking science experiment for long-baseline neutrino oscillation studies and for neutrino astrophysics and nucleon decay searches. The DUNE far detector will be a very large modu… ▽ More This document presents the Conceptual Design Report (CDR) put forward by an international neutrino community to pursue the Deep Underground Neutrino Experiment at the Long-Baseline Neutrino Facility (LBNF/DUNE), a groundbreaking science experiment for long-baseline neutrino oscillation studies and for neutrino astrophysics and nucleon decay searches. The DUNE far detector will be a very large modular liquid argon time-projection chamber (LArTPC) located deep underground, coupled to the LBNF multi-megawatt wide-band neutrino beam. DUNE will also have a high-resolution and high-precision near detector. △ Less

Submitted 20 January, 2016; originally announced January 2016.

arXiv:1601.02984 [pdf, other]

Long-Baseline Neutrino Facility (LBNF) and Deep Underground Neutrino Experiment (DUNE) Conceptual Design Report, Volume 4 The DUNE Detectors at LBNF

Authors: R. Acciarri, M. A. Acero, M. Adamowski, C. Adams, P. Adamson, S. Adhikari, Z. Ahmad, C. H. Albright, T. Alion, E. Amador, J. Anderson, K. Anderson, C. Andreopoulos, M. Andrews, R. Andrews, I. Anghel, J. d. Anjos, A. Ankowski, M. Antonello, A. ArandaFernandez, A. Ariga, T. Ariga, D. Aristizabal, E. Arrieta-Diaz, K. Aryal , et al. (779 additional authors not shown)

Abstract: A description of the proposed detector(s) for DUNE at LBNF A description of the proposed detector(s) for DUNE at LBNF △ Less

Submitted 12 January, 2016; originally announced January 2016.

arXiv:1512.06148 [pdf, other]

Long-Baseline Neutrino Facility (LBNF) and Deep Underground Neutrino Experiment (DUNE) Conceptual Design Report Volume 2: The Physics Program for DUNE at LBNF

Authors: DUNE Collaboration, R. Acciarri, M. A. Acero, M. Adamowski, C. Adams, P. Adamson, S. Adhikari, Z. Ahmad, C. H. Albright, T. Alion, E. Amador, J. Anderson, K. Anderson, C. Andreopoulos, M. Andrews, R. Andrews, I. Anghel, J. d. Anjos, A. Ankowski, M. Antonello, A. ArandaFernandez, A. Ariga, T. Ariga, D. Aristizabal, E. Arrieta-Diaz , et al. (780 additional authors not shown)

Abstract: The Physics Program for the Deep Underground Neutrino Experiment (DUNE) at the Fermilab Long-Baseline Neutrino Facility (LBNF) is described. The Physics Program for the Deep Underground Neutrino Experiment (DUNE) at the Fermilab Long-Baseline Neutrino Facility (LBNF) is described. △ Less

Submitted 22 January, 2016; v1 submitted 18 December, 2015; originally announced December 2015.

Showing 1–50 of 68 results for author: Warburton, T