Skip to main content

Showing 1–48 of 48 results for author: Kaiser, H

.
  1. arXiv:2407.00026  [pdf, other

    cs.DC

    Distributed astrophysics simulations using Octo-Tiger with RISC-V CPUs using HPX and Kokkos

    Authors: Patrick Diehl, Gregor Daiß, Steven R. Brandt, Alireza Kheirkhahan, Srinivas Yadav Singanaboina, Dominic Marcello, Chris Taylor, John Leidel, Hartmut Kaiser

    Abstract: In recent years, interest in RISC-V computing architectures have moved from academic to mainstream, especially in the field of High Performance Computing where energy limitations are increasingly a point of concern. The results presented in this paper are part of a longer-term evaluation of RISC-V's viability for HPC applications. In this work, we use the Octo-Tiger multi-physics, multi-scale, 3D… ▽ More

    Submitted 10 May, 2024; originally announced July 2024.

  2. arXiv:2405.13101  [pdf, other

    cs.SE cs.AI

    Evaluating AI-generated code for C++, Fortran, Go, Java, Julia, Matlab, Python, R, and Rust

    Authors: Patrick Diehl, Noujoud Nader, Steve Brandt, Hartmut Kaiser

    Abstract: This study evaluates the capabilities of ChatGPT versions 3.5 and 4 in generating code across a diverse range of programming languages. Our objective is to assess the effectiveness of these AI models for generating scientific programs. To this end, we asked ChatGPT to generate three distinct codes: a simple numerical integration, a conjugate gradient solver, and a parallel 1D stencil-based heat eq… ▽ More

    Submitted 5 July, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: 9 pages, 3 figures

  3. HPX with Spack and Singularity Containers: Evaluating Overheads for HPX/Kokkos using an astrophysics application

    Authors: Patrick Diehl, Steven R. Brandt, Gregor Daiß, Hartmut Kaiser

    Abstract: Cloud computing for high performance computing resources is an emerging topic. This service is of interest to researchers who care about reproducible computing, for software packages with complex installations, and for companies or researchers who need the compute resources only occasionally or do not want to run and maintain a supercomputer on their own. The connection between HPC and containers… ▽ More

    Submitted 7 May, 2024; v1 submitted 11 February, 2024; originally announced May 2024.

  4. arXiv:2404.06864  [pdf, other

    astro-ph.SR

    Hydrodynamic simulations of WD-WD mergers and the origin of RCB stars

    Authors: Sagiv Shiber, Orsola De Marco, Patrick M. Motl, Bradley Munson, Dominic C. Marcello, Juhan Frank, Patrick Diehl, Geoffrey C. Clayton, Bennett N. Skinner, Hartmut Kaiser, Gregor Daiss, Dirk Pfluger, Jan E. Staff

    Abstract: We study the properties of double white dwarf (DWD) mergers by performing hydrodynamic simulations using the new and improved adaptive mesh refinement code Octo-Tiger. We follow the orbital evolution of DWD systems of mass ratio q=0.7 for tens of orbits until and after the merger to investigate them as a possible origin for R Coronae Borealis (RCB) type stars. We reproduce previous results, findin… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 27 pages, Submitted to MNRAS. Comments are welcome

  5. arXiv:2403.04818  [pdf, other

    cs.LG physics.ao-ph

    Storm Surge Modeling in the AI ERA: Using LSTM-based Machine Learning for Enhancing Forecasting Accuracy

    Authors: Stefanos Giaremis, Noujoud Nader, Clint Dawson, Hartmut Kaiser, Carola Kaiser, Efstratios Nikidis

    Abstract: Physics simulation results of natural processes usually do not fully capture the real world. This is caused for instance by limits in what physical processes are simulated and to what accuracy. In this work we propose and analyze the use of an LSTM-based deep learning network machine learning (ML) architecture for capturing and predicting the behavior of the systemic error for storm surge forecast… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  6. arXiv:2401.03353  [pdf, other

    cs.DC

    HPX -- An open source C++ Standard Library for Parallelism and Concurrency

    Authors: Thomas Heller, Patrick Diehl, Zachary Byerly, John Biddiscombe, Hartmut Kaiser

    Abstract: To achieve scalability with today's heterogeneous HPC resources, we need a dramatic shift in our thinking; MPI+X is not enough. Asynchronous Many Task (AMT) runtime systems break down the global barriers imposed by the Bulk Synchronous Programming model. HPX is an open-source, C++ Standards compliant AMT runtime system that is developed by a diverse international community of collaborators called… ▽ More

    Submitted 11 August, 2023; originally announced January 2024.

    Journal ref: Proceedings of OpenSuCo 2017, Denver, Colorado USA, November 2017 (OpenSuCo 17)

  7. Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger

    Authors: Parick Diehl, Gregor Daiss, Steven R. Brandt, Alireza Kheirkhahan, Hartmut Kaiser, Christopher Taylor, John Leidel

    Abstract: In recent years, computers based on the RISC-V architecture have raised broad interest in the high-performance computing (HPC) community. As the RISC-V community develops the core instruction set architecture (ISA) along with ISA extensions, the HPC community has been actively ensuring HPC applications and environments are supported. In this context, assessing the performance of asynchronous many-… ▽ More

    Submitted 17 August, 2023; originally announced September 2023.

  8. Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust, Swift, and Java

    Authors: Patrick Diehl, Steven R. Brandt, Max Morris, Nikunj Gupta, Hartmut Kaiser

    Abstract: Many scientific high performance codes that simulate e.g. black holes, coastal waves, climate and weather, etc. rely on block-structured meshes and use finite differencing methods to iteratively solve the appropriate systems of differential equations. In this paper we investigate implementations of an extremely simple simulation of this type using various programming systems and languages. We focu… ▽ More

    Submitted 10 July, 2023; v1 submitted 18 May, 2023; originally announced July 2023.

  9. Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku

    Authors: Patrick Diehl, Gregor Daiß, Kevin Huck, Dominic Marcello, Sagiv Shiber, Hartmut Kaiser, Dirk Pflüger

    Abstract: The increasing availability of machines relying on non-GPU architectures, such as ARM A64FX in high-performance computing, provides a set of interesting challenges to application developers. In addition to requiring code portability across different parallelization schemes, programs targeting these architectures have to be highly adaptable in terms of compute kernel sizes to accommodate different… ▽ More

    Submitted 15 March, 2023; originally announced April 2023.

  10. Stellar Mergers with HPX-Kokkos and SYCL: Methods of using an Asynchronous Many-Task Runtime System with SYCL

    Authors: Gregor Daiß, Patrick Diehl, Hartmut Kaiser, Dirk Pflüger

    Abstract: Ranging from NVIDIA GPUs to AMD GPUs and Intel GPUs: Given the heterogeneity of available accelerator cards within current supercomputers, portability is a key aspect for modern HPC applications. In Octo-Tiger, we rely on Kokkos and its various execution spaces for portable compute kernels. In turn, we use HPX to coordinate kernel launches, CPU tasks, and communication. This combination allows us… ▽ More

    Submitted 8 May, 2023; v1 submitted 4 March, 2023; originally announced March 2023.

  11. Shared memory parallelism in Modern C++ and HPX

    Authors: Patrick Diehl, Steven R. Brandt, Hartmut Kaiser

    Abstract: Parallel programming remains a daunting challenge, from the struggle to express a parallel algorithm without cluttering the underlying synchronous logic, to describing which devices to employ in a calculation, to correctness. Over the years, numerous solutions have arisen, many of them requiring new programming languages, extensions to programming languages, or the addition of pragmas. Support for… ▽ More

    Submitted 9 August, 2023; v1 submitted 16 January, 2023; originally announced February 2023.

    Comments: Extended paper for the special issue

  12. From Merging Frameworks to Merging Stars: Experiences using HPX, Kokkos and SIMD Types

    Authors: Gregor Daiß, Srinivas Yadav Singanaboina, Patrick Diehl, Hartmut Kaiser, Dirk Pflüger

    Abstract: Octo-Tiger, a large-scale 3D AMR code for the merger of stars, uses a combination of HPX, Kokkos and explicit SIMD types, aiming to achieve performance-portability for a broad range of heterogeneous hardware. However, on A64FX CPUs, we encountered several missing pieces, hindering performance by causing problems with the SIMD vectorization. Therefore, we add std::experimental::simd as an option to… ▽ More

    Submitted 8 May, 2023; v1 submitted 26 September, 2022; originally announced October 2022.

  13. From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels

    Authors: Gregor Daiß, Patrick Diehl, Dominic Marcello, Alireza Kheirkhahan, Hartmut Kaiser, Dirk Pflüger

    Abstract: Meeting both scalability and performance portability requirements is a challenge for any HPC application, especially for adaptively refined ones. In Octo-Tiger, an astrophysics application for the simulation of stellar mergers, we approach this with existing solutions: We employ HPX to obtain fine-grained tasks to easily distribute work and finely overlap communication and computation. For the com… ▽ More

    Submitted 4 March, 2023; v1 submitted 26 September, 2022; originally announced October 2022.

  14. arXiv:2210.06437  [pdf, other

    cs.DC

    Distributed, combined CPU and GPU profiling within HPX using APEX

    Authors: Patrick Diehl, Gregor Daiss, Kevin Huck, Dominic Marcello, Sagiv Shiber, Hartmut Kaiser, Juhan Frank, Geoffrey C. Clayton, Dirk Pflueger

    Abstract: Benchmarking and comparing performance of a scientific simulation across hardware platforms is a complex task. When the simulation in question is constructed with an asynchronous, many-task (AMT) runtime offloading work to GPUs, the task becomes even more complex. In this paper, we discuss the use of a uniquely suited performance measurement library, APEX, to capture the performance behavior of a… ▽ More

    Submitted 21 September, 2022; originally announced October 2022.

  15. arXiv:2208.00109  [pdf, other

    cs.HC

    Traveler: Navigating Task Parallel Traces for Performance Analysis

    Authors: Sayef Azad Sakin, Alex Bigelow, R. Tohid, Connor Scully-Allison, Carlos Scheidegger, Steven R. Brandt, Christopher Taylor, Kevin A. Huck, Hartmut Kaiser, Katherine E. Isaacs

    Abstract: Understanding the behavior of software in execution is a key step in identifying and fixing performance issues. This is especially important in high performance computing contexts where even minor performance tweaks can translate into large savings in terms of computational resource use. To aid performance analysis, developers may collect an execution trace - a chronological log of program activit… ▽ More

    Submitted 3 September, 2022; v1 submitted 29 July, 2022; originally announced August 2022.

    Comments: IEEE VIS 2022

  16. Quantifying Overheads in Charm++ and HPX using Task Bench

    Authors: Nanmiao Wu, Ioannis Gonidelis, Simeng Liu, Zane Fink, Nikunj Gupta, Karame Mohammadiporshokooh, Patrick Diehl, Hartmut Kaiser, Laxmikant V. Kale

    Abstract: Asynchronous Many-Task (AMT) runtime systems take advantage of multi-core architectures with light-weight threads, asynchronous executions, and smart scheduling. In this paper, we present the comparison of the AMT systems Charm++ and HPX with the main stream MPI, OpenMP, and MPI+OpenMP libraries using the Task Bench benchmarks. Charm++ is a parallel programming language based on C++, supporting st… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

  17. Closing the Performance Gap with Modern C++

    Authors: Thomas Heller, Hartmut Kaiser, Patrick Diehl, Dietmar Fey, Marc Alexander Schweitzer

    Abstract: On the way to Exascale, programmers face the increasing challenge of having to support multiple hardware architectures from the same code base. At the same time, portability of code and performance are increasingly difficult to achieve as hardware architectures are becoming more and more diverse. Today's heterogeneous systems often include two or more completely distinct and incompatible hardware… ▽ More

    Submitted 30 May, 2022; originally announced June 2022.

  18. Octo-Tiger's New Hydro Module and Performance Using HPX+CUDA on ORNL's Summit

    Authors: Patrick Diehl, Gregor Daiß, Dominic Marcello, Kevin Huck, Sagiv Shiber, Hartmut Kaiser, Juhan Frank, Dirk Pflüger

    Abstract: Octo-Tiger is a code for modeling three-dimensional self-gravitating astrophysical fluids. It was particularly designed for the study of dynamical mass transfer between interacting binary stars. Octo-Tiger is parallelized for distributed systems using the asynchronous many-task runtime system, the C++ standard library for parallelism and concurrency (HPX) and utilizes CUDA for its gravity solver.… ▽ More

    Submitted 26 July, 2021; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: Accepted to IEEE Cluster

  19. arXiv:2105.00027  [pdf, other

    cs.DC cond-mat.mtrl-sci cond-mat.str-el cond-mat.supr-con

    Memory Reduction using a Ring Abstraction over GPU RDMA for Distributed Quantum Monte Carlo Solver

    Authors: Weile Wei, Eduardo D'Azevedo, Kevin Huck, Arghya Chatterjee, Oscar Hernandez, Hartmut Kaiser

    Abstract: Scientific applications that run on leadership computing facilities often face the challenge of being unable to fit leading science cases onto accelerator devices due to memory constraints (memory-bound applications). In this work, the authors studied one such US Department of Energy mission-critical condensed matter physics application, Dynamical Cluster Approximation (DCA++), and this paper disc… ▽ More

    Submitted 13 May, 2021; v1 submitted 30 April, 2021; originally announced May 2021.

  20. Performance Measurements within Asynchronous Task-based Runtime Systems: A Double White Dwarf Merger as an Application

    Authors: Patrick Diehl, Dominic Marcello, Parsa Amini, Hartmut Kaiser, Sagiv Shiber, Geoffrey C. Clayton, Juhan Frank, Gregor Daiß, Dirk Pflüger, David Eder, Alice Koniges, Kevin Huck

    Abstract: Analyzing performance within asynchronous many-task-based runtime systems is challenging because millions of tasks are launched concurrently. Especially for long-term runs the amount of data collected becomes overwhelming. We study HPX and its performance-counter framework and APEX to collect performance data and energy consumption. We added HPX application-specific performance counters to the Oct… ▽ More

    Submitted 9 June, 2021; v1 submitted 30 January, 2021; originally announced February 2021.

  21. arXiv:2101.08226  [pdf, other

    astro-ph.IM astro-ph.SR

    Octo-Tiger: A New, 3D Hydrodynamic Code for Stellar Mergers that uses HPX Parallelisation

    Authors: Dominic C. Marcello, Sagiv Shiber, Orsola De Marco, Juhan Frank, Geoffrey C. Clayton, Patrick M. Motl, Patrick Diehl, Hartmut Kaiser

    Abstract: OCTO-TIGER is an astrophysics code to simulate the evolution of self-gravitating and rotat-ing systems of arbitrary geometry based on the fast multipole method, using adaptive mesh refinement. OCTO-TIGER is currently optimised to simulate the merger of well-resolved stars that can be approximated by barotropic structures, such as white dwarfs or main sequence stars. The gravity solver conserves an… ▽ More

    Submitted 10 August, 2021; v1 submitted 20 January, 2021; originally announced January 2021.

    Comments: 38 pages, 24 figures, Co-Lead Authors: Dominic C. Marcello and Sagiv Shiber

  22. arXiv:2010.10930  [pdf, ps, other

    cs.DC

    Towards Distributed Software Resilience in Asynchronous Many-Task Programming Models

    Authors: Nikunj Gupta, Jackson R. Mayo, Adrian S. Lemoine, Hartmut Kaiser

    Abstract: Exceptions and errors occurring within mission critical applications due to hardware failures have a high cost. With the emerging Next Generation Platforms (NGPs), the rate of hardware failures will likely increase. Therefore, designing our applications to be resilient is a critical concern in order to retain the reliability of results while meeting the constraints on power budgets. In this paper,… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: arXiv admin note: text overlap with arXiv:2004.07203

    Report number: SAND2020-11278 C

  23. arXiv:2010.07098  [pdf, other

    cs.DC cond-mat.mtrl-sci cond-mat.str-el cond-mat.supr-con

    Performance Analysis of a Quantum Monte Carlo Application on Multiple Hardware Architectures Using the HPX Runtime

    Authors: Weile Wei, Arghya Chatterjee, Kevin Huck, Oscar Hernandez, Hartmut Kaiser

    Abstract: This paper describes how we successfully used the HPX programming model to port the DCA++ application on multiple architectures that include POWER9, x86, ARM v8, and NVIDIA GPUs. We describe the lessons we can learn from this experience as well as the benefits of enabling the HPX in the application to improve the CPU threading part of the code, which led to an overall 21% improvement across archit… ▽ More

    Submitted 19 October, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

  24. Deploying a Task-based Runtime System on Raspberry Pi Clusters

    Authors: Nikunj Gupta, Steve R. Brandt, Bibek Wagle, Nanmiao, Alireza Kheirkhahan, Patrick Diehl, Hartmut Kaiser, Felix W. Baumann

    Abstract: Arm technology is becoming increasingly important in HPC. Recently, Fugaku, an \arm-based system, was awarded the number one place in the Top500 list. Raspberry Pis provide an inexpensive platform to become familiar with this architecture. However, Pis can also be useful on their own. Here we describe our efforts to configure and benchmark the use of a Raspberry Pi cluster with the HPX/Phylanx pla… ▽ More

    Submitted 9 April, 2021; v1 submitted 8 October, 2020; originally announced October 2020.

  25. Towards a Scalable and Distributed Infrastructure for Deep Learning Applications

    Authors: Bita Hasheminezhad, Shahrzad Shirzad, Nanmiao Wu, Patrick Diehl, Hannes Schulz, Hartmut Kaiser

    Abstract: Although recent scaling up approaches to training deep neural networks have proven to be effective, the computational intensity of large and complex models, as well as the availability of large-scale datasets, require deep learning frameworks to utilize scaling out techniques. Parallelization approaches and distribution requirements are not considered in the preliminary designs of most available d… ▽ More

    Submitted 19 April, 2021; v1 submitted 6 October, 2020; originally announced October 2020.

  26. arXiv:2004.07203  [pdf, other

    cs.DC

    Implementing Software Resiliency in HPX for Extreme Scale Computing

    Authors: Nikunj Gupta, Jackson R. Mayo, Adrian S. Lemoine, Hartmut Kaiser

    Abstract: Exceptions and errors occurring within mission critical applications due to hardware failures have a high cost. With the emerging Next Generation Platforms (NGPs), the rate of hardware failures will invariably increase. Therefore, designing our applications to be resilient is a critical concern in order to retain the reliability of results while meeting the constraints on power budgets. In this pa… ▽ More

    Submitted 15 April, 2020; originally announced April 2020.

    Comments: 7 pages, 5 figures

    Report number: SAND2020-3975 R

  27. arXiv:2002.07970  [pdf, other

    cs.DC cs.PL

    Supporting OpenMP 5.0 Tasks in hpxMP -- A study of an OpenMP implementation within Task Based Runtime Systems

    Authors: Tianyi Zhang, Shahrzad Shirzad, Bibek Wagle, Adrian S. Lemoine, Patrick Diehl, Hartmut Kaiser

    Abstract: OpenMP has been the de facto standard for single node parallelism for more than a decade. Recently, asynchronous many-task runtime (AMT) systems have increased in popularity as a new programming paradigm for high performance computing applications. One of the major challenges of this new paradigm is the incompatibility of the OpenMP thread model and other AMTs. Highly optimized OpenMP-based librar… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

  28. Scheduling optimization of parallel linear algebra algorithms using Supervised Learning

    Authors: G. Laberge, S. Shirzad, P. Diehl, H. Kaiser, S. Prudhomme, A. Lemoine

    Abstract: Linear algebra algorithms are used widely in a variety of domains, e.g machine learning, numerical physics and video games graphics. For all these applications, loop-level parallelism is required to achieve high performance. However, finding the optimal way to schedule the workload between threads is a non-trivial problem because it depends on the structure of the algorithm being parallelized and… ▽ More

    Submitted 25 September, 2019; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: Accepted at HPCML19

  29. From Piz Daint to the Stars: Simulation of Stellar Mergers using High-Level Abstractions

    Authors: Gregor Daiß, Parsa Amini, John Biddiscombe, Patrick Diehl, Juhan Frank, Kevin Huck, Hartmut Kaiser, Dominic Marcello, David Pfander, Dirk Pflüger

    Abstract: We study the simulation of stellar mergers, which requires complex simulations with high computational demands. We have developed Octo-Tiger, a finite volume grid-based hydrodynamics simulation code with Adaptive Mesh Refinement which is unique in conserving both linear and angular momentum to machine precision. To face the challenge of increasingly complex, diverse, and heterogeneous HPC systems,… ▽ More

    Submitted 9 August, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

    Comments: Accepted at SC19

  30. An Introduction to hpxMP: A Modern OpenMP Implementation Leveraging HPX, An Asynchronous Many-Task System

    Authors: Tianyi Zhang, Shahrzad Shirzad, Patrick Diehl, R. Tohid, Weile Wei, Hartmut Kaiser

    Abstract: Asynchronous Many-task (AMT) runtime systems have gained increasing acceptance in the HPC community due to the performance improvements offered by fine-grained tasking runtime systems. At the same time, C++ standardization efforts are focused on creating higher-level interfaces able to replace OpenMP or OpenACC in modern C++ codes. These higher level functions have been adopted in standards confor… ▽ More

    Submitted 5 July, 2019; v1 submitted 7 March, 2019; originally announced March 2019.

  31. Integration of CUDA Processing within the C++ library for parallelism and concurrency (HPX)

    Authors: Patrick Diehl, Madhavan Seshadri, Thomas Heller, Hartmut Kaiser

    Abstract: Experience shows that on today's high performance systems the utilization of different acceleration cards in conjunction with a high utilization of all other parts of the system is difficult. Future architectures, like exascale clusters, are expected to aggravate this issue as the number of cores are expected to increase and memory hierarchies are expected to become deeper. One big aspect for dist… ▽ More

    Submitted 26 October, 2018; originally announced October 2018.

  32. Asynchronous Execution of Python Code on Task Based Runtime Systems

    Authors: R. Tohid, Bibek Wagle, Shahrzad Shirzad, Patrick Diehl, Adrian Serio, Alireza Kheirkhahan, Parsa Amini, Katy Williams, Kate Isaacs, Kevin Huck, Steven Brandt, Hartmut Kaiser

    Abstract: Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and artificial intelligence (AI), from utilizing performance benefits of such systems. Researchers and scientists favor high-productivity languages to avoid the inconvenienc… ▽ More

    Submitted 22 October, 2018; v1 submitted 17 October, 2018; originally announced October 2018.

  33. arXiv:1806.06917  [pdf, other

    cs.DC physics.comp-ph

    An asynchronous and task-based implementation of Peridynamics utilizing HPX -- the C++ standard library for parallelism and concurrency

    Authors: Patrick Diehl, Prashant K. Jha, Hartmut Kaiser, Robert Lipton, Martin Levesque

    Abstract: On modern supercomputers, asynchronous many task systems are emerging to address the new architecture of computational nodes. Through this shift of increasing cores per node, a new programming model with the focus on handle the fine-grain parallelism of this increasing amount of cores per computational node is needed. Asynchronous Many Task (AMT) run time systems represent an emerging paradigm for… ▽ More

    Submitted 28 October, 2020; v1 submitted 18 June, 2018; originally announced June 2018.

  34. arXiv:1711.01519  [pdf, other

    cs.DC cs.AI cs.LG

    HPX Smart Executors

    Authors: Zahra Khatami, Lukas Troska, Hartmut Kaiser, J. Ramanujam, Adrian Serio

    Abstract: The performance of many parallel applications depends on loop-level parallelism. However, manually parallelizing all loops may result in degrading parallel performance, as some of them cannot scale desirably to a large number of threads. In addition, the overheads of manually tuning loop parameters might prevent an application from reaching its maximum parallel performance. We illustrate how machi… ▽ More

    Submitted 4 November, 2017; originally announced November 2017.

    Comments: In Proceedings of ESPM2'17: Third International Workshop on Extreme Scale Programming Models and Middleware, Denver, CO, USA, November 12-17,,2017 (ESPM2'17), 8 pages

  35. arXiv:1703.09264  [pdf, other

    cs.DC

    Redesigning OP2 Compiler to Use HPX Runtime Asynchronous Techniques

    Authors: Zahra Khatami, Hartmut Kaiser, J. Ramanujam

    Abstract: Maximizing parallelism level in applications can be achieved by minimizing overheads due to load imbalances and waiting time due to memory latencies. Compiler optimization is one of the most effective solutions to tackle this problem. The compiler is able to detect the data dependencies in an application and is able to analyze the specific sections of code for parallelization potential. However, a… ▽ More

    Submitted 27 March, 2017; originally announced March 2017.

    Comments: 18th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2017)

  36. arXiv:1611.00463  [pdf, other

    cs.DC

    A Load-Balanced Parallel and Distributed Sorting Algorithm Implemented with PGX.D

    Authors: Zahra Khatami, Sungpack Hong, **soo Lee, Siegfried Depner, Hassan Chafi, J. Ramanujam, Hartmut Kaiser

    Abstract: Sorting has been one of the most challenging studied problems in different scientific researches. Although many techniques and algorithms have been proposed on the theory of having efficient parallel sorting implementation, however achieving desired performance on different types of the architectures with large number of processors is still a challenging issue. Maximizing parallelism level in appl… ▽ More

    Submitted 14 January, 2017; v1 submitted 1 November, 2016; originally announced November 2016.

    Comments: 8 pages, 12 figures

  37. arXiv:1510.05804  [pdf, other

    cond-mat.stat-mech cond-mat.soft

    Mermin-Wagner fluctuations in 2D amorphous solids

    Authors: Bernd Illing, Sebastian Frischi, Herbert Kaiser, Christian Klix, Georg Maret, Peter Keim

    Abstract: In a recent comment, M. Kosterlitz described how the discrepancy about the lack of broken translational symmetry in two dimensions - doubting the existence of 2D crystals - and the first computer simulations foretelling 2D crystals at least in tiny systems, motivated him and D. Thouless to investigate melting and suprafluidity in two dimensions [Jour. of Phys. Cond. Matt. \textbf{28}, 481001 (2016… ▽ More

    Submitted 30 October, 2016; v1 submitted 20 October, 2015; originally announced October 2015.

    Comments: 7 pages, 4 figures

    Journal ref: Proc. Natl. Acad. Sci. 114, 2440 (2017)

  38. arXiv:1205.5055  [pdf, other

    cs.DC

    Neutron Star Evolutions using Tabulated Equations of State with a New Execution Model

    Authors: Matthew Anderson, Maciej Brodowicz, Hartmut Kaiser, Bryce Adelstein-Lelbach, Thomas Sterling

    Abstract: The addition of nuclear and neutrino physics to general relativistic fluid codes allows for a more realistic description of hot nuclear matter in neutron star and black hole systems. This additional microphysics requires that each processor have access to large tables of data, such as equations of state, and in large simulations the memory required to store these tables locally can become excessiv… ▽ More

    Submitted 22 May, 2012; originally announced May 2012.

    Comments: 9 pages, 8 figures. arXiv admin note: substantial text overlap with arXiv:1110.1131

  39. arXiv:1110.1131  [pdf, other

    cs.DC

    Adaptive Mesh Refinement for Astrophysics Applications with ParalleX

    Authors: Matthew Anderson, Maciej Brodowicz, Hartmut Kaiser, Bryce Adelstein-Lelbach, Thomas Sterling

    Abstract: Several applications in astrophysics require adequately resolving many physical and temporal scales which vary over several orders of magnitude. Adaptive mesh refinement techniques address this problem effectively but often result in constrained strong scaling performance. The ParalleX execution model is an experimental execution model that aims to expose new forms of program parallelism and elimi… ▽ More

    Submitted 5 October, 2011; originally announced October 2011.

  40. arXiv:1109.5201  [pdf, other

    cs.DC

    An Application Driven Analysis of the ParalleX Execution Model

    Authors: Matthew Anderson, Maciej Brodowicz, Hartmut Kaiser, Thomas Sterling

    Abstract: Exascale systems, expected to emerge by the end of the next decade, will require the exploitation of billion-way parallelism at multiple hierarchical levels in order to achieve the desired sustained performance. The task of assessing future machine performance is approached by identifying the factors which currently challenge the scalability of parallel applications. It is suggested that the root… ▽ More

    Submitted 23 September, 2011; originally announced September 2011.

    Comments: 9 Figures

  41. Improving the scalability of parallel N-body applications with an event driven constraint based execution model

    Authors: Chirag Dekate, Matthew Anderson, Maciej Brodowicz, Hartmut Kaiser, Bryce Adelstein-Lelbach, Thomas Sterling

    Abstract: The scalability and efficiency of graph applications are significantly constrained by conventional systems and their supporting programming models. Technology trends like multicore, manycore, and heterogeneous system architectures are introducing further challenges and possibilities for emerging application domains such as graph applications. This paper explores the space of effective parallel exe… ▽ More

    Submitted 23 September, 2011; originally announced September 2011.

    Comments: 11 figures

    Journal ref: International Journal of High Performance Computing Applications, April 11, 2012

  42. arXiv:0803.4170  [pdf, ps, other

    physics.ins-det cond-mat.mtrl-sci nucl-ex

    Neutronic Design and Measured Performance of the Low Energy Neutron Source (LENS) Target Moderator Reflector Assembly

    Authors: C. M. Lavelle, D. V. Baxter, A. Bogdanov, V. P. Derenchuk, H. Kaiser, M. B. Leuschner, M. A. Lone, W. Lozowski, H. Nann, B. v. Przewoski, N. Remmes, T. Rinckel, Y. Shin, W. M. Snow, P. E. Sokol

    Abstract: The Low Energy Neutron Source (LENS) is an accelerator-based pulsed cold neutron facility under construction at the Indiana University Cyclotron Facility (IUCF). The idea behind LENS is to produce pulsed cold neutron beams starting with ~MeV neutrons from (p,n) reactions in Be which are moderated to meV energies and extracted from a small solid angle for use in neutron instruments which can oper… ▽ More

    Submitted 28 March, 2008; originally announced March 2008.

    Comments: This is a preprint version of an article which has been published in Nuclear Instruments and Methods in Physics Research A 587 (2008) 324-341. http://dx.doi.org/10.1016/j.nima.2007.12.044

    Journal ref: Nucl.Instrum.Meth.A587:324-341,2008

  43. arXiv:math/0701132  [pdf, ps, other

    math.AP

    Classical solutions of drift-diffusion equations for semiconductor devices: the 2d case

    Authors: Hans-Christoph Kaiser, Hagen Neidhardt, Joachim Rehberg

    Abstract: We regard drift-diffusion equations for semiconductor devices in Lebesgue spaces. To that end we reformulate the (generalized) van Roosbroeck system as an evolution equation for the potentials to the driving forces of the currents of electrons and holes. This evolution equation falls into a class of quasi-linear parabolic systems which allow unique, local in time solution in certain Lebesgue spa… ▽ More

    Submitted 4 January, 2007; originally announced January 2007.

    Report number: WIAS Preprint No. 1189 (2006) MSC Class: 35K45; 35K50; 35K55; 35K57; 78A35

  44. Measuring the Neutron's Mean Square Charge Radius Using Neutron Interferometry

    Authors: F. E. Wietfeldt, M. Huber, T. C. Black, H. Kaiser, M. Arif, D. L. Jacobson, S. A. Werner

    Abstract: The neutron is electrically neutral, but its substructure consists of charged quarks so it may have an internal charge distribution. In fact it is known to have a negative mean square charge radius (MSCR), the second moment of the radial charge density. In other words the neutron has a positive core and negative skin. In the first Born approximation the neutron MSCR can be simply related to the… ▽ More

    Submitted 14 September, 2005; originally announced September 2005.

    Comments: 5 pages, 2 figures

  45. Inertia of Intrinsic Spin

    Authors: Bahram Mashhoon, Helmut Kaiser

    Abstract: The state of a particle in space and time is characterized by its mass and spin, which therefore determine the inertial properties of the particle. The coupling of intrinsic spin with rotation is examined and the corresponding inertial effects of intrinsic spin are studied. An experiment to measure directly the spin-rotation coupling via neutron interferometry is analyzed in detail.

    Submitted 18 November, 2005; v1 submitted 24 August, 2005; originally announced August 2005.

    Comments: 3 pages, 1 figure, contribution to Festschrift honoring Samuel A. Werner; v2: slightly expanded version accepted for publication in Proc. Int. Conf. Neutron Scattering 2005 (scheduled for publication in the regular edition of Physica B, July 2006)

    Journal ref: Physica B385 (2006) 1381-1383

  46. Precision neutron interferometric measurements and updated evaluations of the n-p and n-d coherent neutron scattering lengths

    Authors: K. Schoen, D. L. Jacobson, M. Arif, P. R. Huffman, T. C. Black, W. M. Snow, S. K. Lamoreaux, H. Kaiser, S. A. Werner

    Abstract: We have performed high-precision measurements of the coherent neutron scattering lengths of gas phase molecular hydrogen and deuterium using neutron interferometry. After correcting for molecular binding and multiple scattering from the molecule, we find b_{np} = (-3.7384 +/- 0.0020) fm and b_{nd} = (6.6649 +/- 0.0040) fm. Our results are in agreement with the world average of previous Measureme… ▽ More

    Submitted 10 June, 2003; originally announced June 2003.

    Comments: 22 pages, 19 figures

    Journal ref: Physical Review C 67, 044005 (2003)

  47. Precision neutron interferometric measurement of the nd coherent neutron scattering length and consequences for models of three-nucleon forces

    Authors: T. C. Black, P. R. Huffman, D. L. Jacobson, W. M. Snow, K. Schoen, M. Arif, H. Kaiser, S. K. Lamoreaux, S. A. Werner

    Abstract: We have performed the first high precision measurement of the coherent neutron scattering length of deuterium in a pure sample using neutron interferometry. We find b_nd = (6.665 +/- 0.004) fm in agreement with the world average of previous measurements using different techniques, b_nd = (6.6730 +/- 0.0045) fm. We compare the new world average for the nd coherent scattering length b_nd = (6.669… ▽ More

    Submitted 20 May, 2003; originally announced May 2003.

    Comments: 4 pages, 4 figures

    Journal ref: Phys.Rev.Lett. 90 (2003) 192502

  48. Orientation of Vortices in a Superconducting Thin-Film: Quantitative Comparison of Spin-Polarized Neutron Reflectivity and Magnetization

    Authors: S. -W. Han, J. Farmer, H. Kaiser, P. F. Miceli, I. V. Roshchin, L. H. Greene

    Abstract: We present a quantitative comparison of the magnetization measured by spin-polarized neutron reflectivity (SPNR) and DC magnetometry on a 1370 Å -thick Nb superconducting film. As a function of magnetic field applied in the film plane, SPNR exhibits reversible behavior whereas the DC magnetization shows substantial hysteresis. The difference between these measurements is attributed to a rotation… ▽ More

    Submitted 17 August, 2000; v1 submitted 16 August, 2000; originally announced August 2000.

    Comments: 12 pages, 8 figures, It will be printed in PRB, Oct. 2000

    Journal ref: Phys. Rev. B 62, 9784 (2000)