Search | arXiv e-print repository

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs

Authors: John Tramm, Paul Romano, Patrick Shriwise, Amanda Lund, Johannes Doerfert, Patrick Steinbrecher, Andrew Siegel, Gavin Ridley

Abstract: OpenMC is an open source Monte Carlo neutral particle transport application that has recently been ported to GPU using the OpenMP target offloading model. We examine the performance of OpenMC at scale on the Frontier, Polaris, and Aurora supercomputers, demonstrating that performance portability has been achieved by OpenMC across all three major GPU vendors (AMD, NVIDIA, and Intel). OpenMC's GPU p… ▽ More OpenMC is an open source Monte Carlo neutral particle transport application that has recently been ported to GPU using the OpenMP target offloading model. We examine the performance of OpenMC at scale on the Frontier, Polaris, and Aurora supercomputers, demonstrating that performance portability has been achieved by OpenMC across all three major GPU vendors (AMD, NVIDIA, and Intel). OpenMC's GPU performance is compared to both the traditional CPU-based version of OpenMC as well as several other state-of-the-art CPU-based Monte Carlo particle transport applications. We also provide historical context by analyzing OpenMC's performance on several legacy GPU and CPU architectures. This work includes some of the first published results for a scientific simulation application at scale on a supercomputer featuring Intel's Max series "Ponte Vecchio" GPUs. It is also one of the first demonstrations of a large scientific production application using the OpenMP target offloading model to achieve high performance on all three major GPU platforms. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2311.01739 [pdf, other]

Efficient Algorithms for Monte Carlo Particle Transport on AI Accelerator Hardware

Authors: John Tramm, Bryce Allen, Kazutomo Yoshii, Andrew Siegel, Leighton Wilson

Abstract: The recent trend toward deep learning has led to the development of a variety of highly innovative AI accelerator architectures. One such architecture, the Cerebras Wafer-Scale Engine 2 (WSE-2), features 40 GB of on-chip SRAM, making it a potentially attractive platform for latency- or bandwidth-bound HPC simulation workloads. In this study, we examine the feasibility of performing continuous ener… ▽ More The recent trend toward deep learning has led to the development of a variety of highly innovative AI accelerator architectures. One such architecture, the Cerebras Wafer-Scale Engine 2 (WSE-2), features 40 GB of on-chip SRAM, making it a potentially attractive platform for latency- or bandwidth-bound HPC simulation workloads. In this study, we examine the feasibility of performing continuous energy Monte Carlo (MC) particle transport on the WSE-2 by porting a key kernel from the MC transport algorithm to Cerebras's CSL programming model. New algorithms for minimizing communication costs and for handling load balancing are developed and tested. The WSE-2 is found to run 130 times faster than a highly optimized CUDA version of the kernel run on an NVIDIA A100 GPU -- significantly outpacing the expected performance increase given the difference in transistor counts between the architectures. △ Less

Submitted 6 November, 2023; v1 submitted 3 November, 2023; originally announced November 2023.

ACM Class: D.1.3; J.2

arXiv:2306.04249 [pdf, other]

DEMIST: A deep-learning-based task-specific denoising approach for myocardial perfusion SPECT

Authors: Md Ashequr Rahman, Zitong Yu, Richard Laforest, Craig K. Abbey, Barry A. Siegel, Abhinav K. Jha

Abstract: There is an important need for methods to process myocardial perfusion imaging (MPI) SPECT images acquired at lower radiation dose and/or acquisition time such that the processed images improve observer performance on the clinical task of detecting perfusion defects. To address this need, we build upon concepts from model-observer theory and our understanding of the human visual system to propose… ▽ More There is an important need for methods to process myocardial perfusion imaging (MPI) SPECT images acquired at lower radiation dose and/or acquisition time such that the processed images improve observer performance on the clinical task of detecting perfusion defects. To address this need, we build upon concepts from model-observer theory and our understanding of the human visual system to propose a Detection task-specific deep-learning-based approach for denoising MPI SPECT images (DEMIST). The approach, while performing denoising, is designed to preserve features that influence observer performance on detection tasks. We objectively evaluated DEMIST on the task of detecting perfusion defects using a retrospective study with anonymized clinical data in patients who underwent MPI studies across two scanners (N = 338). The evaluation was performed at low-dose levels of 6.25%, 12.5% and 25% and using an anthropomorphic channelized Hotelling observer. Performance was quantified using area under the receiver operating characteristics curve (AUC). Images denoised with DEMIST yielded significantly higher AUC compared to corresponding low-dose images and images denoised with a commonly used task-agnostic DL-based denoising method. Similar results were observed with stratified analysis based on patient sex and defect type. Additionally, DEMIST improved visual fidelity of the low-dose images as quantified using root mean squared error and structural similarity index metric. A mathematical analysis revealed that DEMIST preserved features that assist in detection tasks while improving the noise properties, resulting in improved observer performance. The results provide strong evidence for further clinical evaluation of DEMIST to denoise low-count images in MPI SPECT. △ Less

Submitted 25 October, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

arXiv:2303.02110 [pdf]

Need for Objective Task-based Evaluation of Deep Learning-Based Denoising Methods: A Study in the Context of Myocardial Perfusion SPECT

Authors: Zitong Yu, Md Ashequr Rahman, Richard Laforest, Thomas H. Schindler, Robert J. Gropler, Richard L. Wahl, Barry A. Siegel, Abhinav K. Jha

Abstract: Artificial intelligence-based methods have generated substantial interest in nuclear medicine. An area of significant interest has been using deep-learning (DL)-based approaches for denoising images acquired with lower doses, shorter acquisition times, or both. Objective evaluation of these approaches is essential for clinical application. DL-based approaches for denoising nuclear-medicine images… ▽ More Artificial intelligence-based methods have generated substantial interest in nuclear medicine. An area of significant interest has been using deep-learning (DL)-based approaches for denoising images acquired with lower doses, shorter acquisition times, or both. Objective evaluation of these approaches is essential for clinical application. DL-based approaches for denoising nuclear-medicine images have typically been evaluated using fidelity-based figures of merit (FoMs) such as RMSE and SSIM. However, these images are acquired for clinical tasks and thus should be evaluated based on their performance in these tasks. Our objectives were to (1) investigate whether evaluation with these FoMs is consistent with objective clinical-task-based evaluation; (2) provide a theoretical analysis for determining the impact of denoising on signal-detection tasks; (3) demonstrate the utility of virtual clinical trials (VCTs) to evaluate DL-based methods. A VCT to evaluate a DL-based method for denoising myocardial perfusion SPECT (MPS) images was conducted. The impact of DL-based denoising was evaluated using fidelity-based FoMs and AUC, which quantified performance on detecting perfusion defects in MPS images as obtained using a model observer with anthropomorphic channels. Based on fidelity-based FoMs, denoising using the considered DL-based method led to significantly superior performance. However, based on ROC analysis, denoising did not improve, and in fact, often degraded detection-task performance. The results motivate the need for objective task-based evaluation of DL-based denoising approaches. Further, this study shows how VCTs provide a mechanism to conduct such evaluations using VCTs. Finally, our theoretical treatment reveals insights into the reasons for the limited performance of the denoising approach. △ Less

Submitted 1 April, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

arXiv:2303.00212 [pdf, other]

A task-specific deep-learning-based denoising approach for myocardial perfusion SPECT

Authors: Md Ashequr Rahman, Zitong Yu, Barry A. Siegel, Abhinav K. Jha

Abstract: Deep-learning (DL)-based methods have shown significant promise in denoising myocardial perfusion SPECT images acquired at low dose. For clinical application of these methods, evaluation on clinical tasks is crucial. Typically, these methods are designed to minimize some fidelity-based criterion between the predicted denoised image and some reference normal-dose image. However, while promising, st… ▽ More Deep-learning (DL)-based methods have shown significant promise in denoising myocardial perfusion SPECT images acquired at low dose. For clinical application of these methods, evaluation on clinical tasks is crucial. Typically, these methods are designed to minimize some fidelity-based criterion between the predicted denoised image and some reference normal-dose image. However, while promising, studies have shown that these methods may have limited impact on the performance of clinical tasks in SPECT. To address this issue, we use concepts from the literature on model observers and our understanding of the human visual system to propose a DL-based denoising approach designed to preserve observer-related information for detection tasks. The proposed method was objectively evaluated on the task of detecting perfusion defect in myocardial perfusion SPECT images using a retrospective study with anonymized clinical data. Our results demonstrate that the proposed method yields improved performance on this detection task compared to using low-dose images. The results show that by preserving task-specific information, DL may provide a mechanism to improve observer performance in low-dose myocardial perfusion SPECT. △ Less

Submitted 28 February, 2023; originally announced March 2023.

arXiv:2205.09296 [pdf, other]

doi 10.1038/s41598-022-26921-5

Opinion Manipulation on Farsi Twitter

Authors: Amirhossein Farzam, Parham Moradi, Saeedeh Mohammadi, Zahra Padar, Alexandra A. Siegel

Abstract: For Iranians and the Iranian diaspora, the Farsi Twittersphere provides an important alternative to state media and an outlet for political discourse. But this understudied online space has become an opinion manipulation battleground, with diverse actors using inauthentic accounts to advance their goals and shape online narratives. Examining trending discussions crossing social cleavages in Iran,… ▽ More For Iranians and the Iranian diaspora, the Farsi Twittersphere provides an important alternative to state media and an outlet for political discourse. But this understudied online space has become an opinion manipulation battleground, with diverse actors using inauthentic accounts to advance their goals and shape online narratives. Examining trending discussions crossing social cleavages in Iran, we explore how the dynamics of opinion manipulation differ across diverse issue areas. Our analysis suggests that opinion manipulation by inauthentic accounts is more prevalent in divisive political discussions than non-divisive or apolitical discussions. We show how Twitter's network structures help to reinforce the content propagated by clusters of inauthentic accounts in divisive political discussions. Analyzing both the content and structure of online discussions in the Iranian Twittersphere, this work contributes to a growing body of literature exploring the dynamics of online opinion manipulation, while improving our understanding of how information is controlled in the digital age. △ Less

Submitted 8 March, 2023; v1 submitted 18 May, 2022; originally announced May 2022.

Comments: 23 pages, 12 figures, two appendices

Journal ref: Scientific Reports, 13(1), 333 (2023)

arXiv:2004.08382 [pdf]

From Horseback Riding to Changing the World: UX Competence as a Journey

Authors: Omar Sosa-Tzec, Erik Stolterman Bergqvist, Marty A. Siegel

Abstract: In this paper, we explore the notion of competence in UX based on the perspective of practitioners. As a result of this exploration, we observed four domains through which we conceptualize a plan of sources of competence that describes the ways a UX practitioner develop competence. Based on this plane, we present the idea of competence as a journey. A journey whose furthest stage implies an urge t… ▽ More In this paper, we explore the notion of competence in UX based on the perspective of practitioners. As a result of this exploration, we observed four domains through which we conceptualize a plan of sources of competence that describes the ways a UX practitioner develop competence. Based on this plane, we present the idea of competence as a journey. A journey whose furthest stage implies an urge towards transforming society and UX practice. △ Less

Submitted 16 April, 2020; originally announced April 2020.

Comments: 5 pages, 2 Figures

ACM Class: H.5.m

arXiv:2003.00317 [pdf, other]

doi 10.1088/1361-6560/ac01f4

A Bayesian approach to tissue-fraction estimation for oncological PET segmentation

Authors: Zi** Liu, Joyce C. Mhlanga, Richard Laforest, Paul-Robert Derenoncourt, Barry A. Siegel, Abhinav K. Jha

Abstract: Tumor segmentation in oncological PET is challenging, a major reason being the partial-volume effects that arise due to low system resolution and finite voxel size. The latter results in tissue-fraction effects, i.e. voxels contain a mixture of tissue classes. Conventional segmentation methods are typically designed to assign each voxel in the image as belonging to a certain tissue class. Thus, th… ▽ More Tumor segmentation in oncological PET is challenging, a major reason being the partial-volume effects that arise due to low system resolution and finite voxel size. The latter results in tissue-fraction effects, i.e. voxels contain a mixture of tissue classes. Conventional segmentation methods are typically designed to assign each voxel in the image as belonging to a certain tissue class. Thus, these methods are inherently limited in modeling tissue-fraction effects. To address the challenge of accounting for partial-volume effects, and in particular, tissue-fraction effects, we propose a Bayesian approach to tissue-fraction estimation for oncological PET segmentation. Specifically, this Bayesian approach estimates the posterior mean of fractional volume that the tumor occupies within each voxel of the image. The proposed method, implemented using a deep-learning-based technique, was first evaluated using clinically realistic 2-D simulation studies with known ground truth, in the context of segmenting the primary tumor in PET images of patients with lung cancer. The evaluation studies demonstrated that the method accurately estimated the tumor-fraction areas and significantly outperformed widely used conventional PET segmentation methods, including a U-net-based method, on the task of segmenting the tumor. In addition, the proposed method was relatively insensitive to partial-volume effects and yielded reliable tumor segmentation for different clinical-scanner configurations. The method was then evaluated using clinical images of patients with stage IIB/III non-small cell lung cancer from ACRIN 6668/RTOG 0235 multi-center clinical trial. Here, the results showed that the proposed method significantly outperformed all other considered methods and yielded accurate tumor segmentation on patient images with Dice similarity coefficient (DSC) of 0.82 (95 % CI: [0.78, 0.86]). △ Less

Submitted 27 May, 2022; v1 submitted 29 February, 2020; originally announced March 2020.

Journal ref: Phys. Med. Biol. 66 124002 (2021)

arXiv:1909.03632 [pdf, other]

Improving the scalabiliy of neutron cross-section lookup codes on multicore NUMA system

Authors: Kazutomo Yoshii, John Tramm, Andrew Siegel, Pete Beckman

Abstract: We use the XSBench proxy application, a memory-intensive OpenMP program, to explore the source of on-node scalability degradation of a popular Monte Carlo (MC) reactor physics benchmark on non-uniform memory access (NUMA) systems. As background, we present the details of XSBench, a performance abstraction "proxy app" for the full MC simulation, as well as the internal design of the Linux kernel. W… ▽ More We use the XSBench proxy application, a memory-intensive OpenMP program, to explore the source of on-node scalability degradation of a popular Monte Carlo (MC) reactor physics benchmark on non-uniform memory access (NUMA) systems. As background, we present the details of XSBench, a performance abstraction "proxy app" for the full MC simulation, as well as the internal design of the Linux kernel. We explain how the physical memory allocation inside the kernel affects the multicore scalability of XSBench. On a sixteen-core, two-socket NUMA testbed, the scaling efficiency is improved from a nonoptimized 70% to an optimized 95%, and the optimized version consumes 25% less energy than does the nonoptimized version. In addition to the NUMA optimization we evaluate a page-size optimization to XSBench and observe a 1.5x performance improvement, compared with a nonoptimized one. △ Less

Submitted 9 September, 2019; originally announced September 2019.

arXiv:1808.04149 [pdf, ps, other]

Hybrid Metabolic Network Completion

Authors: Clémence Frioux, Torsten Schaub, Sebastian Schellhorn, Anne Siegel, Philipp Wanko

Abstract: Metabolic networks play a crucial role in biology since they capture all chemical reactions in an organism. While there are networks of high quality for many model organisms, networks for less studied organisms are often of poor quality and suffer from incompleteness. To this end, we introduced in previous work an ASP-based approach to metabolic network completion. Although this qualitative approa… ▽ More Metabolic networks play a crucial role in biology since they capture all chemical reactions in an organism. While there are networks of high quality for many model organisms, networks for less studied organisms are often of poor quality and suffer from incompleteness. To this end, we introduced in previous work an ASP-based approach to metabolic network completion. Although this qualitative approach allows for restoring moderately degraded networks, it fails to restore highly degraded ones. This is because it ignores quantitative constraints capturing reaction rates. To address this problem, we propose a hybrid approach to metabolic network completion that integrates our qualitative ASP approach with quantitative means for capturing reaction rates. We begin by formally reconciling existing stoichiometric and topological approaches to network completion in a unified formalism. With it, we develop a hybrid ASP encoding and rely upon the theory reasoning capacities of the ASP system clingo for solving the resulting logic program with linear constraints over reals. We empirically evaluate our approach by means of the metabolic network of Escherichia coli. Our analysis shows that our novel approach yields greatly superior results than obtainable from purely qualitative or quantitative approaches. Under consideration in Theory and Practice of Logic Programming (TPLP). △ Less

Submitted 13 August, 2018; originally announced August 2018.

Comments: Under consideration in Theory and Practice of Logic Programming (TPLP)

arXiv:1610.01390 [pdf]

Reliability of PET/CT shape and heterogeneity features in functional and morphological components of Non-Small Cell Lung Cancer tumors: a repeatability analysis in a prospective multi-center cohort

Authors: Marie-Charlotte Desseroit, Florent Tixier, Wolfgang Weber, Barry A Siegel, Catherine Cheze Le Rest, Dimitris Visvikis, Mathieu Hatt

Abstract: Purpose: The main purpose of this study was to assess the reliability of shape and heterogeneity features in both Positron Emission Tomography (PET) and low-dose Computed Tomography (CT) components of PET/CT. A secondary objective was to investigate the impact of image quantization.Material and methods: A Health Insurance Portability and Accountability Act -compliant secondary analysis of deidenti… ▽ More Purpose: The main purpose of this study was to assess the reliability of shape and heterogeneity features in both Positron Emission Tomography (PET) and low-dose Computed Tomography (CT) components of PET/CT. A secondary objective was to investigate the impact of image quantization.Material and methods: A Health Insurance Portability and Accountability Act -compliant secondary analysis of deidentified prospectively acquired PET/CT test-retest datasets of 74 patients from multi-center Merck and ACRIN trials was performed. Metabolically active volumes were automatically delineated on PET with Fuzzy Locally Adaptive Bayesian algorithm. 3DSlicerTM was used to semi-automatically delineate the anatomical volumes on low-dose CT components. Two quantization methods were considered: a quantization into a set number of bins (quantizationB) and an alternative quantization with bins of fixed width (quantizationW). Four shape descriptors, ten first-order metrics and 26 textural features were computed. Bland-Altman analysis was used to quantify repeatability. Features were subsequently categorized as very reliable, reliable, moderately reliable and poorly reliable with respect to the corresponding volume variability. Results: Repeatability was highly variable amongst features. Numerous metrics were identified as poorly or moderately reliable. Others were (very) reliable in both modalities, and in all categories (shape, 1st-, 2nd- and 3rd-order metrics). Image quantization played a major role in the features repeatability. Features were more reliable in PET with quantizationB, whereas quantizationW showed better results in CT.Conclusion: The test-retest repeatability of shape and heterogeneity features in PET and low-dose CT varied greatly amongst metrics. The level of repeatability also depended strongly on the quantization step, with different optimal choices for each modality. The repeatability of PET and low-dose CT features should be carefully taken into account when selecting metrics to build multiparametric models. △ Less

Submitted 5 October, 2016; originally announced October 2016.

Comments: Journal of Nuclear Medicine, Society of Nuclear Medicine, 2016

arXiv:1401.0704 [pdf, other]

doi 10.1017/etds.2014.141

A combinatorial approach to products of Pisot substitutions

Authors: Valérie Berthé, Jérémie Bourdon, Timo Jolivet, Anne Siegel

Abstract: We define a generic algorithmic framework to prove pure discrete spectrum for the substitutive symbolic dynamical systems associated with some infinite families of Pisot substitutions. We focus on the families obtained as finite products of the three-letter substitutions associated with the multidimensional continued fraction algorithms of Brun and Jacobi-Perron. Our tools consist in a reformula… ▽ More We define a generic algorithmic framework to prove pure discrete spectrum for the substitutive symbolic dynamical systems associated with some infinite families of Pisot substitutions. We focus on the families obtained as finite products of the three-letter substitutions associated with the multidimensional continued fraction algorithms of Brun and Jacobi-Perron. Our tools consist in a reformulation of some combinatorial criteria (coincidence conditions), in terms of properties of discrete plane generation using multidimensional (dual) substitutions. We also deduce some topological and dynamical properties of the Rauzy fractals, of the underlying symbolic dynamical systems, as well as some number-theoretical properties of the associated Pisot numbers. △ Less

Submitted 26 June, 2014; v1 submitted 3 January, 2014; originally announced January 2014.

Comments: 32 pages, v2 with many corrections and improvements

Journal ref: Ergod. Th. Dynam. Sys. 36 (2016) 1757-1794

arXiv:1210.0690 [pdf, other]

doi 10.1007/978-3-642-33636-2_20

Revisiting the Training of Logic Models of Protein Signaling Networks with a Formal Approach based on Answer Set Programming

Authors: Santiago Videla, Carito Guziolowski, Federica Eduati, Sven Thiele, Niels Grabe, Julio Saez-Rodriguez, Anne Siegel

Abstract: A fundamental question in systems biology is the construction and training to data of mathematical models. Logic formalisms have become very popular to model signaling networks because their simplicity allows us to model large systems encompassing hundreds of proteins. An approach to train (Boolean) logic models to high-throughput phospho-proteomics data was recently introduced and solved using op… ▽ More A fundamental question in systems biology is the construction and training to data of mathematical models. Logic formalisms have become very popular to model signaling networks because their simplicity allows us to model large systems encompassing hundreds of proteins. An approach to train (Boolean) logic models to high-throughput phospho-proteomics data was recently introduced and solved using optimization heuristics based on stochastic methods. Here we demonstrate how this problem can be solved using Answer Set Programming (ASP), a declarative problem solving paradigm, in which a problem is encoded as a logical program such that its answer sets represent solutions to the problem. ASP has significant improvements over heuristic methods in terms of efficiency and scalability, it guarantees global optimality of solutions as well as provides a complete set of solutions. We illustrate the application of ASP with in silico cases based on realistic networks and data. △ Less

Submitted 22 December, 2012; v1 submitted 2 October, 2012; originally announced October 2012.

Journal ref: CMSB - 10th Computational Methods in Systems Biology 2012 7605 (2012) 342-361

arXiv:1108.5574 [pdf, other]

Substitutive Arnoux-Rauzy sequences have pure discrete spectrum

Authors: Valérie Berthé, Timo Jolivet, Anne Siegel

Abstract: We prove that the symbolic dynamical system generated by a purely substitutive Arnoux-Rauzy sequence is measurably conjugate to a toral translation. The proof is based on an explicit construction of a fundamental domain with fractal boundary (a Rauzy fractal) for this toral translation. We prove that the symbolic dynamical system generated by a purely substitutive Arnoux-Rauzy sequence is measurably conjugate to a toral translation. The proof is based on an explicit construction of a fundamental domain with fractal boundary (a Rauzy fractal) for this toral translation. △ Less

Submitted 26 June, 2014; v1 submitted 29 August, 2011; originally announced August 2011.

Comments: 19 pages, v2 includes some corrections to match the published version, and a mistake in the graph of Fig. 1 has been corrected

Journal ref: Uniform Distribution Theory 7 (2012), no. 1, 173-197

arXiv:1101.1784 [pdf, other]

doi 10.1051/ita/2014008

Connectedness of fractals associated with Arnoux-Rauzy substitutions

Authors: Valérie Berthé, Timo Jolivet, Anne Siegel

Abstract: Rauzy fractals are compact sets with fractal boundary that can be associated with any unimodular Pisot irreducible substitution. These fractals can be defined as the Hausdorff limit of a sequence of compact sets, where each set is a renormalized projection of a finite union of faces of unit cubes. We exploit this combinatorial definition to prove the connectedness of the Rauzy fractal associated w… ▽ More Rauzy fractals are compact sets with fractal boundary that can be associated with any unimodular Pisot irreducible substitution. These fractals can be defined as the Hausdorff limit of a sequence of compact sets, where each set is a renormalized projection of a finite union of faces of unit cubes. We exploit this combinatorial definition to prove the connectedness of the Rauzy fractal associated with any finite product of three-letter Arnoux-Rauzy substitutions. △ Less

Submitted 26 June, 2014; v1 submitted 10 January, 2011; originally announced January 2011.

Comments: 15 pages, v2 includes minor corrections to match the published version

Journal ref: RAIRO-Theor. Inf. Appl. 48 (2014) 249-266

Showing 1–15 of 15 results for author: Siegel, A