Search | arXiv e-print repository

arXiv:2311.02010 [pdf, other]

A cast of thousands: How the IDEAS Productivity project has advanced software productivity and sustainability

Authors: Lois Curfman McInnes, Michael Heroux, David E. Bernholdt, Anshu Dubey, Elsa Gonsiorowski, Rinku Gupta, Osni Marques, J. David Moulton, Hai Ah Nam, Boyana Norris, Elaine M. Raybourn, Jim Willenbring, Ann Almgren, Ross Bartlett, Kita Cranfill, Stephen Fickas, Don Frederick, William Godoy, Patricia Grubel, Rebecca Hartman-Baker, Axel Huebl, Rose Lynch, Addi Malviya Thakur, Reed Milewicz, Mark C. Miller , et al. (9 additional authors not shown)

Abstract: Computational and data-enabled science and engineering are revolutionizing advances throughout science and society, at all scales of computing. For example, teams in the U.S. DOE Exascale Computing Project have been tackling new frontiers in modeling, simulation, and analysis by exploiting unprecedented exascale computing capabilities-building an advanced software ecosystem that supports next-gene… ▽ More Computational and data-enabled science and engineering are revolutionizing advances throughout science and society, at all scales of computing. For example, teams in the U.S. DOE Exascale Computing Project have been tackling new frontiers in modeling, simulation, and analysis by exploiting unprecedented exascale computing capabilities-building an advanced software ecosystem that supports next-generation applications and addresses disruptive changes in computer architectures. However, concerns are growing about the productivity of the developers of scientific software, its sustainability, and the trustworthiness of the results that it produces. Members of the IDEAS project serve as catalysts to address these challenges through fostering software communities, incubating and curating methodologies and resources, and disseminating knowledge to advance developer productivity and software sustainability. This paper discusses how these synergistic activities are advancing scientific discovery-mitigating technical risks by building a firmer foundation for reproducible, sustainable science at all scales of computing, from laptops to clusters to exascale and beyond. △ Less

Submitted 16 February, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

Comments: 12 pages, 1 figure

arXiv:2303.04630 [pdf]

doi 10.1038/s41746-023-00895-8

Mining the contribution of intensive care clinical course to outcome after traumatic brain injury

Authors: Shubhayu Bhattacharyay, Pier Francesco Caruso, Cecilia Åkerlund, Lindsay Wilson, Robert D Stevens, David K Menon, Ewout W Steyerberg, David W Nelson, Ari Ercole, the CENTER-TBI investigators/participants

Abstract: Existing methods to characterise the evolving condition of traumatic brain injury (TBI) patients in the intensive care unit (ICU) do not capture the context necessary for individualising treatment. Here, we integrate all heterogenous data stored in medical records (1,166 pre-ICU and ICU variables) to model the individualised contribution of clinical course to six-month functional outcome on the Gl… ▽ More Existing methods to characterise the evolving condition of traumatic brain injury (TBI) patients in the intensive care unit (ICU) do not capture the context necessary for individualising treatment. Here, we integrate all heterogenous data stored in medical records (1,166 pre-ICU and ICU variables) to model the individualised contribution of clinical course to six-month functional outcome on the Glasgow Outcome Scale - Extended (GOSE). On a prospective cohort (n=1,550, 65 centres) of TBI patients, we train recurrent neural network models to map a token-embedded time series representation of all variables (including missing values) to an ordinal GOSE prognosis every two hours. The full range of variables explains up to 52% (95% CI: 50%-54%) of the ordinal variance in functional outcome. Up to 91% (95% CI: 90%-91%) of this explanation is derived from pre-ICU and admission information (i.e., static variables). Information collected in the ICU (i.e., dynamic variables) increases explanation (by up to 5% [95% CI: 4%-6%]), though not enough to counter poorer overall performance in longer-stay (>5.75 days) patients. Highest-contributing variables include physician-based prognoses, CT features, and markers of neurological function. Whilst static information currently accounts for the majority of functional outcome explanation after TBI, data-driven analysis highlights investigative avenues to improve dynamic characterisation of longer-stay patients. Moreover, our modelling strategy proves useful for converting large patient records into interpretable time series with missing data integration and minimal processing. △ Less

Submitted 1 August, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

Journal ref: npj Digit. Med. 6, 154 (2023)

arXiv:2202.04801 [pdf]

doi 10.1371/journal.pone.0270973

The leap to ordinal: detailed functional prognosis after traumatic brain injury with a flexible modelling approach

Authors: Shubhayu Bhattacharyay, Ioan Milosevic, Lindsay Wilson, David K. Menon, Robert D. Stevens, Ewout W. Steyerberg, David W. Nelson, Ari Ercole, the CENTER-TBI investigators/participants

Abstract: When a patient is admitted to the intensive care unit (ICU) after a traumatic brain injury (TBI), an early prognosis is essential for baseline risk adjustment and shared decision making. TBI outcomes are commonly categorised by the Glasgow Outcome Scale-Extended (GOSE) into 8, ordered levels of functional recovery at 6 months after injury. Existing ICU prognostic models predict binary outcomes at… ▽ More When a patient is admitted to the intensive care unit (ICU) after a traumatic brain injury (TBI), an early prognosis is essential for baseline risk adjustment and shared decision making. TBI outcomes are commonly categorised by the Glasgow Outcome Scale-Extended (GOSE) into 8, ordered levels of functional recovery at 6 months after injury. Existing ICU prognostic models predict binary outcomes at a certain threshold of GOSE (e.g., prediction of survival [GOSE>1] or functional independence [GOSE>4]). We aimed to develop ordinal prediction models that concurrently predict probabilities of each GOSE score. From a prospective cohort (n=1,550, 65 centres) in the ICU stratum of the Collaborative European NeuroTrauma Effectiveness Research in TBI (CENTER-TBI) patient dataset, we extracted all clinical information within 24 hours of ICU admission (1,151 predictors) and 6-month GOSE scores. We analysed the effect of 2 design elements on ordinal model performance: (1) the baseline predictor set, ranging from a concise set of 10 validated predictors to a token-embedded representation of all possible predictors, and (2) the modelling strategy, from ordinal logistic regression to multinomial deep learning. With repeated k-fold cross-validation, we found that expanding the baseline predictor set significantly improved ordinal prediction performance while increasing analytical complexity did not. Half of these gains could be achieved with the addition of 8 high-impact predictors (2 demographic variables, 4 protein biomarkers, and 2 severity assessments) to the concise set. At best, ordinal models achieved 0.76 (95% CI: 0.74-0.77) ordinal discrimination ability (ordinal c-index) and 57% (95% CI: 54%-60%) explanation of ordinal variation in 6-month GOSE (Somers' D). Our results motivate the search for informative predictors for higher GOSE and the development of ordinal dynamic prediction models. △ Less

Submitted 4 May, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

Comments: 68 pages, 4 figures, 4 tables, 1 appendix, 6 supplementary figures, 4 supplementary tables, 3 supplementary methods, 1 supplementary result

ACM Class: J.3; I.2.0; I.5.1

Journal ref: PLOS ONE 17:7 (2022) e0270973

arXiv:2002.03309 [pdf]

doi 10.1016/j.accpm.2021.101015

A Physiology-Driven Computational Model for Post-Cardiac Arrest Outcome Prediction

Authors: Han B. Kim, Hieu Nguyen, Qingchu **, Sharmila Tamby, Tatiana Gelaf Romer, Eric Sung, Ran Liu, Joseph Greenstein, Jose I. Suarez, Christian Storm, Raimond Winslow, Robert D. Stevens

Abstract: Patients resuscitated from cardiac arrest (CA) face a high risk of neurological disability and death, however pragmatic methods are lacking for accurate and reliable prognostication. The aim of this study was to build computational models to predict post-CA outcome by leveraging high-dimensional patient data available early after admission to the intensive care unit (ICU). We hypothesized that mod… ▽ More Patients resuscitated from cardiac arrest (CA) face a high risk of neurological disability and death, however pragmatic methods are lacking for accurate and reliable prognostication. The aim of this study was to build computational models to predict post-CA outcome by leveraging high-dimensional patient data available early after admission to the intensive care unit (ICU). We hypothesized that model performance could be enhanced by integrating physiological time series (PTS) data and by training machine learning (ML) classifiers. We compared three models integrating features extracted from the electronic health records (EHR) alone, features derived from PTS collected in the first 24hrs after ICU admission (PTS24), and models integrating PTS24 and EHR. Outcomes of interest were survival and neurological outcome at ICU discharge. Combined EHR-PTS24 models had higher discrimination (area under the receiver operating characteristic curve [AUC]) than models which used either EHR or PTS24 alone, for the prediction of survival (AUC 0.85, 0.80 and 0.68 respectively) and neurological outcome (0.87, 0.83 and 0.78). The best ML classifier achieved higher discrimination than the reference logistic regression model (APACHE III) for survival (AUC 0.85 vs 0.70) and neurological outcome prediction (AUC 0.87 vs 0.75). Feature analysis revealed previously unknown factors to be associated with post-CA recovery. Results attest to the effectiveness of ML models for post-CA predictive modeling and suggest that PTS recorded in very early phase after resuscitation encode short-term outcome probabilities. △ Less

Submitted 11 February, 2020; v1 submitted 9 February, 2020; originally announced February 2020.

Comments: 51 pages, 7 figures, 4 supplementary figures

ACM Class: J.3; I.2.1; I.6.4; G.3

Journal ref: Anaesthesia Critical Care & Pain Medicine 41.1 (2022): 101015

arXiv:1904.09538 [pdf, other]

doi 10.1177/1094342020921340

A mechanism for balancing accuracy and scope in cross-machine black-box GPU performance modeling

Authors: James D. Stevens, Andreas Klöckner

Abstract: The ability to model, analyze, and predict execution time of computations is an important building block supporting numerous efforts, such as load balancing, performance optimization, and automated performance tuning for high performance, parallel applications. In today's increasingly heterogeneous computing environment, this task must be accomplished efficiently across multiple architectures, inc… ▽ More The ability to model, analyze, and predict execution time of computations is an important building block supporting numerous efforts, such as load balancing, performance optimization, and automated performance tuning for high performance, parallel applications. In today's increasingly heterogeneous computing environment, this task must be accomplished efficiently across multiple architectures, including massively parallel coprocessors like GPUs. To address this challenge, we present an approach for constructing customizable, cross-machine performance models for GPU kernels, including a mechanism to automatically and symbolically gather performance-relevant kernel operation counts, a tool for formulating mathematical models using these counts, and a customizable parameterized collection of benchmark kernels used to calibrate models to GPUs in a black-box fashion. Our approach empowers a user to manage trade-offs between model accuracy, evaluation speed, and generalizability. A user can define a model and customize the calibration process, making it as simple or complex as desired, and as application-targeted or general as desired. To evaluate our approach, we demonstrate both linear and nonlinear models; each example models execution times for multiple variants of a particular computation: two matrix multiplication variants, four Discontinuous Galerkin (DG) differentiation operation variants, and two 2-D five-point finite difference stencil variants. For each variant, we present accuracy results on GPUs from multiple vendors and hardware generations. We view this customizable approach as a response to a central question in GPU performance modeling: how can we model GPU performance in a cost-explanatory fashion while maintaining accuracy, evaluation speed, portability, and ease of use, an attribute we believe precludes manual collection of kernel or hardware statistics. △ Less

Submitted 19 June, 2020; v1 submitted 20 April, 2019; originally announced April 2019.

Comments: 25 pages, 9 figures

Journal ref: The International Journal of High Performance Computing Applications, June 2020

Showing 1–5 of 5 results for author: Stevens, D