-
Benchmark datasets driving artificial intelligence development fail to capture the needs of medical professionals
Authors:
Kathrin Blagec,
Jakob Kraiger,
Wolfgang Frühwirt,
Matthias Samwald
Abstract:
Publicly accessible benchmarks that allow for assessing and comparing model performances are important drivers of progress in artificial intelligence (AI). While recent advances in AI capabilities hold the potential to transform medical practice by assisting and augmenting the cognitive processes of healthcare professionals, the coverage of clinically relevant tasks by AI benchmarks is largely unc…
▽ More
Publicly accessible benchmarks that allow for assessing and comparing model performances are important drivers of progress in artificial intelligence (AI). While recent advances in AI capabilities hold the potential to transform medical practice by assisting and augmenting the cognitive processes of healthcare professionals, the coverage of clinically relevant tasks by AI benchmarks is largely unclear. Furthermore, there is a lack of systematized meta-information that allows clinical AI researchers to quickly determine accessibility, scope, content and other characteristics of datasets and benchmark datasets relevant to the clinical domain.
To address these issues, we curated and released a comprehensive catalogue of datasets and benchmarks pertaining to the broad domain of clinical and biomedical natural language processing (NLP), based on a systematic review of literature and online resources. A total of 450 NLP datasets were manually systematized and annotated with rich metadata, such as targeted tasks, clinical applicability, data types, performance metrics, accessibility and licensing information, and availability of data splits. We then compared tasks covered by AI benchmark datasets with relevant tasks that medical practitioners reported as highly desirable targets for automation in a previous empirical study.
Our analysis indicates that AI benchmarks of direct clinical relevance are scarce and fail to cover most work activities that clinicians want to see addressed. In particular, tasks associated with routine documentation and patient data administration workflows are not represented despite significant associated workloads. Thus, currently available AI benchmarks are improperly aligned with desired targets for AI automation in clinical settings, and novel benchmarks should be created to fill these gaps.
△ Less
Submitted 12 May, 2022; v1 submitted 18 January, 2022;
originally announced January 2022.
-
Towards better healthcare: What could and should be automated?
Authors:
Wolfgang Frühwirt,
Paul Duckworth
Abstract:
While artificial intelligence (AI) and other automation technologies might lead to enormous progress in healthcare, they may also have undesired consequences for people working in the field. In this interdisciplinary study, we capture empirical evidence of not only what healthcare work could be automated, but also what should be automated. We quantitatively investigate these research questions by…
▽ More
While artificial intelligence (AI) and other automation technologies might lead to enormous progress in healthcare, they may also have undesired consequences for people working in the field. In this interdisciplinary study, we capture empirical evidence of not only what healthcare work could be automated, but also what should be automated. We quantitatively investigate these research questions by utilizing probabilistic machine learning models trained on thousands of ratings, provided by both healthcare practitioners and automation experts. Based on our findings, we present an analytical tool (Automatability-Desirability Matrix) to support policymakers and organizational leaders in develo** practical strategies on how to harness the positive power of automation technologies, while accompanying change and empowering stakeholders in a participatory fashion.
△ Less
Submitted 21 October, 2019;
originally announced October 2019.
-
Bayesian deep neural networks for low-cost neurophysiological markers of Alzheimer's disease severity
Authors:
Wolfgang Fruehwirt,
Adam D. Cobb,
Martin Mairhofer,
Leonard Weydemann,
Heinrich Garn,
Reinhold Schmidt,
Thomas Benke,
Peter Dal-Bianco,
Gerhard Ransmayr,
Markus Waser,
Dieter Grossegger,
Pengfei Zhang,
Georg Dorffner,
Stephen Roberts
Abstract:
As societies around the world are ageing, the number of Alzheimer's disease (AD) patients is rapidly increasing. To date, no low-cost, non-invasive biomarkers have been established to advance the objectivization of AD diagnosis and progression assessment. Here, we utilize Bayesian neural networks to develop a multivariate predictor for AD severity using a wide range of quantitative EEG (QEEG) mark…
▽ More
As societies around the world are ageing, the number of Alzheimer's disease (AD) patients is rapidly increasing. To date, no low-cost, non-invasive biomarkers have been established to advance the objectivization of AD diagnosis and progression assessment. Here, we utilize Bayesian neural networks to develop a multivariate predictor for AD severity using a wide range of quantitative EEG (QEEG) markers. The Bayesian treatment of neural networks both automatically controls model complexity and provides a predictive distribution over the target function, giving uncertainty bounds for our regression task. It is therefore well suited to clinical neuroscience, where data sets are typically sparse and practitioners require a precise assessment of the predictive uncertainty. We use data of one of the largest prospective AD EEG trials ever conducted to demonstrate the potential of Bayesian deep learning in this domain, while comparing two distinct Bayesian neural network approaches, i.e., Monte Carlo dropout and Hamiltonian Monte Carlo.
△ Less
Submitted 13 December, 2018; v1 submitted 12 December, 2018;
originally announced December 2018.