Search | arXiv e-print repository

arXiv:2405.19230 [pdf, other]

Valid Conformal Prediction for Dynamic GNNs

Authors: Ed Davis, Ian Gallagher, Daniel John Lawson, Patrick Rubin-Delanchy

Abstract: Graph neural networks (GNNs) are powerful black-box models which have shown impressive empirical performance. However, without any form of uncertainty quantification, it can be difficult to trust such models in high-risk scenarios. Conformal prediction aims to address this problem, however, an assumption of exchangeability is required for its validity which has limited its applicability to static… ▽ More Graph neural networks (GNNs) are powerful black-box models which have shown impressive empirical performance. However, without any form of uncertainty quantification, it can be difficult to trust such models in high-risk scenarios. Conformal prediction aims to address this problem, however, an assumption of exchangeability is required for its validity which has limited its applicability to static graphs and transductive regimes. We propose to use unfolding, which allows any existing static GNN to output a dynamic graph embedding with exchangeability properties. Using this, we extend the validity of conformal prediction to dynamic GNNs in both transductive and semi-inductive regimes. We provide a theoretical guarantee of valid conformal prediction in these cases and demonstrate the empirical validity, as well as the performance gains, of unfolded GNNs against standard GNN architectures on both simulated and real datasets. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 17 pages, 8 figures

MSC Class: 62H30

arXiv:2404.15166 [pdf, other]

Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication

Authors: John R. Lawson, Montgomery L. Flora, Kevin H. Goebbert, Seth N. Lyman, Corey K. Potvin, David M. Schultz, Adam J. Stepanek, Joseph E. Trujillo-Falcón

Abstract: Generative AI, such as OpenAI's GPT-4V large-language model, has rapidly entered mainstream discourse. Novel capabilities in image processing and natural-language communication may augment existing forecasting methods. Large language models further display potential to better communicate weather hazards in a style honed for diverse communities and different languages. This study evaluates GPT-4V's… ▽ More Generative AI, such as OpenAI's GPT-4V large-language model, has rapidly entered mainstream discourse. Novel capabilities in image processing and natural-language communication may augment existing forecasting methods. Large language models further display potential to better communicate weather hazards in a style honed for diverse communities and different languages. This study evaluates GPT-4V's ability to interpret meteorological charts and communicate weather hazards appropriately to the user, despite challenges of hallucinations, where generative AI delivers coherent, confident, but incorrect responses. We assess GPT-4V's competence via its web interface ChatGPT in two tasks: (1) generating a severe-weather outlook from weather-chart analysis and conducting self-evaluation, revealing an outlook that corresponds well with a Storm Prediction Center human-issued forecast; and (2) producing hazard summaries in Spanish and English from weather charts. Responses in Spanish, however, resemble direct (not idiomatic) translations from English to Spanish, yielding poorly translated summaries that lose critical idiomatic precision required for optimal communication. Our findings advocate for cautious integration of tools like GPT-4V in meteorology, underscoring the necessity of human oversight and development of trustworthy, explainable AI. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: Supplementary material PDF attached. Submitted to Artificial Intelligence for the Earth Systems (American Meteorological Society) on 18 April 2024

arXiv:2401.10804 [pdf, other]

Endovascular Detection of Catheter-Thrombus Contact by Vacuum Excitation

Authors: Jared Lawson, Madison Veliky, Colette P. Abah, Mary S. Dietrich, Rohan Chitale, Nabil Simaan

Abstract: Objective: The objective of this work is to introduce and demonstrate the effectiveness of a novel sensing modality for contact detection between an off-the-shelf aspiration catheter and a thrombus. Methods: A custom robotic actuator with a pressure sensor was used to generate an oscillatory vacuum excitation and sense the pressure inside the extracorporeal portion of the catheter. Vacuum pressure… ▽ More Objective: The objective of this work is to introduce and demonstrate the effectiveness of a novel sensing modality for contact detection between an off-the-shelf aspiration catheter and a thrombus. Methods: A custom robotic actuator with a pressure sensor was used to generate an oscillatory vacuum excitation and sense the pressure inside the extracorporeal portion of the catheter. Vacuum pressure profiles and robotic motion data were used to train a support vector machine (SVM) classification model to detect contact between the aspiration catheter tip and a mock thrombus. Validation consisted of benchtop accuracy verification, as well as user study comparison to the current standard of angiographic presentation. Results: Benchtop accuracy of the sensing modality was shown to be 99.67%. The user study demonstrated statistically significant improvement in identifying catheter-thrombus contact compared to the current standard. The odds ratio of successful detection of clot contact was 2.86 (p=0.03) when using the proposed sensory method compared to without it. Conclusion: The results of this work indicate that the proposed sensing modality can offer intraoperative feedback to interventionalists that can improve their ability to detect contact between the distal tip of a catheter and a thrombus. Significance: By offering a relatively low-cost technology that affords off-the-shelf aspiration catheters as clot-detecting sensors, interventionalists can improve the first-pass effect of the mechanical thrombectomy procedure while reducing procedural times and mental burden. △ Less

Submitted 19 January, 2024; originally announced January 2024.

arXiv:2311.09251 [pdf, other]

A Simple and Powerful Framework for Stable Dynamic Network Embedding

Authors: Ed Davis, Ian Gallagher, Daniel John Lawson, Patrick Rubin-Delanchy

Abstract: In this paper, we address the problem of dynamic network embedding, that is, representing the nodes of a dynamic network as evolving vectors within a low-dimensional space. While the field of static network embedding is wide and established, the field of dynamic network embedding is comparatively in its infancy. We propose that a wide class of established static network embedding methods can be us… ▽ More In this paper, we address the problem of dynamic network embedding, that is, representing the nodes of a dynamic network as evolving vectors within a low-dimensional space. While the field of static network embedding is wide and established, the field of dynamic network embedding is comparatively in its infancy. We propose that a wide class of established static network embedding methods can be used to produce interpretable and powerful dynamic network embeddings when they are applied to the dilated unfolded adjacency matrix. We provide a theoretical guarantee that, regardless of embedding dimension, these unfolded methods will produce stable embeddings, meaning that nodes with identical latent behaviour will be exchangeable, regardless of their position in time or space. We additionally define a hypothesis testing framework which can be used to evaluate the quality of a dynamic network embedding by testing for planted structure in simulated networks. Using this, we demonstrate that, even in trivial cases, unstable methods are often either conservative or encode incorrect structure. In contrast, we demonstrate that our suite of stable unfolded methods are not only more interpretable but also more powerful in comparison to their unstable counterparts. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: 33 pages, 9 figures

MSC Class: 62H15 (Primary) 62H30; 62M10; 62G99 (Secondary)

arXiv:2304.12165 [pdf, other]

Model-Based Pose Estimation of Steerable Catheters under Bi-Plane Image Feedback

Authors: Jared Lawson, Rohan Chitale, Nabil Simaan

Abstract: Small catheters undergo significant torsional deflections during endovascular interventions. A key challenge in enabling robot control of these catheters is the estimation of their bending planes. This paper considers approaches for estimating these bending planes based on bi-plane image feedback. The proposed approaches attempt to minimize error between either the direct (position-based) or insta… ▽ More Small catheters undergo significant torsional deflections during endovascular interventions. A key challenge in enabling robot control of these catheters is the estimation of their bending planes. This paper considers approaches for estimating these bending planes based on bi-plane image feedback. The proposed approaches attempt to minimize error between either the direct (position-based) or instantaneous (velocity-based) kinematics with the reconstructed kinematics from bi-plane image feedback. A comparison between these methods is carried out on a setup using two cameras in lieu of a bi-plane fluoroscopy setup. The results show that the position-based approach is less susceptible to segmentation noise and works best when the segment is in a non-straight configuration. These results suggest that estimation of the bending planes can be accompanied with errors under 30 degrees. Considering that the torsional buildup of these catheters can be more than 180 degrees, we believe that this method can be used for catheter control with improved safety due to the reduction of this uncertainty. △ Less

Submitted 24 April, 2023; originally announced April 2023.

Comments: 8 pages, 5 figures, International Conference on Robotics and Automation (ICRA)

arXiv:2210.10526 [pdf, other]

Propagating Variational Model Uncertainty for Bioacoustic Call Label Smoothing

Authors: Georgios Rizos, Jenna Lawson, Simon Mitchell, Pranay Shah, Xin Wen, Cristina Banks-Leite, Robert Ewers, Bjoern W. Schuller

Abstract: We focus on using the predictive uncertainty signal calculated by Bayesian neural networks to guide learning in the self-same task the model is being trained on. Not opting for costly Monte Carlo sampling of weights, we propagate the approximate hidden variance in an end-to-end manner, throughout a variational Bayesian adaptation of a ResNet with attention and squeeze-and-excitation blocks, in ord… ▽ More We focus on using the predictive uncertainty signal calculated by Bayesian neural networks to guide learning in the self-same task the model is being trained on. Not opting for costly Monte Carlo sampling of weights, we propagate the approximate hidden variance in an end-to-end manner, throughout a variational Bayesian adaptation of a ResNet with attention and squeeze-and-excitation blocks, in order to identify data samples that should contribute less into the loss value calculation. We, thus, propose uncertainty-aware, data-specific label smoothing, where the smoothing probability is dependent on this epistemic uncertainty. We show that, through the explicit usage of the epistemic uncertainty in the loss calculation, the variational model is led to improved predictive and calibration performance. This core machine learning methodology is exemplified at wildlife call detection, from audio recordings made via passive acoustic monitoring equipment in the animals' natural habitats, with the future goal of automating large scale annotation in a trustworthy manner. △ Less

Submitted 19 October, 2022; originally announced October 2022.

arXiv:2008.13145 [pdf, other]

Performance portability through machine learning guided kernel selection in SYCL libraries

Authors: John Lawson

Abstract: Automatically tuning parallel compute kernels allows libraries and frameworks to achieve performance on a wide range of hardware, however these techniques are typically focused on finding optimal kernel parameters for particular input sizes and parameters. General purpose compute libraries must be able to cater to all inputs and parameters provided by a user, and so these techniques are of limited… ▽ More Automatically tuning parallel compute kernels allows libraries and frameworks to achieve performance on a wide range of hardware, however these techniques are typically focused on finding optimal kernel parameters for particular input sizes and parameters. General purpose compute libraries must be able to cater to all inputs and parameters provided by a user, and so these techniques are of limited use. Additionally, parallel programming frameworks such as SYCL require that the kernels be deployed in a binary format embedded within the library. As such it is impractical to deploy a large number of possible kernel configurations without inflating the library size. Machine learning methods can be used to mitigate against both of these problems and provide performance for general purpose routines with a limited number of kernel configurations. We show that unsupervised clustering methods can be used to select a subset of the possible kernels that should be deployed and that simple classification methods can be trained to select from these kernels at runtime to give good performance. As these techniques are fully automated, relying only on benchmark data, the tuning process for new hardware or problems does not require any developer effort or expertise. △ Less

Submitted 30 August, 2020; originally announced August 2020.

Comments: 13 pages, 7 figures, 2 tables

arXiv:2003.06795 [pdf, other]

Towards automated kernel selection in machine learning systems: A SYCL case study

Authors: John Lawson

Abstract: Automated tuning of compute kernels is a popular area of research, mainly focused on finding optimal kernel parameters for a problem with fixed input sizes. This approach is good for deploying machine learning models, where the network topology is constant, but machine learning research often involves changing network topologies and hyperparameters. Traditional kernel auto-tuning has limited impac… ▽ More Automated tuning of compute kernels is a popular area of research, mainly focused on finding optimal kernel parameters for a problem with fixed input sizes. This approach is good for deploying machine learning models, where the network topology is constant, but machine learning research often involves changing network topologies and hyperparameters. Traditional kernel auto-tuning has limited impact in this case; a more general selection of kernels is required for libraries to accelerate machine learning research. In this paper we present initial results using machine learning to select kernels in a case study deploying high performance SYCL kernels in libraries that target a range of heterogeneous devices from desktop GPUs to embedded accelerators. The techniques investigated apply more generally and could similarly be integrated with other heterogeneous programming systems. By combining auto-tuning and machine learning these kernel selection processes can be deployed with little developer effort to achieve high performance on new hardware. △ Less

Submitted 15 March, 2020; originally announced March 2020.

Comments: 4 pages, 4 figures, 1 table. Accepted to AsHES workshop at IPDPS 2020

arXiv:1904.05347 [pdf, other]

Cross-Platform Performance Portability Using Highly Parametrized SYCL Kernels

Authors: John Lawson, Mehdi Goli, Duncan McBain, Daniel Soutar, Louis Sugy

Abstract: Over recent years heterogeneous systems have become more prevalent across HPC systems, with over 100 supercomputers in the TOP500 incorporating GPUs or other accelerators. These hardware platforms have different performance characteristics and optimization requirements. In order to make the most of multiple accelerators a developer has to provide implementations of their algorithms tuned for each… ▽ More Over recent years heterogeneous systems have become more prevalent across HPC systems, with over 100 supercomputers in the TOP500 incorporating GPUs or other accelerators. These hardware platforms have different performance characteristics and optimization requirements. In order to make the most of multiple accelerators a developer has to provide implementations of their algorithms tuned for each device. Hardware vendors provide libraries targeting their devices specifically, which provide good performance but frequently have different API designs, hampering portability. The SYCL programming model allows users to write heterogeneous programs using completely standard C++, and so developers have access to the power of C++ templates when develo** compute kernels. In this paper we show that by writing highly parameterized kernels for matrix multiplies and convolutions we achieve performance competitive with vendor implementations across different architectures. Furthermore, tuning for new devices amounts to choosing the combinations of kernel parameters that perform best on the hardware. △ Less

Submitted 10 April, 2019; originally announced April 2019.

Comments: 11 pages, 9 figures, 4 tables

arXiv:1904.04174 [pdf, other]

doi 10.1145/3318170.3318183

Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

Authors: Rod Burns, John Lawson, Duncan McBain, Daniel Soutar

Abstract: Over the past few years machine learning has seen a renewed explosion of interest, following a number of studies showing the effectiveness of neural networks in a range of tasks which had previously been considered incredibly hard. Neural networks' effectiveness in the fields of image recognition and natural language processing stems primarily from the vast amounts of data available to companies a… ▽ More Over the past few years machine learning has seen a renewed explosion of interest, following a number of studies showing the effectiveness of neural networks in a range of tasks which had previously been considered incredibly hard. Neural networks' effectiveness in the fields of image recognition and natural language processing stems primarily from the vast amounts of data available to companies and researchers, coupled with the huge amounts of compute power available in modern accelerators such as GPUs, FPGAs and ASICs. There are a number of approaches available to developers for utilizing GPGPU technologies such as SYCL, OpenCL and CUDA, however many applications require the same low level mathematical routines. Libraries dedicated to accelerating these common routines allow developers to easily make full use of the available hardware without requiring low level knowledge of the hardware themselves, however such libraries are often provided by hardware manufacturers for specific hardware such as cuDNN for Nvidia hardware or MIOpen for AMD hardware. SYCL-DNN is a new open-source library dedicated to providing accelerated routines for neural network operations which are hardware and vendor agnostic. Built on top of the SYCL open standard and written entirely in standard C++, SYCL-DNN allows a user to easily accelerate neural network code for a wide range of hardware using a modern C++ interface. The library is tested on AMD's OpenCL for GPU, Intel's OpenCL for CPU and GPU, ARM's OpenCL for Mali GPUs as well as ComputeAorta's OpenCL for R-Car CV engine and host CPU. In this talk we will present performance figures for SYCL-DNN on this range of hardware, and discuss how high performance was achieved on such a varied set of accelerators with such different hardware features. △ Less

Submitted 8 April, 2019; originally announced April 2019.

Comments: 4 pages, 3 figures. In International Workshop on OpenCL (IWOCL '19), May 13-15, 2019, Boston

Showing 1–10 of 10 results for author: Lawson, J