-
Self-training superconducting neuromorphic circuits using reinforcement learning rules
Authors:
M. L. Schneider,
E. M. Jué,
M. R. Pufall,
K. Segall,
C. W. Anderson
Abstract:
Reinforcement learning algorithms are used in a wide range of applications, from gaming and robotics to autonomous vehicles. In this paper we describe a set of reinforcement learning-based local weight update rules and their implementation in superconducting hardware. Using SPICE circuit simulations, we implement a small-scale neural network with a learning time of order one nanosecond. This netwo…
▽ More
Reinforcement learning algorithms are used in a wide range of applications, from gaming and robotics to autonomous vehicles. In this paper we describe a set of reinforcement learning-based local weight update rules and their implementation in superconducting hardware. Using SPICE circuit simulations, we implement a small-scale neural network with a learning time of order one nanosecond. This network can be trained to learn new functions simply by changing the target output for a given set of inputs, without the need for any external adjustments to the network. In this implementation the weights are adjusted based on the current state of the overall network response and locally stored information about the previous action. This removes the need to program explicit weight values in these networks, which is one of the primary challenges that analog hardware implementations of neural networks face. The adjustment of weights is based on a global reinforcement signal that obviates the need for circuitry to back-propagate errors.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Eclectic Rule Extraction for Explainability of Deep Neural Network based Intrusion Detection Systems
Authors:
Jesse Ables,
Nathaniel Childers,
William Anderson,
Sudip Mittal,
Shahram Rahimi,
Ioana Banicescu,
Maria Seale
Abstract:
This paper addresses trust issues created from the ubiquity of black box algorithms and surrogate explainers in Explainable Intrusion Detection Systems (X-IDS). While Explainable Artificial Intelligence (XAI) aims to enhance transparency, black box surrogate explainers, such as Local Interpretable Model-Agnostic Explanation (LIME) and SHapley Additive exPlanation (SHAP), are difficult to trust. Th…
▽ More
This paper addresses trust issues created from the ubiquity of black box algorithms and surrogate explainers in Explainable Intrusion Detection Systems (X-IDS). While Explainable Artificial Intelligence (XAI) aims to enhance transparency, black box surrogate explainers, such as Local Interpretable Model-Agnostic Explanation (LIME) and SHapley Additive exPlanation (SHAP), are difficult to trust. The black box nature of these surrogate explainers makes the process behind explanation generation opaque and difficult to understand. To avoid this problem, one can use transparent white box algorithms such as Rule Extraction (RE). There are three types of RE algorithms: pedagogical, decompositional, and eclectic. Pedagogical methods offer fast but untrustworthy white-box explanations, while decompositional RE provides trustworthy explanations with poor scalability. This work explores eclectic rule extraction, which strikes a balance between scalability and trustworthiness. By combining techniques from pedagogical and decompositional approaches, eclectic rule extraction leverages the advantages of both, while mitigating some of their drawbacks. The proposed Hybrid X-IDS architecture features eclectic RE as a white box surrogate explainer for black box Deep Neural Networks (DNN). The presented eclectic RE algorithm extracts human-readable rules from hidden layers, facilitating explainable and trustworthy rulesets. Evaluations on UNSW-NB15 and CIC-IDS-2017 datasets demonstrate the algorithm's ability to generate rulesets with 99.9% accuracy, mimicking DNN outputs. The contributions of this work include the hybrid X-IDS architecture, the eclectic rule extraction algorithm applicable to intrusion detection datasets, and a thorough analysis of performance and explainability, demonstrating the trade-offs involved in rule extraction speed and accuracy.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
An R package for parametric estimation of causal effects
Authors:
Joshua Wolff Anderson,
Cyril Rakovski
Abstract:
This article explains the usage of R package CausalModels, which is publicly available on the Comprehensive R Archive Network. While packages are available for sufficiently estimating causal effects, there lacks a package that provides a collection of structural models using the conventional statistical approach developed by Hernan and Robins (2020). CausalModels addresses this deficiency of softw…
▽ More
This article explains the usage of R package CausalModels, which is publicly available on the Comprehensive R Archive Network. While packages are available for sufficiently estimating causal effects, there lacks a package that provides a collection of structural models using the conventional statistical approach developed by Hernan and Robins (2020). CausalModels addresses this deficiency of software in R concerning causal inference by offering tools for methods that account for biases in observational data without requiring extensive statistical knowledge. These methods should not be ignored and may be more appropriate or efficient in solving particular problems. While implementations of these statistical models are distributed among a number of causal packages, CausalModels introduces a simple and accessible framework for a consistent modeling pipeline among a variety of statistical methods for estimating causal effects in a single R package. It consists of common methods including standardization, IP weighting, G-estimation, outcome regression, instrumental variables and propensity matching.
△ Less
Submitted 17 July, 2023; v1 submitted 17 July, 2023;
originally announced July 2023.
-
Attention Hybrid Variational Net for Accelerated MRI Reconstruction
Authors:
Guoyao Shen,
Boran Hao,
Mengyu Li,
Chad W. Farris,
Ioannis Ch. Paschalidis,
Stephan W. Anderson,
Xin Zhang
Abstract:
The application of compressed sensing (CS)-enabled data reconstruction for accelerating magnetic resonance imaging (MRI) remains a challenging problem. This is due to the fact that the information lost in k-space from the acceleration mask makes it difficult to reconstruct an image similar to the quality of a fully sampled image. Multiple deep learning-based structures have been proposed for MRI r…
▽ More
The application of compressed sensing (CS)-enabled data reconstruction for accelerating magnetic resonance imaging (MRI) remains a challenging problem. This is due to the fact that the information lost in k-space from the acceleration mask makes it difficult to reconstruct an image similar to the quality of a fully sampled image. Multiple deep learning-based structures have been proposed for MRI reconstruction using CS, both in the k-space and image domains as well as using unrolled optimization methods. However, the drawback of these structures is that they are not fully utilizing the information from both domains (k-space and image). Herein, we propose a deep learning-based attention hybrid variational network that performs learning in both the k-space and image domain. We evaluate our method on a well-known open-source MRI dataset and a clinical MRI dataset of patients diagnosed with strokes from our institution to demonstrate the performance of our network. In addition to quantitative evaluation, we undertook a blinded comparison of image quality across networks performed by a subspecialty trained radiologist. Overall, we demonstrate that our network achieves a superior performance among others under multiple reconstruction tasks.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
REGARD: Rules of EngaGement for Automated cybeR Defense to aid in Intrusion Response
Authors:
Damodar Panigrahi,
William Anderson,
Joshua Whitman,
Sudip Mittal,
Benjamin A Blakely
Abstract:
Automated Intelligent Cyberdefense Agents (AICAs) that are part Intrusion Detection Systems (IDS) and part Intrusion Response Systems (IRS) are being designed to protect against sophisticated and automated cyber-attacks. An AICA based on the ideas of Self-Adaptive Autonomic Computing Systems (SA-ACS) can be considered as a managing system that protects a managed system like a personal computer, we…
▽ More
Automated Intelligent Cyberdefense Agents (AICAs) that are part Intrusion Detection Systems (IDS) and part Intrusion Response Systems (IRS) are being designed to protect against sophisticated and automated cyber-attacks. An AICA based on the ideas of Self-Adaptive Autonomic Computing Systems (SA-ACS) can be considered as a managing system that protects a managed system like a personal computer, web application, critical infrastructure, etc. An AICA, specifically the IRS components, can compute a wide range of potential responses to meet its security goals and objectives, such as taking actions to prevent the attack from completing, restoring the system to comply with the organizational security policy, containing or confining an attack, attack eradication, deploying forensics measures to enable future attack analysis, counterattack, and so on. To restrict its activities in order to minimize collateral/organizational damage, such an automated system must have set Rules of Engagement (RoE). Automated systems must determine which operations can be completely automated (and when), which actions require human operator confirmation, and which actions must never be undertaken. In this paper, to enable this control functionality over an IRS, we create Rules of EngaGement for Automated cybeR Defense (REGARD) system which holds a set of Rules of Engagement (RoE) to protect the managed system according to the instructions provided by the human operator. These rules help limit the action of the IRS on the managed system in compliance with the recommendations of the domain expert. We provide details of execution, management, operation, and conflict resolution for Rules of Engagement (RoE) to constrain the actions of an automated IRS. We also describe REGARD system implementation, security case studies for cyber defense, and RoE demonstrations.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Explainable Intrusion Detection Systems Using Competitive Learning Techniques
Authors:
Jesse Ables,
Thomas Kirby,
Sudip Mittal,
Ioana Banicescu,
Shahram Rahimi,
William Anderson,
Maria Seale
Abstract:
The current state of the art systems in Artificial Intelligence (AI) enabled intrusion detection use a variety of black box methods. These black box methods are generally trained using Error Based Learning (EBL) techniques with a focus on creating accurate models. These models have high performative costs and are not easily explainable. A white box Competitive Learning (CL) based eXplainable Intru…
▽ More
The current state of the art systems in Artificial Intelligence (AI) enabled intrusion detection use a variety of black box methods. These black box methods are generally trained using Error Based Learning (EBL) techniques with a focus on creating accurate models. These models have high performative costs and are not easily explainable. A white box Competitive Learning (CL) based eXplainable Intrusion Detection System (X-IDS) offers a potential solution to these problem. CL models utilize an entirely different learning paradigm than EBL approaches. This different learning process makes the CL family of algorithms innately explainable and less resource intensive. In this paper, we create an X-IDS architecture that is based on DARPA's recommendation for explainable systems. In our architecture we leverage CL algorithms like, Self Organizing Maps (SOM), Growing Self Organizing Maps (GSOM), and Growing Hierarchical Self Organizing Map (GHSOM). The resulting models can be data-mined to create statistical and visual explanations. Our architecture is tested using NSL-KDD and CIC-IDS-2017 benchmark datasets, and produces accuracies that are 1% - 3% less than EBL models. However, CL models are much more explainable than EBL models. Additionally, we use a pruning process that is able to significantly reduce the size of these CL based models. By pruning our models, we are able to increase prediction speeds. Lastly, we analyze the statistical and visual explanations generated by our architecture, and we give a strategy that users could use to help navigate the set of explanations. These explanations will help users build trust with an Intrusion Detection System (IDS), and allow users to discover ways to increase the IDS's potency.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
A Topic Modeling Approach to Classifying Open Street Map Health Clinics and Schools in Sub-Saharan Africa
Authors:
Joshua W. Anderson,
Luis Iñaki Alberro Encina,
Tina George Karippacheril,
Jonathan Hersh,
Cadence Stringer
Abstract:
Data deprivation, or the lack of easily available and actionable information on the well-being of individuals, is a significant challenge for the develo** world and an impediment to the design and operationalization of policies intended to alleviate poverty. In this paper we explore the suitability of data derived from OpenStreetMap to proxy for the location of two crucial public services: schoo…
▽ More
Data deprivation, or the lack of easily available and actionable information on the well-being of individuals, is a significant challenge for the develo** world and an impediment to the design and operationalization of policies intended to alleviate poverty. In this paper we explore the suitability of data derived from OpenStreetMap to proxy for the location of two crucial public services: schools and health clinics. Thanks to the efforts of thousands of digital humanitarians, online map** repositories such as OpenStreetMap contain millions of records on buildings and other structures, delineating both their location and often their use. Unfortunately much of this data is locked in complex, unstructured text rendering it seemingly unsuitable for classifying schools or clinics. We apply a scalable, unsupervised learning method to unlabeled OpenStreetMap building data to extract the location of schools and health clinics in ten countries in Africa. We find the topic modeling approach greatly improves performance versus reliance on structured keys alone. We validate our results by comparing schools and clinics identified by our OSM method versus those identified by the WHO, and describe OSM coverage gaps more broadly.
△ Less
Submitted 22 December, 2022;
originally announced December 2022.
-
Synthetic Image Data for Deep Learning
Authors:
Jason W. Anderson,
Marcin Ziolkowski,
Ken Kennedy,
Amy W. Apon
Abstract:
Realistic synthetic image data rendered from 3D models can be used to augment image sets and train image classification semantic segmentation models. In this work, we explore how high quality physically-based rendering and domain randomization can efficiently create a large synthetic dataset based on production 3D CAD models of a real vehicle. We use this dataset to quantify the effectiveness of s…
▽ More
Realistic synthetic image data rendered from 3D models can be used to augment image sets and train image classification semantic segmentation models. In this work, we explore how high quality physically-based rendering and domain randomization can efficiently create a large synthetic dataset based on production 3D CAD models of a real vehicle. We use this dataset to quantify the effectiveness of synthetic augmentation using U-net and Double-U-net models. We found that, for this domain, synthetic images were an effective technique for augmenting limited sets of real training data. We observed that models trained on purely synthetic images had a very low mean prediction IoU on real validation images. We also observed that adding even very small amounts of real images to a synthetic dataset greatly improved accuracy, and that models trained on datasets augmented with synthetic images were more accurate than those trained on real images alone. Finally, we found that in use cases that benefit from incremental training or model specialization, pretraining a base model on synthetic images provided a sizeable reduction in the training cost of transfer learning, allowing up to 90\% of the model training to be front-loaded.
△ Less
Submitted 12 December, 2022;
originally announced December 2022.
-
Designing an Artificial Immune System inspired Intrusion Detection System
Authors:
William Anderson,
Kaneesha Moore,
Jesse Ables,
Sudip Mittal,
Shahram Rahimi,
Ioana Banicescu,
Maria Seale
Abstract:
The Human Immune System (HIS) works to protect a body from infection, illness, and disease. This system can inspire cybersecurity professionals to design an Artificial Immune System (AIS) based Intrusion Detection System (IDS). These biologically inspired algorithms using Self/Nonself and Danger Theory can directly augmentIDS designs and implementations. In this paper, we include an examination in…
▽ More
The Human Immune System (HIS) works to protect a body from infection, illness, and disease. This system can inspire cybersecurity professionals to design an Artificial Immune System (AIS) based Intrusion Detection System (IDS). These biologically inspired algorithms using Self/Nonself and Danger Theory can directly augmentIDS designs and implementations. In this paper, we include an examination into the elements of design necessary for building an AIS-IDS framework and present an architecture to create such systems.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Creating an Explainable Intrusion Detection System Using Self Organizing Maps
Authors:
Jesse Ables,
Thomas Kirby,
William Anderson,
Sudip Mittal,
Shahram Rahimi,
Ioana Banicescu,
Maria Seale
Abstract:
Modern Artificial Intelligence (AI) enabled Intrusion Detection Systems (IDS) are complex black boxes. This means that a security analyst will have little to no explanation or clarification on why an IDS model made a particular prediction. A potential solution to this problem is to research and develop Explainable Intrusion Detection Systems (X-IDS) based on current capabilities in Explainable Art…
▽ More
Modern Artificial Intelligence (AI) enabled Intrusion Detection Systems (IDS) are complex black boxes. This means that a security analyst will have little to no explanation or clarification on why an IDS model made a particular prediction. A potential solution to this problem is to research and develop Explainable Intrusion Detection Systems (X-IDS) based on current capabilities in Explainable Artificial Intelligence (XAI). In this paper, we create a Self Organizing Maps (SOMs) based X-IDS system that is capable of producing explanatory visualizations. We leverage SOM's explainability to create both global and local explanations. An analyst can use global explanations to get a general idea of how a particular IDS model computes predictions. Local explanations are generated for individual datapoints to explain why a certain prediction value was computed. Furthermore, our SOM based X-IDS was evaluated on both explanation generation and traditional accuracy tests using the NSL-KDD and the CIC-IDS-2017 datasets.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Explainable Intrusion Detection Systems (X-IDS): A Survey of Current Methods, Challenges, and Opportunities
Authors:
Subash Neupane,
Jesse Ables,
William Anderson,
Sudip Mittal,
Shahram Rahimi,
Ioana Banicescu,
Maria Seale
Abstract:
The application of Artificial Intelligence (AI) and Machine Learning (ML) to cybersecurity challenges has gained traction in industry and academia, partially as a result of widespread malware attacks on critical systems such as cloud infrastructures and government institutions. Intrusion Detection Systems (IDS), using some forms of AI, have received widespread adoption due to their ability to hand…
▽ More
The application of Artificial Intelligence (AI) and Machine Learning (ML) to cybersecurity challenges has gained traction in industry and academia, partially as a result of widespread malware attacks on critical systems such as cloud infrastructures and government institutions. Intrusion Detection Systems (IDS), using some forms of AI, have received widespread adoption due to their ability to handle vast amounts of data with a high prediction accuracy. These systems are hosted in the organizational Cyber Security Operation Center (CSoC) as a defense tool to monitor and detect malicious network flow that would otherwise impact the Confidentiality, Integrity, and Availability (CIA). CSoC analysts rely on these systems to make decisions about the detected threats. However, IDSs designed using Deep Learning (DL) techniques are often treated as black box models and do not provide a justification for their predictions. This creates a barrier for CSoC analysts, as they are unable to improve their decisions based on the model's predictions. One solution to this problem is to design explainable IDS (X-IDS).
This survey reviews the state-of-the-art in explainable AI (XAI) for IDS, its current challenges, and discusses how these challenges span to the design of an X-IDS. In particular, we discuss black box and white box approaches comprehensively. We also present the tradeoff between these approaches in terms of their performance and ability to produce explanations. Furthermore, we propose a generic architecture that considers human-in-the-loop which can be used as a guideline when designing an X-IDS. Research recommendations are given from three critical viewpoints: the need to define explainability for IDS, the need to create explanations tailored to various stakeholders, and the need to design metrics to evaluate explanations.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Are Metrics Enough? Guidelines for Communicating and Visualizing Predictive Models to Subject Matter Experts
Authors:
Ashley Suh,
Gabriel Appleby,
Erik W. Anderson,
Luca Finelli,
Remco Chang,
Dylan Cashman
Abstract:
Presenting a predictive model's performance is a communication bottleneck that threatens collaborations between data scientists and subject matter experts. Accuracy and error metrics alone fail to tell the whole story of a model - its risks, strengths, and limitations - making it difficult for subject matter experts to feel confident in their decision to use a model. As a result, models may fail i…
▽ More
Presenting a predictive model's performance is a communication bottleneck that threatens collaborations between data scientists and subject matter experts. Accuracy and error metrics alone fail to tell the whole story of a model - its risks, strengths, and limitations - making it difficult for subject matter experts to feel confident in their decision to use a model. As a result, models may fail in unexpected ways or go entirely unused, as subject matter experts disregard poorly presented models in favor of familiar, yet arguably substandard methods. In this paper, we describe an iterative study conducted with both subject matter experts and data scientists to understand the gaps in communication between these two groups. We find that, while the two groups share common goals of understanding the data and predictions of the model, friction can stem from unfamiliar terms, metrics, and visualizations - limiting the transfer of knowledge to SMEs and discouraging clarifying questions being asked during presentations. Based on our findings, we derive a set of communication guidelines that use visualization as a common medium for communicating the strengths and weaknesses of a model. We provide a demonstration of our guidelines in a regression modeling scenario and elicit feedback on their use from subject matter experts. From our demonstration, subject matter experts were more comfortable discussing a model's performance, more aware of the trade-offs for the presented model, and better equipped to assess the model's risks - ultimately informing and contextualizing the model's use beyond text and numbers.
△ Less
Submitted 27 March, 2023; v1 submitted 11 May, 2022;
originally announced May 2022.
-
Machine Learning: Algorithms, Models, and Applications
Authors:
Jaydip Sen,
Sidra Mehtab,
Rajdeep Sen,
Abhishek Dutta,
Pooja Kherwa,
Saheel Ahmed,
Pranay Berry,
Sahil Khurana,
Sonali Singh,
David W. W Cadotte,
David W. Anderson,
Kalum J. Ost,
Racheal S. Akinbo,
Oladunni A. Daramola,
Bongs Lainjo
Abstract:
Recent times are witnessing rapid development in machine learning algorithm systems, especially in reinforcement learning, natural language processing, computer and robot vision, image processing, speech, and emotional processing and understanding. In tune with the increasing importance and relevance of machine learning models, algorithms, and their applications, and with the emergence of more inn…
▽ More
Recent times are witnessing rapid development in machine learning algorithm systems, especially in reinforcement learning, natural language processing, computer and robot vision, image processing, speech, and emotional processing and understanding. In tune with the increasing importance and relevance of machine learning models, algorithms, and their applications, and with the emergence of more innovative uses cases of deep learning and artificial intelligence, the current volume presents a few innovative research works and their applications in real world, such as stock trading, medical and healthcare systems, and software automation. The chapters in the book illustrate how machine learning and deep learning algorithms and models are designed, optimized, and deployed. The volume will be useful for advanced graduate and doctoral students, researchers, faculty members of universities, practicing data scientists and data engineers, professionals, and consultants working on the broad areas of machine learning, deep learning, and artificial intelligence.
△ Less
Submitted 6 January, 2022;
originally announced January 2022.
-
UnProjection: Leveraging Inverse-Projections for Visual Analytics of High-Dimensional Data
Authors:
Mateus Espadoto,
Gabriel Appleby,
Ashley Suh,
Dylan Cashman,
Mingwei Li,
Carlos Scheidegger,
Erik W Anderson,
Remco Chang,
Alexandru C Telea
Abstract:
Projection techniques are often used to visualize high-dimensional data, allowing users to better understand the overall structure of multi-dimensional spaces on a 2D screen. Although many such methods exist, comparably little work has been done on generalizable methods of inverse-projection -- the process of map** the projected points, or more generally, the projection space back to the origina…
▽ More
Projection techniques are often used to visualize high-dimensional data, allowing users to better understand the overall structure of multi-dimensional spaces on a 2D screen. Although many such methods exist, comparably little work has been done on generalizable methods of inverse-projection -- the process of map** the projected points, or more generally, the projection space back to the original high-dimensional space. In this paper we present NNInv, a deep learning technique with the ability to approximate the inverse of any projection or map**. NNInv learns to reconstruct high-dimensional data from any arbitrary point on a 2D projection space, giving users the ability to interact with the learned high-dimensional representation in a visual analytics system. We provide an analysis of the parameter space of NNInv, and offer guidance in selecting these parameters. We extend validation of the effectiveness of NNInv through a series of quantitative and qualitative analyses. We then demonstrate the method's utility by applying it to three visualization tasks: interactive instance interpolation, classifier agreement, and gradient visualization.
△ Less
Submitted 2 November, 2021;
originally announced November 2021.
-
HyperNP: Interactive Visual Exploration of Multidimensional Projection Hyperparameters
Authors:
Gabriel Appleby,
Mateus Espadoto,
Rui Chen,
Samuel Goree,
Alexandru Telea,
Erik W Anderson,
Remco Chang
Abstract:
Projection algorithms such as t-SNE or UMAP are useful for the visualization of high dimensional data, but depend on hyperparameters which must be tuned carefully. Unfortunately, iteratively recomputing projections to find the optimal hyperparameter value is computationally intensive and unintuitive due to the stochastic nature of these methods. In this paper we propose HyperNP, a scalable method…
▽ More
Projection algorithms such as t-SNE or UMAP are useful for the visualization of high dimensional data, but depend on hyperparameters which must be tuned carefully. Unfortunately, iteratively recomputing projections to find the optimal hyperparameter value is computationally intensive and unintuitive due to the stochastic nature of these methods. In this paper we propose HyperNP, a scalable method that allows for real-time interactive hyperparameter exploration of projection methods by training neural network approximations. HyperNP can be trained on a fraction of the total data instances and hyperparameter configurations and can compute projections for new data and hyperparameters at interactive speeds. HyperNP is compact in size and fast to compute, thus allowing it to be embedded in lightweight visualization systems such as web browsers. We evaluate the performance of the HyperNP across three datasets in terms of performance and speed. The results suggest that HyperNP is accurate, scalable, interactive, and appropriate for use in real-world settings.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Quantum Self-Supervised Learning
Authors:
Ben Jaderberg,
Lewis W. Anderson,
Weidi Xie,
Samuel Albanie,
Martin Kiffner,
Dieter Jaksch
Abstract:
The resurgence of self-supervised learning, whereby a deep learning model generates its own supervisory signal from the data, promises a scalable way to tackle the dramatically increasing size of real-world data sets without human annotation. However, the staggering computational complexity of these methods is such that for state-of-the-art performance, classical hardware requirements represent a…
▽ More
The resurgence of self-supervised learning, whereby a deep learning model generates its own supervisory signal from the data, promises a scalable way to tackle the dramatically increasing size of real-world data sets without human annotation. However, the staggering computational complexity of these methods is such that for state-of-the-art performance, classical hardware requirements represent a significant bottleneck to further progress. Here we take the first steps to understanding whether quantum neural networks could meet the demand for more powerful architectures and test its effectiveness in proof-of-principle hybrid experiments. Interestingly, we observe a numerical advantage for the learning of visual representations using small-scale quantum neural networks over equivalently structured classical networks, even when the quantum circuits are sampled with only 100 shots. Furthermore, we apply our best quantum model to classify unseen images on the ibmq\_paris quantum computer and find that current noisy devices can already achieve equal accuracy to the equivalent classical model on downstream tasks.
△ Less
Submitted 4 April, 2022; v1 submitted 26 March, 2021;
originally announced March 2021.
-
MRI correlates of chronic symptoms in mild traumatic brain injury
Authors:
Cailey I. Kerley,
Kurt G. Schilling,
Justin Blaber,
Beth Miller,
Allen Newton,
Adam W. Anderson,
Bennett A. Landman,
Tonia S. Rex
Abstract:
Veterans with mild traumatic brain injury (mTBI) have reported auditory and visual dysfunction that persists beyond the acute incident. The etiology behind these symptoms is difficult to characterize with current clinical imaging. These functional deficits may be caused by shear injury or micro-bleeds, which can be detected with special imaging modalities. We explore these hypotheses in a pilot st…
▽ More
Veterans with mild traumatic brain injury (mTBI) have reported auditory and visual dysfunction that persists beyond the acute incident. The etiology behind these symptoms is difficult to characterize with current clinical imaging. These functional deficits may be caused by shear injury or micro-bleeds, which can be detected with special imaging modalities. We explore these hypotheses in a pilot study of multi-parametric MRI. We extract over 1,000 imaging and clinical metrics and project them to a low-dimensional space, where we can discriminate between healthy controls and patients with mTBI. We also show correlations between the metric representations and patient symptoms.
△ Less
Submitted 22 June, 2020; v1 submitted 6 December, 2019;
originally announced December 2019.
-
Deep Learning Captures More Accurate Diffusion Fiber Orientations Distributions than Constrained Spherical Deconvolution
Authors:
Vishwesh Nath,
Kurt G. Schilling,
Colin B. Hansen,
Prasanna Parvathaneni,
Allison E. Hainline,
Camilo Bermudez,
Andrew J. Plassard,
Vaibhav Janve,
Yurui Gao,
Justin A. Blaber,
Iwona Stępniewska,
Adam W. Anderson,
Bennett A. Landman
Abstract:
Confocal histology provides an opportunity to establish intra-voxel fiber orientation distributions that can be used to quantitatively assess the biological relevance of diffusion weighted MRI models, e.g., constrained spherical deconvolution (CSD). Here, we apply deep learning to investigate the potential of single shell diffusion weighted MRI to explain histologically observed fiber orientation…
▽ More
Confocal histology provides an opportunity to establish intra-voxel fiber orientation distributions that can be used to quantitatively assess the biological relevance of diffusion weighted MRI models, e.g., constrained spherical deconvolution (CSD). Here, we apply deep learning to investigate the potential of single shell diffusion weighted MRI to explain histologically observed fiber orientation distributions (FOD) and compare the derived deep learning model with a leading CSD approach. This study (1) demonstrates that there exists additional information in the diffusion signal that is not currently exploited by CSD, and (2) provides an illustrative data-driven model that makes use of this information.
△ Less
Submitted 13 November, 2019;
originally announced November 2019.
-
AI Assisted Annotator using Reinforcement Learning
Authors:
V. Ratna Saripalli,
Gopal Avinash,
Dibyajyoti Pati,
Michael Potter,
Charles W. Anderson
Abstract:
Healthcare data suffers from both noise and lack of ground truth. The cost of data increases as it is cleaned and annotated in healthcare. Unlike other data sets, medical data annotation, which is critical to accurate ground truth, requires medical domain expertise for a better patient outcome. In this work, we report on the use of reinforcement learning to mimic the decision making process of ann…
▽ More
Healthcare data suffers from both noise and lack of ground truth. The cost of data increases as it is cleaned and annotated in healthcare. Unlike other data sets, medical data annotation, which is critical to accurate ground truth, requires medical domain expertise for a better patient outcome. In this work, we report on the use of reinforcement learning to mimic the decision making process of annotators for medical events, to automate annotation and labelling. The reinforcement agent learns to annotate alarm data based on annotations done by an expert. Our method shows promising results on medical alarm data sets. We trained DQN and A2C agents using the data from monitoring devices annotated by an expert. Initial results from these RL agents learning the expert annotation behavior are promising. The A2C agent performs better in terms of learning the sparse events in a given state, thereby choosing more right actions compared to DQN agent. To the best of our knowledge, this is the first reinforcement learning application for the automation of medical events annotation, which has far-reaching practical use.
△ Less
Submitted 11 June, 2020; v1 submitted 2 October, 2019;
originally announced October 2019.
-
Enabling Multi-Shell b-Value Generalizability of Data-Driven Diffusion Models with Deep SHORE
Authors:
Vishwesh Nath,
Ilwoo Lyu,
Kurt G. Schilling,
Prasanna Parvathaneni,
Colin B. Hansen,
Yucheng Tang,
Yuankai Huo,
Vaibhav A. Janve,
Yurui Gao,
Iwona Stepniewska,
Adam W. Anderson,
Bennett A. Landman
Abstract:
Intra-voxel models of the diffusion signal are essential for interpreting organization of the tissue environment at micrometer level with data at millimeter resolution. Recent advances in data driven methods have enabled direct compari-son and optimization of methods for in-vivo data with externally validated histological sections with both 2-D and 3-D histology. Yet, all existing methods make lim…
▽ More
Intra-voxel models of the diffusion signal are essential for interpreting organization of the tissue environment at micrometer level with data at millimeter resolution. Recent advances in data driven methods have enabled direct compari-son and optimization of methods for in-vivo data with externally validated histological sections with both 2-D and 3-D histology. Yet, all existing methods make limiting assumptions of either (1) model-based linkages between b-values or (2) limited associations with single shell data. We generalize prior deep learning models that used single shell spherical harmonic transforms to integrate the re-cently developed simple harmonic oscillator reconstruction (SHORE) basis. To enable learning on the SHORE manifold, we present an alternative formulation of the fiber orientation distribution (FOD) object using the SHORE basis while rep-resenting the observed diffusion weighted data in the SHORE basis. To ensure consistency of hyper-parameter optimization for SHORE, we present our Deep SHORE approach to learn on a data-optimized manifold. Deep SHORE is evalu-ated with eight-fold cross-validation of a preclinical MRI-histology data with four b-values. Generalizability of in-vivo human data is evaluated on two separate 3T MRI scanners. Specificity in terms of angular correlation (ACC) with the preclinical data improved on single shell: 0.78 relative to 0.73 and 0.73, multi-shell: 0.80 relative to 0.74 (p < 0.001). In the in-vivo human data, Deep SHORE was more consistent across scanners with 0.63 relative to other multi-shell methods 0.39, 0.52 and 0.57 in terms of ACC. In conclusion, Deep SHORE is a promising method to enable data driven learning with DW-MRI under conditions with varying b-values, number of diffusion shells, and gradient directions per shell.
△ Less
Submitted 22 February, 2020; v1 submitted 14 July, 2019;
originally announced July 2019.
-
Inter-Scanner Harmonization of High Angular Resolution DW-MRI using Null Space Deep Learning
Authors:
Vishwesh Nath,
Prasanna Parvathaneni,
Colin B. Hansen,
Allison E. Hainline,
Camilo Bermudez,
Samuel Remedios,
Justin A. Blaber,
Kurt G. Schilling,
Ilwoo Lyu,
Vaibhav Janve,
Yurui Gao,
Iwona Stepniewska,
Baxter P. Rogers,
Allen T. Newton,
L. Taylor Davis,
Jeff Luci,
Adam W. Anderson,
Bennett A. Landman
Abstract:
Diffusion-weighted magnetic resonance imaging (DW-MRI) allows for non-invasive imaging of the local fiber architecture of the human brain at a millimetric scale. Multiple classical approaches have been proposed to detect both single (e.g., tensors) and multiple (e.g., constrained spherical deconvolution, CSD) fiber population orientations per voxel. However, existing techniques generally exhibit l…
▽ More
Diffusion-weighted magnetic resonance imaging (DW-MRI) allows for non-invasive imaging of the local fiber architecture of the human brain at a millimetric scale. Multiple classical approaches have been proposed to detect both single (e.g., tensors) and multiple (e.g., constrained spherical deconvolution, CSD) fiber population orientations per voxel. However, existing techniques generally exhibit low reproducibility across MRI scanners. Herein, we propose a data-driven tech-nique using a neural network design which exploits two categories of data. First, training data were acquired on three squirrel monkey brains using ex-vivo DW-MRI and histology of the brain. Second, repeated scans of human subjects were acquired on two different scanners to augment the learning of the network pro-posed. To use these data, we propose a new network architecture, the null space deep network (NSDN), to simultaneously learn on traditional observed/truth pairs (e.g., MRI-histology voxels) along with repeated observations without a known truth (e.g., scan-rescan MRI). The NSDN was tested on twenty percent of the histology voxels that were kept completely blind to the network. NSDN significantly improved absolute performance relative to histology by 3.87% over CSD and 1.42% over a recently proposed deep neural network approach. More-over, it improved reproducibility on the paired data by 21.19% over CSD and 10.09% over a recently proposed deep approach. Finally, NSDN improved gen-eralizability of the model to a third in vivo human scanner (which was not used in training) by 16.08% over CSD and 10.41% over a recently proposed deep learn-ing approach. This work suggests that data-driven approaches for local fiber re-construction are more reproducible, informative and precise and offers a novel, practical method for determining these models.
△ Less
Submitted 9 October, 2018;
originally announced October 2018.
-
Software Must Move! A Description of the Software Assembly Line
Authors:
Martin J. McGowan III,
William L. Anderson
Abstract:
This paper describes a set of tools for automating and controlling the development and maintenance of software systems. The mental model is a software assembly line. Program design and construction take place at individual programmer workstations. Integration of individual software components takes place at subsequent stations on the assembly line. Software is moved automatically along the assembl…
▽ More
This paper describes a set of tools for automating and controlling the development and maintenance of software systems. The mental model is a software assembly line. Program design and construction take place at individual programmer workstations. Integration of individual software components takes place at subsequent stations on the assembly line. Software is moved automatically along the assembly line toward final packaging. Software under construction or maintenance is divided into packages. Each package of software is composed of a recipe and ingredients. Some new terms are introduced to describe the ingredients. The recipe specifies how ingredients are transformed into products. The benefits of the Software Assembly Line for development, maintenance, and management of large-scale computer systems are explained.
△ Less
Submitted 10 June, 2010;
originally announced June 2010.