-
A Review of Large Language Models and Autonomous Agents in Chemistry
Authors:
Mayk Caldas Ramos,
Christopher J. Collison,
Andrew D. White
Abstract:
Large language models (LLMs) are emerging as a powerful tool in chemistry across multiple domains. In chemistry, LLMs are able to accurately predict properties, design new molecules, optimize synthesis pathways, and accelerate drug and material discovery. A core emerging idea is combining LLMs with chemistry-specific tools like synthesis planners and databases, leading to so-called "agents." This…
▽ More
Large language models (LLMs) are emerging as a powerful tool in chemistry across multiple domains. In chemistry, LLMs are able to accurately predict properties, design new molecules, optimize synthesis pathways, and accelerate drug and material discovery. A core emerging idea is combining LLMs with chemistry-specific tools like synthesis planners and databases, leading to so-called "agents." This review covers LLMs' recent history, current capabilities, design, challenges specific to chemistry, and future directions. Particular attention is given to agents and their emergence as a cross-chemistry paradigm. Agents have proven effective in diverse domains of chemistry, but challenges remain. It is unclear if creating domain-specific versus generalist agents and develo** autonomous pipelines versus "co-pilot" systems will accelerate chemistry. An emerging direction is the development of multi-agent systems using a human-in-the-loop approach. Due to the incredibly fast development of this field, a repository has been built to keep track of the latest studies: https://github.com/ur-whitelab/LLMs-in-science.
△ Less
Submitted 26 June, 2024;
originally announced July 2024.
-
Machine Learning Visualization Tool for Exploring Parameterized Hydrodynamics
Authors:
C. F. Jekel,
D. M. Sterbentz,
T. M. Stitt,
P. Mocz,
R. N. Rieben,
D. A. White,
J. L. Belof
Abstract:
We are interested in the computational study of shock hydrodynamics, i.e. problems involving compressible solids, liquids, and gases that undergo large deformation. These problems are dynamic and nonlinear and can exhibit complex instabilities. Due to advances in high performance computing it is possible to parameterize a hydrodynamic problem and perform a computational study yielding…
▽ More
We are interested in the computational study of shock hydrodynamics, i.e. problems involving compressible solids, liquids, and gases that undergo large deformation. These problems are dynamic and nonlinear and can exhibit complex instabilities. Due to advances in high performance computing it is possible to parameterize a hydrodynamic problem and perform a computational study yielding $\mathcal{O}\left({\rm TB}\right)$ of simulation state data. We present an interactive machine learning tool that can be used to compress, browse, and interpolate these large simulation datasets. This tool allows computational scientists and researchers to quickly visualize "what-if" situations, perform sensitivity analyses, and optimize complex hydrodynamic experiments.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Using LLMs for Tabletop Exercises within the Security Domain
Authors:
Sam Hays,
Dr. Jules White
Abstract:
Tabletop exercises are a crucial component of many company's strategy to test and evaluate its preparedness for security incidents in a realistic way. Traditionally led by external firms specializing in cybersecurity, these exercises can be costly, time-consuming, and may not always align precisely with the client's specific needs. Large Language Models (LLMs) like ChatGPT offer a compelling alter…
▽ More
Tabletop exercises are a crucial component of many company's strategy to test and evaluate its preparedness for security incidents in a realistic way. Traditionally led by external firms specializing in cybersecurity, these exercises can be costly, time-consuming, and may not always align precisely with the client's specific needs. Large Language Models (LLMs) like ChatGPT offer a compelling alternative. They enable faster iteration, provide rich and adaptable simulations, and offer infinite patience in handling feedback and recommendations. This approach can enhances the efficiency and relevance of security preparedness exercises.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
Employing LLMs for Incident Response Planning and Review
Authors:
Sam Hays,
Dr. Jules White
Abstract:
Incident Response Planning (IRP) is essential for effective cybersecurity management, requiring detailed documentation (or playbooks) to guide security personnel during incidents. Yet, creating comprehensive IRPs is often hindered by challenges such as complex systems, high turnover rates, and legacy technologies lacking documentation. This paper argues that, despite these obstacles, the developme…
▽ More
Incident Response Planning (IRP) is essential for effective cybersecurity management, requiring detailed documentation (or playbooks) to guide security personnel during incidents. Yet, creating comprehensive IRPs is often hindered by challenges such as complex systems, high turnover rates, and legacy technologies lacking documentation. This paper argues that, despite these obstacles, the development, review, and refinement of IRPs can be significantly enhanced through the utilization of Large Language Models (LLMs) like ChatGPT. By leveraging LLMs for tasks such as drafting initial plans, suggesting best practices, and identifying documentation gaps, organizations can overcome resource constraints and improve their readiness for cybersecurity incidents. We discuss the potential of LLMs to streamline IRP processes, while also considering the limitations and the need for human oversight in ensuring the accuracy and relevance of generated content. Our findings contribute to the cybersecurity field by demonstrating a novel approach to enhancing IRP with AI technologies, offering practical insights for organizations seeking to bolster their incident response capabilities.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Reducing Usefulness of Stolen Credentials in SSO Contexts
Authors:
Sam Hays,
Michael Sandborn,
Dr. Jules White
Abstract:
Approximately 61% of cyber attacks involve adversaries in possession of valid credentials. Attackers acquire credentials through various means, including phishing, dark web data drops, password reuse, etc. Multi-factor authentication (MFA) helps to thwart attacks that use valid credentials, but attackers still commonly breach systems by tricking users into accepting MFA step up requests through te…
▽ More
Approximately 61% of cyber attacks involve adversaries in possession of valid credentials. Attackers acquire credentials through various means, including phishing, dark web data drops, password reuse, etc. Multi-factor authentication (MFA) helps to thwart attacks that use valid credentials, but attackers still commonly breach systems by tricking users into accepting MFA step up requests through techniques, such as ``MFA Bombing'', where multiple requests are sent to a user until they accept one. Currently, there are several solutions to this problem, each with varying levels of security and increasing invasiveness on user devices. This paper proposes a token-based enrollment architecture that is less invasive to user devices than mobile device management, but still offers strong protection against use of stolen credentials and MFA attacks.
△ Less
Submitted 21 January, 2024;
originally announced January 2024.
-
PaperQA: Retrieval-Augmented Generative Agent for Scientific Research
Authors:
Jakub Lála,
Odhran O'Donoghue,
Aleksandar Shtedritski,
Sam Cox,
Samuel G. Rodriques,
Andrew D. White
Abstract:
Large Language Models (LLMs) generalize well across language tasks, but suffer from hallucinations and uninterpretability, making it difficult to assess their accuracy without ground-truth. Retrieval-Augmented Generation (RAG) models have been proposed to reduce hallucinations and provide provenance for how an answer was generated. Applying such models to the scientific literature may enable large…
▽ More
Large Language Models (LLMs) generalize well across language tasks, but suffer from hallucinations and uninterpretability, making it difficult to assess their accuracy without ground-truth. Retrieval-Augmented Generation (RAG) models have been proposed to reduce hallucinations and provide provenance for how an answer was generated. Applying such models to the scientific literature may enable large-scale, systematic processing of scientific knowledge. We present PaperQA, a RAG agent for answering questions over the scientific literature. PaperQA is an agent that performs information retrieval across full-text scientific articles, assesses the relevance of sources and passages, and uses RAG to provide answers. Viewing this agent as a question answering model, we find it exceeds performance of existing LLMs and LLM agents on current science QA benchmarks. To push the field closer to how humans perform research on scientific literature, we also introduce LitQA, a more complex benchmark that requires retrieval and synthesis of information from full-text scientific papers across the literature. Finally, we demonstrate PaperQA's matches expert human researchers on LitQA.
△ Less
Submitted 14 December, 2023; v1 submitted 8 December, 2023;
originally announced December 2023.
-
Integration and Implementation Strategies for AI Algorithm Deployment with Smart Routing Rules and Workflow Management
Authors:
Barbaros Selnur Erdal,
Vikash Gupta,
Mutlu Demirer,
Kim H. Fair,
Richard D. White,
Jeff Blair,
Barbara Deichert,
Laurie Lafleur,
Ming Melvin Qin,
David Bericat,
Brad Genereaux
Abstract:
This paper reviews the challenges hindering the widespread adoption of artificial intelligence (AI) solutions in the healthcare industry, focusing on computer vision applications for medical imaging, and how interoperability and enterprise-grade scalability can be used to address these challenges. The complex nature of healthcare workflows, intricacies in managing large and secure medical imaging…
▽ More
This paper reviews the challenges hindering the widespread adoption of artificial intelligence (AI) solutions in the healthcare industry, focusing on computer vision applications for medical imaging, and how interoperability and enterprise-grade scalability can be used to address these challenges. The complex nature of healthcare workflows, intricacies in managing large and secure medical imaging data, and the absence of standardized frameworks for AI development pose significant barriers and require a new paradigm to address them.
The role of interoperability is examined in this paper as a crucial factor in connecting disparate applications within healthcare workflows. Standards such as DICOM, Health Level 7 (HL7), and Integrating the Healthcare Enterprise (IHE) are highlighted as foundational for common imaging workflows. A specific focus is placed on the role of DICOM gateways, with Smart Routing Rules and Workflow Management leading transformational efforts in this area.
To drive enterprise scalability, new tools are needed. Project MONAI, established in 2019, is introduced as an initiative aiming to redefine the development of medical AI applications. The MONAI Deploy App SDK, a component of Project MONAI, is identified as a key tool in simplifying the packaging and deployment process, enabling repeatable, scalable, and standardized deployment patterns for AI applications.
The abstract underscores the potential impact of successful AI adoption in healthcare, offering physicians both life-saving and time-saving insights and driving efficiencies in radiology department workflows. The collaborative efforts between academia and industry, are emphasized as essential for advancing the adoption of healthcare AI solutions.
△ Less
Submitted 21 November, 2023; v1 submitted 17 November, 2023;
originally announced November 2023.
-
Rating-based Reinforcement Learning
Authors:
Devin White,
Mingkang Wu,
Ellen Novoseller,
Vernon J. Lawhern,
Nicholas Waytowich,
Yongcan Cao
Abstract:
This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individua…
▽ More
This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.
△ Less
Submitted 29 January, 2024; v1 submitted 30 July, 2023;
originally announced July 2023.
-
CliniDigest: A Case Study in Large Language Model Based Large-Scale Summarization of Clinical Trial Descriptions
Authors:
Renee D. White,
Tristan Peng,
Pann Sripitak,
Alexander Rosenberg Johansen,
Michael Snyder
Abstract:
A clinical trial is a study that evaluates new biomedical interventions. To design new trials, researchers draw inspiration from those current and completed. In 2022, there were on average more than 100 clinical trials submitted to ClinicalTrials.gov every day, with each trial having a mean of approximately 1500 words [1]. This makes it nearly impossible to keep up to date. To mitigate this issue,…
▽ More
A clinical trial is a study that evaluates new biomedical interventions. To design new trials, researchers draw inspiration from those current and completed. In 2022, there were on average more than 100 clinical trials submitted to ClinicalTrials.gov every day, with each trial having a mean of approximately 1500 words [1]. This makes it nearly impossible to keep up to date. To mitigate this issue, we have created a batch clinical trial summarizer called CliniDigest using GPT-3.5. CliniDigest is, to our knowledge, the first tool able to provide real-time, truthful, and comprehensive summaries of clinical trials. CliniDigest can reduce up to 85 clinical trial descriptions (approximately 10,500 words) into a concise 200-word summary with references and limited hallucinations. We have tested CliniDigest on its ability to summarize 457 trials divided across 27 medical subdomains. For each field, CliniDigest generates summaries of $μ=153,\ σ=69 $ words, each of which utilizes $μ=54\%,\ σ=30\% $ of the sources. A more comprehensive evaluation is planned and outlined in this paper.
△ Less
Submitted 31 July, 2023; v1 submitted 26 July, 2023;
originally announced July 2023.
-
Predicting small molecules solubilities on endpoint devices using deep ensemble neural networks
Authors:
Mayk Caldas Ramos,
Andrew D. White
Abstract:
Aqueous solubility is a valuable yet challenging property to predict. Computing solubility using first-principles methods requires accounting for the competing effects of entropy and enthalpy, resulting in long computations for relatively poor accuracy. Data-driven approaches, such as deep learning, offer improved accuracy and computational efficiency but typically lack uncertainty quantification.…
▽ More
Aqueous solubility is a valuable yet challenging property to predict. Computing solubility using first-principles methods requires accounting for the competing effects of entropy and enthalpy, resulting in long computations for relatively poor accuracy. Data-driven approaches, such as deep learning, offer improved accuracy and computational efficiency but typically lack uncertainty quantification. Additionally, ease of use remains a concern for any computational technique, resulting in the sustained popularity of group-based contribution methods. In this work, we addressed these problems with a deep learning model with predictive uncertainty that runs on a static website (without a server). This approach moves computing needs onto the website visitor without requiring installation, removing the need to pay for and maintain servers. Our model achieves satisfactory results in solubility prediction. Furthermore, we demonstrate how to create molecular property prediction models that balance uncertainty and ease of use. The code is available at https://github.com/ur-whitelab/mol.dev, and the model is usable at https://mol.dev.
△ Less
Submitted 7 March, 2024; v1 submitted 11 July, 2023;
originally announced July 2023.
-
14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon
Authors:
Kevin Maik Jablonka,
Qianxiang Ai,
Alexander Al-Feghali,
Shruti Badhwar,
Joshua D. Bocarsly,
Andres M Bran,
Stefan Bringuier,
L. Catherine Brinson,
Kamal Choudhary,
Defne Circi,
Sam Cox,
Wibe A. de Jong,
Matthew L. Evans,
Nicolas Gastellu,
Jerome Genzling,
María Victoria Gil,
Ankur K. Gupta,
Zhi Hong,
Alishba Imran,
Sabine Kruschwitz,
Anne Labarre,
Jakub Lála,
Tao Liu,
Steven Ma,
Sauradeep Majumdar
, et al. (28 additional authors not shown)
Abstract:
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of mole…
▽ More
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of molecules and materials, designing novel interfaces for tools, extracting knowledge from unstructured data, and develo** new educational applications.
The diverse topics and the fact that working prototypes could be generated in less than two days highlight that LLMs will profoundly impact the future of our fields. The rich collection of ideas and projects also indicates that the applications of LLMs are not limited to materials science and chemistry but offer potential benefits to a wide range of scientific disciplines.
△ Less
Submitted 14 July, 2023; v1 submitted 9 June, 2023;
originally announced June 2023.
-
Active Learning in Symbolic Regression with Physical Constraints
Authors:
Jorge Medina,
Andrew D. White
Abstract:
Evolutionary symbolic regression (SR) fits a symbolic equation to data, which gives a concise interpretable model. We explore using SR as a method to propose which data to gather in an active learning setting with physical constraints. SR with active learning proposes which experiments to do next. Active learning is done with query by committee, where the Pareto frontier of equations is the commit…
▽ More
Evolutionary symbolic regression (SR) fits a symbolic equation to data, which gives a concise interpretable model. We explore using SR as a method to propose which data to gather in an active learning setting with physical constraints. SR with active learning proposes which experiments to do next. Active learning is done with query by committee, where the Pareto frontier of equations is the committee. The physical constraints improve proposed equations in very low data settings. These approaches reduce the data required for SR and achieves state of the art results in data required to rediscover known equations.
△ Less
Submitted 18 May, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Censoring chemical data to mitigate dual use risk
Authors:
Quintina L. Campbell,
Jonathan Herington,
Andrew D. White
Abstract:
The dual use of machine learning applications, where models can be used for both beneficial and malicious purposes, presents a significant challenge. This has recently become a particular concern in chemistry, where chemical datasets containing sensitive labels (e.g. toxicological information) could be used to develop predictive models that identify novel toxins or chemical warfare agents. To miti…
▽ More
The dual use of machine learning applications, where models can be used for both beneficial and malicious purposes, presents a significant challenge. This has recently become a particular concern in chemistry, where chemical datasets containing sensitive labels (e.g. toxicological information) could be used to develop predictive models that identify novel toxins or chemical warfare agents. To mitigate dual use risks, we propose a model-agnostic method of selectively noising datasets while preserving the utility of the data for training deep neural networks in a beneficial region. We evaluate the effectiveness of the proposed method across least squares, a multilayer perceptron, and a graph neural network. Our findings show selectively noised datasets can induce model variance and bias in predictions for sensitive labels with control, suggesting the safe sharing of datasets containing sensitive information is feasible. We also find omitting sensitive data often increases model variance sufficiently to mitigate dual use. This work is proposed as a foundation for future research on enabling more secure and collaborative data sharing practices and safer machine learning applications in chemistry.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Bayesian Optimization of Catalysts With In-context Learning
Authors:
Mayk Caldas Ramos,
Shane S. Michtavy,
Marc D. Porosoff,
Andrew D. White
Abstract:
Large language models (LLMs) are able to do accurate classification with zero or only a few examples (in-context learning). We show a prompting system that enables regression with uncertainty for in-context learning with frozen LLM (GPT-3, GPT-3.5, and GPT-4) models, allowing predictions without features or architecture tuning. By incorporating uncertainty, our approach enables Bayesian optimizati…
▽ More
Large language models (LLMs) are able to do accurate classification with zero or only a few examples (in-context learning). We show a prompting system that enables regression with uncertainty for in-context learning with frozen LLM (GPT-3, GPT-3.5, and GPT-4) models, allowing predictions without features or architecture tuning. By incorporating uncertainty, our approach enables Bayesian optimization for catalyst or molecule optimization using natural language, eliminating the need for training or simulation. Here, we performed the optimization using the synthesis procedure of catalysts to predict properties. Working with natural language mitigates difficulty synthesizability since the literal synthesis procedure is the model's input. We showed that in-context learning could improve past a model context window (maximum number of tokens the model can process at once) as data is gathered via example selection, allowing the model to scale better. Although our method does not outperform all baselines, it requires zero training, feature selection, and minimal computing while maintaining satisfactory performance. We also find Gaussian Process Regression on text embeddings is strong at Bayesian optimization. The code is available in our GitHub repository: https://github.com/ur-whitelab/BO-LIFT
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Recent advances in the Self-Referencing Embedding Strings (SELFIES) library
Authors:
Alston Lo,
Robert Pollice,
AkshatKumar Nigam,
Andrew D. White,
Mario Krenn,
Alán Aspuru-Guzik
Abstract:
String-based molecular representations play a crucial role in cheminformatics applications, and with the growing success of deep learning in chemistry, have been readily adopted into machine learning pipelines. However, traditional string-based representations such as SMILES are often prone to syntactic and semantic errors when produced by generative models. To address these problems, a novel repr…
▽ More
String-based molecular representations play a crucial role in cheminformatics applications, and with the growing success of deep learning in chemistry, have been readily adopted into machine learning pipelines. However, traditional string-based representations such as SMILES are often prone to syntactic and semantic errors when produced by generative models. To address these problems, a novel representation, SELF-referencIng Embedded Strings (SELFIES), was proposed that is inherently 100% robust, alongside an accompanying open-source implementation. Since then, we have generalized SELFIES to support a wider range of molecules and semantic constraints and streamlined its underlying grammar. We have implemented this updated representation in subsequent versions of \selfieslib, where we have also made major advances with respect to design, efficiency, and supported features. Hence, we present the current status of \selfieslib (version 2.1.1) in this manuscript.
△ Less
Submitted 7 February, 2023;
originally announced February 2023.
-
Current State of Community-Driven Radiological AI Deployment in Medical Imaging
Authors:
Vikash Gupta,
Barbaros Selnur Erdal,
Carolina Ramirez,
Ralf Floca,
Laurence Jackson,
Brad Genereaux,
Sidney Bryson,
Christopher P Bridge,
Jens Kleesiek,
Felix Nensa,
Rickmer Braren,
Khaled Younis,
Tobias Penzkofer,
Andreas Michael Bucher,
Ming Melvin Qin,
Gigon Bae,
Hyeonhoon Lee,
M. Jorge Cardoso,
Sebastien Ourselin,
Eric Kerfoot,
Rahul Choudhury,
Richard D. White,
Tessa Cook,
David Bericat,
Matthew Lungren
, et al. (2 additional authors not shown)
Abstract:
Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introd…
▽ More
Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and develo** tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.
△ Less
Submitted 8 May, 2023; v1 submitted 29 December, 2022;
originally announced December 2022.
-
Using Conservation Laws to Infer Deep Learning Model Accuracy of Richtmyer-meshkov Instabilities
Authors:
Charles F. Jekel,
Dane M. Sterbentz,
Sylvie Aubry,
Youngsoo Choi,
Daniel A. White,
Jonathan L. Belof
Abstract:
Richtmyer-Meshkov Instability (RMI) is a complicated phenomenon that occurs when a shockwave passes through a perturbed interface. Over a thousand hydrodynamic simulations were performed to study the formation of RMI for a parameterized high velocity impact. Deep learning was used to learn the temporal map** of initial geometric perturbations to the full-field hydrodynamic solutions of density a…
▽ More
Richtmyer-Meshkov Instability (RMI) is a complicated phenomenon that occurs when a shockwave passes through a perturbed interface. Over a thousand hydrodynamic simulations were performed to study the formation of RMI for a parameterized high velocity impact. Deep learning was used to learn the temporal map** of initial geometric perturbations to the full-field hydrodynamic solutions of density and velocity. The continuity equation was used to include physical information into the loss function, however only resulted in very minor improvements at the cost of additional training complexity. Predictions from the deep learning model appear to accurately capture temporal RMI formations for a variety of geometric conditions within the domain. First principle physical laws were investigated to infer the accuracy of the model's predictive capability. While the continuity equation appeared to show no correlation with the accuracy of the model, conservation of mass and momentum were weakly correlated with accuracy. Since conservation laws can be quickly calculated from the deep learning model, they may be useful in applications where a relative accuracy measure is needed.
△ Less
Submitted 18 July, 2022;
originally announced August 2022.
-
Exploring the Intersection between Neural Architecture Search and Continual Learning
Authors:
Mohamed Shahawy,
Elhadj Benkhelifa,
David White
Abstract:
Despite the significant advances achieved in Artificial Neural Networks (ANNs), their design process remains notoriously tedious, depending primarily on intuition, experience and trial-and-error. This human-dependent process is often time-consuming and prone to errors. Furthermore, the models are generally bound to their training contexts, with no considerations to their surrounding environments.…
▽ More
Despite the significant advances achieved in Artificial Neural Networks (ANNs), their design process remains notoriously tedious, depending primarily on intuition, experience and trial-and-error. This human-dependent process is often time-consuming and prone to errors. Furthermore, the models are generally bound to their training contexts, with no considerations to their surrounding environments. Continual adaptiveness and automation of neural networks is of paramount importance to several domains where model accessibility is limited after deployment (e.g IoT devices, self-driving vehicles, etc.). Additionally, even accessible models require frequent maintenance post-deployment to overcome issues such as Concept/Data Drift, which can be cumbersome and restrictive. By leveraging and combining approaches from Neural Architecture Search (NAS) and Continual Learning (CL), more robust and adaptive agents can be developed. This study conducts the first extensive review on the intersection between NAS and CL, formalizing the prospective Continually-Adaptive Neural Networks (CANNs) paradigm and outlining research directions for lifelong autonomous ANNs.
△ Less
Submitted 15 June, 2023; v1 submitted 11 June, 2022;
originally announced June 2022.
-
Neural Network Layers for Prediction of Positive Definite Elastic Stiffness Tensors
Authors:
Charles F. Jekel,
Kenneth E. Swartz,
Daniel A. White,
Daniel A. Tortorelli,
Seth E. Watts
Abstract:
Machine learning models can be used to predict physical quantities like homogenized elasticity stiffness tensors, which must always be symmetric positive definite (SPD) based on conservation arguments. Two datasets of homogenized elasticity tensors of lattice materials are presented as examples, where it is desired to obtain models that map unit cell geometric and material parameters to their homo…
▽ More
Machine learning models can be used to predict physical quantities like homogenized elasticity stiffness tensors, which must always be symmetric positive definite (SPD) based on conservation arguments. Two datasets of homogenized elasticity tensors of lattice materials are presented as examples, where it is desired to obtain models that map unit cell geometric and material parameters to their homogenized stiffness. Fitting a model to SPD data does not guarantee the model's predictions will remain SPD. Existing Cholsesky factorization and Eigendecomposition schemes are abstracted in this work as transformation layers which enforce the SPD condition. These layers can be included in many popular machine learning models to enforce SPD behavior. This work investigates the effects that different positivity functions have on the layers and how their inclusion affects model accuracy. Commonly used models are considered, including polynomials, radial basis functions, and neural networks. Ultimately it is shown that a single SPD layer improves the model's average prediction accuracy.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
Digital Fingerprinting of Microstructures
Authors:
Michael D. White,
Alexander Tarakanov,
Christopher P. Race,
Philip J. Withers,
Kody J. H. Law
Abstract:
Finding efficient means of fingerprinting microstructural information is a critical step towards harnessing data-centric machine learning approaches. A statistical framework is systematically developed for compressed characterisation of a population of images, which includes some classical computer vision methods as special cases. The focus is on materials microstructure. The ultimate purpose is t…
▽ More
Finding efficient means of fingerprinting microstructural information is a critical step towards harnessing data-centric machine learning approaches. A statistical framework is systematically developed for compressed characterisation of a population of images, which includes some classical computer vision methods as special cases. The focus is on materials microstructure. The ultimate purpose is to rapidly fingerprint sample images in the context of various high-throughput design/make/test scenarios. This includes, but is not limited to, quantification of the disparity between microstructures for quality control, classifying microstructures, predicting materials properties from image data and identifying potential processing routes to engineer new materials with specific properties. Here, we consider microstructure classification and utilise the resulting features over a range of related machine learning tasks, namely supervised, semi-supervised, and unsupervised learning.
The approach is applied to two distinct datasets to illustrate various aspects and some recommendations are made based on the findings. In particular, methods that leverage transfer learning with convolutional neural networks (CNNs), pretrained on the ImageNet dataset, are generally shown to outperform other methods. Additionally, dimensionality reduction of these CNN-based fingerprints is shown to have negligible impact on classification accuracy for the supervised learning approaches considered. In situations where there is a large dataset with only a handful of images labelled, graph-based label propagation to unlabelled data is shown to be favourable over discarding unlabelled data and performing supervised learning. In particular, label propagation by Poisson learning is shown to be highly effective at low label rates.
△ Less
Submitted 22 January, 2024; v1 submitted 25 March, 2022;
originally announced March 2022.
-
A multi-reconstruction study of breast density estimation using Deep Learning
Authors:
Vikash Gupta,
Mutlu Demirer,
Robert W. Maxwell,
Richard D. White,
Barbaros Selnur Erdal
Abstract:
Breast density estimation is one of the key tasks in recognizing individuals predisposed to breast cancer. It is often challenging because of low contrast and fluctuations in mammograms' fatty tissue background. Most of the time, the breast density is estimated manually where a radiologist assigns one of the four density categories decided by the Breast Imaging and Reporting Data Systems (BI-RADS)…
▽ More
Breast density estimation is one of the key tasks in recognizing individuals predisposed to breast cancer. It is often challenging because of low contrast and fluctuations in mammograms' fatty tissue background. Most of the time, the breast density is estimated manually where a radiologist assigns one of the four density categories decided by the Breast Imaging and Reporting Data Systems (BI-RADS). There have been efforts in the direction of automating a breast density classification pipeline.
Breast density estimation is one of the key tasks performed during a screening exam. Dense breasts are more susceptible to breast cancer. The density estimation is challenging because of low contrast and fluctuations in mammograms' fatty tissue background. Traditional mammograms are being replaced by tomosynthesis and its other low radiation dose variants (for example Hologic' Intelligent 2D and C-View). Because of the low-dose requirement, increasingly more screening centers are favoring the Intelligent 2D view and C-View. Deep-learning studies for breast density estimation use only a single modality for training a neural network. However, doing so restricts the number of images in the dataset. In this paper, we show that a neural network trained on all the modalities at once performs better than a neural network trained on any single modality. We discuss these results using the area under the receiver operator characteristics curves.
△ Less
Submitted 10 October, 2022; v1 submitted 16 February, 2022;
originally announced February 2022.
-
Cascading Neural Network Methodology for Artificial Intelligence-Assisted Radiographic Detection and Classification of Lead-Less Implanted Electronic Devices within the Chest
Authors:
Mutlu Demirer,
Richard D. White,
Vikash Gupta,
Ronnie A. Sebro,
Barbaros S. Erdal
Abstract:
Background & Purpose: Chest X-Ray (CXR) use in pre-MRI safety screening for Lead-Less Implanted Electronic Devices (LLIEDs), easily overlooked or misidentified on a frontal view (often only acquired), is common. Although most LLIED types are "MRI conditional": 1. Some are stringently conditional; 2. Different conditional types have specific patient- or device- management requirements; and 3. Parti…
▽ More
Background & Purpose: Chest X-Ray (CXR) use in pre-MRI safety screening for Lead-Less Implanted Electronic Devices (LLIEDs), easily overlooked or misidentified on a frontal view (often only acquired), is common. Although most LLIED types are "MRI conditional": 1. Some are stringently conditional; 2. Different conditional types have specific patient- or device- management requirements; and 3. Particular types are "MRI unsafe". This work focused on develo** CXR interpretation-assisting Artificial Intelligence (AI) methodology with: 1. 100% detection for LLIED presence/location; and 2. High classification in LLIED ty**. Materials & Methods: Data-mining (03/1993-02/2021) produced an AI Model Development Population (1,100 patients/4,871 images) creating 4,924 LLIED Region-Of-Interests (ROIs) (with image-quality grading) used in Training, Validation, and Testing. For develo** the cascading neural network (detection via Faster R-CNN and classification via Inception V3), "ground-truth" CXR annotation (ROI labeling per LLIED), as well as inference display (as Generated Bounding Boxes (GBBs)), relied on a GPU-based graphical user interface. Results: To achieve 100% LLIED detection, probability threshold reduction to 0.00002 was required by Model 1, resulting in increasing GBBs per LLIED-related ROI. Targeting LLIED-type classification following detection of all LLIEDs, Model 2 multi-classified to reach high-performance while decreasing falsely positive GBBs. Despite 24% suboptimal ROI image quality, classification was correct in 98.9% and AUCs for the 9 LLIED-types were 1.00 for 8 and 0.92 for 1. For all misclassification cases: 1. None involved stringently conditional or unsafe LLIEDs; and 2. Most were attributable to suboptimal images. Conclusion: This project successfully developed a LLIED-related AI methodology supporting: 1. 100% detection; and 2. Typically 100% type classification.
△ Less
Submitted 26 April, 2022; v1 submitted 25 August, 2021;
originally announced August 2021.
-
Towards establishing formal verification and inductive code synthesis in the PLC domain
Authors:
Matthias Weiß,
Philipp Marks,
Benjamin Maschler,
Dustin White,
Pascal Kesseli,
Michael Weyrich
Abstract:
Nowadays, formal methods are used in various areas for the verification of programs or for code generation from models in order to increase the quality of software and to reduce costs. However, there are still fields in which formal methods have not been widely adopted, despite the large set of possible benefits offered. This is the case for the area of programmable logic controllers (PLC). This a…
▽ More
Nowadays, formal methods are used in various areas for the verification of programs or for code generation from models in order to increase the quality of software and to reduce costs. However, there are still fields in which formal methods have not been widely adopted, despite the large set of possible benefits offered. This is the case for the area of programmable logic controllers (PLC). This article aims to evaluate the potential of formal methods in the context of PLC development. For this purpose, the general concepts of formal methods are first introduced and then transferred to the PLC area, resulting in an engineering-oriented description of the technology that is based on common concepts from PLC development. Based on this description, PLC professionals with varying degrees of experience were interviewed for their perspective on the topic and to identify possible use cases within the PLC domain. The survey results indicate the technology's high potential in the PLC area, either as a tool to directly support the developer or as a key element within a model-based systems engineering toolchain. The evaluation of the survey results is performed with the aid of a demo application that communicates with the Totally Integrated Automation Portal from Siemens and generates programs via Fastsynth, a model-based open source code generator. Benchmarks based on an industry-related PLC project show satisfactory synthesis times and a successful integration into the workflow of a PLC developer.
△ Less
Submitted 30 June, 2021;
originally announced June 2021.
-
IRIS: A Low Duty Cycle Cross-Layer Protocol for Long-Range Wireless Sensor Networks with Low Power Budget
Authors:
Yi Chu,
Paul Mitchell,
David Grace,
Jonathan Roberts,
Dominic White,
Tautvydas Mickus
Abstract:
This paper presents a cross-layer protocol (IRIS) designed for long-range pipeline Wireless Sensor Networks with extremely low power budget, typically seen in a range of monitoring applications. IRIS uses ** packets initiated by a base station to travel through the multi-hop network and carry monitoring information. The protocol is able to operate with less than 1% duty cycle, thereby conforming…
▽ More
This paper presents a cross-layer protocol (IRIS) designed for long-range pipeline Wireless Sensor Networks with extremely low power budget, typically seen in a range of monitoring applications. IRIS uses ** packets initiated by a base station to travel through the multi-hop network and carry monitoring information. The protocol is able to operate with less than 1% duty cycle, thereby conforming to ISM band spectrum regulations in the 868MHz band. The duty cycle can be flexibly configured to meet other regulations/power budgets as well as to improve the route forming performance. Simulation results show guaranteed route formation in different network topologies with various protocol configurations. System robustness against unreliable wireless connections and node failures are also demonstrated by simulations.
△ Less
Submitted 1 December, 2020;
originally announced December 2020.
-
Metaheuristics "In the Large"
Authors:
Jerry Swan,
Steven Adriaensen,
Alexander E. I. Brownlee,
Kevin Hammond,
Colin G. Johnson,
Ahmed Kheiri,
Faustyna Krawiec,
J. J. Merelo,
Leandro L. Minku,
Ender Özcan,
Gisele L. Pappa,
Pablo García-Sánchez,
Kenneth Sörensen,
Stefan Voß,
Markus Wagner,
David R. White
Abstract:
Following decades of sustained improvement, metaheuristics are one of the great success stories of optimization research. However, in order for research in metaheuristics to avoid fragmentation and a lack of reproducibility, there is a pressing need for stronger scientific and computational infrastructure to support the development, analysis and comparison of new approaches. We argue that, via pri…
▽ More
Following decades of sustained improvement, metaheuristics are one of the great success stories of optimization research. However, in order for research in metaheuristics to avoid fragmentation and a lack of reproducibility, there is a pressing need for stronger scientific and computational infrastructure to support the development, analysis and comparison of new approaches. We argue that, via principled choice of infrastructure support, the field can pursue a higher level of scientific enquiry. We describe our vision and report on progress, showing how the adoption of common protocols for all metaheuristics can help liberate the potential of the field, easing the exploration of the design space of metaheuristics.
△ Less
Submitted 3 June, 2021; v1 submitted 19 November, 2020;
originally announced November 2020.
-
Deep Learning-Based Automatic Detection of Poorly Positioned Mammograms to Minimize Patient Return Visits for Repeat Imaging: A Real-World Application
Authors:
Vikash Gupta,
Clayton Taylor,
Sarah Bonnet,
Luciano M. Prevedello,
Jeffrey Hawley,
Richard D White,
Mona G Flores,
Barbaros Selnur Erdal
Abstract:
Screening mammograms are a routine imaging exam performed to detect breast cancer in its early stages to reduce morbidity and mortality attributed to this disease. In order to maximize the efficacy of breast cancer screening programs, proper mammographic positioning is paramount. Proper positioning ensures adequate visualization of breast tissue and is necessary for effective breast cancer detecti…
▽ More
Screening mammograms are a routine imaging exam performed to detect breast cancer in its early stages to reduce morbidity and mortality attributed to this disease. In order to maximize the efficacy of breast cancer screening programs, proper mammographic positioning is paramount. Proper positioning ensures adequate visualization of breast tissue and is necessary for effective breast cancer detection. Therefore, breast-imaging radiologists must assess each mammogram for the adequacy of positioning before providing a final interpretation of the examination; this often necessitates return patient visits for additional imaging. In this paper, we propose a deep learning-algorithm method that mimics and automates this decision-making process to identify poorly positioned mammograms. Our objective for this algorithm is to assist mammography technologists in recognizing inadequately positioned mammograms real-time, improve the quality of mammographic positioning and performance, and ultimately reducing repeat visits for patients with initially inadequate imaging. The proposed model showed a true positive rate for detecting correct positioning of 91.35% in the mediolateral oblique view and 95.11% in the craniocaudal view. In addition to these results, we also present an automatically generated report which can aid the mammography technologist in taking corrective measures during the patient visit.
△ Less
Submitted 28 September, 2020;
originally announced September 2020.
-
Democratizing Artificial Intelligence in Healthcare: A Study of Model Development Across Two Institutions Incorporating Transfer Learning
Authors:
Vikash Gupta1,
Holger Roth,
Varun Buch3,
Marcio A. B. C. Rockenbach,
Richard D White,
Dong Yang,
Olga Laur,
Brian Ghoshhajra,
Ittai Dayan,
Daguang Xu,
Mona G. Flores,
Barbaros Selnur Erdal
Abstract:
The training of deep learning models typically requires extensive data, which are not readily available as large well-curated medical-image datasets for development of artificial intelligence (AI) models applied in Radiology. Recognizing the potential for transfer learning (TL) to allow a fully trained model from one institution to be fine-tuned by another institution using a much small local data…
▽ More
The training of deep learning models typically requires extensive data, which are not readily available as large well-curated medical-image datasets for development of artificial intelligence (AI) models applied in Radiology. Recognizing the potential for transfer learning (TL) to allow a fully trained model from one institution to be fine-tuned by another institution using a much small local dataset, this report describes the challenges, methodology, and benefits of TL within the context of develo** an AI model for a basic use-case, segmentation of Left Ventricular Myocardium (LVM) on images from 4-dimensional coronary computed tomography angiography. Ultimately, our results from comparisons of LVM segmentation predicted by a model locally trained using random initialization, versus one training-enhanced by TL, showed that a use-case model initiated by TL can be developed with sparse labels with acceptable performance. This process reduces the time required to build a new model in the clinical environment at a different institution.
△ Less
Submitted 25 September, 2020;
originally announced September 2020.
-
Artificial Intelligence to Assist in Exclusion of Coronary Atherosclerosis during CCTA Evaluation of Chest-Pain in the Emergency Department: Preparing an Application for Real-World Use
Authors:
Richard D. White,
Barbaros S. Erdal,
Mutlu Demirer,
Vikash Gupta,
Matthew T. Bigelow,
Engin Dikici,
Sema Candemir,
Mauricio S. Galizia,
Jessica L. Carpenter,
Thomas P. O Donnell,
Abdul H. Halabi,
Luciano M. Prevedello
Abstract:
Coronary Computed Tomography Angiography (CCTA) evaluation of chest-pain patients in an Emergency Department (ED) is considered appropriate. While a negative CCTA interpretation supports direct patient discharge from an ED, labor-intensive analyses are required, with accuracy in jeopardy from distractions. We describe the development of an Artificial Intelligence (AI) algorithm and workflow for as…
▽ More
Coronary Computed Tomography Angiography (CCTA) evaluation of chest-pain patients in an Emergency Department (ED) is considered appropriate. While a negative CCTA interpretation supports direct patient discharge from an ED, labor-intensive analyses are required, with accuracy in jeopardy from distractions. We describe the development of an Artificial Intelligence (AI) algorithm and workflow for assisting interpreting physicians in CCTA screening for the absence of coronary atherosclerosis. The two-phase approach consisted of (1) Phase 1 - focused on the development and preliminary testing of an algorithm for vessel-centerline extraction classification in a balanced study population (n = 500 with 50% disease prevalence) derived by retrospective random case selection; and (2) Phase 2 - concerned with simulated-clinical Trialing of the developed algorithm on a per-case basis in a more real-world study population (n = 100 with 28% disease prevalence) from an ED chest-pain series. This allowed pre-deployment evaluation of the AI-based CCTA screening application which provides a vessel-by-vessel graphic display of algorithm inference results integrated into a clinically capable viewer. Algorithm performance evaluation used Area Under the Receiver-Operating-Characteristic Curve (AUC-ROC); confusion matrices reflected ground-truth vs AI determinations. The vessel-based algorithm demonstrated strong performance with AUC-ROC = 0.96. In both Phase 1 and Phase 2, independent of disease prevalence differences, negative predictive values at the case level were very high at 95%. The rate of completion of the algorithm workflow process (96% with inference results in 55-80 seconds) in Phase 2 depended on adequate image quality. There is potential for this AI application to assist in CCTA interpretation to help extricate atherosclerosis from chest-pain presentations.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
Graph Neural Network Based Coarse-Grained Map** Prediction
Authors:
Zhiheng Li,
Geemi P. Wellawatte,
Maghesree Chakraborty,
Heta A. Gandhi,
Chenliang Xu,
Andrew D. White
Abstract:
The selection of coarse-grained (CG) map** operators is a critical step for CG molecular dynamics (MD) simulation. It is still an open question about what is optimal for this choice and there is a need for theory. The current state-of-the art method is map** operators manually selected by experts. In this work, we demonstrate an automated approach by viewing this problem as supervised learning…
▽ More
The selection of coarse-grained (CG) map** operators is a critical step for CG molecular dynamics (MD) simulation. It is still an open question about what is optimal for this choice and there is a need for theory. The current state-of-the art method is map** operators manually selected by experts. In this work, we demonstrate an automated approach by viewing this problem as supervised learning where we seek to reproduce the map** operators produced by experts. We present a graph neural network based CG map** predictor called DEEP SUPERVISED GRAPH PARTITIONING MODEL(DSGPM) that treats map** operators as a graph segmentation problem. DSGPM is trained on a novel dataset, Human-annotated Map**s (HAM), consisting of 1,206 molecules with expert annotated map** operators. HAM can be used to facilitate further research in this area. Our model uses a novel metric learning objective to produce high-quality atomic features that are used in spectral clustering. The results show that the DSGPM outperforms state-of-the-art methods in the field of graph segmentation. Finally, we find that predicted CG map** operators indeed result in good CG MD models when used in simulation.
△ Less
Submitted 19 August, 2021; v1 submitted 24 June, 2020;
originally announced July 2020.
-
Predicting Rate of Cognitive Decline at Baseline Using a Deep Neural Network with Multidata Analysis
Authors:
Sema Candemir,
Xuan V. Nguyen,
Luciano M. Prevedello,
Matthew T. Bigelow,
Richard D. White,
Barbaros S. Erdal
Abstract:
Purpose: This study investigates whether a machine-learning-based system can predict the rate of cognitive decline in mildly cognitively impaired patients by processing only the clinical and imaging data collected at the initial visit.
Approach: We built a predictive model based on a supervised hybrid neural network utilizing a 3-Dimensional Convolutional Neural Network to perform volume analysi…
▽ More
Purpose: This study investigates whether a machine-learning-based system can predict the rate of cognitive decline in mildly cognitively impaired patients by processing only the clinical and imaging data collected at the initial visit.
Approach: We built a predictive model based on a supervised hybrid neural network utilizing a 3-Dimensional Convolutional Neural Network to perform volume analysis of Magnetic Resonance Imaging and integration of non-imaging clinical data at the fully connected layer of the architecture. The experiments are conducted on the Alzheimers Disease Neuroimaging Initiative dataset.
Results: Experimental results confirm that there is a correlation between cognitive decline and the data obtained at the first visit. The system achieved an area under the receiver operator curve (AUC) of 0.70 for cognitive decline class prediction.
Conclusion: To our knowledge, this is the first study that predicts slowly deteriorating/stable or rapidly deteriorating classes by processing routinely collected baseline clinical and demographic data (Baseline MRI, Baseline MMSE, Scalar Volumetric data, Age, Gender, Education, Ethnicity, and Race). The training data is built based on MMSE-rate values. Unlike the studies in the literature that focus on predicting Mild Cognitive Impairment-to-Alzheimer`s disease conversion and disease classification, we approach the problem as an early prediction of cognitive decline rate in MCI patients.
△ Less
Submitted 5 October, 2020; v1 submitted 23 February, 2020;
originally announced February 2020.
-
Automated Coronary Artery Atherosclerosis Detection and Weakly Supervised Localization on Coronary CT Angiography with a Deep 3-Dimensional Convolutional Neural Network
Authors:
Sema Candemir,
Richard D. White,
Mutlu Demirer,
Vikash Gupta,
Matthew T. Bigelow,
Luciano M. Prevedello,
Barbaros S. Erdal
Abstract:
We propose a fully automated algorithm based on a deep learning framework enabling screening of a coronary computed tomography angiography (CCTA) examination for confident detection of the presence or absence of coronary artery atherosclerosis. The system starts with extracting the coronary arteries and their branches from CCTA datasets and representing them with multi-planar reformatted volumes;…
▽ More
We propose a fully automated algorithm based on a deep learning framework enabling screening of a coronary computed tomography angiography (CCTA) examination for confident detection of the presence or absence of coronary artery atherosclerosis. The system starts with extracting the coronary arteries and their branches from CCTA datasets and representing them with multi-planar reformatted volumes; pre-processing and augmentation techniques are then applied to increase the robustness and generalization ability of the system. A 3-dimensional convolutional neural network (3D-CNN) is utilized to model pathological changes (e.g., atherosclerotic plaques) in coronary vessels. The system learns the discriminatory features between vessels with and without atherosclerosis. The discriminative features at the final convolutional layer are visualized with a saliency map approach to provide visual clues related to atherosclerosis likelihood and location. We have evaluated the system on a reference dataset representing247 patients with atherosclerosis and 246 patients free of atherosclerosis. With five-fold cross-validation,an Accuracy = 90.9%, Positive Predictive Value = 58.8%, Sensitivity = 68.9%, Specificity of 93.6%, and Negative Predictive Value (NPV) = 96.1% are achieved at the artery/branch level with threshold 0.5. The average area under the receiver operating characteristic curve is 0.91. The system indicates a high NPV, which may be potentially useful for assisting interpreting physicians in excluding coronary atherosclerosis in patients with acute chest pain.
△ Less
Submitted 7 June, 2020; v1 submitted 26 November, 2019;
originally announced November 2019.
-
Investigating Active Learning and Meta-Learning for Iterative Peptide Design
Authors:
Rainier Barrett,
Andrew D. White
Abstract:
Often the development of novel functional peptides is not amenable to high throughput or purely computational screening methods. Peptides must be synthesized one at a time in a process that does not generate large amounts of data. One way this method can be improved is by ensuring that each experiment provides the best improvement in both peptide properties and predictive modeling accuracy. Here,…
▽ More
Often the development of novel functional peptides is not amenable to high throughput or purely computational screening methods. Peptides must be synthesized one at a time in a process that does not generate large amounts of data. One way this method can be improved is by ensuring that each experiment provides the best improvement in both peptide properties and predictive modeling accuracy. Here, we study the effectiveness of active learning, optimizing experiment order, and meta-learning, transferring knowledge between contexts, to reduce the number of experiments necessary to build a predictive model. We present a multi-task benchmark database of peptides designed to advance these methods for experimental design. Each task is binary classification of peptides represented as a sequence string. We find neither active learning method tested to be better than random choice. The meta-learning method Reptile was found to improve average accuracy across datasets. Combining meta-learning with active learning offers inconsistent benefits.
△ Less
Submitted 10 December, 2020; v1 submitted 20 November, 2019;
originally announced November 2019.
-
Are Quantitative Features of Lung Nodules Reproducible at Different CT Acquisition and Reconstruction Parameters?
Authors:
Barbaros S. Erdal,
Mutlu Demirer,
Chiemezie C. Amadi,
Gehan F. M. Ibrahim,
Thomas P. O'Donnell,
Rainer Grimmer,
Andreas Wimmer,
Kevin J. Little,
Vikash Gupta,
Matthew T. Bigelow,
Luciano M. Prevedello,
Richard D. White
Abstract:
Consistency and duplicability in Computed Tomography (CT) output is essential to quantitative imaging for lung cancer detection and monitoring. This study of CT-detected lung nodules investigated the reproducibility of volume-, density-, and texture-based features (outcome variables) over routine ranges of radiation-dose, reconstruction kernel, and slice thickness. CT raw data of 23 nodules were r…
▽ More
Consistency and duplicability in Computed Tomography (CT) output is essential to quantitative imaging for lung cancer detection and monitoring. This study of CT-detected lung nodules investigated the reproducibility of volume-, density-, and texture-based features (outcome variables) over routine ranges of radiation-dose, reconstruction kernel, and slice thickness. CT raw data of 23 nodules were reconstructed using 320 acquisition/reconstruction conditions (combinations of 4 doses, 10 kernels, and 8 thicknesses). Scans at 12.5%, 25%, and 50% of protocol dose were simulated; reduced-dose and full-dose data were reconstructed using conventional filtered back-projection and iterative-reconstruction kernels at a range of thicknesses (0.6-5.0 mm). Full-dose/B50f kernel reconstructions underwent expert segmentation for reference Region-Of-Interest (ROI) and nodule volume per thickness; each ROI was applied to 40 corresponding images (combinations of 4 doses and 10 kernels). Typical texture analysis metrics (including 5 histogram features, 13 Gray Level Co-occurrence Matrix, 5 Run Length Matrix, 2 Neighboring Gray-Level Dependence Matrix, and 2 Neighborhood Gray-Tone Difference Matrix) were computed per ROI. Reconstruction conditions resulting in no significant change in volume, density, or texture metrics were identified as "compatible pairs" for a given outcome variable. Our results indicate that as thickness increases, volumetric reproducibility decreases, while reproducibility of histogram- and texture-based features across different acquisition and reconstruction parameters improves. In order to achieve concomitant reproducibility of volumetric and radiomic results across studies, balanced standardization of the imaging acquisition parameters is required.
△ Less
Submitted 14 August, 2019;
originally announced August 2019.
-
Optimising Trotter-Suzuki Decompositions for Quantum Simulation Using Evolutionary Strategies
Authors:
Benjamin D. M. Jones,
George O. O'Brien,
David R. White,
Earl T. Campbell,
John A. Clark
Abstract:
One of the most promising applications of near-term quantum computing is the simulation of quantum systems, a classically intractable task. Quantum simulation requires computationally expensive matrix exponentiation; Trotter-Suzuki decomposition of this exponentiation enables efficient simulation to a desired accuracy on a quantum computer. We apply the Covariance Matrix Adaptation Evolutionary St…
▽ More
One of the most promising applications of near-term quantum computing is the simulation of quantum systems, a classically intractable task. Quantum simulation requires computationally expensive matrix exponentiation; Trotter-Suzuki decomposition of this exponentiation enables efficient simulation to a desired accuracy on a quantum computer. We apply the Covariance Matrix Adaptation Evolutionary Strategy (CMA-ES) algorithm to optimise the Trotter-Suzuki decompositions of a canonical quantum system, the Heisenberg Chain; we reduce simulation error by around 60%. We introduce this problem to the computational search community, show that an evolutionary optimisation approach is robust across runs and problem instances, and find that optimisation results generalise to the simulation of larger systems.
△ Less
Submitted 23 April, 2019; v1 submitted 2 April, 2019;
originally announced April 2019.
-
DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning
Authors:
Alex Olsen,
Dmitry A. Konovalov,
Bronson Philippa,
Peter Ridd,
Jake C. Wood,
Jamie Johns,
Wesley Banks,
Benjamin Girgenti,
Owen Kenny,
James Whinney,
Brendan Calvert,
Mostafa Rahimi Azghadi,
Ronald D. White
Abstract:
Robotic weed control has seen increased research of late with its potential for boosting productivity in agriculture. Majority of works focus on develo** robotics for croplands, ignoring the weed management problems facing rangeland stock farmers. Perhaps the greatest obstacle to widespread uptake of robotic weed control is the robust classification of weed species in their natural environment.…
▽ More
Robotic weed control has seen increased research of late with its potential for boosting productivity in agriculture. Majority of works focus on develo** robotics for croplands, ignoring the weed management problems facing rangeland stock farmers. Perhaps the greatest obstacle to widespread uptake of robotic weed control is the robust classification of weed species in their natural environment. The unparalleled successes of deep learning make it an ideal candidate for recognising various weed species in the complex rangeland environment. This work contributes the first large, public, multiclass image dataset of weed species from the Australian rangelands; allowing for the development of robust classification methods to make robotic weed control viable. The DeepWeeds dataset consists of 17,509 labelled images of eight nationally significant weed species native to eight locations across northern Australia. This paper presents a baseline for classification performance on the dataset using the benchmark deep learning models, Inception-v3 and ResNet-50. These models achieved an average classification accuracy of 95.1% and 95.7%, respectively. We also demonstrate real time performance of the ResNet-50 architecture, with an average inference time of 53.4 ms per image. These strong results bode well for future field implementation of robotic weed control methods in the Australian rangelands.
△ Less
Submitted 14 February, 2019; v1 submitted 9 October, 2018;
originally announced October 2018.
-
An Overview of Schema Theory
Authors:
David White
Abstract:
The purpose of this paper is to give an introduction to the field of Schema Theory written by a mathematician and for mathematicians. In particular, we endeavor to to highlight areas of the field which might be of interest to a mathematician, to point out some related open problems, and to suggest some large-scale projects. Schema theory seeks to give a theoretical justification for the efficacy o…
▽ More
The purpose of this paper is to give an introduction to the field of Schema Theory written by a mathematician and for mathematicians. In particular, we endeavor to to highlight areas of the field which might be of interest to a mathematician, to point out some related open problems, and to suggest some large-scale projects. Schema theory seeks to give a theoretical justification for the efficacy of the field of genetic algorithms, so readers who have studied genetic algorithms stand to gain the most from this paper. However, nothing beyond basic probability theory is assumed of the reader, and for this reason we write in a fairly informal style.
Because the mathematics behind the theorems in schema theory is relatively elementary, we focus more on the motivation and philosophy. Many of these results have been proven elsewhere, so this paper is designed to serve a primarily expository role. We attempt to cast known results in a new light, which makes the suggested future directions natural. This involves devoting a substantial amount of time to the history of the field.
We hope that this exposition will entice some mathematicians to do research in this area, that it will serve as a road map for researchers new to the field, and that it will help explain how schema theory developed. Furthermore, we hope that the results collected in this document will serve as a useful reference. Finally, as far as the author knows, the questions raised in the final section are new.
△ Less
Submitted 12 January, 2014;
originally announced January 2014.
-
Pennants for Descriptors
Authors:
Howard D. White,
Philipp Mayr
Abstract:
We present a new technique (called pennants) for displaying the descriptors related to a descriptor across literatures, rather in a thesaurus. It has definite implications for online searching and browsing. Pennants, named for the flag they resemble, are a form of algorithmic prediction. Their cognitive base is in relevance theory (RT) from linguistic pragmatics (Sperber & Wilson 1995).
We present a new technique (called pennants) for displaying the descriptors related to a descriptor across literatures, rather in a thesaurus. It has definite implications for online searching and browsing. Pennants, named for the flag they resemble, are a form of algorithmic prediction. Their cognitive base is in relevance theory (RT) from linguistic pragmatics (Sperber & Wilson 1995).
△ Less
Submitted 14 October, 2013;
originally announced October 2013.
-
Minimal Dirichlet energy partitions for graphs
Authors:
Braxton Osting,
Chris D. White,
Edouard Oudet
Abstract:
Motivated by a geometric problem, we introduce a new non-convex graph partitioning objective where the optimality criterion is given by the sum of the Dirichlet eigenvalues of the partition components. A relaxed formulation is identified and a novel rearrangement algorithm is proposed, which we show is strictly decreasing and converges in a finite number of iterations to a local minimum of the rel…
▽ More
Motivated by a geometric problem, we introduce a new non-convex graph partitioning objective where the optimality criterion is given by the sum of the Dirichlet eigenvalues of the partition components. A relaxed formulation is identified and a novel rearrangement algorithm is proposed, which we show is strictly decreasing and converges in a finite number of iterations to a local minimum of the relaxed objective function. Our method is applied to several clustering problems on graphs constructed from synthetic data, MNIST handwritten digits, and manifold discretizations. The model has a semi-supervised extension and provides a natural representative for the clusters as well.
△ Less
Submitted 20 May, 2014; v1 submitted 22 August, 2013;
originally announced August 2013.
-
Traversals of Infinite Graphs with Random Local Orientations
Authors:
David White
Abstract:
We introduce the notion of a "random basic walk" on an infinite graph, give numerous examples, list potential applications, and provide detailed comparisons between the random basic walk and existing generalizations of simple random walks. We define analogues in the setting of random basic walks of the notions of recurrence and transience in the theory of simple random walks, and we study the ques…
▽ More
We introduce the notion of a "random basic walk" on an infinite graph, give numerous examples, list potential applications, and provide detailed comparisons between the random basic walk and existing generalizations of simple random walks. We define analogues in the setting of random basic walks of the notions of recurrence and transience in the theory of simple random walks, and we study the question of which graphs have a cycling random basic walk and which a transient random basic walk.
We prove that cycles of arbitrary length are possible in any regular graph, but that they are unlikely. We give upper bounds on the expected number of vertices a random basic walk will visit on the infinite graphs studied and on their finite analogues of sufficiently large size. We then study random basic walks on complete graphs, and prove that the class of complete graphs has random basic walks asymptotically visit a constant fraction of the nodes. We end with numerous conjectures and problems for future study, as well as ideas for how to approach these problems.
△ Less
Submitted 5 August, 2013;
originally announced August 2013.