-
Multi-Patch Isogeometric Convolution Hierarchical Deep-learning Neural Network
Authors:
Lei Zhang,
Chanwook Park,
T. J. R. Hughes,
Wing Kam Liu
Abstract:
A seamless integration of neural networks with Isogeometric Analysis (IGA) was first introduced in [1] under the name of Hierarchical Deep-learning Neural Network (HiDeNN) and has systematically evolved into Isogeometric Convolution HiDeNN (in short, C-IGA) [2]. C-IGA achieves higher order approximations without increasing the degree of freedom. Due to the Kronecker delta property of C-IGA shape f…
▽ More
A seamless integration of neural networks with Isogeometric Analysis (IGA) was first introduced in [1] under the name of Hierarchical Deep-learning Neural Network (HiDeNN) and has systematically evolved into Isogeometric Convolution HiDeNN (in short, C-IGA) [2]. C-IGA achieves higher order approximations without increasing the degree of freedom. Due to the Kronecker delta property of C-IGA shape functions, one can refine the mesh in the physical domain like standard finite element method (FEM) while maintaining the exact geometrical map** of IGA. In this article, C-IGA theory is generalized for multi-CAD-patch systems with a mathematical investigation of the compatibility conditions at patch interfaces and convergence of error estimates. Two compatibility conditions (nodal compatibility and G^0 (i.e., global C^0) compatibility) are presented and validated through numerical examples.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Challenges and Opportunities for Large-Scale Exploration with Air-Ground Teams using Semantics
Authors:
Fernando Cladera,
Ian D. Miller,
Zachary Ravichandran,
Varun Murali,
Jason Hughes,
M. Ani Hsieh,
C. J. Taylor,
Vijay Kumar
Abstract:
One common and desirable application of robots is exploring potentially hazardous and unstructured environments. Air-ground collaboration offers a synergistic approach to addressing such exploration challenges. In this paper, we demonstrate a system for large-scale exploration using a team of aerial and ground robots. Our system uses semantics as lingua franca, and relies on fully opportunistic co…
▽ More
One common and desirable application of robots is exploring potentially hazardous and unstructured environments. Air-ground collaboration offers a synergistic approach to addressing such exploration challenges. In this paper, we demonstrate a system for large-scale exploration using a team of aerial and ground robots. Our system uses semantics as lingua franca, and relies on fully opportunistic communications. We highlight the unique challenges from this approach, explain our system architecture and showcase lessons learned during our experiments. All our code is open-source, encouraging researchers to use it and build upon.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
From Rigid to Soft Robotic Approaches for Minimally Invasive Neurosurgery
Authors:
Kieran Gilday,
Irena Zubak,
Andreas Raabe,
Josie Hughes
Abstract:
Robotic assistance has significantly improved the outcomes of open microsurgery and rigid endoscopic surgery, however is yet to make an impact in flexible endoscopic neurosurgery. Some of the most common intracranial procedures for treatment of hydrocephalus and tumors stand to benefit from increased dexterity and reduced invasiveness offered by robotic systems that can navigate in the deep ventri…
▽ More
Robotic assistance has significantly improved the outcomes of open microsurgery and rigid endoscopic surgery, however is yet to make an impact in flexible endoscopic neurosurgery. Some of the most common intracranial procedures for treatment of hydrocephalus and tumors stand to benefit from increased dexterity and reduced invasiveness offered by robotic systems that can navigate in the deep ventricular system of the brain. We review a spectrum of flexible robotic devices, from the traditional highly actuated approach, to more novel and bio-inspired mechanisms for safe navigation. For each technology, we identify the operating principle and are able to evaluate the potential for minimally invasive surgical applications. Overall, rigid-type continuum robots have seen the most development, however, approaches combining rigid and soft robotic principles into innovative devices, are ideally situated to address safety and complexity limitations after future design evolution. We also observe a number of related challenges in the field, from surgeon-robot interfaces to robot evaluation procedures. Fundamentally, the challenges revolve around a guarantee of safety in robotic devices with the prerequisites to assist and improve upon surgical tasks. With innovative designs, materials and evaluation techniques emerging, we see potential impacts in the next 5--10 years.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Robust Anthropomorphic Robotic Manipulation through Biomimetic Distributed Compliance
Authors:
Kai Junge,
Josie Hughes
Abstract:
The impressive capabilities of humans to robustly perform manipulation relies on compliant interactions, enabled through the structure and materials spatially distributed in our hands. We propose by mimicking this distributed compliance in an anthropomorphic robotic hand, the open-loop manipulation robustness increases and observe the emergence of human-like behaviours. To achieve this, we introdu…
▽ More
The impressive capabilities of humans to robustly perform manipulation relies on compliant interactions, enabled through the structure and materials spatially distributed in our hands. We propose by mimicking this distributed compliance in an anthropomorphic robotic hand, the open-loop manipulation robustness increases and observe the emergence of human-like behaviours. To achieve this, we introduce the ADAPT Hand equipped with tunable compliance throughout the skin, fingers, and the wrist. Through extensive automated pick-and-place tests, we show the gras** robustness closely mirrors an estimated geometric theoretical limit, while `stress-testing' the robot hand to perform 800+ grasps. Finally, 24 items with largely varying geometries are grasped in a constrained environment with a success rate of 93%. We demonstrate the hand-object self-organization behavior underlines this extreme robustness, where the hand automatically exhibits different grasp types depending on object geometries. Furthermore, the robot grasp type mimics a natural human grasp with a direct similarity of 68%.
△ Less
Submitted 14 April, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data
Authors:
Matthias Gerstgrasser,
Rylan Schaeffer,
Apratim Dey,
Rafael Rafailov,
Henry Sleight,
John Hughes,
Tomasz Korbak,
Rajashree Agrawal,
Dhruv Pai,
Andrey Gromov,
Daniel A. Roberts,
Diyi Yang,
David L. Donoho,
Sanmi Koyejo
Abstract:
The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent investigations into model-data feedback loops proposed that such loops would lead to a phenomenon termed model collapse, under which performance progressively degrades with each model-data feedback iteration…
▽ More
The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent investigations into model-data feedback loops proposed that such loops would lead to a phenomenon termed model collapse, under which performance progressively degrades with each model-data feedback iteration until fitted models become useless. However, those studies largely assumed that new data replace old data over time, where an arguably more realistic assumption is that data accumulate over time. In this paper, we ask: what effect does accumulating data have on model collapse? We empirically study this question by pretraining sequences of language models on text corpora. We confirm that replacing the original real data by each generation's synthetic data does indeed tend towards model collapse, then demonstrate that accumulating the successive generations of synthetic data alongside the original real data avoids model collapse; these results hold across a range of model sizes, architectures, and hyperparameters. We obtain similar results for deep generative models on other types of real data: diffusion models for molecule conformation generation and variational autoencoders for image generation. To understand why accumulating data can avoid model collapse, we use an analytically tractable framework introduced by prior work in which a sequence of linear models are fit to the previous models' outputs. Previous work used this framework to show that if data are replaced, the test error increases with the number of model-fitting iterations; we extend this argument to prove that if data instead accumulate, the test error has a finite upper bound independent of the number of iterations, meaning model collapse no longer occurs.
△ Less
Submitted 29 April, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
Detection of subclinical atherosclerosis by image-based deep learning on chest x-ray
Authors:
Guglielmo Gallone,
Francesco Iodice,
Alberto Presta,
Davide Tore,
Ovidio de Filippo,
Michele Visciano,
Carlo Alberto Barbano,
Alessandro Serafini,
Paola Gorrini,
Alessandro Bruno,
Walter Grosso Marra,
James Hughes,
Mario Iannaccone,
Paolo Fonio,
Attilio Fiandrotti,
Alessandro Depaoli,
Marco Grangetto,
Gaetano Maria de Ferrari,
Fabrizio D'Ascenzo
Abstract:
Aims. To develop a deep-learning based system for recognition of subclinical atherosclerosis on a plain frontal chest x-ray. Methods and Results. A deep-learning algorithm to predict coronary artery calcium (CAC) score (the AI-CAC model) was developed on 460 chest x-ray (80% training cohort, 20% internal validation cohort) of primary prevention patients (58.4% male, median age 63 [51-74] years) wi…
▽ More
Aims. To develop a deep-learning based system for recognition of subclinical atherosclerosis on a plain frontal chest x-ray. Methods and Results. A deep-learning algorithm to predict coronary artery calcium (CAC) score (the AI-CAC model) was developed on 460 chest x-ray (80% training cohort, 20% internal validation cohort) of primary prevention patients (58.4% male, median age 63 [51-74] years) with available paired chest x-ray and chest computed tomography (CT) indicated for any clinical reason and performed within 3 months. The CAC score calculated on chest CT was used as ground truth. The model was validated on an temporally-independent cohort of 90 patients from the same institution (external validation). The diagnostic accuracy of the AI-CAC model assessed by the area under the curve (AUC) was the primary outcome. Overall, median AI-CAC score was 35 (0-388) and 28.9% patients had no AI-CAC. AUC of the AI-CAC model to identify a CAC>0 was 0.90 in the internal validation cohort and 0.77 in the external validation cohort. Sensitivity was consistently above 92% in both cohorts. In the overall cohort (n=540), among patients with AI-CAC=0, a single ASCVD event occurred, after 4.3 years. Patients with AI-CAC>0 had significantly higher Kaplan Meier estimates for ASCVD events (13.5% vs. 3.4%, log-rank=0.013). Conclusion. The AI-CAC model seems to accurately detect subclinical atherosclerosis on chest x-ray with elevated sensitivity, and to predict ASCVD events with elevated negative predictive value. Adoption of the AI-CAC model to refine CV risk stratification or as an opportunistic screening tool requires prospective evaluation.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Quadcopter Team Configurable Motion Guided by a Quadruped
Authors:
Mohammad Ghufran,
Sourish Tetakayala,
Jack Hughes,
Aron Wilson,
Hossein Rastgoftar
Abstract:
The paper focuses on modeling and experimental evaluation of a quadcopter team configurable coordination guided by a single quadruped robot. We consider the quadcopter team as particles of a two-dimensional deformable body and propose a two-dimensional affine transformation model for safe and collision-free configurable coordination of this heterogeneous robotic system. The proposed affine transfo…
▽ More
The paper focuses on modeling and experimental evaluation of a quadcopter team configurable coordination guided by a single quadruped robot. We consider the quadcopter team as particles of a two-dimensional deformable body and propose a two-dimensional affine transformation model for safe and collision-free configurable coordination of this heterogeneous robotic system. The proposed affine transformation is decomposed into translation, that is specified by the quadruped global position, and configurable motion of the quadcopters, which is determined by a nonsingular Jacobian matrix so that the quadcopter team can safely navigate a constrained environment while avoiding collision. We propose two methods to experimentally evaluate the proposed heterogeneous robot coordination model. The first method measures real positions of quadcopters, quadruped, and environmental objects all with respect to the global coordinate system. On the other hand, the second method measures position with respect to the local coordinate system fixed on the dog robot which in turn enables safe planning the Jacobian matrix of the quadcopter team while the world is virtually approached the robotic system.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Anomaly Detection in Offshore Wind Turbine Structures using Hierarchical Bayesian Modelling
Authors:
S. M. Smith,
A. J. Hughes,
T. A. Dardeno,
L. A. Bull,
N. Dervilis,
K. Worden
Abstract:
Population-based structural health monitoring (PBSHM), aims to share information between members of a population. An offshore wind (OW) farm could be considered as a population of nominally-identical wind-turbine structures. However, benign variations exist among members, such as geometry, sea-bed conditions and temperature differences. These factors could influence structural properties and there…
▽ More
Population-based structural health monitoring (PBSHM), aims to share information between members of a population. An offshore wind (OW) farm could be considered as a population of nominally-identical wind-turbine structures. However, benign variations exist among members, such as geometry, sea-bed conditions and temperature differences. These factors could influence structural properties and therefore the dynamic response, making it more difficult to detect structural problems via traditional SHM techniques. This paper explores the use of a hierarchical Bayesian model to infer expected soil stiffness distributions at both population and local levels, as a basis to perform anomaly detection, in the form of scour, for new and existing turbines. To do this, observations of natural frequency will be generated as though they are from a small population of wind turbines. Differences between individual observations will be introduced by postulating distributions over the soil stiffness and measurement noise, as well as reducing soil depth (to represent scour), in the case of anomaly detection.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
Debating with More Persuasive LLMs Leads to More Truthful Answers
Authors:
Akbir Khan,
John Hughes,
Dan Valentine,
Laura Ruis,
Kshitij Sachan,
Ansh Radhakrishnan,
Edward Grefenstette,
Samuel R. Bowman,
Tim Rocktäschel,
Ethan Perez
Abstract:
Common methods for aligning large language models (LLMs) with desired behaviour heavily rely on human-labelled data. However, as models grow increasingly sophisticated, they will surpass human expertise, and the role of human evaluation will evolve into non-experts overseeing experts. In anticipation of this, we ask: can weaker models assess the correctness of stronger models? We investigate this…
▽ More
Common methods for aligning large language models (LLMs) with desired behaviour heavily rely on human-labelled data. However, as models grow increasingly sophisticated, they will surpass human expertise, and the role of human evaluation will evolve into non-experts overseeing experts. In anticipation of this, we ask: can weaker models assess the correctness of stronger models? We investigate this question in an analogous setting, where stronger models (experts) possess the necessary information to answer questions and weaker models (non-experts) lack this information. The method we evaluate is debate, where two LLM experts each argue for a different answer, and a non-expert selects the answer. We find that debate consistently helps both non-expert models and humans answer questions, achieving 76% and 88% accuracy respectively (naive baselines obtain 48% and 60%). Furthermore, optimising expert debaters for persuasiveness in an unsupervised manner improves non-expert ability to identify the truth in debates. Our results provide encouraging empirical evidence for the viability of aligning models with debate in the absence of ground truth.
△ Less
Submitted 30 May, 2024; v1 submitted 9 February, 2024;
originally announced February 2024.
-
Monitoring-Supported Value Generation for Managing Structures and Infrastructure Systems
Authors:
Antonios Kamariotis,
Eleni Chatzi,
Daniel Straub,
Nikolaos Dervilis,
Kai Goebel,
Aidan J. Hughes,
Geert Lombaert,
Costas Papadimitriou,
Konstantinos G. Papakonstantinou,
Matteo Pozzi,
Michael Todd,
Keith Worden
Abstract:
To maximize its value, the design, development and implementation of Structural Health Monitoring (SHM) should focus on its role in facilitating decision support. In this position paper, we offer perspectives on the synergy between SHM and decision-making. We propose a classification of SHM use cases aligning with various dimensions that are closely linked to the respective decision contexts. The…
▽ More
To maximize its value, the design, development and implementation of Structural Health Monitoring (SHM) should focus on its role in facilitating decision support. In this position paper, we offer perspectives on the synergy between SHM and decision-making. We propose a classification of SHM use cases aligning with various dimensions that are closely linked to the respective decision contexts. The types of decisions that have to be supported by the SHM system within these settings are discussed along with the corresponding challenges. We provide an overview of different classes of models that are required for integrating SHM in the decision-making process to support management and operation and maintenance of structures and infrastructure systems. Fundamental decision-theoretic principles and state-of-the-art methods for optimizing maintenance and operational decision-making under uncertainty are briefly discussed. Finally, we offer a viewpoint on the appropriate course of action for quantifying, validating and maximizing the added value generated by SHM. This work aspires to synthesize the different perspectives of the SHM, Prognostic Health Management (PHM), and reliability communities, and deliver a roadmap towards monitoring-based decision support.
△ Less
Submitted 4 January, 2024;
originally announced February 2024.
-
A simple and efficient hybrid discretization approach to alleviate membrane locking in isogeometric thin shells
Authors:
Roger A. Sauer,
Zhihui Zou,
Thomas J. R. Hughes
Abstract:
This work presents a new hybrid discretization approach to alleviate membrane locking in isogeometric finite element formulations for Kirchhoff-Love shells. The approach is simple, and requires no additional dofs and no static condensation. It does not increase the bandwidth of the tangent matrix and is effective for both linear and nonlinear problems. It combines isogeometric surface discretizati…
▽ More
This work presents a new hybrid discretization approach to alleviate membrane locking in isogeometric finite element formulations for Kirchhoff-Love shells. The approach is simple, and requires no additional dofs and no static condensation. It does not increase the bandwidth of the tangent matrix and is effective for both linear and nonlinear problems. It combines isogeometric surface discretizations with classical Lagrange-based surface discretizations, and can thus be run with existing isogeometric finite element codes. Also, the stresses can be recovered straightforwardly. The effectiveness of the proposed approach in alleviating, if not eliminating, membrane locking is demonstrated through the rigorous study of the convergence behavior of several classical benchmark problems. Accuracy gains are particularly large in the membrane stresses. The approach is formulated here for quadratic NURBS, but an extension to other discretization types can be anticipated. The same applies to other constraints and associated locking phenomena.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Lessons Learned from Efforts to Standardize Streaming In SQL
Authors:
Sabina Petride,
Dan Sotolongo,
Jan Michels,
Andrew Witkowski,
Cara Haas,
Jim Hughes
Abstract:
Acknowledging the reality of streaming languages and platforms overlap** with SQL and database systems, in 2019 INCTIS Data Management established an Expert Group with the focused mission to initiate the process of standardizing streaming support in SQL. Over time, the roster included companies like Actian, Alibaba, Amazon Web Services, Confluent, dbt Labs, Google, Hazelcast, IBM, Materialize, M…
▽ More
Acknowledging the reality of streaming languages and platforms overlap** with SQL and database systems, in 2019 INCTIS Data Management established an Expert Group with the focused mission to initiate the process of standardizing streaming support in SQL. Over time, the roster included companies like Actian, Alibaba, Amazon Web Services, Confluent, dbt Labs, Google, Hazelcast, IBM, Materialize, Microsoft, Oracle, Snowflake, SQLstream and Timeplus. For the span of more than one year, representatives of each company have presented key features of their streaming product or, in some cases, multiple streaming products. These were live technical Q&A sessions accompanied by summary or position papers, which are unquestionably valuable. As expected, substantial time was spent in clarifying what common terms meant in each system and setting up a glossary. These sessions were followed by clarification notes and debates, and decisions that appeared mandatory to allow further progress. This first phase was followed by the next phase, which consisted of the group meetings, in which the expert group (EG) agreed on main exit criteria topics that a streaming solution in SQL must address, and position papers and follow-ups were written and discussed. This paper summarizes these group efforts, up to the summer of 2023.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Quantifying the value of information transfer in population-based SHM
Authors:
Aidan J. Hughes,
Jack Poole,
Nikolaos Dervilis,
Paul Gardner,
Keith Worden
Abstract:
Population-based structural health monitoring (PBSHM), seeks to address some of the limitations associated with data scarcity that arise in traditional SHM. A tenet of the population-based approach to SHM is that information can be shared between sufficiently-similar structures in order to improve predictive models. Transfer learning techniques, such as domain adaptation, have been shown to be a h…
▽ More
Population-based structural health monitoring (PBSHM), seeks to address some of the limitations associated with data scarcity that arise in traditional SHM. A tenet of the population-based approach to SHM is that information can be shared between sufficiently-similar structures in order to improve predictive models. Transfer learning techniques, such as domain adaptation, have been shown to be a highly-useful technology for sharing information between structures when develo** statistical classifiers for PBSHM. Nonetheless, transfer-learning techniques are not without their pitfalls. In some circumstances, for example if the data distributions associated with the structures within a population are dissimilar, applying transfer-learning methods can be detrimental to classification performance -- this phenomenon is known as negative transfer. Given the potentially-severe consequences of negative transfer, it is prudent for engineers to ask the question `when, what, and how should one transfer between structures?'.
The current paper aims to demonstrate a transfer-strategy decision process for a classification task for a population of simulated structures in the context of a representative SHM maintenance problem, supported by domain adaptation. The transfer decision framework is based upon the concept of expected value of information transfer. In order to compute the expected value of information transfer, predictions must be made regarding the classification (and decision performance) in the target domain following information transfer. In order to forecast the outcome of transfers, a probabilistic regression is used here to predict classification performance from a proxy for structural similarity based on the modal assurance criterion.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
The Data Lakehouse: Data Warehousing and More
Authors:
Dipankar Mazumdar,
Jason Hughes,
JB Onofre
Abstract:
Relational Database Management Systems designed for Online Analytical Processing (RDBMS-OLAP) have been foundational to democratizing data and enabling analytical use cases such as business intelligence and reporting for many years. However, RDBMS-OLAP systems present some well-known challenges. They are primarily optimized only for relational workloads, lead to proliferation of data copies which…
▽ More
Relational Database Management Systems designed for Online Analytical Processing (RDBMS-OLAP) have been foundational to democratizing data and enabling analytical use cases such as business intelligence and reporting for many years. However, RDBMS-OLAP systems present some well-known challenges. They are primarily optimized only for relational workloads, lead to proliferation of data copies which can become unmanageable, and since the data is stored in proprietary formats, it can lead to vendor lock-in, restricting access to engines, tools, and capabilities beyond what the vendor offers. As the demand for data-driven decision making surges, the need for a more robust data architecture to address these challenges becomes ever more critical. Cloud data lakes have addressed some of the shortcomings of RDBMS-OLAP systems, but they present their own set of challenges. More recently, organizations have often followed a two-tier architectural approach to take advantage of both these platforms, leveraging both cloud data lakes and RDBMS-OLAP systems. However, this approach brings additional challenges, complexities, and overhead. This paper discusses how a data lakehouse, a new architectural approach, achieves the same benefits of an RDBMS-OLAP and cloud data lake combined, while also providing additional advantages. We take today's data warehousing and break it down into implementation independent components, capabilities, and practices. We then take these aspects and show how a lakehouse architecture satisfies them. Then, we go a step further and discuss what additional capabilities and benefits a lakehouse architecture provides over an RDBMS-OLAP.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
Sharing Information Between Machine Tools to Improve Surface Finish Forecasting
Authors:
Daniel R. Clarkson,
Lawrence A. Bull,
Tina A. Dardeno,
Chandula T. Wickramarachchi,
Elizabeth J. Cross,
Timothy J. Rogers,
Keith Worden,
Nikolaos Dervilis,
Aidan J. Hughes
Abstract:
At present, most surface-quality prediction methods can only perform single-task prediction which results in under-utilised datasets, repetitive work and increased experimental costs. To counter this, the authors propose a Bayesian hierarchical model to predict surface-roughness measurements for a turning machining process. The hierarchical model is compared to multiple independent Bayesian linear…
▽ More
At present, most surface-quality prediction methods can only perform single-task prediction which results in under-utilised datasets, repetitive work and increased experimental costs. To counter this, the authors propose a Bayesian hierarchical model to predict surface-roughness measurements for a turning machining process. The hierarchical model is compared to multiple independent Bayesian linear regression models to showcase the benefits of partial pooling in a machining setting with respect to prediction accuracy and uncertainty quantification.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Patient-specific computational forecasting of prostate cancer growth during active surveillance using an imaging-informed biomechanistic model
Authors:
Guillermo Lorenzo,
Jon S. Heiselman,
Michael A. Liss,
Michael I. Miga,
Hector Gomez,
Thomas E. Yankeelov,
Alessandro Reali,
Thomas J. R. Hughes
Abstract:
Active surveillance (AS) is a suitable management option for newly-diagnosed prostate cancer (PCa), which usually presents low to intermediate clinical risk. Patients enrolled in AS have their tumor closely monitored via longitudinal multiparametric magnetic resonance imaging (mpMRI), serum prostate-specific antigen tests, and biopsies. Hence, the patient is prescribed treatment when these tests i…
▽ More
Active surveillance (AS) is a suitable management option for newly-diagnosed prostate cancer (PCa), which usually presents low to intermediate clinical risk. Patients enrolled in AS have their tumor closely monitored via longitudinal multiparametric magnetic resonance imaging (mpMRI), serum prostate-specific antigen tests, and biopsies. Hence, the patient is prescribed treatment when these tests identify progression to higher-risk PCa. However, current AS protocols rely on detecting tumor progression through direct observation according to standardized monitoring strategies. This approach limits the design of patient-specific AS plans and may lead to the late detection and treatment of tumor progression. Here, we propose to address these issues by leveraging personalized computational predictions of PCa growth. Our forecasts are obtained with a spatiotemporal biomechanistic model informed by patient-specific longitudinal mpMRI data. Our results show that our predictive technology can represent and forecast the global tumor burden for individual patients, achieving concordance correlation coefficients ranging from 0.93 to 0.99 across our cohort (n=7). Additionally, we identify a model-based biomarker of higher-risk PCa: the mean proliferation activity of the tumor (p=0.041). Using logistic regression, we construct a PCa risk classifier based on this biomarker that achieves an area under the receiver operating characteristic curve of 0.83. We further show that coupling our tumor forecasts with this PCa risk classifier enables the early identification of PCa progression to higher-risk disease by more than one year. Thus, we posit that our predictive technology constitutes a promising clinical decision-making tool to design personalized AS plans for PCa patients.
△ Less
Submitted 29 September, 2023;
originally announced October 2023.
-
Exploring API Behaviours Through Generated Examples
Authors:
Stefan Karlsson,
John Hughes,
Robbert Jongeling,
Adnan Causevic,
Daniel Sundmark
Abstract:
Understanding the behaviour of a system's API can be hard. Giving users access to relevant examples of how an API behaves has been shown to make this easier for them. In addition, such examples can be used to verify expected behaviour or identify unwanted behaviours.
Methods for automatically generating examples have existed for a long time. However, state-of-the-art methods rely on either white…
▽ More
Understanding the behaviour of a system's API can be hard. Giving users access to relevant examples of how an API behaves has been shown to make this easier for them. In addition, such examples can be used to verify expected behaviour or identify unwanted behaviours.
Methods for automatically generating examples have existed for a long time. However, state-of-the-art methods rely on either white-box information, such as source code, or on formal specifications of the system behaviour. But what if you do not have access to either? e.g., when interacting with a third-party API.
In this paper, we present an approach to automatically generate relevant examples of behaviours of an API, without requiring either source code or a formal specification of behaviour.
Evaluation on an industry-grade REST API shows that our method can produce small and relevant examples that can help engineers to understand the system under exploration.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
A decision framework for selecting information-transfer strategies in population-based SHM
Authors:
Aidan J. Hughes,
Jack Poole,
Nikolaos Dervilis,
Paul Gardner,
Keith Worden
Abstract:
Decision-support for the operation and maintenance of structures provides significant motivation for the development and implementation of structural health monitoring (SHM) systems. Unfortunately, the limited availability of labelled training data hinders the development of the statistical models on which these decision-support systems rely. Population-based SHM seeks to mitigate the impact of da…
▽ More
Decision-support for the operation and maintenance of structures provides significant motivation for the development and implementation of structural health monitoring (SHM) systems. Unfortunately, the limited availability of labelled training data hinders the development of the statistical models on which these decision-support systems rely. Population-based SHM seeks to mitigate the impact of data scarcity by using transfer learning techniques to share information between individual structures within a population. The current paper proposes a decision framework for selecting transfer strategies based upon a novel concept -- the expected value of information transfer -- such that negative transfer is avoided. By avoiding negative transfer, and by optimising information transfer strategies using the transfer-decision framework, one can reduce the costs associated with operating and maintaining structures, and improve safety.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
Can Large Language Models design a Robot?
Authors:
Francesco Stella,
Cosimo Della Santina,
Josie Hughes
Abstract:
Large Language Models can lead researchers in the design of robots.
Large Language Models can lead researchers in the design of robots.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Towards risk-informed PBSHM: Populations as hierarchical systems
Authors:
Aidan J. Hughes,
Paul Gardner,
Keith Worden
Abstract:
The prospect of informed and optimal decision-making regarding the operation and maintenance (O&M) of structures provides impetus to the development of structural health monitoring (SHM) systems. A probabilistic risk-based framework for decision-making has already been proposed. However, in order to learn the statistical models necessary for decision-making, measured data from the structure of int…
▽ More
The prospect of informed and optimal decision-making regarding the operation and maintenance (O&M) of structures provides impetus to the development of structural health monitoring (SHM) systems. A probabilistic risk-based framework for decision-making has already been proposed. However, in order to learn the statistical models necessary for decision-making, measured data from the structure of interest are required. Unfortunately, these data are seldom available across the range of environmental and operational conditions necessary to ensure good generalisation of the model.
Recently, technologies have been developed that overcome this challenge, by extending SHM to populations of structures, such that valuable knowledge may be transferred between instances of structures that are sufficiently similar. This new approach is termed population-based structural heath monitoring (PBSHM).
The current paper presents a formal representation of populations of structures, such that risk-based decision processes may be specified within them. The population-based representation is an extension to the hierarchical representation of a structure used within the probabilistic risk-based decision framework to define fault trees. The result is a series, consisting of systems of systems ranging from the individual component level up to an inventory of heterogeneous populations. The current paper considers an inventory of wind farms as a motivating example and highlights the inferences and decisions that can be made within the hierarchical representation.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs
Authors:
Geonhwa Jeong,
Sana Damani,
Abhimanyu Rajeshkumar Bambhaniya,
Eric Qin,
Christopher J. Hughes,
Sreenivas Subramoney,
Hyesoon Kim,
Tushar Krishna
Abstract:
Deep Learning (DL) acceleration support in CPUs has recently gained a lot of traction, with several companies (Arm, Intel, IBM) announcing products with specialized matrix engines accessible via GEMM instructions. CPUs are pervasive and need to handle diverse requirements across DL workloads running in edge/HPC/cloud platforms. Therefore, as DL workloads embrace sparsity to reduce the computations…
▽ More
Deep Learning (DL) acceleration support in CPUs has recently gained a lot of traction, with several companies (Arm, Intel, IBM) announcing products with specialized matrix engines accessible via GEMM instructions. CPUs are pervasive and need to handle diverse requirements across DL workloads running in edge/HPC/cloud platforms. Therefore, as DL workloads embrace sparsity to reduce the computations and memory size of models, it is also imperative for CPUs to add support for sparsity to avoid under-utilization of the dense matrix engine and inefficient usage of the caches and registers. This work presents VEGETA, a set of ISA and microarchitecture extensions over dense matrix engines to support flexible structured sparsity for CPUs, enabling programmable support for diverse DL models with varying degrees of sparsity. Compared to the state-of-the-art (SOTA) dense matrix engine in CPUs, a VEGETA engine provides 1.09x, 2.20x, 3.74x, and 3.28x speed-ups when running 4:4 (dense), 2:4, 1:4, and unstructured (95%) sparse DNN layers.
△ Less
Submitted 23 February, 2023; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Dynamic and Distributed Optimization for the Allocation of Aerial Swarm Vehicles
Authors:
Jason Hughes,
Dominic Larkin,
Charles O'Donnell,
Christopher Korpela
Abstract:
Optimal transport (OT) is a framework that can guide the design of efficient resource allocation strategies in a network of multiple sources and targets. This paper applies discrete OT to a swarm of UAVs in a novel way to achieve appropriate task allocation and execution. Drone swarm deployments already operate in multiple domains where sensors are used to gain knowledge of an environment [1]. Use…
▽ More
Optimal transport (OT) is a framework that can guide the design of efficient resource allocation strategies in a network of multiple sources and targets. This paper applies discrete OT to a swarm of UAVs in a novel way to achieve appropriate task allocation and execution. Drone swarm deployments already operate in multiple domains where sensors are used to gain knowledge of an environment [1]. Use cases such as, chemical and radiation detection, and thermal and RGB imaging create a specific need for an algorithm that considers parameters on both the UAV and waypoint side and allows for updating the matching scheme as the swarm gains information from the environment. Additionally, the need for a centralized planner can be removed by using a distributed algorithm that can dynamically update based on changes in the swarm network or parameters. To this end, we develop a dynamic and distributed OT algorithm that matches a UAV to the optimal waypoint based on one parameter at the UAV and another parameter at the waypoint. We show the convergence and allocation of the algorithm through a case study and test the algorithm's effectiveness against a greedy assignment algorithm in simulation.
△ Less
Submitted 30 November, 2022;
originally announced November 2022.
-
Security Investment Over Networks with Bounded Rational Agents: Analysis and Distributed Algorithm
Authors:
Jason Hughes,
Juntao Chen
Abstract:
This paper considers the security investment problem over a network in which the resource owners aim to allocate their constrained security resources to heterogeneous targets strategically. Investing in each target makes it less vulnerable, and thus lowering its probability of a successful attack. However, humans tend to perceive such probabilities inaccurately yielding bounded rational behaviors;…
▽ More
This paper considers the security investment problem over a network in which the resource owners aim to allocate their constrained security resources to heterogeneous targets strategically. Investing in each target makes it less vulnerable, and thus lowering its probability of a successful attack. However, humans tend to perceive such probabilities inaccurately yielding bounded rational behaviors; a phenomenon frequently observed in their decision-making when facing uncertainties. We capture this human nature through the lens of cumulative prospect theory and establish a behavioral resource allocation framework to account for the human's misperception in security investment. We analyze how this misperception behavior affects the resource allocation plan by comparing it with the accurate perception counterpart. The network can become highly complex with a large number of participating agents. To this end, we further develop a fully distributed algorithm to compute the behavioral security investment strategy efficiently. Finally, we corroborate our results and illustrate the impacts of human's bounded rationality on the resource allocation scheme using cases studies.
△ Less
Submitted 17 February, 2023; v1 submitted 30 November, 2022;
originally announced November 2022.
-
Differentially Private ADMM-Based Distributed Discrete Optimal Transport for Resource Allocation
Authors:
Jason Hughes,
Juntao Chen
Abstract:
Optimal transport (OT) is a framework that can guide the design of efficient resource allocation strategies in a network of multiple sources and targets. To ease the computational complexity of large-scale transport design, we first develop a distributed algorithm based on the alternating direction method of multipliers (ADMM). However, such a distributed algorithm is vulnerable to sensitive infor…
▽ More
Optimal transport (OT) is a framework that can guide the design of efficient resource allocation strategies in a network of multiple sources and targets. To ease the computational complexity of large-scale transport design, we first develop a distributed algorithm based on the alternating direction method of multipliers (ADMM). However, such a distributed algorithm is vulnerable to sensitive information leakage when an attacker intercepts the transport decisions communicated between nodes during the distributed ADMM updates. To this end, we propose a privacy-preserving distributed mechanism based on output variable perturbation by adding appropriate randomness to each node's decision before it is shared with other corresponding nodes at each update instance. We show that the developed scheme is differentially private, which prevents the adversary from inferring the node's confidential information even knowing the transport decisions. Finally, we corroborate the effectiveness of the devised algorithm through case studies.
△ Less
Submitted 30 November, 2022;
originally announced November 2022.
-
Control and Morphology Optimization of Passive Asymmetric Structures for Robotic Swimming
Authors:
Nana Obayashi,
Andrea Vicari,
Kai Junge,
Kamran Shakir,
Josie Hughes
Abstract:
Aquatic creatures exhibit remarkable adaptations of their body to efficiently interact with the surrounding fluid. The tight coupling between their morphology, motion, and the environment are highly complex but serves as a valuable example when creating biomimetic structures in soft robotic swimmers. We focus on the use of asymmetry in structures to aid thrust generation and maneuverability. Desig…
▽ More
Aquatic creatures exhibit remarkable adaptations of their body to efficiently interact with the surrounding fluid. The tight coupling between their morphology, motion, and the environment are highly complex but serves as a valuable example when creating biomimetic structures in soft robotic swimmers. We focus on the use of asymmetry in structures to aid thrust generation and maneuverability. Designs of structures with asymmetric profiles are explored so that we can use morphology to `shape' the thrust generation. We propose combining simple simulation with automatic data-driven methods to explore their interactions with the fluid. The asymmetric structure with its co-optimized morphology and controller is able to produce 2.5 times the useful thrust compared to a baseline symmetric structure. Furthermore these asymmetric feather-like arms are validated on a robotic system capable of forward swimming motion while the same robot fitted with a plain feather is not able to move forward.
△ Less
Submitted 24 November, 2022;
originally announced November 2022.
-
Proprioceptive Sensing of Soft Tentacles with Model Based Reconstruction for Controller Optimization
Authors:
Andrea Vicari,
Nana Obayashi,
Francesco Stella,
Gaetan Raynaud,
Karen Mulleners,
Cosimo Della Santina,
Josie Hughes
Abstract:
The success of soft robots in displaying emergent behaviors is tightly linked to the compliant interaction with the environment. However, to exploit such phenomena, proprioceptive sensing methods which do not hinder their softness are needed. In this work we propose a new sensing approach for soft underwater slender structures based on embedded pressure sensors and use a learning-based pipeline to…
▽ More
The success of soft robots in displaying emergent behaviors is tightly linked to the compliant interaction with the environment. However, to exploit such phenomena, proprioceptive sensing methods which do not hinder their softness are needed. In this work we propose a new sensing approach for soft underwater slender structures based on embedded pressure sensors and use a learning-based pipeline to link the sensor readings to the shape of the soft structure. Using two different modeling techniques, we compare the pose reconstruction accuracy and identify the optimal approach. Using the proprioceptive sensing capabilities we show how this information can be used to assess the swimming performance over a number of metrics, namely swimming thrust, tip deflection, and the traveling wave index. We conclude by demonstrating the robustness of the embedded sensor on a free swimming soft robotic squid swimming at a maximum velocity of 9.5 cm/s, with the absolute tip deflection being predicted within an error less than 9% without the aid of external sensors.
△ Less
Submitted 24 November, 2022;
originally announced November 2022.
-
Piecewise Affine Curvature model: a reduced-order model for soft robot-environment interaction beyond PCC
Authors:
Francesco Stella,
Qinghua Guan,
**song Leng,
Cosimo Della Santina,
Josie Hughes
Abstract:
Soft robot are celebrated for their propensity to enable compliant and complex robot-environment interactions. Soft robotic manipulators, or slender continuum structure robots have the potential to exploit these interactions to enable new exploration and manipulation capabilities and safe human-robot interactions. However, the interactions, or perturbations by external forces cause the soft struct…
▽ More
Soft robot are celebrated for their propensity to enable compliant and complex robot-environment interactions. Soft robotic manipulators, or slender continuum structure robots have the potential to exploit these interactions to enable new exploration and manipulation capabilities and safe human-robot interactions. However, the interactions, or perturbations by external forces cause the soft structure to deform in an infinite degree of freedom (DOF) space. To control such system, reduced order models are needed; typically models consider piecewise sections of constant curvature although external forces often deform the structure out of the constant curvature hypothesis. In this work we perform an analysis of the trade-off between computational treatability and modelling accuracy. We then propose a new kinematic model, the Piecewise Affine Curvature (PAC) which we validate theoretically and experimentally showing that this higher-order model better captures the configuration of a soft continuum body robot when perturbed by the external forces. In comparison to the current state of the art Piecewise Constant Curvature (PCC) model we demonstrate up to 30\% reduction in error for the end position of a soft continuum body robot.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
Variationally Mimetic Operator Networks
Authors:
Dhruv Patel,
Deep Ray,
Michael R. A. Abdelmalik,
Thomas J. R. Hughes,
Assad A. Oberai
Abstract:
In recent years operator networks have emerged as promising deep learning tools for approximating the solution to partial differential equations (PDEs). These networks map input functions that describe material properties, forcing functions and boundary data to the solution of a PDE. This work describes a new architecture for operator networks that mimics the form of the numerical solution obtaine…
▽ More
In recent years operator networks have emerged as promising deep learning tools for approximating the solution to partial differential equations (PDEs). These networks map input functions that describe material properties, forcing functions and boundary data to the solution of a PDE. This work describes a new architecture for operator networks that mimics the form of the numerical solution obtained from an approximate variational or weak formulation of the problem. The application of these ideas to a generic elliptic PDE leads to a variationally mimetic operator network (VarMiON). Like the conventional Deep Operator Network (DeepONet) the VarMiON is also composed of a sub-network that constructs the basis functions for the output and another that constructs the coefficients for these basis functions. However, in contrast to the DeepONet, the architecture of these sub-networks in the VarMiON is precisely determined. An analysis of the error in the VarMiON solution reveals that it contains contributions from the error in the training data, the training error, the quadrature error in sampling input and output functions, and a "covering error" that measures the distance between the test input functions and the nearest functions in the training dataset. It also depends on the stability constants for the exact solution operator and its VarMiON approximation. The application of the VarMiON to a canonical elliptic PDE and a nonlinear PDE reveals that for approximately the same number of network parameters, on average the VarMiON incurs smaller errors than a standard DeepONet and a recently proposed multiple-input operator network (MIONet). Further, its performance is more robust to variations in input functions, the techniques used to sample the input and output functions, the techniques used to construct the basis functions, and the number of input functions.
△ Less
Submitted 29 August, 2023; v1 submitted 26 September, 2022;
originally announced September 2022.
-
Mitigating sampling bias in risk-based active learning via an EM algorithm
Authors:
Aidan J. Hughes,
Lawrence A. Bull,
Paul Gardner,
Nikolaos Dervilis,
Keith Worden
Abstract:
Risk-based active learning is an approach to develo** statistical classifiers for online decision-support. In this approach, data-label querying is guided according to the expected value of perfect information for incipient data points. For SHM applications, the value of information is evaluated with respect to a maintenance decision process, and the data-label querying corresponds to the inspec…
▽ More
Risk-based active learning is an approach to develo** statistical classifiers for online decision-support. In this approach, data-label querying is guided according to the expected value of perfect information for incipient data points. For SHM applications, the value of information is evaluated with respect to a maintenance decision process, and the data-label querying corresponds to the inspection of a structure to determine its health state. Sampling bias is a known issue within active-learning paradigms; this occurs when an active learning process over- or undersamples specific regions of a feature-space, thereby resulting in a training set that is not representative of the underlying distribution. This bias ultimately degrades decision-making performance, and as a consequence, results in unnecessary costs incurred. The current paper outlines a risk-based approach to active learning that utilises a semi-supervised Gaussian mixture model. The semi-supervised approach counteracts sampling bias by incorporating pseudo-labels for unlabelled data via an EM algorithm. The approach is demonstrated on a numerical example representative of the decision processes found in SHM.
△ Less
Submitted 25 June, 2022;
originally announced June 2022.
-
Improving decision-making via risk-based active learning: Probabilistic discriminative classifiers
Authors:
Aidan J. Hughes,
Paul Gardner,
Lawrence A. Bull,
Nikolaos Dervilis,
Keith Worden
Abstract:
Gaining the ability to make informed decisions on operation and maintenance of structures provides motivation for the implementation of structural health monitoring (SHM) systems. However, descriptive labels for measured data corresponding to health-states of the monitored system are often unavailable. This issue limits the applicability of fully-supervised machine learning paradigms for the devel…
▽ More
Gaining the ability to make informed decisions on operation and maintenance of structures provides motivation for the implementation of structural health monitoring (SHM) systems. However, descriptive labels for measured data corresponding to health-states of the monitored system are often unavailable. This issue limits the applicability of fully-supervised machine learning paradigms for the development of statistical classifiers to be used in decision-support in SHM systems. One approach to dealing with this problem is risk-based active learning. In such an approach, data-label querying is guided according to the expected value of perfect information for incipient data points. For risk-based active learning in SHM, the value of information is evaluated with respect to a maintenance decision process, and the data-label querying corresponds to the inspection of a structure to determine its health state.
In the context of SHM, risk-based active learning has only been considered for generative classifiers. The current paper demonstrates several advantages of using an alternative type of classifier -- discriminative models. Using the Z24 Bridge dataset as a case study, it is shown that discriminative classifiers have benefits, in the context of SHM decision-support, including improved robustness to sampling bias, and reduced expenditure on structural inspections.
△ Less
Submitted 23 June, 2022;
originally announced June 2022.
-
Electrocardiographic Deep Learning for Predicting Post-Procedural Mortality
Authors:
David Ouyang,
John Theurer,
Nathan R. Stein,
J. Weston Hughes,
Pierre Elias,
Bryan He,
Neal Yuan,
Grant Duffy,
Roopinder K. Sandhu,
Joseph Ebinger,
Patrick Botting,
Melvin Jujjavarapu,
Brian Claggett,
James E. Tooley,
Tim Poterucha,
Jonathan H. Chen,
Michael Nurok,
Marco Perez,
Adler Perotte,
James Y. Zou,
Nancy R. Cook,
Sumeet S. Chugh,
Susan Cheng,
Christine M. Albert
Abstract:
Background. Pre-operative risk assessments used in clinical practice are limited in their ability to identify risk for post-operative mortality. We hypothesize that electrocardiograms contain hidden risk markers that can help prognosticate post-operative mortality. Methods. In a derivation cohort of 45,969 pre-operative patients (age 59+- 19 years, 55 percent women), a deep learning algorithm was…
▽ More
Background. Pre-operative risk assessments used in clinical practice are limited in their ability to identify risk for post-operative mortality. We hypothesize that electrocardiograms contain hidden risk markers that can help prognosticate post-operative mortality. Methods. In a derivation cohort of 45,969 pre-operative patients (age 59+- 19 years, 55 percent women), a deep learning algorithm was developed to leverage waveform signals from pre-operative ECGs to discriminate post-operative mortality. Model performance was assessed in a holdout internal test dataset and in two external hospital cohorts and compared with the Revised Cardiac Risk Index (RCRI) score. Results. In the derivation cohort, there were 1,452 deaths. The algorithm discriminates mortality with an AUC of 0.83 (95% CI 0.79-0.87) surpassing the discrimination of the RCRI score with an AUC of 0.67 (CI 0.61-0.72) in the held out test cohort. Patients determined to be high risk by the deep learning model's risk prediction had an unadjusted odds ratio (OR) of 8.83 (5.57-13.20) for post-operative mortality as compared to an unadjusted OR of 2.08 (CI 0.77-3.50) for post-operative mortality for RCRI greater than 2. The deep learning algorithm performed similarly for patients undergoing cardiac surgery with an AUC of 0.85 (CI 0.77-0.92), non-cardiac surgery with an AUC of 0.83 (0.79-0.88), and catherization or endoscopy suite procedures with an AUC of 0.76 (0.72-0.81). The algorithm similarly discriminated risk for mortality in two separate external validation cohorts from independent healthcare systems with AUCs of 0.79 (0.75-0.83) and 0.75 (0.74-0.76) respectively. Conclusion. The findings demonstrate how a novel deep learning algorithm, applied to pre-operative ECGs, can improve discrimination of post-operative mortality.
△ Less
Submitted 30 April, 2022;
originally announced May 2022.
-
Deep electric field predictions by drift-reduced Braginskii theory with plasma-neutral interactions based upon experimental images of boundary turbulence
Authors:
Abhilash Mathews,
Jerry Hughes,
James Terry,
Seung-Gyou Baek
Abstract:
We present 2-dimensional turbulent electric field calculations via physics-informed deep learning consistent with (i) drift-reduced Braginskii theory under the framework of an axisymmetric fusion plasma with purely toroidal field and (ii) experimental estimates of the fluctuating electron density and temperature on open field lines obtained from analysis of gas puff imaging of a discharge on the A…
▽ More
We present 2-dimensional turbulent electric field calculations via physics-informed deep learning consistent with (i) drift-reduced Braginskii theory under the framework of an axisymmetric fusion plasma with purely toroidal field and (ii) experimental estimates of the fluctuating electron density and temperature on open field lines obtained from analysis of gas puff imaging of a discharge on the Alcator C-Mod tokamak. The inclusion of effects from the locally puffed atomic helium on particle and energy sources within the reduced plasma turbulence model are found to strengthen correlations between the electric field and electron pressure. The neutrals are also directly associated with broadening the distribution of turbulent field amplitudes and increasing ${\bf E \times B}$ shearing rates. This demonstrates a novel approach in plasma experiments by solving for nonlinear dynamics consistent with partial differential equations and data without encoding explicit boundary nor initial conditions.
△ Less
Submitted 28 November, 2022; v1 submitted 25 April, 2022;
originally announced April 2022.
-
Tight limits and completions from Dedekind-MacNeille to Lambek-Isbell
Authors:
Dusko Pavlovic,
Dominic J. D. Hughes
Abstract:
While any infimum in a poset can also be computed as a supremum, and vice versa, categorical limits and colimits do not always approximate each other. If I approach a point from below, and you approach it from above, then we will surely meet if we live in a poset, but we may miss each other in a category. Can we characterize the limits and the colimits that approximate each other, and guarantee th…
▽ More
While any infimum in a poset can also be computed as a supremum, and vice versa, categorical limits and colimits do not always approximate each other. If I approach a point from below, and you approach it from above, then we will surely meet if we live in a poset, but we may miss each other in a category. Can we characterize the limits and the colimits that approximate each other, and guarantee that we will meet? Such limits and colimits are called *tight*. Some critically important network applications depend on them. This paper characterizes tight limits and colimits, and describes tight completions, derived by applying the nucleus construction to adjunctions between loose completions. Just as the Dedekind-MacNeille completion of a poset preserves any existing infima and suprema, the tight completion of a category preserves any existing tight limits and colimits and is therefore idempotent.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
VaxEquity: A Data-Driven Risk Assessment and Optimization Framework for Equitable Vaccine Distribution
Authors:
Navpreet Kaur,
Jason Hughes,
Juntao Chen
Abstract:
With the continuous rise of the COVID-19 cases worldwide, it is imperative to ensure that all those vulnerable countries lacking vaccine resources can receive sufficient support to contain the risks. COVAX is such an initiative operated by the WHO to supply vaccines to the most needed countries. One critical problem faced by the COVAX is how to distribute the limited amount of vaccines to these co…
▽ More
With the continuous rise of the COVID-19 cases worldwide, it is imperative to ensure that all those vulnerable countries lacking vaccine resources can receive sufficient support to contain the risks. COVAX is such an initiative operated by the WHO to supply vaccines to the most needed countries. One critical problem faced by the COVAX is how to distribute the limited amount of vaccines to these countries in the most efficient and equitable manner. This paper aims to address this challenge by first proposing a data-driven risk assessment and prediction model and then develo** a decision-making framework to support the strategic vaccine distribution. The machine learning-based risk prediction model characterizes how the risk is influenced by the underlying essential factors, e.g., the vaccination level among the population in each COVAX country. This predictive model is then leveraged to design the optimal vaccine distribution strategy that simultaneously minimizes the resulting risks while maximizing the vaccination coverage in these countries targeted by COVAX. Finally, we corroborate the proposed framework using case studies with real-world data.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
On robust risk-based active-learning algorithms for enhanced decision support
Authors:
Aidan J. Hughes,
Lawrence A. Bull,
Paul Gardner,
Nikolaos Dervilis,
Keith Worden
Abstract:
Classification models are a fundamental component of physical-asset management technologies such as structural health monitoring (SHM) systems and digital twins. Previous work introduced risk-based active learning, an online approach for the development of statistical classifiers that takes into account the decision-support context in which they are applied. Decision-making is considered by prefer…
▽ More
Classification models are a fundamental component of physical-asset management technologies such as structural health monitoring (SHM) systems and digital twins. Previous work introduced risk-based active learning, an online approach for the development of statistical classifiers that takes into account the decision-support context in which they are applied. Decision-making is considered by preferentially querying data labels according to expected value of perfect information (EVPI). Although several benefits are gained by adopting a risk-based active learning approach, including improved decision-making performance, the algorithms suffer from issues relating to sampling bias as a result of the guided querying process. This sampling bias ultimately manifests as a decline in decision-making performance during the later stages of active learning, which in turn corresponds to lost resource/utility.
The current paper proposes two novel approaches to counteract the effects of sampling bias: semi-supervised learning, and discriminative classification models. These approaches are first visualised using a synthetic dataset, then subsequently applied to an experimental case study, specifically, the Z24 Bridge dataset. The semi-supervised learning approach is shown to have variable performance; with robustness to sampling bias dependent on the suitability of the generative distributions selected for the model with respect to each dataset. In contrast, the discriminative classifiers are shown to have excellent robustness to the effects of sampling bias. Moreover, it was found that the number of inspections made during a monitoring campaign, and therefore resource expenditure, could be reduced with the careful selection of the statistical classifiers used within a decision-supporting monitoring system.
△ Less
Submitted 12 July, 2022; v1 submitted 7 January, 2022;
originally announced January 2022.
-
Deriving Distributive Laws for Graded Linear Types
Authors:
Jack Hughes,
Michael Vollmer,
Dominic Orchard
Abstract:
The recent notion of graded modal types provides a framework for extending type theories with fine-grained data-flow reasoning. The Granule language explores this idea in the context of linear types. In this practical setting, we observe that the presence of graded modal types can introduce an additional impediment when programming: when composing programs, it is often necessary to 'distribute' da…
▽ More
The recent notion of graded modal types provides a framework for extending type theories with fine-grained data-flow reasoning. The Granule language explores this idea in the context of linear types. In this practical setting, we observe that the presence of graded modal types can introduce an additional impediment when programming: when composing programs, it is often necessary to 'distribute' data types over graded modalities, and vice versa. In this paper, we show how to automatically derive these distributive laws as combinators for programming. We discuss the implementation and use of this automated deriving procedure in Granule, providing easy access to these distributive combinators. This work is also applicable to Linear Haskell (which retrofits Haskell with linear types via grading) and we apply our technique there to provide the same automatically derived combinators. Along the way, we discuss interesting considerations for pattern matching analysis via graded linear types. Lastly, we show how other useful structural combinators can also be automatically derived.
△ Less
Submitted 30 December, 2021;
originally announced December 2021.
-
RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU
Authors:
Geonhwa Jeong,
Eric Qin,
Ananda Samajdar,
Christopher J. Hughes,
Sreenivas Subramoney,
Hyesoon Kim,
Tushar Krishna
Abstract:
As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency. Systolic arrays have been the premier architectural choice as matrix engines in offload accelerators. However, we demonstrate that incorporating them inside CPUs can introduce under-utilization and stalls due to limited register storage to amortize the fill and…
▽ More
As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency. Systolic arrays have been the premier architectural choice as matrix engines in offload accelerators. However, we demonstrate that incorporating them inside CPUs can introduce under-utilization and stalls due to limited register storage to amortize the fill and drain times of the array. To address this, we propose RASA, Register-Aware Systolic Array. We develop techniques to divide an execution stage into several sub-stages and overlap instructions to hide overheads and run them concurrently. RASA-based designs improve performance significantly with negligible area and power overhead.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
Turbulent field fluctuations in gyrokinetic and fluid plasmas
Authors:
Abhilash Mathews,
Noah Mandell,
Manaure Francisquez,
Jerry Hughes,
Ammar Hakim
Abstract:
A key uncertainty in the design and development of magnetic confinement fusion energy reactors is predicting edge plasma turbulence. An essential step in overcoming this uncertainty is the validation in accuracy of reduced turbulent transport models. Drift-reduced Braginskii two-fluid theory is one such set of reduced equations that has for decades simulated boundary plasmas in experiment, but sig…
▽ More
A key uncertainty in the design and development of magnetic confinement fusion energy reactors is predicting edge plasma turbulence. An essential step in overcoming this uncertainty is the validation in accuracy of reduced turbulent transport models. Drift-reduced Braginskii two-fluid theory is one such set of reduced equations that has for decades simulated boundary plasmas in experiment, but significant questions exist regarding its predictive ability. To this end, using a novel physics-informed deep learning framework, we demonstrate the first ever direct quantitative comparisons of turbulent field fluctuations between electrostatic two-fluid theory and electromagnetic gyrokinetic modelling with good overall agreement found in magnetized helical plasmas at low normalized pressure. This framework is readily adaptable to experimental and astrophysical environments, and presents a new technique for the numerical validation and discovery of reduced global plasma turbulence models.
△ Less
Submitted 6 October, 2021; v1 submitted 20 July, 2021;
originally announced July 2021.
-
On risk-based active learning for structural health monitoring
Authors:
A. J. Hughes,
L. A. Bull,
P. Gardner,
R. J. Barthorpe,
N. Dervilis,
K. Worden
Abstract:
A primary motivation for the development and implementation of structural health monitoring systems, is the prospect of gaining the ability to make informed decisions regarding the operation and maintenance of structures and infrastructure. Unfortunately, descriptive labels for measured data corresponding to health-state information for the structure of interest are seldom available prior to the i…
▽ More
A primary motivation for the development and implementation of structural health monitoring systems, is the prospect of gaining the ability to make informed decisions regarding the operation and maintenance of structures and infrastructure. Unfortunately, descriptive labels for measured data corresponding to health-state information for the structure of interest are seldom available prior to the implementation of a monitoring system. This issue limits the applicability of the traditional supervised and unsupervised approaches to machine learning in the development of statistical classifiers for decision-supporting SHM systems.
The current paper presents a risk-based formulation of active learning, in which the querying of class-label information is guided by the expected value of said information for each incipient data point. When applied to structural health monitoring, the querying of class labels can be mapped onto the inspection of a structure of interest in order to determine its health state. In the current paper, the risk-based active learning process is explained and visualised via a representative numerical example and subsequently applied to the Z24 Bridge benchmark. The results of the case studies indicate that a decision-maker's performance can be improved via the risk-based active learning of a statistical classifier, such that the decision process itself is taken into account.
△ Less
Submitted 16 November, 2021; v1 submitted 12 May, 2021;
originally announced May 2021.
-
Wall Detection Via IMU Data Classification In Autonomous Quadcopters
Authors:
Jason Hughes,
Damian Lyons
Abstract:
An autonomous drone flying near obstacles needs to be able to detect and avoid the obstacles or it will collide with them. In prior work, drones can detect and avoid walls using data from camera, ultrasonic or laser sensors mounted either on the drone or in the environment. It is not always possible to instrument the environment, and sensors added to the drone consume payload and power - both of w…
▽ More
An autonomous drone flying near obstacles needs to be able to detect and avoid the obstacles or it will collide with them. In prior work, drones can detect and avoid walls using data from camera, ultrasonic or laser sensors mounted either on the drone or in the environment. It is not always possible to instrument the environment, and sensors added to the drone consume payload and power - both of which are constrained for drones.
This paper studies how data mining classification techniques can be used to predict where an obstacle is in relation to the drone based only on monitoring air-disturbance. We modeled the airflow of the rotors physically to deduce higher level features for classification. Data was collected from the drone's IMU while it was flying with a wall to its direct left, front and right, as well as with no walls present. In total 18 higher level features were produced from the raw data. We used an 80%, 20% train-test scheme with the RandomForest (RF), K-Nearest Neighbor (KNN) and GradientBoosting (GB) classifiers. Our results show that with the RF classifier and with 90% accuracy it can predict which direction a wall is in relation to the drone.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
Quantitative in vivo imaging to enable tumor forecasting and treatment optimization
Authors:
Guillermo Lorenzo,
David A. Hormuth II,
Angela M. Jarrett,
Ernesto A. B. F. Lima,
Shashank Subramanian,
George Biros,
J. Tinsley Oden,
Thomas J. R. Hughes,
Thomas E. Yankeelov
Abstract:
Current clinical decision-making in oncology relies on averages of large patient populations to both assess tumor status and treatment outcomes. However, cancers exhibit an inherent evolving heterogeneity that requires an individual approach based on rigorous and precise predictions of cancer growth and treatment response. To this end, we advocate the use of quantitative in vivo imaging data to ca…
▽ More
Current clinical decision-making in oncology relies on averages of large patient populations to both assess tumor status and treatment outcomes. However, cancers exhibit an inherent evolving heterogeneity that requires an individual approach based on rigorous and precise predictions of cancer growth and treatment response. To this end, we advocate the use of quantitative in vivo imaging data to calibrate mathematical models for the personalized forecasting of tumor development. In this chapter, we summarize the main data types available from both common and emerging in vivo medical imaging technologies, and how these data can be used to obtain patient-specific parameters for common mathematical models of cancer. We then outline computational methods designed to solve these models, thereby enabling their use for producing personalized tumor forecasts in silico, which, ultimately, can be used to not only predict response, but also optimize treatment. Finally, we discuss the main barriers to making the above paradigm a clinical reality.
△ Less
Submitted 24 February, 2021;
originally announced February 2021.
-
A comparison of matrix-free isogeometric Galerkin and collocation methods for Karhunen--Loève expansion
Authors:
Michal Lukasz Mika,
René Rinke Hiemstra,
Thomas Joseph Robert Hughes,
Dominik Schillinger
Abstract:
Numerical computation of the Karhunen--Loève expansion is computationally challenging in terms of both memory requirements and computing time. We compare two state-of-the-art methods that claim to efficiently solve for the K--L expansion: (1) the matrix-free isogeometric Galerkin method using interpolation based quadrature proposed by the authors in [1] and (2) our new matrix-free implementation o…
▽ More
Numerical computation of the Karhunen--Loève expansion is computationally challenging in terms of both memory requirements and computing time. We compare two state-of-the-art methods that claim to efficiently solve for the K--L expansion: (1) the matrix-free isogeometric Galerkin method using interpolation based quadrature proposed by the authors in [1] and (2) our new matrix-free implementation of the isogeometric collocation method proposed in [2]. Two three-dimensional benchmark problems indicate that the Galerkin method performs significantly better for smooth covariance kernels, while the collocation method performs slightly better for rough covariance kernels.
△ Less
Submitted 3 January, 2021;
originally announced January 2021.
-
The Quad Layout Immersion: A Mathematically Equivalent Representation of a Surface Quadrilateral Layout
Authors:
Kendrick M. Shepherd,
René R. Hiemstra,
Thomas J. R. Hughes
Abstract:
Quadrilateral layouts on surfaces are valuable in texture map**, and essential in generation of quadrilateral meshes and in fitting splines. Previous work has characterized such layouts as a special metric on a surface or as a meromorphic quartic differential with finite trajectories. In this work, a surface quadrilateral layout is alternatively characterized as a special immersion of a cut repr…
▽ More
Quadrilateral layouts on surfaces are valuable in texture map**, and essential in generation of quadrilateral meshes and in fitting splines. Previous work has characterized such layouts as a special metric on a surface or as a meromorphic quartic differential with finite trajectories. In this work, a surface quadrilateral layout is alternatively characterized as a special immersion of a cut representation of the surface into the Euclidean plane. We call this a quad layout immersion. This characterization, while posed in smooth topology, naturally generalizes to piecewise-linear representations. As such, it mathematically describes and generalizes integer grid maps, which are common in computer graphics settings. Finally, the utility of the representation is demonstrated by computationally extracting quadrilateral layouts on surfaces of interest.
△ Less
Submitted 16 December, 2020;
originally announced December 2020.
-
A matrix-free isogeometric Galerkin method for Karhunen-Loève approximation of random fields using tensor product splines, tensor contraction and interpolation based quadrature
Authors:
Michal Lukasz Mika,
Thomas Joseph Robert Hughes,
Dominik Schillinger,
Peter Wriggers,
René Rinke Hiemstra
Abstract:
The Karhunen-Loève series expansion (KLE) decomposes a stochastic process into an infinite series of pairwise uncorrelated random variables and pairwise $L^2$-orthogonal functions. For any given truncation order of the infinite series the basis is optimal in the sense that the total mean squared error is minimized. The orthogonal basis functions are determined as the solution of an eigenvalue prob…
▽ More
The Karhunen-Loève series expansion (KLE) decomposes a stochastic process into an infinite series of pairwise uncorrelated random variables and pairwise $L^2$-orthogonal functions. For any given truncation order of the infinite series the basis is optimal in the sense that the total mean squared error is minimized. The orthogonal basis functions are determined as the solution of an eigenvalue problem corresponding to the homogeneous Fredholm integral equation of the second kind, which is computationally challenging for several reasons. Firstly, a Galerkin discretization requires numerical integration over a $2d$ dimensional domain, where $d$, in this work, denotes the spatial dimension. Secondly, the main system matrix of the discretized weak-form is dense. Consequently, the computational complexity of classical finite element formation and assembly procedures as well as the memory requirements of direct solution techniques become quickly computationally intractable with increasing polynomial degree, number of elements and degrees of freedom. The objective of this work is to significantly reduce several of the computational bottlenecks associated with numerical solution of the KLE. We present a matrix-free solution strategy, which is embarrassingly parallel and scales favorably with problem size and polynomial degree. Our approach is based on (1) an interpolation based quadrature that minimizes the required number of quadrature points; (2) an inexpensive reformulation of the generalized eigenvalue problem into a standard eigenvalue problem; and (3) a matrix-free and parallel matrix-vector product for iterative eigenvalue solvers. Two higher-order three-dimensional benchmarks illustrate exceptional computational performance combined with high accuracy and robustness.
△ Less
Submitted 21 February, 2021; v1 submitted 27 November, 2020;
originally announced November 2020.
-
Reality-assisted evolution of soft robots through large-scale physical experimentation: a review
Authors:
Toby Howison,
Simon Hauser,
Josie Hughes,
Fumiya Iida
Abstract:
In this review we introduce the framework of reality-assisted evolution to summarize a growing trend towards combining model-based and model-free approaches to improve the design of physically embodied soft robots. In silico, data-driven models build, adapt and improve representations of the target system using real-world experimental data. By simulating huge numbers of virtual robots using these…
▽ More
In this review we introduce the framework of reality-assisted evolution to summarize a growing trend towards combining model-based and model-free approaches to improve the design of physically embodied soft robots. In silico, data-driven models build, adapt and improve representations of the target system using real-world experimental data. By simulating huge numbers of virtual robots using these data-driven models, optimization algorithms can illuminate multiple design candidates for transference to the real world. In reality, large-scale physical experimentation facilitates the fabrication, testing and analysis of multiple candidate designs. Automated assembly and reconfigurable modular systems enable significantly higher numbers of real-world design evaluations than previously possible. Large volumes of ground-truth data gathered via physical experimentation can be returned to the virtual environment to improve data-driven models and guide optimization. Grounding the design process in physical experimentation ensures the complexity of virtual robot designs does not outpace the model limitations or available fabrication technologies. We outline key developments in the design of physically embodied soft robots under the framework of reality-assisted evolution.
△ Less
Submitted 16 October, 2020; v1 submitted 29 September, 2020;
originally announced September 2020.
-
Nucleus I: Adjunction spectra in recommender systems and descent
Authors:
Dusko Pavlovic,
Dominic J. D. Hughes
Abstract:
Recommender systems build user profiles using concept analysis of usage matrices. The concepts are mined as spectra and form Galois connections. Descent is a general method for spectral decomposition in algebraic geometry and topology which also leads to generalized Galois connections. Both recommender systems and descent theory are vast research areas, separated by a technical gap so large that t…
▽ More
Recommender systems build user profiles using concept analysis of usage matrices. The concepts are mined as spectra and form Galois connections. Descent is a general method for spectral decomposition in algebraic geometry and topology which also leads to generalized Galois connections. Both recommender systems and descent theory are vast research areas, separated by a technical gap so large that trying to establish a link would seem foolish. Yet a formal link emerged, all on its own, bottom-up, against authors' intentions and better judgment. Familiar problems of data analysis led to a novel solution in category theory. The present paper arose from a series of earlier efforts to provide a top-down account of these developments.
△ Less
Submitted 21 October, 2023; v1 submitted 15 April, 2020;
originally announced April 2020.
-
Hierarchical Quantized Autoencoders
Authors:
Will Williams,
Sam Ringer,
Tom Ash,
John Hughes,
David MacLeod,
Jamie Dougherty
Abstract:
Despite progress in training neural networks for lossy image compression, current approaches fail to maintain both perceptual quality and abstract features at very low bitrates. Encouraged by recent success in learning discrete representations with Vector Quantized Variational Autoencoders (VQ-VAEs), we motivate the use of a hierarchy of VQ-VAEs to attain high factors of compression. We show that…
▽ More
Despite progress in training neural networks for lossy image compression, current approaches fail to maintain both perceptual quality and abstract features at very low bitrates. Encouraged by recent success in learning discrete representations with Vector Quantized Variational Autoencoders (VQ-VAEs), we motivate the use of a hierarchy of VQ-VAEs to attain high factors of compression. We show that the combination of stochastic quantization and hierarchical latent structure aids likelihood-based image compression. This leads us to introduce a novel objective for training hierarchical VQ-VAEs. Our resulting scheme produces a Markovian series of latent variables that reconstruct images of high-perceptual quality which retain semantically meaningful features. We provide qualitative and quantitative evaluations on the CelebA and MNIST datasets.
△ Less
Submitted 16 October, 2020; v1 submitted 19 February, 2020;
originally announced February 2020.
-
The divergence-conforming immersed boundary method: Application to vesicle and capsule dynamics
Authors:
Hugo Casquero,
Carles Bona-Casas,
Deepesh Toshniwal,
Thomas J. R. Hughes,
Hector Gomez,
Yongjie Jessica Zhang
Abstract:
We extend the recently introduced divergence-conforming immersed boundary (DCIB) method [1] to fluid-structure interaction (FSI) problems involving closed co-dimension one solids. We focus on capsules and vesicles, whose discretization is particularly challenging due to the higher-order derivatives that appear in their formulations. In two-dimensional settings, we employ cubic B-splines with perio…
▽ More
We extend the recently introduced divergence-conforming immersed boundary (DCIB) method [1] to fluid-structure interaction (FSI) problems involving closed co-dimension one solids. We focus on capsules and vesicles, whose discretization is particularly challenging due to the higher-order derivatives that appear in their formulations. In two-dimensional settings, we employ cubic B-splines with periodic knot vectors to obtain discretizations of closed curves with C^2 inter-element continuity. In three-dimensional settings, we use analysis-suitable bi-cubic T-splines to obtain discretizations of closed surfaces with at least C^1 inter-element continuity. Large spurious changes of the fluid volume inside closed co-dimension one solids is a well-known issue for IB methods. The DCIB method results in volume changes orders of magnitude lower than conventional IB methods. This is a byproduct of discretizing the velocity-pressure pair with divergence-conforming B-splines, which lead to negligible incompressibility errors at the Eulerian level. The higher inter-element continuity of divergence-conforming B-splines is also crucial to avoid the quadrature/interpolation errors of IB methods becoming the dominant discretization error. Benchmark and application problems of vesicle and capsule dynamics are solved, including mesh-independence studies and comparisons with other numerical methods.
△ Less
Submitted 22 January, 2020;
originally announced January 2020.
-
An adaptive space-time phase field formulation for dynamic fracture of brittle shells based on LR NURBS
Authors:
Karsten Paul,
Christopher Zimmermann,
Kranthi K. Mandadapu,
Thomas J. R. Hughes,
Chad M. Landis,
Roger A. Sauer
Abstract:
We present an adaptive space-time phase field formulation for dynamic fracture of brittle shells. Their deformation is characterized by the Kirchhoff-Love thin shell theory using a curvilinear surface description. All kinematical objects are defined on the shell's mid-plane. The evolution equation for the phase field is determined by the minimization of an energy functional based on Griffith's the…
▽ More
We present an adaptive space-time phase field formulation for dynamic fracture of brittle shells. Their deformation is characterized by the Kirchhoff-Love thin shell theory using a curvilinear surface description. All kinematical objects are defined on the shell's mid-plane. The evolution equation for the phase field is determined by the minimization of an energy functional based on Griffith's theory of brittle fracture. Membrane and bending contributions to the fracture process are modeled separately and a thickness integration is established for the latter. The coupled system consists of two nonlinear fourth-order PDEs and all quantities are defined on an evolving two-dimensional manifold. Since the weak form requires $C^1$-continuity, isogeometric shape functions are used. The mesh is adaptively refined based on the phase field using Locally Refinable (LR) NURBS. Time is discretized based on a generalized-$α$ method using adaptive time-step**, and the discretized coupled system is solved with a monolithic Newton-Raphson scheme. The interaction between surface deformation and crack evolution is demonstrated by several numerical examples showing dynamic crack propagation and branching.
△ Less
Submitted 18 June, 2020; v1 submitted 25 June, 2019;
originally announced June 2019.
-
Learning Semantic Vector Representations of Source Code via a Siamese Neural Network
Authors:
David Wehr,
Halley Fede,
Eleanor Pence,
Bo Zhang,
Guilherme Ferreira,
John Walczyk,
Joseph Hughes
Abstract:
The abundance of open-source code, coupled with the success of recent advances in deep learning for natural language processing, has given rise to a promising new application of machine learning to source code. In this work, we explore the use of a Siamese recurrent neural network model on Python source code to create vectors which capture the semantics of code. We evaluate the quality of embeddin…
▽ More
The abundance of open-source code, coupled with the success of recent advances in deep learning for natural language processing, has given rise to a promising new application of machine learning to source code. In this work, we explore the use of a Siamese recurrent neural network model on Python source code to create vectors which capture the semantics of code. We evaluate the quality of embeddings by identifying which problem from a programming competition the code solves. Our model significantly outperforms a bag-of-tokens embedding, providing promising results for improving code embeddings that can be used in future software engineering tasks.
△ Less
Submitted 26 April, 2019;
originally announced April 2019.