-
Bayesian grey-box identification of nonlinear convection effects in heat transfer dynamics
Authors:
Wouter M. Kouw,
Caspar Gruijthuijsen,
Lennart Blanken,
Enzo Evers,
Timothy Rogers
Abstract:
We propose a computational procedure for identifying convection in heat transfer dynamics. The procedure is based on a Gaussian process latent force model, consisting of a white-box component (i.e., known physics) for the conduction and linear convection effects and a Gaussian process that acts as a black-box component for the nonlinear convection effects. States are inferred through Bayesian smoo…
▽ More
We propose a computational procedure for identifying convection in heat transfer dynamics. The procedure is based on a Gaussian process latent force model, consisting of a white-box component (i.e., known physics) for the conduction and linear convection effects and a Gaussian process that acts as a black-box component for the nonlinear convection effects. States are inferred through Bayesian smoothing and we obtain approximate posterior distributions for the kernel covariance function's hyperparameters using Laplace's method. The nonlinear convection function is recovered from the Gaussian process states using a Bayesian regression model. We validate the procedure by simulation error using the identified nonlinear convection function, on both data from a simulated system and measurements from a physical assembly.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Large Language Models estimate fine-grained human color-concept associations
Authors:
Kushin Mukherjee,
Timothy T. Rogers,
Karen B. Schloss
Abstract:
Concepts, both abstract and concrete, elicit a distribution of association strengths across perceptual color space, which influence aspects of visual cognition ranging from object recognition to interpretation of information visualizations. While prior work has hypothesized that color-concept associations may be learned from the cross-modal statistical structure of experience, it has been unclear…
▽ More
Concepts, both abstract and concrete, elicit a distribution of association strengths across perceptual color space, which influence aspects of visual cognition ranging from object recognition to interpretation of information visualizations. While prior work has hypothesized that color-concept associations may be learned from the cross-modal statistical structure of experience, it has been unclear whether natural environments possess such structure or, if so, whether learning systems are capable of discovering and exploiting it without strong prior constraints. We addressed these questions by investigating the ability of GPT-4, a multimodal large language model, to estimate human-like color-concept associations without any additional training. Starting with human color-concept association ratings for 71 color set spanning perceptual color space (\texttt{UW-71}) and concepts that varied in abstractness, we assessed how well association ratings generated by GPT-4 could predict human ratings. GPT-4 ratings were correlated with human ratings, with performance comparable to state-of-the-art methods for automatically estimating color-concept associations from images. Variability in GPT-4's performance across concepts could be explained by specificity of the concept's color-concept association distribution. This study suggests that high-order covariances between language and perception, as expressed in the natural environment of the internet, contain sufficient information to support learning of human-like color-concept associations, and provides an existence proof that a learning system can encode such associations without initial constraints. The work further shows that GPT-4 can be used to efficiently estimate distributions of color associations for a broad range of concepts, potentially serving as a critical tool for designing effective and intuitive information visualizations.
△ Less
Submitted 4 May, 2024;
originally announced June 2024.
-
Beyond Demographics: Aligning Role-playing LLM-based Agents Using Human Belief Networks
Authors:
Yun-Shiuan Chuang,
Zach Studdiford,
Krirk Nirunwiroj,
Agam Goyal,
Vincent V. Frigo,
Sijia Yang,
Dhavan Shah,
Junjie Hu,
Timothy T. Rogers
Abstract:
Creating human-like large language model (LLM) agents is crucial for faithful social simulation. Having LLMs role-play based on demographic information sometimes improves human likeness but often does not. This study assessed whether LLM alignment with human behavior can be improved by integrating information from empirically-derived human belief networks. Using data from a human survey, we estima…
▽ More
Creating human-like large language model (LLM) agents is crucial for faithful social simulation. Having LLMs role-play based on demographic information sometimes improves human likeness but often does not. This study assessed whether LLM alignment with human behavior can be improved by integrating information from empirically-derived human belief networks. Using data from a human survey, we estimated a belief network encompassing 18 topics loading on two non-overlap** latent factors. We then seeded LLM-based agents with an opinion on one topic, and assessed the alignment of its expressed opinions on remaining test topics with corresponding human data. Role-playing based on demographic information alone did not align LLM and human opinions, but seeding the agent with a single belief greatly improved alignment for topics related in the belief network, and not for topics outside the network. These results suggest a novel path for human-LLM belief alignment in work seeking to simulate and understand patterns of belief distributions in society.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Authors:
Jifan Zhang,
Lalit Jain,
Yang Guo,
Jiayi Chen,
Kuan Lok Zhou,
Siddharth Suresh,
Andrew Wagenmaker,
Scott Sievert,
Timothy Rogers,
Kevin Jamieson,
Robert Mankoff,
Robert Nowak
Abstract:
We present a novel multimodal preference dataset for creative tasks, consisting of over 250 million human ratings on more than 2.2 million captions, collected through crowdsourcing rating data for The New Yorker's weekly cartoon caption contest over the past eight years. This unique dataset supports the development and evaluation of multimodal large language models and preference-based fine-tuning…
▽ More
We present a novel multimodal preference dataset for creative tasks, consisting of over 250 million human ratings on more than 2.2 million captions, collected through crowdsourcing rating data for The New Yorker's weekly cartoon caption contest over the past eight years. This unique dataset supports the development and evaluation of multimodal large language models and preference-based fine-tuning algorithms for humorous caption generation. We propose novel benchmarks for judging the quality of model-generated captions, utilizing both GPT4 and human judgments to establish ranking-based evaluation strategies. Our experimental results highlight the limitations of current fine-tuning methods, such as RLHF and DPO, when applied to creative tasks. Furthermore, we demonstrate that even state-of-the-art models like GPT4 and Claude currently underperform top human contestants in generating humorous captions. As we conclude this extensive data collection effort, we release the entire preference dataset to the research community, fostering further advancements in AI humor generation and evaluation.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Multiple-input, multiple-output modal testing of a Hawk T1A aircraft: A new full-scale dataset for structural health monitoring
Authors:
James Wilson,
Max D. Champneys,
Matt Tipuric,
Robin Mills,
David J. Wagg,
Timothy J. Rogers
Abstract:
The use of measured vibration data from structures has a long history of enabling the development of methods for inference and monitoring. In particular, applications based on system identification and structural health monitoring have risen to prominence over recent decades and promise significant benefits when implemented in practice. However, significant challenges remain in the development of…
▽ More
The use of measured vibration data from structures has a long history of enabling the development of methods for inference and monitoring. In particular, applications based on system identification and structural health monitoring have risen to prominence over recent decades and promise significant benefits when implemented in practice. However, significant challenges remain in the development of these methods. The introduction of realistic, full-scale datasets will be an important contribution to overcoming these challenges. This paper presents a new benchmark dataset capturing the dynamic response of a decommissioned BAE Systems Hawk T1A. The dataset reflects the behaviour of a complex structure with a history of service that can still be tested in controlled laboratory conditions, using a variety of known loading and damage simulation conditions. As such, it provides a key step** stone between simple laboratory test structures and in-service structures. In this paper, the Hawk structure is described in detail, alongside a comprehensive summary of the experimental work undertaken. Following this, key descriptive highlights of the dataset are presented, before a discussion of the research challenges that the data present. Using the dataset, non-linearity in the structure is demonstrated, as well as the sensitivity of the structure to damage of different types. The dataset is highly applicable to many academic enquiries and additional analysis techniques which will enable further advancement of vibration-based engineering techniques.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Baseline Results for Selected Nonlinear System Identification Benchmarks
Authors:
Max D. Champneys,
Gerben I. Beintema,
Roland Tóth,
Maarten Schoukens,
Maarten Schoukens,
Timothy J. Rogers
Abstract:
Nonlinear system identification remains an important open challenge across research and academia. Large numbers of novel approaches are seen published each year, each presenting improvements or extensions to existing methods. It is natural, therefore, to consider how one might choose between these competing models. Benchmark datasets provide one clear way to approach this question. However, to mak…
▽ More
Nonlinear system identification remains an important open challenge across research and academia. Large numbers of novel approaches are seen published each year, each presenting improvements or extensions to existing methods. It is natural, therefore, to consider how one might choose between these competing models. Benchmark datasets provide one clear way to approach this question. However, to make meaningful inference based on benchmark performance it is important to understand how well a new method performs comparatively to results available with well-established methods. This paper presents a set of ten baseline techniques and their relative performances on five popular benchmarks. The aim of this contribution is to stimulate thought and discussion regarding objective comparison of identification methodologies.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Probabilistic Numeric SMC Sampling for Bayesian Nonlinear System Identification in Continuous Time
Authors:
Joe D. Longbottom,
Max D. Champneys,
Timothy J. Rogers
Abstract:
In engineering, accurately modeling nonlinear dynamic systems from data contaminated by noise is both essential and complex. Established Sequential Monte Carlo (SMC) methods, used for the Bayesian identification of these systems, facilitate the quantification of uncertainty in the parameter identification process. A significant challenge in this context is the numerical integration of continuous-t…
▽ More
In engineering, accurately modeling nonlinear dynamic systems from data contaminated by noise is both essential and complex. Established Sequential Monte Carlo (SMC) methods, used for the Bayesian identification of these systems, facilitate the quantification of uncertainty in the parameter identification process. A significant challenge in this context is the numerical integration of continuous-time ordinary differential equations (ODEs), crucial for aligning theoretical models with discretely sampled data. This integration introduces additional numerical uncertainty, a factor that is often over looked. To address this issue, the field of probabilistic numerics combines numerical methods, such as numerical integration, with probabilistic modeling to offer a more comprehensive analysis of total uncertainty. By retaining the accuracy of classical deterministic methods, these probabilistic approaches offer a deeper understanding of the uncertainty inherent in the inference process. This paper demonstrates the application of a probabilistic numerical method for solving ODEs in the joint parameter-state identification of nonlinear dynamic systems. The presented approach efficiently identifies latent states and system parameters from noisy measurements. Simultaneously incorporating probabilistic solutions to the ODE in the identification challenge. The methodology's primary advantage lies in its capability to produce posterior distributions over system parameters, thereby representing the inherent uncertainties in both the data and the identification process.
△ Less
Submitted 23 April, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
The Delusional Hedge Algorithm as a Model of Human Learning from Diverse Opinions
Authors:
Yun-Shiuan Chuang,
Jerry Zhu,
Timothy T. Rogers
Abstract:
Whereas cognitive models of learning often assume direct experience with both the features of an event and with a true label or outcome, much of everyday learning arises from hearing the opinions of others, without direct access to either the experience or the ground truth outcome. We consider how people can learn which opinions to trust in such scenarios by extending the hedge algorithm: a classi…
▽ More
Whereas cognitive models of learning often assume direct experience with both the features of an event and with a true label or outcome, much of everyday learning arises from hearing the opinions of others, without direct access to either the experience or the ground truth outcome. We consider how people can learn which opinions to trust in such scenarios by extending the hedge algorithm: a classic solution for learning from diverse information sources. We first introduce a semi-supervised variant we call the delusional hedge capable of learning from both supervised and unsupervised experiences. In two experiments, we examine the alignment between human judgments and predictions from the standard hedge, the delusional hedge, and a heuristic baseline model. Results indicate that humans effectively incorporate both labeled and unlabeled information in a manner consistent with the delusional hedge algorithm -- suggesting that human learners not only gauge the accuracy of information sources but also their consistency with other reliable sources. The findings advance our understanding of human learning from diverse opinions, with implications for the development of algorithms that better capture how people learn to weigh conflicting information sources.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Learning interactions to boost human creativity with bandits and GPT-4
Authors:
Ara Vartanian,
Xiaoxi Sun,
Yun-Shiuan Chuang,
Siddharth Suresh,
Xiao** Zhu,
Timothy T. Rogers
Abstract:
This paper considers how interactions with AI algorithms can boost human creative thought. We employ a psychological task that demonstrates limits on human creativity, namely semantic feature generation: given a concept name, respondents must list as many of its features as possible. Human participants typically produce only a fraction of the features they know before getting "stuck." In experimen…
▽ More
This paper considers how interactions with AI algorithms can boost human creative thought. We employ a psychological task that demonstrates limits on human creativity, namely semantic feature generation: given a concept name, respondents must list as many of its features as possible. Human participants typically produce only a fraction of the features they know before getting "stuck." In experiments with humans and with a language AI (GPT-4) we contrast behavior in the standard task versus a variant in which participants can ask for algorithmically-generated hints. Algorithm choice is administered by a multi-armed bandit whose reward indicates whether the hint helped generating more features. Humans and the AI show similar benefits from hints, and remarkably, bandits learning from AI responses prefer the same prompting strategy as those learning from human behavior. The results suggest that strategies for boosting human creativity via computer interactions can be learned by bandits run on groups of simulated participants.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
The Wisdom of Partisan Crowds: Comparing Collective Intelligence in Humans and LLM-based Agents
Authors:
Yun-Shiuan Chuang,
Siddharth Suresh,
Nikunj Harlalka,
Agam Goyal,
Robert Hawkins,
Sijia Yang,
Dhavan Shah,
Junjie Hu,
Timothy T. Rogers
Abstract:
Human groups are able to converge on more accurate beliefs through deliberation, even in the presence of polarization and partisan bias -- a phenomenon known as the "wisdom of partisan crowds." Generated agents powered by Large Language Models (LLMs) are increasingly used to simulate human collective behavior, yet few benchmarks exist for evaluating their dynamics against the behavior of human gro…
▽ More
Human groups are able to converge on more accurate beliefs through deliberation, even in the presence of polarization and partisan bias -- a phenomenon known as the "wisdom of partisan crowds." Generated agents powered by Large Language Models (LLMs) are increasingly used to simulate human collective behavior, yet few benchmarks exist for evaluating their dynamics against the behavior of human groups. In this paper, we examine the extent to which the wisdom of partisan crowds emerges in groups of LLM-based agents that are prompted to role-play as partisan personas (e.g., Democrat or Republican). We find that they not only display human-like partisan biases, but also converge to more accurate beliefs through deliberation as humans do. We then identify several factors that interfere with convergence, including the use of chain-of-thought prompt and lack of details in personas. Conversely, fine-tuning on human data appears to enhance convergence. These findings show the potential and limitations of LLM-based agents as a model of human collective intelligence.
△ Less
Submitted 16 February, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Evolving Domain Adaptation of Pretrained Language Models for Text Classification
Authors:
Yun-Shiuan Chuang,
Yi Wu,
Dhruv Gupta,
Rheeya Uppaal,
Ananya Kumar,
Luhang Sun,
Makesh Narsimhan Sreedhar,
Sijia Yang,
Timothy T. Rogers,
Junjie Hu
Abstract:
Adapting pre-trained language models (PLMs) for time-series text classification amidst evolving domain shifts (EDS) is critical for maintaining accuracy in applications like stance detection. This study benchmarks the effectiveness of evolving domain adaptation (EDA) strategies, notably self-training, domain-adversarial training, and domain-adaptive pretraining, with a focus on an incremental self…
▽ More
Adapting pre-trained language models (PLMs) for time-series text classification amidst evolving domain shifts (EDS) is critical for maintaining accuracy in applications like stance detection. This study benchmarks the effectiveness of evolving domain adaptation (EDA) strategies, notably self-training, domain-adversarial training, and domain-adaptive pretraining, with a focus on an incremental self-training method. Our analysis across various datasets reveals that this incremental method excels at adapting PLMs to EDS, outperforming traditional domain adaptation techniques. These findings highlight the importance of continually updating PLMs to ensure their effectiveness in real-world applications, paving the way for future research into PLM robustness against the natural temporal evolution of language.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Simulating Opinion Dynamics with Networks of LLM-based Agents
Authors:
Yun-Shiuan Chuang,
Agam Goyal,
Nikunj Harlalka,
Siddharth Suresh,
Robert Hawkins,
Sijia Yang,
Dhavan Shah,
Junjie Hu,
Timothy T. Rogers
Abstract:
Accurately simulating human opinion dynamics is crucial for understanding a variety of societal phenomena, including polarization and the spread of misinformation. However, the agent-based models (ABMs) commonly used for such simulations often over-simplify human behavior. We propose a new approach to simulating opinion dynamics based on populations of Large Language Models (LLMs). Our findings re…
▽ More
Accurately simulating human opinion dynamics is crucial for understanding a variety of societal phenomena, including polarization and the spread of misinformation. However, the agent-based models (ABMs) commonly used for such simulations often over-simplify human behavior. We propose a new approach to simulating opinion dynamics based on populations of Large Language Models (LLMs). Our findings reveal a strong inherent bias in LLM agents towards producing accurate information, leading simulated agents to consensus in line with scientific reality. This bias limits their utility for understanding resistance to consensus views on issues like climate change. After inducing confirmation bias through prompt engineering, however, we observed opinion fragmentation in line with existing agent-based modeling and opinion dynamics research. These insights highlight the promise and limitations of LLM agents in this domain and suggest a path forward: refining LLMs with real-world discourse to better simulate the evolution of human beliefs.
△ Less
Submitted 31 March, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Sharing Information Between Machine Tools to Improve Surface Finish Forecasting
Authors:
Daniel R. Clarkson,
Lawrence A. Bull,
Tina A. Dardeno,
Chandula T. Wickramarachchi,
Elizabeth J. Cross,
Timothy J. Rogers,
Keith Worden,
Nikolaos Dervilis,
Aidan J. Hughes
Abstract:
At present, most surface-quality prediction methods can only perform single-task prediction which results in under-utilised datasets, repetitive work and increased experimental costs. To counter this, the authors propose a Bayesian hierarchical model to predict surface-roughness measurements for a turning machining process. The hierarchical model is compared to multiple independent Bayesian linear…
▽ More
At present, most surface-quality prediction methods can only perform single-task prediction which results in under-utilised datasets, repetitive work and increased experimental costs. To counter this, the authors propose a Bayesian hierarchical model to predict surface-roughness measurements for a turning machining process. The hierarchical model is compared to multiple independent Bayesian linear regression models to showcase the benefits of partial pooling in a machining setting with respect to prediction accuracy and uncertainty quantification.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
A spectrum of physics-informed Gaussian processes for regression in engineering
Authors:
Elizabeth J Cross,
Timothy J Rogers,
Daniel J Pitchforth,
Samuel J Gibson,
Matthew R Jones
Abstract:
Despite the growing availability of sensing and data in general, we remain unable to fully characterise many in-service engineering systems and structures from a purely data-driven approach. The vast data and resources available to capture human activity are unmatched in our engineered world, and, even in cases where data could be referred to as ``big,'' they will rarely hold information across op…
▽ More
Despite the growing availability of sensing and data in general, we remain unable to fully characterise many in-service engineering systems and structures from a purely data-driven approach. The vast data and resources available to capture human activity are unmatched in our engineered world, and, even in cases where data could be referred to as ``big,'' they will rarely hold information across operational windows or life spans. This paper pursues the combination of machine learning technology and physics-based reasoning to enhance our ability to make predictive models with limited data. By explicitly linking the physics-based view of stochastic processes with a data-based regression approach, a spectrum of possible Gaussian process models are introduced that enable the incorporation of different levels of expert knowledge of a system. Examples illustrate how these approaches can significantly reduce reliance on data collection whilst also increasing the interpretability of the model, another important consideration in this context.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Computational Agent-based Models in Opinion Dynamics: A Survey on Social Simulations and Empirical Studies
Authors:
Yun-Shiuan Chuang,
Timothy T. Rogers
Abstract:
Understanding how an individual changes its attitude, belief, and opinion due to other people's social influences is vital because of its wide implications. A core methodology that is used to study the change of attitude under social influences is agent-based model (ABM). The goal of this review paper is to compare and contrast existing ABMs, which I classify into two families, the deductive ABMs…
▽ More
Understanding how an individual changes its attitude, belief, and opinion due to other people's social influences is vital because of its wide implications. A core methodology that is used to study the change of attitude under social influences is agent-based model (ABM). The goal of this review paper is to compare and contrast existing ABMs, which I classify into two families, the deductive ABMs and the inductive ABMs. The former subsumes social simulation studies, and the latter involves human experiments. To facilitate the comparison between ABMs of different formulations, I propose a general unified formulation, in which all ABMs can be viewed as special cases. In addition, I show the connections between deductive ABMs and inductive ABMs, and point out their strengths and limitations. At the end of the paper, I identify underexplored areas and suggest future research directions.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
A Robust Probabilistic Approach to Stochastic Subspace Identification
Authors:
Brandon J. O'Connell,
Timothy J. Rogers
Abstract:
Modal parameter estimation of operational structures is often a challenging task when confronted with unwanted distortions (outliers) in field measurements. Atypical observations present a problem to operational modal analysis (OMA) algorithms, such as stochastic subspace identification (SSI), severely biasing parameter estimates and resulting in misidentification of the system. Despite this predi…
▽ More
Modal parameter estimation of operational structures is often a challenging task when confronted with unwanted distortions (outliers) in field measurements. Atypical observations present a problem to operational modal analysis (OMA) algorithms, such as stochastic subspace identification (SSI), severely biasing parameter estimates and resulting in misidentification of the system. Despite this predicament, no simple mechanism currently exists capable of dealing with such anomalies in SSI. Addressing this problem, this paper first introduces a novel probabilistic formulation of stochastic subspace identification (Prob-SSI), realised using probabilistic projections. Mathematically, the equivalence between this model and the classic algorithm is demonstrated. This fresh perspective, viewing SSI as a problem in probabilistic inference, lays the necessary mathematical foundation to enable a plethora of new, more sophisticated OMA approaches. To this end, a statistically robust SSI algorithm (robust Prob-SSI) is developed, capable of providing a principled and automatic way of handling outlying or anomalous data in the measured timeseries, such as may occur in field recordings, e.g. intermittent sensor dropout. Robust Prob-SSI is shown to outperform conventional SSI when confronted with 'corrupted' data, exhibiting improved identification performance and higher levels of confidence in the found poles when viewing consistency (stabilisation) diagrams. Similar benefits are also demonstrated on the Z24 Bridge benchmark dataset, highlighting enhanced performance on measured systems.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
Proof of principle for a self-governing prediction and forecasting reward algorithm
Authors:
J. O. Gonzalez-Hernandez,
Jonathan Marino,
Ted Rogers,
Brandon Velasco
Abstract:
We use Monte Carlo techniques to simulate an organized prediction competition between a group of a scientific experts acting under the influence of a ``self-governing'' prediction reward algorithm. Our aim is to illustrate the advantages of a specific type of reward distribution rule that is designed to address some of the limitations of traditional forecast scoring rules. The primary extension of…
▽ More
We use Monte Carlo techniques to simulate an organized prediction competition between a group of a scientific experts acting under the influence of a ``self-governing'' prediction reward algorithm. Our aim is to illustrate the advantages of a specific type of reward distribution rule that is designed to address some of the limitations of traditional forecast scoring rules. The primary extension of this algorithm as compared with standard forecast scoring is that it incorporates measures of both group consensus and question relevance directly into the reward distribution algorithm. Our model of the prediction competition includes parameters that control both the level of bias from prior beliefs and the influence of the reward incentive. The Monte Carlo simulations demonstrate that, within the simplifying assumptions of the the model, experts collectively approach belief in objectively true facts, so long as reward influence is high and the bias stays below a critical threshold. The purpose of this work is to motivate further research into prediction reward algorithms that combine standard forecasting measures with factors like bias and consensus.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
PAO: A general particle swarm algorithm with exact dynamics and closed-form transition densities
Authors:
Max D. Champneys,
Timothy J. Rogers
Abstract:
A great deal of research has been conducted in the consideration of meta-heuristic optimisation methods that are able to find global optima in settings that gradient based optimisers have traditionally struggled. Of these, so-called particle swarm optimisation (PSO) approaches have proven to be highly effective in a number of application areas. Given the maturity of the PSO field, it is likely tha…
▽ More
A great deal of research has been conducted in the consideration of meta-heuristic optimisation methods that are able to find global optima in settings that gradient based optimisers have traditionally struggled. Of these, so-called particle swarm optimisation (PSO) approaches have proven to be highly effective in a number of application areas. Given the maturity of the PSO field, it is likely that novel variants of the PSO algorithm stand to offer only marginal gains in terms of performance -- there is, after all, no free lunch. Instead of only chasing performance on suites of benchmark optimisation functions, it is argued herein that research effort is better placed in the pursuit of algorithms that also have other useful properties. In this work, a highly-general, interpretable variant of the PSO algorithm -- particle attractor algorithm (PAO) -- is proposed. Furthermore, the algorithm is designed such that the transition densities (describing the motions of the particles from one generation to the next) can be computed exactly in closed form for each step. Access to closed-form transition densities has important ramifications for the closely-related field of Sequential Monte Carlo (SMC). In order to demonstrate that the useful properties do not come at the cost of performance, PAO is compared to several other state-of-the art heuristic optimisation algorithms in a benchmark comparison study.
△ Less
Submitted 28 April, 2023;
originally announced April 2023.
-
Semantic Feature Verification in FLAN-T5
Authors:
Siddharth Suresh,
Kushin Mukherjee,
Timothy T. Rogers
Abstract:
This study evaluates the potential of a large language model for aiding in generation of semantic feature norms - a critical tool for evaluating conceptual structure in cognitive science. Building from an existing human-generated dataset, we show that machine-verified norms capture aspects of conceptual structure beyond what is expressed in human norms alone, and better explain human judgments of…
▽ More
This study evaluates the potential of a large language model for aiding in generation of semantic feature norms - a critical tool for evaluating conceptual structure in cognitive science. Building from an existing human-generated dataset, we show that machine-verified norms capture aspects of conceptual structure beyond what is expressed in human norms alone, and better explain human judgments of semantic similarity amongst items that are distally related. The results suggest that LLMs can greatly enhance traditional methods of semantic feature norm verification, with implications for our understanding of conceptual representation in humans and machines.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Human-machine cooperation for semantic feature listing
Authors:
Kushin Mukherjee,
Siddharth Suresh,
Timothy T. Rogers
Abstract:
Semantic feature norms, lists of features that concepts do and do not possess, have played a central role in characterizing human conceptual knowledge, but require extensive human labor. Large language models (LLMs) offer a novel avenue for the automatic generation of such feature lists, but are prone to significant error. Here, we present a new method for combining a learned model of human lexica…
▽ More
Semantic feature norms, lists of features that concepts do and do not possess, have played a central role in characterizing human conceptual knowledge, but require extensive human labor. Large language models (LLMs) offer a novel avenue for the automatic generation of such feature lists, but are prone to significant error. Here, we present a new method for combining a learned model of human lexical-semantics from limited data with LLM-generated data to efficiently generate high-quality feature norms.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Conceptual structure coheres in human cognition but not in large language models
Authors:
Siddharth Suresh,
Kushin Mukherjee,
Xizheng Yu,
Wei-Chun Huang,
Lisa Padua,
Timothy T Rogers
Abstract:
Neural network models of language have long been used as a tool for develo** hypotheses about conceptual representation in the mind and brain. For many years, such use involved extracting vector-space representations of words and using distances among these to predict or understand human behavior in various semantic tasks. Contemporary large language models (LLMs), however, make it possible to i…
▽ More
Neural network models of language have long been used as a tool for develo** hypotheses about conceptual representation in the mind and brain. For many years, such use involved extracting vector-space representations of words and using distances among these to predict or understand human behavior in various semantic tasks. Contemporary large language models (LLMs), however, make it possible to interrogate the latent structure of conceptual representations using experimental methods nearly identical to those commonly used with human participants. The current work utilizes three common techniques borrowed from cognitive psychology to estimate and compare the structure of concepts in humans and a suite of LLMs. In humans, we show that conceptual structure is robust to differences in culture, language, and method of estimation. Structures estimated from LLM behavior, while individually fairly consistent with those estimated from human behavior, vary much more depending upon the particular task used to generate responses--across tasks, estimates of conceptual structure from the very same model cohere less with one another than do human structure estimates. These results highlight an important difference between contemporary LLMs and human cognition, with implications for understanding some fundamental limitations of contemporary machine language.
△ Less
Submitted 10 November, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
Physically Meaningful Uncertainty Quantification in Probabilistic Wind Turbine Power Curve Models as a Damage Sensitive Feature
Authors:
J. H. Mclean,
M. R. Jones,
B. J. O'Connell,
A. E Maguire,
T. J. Rogers
Abstract:
A wind turbines' power curve is easily accessible damage sensitive data, and as such is a key part of structural health monitoring in wind turbines. Power curve models can be constructed in a number of ways, but the authors argue that probabilistic methods carry inherent benefits in this use case, such as uncertainty quantification and allowing uncertainty propagation analysis. Many probabilistic…
▽ More
A wind turbines' power curve is easily accessible damage sensitive data, and as such is a key part of structural health monitoring in wind turbines. Power curve models can be constructed in a number of ways, but the authors argue that probabilistic methods carry inherent benefits in this use case, such as uncertainty quantification and allowing uncertainty propagation analysis. Many probabilistic power curve models have a key limitation in that they are not physically meaningful - they return mean and uncertainty predictions outside of what is physically possible (the maximum and minimum power outputs of the wind turbine). This paper investigates the use of two bounded Gaussian Processes in order to produce physically meaningful probabilistic power curve models. The first model investigated was a warped heteroscedastic Gaussian process, and was found to be ineffective due to specific shortcomings of the Gaussian Process in relation to the war** function. The second model - an approximated Gaussian Process with a Beta likelihood was highly successful and demonstrated that a working bounded probabilistic model results in better predictive uncertainty than a corresponding unbounded one without meaningful loss in predictive accuracy. Such a bounded model thus offers increased accuracy for performance monitoring and increased operator confidence in the model due to guaranteed physical plausibility.
△ Less
Submitted 30 September, 2022;
originally announced September 2022.
-
Physics-informed machine learning for Structural Health Monitoring
Authors:
Elizabeth J Cross,
Samuel J Gibson,
Matthew R Jones,
Daniel J Pitchforth,
Sikai Zhang,
Timothy J Rogers
Abstract:
The use of machine learning in Structural Health Monitoring is becoming more common, as many of the inherent tasks (such as regression and classification) in develo** condition-based assessment fall naturally into its remit. This chapter introduces the concept of physics-informed machine learning, where one adapts ML algorithms to account for the physical insight an engineer will often have of t…
▽ More
The use of machine learning in Structural Health Monitoring is becoming more common, as many of the inherent tasks (such as regression and classification) in develo** condition-based assessment fall naturally into its remit. This chapter introduces the concept of physics-informed machine learning, where one adapts ML algorithms to account for the physical insight an engineer will often have of the structure they are attempting to model or assess. The chapter will demonstrate how grey-box models, that combine simple physics-based models with data-driven ones, can improve predictive capability in an SHM setting. A particular strength of the approach demonstrated here is the capacity of the models to generalise, with enhanced predictive capability in different regimes. This is a key issue when life-time assessment is a requirement, or when monitoring data do not span the operational conditions a structure will undergo.
The chapter will provide an overview of physics-informed ML, introducing a number of new approaches for grey-box modelling in a Bayesian setting. The main ML tool discussed will be Gaussian process regression, we will demonstrate how physical assumptions/models can be incorporated through constraints, through the mean function and kernel design, and finally in a state-space setting. A range of SHM applications will be demonstrated, from loads monitoring tasks for off-shore and aerospace structures, through to performance monitoring for long-span bridges.
△ Less
Submitted 30 June, 2022;
originally announced June 2022.
-
GaLeNet: Multimodal Learning for Disaster Prediction, Management and Relief
Authors:
Rohit Saha,
Mengyi Fang,
Angeline Yasodhara,
Kyryl Truskovskyi,
Azin Asgarian,
Daniel Homola,
Raahil Shah,
Frederik Dieleman,
Jack Weatheritt,
Thomas Rogers
Abstract:
After a natural disaster, such as a hurricane, millions are left in need of emergency assistance. To allocate resources optimally, human planners need to accurately analyze data that can flow in large volumes from several sources. This motivates the development of multimodal machine learning frameworks that can integrate multiple data sources and leverage them efficiently. To date, the research co…
▽ More
After a natural disaster, such as a hurricane, millions are left in need of emergency assistance. To allocate resources optimally, human planners need to accurately analyze data that can flow in large volumes from several sources. This motivates the development of multimodal machine learning frameworks that can integrate multiple data sources and leverage them efficiently. To date, the research community has mainly focused on unimodal reasoning to provide granular assessments of the damage. Moreover, previous studies mostly rely on post-disaster images, which may take several days to become available. In this work, we propose a multimodal framework (GaLeNet) for assessing the severity of damage by complementing pre-disaster images with weather data and the trajectory of the hurricane. Through extensive experiments on data from two hurricanes, we demonstrate (i) the merits of multimodal approaches compared to unimodal methods, and (ii) the effectiveness of GaLeNet at fusing various modalities. Furthermore, we show that GaLeNet can leverage pre-disaster images in the absence of post-disaster images, preventing substantial delays in decision making.
△ Less
Submitted 18 June, 2022;
originally announced June 2022.
-
Constraining Gaussian processes for physics-informed acoustic emission map**
Authors:
Matthew R Jones,
Timothy J Rogers,
Elizabeth J Cross
Abstract:
The automated localisation of damage in structures is a challenging but critical ingredient in the path towards predictive or condition-based maintenance of high value structures. The use of acoustic emission time of arrival map** is a promising approach to this challenge, but is severely hindered by the need to collect a dense set of artificial acoustic emission measurements across the structur…
▽ More
The automated localisation of damage in structures is a challenging but critical ingredient in the path towards predictive or condition-based maintenance of high value structures. The use of acoustic emission time of arrival map** is a promising approach to this challenge, but is severely hindered by the need to collect a dense set of artificial acoustic emission measurements across the structure, resulting in a lengthy and often impractical data acquisition process. In this paper, we consider the use of physics-informed Gaussian processes for learning these maps to alleviate this problem. In the approach, the Gaussian process is constrained to the physical domain such that information relating to the geometry and boundary conditions of the structure are embedded directly into the learning process, returning a model that guarantees that any predictions made satisfy physically-consistent behaviour at the boundary. A number of scenarios that arise when training measurement acquisition is limited, including where training data are sparse, and also of limited coverage over the structure of interest. Using a complex plate-like structure as an experimental case study, we show that our approach significantly reduces the burden of data collection, where it is seen that incorporation of boundary condition knowledge significantly improves predictive accuracy as training observations are reduced, particularly when training measurements are not available across all parts of the structure.
△ Less
Submitted 3 June, 2022;
originally announced June 2022.
-
Bayesian Modelling of Multivalued Power Curves from an Operational Wind Farm
Authors:
L. A. Bull,
P. A. Gardner,
T. J. Rogers,
N. Dervilis,
E. J. Cross,
E. Papatheou,
A. E. Maguire,
C. Campos,
K. Worden
Abstract:
Power curves capture the relationship between wind speed and output power for a specific wind turbine. Accurate regression models of this function prove useful in monitoring, maintenance, design, and planning. In practice, however, the measurements do not always correspond to the ideal curve: power curtailments will appear as (additional) functional components. Such multivalued relationships canno…
▽ More
Power curves capture the relationship between wind speed and output power for a specific wind turbine. Accurate regression models of this function prove useful in monitoring, maintenance, design, and planning. In practice, however, the measurements do not always correspond to the ideal curve: power curtailments will appear as (additional) functional components. Such multivalued relationships cannot be modelled by conventional regression, and the associated data are usually removed during pre-processing. The current work suggests an alternative method to infer multivalued relationships in curtailed power data. Using a population-based approach, an overlap** mixture of probabilistic regression models is applied to signals recorded from turbines within an operational wind farm. The model is shown to provide an accurate representation of practical power data across the population.
△ Less
Submitted 30 November, 2021;
originally announced November 2021.
-
A Latent Restoring Force Approach to Nonlinear System Identification
Authors:
Timothy J. Rogers,
Tobias Friis
Abstract:
Identification of nonlinear dynamic systems remains a significant challenge across engineering. This work suggests an approach based on Bayesian filtering to extract and identify the contribution of an unknown nonlinear term in the system which can be seen as an alternative viewpoint on restoring force surface type approaches. To achieve this identification, the contribution which is the nonlinear…
▽ More
Identification of nonlinear dynamic systems remains a significant challenge across engineering. This work suggests an approach based on Bayesian filtering to extract and identify the contribution of an unknown nonlinear term in the system which can be seen as an alternative viewpoint on restoring force surface type approaches. To achieve this identification, the contribution which is the nonlinear restoring force is modelled, initially, as a Gaussian process in time. That Gaussian process is converted into a state-space model and combined with the linear dynamic component of the system. Then, by inference of the filtering and smoothing distributions, the internal states of the system and the nonlinear restoring force can be extracted. In possession of these states a nonlinear model can be constructed. The approach is demonstrated to be effective in both a simulated case study and on an experimental benchmark dataset.
△ Less
Submitted 30 June, 2022; v1 submitted 22 September, 2021;
originally announced September 2021.
-
Grey-box models for wave loading prediction
Authors:
Daniel J Pitchforth,
Timothy J Rogers,
Ulf T Tygesen,
Elizabeth J Cross
Abstract:
The quantification of wave loading on offshore structures and components is a crucial element in the assessment of their useful remaining life. In many applications the well-known Morison's equation is employed to estimate the forcing from waves with assumed particle velocities and accelerations. This paper develops a grey-box modelling approach to improve the predictions of the force on structura…
▽ More
The quantification of wave loading on offshore structures and components is a crucial element in the assessment of their useful remaining life. In many applications the well-known Morison's equation is employed to estimate the forcing from waves with assumed particle velocities and accelerations. This paper develops a grey-box modelling approach to improve the predictions of the force on structural members. A grey-box model intends to exploit the enhanced predictive capabilities of data-based modelling whilst retaining physical insight into the behaviour of the system; in the context of the work carried out here, this can be considered as physics-informed machine learning. There are a number of possible approaches to establish a grey-box model. This paper demonstrates two means of combining physics (white box) and data-based (black box) components; one where the model is a simple summation of the two components, the second where the white-box prediction is fed into the black box as an additional input. Here Morison's equation is used as the physics-based component in combination with a data-based Gaussian process NARX - a dynamic variant of the more well-known Gaussian process regression. Two key challenges with employing the GP-NARX formulation that are addressed here are the selection of appropriate lag terms and the proper treatment of uncertainty propagation within the dynamic GP. The best performing grey-box model, the residual modelling GP-NARX, was able to achieve a 29.13\% and 5.48\% relative reduction in NMSE over Morison's Equation and a black-box GP-NARX respectively, alongside significant benefits in extrapolative capabilities of the model, in circumstances of low dataset coverage.
△ Less
Submitted 30 June, 2021; v1 submitted 10 May, 2021;
originally announced May 2021.
-
Probabilistic Inference for Structural Health Monitoring: New Modes of Learning from Data
Authors:
Lawrence A. Bull,
Paul Gardner,
Timothy J. Rogers,
Elizabeth J. Cross,
Nikolaos Dervilis,
Keith Worden
Abstract:
In data-driven SHM, the signals recorded from systems in operation can be noisy and incomplete. Data corresponding to each of the operational, environmental, and damage states are rarely available a priori; furthermore, labelling to describe the measurements is often unavailable. In consequence, the algorithms used to implement SHM should be robust and adaptive, while accommodating for missing inf…
▽ More
In data-driven SHM, the signals recorded from systems in operation can be noisy and incomplete. Data corresponding to each of the operational, environmental, and damage states are rarely available a priori; furthermore, labelling to describe the measurements is often unavailable. In consequence, the algorithms used to implement SHM should be robust and adaptive, while accommodating for missing information in the training-data -- such that new information can be included if it becomes available. By reviewing novel techniques for statistical learning (introduced in previous work), it is argued that probabilistic algorithms offer a natural solution to the modelling of SHM data in practice. In three case-studies, probabilistic methods are adapted for applications to SHM signals -- including semi-supervised learning, active learning, and multi-task learning.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Structured Machine Learning Tools for Modelling Characteristics of Guided Waves
Authors:
Marcus Haywood-Alexander,
Nikolaos Dervilis,
Keith Worden,
Elizabeth J. Cross,
Robin S. Mills,
Timothy J. Rogers
Abstract:
The use of ultrasonic guided waves to probe the materials/structures for damage continues to increase in popularity for non-destructive evaluation (NDE) and structural health monitoring (SHM). The use of high-frequency waves such as these offers an advantage over low-frequency methods from their ability to detect damage on a smaller scale. However, in order to assess damage in a structure, and imp…
▽ More
The use of ultrasonic guided waves to probe the materials/structures for damage continues to increase in popularity for non-destructive evaluation (NDE) and structural health monitoring (SHM). The use of high-frequency waves such as these offers an advantage over low-frequency methods from their ability to detect damage on a smaller scale. However, in order to assess damage in a structure, and implement any NDE or SHM tool, knowledge of the behaviour of a guided wave throughout the material/structure is important (especially when designing sensor placement for SHM systems). Determining this behaviour is extremely diffcult in complex materials, such as fibre-matrix composites, where unique phenomena such as continuous mode conversion takes place. This paper introduces a novel method for modelling the feature-space of guided waves in a composite material. This technique is based on a data-driven model, where prior physical knowledge can be used to create structured machine learning tools; where constraints are applied to provide said structure. The method shown makes use of Gaussian processes, a full Bayesian analysis tool, and in this paper it is shown how physical knowledge of the guided waves can be utilised in modelling using an ML tool. This paper shows that through careful consideration when applying machine learning techniques, more robust models can be generated which offer advantages such as extrapolation ability and physical interpretation.
△ Less
Submitted 5 January, 2021;
originally announced January 2021.
-
A Bayesian methodology for localising acoustic emission sources in complex structures
Authors:
Matthew R. Jones,
Tim J. Rogers,
Keith Worden,
Elizabeth J. Cross
Abstract:
In the field of structural health monitoring (SHM), the acquisition of acoustic emissions to localise damage sources has emerged as a popular approach. Despite recent advances, the task of locating damage within composite materials and structures that contain non-trivial geometrical features, still poses a significant challenge. Within this paper, a Bayesian source localisation strategy that is ro…
▽ More
In the field of structural health monitoring (SHM), the acquisition of acoustic emissions to localise damage sources has emerged as a popular approach. Despite recent advances, the task of locating damage within composite materials and structures that contain non-trivial geometrical features, still poses a significant challenge. Within this paper, a Bayesian source localisation strategy that is robust to these complexities is presented. Under this new framework, a Gaussian process is first used to learn the relationship between source locations and the corresponding difference-in-time-of-arrival values for a number of sensor pairings. As an acoustic emission event with an unknown origin is observed, a map** is then generated that quantifies the likelihood of the emission location across the surface of the structure. The new probabilistic map** offers multiple benefits, leading to a localisation strategy that is more informative than deterministic predictions or single-point estimates with an associated confidence bound. The performance of the approach is investigated on a structure with numerous complex geometrical features and demonstrates a favourable performance in comparison to other similar localisation methods.
△ Less
Submitted 20 December, 2020;
originally announced December 2020.
-
Loss convergence in a causal Bayesian neural network of retail firm performance
Authors:
F. Trevor Rogers
Abstract:
We extend the empirical results from the structural equation model (SEM) published in the paper Assortment Planning for Retail Buying, Retail Store Operations, and Firm Performance [1] by implementing the directed acyclic graph as a causal Bayesian neural network. Neural network convergence is shown to improve with the removal of the node with the weakest SEM path when variational inference is pro…
▽ More
We extend the empirical results from the structural equation model (SEM) published in the paper Assortment Planning for Retail Buying, Retail Store Operations, and Firm Performance [1] by implementing the directed acyclic graph as a causal Bayesian neural network. Neural network convergence is shown to improve with the removal of the node with the weakest SEM path when variational inference is provided by perturbing weights with Flipout layers, while results from perturbing weights at the output with the Vadam optimizer are inconclusive.
△ Less
Submitted 29 August, 2020;
originally announced August 2020.
-
Evaluation of an AI System for the Detection of Diabetic Retinopathy from Images Captured with a Handheld Portable Fundus Camera: the MAILOR AI study
Authors:
T W Rogers,
J Gonzalez-Bueno,
R Garcia Franco,
E Lopez Star,
D Méndez Marín,
J Vassallo,
V C Lansingh,
S Trikha,
N Jaccard
Abstract:
Objectives: To evaluate the performance of an Artificial Intelligence (AI) system (Pegasus, Visulytix Ltd., UK), at the detection of Diabetic Retinopathy (DR) from images captured by a handheld portable fundus camera.
Methods: A cohort of 6,404 patients (~80% with diabetes mellitus) was screened for retinal diseases using a handheld portable fundus camera (Pictor Plus, Volk Optical Inc., USA) at…
▽ More
Objectives: To evaluate the performance of an Artificial Intelligence (AI) system (Pegasus, Visulytix Ltd., UK), at the detection of Diabetic Retinopathy (DR) from images captured by a handheld portable fundus camera.
Methods: A cohort of 6,404 patients (~80% with diabetes mellitus) was screened for retinal diseases using a handheld portable fundus camera (Pictor Plus, Volk Optical Inc., USA) at the Mexican Advanced Imaging Laboratory for Ocular Research. The images were graded for DR by specialists according to the Scottish DR grading scheme. The performance of the AI system was evaluated, retrospectively, in assessing Referable DR (RDR) and Proliferative DR (PDR) and compared to the performance on a publicly available desktop camera benchmark dataset.
Results: For RDR detection, Pegasus performed with an 89.4% (95% CI: 88.0-90.7) Area Under the Receiver Operating Characteristic (AUROC) curve for the MAILOR cohort, compared to an AUROC of 98.5% (95% CI: 97.8-99.2) on the benchmark dataset. This difference was statistically significant. Moreover, no statistically significant difference was found in performance for PDR detection with Pegasus achieving an AUROC of 94.3% (95% CI: 91.0-96.9) on the MAILOR cohort and 92.2% (95% CI: 89.4-94.8) on the benchmark dataset.
Conclusions: Pegasus showed good transferability for the detection of PDR from a curated desktop fundus camera dataset to real-world clinical practice with a handheld portable fundus camera. However, there was a substantial, and statistically significant, decrease in the diagnostic performance for RDR when using the handheld device.
△ Less
Submitted 18 August, 2019;
originally announced August 2019.
-
Evaluation of an AI system for the automated detection of glaucoma from stereoscopic optic disc photographs: the European Optic Disc Assessment Study
Authors:
Thomas W. Rogers,
Nicolas Jaccard,
Francis Carbonaro,
Hans G. Lemij,
Koenraad A. Vermeer,
Nicolaas J. Reus,
Sameer Trikha
Abstract:
Objectives: To evaluate the performance of a deep learning based Artificial Intelligence (AI) software for detection of glaucoma from stereoscopic optic disc photographs, and to compare this performance to the performance of a large cohort of ophthalmologists and optometrists.
Methods: A retrospective study evaluating the diagnostic performance of an AI software (Pegasus v1.0, Visulytix Ltd., Lo…
▽ More
Objectives: To evaluate the performance of a deep learning based Artificial Intelligence (AI) software for detection of glaucoma from stereoscopic optic disc photographs, and to compare this performance to the performance of a large cohort of ophthalmologists and optometrists.
Methods: A retrospective study evaluating the diagnostic performance of an AI software (Pegasus v1.0, Visulytix Ltd., London UK) and comparing it to that of 243 European ophthalmologists and 208 British optometrists, as determined in previous studies, for the detection of glaucomatous optic neuropathy from 94 scanned stereoscopic photographic slides scanned into digital format.
Results: Pegasus was able to detect glaucomatous optic neuropathy with an accuracy of 83.4% (95% CI: 77.5-89.2). This is comparable to an average ophthalmologist accuracy of 80.5% (95% CI: 67.2-93.8) and average optometrist accuracy of 80% (95% CI: 67-88) on the same images. In addition, the AI system had an intra-observer agreement (Cohen's Kappa, $κ$) of 0.74 (95% CI: 0.63-0.85), compared to 0.70 (range: -0.13-1.00; 95% CI: 0.67-0.73) and 0.71 (range: 0.08-1.00) for ophthalmologists and optometrists, respectively. There was no statistically significant difference between the performance of the deep learning system and ophthalmologists or optometrists. There was no statistically significant difference between the performance of the deep learning system and ophthalmologists or optometrists.
Conclusion: The AI system obtained a diagnostic performance and repeatability comparable to that of the ophthalmologists and optometrists. We conclude that deep learning based AI systems, such as Pegasus, demonstrate significant promise in the assisted detection of glaucomatous optic neuropathy.
△ Less
Submitted 4 June, 2019;
originally announced June 2019.
-
Analyzing Machine Learning Workloads Using a Detailed GPU Simulator
Authors:
Jonathan Lew,
Deval Shah,
Suchita Pati,
Shaylin Cattell,
Mengchi Zhang,
Amruth Sandhupatla,
Christopher Ng,
Negar Goli,
Matthew D. Sinclair,
Timothy G. Rogers,
Tor Aamodt
Abstract:
Most deep neural networks deployed today are trained using GPUs via high-level frameworks such as TensorFlow and PyTorch. This paper describes changes we made to the GPGPU-Sim simulator to enable it to run PyTorch by running PTX kernels included in NVIDIA's cuDNN library. We use the resulting modified simulator, which has been made available publicly with this paper, to study some simple deep lear…
▽ More
Most deep neural networks deployed today are trained using GPUs via high-level frameworks such as TensorFlow and PyTorch. This paper describes changes we made to the GPGPU-Sim simulator to enable it to run PyTorch by running PTX kernels included in NVIDIA's cuDNN library. We use the resulting modified simulator, which has been made available publicly with this paper, to study some simple deep learning workloads. With our changes to GPGPU-Sim's functional simulation model, we find GPGPU-Sim performance model running a cuDNN enabled implementation of LeNet for MNIST reports results within 30% of real hardware. Using GPGPU-Sim's AerialVision performance analysis tool we observe that cuDNN API calls contain many varying phases and appear to include potentially inefficient microarchitecture behaviour such as DRAM partition bank cam**, at least when executed on GPGPU-Sim's current performance model.
△ Less
Submitted 26 January, 2019; v1 submitted 18 November, 2018;
originally announced November 2018.
-
Exploring Modern GPU Memory System Design Challenges through Accurate Modeling
Authors:
Mahmoud Khairy,
Jain Akshay,
Tor Aamodt,
Timothy G. Rogers
Abstract:
This paper explores the impact of simulator accuracy on architecture design decisions in the general-purpose graphics processing unit (GPGPU) space. We perform a detailed, quantitative analysis of the most popular publicly available GPU simulator, GPGPU-Sim, against our enhanced version of the simulator, updated to model the memory system of modern GPUs in more detail. Our enhanced GPU model is ab…
▽ More
This paper explores the impact of simulator accuracy on architecture design decisions in the general-purpose graphics processing unit (GPGPU) space. We perform a detailed, quantitative analysis of the most popular publicly available GPU simulator, GPGPU-Sim, against our enhanced version of the simulator, updated to model the memory system of modern GPUs in more detail. Our enhanced GPU model is able to describe the NVIDIA Volta architecture in sufficient detail to reduce error in memory system even counters by as much as 66X. The reduced error in the memory system further reduces execution time error versus real hardware by 2.5X. To demonstrate the accuracy of our enhanced model against a real machine, we perform a counter-by-counter validation against an NVIDIA TITAN V Volta GPU, demonstrating the relative accuracy of the new simulator versus the publicly available model.
We go on to demonstrate that the simpler model discounts the importance of advanced memory system designs such as out-of-order memory access scheduling, while overstating the impact of more heavily researched areas like L1 cache bypassing. Our results demonstrate that it is important for the academic community to enhance the level of detail in architecture simulators as system complexity continues to grow. As part of this detailed correlation and modeling effort, we developed a new Correlator toolset that includes a consolidation of applications from a variety of popular GPGPU benchmark suites, designed to run in reasonable simulation times. The Correlator also includes a database of hardware profiling results for all these applications on NVIDIA cards ranging from Fermi to Volta and a toolchain that enables users to gather correlation statistics and create detailed counter-by-counter hardware correlation plots with minimal effort.
△ Less
Submitted 16 October, 2018;
originally announced October 2018.
-
Know When to Fold 'Em: Self-Assembly of Shapes by Folding in Oritatami
Authors:
Erik D. Demaine,
Jacob Hendricks,
Meagan Olsen,
Matthew J. Patitz,
Trent A. Rogers,
Nicolas Schabanel,
Shinnosuke Seki,
Hadley Thomas
Abstract:
An oritatami system (OS) is a theoretical model of self-assembly via co-transcriptional folding. It consists of a growing chain of beads which can form bonds with each other as they are transcribed. During the transcription process, the $δ$ most recently produced beads dynamically fold so as to maximize the number of bonds formed, self-assemblying into a shape incrementally. The parameter $δ$ is c…
▽ More
An oritatami system (OS) is a theoretical model of self-assembly via co-transcriptional folding. It consists of a growing chain of beads which can form bonds with each other as they are transcribed. During the transcription process, the $δ$ most recently produced beads dynamically fold so as to maximize the number of bonds formed, self-assemblying into a shape incrementally. The parameter $δ$ is called the delay and is related to the transcription rate in nature.
This article initiates the study of shape self-assembly using oritatami. A shape is a connected set of points in the triangular lattice. We first show that oritatami systems differ fundamentally from tile-assembly systems by exhibiting a family of infinite shapes that can be tile-assembled but cannot be folded by any OS. As it is NP-hard in general to determine whether there is an OS that folds into (self-assembles) a given finite shape, we explore the folding of upscaled versions of finite shapes. We show that any shape can be folded from a constant size seed, at any scale n >= 3, by an OS with delay 1. We also show that any shape can be folded at the smaller scale 2 by an OS with unbounded delay. This leads us to investigate the influence of delay and to prove that, for all δ > 2, there are shapes that can be folded (at scale 1) with delay δ but not with delay δ'<δ. These results serve as a foundation for the study of shape-building in this new model of self-assembly, and have the potential to provide better understanding of cotranscriptional folding in biology, as well as improved abilities of experimentalists to design artificial systems that self-assemble via this complex dynamical process.
△ Less
Submitted 13 July, 2018; v1 submitted 12 July, 2018;
originally announced July 2018.
-
Thermodynamic Binding Networks
Authors:
David Doty,
Trent A. Rogers,
David Soloveichik,
Chris Thachuk,
Damien Woods
Abstract:
Strand displacement and tile assembly systems are designed to follow prescribed kinetic rules (i.e., exhibit a specific time-evolution). However, the expected behavior in the limit of infinite time--known as thermodynamic equilibrium--is often incompatible with the desired computation. Basic physical chemistry implicates this inconsistency as a source of unavoidable error. Can the thermodynamic eq…
▽ More
Strand displacement and tile assembly systems are designed to follow prescribed kinetic rules (i.e., exhibit a specific time-evolution). However, the expected behavior in the limit of infinite time--known as thermodynamic equilibrium--is often incompatible with the desired computation. Basic physical chemistry implicates this inconsistency as a source of unavoidable error. Can the thermodynamic equilibrium be made consistent with the desired computational pathway? In order to formally study this question, we introduce a new model of molecular computing in which computation is driven by the thermodynamic driving forces of enthalpy and entropy. To ensure greatest generality we do not assume that there are any constraints imposed by geometry and treat monomers as unstructured collections of binding sites. In this model we design Boolean AND/OR formulas, as well as a self-assembling binary counter, where the thermodynamically favored states are exactly the desired final output configurations. Though inspired by DNA nanotechnology, the model is sufficiently general to apply to a wide variety of chemical systems.
△ Less
Submitted 22 September, 2017;
originally announced September 2017.
-
Automated detection of smuggled high-risk security threats using Deep Learning
Authors:
Nicolas Jaccard,
Thomas W. Rogers,
Edward J. Morton,
Lewis D. Griffin
Abstract:
The security infrastructure is ill-equipped to detect and deter the smuggling of non-explosive devices that enable terror attacks such as those recently perpetrated in western Europe. The detection of so-called "small metallic threats" (SMTs) in cargo containers currently relies on statistical risk analysis, intelligence reports, and visual inspection of X-ray images by security officers. The latt…
▽ More
The security infrastructure is ill-equipped to detect and deter the smuggling of non-explosive devices that enable terror attacks such as those recently perpetrated in western Europe. The detection of so-called "small metallic threats" (SMTs) in cargo containers currently relies on statistical risk analysis, intelligence reports, and visual inspection of X-ray images by security officers. The latter is very slow and unreliable due to the difficulty of the task: objects potentially spanning less than 50 pixels have to be detected in images containing more than 2 million pixels against very complex and cluttered backgrounds. In this contribution, we demonstrate for the first time the use of Convolutional Neural Networks (CNNs), a type of Deep Learning, to automate the detection of SMTs in fullsize X-ray images of cargo containers. Novel approaches for dataset augmentation allowed to train CNNs from-scratch despite the scarcity of data available. We report fewer than 6% false alarms when detecting 90% SMTs synthetically concealed in stream-of-commerce images, which corresponds to an improvement of over an order of magnitude over conventional approaches such as Bag-of-Words (BoWs). The proposed scheme offers potentially super-human performance for a fraction of the time it would take for a security officers to carry out visual inspection (processing time is approximately 3.5s per container image).
△ Less
Submitted 9 September, 2016;
originally announced September 2016.
-
Universal Simulation of Directed Systems in the abstract Tile Assembly Model Requires Undirectedness
Authors:
Jacob Hendricks,
Matthew J. Patitz,
Trent A. Rogers
Abstract:
As a mathematical model of self-assembling systems, Winfree's abstract Tile Assembly Model (aTAM) is a remarkable platform for studying the behaviors and powers of self-assembling systems. Capable of Turing universal computation, the aTAM allows algorithmic self-assembly, in which the components can be designed so that the rules governing their behaviors force them to inherently execute prescribed…
▽ More
As a mathematical model of self-assembling systems, Winfree's abstract Tile Assembly Model (aTAM) is a remarkable platform for studying the behaviors and powers of self-assembling systems. Capable of Turing universal computation, the aTAM allows algorithmic self-assembly, in which the components can be designed so that the rules governing their behaviors force them to inherently execute prescribed algorithms as they combine. Adding to its completeness, the aTAM was shown to also be intrinsically universal, which means that there exists a single tile set such that for any arbitrary input aTAM system, that tile set can be configured into a seed structure which will then cause self-assembly using that tile set to simulate the input system, capturing its full dynamics modulo only a scale factor. However, the universal simulator previously given makes use of nondeterminism in terms of tile types placed in several key locations when different assembly sequences are followed, even when simulating a directed system, meaning one that has exactly one unique terminal assembly. The question then became whether or not that nondeterminism is fundamentally required. Here, we answer that in the affirmative: the class of directed systems in the aTAM is not intrinsically universal, meaning there is no universal simulator for directed systems which itself is always directed. This provides insight into the role of nondeterminism in self-assembly, which is itself a fundamentally nondeterministic process. To achieve this result we leverage powerful results of computational complexity hierarchies, including tight bounds on both best and worst-case complexities of decidable languages, to design systems with precisely controllable space resources available to embedded computations. We also develop novel techniques for designing systems containing subsystems with disjoint, mutually exclusive computational powers.
△ Less
Submitted 9 August, 2016;
originally announced August 2016.
-
Automated X-ray Image Analysis for Cargo Security: Critical Review and Future Promise
Authors:
Thomas W. Rogers,
Nicolas Jaccard,
Edward J. Morton,
Lewis D. Griffin
Abstract:
We review the relatively immature field of automated image analysis for X-ray cargo imagery. There is increasing demand for automated analysis methods that can assist in the inspection and selection of containers, due to the ever-growing volumes of traded cargo and the increasing concerns that customs- and security-related threats are being smuggled across borders by organised crime and terrorist…
▽ More
We review the relatively immature field of automated image analysis for X-ray cargo imagery. There is increasing demand for automated analysis methods that can assist in the inspection and selection of containers, due to the ever-growing volumes of traded cargo and the increasing concerns that customs- and security-related threats are being smuggled across borders by organised crime and terrorist networks. We split the field into the classical pipeline of image preprocessing and image understanding. Preprocessing includes: image manipulation; quality improvement; Threat Image Projection (TIP); and material discrimination and segmentation. Image understanding includes: Automated Threat Detection (ATD); and Automated Contents Verification (ACV). We identify several gaps in the literature that need to be addressed and propose ideas for future research. Where the current literature is sparse we borrow from the single-view, multi-view, and CT X-ray baggage domains, which have some characteristics in common with X-ray cargo.
△ Less
Submitted 2 August, 2016;
originally announced August 2016.
-
Detection of concealed cars in complex cargo X-ray imagery using Deep Learning
Authors:
Nicolas Jaccard,
Thomas W. Rogers,
Edward J. Morton,
Lewis D. Griffin
Abstract:
Non-intrusive inspection systems based on X-ray radiography techniques are routinely used at transport hubs to ensure the conformity of cargo content with the supplied ship** manifest. As trade volumes increase and regulations become more stringent, manual inspection by trained operators is less and less viable due to low throughput. Machine vision techniques can assist operators in their task b…
▽ More
Non-intrusive inspection systems based on X-ray radiography techniques are routinely used at transport hubs to ensure the conformity of cargo content with the supplied ship** manifest. As trade volumes increase and regulations become more stringent, manual inspection by trained operators is less and less viable due to low throughput. Machine vision techniques can assist operators in their task by automating parts of the inspection workflow. Since cars are routinely involved in trafficking, export fraud, and tax evasion schemes, they represent an attractive target for automated detection and flagging for subsequent inspection by operators. In this contribution, we describe a method for the detection of cars in X-ray cargo images based on trained-from-scratch Convolutional Neural Networks. By introducing an oversampling scheme that suitably addresses the low number of car images available for training, we achieved 100% car image classification rate for a false positive rate of 1-in-454. Cars that were partially or completely obscured by other goods, a modus operandi frequently adopted by criminals, were correctly detected. We believe that this level of performance suggests that the method is suitable for deployment in the field. It is expected that the generic object detection workflow described can be extended to other object classes given the availability of suitable training data.
△ Less
Submitted 9 September, 2016; v1 submitted 26 June, 2016;
originally announced June 2016.
-
Hierarchical Self-Assembly of Fractals with Signal-Passing Tiles
Authors:
Jacob Hendricks,
Meagan Olsen,
Matthew J. Patitz,
Trent A. Rogers,
Hadley Thomas
Abstract:
In this paper, we present high-level overviews of tile-based self-assembling systems capable of producing complex, infinite, aperiodic structures known as discrete self-similar fractals. Fractals have a variety of interesting mathematical and structural properties, and by utilizing the bottom-up growth paradigm of self-assembly to create them we not only learn important techniques for building suc…
▽ More
In this paper, we present high-level overviews of tile-based self-assembling systems capable of producing complex, infinite, aperiodic structures known as discrete self-similar fractals. Fractals have a variety of interesting mathematical and structural properties, and by utilizing the bottom-up growth paradigm of self-assembly to create them we not only learn important techniques for building such complex structures, we also gain insight into how similar structural complexity arises in natural self-assembling systems. Our results fundamentally leverage hierarchical assembly processes, and use as our building blocks square "tile" components which are capable of activating and deactivating their binding "glues" a constant number of times each, based only on local interactions. We provide the first constructions capable of building arbitrary discrete self-similar fractals at scale factor 1, and many at temperature 1 (i.e. "non-cooperatively"), including the Sierpinski triangle.
△ Less
Submitted 22 December, 2016; v1 submitted 6 June, 2016;
originally announced June 2016.
-
The Simulation Powers and Limitations of Higher Temperature Hierarchical Self-Assembly Systems
Authors:
Jacob Hendricks,
Matthew J. Patitz,
Trent A. Rogers
Abstract:
In this paper, we extend existing results about simulation and intrinsic universality in a model of tile-based self-assembly. Namely, we work within the 2-Handed Assembly Model (2HAM), which is a model of self-assembly in which assemblies are formed by square tiles that are allowed to combine, using glues along their edges, individually or as pairs of arbitrarily large assemblies in a hierarchical…
▽ More
In this paper, we extend existing results about simulation and intrinsic universality in a model of tile-based self-assembly. Namely, we work within the 2-Handed Assembly Model (2HAM), which is a model of self-assembly in which assemblies are formed by square tiles that are allowed to combine, using glues along their edges, individually or as pairs of arbitrarily large assemblies in a hierarchical manner, and we explore the abilities of these systems to simulate each other when the simulating systems have a higher "temperature" parameter, which is a system wide threshold dictating how many glue bonds must be formed between two assemblies to allow them to combine. It has previously been shown that systems with lower temperatures cannot simulate arbitrary systems with higher temperatures, and also that systems at some higher temperatures can simulate those at particular lower temperatures, creating an infinite set of infinite hierarchies of 2HAM systems with strictly increasing simulation power within each hierarchy. These previous results relied on two different definitions of simulation, one (strong simulation) seemingly more restrictive than the other (standard simulation), but which have previously not been proven to be distinct. Here we prove distinctions between them by first fully characterizing the set of pairs of temperatures such that the high temperature systems are intrinsically universal for the lower temperature systems (i.e. one tile set at the higher temperature can simulate any at the lower) using strong simulation. This includes the first impossibility result for simulation downward in temperature. We then show that lower temperature systems which cannot be simulated by higher temperature systems using the strong definition, can in fact be simulated using the standard definition, proving the distinction between the types of simulation.
△ Less
Submitted 15 March, 2015;
originally announced March 2015.
-
Replication of arbitrary hole-free shapes via self-assembly with signal-passing tiles
Authors:
Andrew Alseth,
Jacob Hendricks,
Matthew J. Patitz,
Trent A. Rogers
Abstract:
In this paper, we investigate the abilities of systems of self-assembling tiles which can each pass a constant number of signals to their immediate neighbors to create replicas of input shapes. Namely, we work within the Signal-passing Tile Assembly Model (STAM), and we provide a universal STAM tile set which is capable of creating unbounded numbers of assemblies of shapes identical to those of in…
▽ More
In this paper, we investigate the abilities of systems of self-assembling tiles which can each pass a constant number of signals to their immediate neighbors to create replicas of input shapes. Namely, we work within the Signal-passing Tile Assembly Model (STAM), and we provide a universal STAM tile set which is capable of creating unbounded numbers of assemblies of shapes identical to those of input assemblies. The shapes of the input assemblies can be arbitrary 2-dimensional hole-free shapes. This improves previous shape replication results in self-assembly that required models in which multiple assembly stages and/or bins were required, and the shapes which could be replicated were more constrained, as well as a previous version of this result that required input shapes to be represented at scale factor 2.
△ Less
Submitted 3 April, 2022; v1 submitted 4 March, 2015;
originally announced March 2015.
-
Computing in continuous space with self-assembling polygonal tiles
Authors:
Oscar Gilbert,
Jacob Hendricks,
Matthew J. Patitz,
Trent A. Rogers
Abstract:
In this paper we investigate the computational power of the polygonal tile assembly model (polygonal TAM) at temperature 1, i.e. in non-cooperative systems. The polygonal TAM is an extension of Winfree's abstract tile assembly model (aTAM) which not only allows for square tiles (as in the aTAM) but also allows for tile shapes that are polygons. Although a number of self-assembly results have shown…
▽ More
In this paper we investigate the computational power of the polygonal tile assembly model (polygonal TAM) at temperature 1, i.e. in non-cooperative systems. The polygonal TAM is an extension of Winfree's abstract tile assembly model (aTAM) which not only allows for square tiles (as in the aTAM) but also allows for tile shapes that are polygons. Although a number of self-assembly results have shown computational universality at temperature 1, these are the first results to do so by fundamentally relying on tile placements in continuous, rather than discrete, space. With the square tiles of the aTAM, it is conjectured that the class of temperature 1 systems is not computationally universal. Here we show that the class of systems whose tiles are composed of a regular polygon P with n > 6 sides is computationally universal. On the other hand, we show that the class of systems whose tiles consist of a regular polygon P with n <= 6 cannot compute using any known techniques. In addition, we show a number of classes of systems whose tiles consist of a non-regular polygon with n >= 3 sides are computationally universal.
△ Less
Submitted 18 August, 2015; v1 submitted 1 March, 2015;
originally announced March 2015.
-
Universal Computation with Arbitrary Polyomino Tiles in Non-Cooperative Self-Assembly
Authors:
Sándor P. Fekete,
Jacob Hendricks,
Matthew J. Patitz,
Trent A. Rogers,
Robert T. Schweller
Abstract:
In this paper we explore the power of geometry to overcome the limitations of non-cooperative self-assembly. We define a generalization of the abstract Tile Assembly Model (aTAM), such that a tile system consists of a collection of polyomino tiles, the Polyomino Tile Assembly Model (polyTAM), and investigate the computational powers of polyTAM systems at temperature 1, where attachment among tiles…
▽ More
In this paper we explore the power of geometry to overcome the limitations of non-cooperative self-assembly. We define a generalization of the abstract Tile Assembly Model (aTAM), such that a tile system consists of a collection of polyomino tiles, the Polyomino Tile Assembly Model (polyTAM), and investigate the computational powers of polyTAM systems at temperature 1, where attachment among tiles occurs without glue cooperation. Systems composed of the unit-square tiles of the aTAM at temperature 1 are believed to be incapable of Turing universal computation (while cooperative systems, with temperature > 1, are able). As our main result, we prove that for any polyomino $P$ of size 3 or greater, there exists a temperature-1 polyTAM system containing only shape-$P$ tiles that is computationally universal. Our proof leverages the geometric properties of these larger (relative to the aTAM) tiles and their abilities to effectively utilize geometric blocking of particular growth paths of assemblies, while allowing others to complete.
To round out our main result, we provide strong evidence that size-1 (i.e. aTAM tiles) and size-2 polyomino systems are unlikely to be computationally universal by showing that such systems are incapable of geometric bit-reading, which is a technique common to all currently known temperature-1 computationally universal systems. We further show that larger polyominoes with a limited number of binding positions are unlikely to be computationally universal, as they are only as powerful as temperature-1 aTAM systems. Finally, we connect our work with other work on domino self-assembly to show that temperature-1 assembly with at least 2 distinct shapes, regardless of the shapes or their sizes, allows for universal computation.
△ Less
Submitted 18 August, 2014; v1 submitted 14 August, 2014;
originally announced August 2014.
-
Reflections on Tiles (in Self-Assembly)
Authors:
Jacob Hendricks,
Matthew J. Patitz,
Trent A. Rogers
Abstract:
We define the Reflexive Tile Assembly Model (RTAM), which is obtained from the abstract Tile Assembly Model (aTAM) by allowing tiles to reflect across their horizontal and/or vertical axes. We show that the class of directed temperature-1 RTAM systems is not computationally universal, which is conjectured but unproven for the aTAM, and like the aTAM, the RTAM is computationally universal at temper…
▽ More
We define the Reflexive Tile Assembly Model (RTAM), which is obtained from the abstract Tile Assembly Model (aTAM) by allowing tiles to reflect across their horizontal and/or vertical axes. We show that the class of directed temperature-1 RTAM systems is not computationally universal, which is conjectured but unproven for the aTAM, and like the aTAM, the RTAM is computationally universal at temperature 2. We then show that at temperature 1, when starting from a single tile seed, the RTAM is capable of assembling n x n squares for n odd using only n tile types, but incapable of assembling n x n squares for n even. Moreover, we show that n is a lower bound on the number of tile types needed to assemble n x n squares for n odd in the temperature-1 RTAM. The conjectured lower bound for temperature-1 aTAM systems is 2n-1. Finally, we give preliminary results toward the classification of which finite connected shapes in Z^2 can be assembled (strictly or weakly) by a singly seeded (i.e. seed of size 1) RTAM system, including a complete classification of which finite connected shapes be strictly assembled by a "mismatch-free" singly seeded RTAM system.
△ Less
Submitted 11 March, 2015; v1 submitted 23 April, 2014;
originally announced April 2014.
-
Doubles and Negatives are Positive (in Self-Assembly)
Authors:
Jacob Hendricks,
Matthew J. Patitz,
Trent A. Rogers
Abstract:
In the abstract Tile Assembly Model (aTAM), the phenomenon of cooperation occurs when the attachment of a new tile to a growing assembly requires it to bind to more than one tile already in the assembly. Often referred to as ``temperature-2'' systems, those which employ cooperation are known to be quite powerful (i.e. they are computationally universal and can build an enormous variety of shapes a…
▽ More
In the abstract Tile Assembly Model (aTAM), the phenomenon of cooperation occurs when the attachment of a new tile to a growing assembly requires it to bind to more than one tile already in the assembly. Often referred to as ``temperature-2'' systems, those which employ cooperation are known to be quite powerful (i.e. they are computationally universal and can build an enormous variety of shapes and structures). Conversely, aTAM systems which do not enforce cooperative behavior, a.k.a. ``temperature-1'' systems, are conjectured to be relatively very weak, likely to be unable to perform complex computations or algorithmically direct the process of self-assembly. Nonetheless, a variety of models based on slight modifications to the aTAM have been developed in which temperature-1 systems are in fact capable of Turing universal computation through a restricted notion of cooperation. Despite that power, though, several of those models have previously been proven to be unable to perform or simulate the stronger form of cooperation exhibited by temperature-2 aTAM systems.
In this paper, we first prove that another model in which temperature-1 systems are computationally universal, namely the restricted glue TAM (rgTAM) in which tiles are allowed to have edges which exhibit repulsive forces, is also unable to simulate the strongly cooperative behavior of the temperature-2 aTAM. We then show that by combining the properties of two such models, the Dupled Tile Assembly Model (DTAM) and the rgTAM into the DrgTAM, we derive a model which is actually more powerful at temperature-1 than the aTAM at temperature-2. Specifically, the DrgTAM, at temperature-1, can simulate any aTAM system of any temperature, and it also contains systems which cannot be simulated by any system in the aTAM.
△ Less
Submitted 15 March, 2014;
originally announced March 2014.
-
The Power of Duples (in Self-Assembly): It's Not So Hip To Be Square
Authors:
Jacob Hendricks,
Matthew J. Patitz,
Trent A. Rogers,
Scott M. Summers
Abstract:
In this paper we define the Dupled abstract Tile Assembly Model (DaTAM), which is a slight extension to the abstract Tile Assembly Model (aTAM) that allows for not only the standard square tiles, but also "duple" tiles which are rectangles pre-formed by the joining of two square tiles. We show that the addition of duples allows for powerful behaviors of self-assembling systems at temperature 1, me…
▽ More
In this paper we define the Dupled abstract Tile Assembly Model (DaTAM), which is a slight extension to the abstract Tile Assembly Model (aTAM) that allows for not only the standard square tiles, but also "duple" tiles which are rectangles pre-formed by the joining of two square tiles. We show that the addition of duples allows for powerful behaviors of self-assembling systems at temperature 1, meaning systems which exclude the requirement of cooperative binding by tiles (i.e., the requirement that a tile must be able to bind to at least 2 tiles in an existing assembly if it is to attach). Cooperative binding is conjectured to be required in the standard aTAM for Turing universal computation and the efficient self-assembly of shapes, but we show that in the DaTAM these behaviors can in fact be exhibited at temperature 1. We then show that the DaTAM doesn't provide asymptotic improvements over the aTAM in its ability to efficiently build thin rectangles. Finally, we present a series of results which prove that the temperature-2 aTAM and temperature-1 DaTAM have mutually exclusive powers. That is, each is able to self-assemble shapes that the other can't, and each has systems which cannot be simulated by the other. Beyond being of purely theoretical interest, these results have practical motivation as duples have already proven to be useful in laboratory implementations of DNA-based tiles.
△ Less
Submitted 6 March, 2014; v1 submitted 18 February, 2014;
originally announced February 2014.