Search | arXiv e-print repository

Amortized Active Causal Induction with Deep Reinforcement Learning

Authors: Yashas Annadani, Panagiotis Tigas, Stefan Bauer, Adam Foster

Abstract: We present Causal Amortized Active Structure Learning (CAASL), an active intervention design policy that can select interventions that are adaptive, real-time and that does not require access to the likelihood. This policy, an amortized network based on the transformer, is trained with reinforcement learning on a simulator of the design environment, and a reward function that measures how close th… ▽ More We present Causal Amortized Active Structure Learning (CAASL), an active intervention design policy that can select interventions that are adaptive, real-time and that does not require access to the likelihood. This policy, an amortized network based on the transformer, is trained with reinforcement learning on a simulator of the design environment, and a reward function that measures how close the true causal graph is to a causal graph posterior inferred from the gathered data. On synthetic data and a single-cell gene expression simulator, we demonstrate empirically that the data acquired through our policy results in a better estimate of the underlying causal graph than alternative strategies. Our design policy successfully achieves amortized intervention design on the distribution of the training environment while also generalizing well to distribution shifts in test-time design environments. Further, our policy also demonstrates excellent zero-shot generalization to design environments with dimensionality higher than that during training, and to intervention types that it has not been trained on. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2404.17249 [pdf, other]

Making Better Use of Unlabelled Data in Bayesian Active Learning

Authors: Freddie Bickford Smith, Adam Foster, Tom Rainforth

Abstract: Fully supervised models are predominant in Bayesian active learning. We argue that their neglect of the information present in unlabelled data harms not just predictive performance but also decisions about what data to acquire. Our proposed solution is a simple framework for semi-supervised Bayesian active learning. We find it produces better-performing models than either conventional Bayesian act… ▽ More Fully supervised models are predominant in Bayesian active learning. We argue that their neglect of the information present in unlabelled data harms not just predictive performance but also decisions about what data to acquire. Our proposed solution is a simple framework for semi-supervised Bayesian active learning. We find it produces better-performing models than either conventional Bayesian active learning or semi-supervised learning with randomly acquired data. It is also easier to scale up than the conventional approach. As well as supporting a shift towards semi-supervised models, our findings highlight the importance of studying models and acquisition methods in conjunction. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: Published at AISTATS 2024

arXiv:2402.02846 [pdf, other]

Machine Learning Resistant Amorphous Silicon Physically Unclonable Functions (PUFs)

Authors: Velat Kilic, Neil Macfarlane, Jasper Stround, Samuel Metais, Milad Alemohammad, A. Brinton Cooper, Amy C. Foster, Mark A. Foster

Abstract: We investigate usage of nonlinear wave chaotic amorphous silicon (a-Si) cavities as physically unclonable functions (PUF). Machine learning attacks on integrated electronic PUFs have been demonstrated to be very effective at modeling PUF behavior. Such attacks on integrated a-Si photonic PUFs are investigated through application of algorithms including linear regression, k-nearest neighbor, decisi… ▽ More We investigate usage of nonlinear wave chaotic amorphous silicon (a-Si) cavities as physically unclonable functions (PUF). Machine learning attacks on integrated electronic PUFs have been demonstrated to be very effective at modeling PUF behavior. Such attacks on integrated a-Si photonic PUFs are investigated through application of algorithms including linear regression, k-nearest neighbor, decision tree ensembles (random forests and gradient boosted trees), and deep neural networks (DNNs). We found that DNNs performed the best among all the algorithms studied but still failed to completely break the a-Si PUF security which we quantify through a private information metric. Furthermore, machine learning resistance of a-Si PUFs were found to be directly related to the strength of their nonlinear response. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2304.08151 [pdf, other]

Prediction-Oriented Bayesian Active Learning

Authors: Freddie Bickford Smith, Andreas Kirsch, Sebastian Farquhar, Yarin Gal, Adam Foster, Tom Rainforth

Abstract: Information-theoretic approaches to active learning have traditionally focused on maximising the information gathered about the model parameters, most commonly by optimising the BALD score. We highlight that this can be suboptimal from the perspective of predictive performance. For example, BALD lacks a notion of an input distribution and so is prone to prioritise data of limited relevance. To add… ▽ More Information-theoretic approaches to active learning have traditionally focused on maximising the information gathered about the model parameters, most commonly by optimising the BALD score. We highlight that this can be suboptimal from the perspective of predictive performance. For example, BALD lacks a notion of an input distribution and so is prone to prioritise data of limited relevance. To address this we propose the expected predictive information gain (EPIG), an acquisition function that measures information gain in the space of predictions rather than parameters. We find that using EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models, and thus provides an appealing drop-in replacement. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: Published at AISTATS 2023

arXiv:2302.14545 [pdf, ps, other]

Modern Bayesian Experimental Design

Authors: Tom Rainforth, Adam Foster, Desi R Ivanova, Freddie Bickford Smith

Abstract: Bayesian experimental design (BED) provides a powerful and general framework for optimizing the design of experiments. However, its deployment often poses substantial computational challenges that can undermine its practical use. In this review, we outline how recent advances have transformed our ability to overcome these challenges and thus utilize BED effectively, before discussing some key area… ▽ More Bayesian experimental design (BED) provides a powerful and general framework for optimizing the design of experiments. However, its deployment often poses substantial computational challenges that can undermine its practical use. In this review, we outline how recent advances have transformed our ability to overcome these challenges and thus utilize BED effectively, before discussing some key areas for future development in the field. △ Less

Submitted 29 November, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

Comments: Accepted for publication in Statistical Science

arXiv:2302.14015 [pdf, other]

CO-BED: Information-Theoretic Contextual Optimization via Bayesian Experimental Design

Authors: Desi R. Ivanova, Joel Jennings, Tom Rainforth, Cheng Zhang, Adam Foster

Abstract: We formalize the problem of contextual optimization through the lens of Bayesian experimental design and propose CO-BED -- a general, model-agnostic framework for designing contextual experiments using information-theoretic principles. After formulating a suitable information-based objective, we employ black-box variational methods to simultaneously estimate it and optimize the designs in a single… ▽ More We formalize the problem of contextual optimization through the lens of Bayesian experimental design and propose CO-BED -- a general, model-agnostic framework for designing contextual experiments using information-theoretic principles. After formulating a suitable information-based objective, we employ black-box variational methods to simultaneously estimate it and optimize the designs in a single stochastic gradient scheme. In addition, to accommodate discrete actions within our framework, we propose leveraging continuous relaxation schemes, which can naturally be integrated into our variational objective. As a result, CO-BED provides a general and automated solution to a wide range of contextual optimization problems. We illustrate its effectiveness in a number of experiments, where CO-BED demonstrates competitive performance even when compared to bespoke, model-specific alternatives. △ Less

Submitted 13 July, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: Proceedings of the 40th International Conference on Machine Learning (ICML 2023); 9 pages, 7 figures

arXiv:2302.10607 [pdf, other]

Differentiable Multi-Target Causal Bayesian Experimental Design

Authors: Yashas Annadani, Panagiotis Tigas, Desi R. Ivanova, Andrew Jesson, Yarin Gal, Adam Foster, Stefan Bauer

Abstract: We introduce a gradient-based approach for the problem of Bayesian optimal experimental design to learn causal models in a batch setting -- a critical component for causal discovery from finite data where interventions can be costly or risky. Existing methods rely on greedy approximations to construct a batch of experiments while using black-box methods to optimize over a single target-state pair… ▽ More We introduce a gradient-based approach for the problem of Bayesian optimal experimental design to learn causal models in a batch setting -- a critical component for causal discovery from finite data where interventions can be costly or risky. Existing methods rely on greedy approximations to construct a batch of experiments while using black-box methods to optimize over a single target-state pair to intervene with. In this work, we completely dispose of the black-box optimization techniques and greedy heuristics and instead propose a conceptually simple end-to-end gradient-based optimization procedure to acquire a set of optimal intervention target-state pairs. Such a procedure enables parameterization of the design space to efficiently optimize over a batch of multi-target-state interventions, a setting which has hitherto not been explored due to its complexity. We demonstrate that our proposed method outperforms baselines and existing acquisition strategies in both single-target and multi-target settings across a number of synthetic datasets. △ Less

Submitted 2 June, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

Comments: Camera-ready version ICML 2023

arXiv:2207.11297 [pdf]

doi 10.1002/mrm.29574

Accelerated and Quantitative 3D Semisolid MT/CEST Imaging using a Generative Adversarial Network (GAN-CEST)

Authors: Jonah Weigand-Whittier, Maria Sedykh, Kai Herz, Jaume Coll-Font, Anna N. Foster, Elizabeth R. Gerstner, Christopher Nguyen, Moritz Zaiss, Christian T. Farrar, Or Perlman

Abstract: Purpose: To substantially shorten the acquisition time required for quantitative 3D chemical exchange saturation transfer (CEST) and semisolid magnetization transfer (MT) imaging and allow for rapid chemical exchange parameter map reconstruction. Methods: Three-dimensional CEST and MT magnetic resonance fingerprinting (MRF) datasets of L-arginine phantoms, whole-brains, and calf muscles from healt… ▽ More Purpose: To substantially shorten the acquisition time required for quantitative 3D chemical exchange saturation transfer (CEST) and semisolid magnetization transfer (MT) imaging and allow for rapid chemical exchange parameter map reconstruction. Methods: Three-dimensional CEST and MT magnetic resonance fingerprinting (MRF) datasets of L-arginine phantoms, whole-brains, and calf muscles from healthy volunteers, cancer patients, and cardiac patients were acquired using 3T clinical scanners at 3 different sites, using 3 different scanner models and coils. A generative adversarial network supervised framework (GAN-CEST) was then designed and trained to learn the map** from a reduced input data space to the quantitative exchange parameter space, while preserving perceptual and quantitative content. Results: The GAN-CEST 3D acquisition time was 42-52 seconds, 70% shorter than CEST-MRF. The quantitative reconstruction of the entire brain took 0.8 seconds. An excellent agreement was observed between the ground truth and GAN-based L-arginine concentration and pH values (Pearson's r > 0.97, NRMSE < 1.5%). GAN-CEST images from a brain-tumor subject yielded a semi-solid volume fraction and exchange rate NRMSE of 3.8$\pm$1.3% and 4.6$\pm$1.3%, respectively, and SSIM of 96.3$\pm$1.6% and 95.0$\pm$2.4%, respectively. The map** of the calf-muscle exchange parameters in a cardiac patient, yielded NRMSE < 7% and SSIM > 94% for the semi-solid exchange parameters. In regions with large susceptibility artifacts, GAN-CEST has demonstrated improved performance and reduced noise compared to MRF. Conclusion: GAN-CEST can substantially reduce the acquisition time for quantitative semisolid MT/CEST map**, while retaining performance even when facing pathologies and scanner models that were not available during training. △ Less

Submitted 5 August, 2023; v1 submitted 22 July, 2022; originally announced July 2022.

Comments: This project received funding from NIH Grants R01-CA203873, R01-EB03008, P41-RR14075, and the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 836752 (OncoViroMRI). This paper reflects only the author's view, and the European Research Executive Agency is not responsible for any use that may be made of the information it contains

Journal ref: Magn Reson Med. 2023;89:1901-1914

arXiv:2207.05250 [pdf, other]

Efficient Real-world Testing of Causal Decision Making via Bayesian Experimental Design for Contextual Optimisation

Authors: Desi R. Ivanova, Joel Jennings, Cheng Zhang, Adam Foster

Abstract: The real-world testing of decisions made using causal machine learning models is an essential prerequisite for their successful application. We focus on evaluating and improving contextual treatment assignment decisions: these are personalised treatments applied to e.g. customers, each with their own contextual information, with the aim of maximising a reward. In this paper we introduce a model-ag… ▽ More The real-world testing of decisions made using causal machine learning models is an essential prerequisite for their successful application. We focus on evaluating and improving contextual treatment assignment decisions: these are personalised treatments applied to e.g. customers, each with their own contextual information, with the aim of maximising a reward. In this paper we introduce a model-agnostic framework for gathering data to evaluate and improve contextual decision making through Bayesian Experimental Design. Specifically, our method is used for the data-efficient evaluation of the regret of past treatment assignments. Unlike approaches such as A/B testing, our method avoids assigning treatments that are known to be highly sub-optimal, whilst engaging in some exploration to gather pertinent information. We achieve this by introducing an information-based design objective, which we optimise end-to-end. Our method applies to discrete and continuous treatments. Comparing our information-theoretic approach to baselines in several simulation studies demonstrates the superior performance of our proposed approach. △ Less

Submitted 11 July, 2022; originally announced July 2022.

Comments: ICML 2022 Workshop on Adaptive Experimental Design and Active Learning in the Real World. 16 pages, 5 figures

arXiv:2206.00051 [pdf, other]

Learning Instance-Specific Augmentations by Capturing Local Invariances

Authors: Ning Miao, Tom Rainforth, Emile Mathieu, Yann Dubois, Yee Whye Teh, Adam Foster, Hyunjik Kim

Abstract: We introduce InstaAug, a method for automatically learning input-specific augmentations from data. Previous methods for learning augmentations have typically assumed independence between the original input and the transformation applied to that input. This can be highly restrictive, as the invariances we hope our augmentation will capture are themselves often highly input dependent. InstaAug inste… ▽ More We introduce InstaAug, a method for automatically learning input-specific augmentations from data. Previous methods for learning augmentations have typically assumed independence between the original input and the transformation applied to that input. This can be highly restrictive, as the invariances we hope our augmentation will capture are themselves often highly input dependent. InstaAug instead introduces a learnable invariance module that maps from inputs to tailored transformation parameters, allowing local invariances to be captured. This can be simultaneously trained alongside the downstream model in a fully end-to-end manner, or separately learned for a pre-trained model. We empirically demonstrate that InstaAug learns meaningful input-dependent augmentations for a wide range of transformation classes, which in turn provides better performance on both supervised and self-supervised tasks. △ Less

Submitted 30 May, 2023; v1 submitted 31 May, 2022; originally announced June 2022.

arXiv:2202.03422 [pdf]

Development of a deep learning platform for optimising sheet stam** geometries subject to manufacturing constraints

Authors: Hamid Reza Attar, Alistair Foster, Nan Li

Abstract: The latest sheet stam** processes enable efficient manufacturing of complex shape structural components that have high stiffness to weight ratios, but these processes can introduce defects. To assist component design for stam** processes, this paper presents a novel deep-learning-based platform for optimising 3D component geometries. The platform adopts a non-parametric modelling approach that… ▽ More The latest sheet stam** processes enable efficient manufacturing of complex shape structural components that have high stiffness to weight ratios, but these processes can introduce defects. To assist component design for stam** processes, this paper presents a novel deep-learning-based platform for optimising 3D component geometries. The platform adopts a non-parametric modelling approach that is capable of optimising arbitrary geometries from multiple geometric parameterisation schema. This approach features the interaction of two neural networks: 1) a geometry generator and 2) a manufacturing performance evaluator. The generator predicts continuous 3D signed distance fields (SDFs) for geometries of different classes, and each SDF is conditioned on a latent vector. The zero-level-set of each SDF implicitly represents a generated geometry. Novel training strategies for the generator are introduced and include a new loss function which is tailored for sheet stam** applications. These strategies enable the differentiable generation of high quality, large scale component geometries with tight local features for the first time. The evaluator maps a 2D projection of these generated geometries to their post-stam** physical (e.g., strain) distributions. Manufacturing constraints are imposed based on these distributions and are used to formulate a novel objective function for optimisation. A new gradient-based optimisation technique is employed to iteratively update the latent vectors, and therefore geometries, to minimise this objective function and thus meet the manufacturing constraints. Case studies based on optimising box geometries subject to a sheet thinning constraint for a hot stam** process are presented and discussed. The results show that expressive geometric changes are achievable, and that these changes are driven by stam** performance. △ Less

Submitted 4 February, 2022; originally announced February 2022.

Comments: 44 pages, 35 figures

arXiv:2202.02195 [pdf, other]

Deep End-to-end Causal Inference

Authors: Tomas Geffner, Javier Antoran, Adam Foster, Wenbo Gong, Chao Ma, Emre Kiciman, Amit Sharma, Angus Lamb, Martin Kukla, Nick Pawlowski, Miltiadis Allamanis, Cheng Zhang

Abstract: Causal inference is essential for data-driven decision making across domains such as business engagement, medical treatment and policy making. However, research on causal discovery has evolved separately from inference methods, preventing straight-forward combination of methods from both fields. In this work, we develop Deep End-to-end Causal Inference (DECI), a single flow-based non-linear additi… ▽ More Causal inference is essential for data-driven decision making across domains such as business engagement, medical treatment and policy making. However, research on causal discovery has evolved separately from inference methods, preventing straight-forward combination of methods from both fields. In this work, we develop Deep End-to-end Causal Inference (DECI), a single flow-based non-linear additive noise model that takes in observational data and can perform both causal discovery and inference, including conditional average treatment effect (CATE) estimation. We provide a theoretical guarantee that DECI can recover the ground truth causal graph under standard causal discovery assumptions. Motivated by application impact, we extend this model to heterogeneous, mixed-type data with missing values, allowing for both continuous and discrete treatment decisions. Our results show the competitive performance of DECI when compared to relevant baselines for both causal discovery and (C)ATE estimation in over a thousand experiments on both synthetic datasets and causal machine learning benchmarks across data-types and levels of missingness. △ Less

Submitted 20 June, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

arXiv:2111.02329 [pdf, other]

Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods

Authors: Desi R. Ivanova, Adam Foster, Steven Kleinegesse, Michael U. Gutmann, Tom Rainforth

Abstract: We introduce implicit Deep Adaptive Design (iDAD), a new method for performing adaptive experiments in real-time with implicit models. iDAD amortizes the cost of Bayesian optimal experimental design (BOED) by learning a design policy network upfront, which can then be deployed quickly at the time of the experiment. The iDAD network can be trained on any model which simulates differentiable samples… ▽ More We introduce implicit Deep Adaptive Design (iDAD), a new method for performing adaptive experiments in real-time with implicit models. iDAD amortizes the cost of Bayesian optimal experimental design (BOED) by learning a design policy network upfront, which can then be deployed quickly at the time of the experiment. The iDAD network can be trained on any model which simulates differentiable samples, unlike previous design policy work that requires a closed form likelihood and conditionally independent experiments. At deployment, iDAD allows design decisions to be made in milliseconds, in contrast to traditional BOED approaches that require heavy computation during the experiment itself. We illustrate the applicability of iDAD on a number of experiments, and show that it provides a fast and effective mechanism for performing adaptive design with implicit models. △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: 33 pages, 8 figures. Published as a conference paper at NeurIPS 2021

arXiv:2107.07004 [pdf, other]

Lidar Light Scattering Augmentation (LISA): Physics-based Simulation of Adverse Weather Conditions for 3D Object Detection

Authors: Velat Kilic, Deepti Hegde, Vishwanath Sindagi, A. Brinton Cooper, Mark A. Foster, Vishal M. Patel

Abstract: Lidar-based object detectors are critical parts of the 3D perception pipeline in autonomous navigation systems such as self-driving cars. However, they are known to be sensitive to adverse weather conditions such as rain, snow and fog due to reduced signal-to-noise ratio (SNR) and signal-to-background ratio (SBR). As a result, lidar-based object detectors trained on data captured in normal weather… ▽ More Lidar-based object detectors are critical parts of the 3D perception pipeline in autonomous navigation systems such as self-driving cars. However, they are known to be sensitive to adverse weather conditions such as rain, snow and fog due to reduced signal-to-noise ratio (SNR) and signal-to-background ratio (SBR). As a result, lidar-based object detectors trained on data captured in normal weather tend to perform poorly in such scenarios. However, collecting and labelling sufficient training data in a diverse range of adverse weather conditions is laborious and prohibitively expensive. To address this issue, we propose a physics-based approach to simulate lidar point clouds of scenes in adverse weather conditions. These augmented datasets can then be used to train lidar-based detectors to improve their all-weather reliability. Specifically, we introduce a hybrid Monte-Carlo based approach that treats (i) the effects of large particles by placing them randomly and comparing their back reflected power against the target, and (ii) attenuation effects on average through calculation of scattering efficiencies from the Mie theory and particle size distributions. Retraining networks with this augmented data improves mean average precision evaluated on real world rainy scenes and we observe greater improvement in performance with our model relative to existing models from the literature. Furthermore, we evaluate recent state-of-the-art detectors on the simulated weather conditions and present an in-depth analysis of their performance. △ Less

Submitted 14 July, 2021; originally announced July 2021.

arXiv:2106.10052 [pdf, other]

On Contrastive Representations of Stochastic Processes

Authors: Emile Mathieu, Adam Foster, Yee Whye Teh

Abstract: Learning representations of stochastic processes is an emerging problem in machine learning with applications from meta-learning to physical object models to time series. Typical methods rely on exact reconstruction of observations, but this approach breaks down as observations become high-dimensional or noise distributions become complex. To address this, we propose a unifying framework for learn… ▽ More Learning representations of stochastic processes is an emerging problem in machine learning with applications from meta-learning to physical object models to time series. Typical methods rely on exact reconstruction of observations, but this approach breaks down as observations become high-dimensional or noise distributions become complex. To address this, we propose a unifying framework for learning contrastive representations of stochastic processes (CReSP) that does away with exact reconstruction. We dissect potential use cases for stochastic process representations, and propose methods that accommodate each. Empirically, we show that our methods are effective for learning representations of periodic functions, 3D objects and dynamical processes. Our methods tolerate noisy high-dimensional observations better than traditional approaches, and the learned representations transfer to a range of downstream tasks. △ Less

Submitted 29 October, 2021; v1 submitted 18 June, 2021; originally announced June 2021.

Comments: NeurIPS 2021 Camera ready

arXiv:2106.08161 [pdf, other]

Contrastive Mixture of Posteriors for Counterfactual Inference, Data Integration and Fairness

Authors: Adam Foster, Árpi Vezér, Craig A Glastonbury, Páidí Creed, Sam Abujudeh, Aaron Sim

Abstract: Learning meaningful representations of data that can address challenges such as batch effect correction and counterfactual inference is a central problem in many domains including computational biology. Adopting a Conditional VAE framework, we show that marginal independence between the representation and a condition variable plays a key role in both of these challenges. We propose the Contrastive… ▽ More Learning meaningful representations of data that can address challenges such as batch effect correction and counterfactual inference is a central problem in many domains including computational biology. Adopting a Conditional VAE framework, we show that marginal independence between the representation and a condition variable plays a key role in both of these challenges. We propose the Contrastive Mixture of Posteriors (CoMP) method that uses a novel misalignment penalty defined in terms of mixtures of the variational posteriors to enforce this independence in latent space. We show that CoMP has attractive theoretical properties compared to previous approaches, and we prove counterfactual identifiability of CoMP under additional assumptions. We demonstrate state-of-the-art performance on a set of challenging tasks including aligning human tumour samples with cancer cell-lines, predicting transcriptome-level perturbation responses, and batch correction on single-cell RNA sequencing data. We also find parallels to fair representation learning and demonstrate that CoMP is competitive on a common task in the field. △ Less

Submitted 26 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: Published as a conference paper (long presentation) at ICML 2022

arXiv:2104.13199 [pdf]

Rapid feasibility assessment of components formed through hot stam**: A deep learning approach

Authors: Hamid Reza Attar, Haosu Zhou, Alistair Foster, Nan Li

Abstract: The novel non-isothermal Hot Forming and cold die Quenching (HFQ) process can enable the cost-effective production of complex shaped, high strength aluminium alloy panel components. However, the unfamiliarity of designing for the new process prevents its widescale adoption in industrial settings. Recent research efforts focus on the development of advanced material models for finite element simula… ▽ More The novel non-isothermal Hot Forming and cold die Quenching (HFQ) process can enable the cost-effective production of complex shaped, high strength aluminium alloy panel components. However, the unfamiliarity of designing for the new process prevents its widescale adoption in industrial settings. Recent research efforts focus on the development of advanced material models for finite element simulations, used to assess the feasibility of new component designs for the HFQ process. However, FE simulations take place late in design processes, require forming process expertise and are unsuitable for early-stage design explorations. To address these limitations, this study presents a novel application of a Convolutional Neural Network (CNN) based surrogate as a means of rapid manufacturing feasibility assessment for components to be formed using the HFQ process. A diverse dataset containing variations in component geometry, blank shapes, and processing parameters, together with corresponding physical fields is generated and used to train the model. The results show that near indistinguishable full field predictions are obtained in real time from the model when compared with HFQ simulations. This technique provides an invaluable tool to aid component design and decision making at the onset of a design process for complex-shaped components formed under HFQ conditions. △ Less

Submitted 20 April, 2021; originally announced April 2021.

arXiv:2103.02438 [pdf, other]

Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design

Authors: Adam Foster, Desi R. Ivanova, Ilyas Malik, Tom Rainforth

Abstract: We introduce Deep Adaptive Design (DAD), a method for amortizing the cost of adaptive Bayesian experimental design that allows experiments to be run in real-time. Traditional sequential Bayesian optimal experimental design approaches require substantial computation at each stage of the experiment. This makes them unsuitable for most real-world applications, where decisions must typically be made q… ▽ More We introduce Deep Adaptive Design (DAD), a method for amortizing the cost of adaptive Bayesian experimental design that allows experiments to be run in real-time. Traditional sequential Bayesian optimal experimental design approaches require substantial computation at each stage of the experiment. This makes them unsuitable for most real-world applications, where decisions must typically be made quickly. DAD addresses this restriction by learning an amortized design network upfront and then using this to rapidly run (multiple) adaptive experiments at deployment time. This network represents a design policy which takes as input the data from previous steps, and outputs the next design using a single forward pass; these design decisions can be made in milliseconds during the live experiment. To train the network, we introduce contrastive information bounds that are suitable objectives for the sequential setting, and propose a customized network architecture that exploits key symmetries. We demonstrate that DAD successfully amortizes the process of experimental design, outperforming alternative strategies on a number of problems. △ Less

Submitted 11 June, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

Comments: Published as a conference paper at ICML 2021

arXiv:2010.09515 [pdf, other]

Improving Transformation Invariance in Contrastive Representation Learning

Authors: Adam Foster, Rattana Pukdee, Tom Rainforth

Abstract: We propose methods to strengthen the invariance properties of representations obtained by contrastive learning. While existing approaches implicitly induce a degree of invariance as representations are learned, we look to more directly enforce invariance in the encoding process. To this end, we first introduce a training objective for contrastive learning that uses a novel regularizer to control h… ▽ More We propose methods to strengthen the invariance properties of representations obtained by contrastive learning. While existing approaches implicitly induce a degree of invariance as representations are learned, we look to more directly enforce invariance in the encoding process. To this end, we first introduce a training objective for contrastive learning that uses a novel regularizer to control how the representation changes under transformation. We show that representations trained with this objective perform better on downstream tasks and are more robust to the introduction of nuisance transformations at test time. Second, we propose a change to how test time representations are generated by introducing a feature averaging approach that combines encodings from multiple transformations of the original input, finding that this leads to across the board performance gains. Finally, we introduce the novel Spirograph dataset to explore our ideas in the context of a differentiable generative process with multiple downstream tasks, showing that our techniques for learning invariance are highly beneficial. △ Less

Submitted 22 March, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

Comments: Published as a conference paper at ICLR 2021

arXiv:1911.00294 [pdf, other]

A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments

Authors: Adam Foster, Martin Jankowiak, Matthew O'Meara, Yee Whye Teh, Tom Rainforth

Abstract: We introduce a fully stochastic gradient based approach to Bayesian optimal experimental design (BOED). Our approach utilizes variational lower bounds on the expected information gain (EIG) of an experiment that can be simultaneously optimized with respect to both the variational and design parameters. This allows the design process to be carried out through a single unified stochastic gradient as… ▽ More We introduce a fully stochastic gradient based approach to Bayesian optimal experimental design (BOED). Our approach utilizes variational lower bounds on the expected information gain (EIG) of an experiment that can be simultaneously optimized with respect to both the variational and design parameters. This allows the design process to be carried out through a single unified stochastic gradient ascent procedure, in contrast to existing approaches that typically construct a pointwise EIG estimator, before passing this estimator to a separate optimizer. We provide a number of different variational objectives including the novel adaptive contrastive estimation (ACE) bound. Finally, we show that our gradient-based approaches are able to provide effective design optimization in substantially higher dimensional settings than existing approaches. △ Less

Submitted 27 February, 2020; v1 submitted 1 November, 2019; originally announced November 2019.

Comments: Published as a conference paper at AISTATS 2020

arXiv:1904.08875 [pdf, other]

doi 10.1016/j.cpc.2019.106949

DScribe: Library of Descriptors for Machine Learning in Materials Science

Authors: Lauri Himanen, Marc O. J. Jäger, Eiaki V. Morooka, Filippo Federici Canova, Yashasvi S. Ranawat, David Z. Gao, Patrick Rinke, Adam S. Foster

Abstract: DScribe is a software package for machine learning that provides popular feature transformations ("descriptors") for atomistic materials simulations. DScribe accelerates the application of machine learning for atomistic property prediction by providing user-friendly, off-the-shelf descriptor implementations. The package currently contains implementations for Coulomb matrix, Ewald sum matrix, sine… ▽ More DScribe is a software package for machine learning that provides popular feature transformations ("descriptors") for atomistic materials simulations. DScribe accelerates the application of machine learning for atomistic property prediction by providing user-friendly, off-the-shelf descriptor implementations. The package currently contains implementations for Coulomb matrix, Ewald sum matrix, sine matrix, Many-body Tensor Representation (MBTR), Atom-centered Symmetry Function (ACSF) and Smooth Overlap of Atomic Positions (SOAP). Usage of the package is illustrated for two different applications: formation energy prediction for solids and ionic charge prediction for atoms in organic molecules. The package is freely available under the open-source Apache License 2.0. △ Less

Submitted 18 April, 2019; originally announced April 2019.

Journal ref: Comp. Phys. Comm. 247 (2020) 106949

arXiv:1903.05480 [pdf, other]

Variational Bayesian Optimal Experimental Design

Authors: Adam Foster, Martin Jankowiak, Eli Bingham, Paul Horsfall, Yee Whye Teh, Tom Rainforth, Noah Goodman

Abstract: Bayesian optimal experimental design (BOED) is a principled framework for making efficient use of limited experimental resources. Unfortunately, its applicability is hampered by the difficulty of obtaining accurate estimates of the expected information gain (EIG) of an experiment. To address this, we introduce several classes of fast EIG estimators by building on ideas from amortized variational i… ▽ More Bayesian optimal experimental design (BOED) is a principled framework for making efficient use of limited experimental resources. Unfortunately, its applicability is hampered by the difficulty of obtaining accurate estimates of the expected information gain (EIG) of an experiment. To address this, we introduce several classes of fast EIG estimators by building on ideas from amortized variational inference. We show theoretically and empirically that these estimators can provide significant gains in speed and accuracy over previous approaches. We further demonstrate the practicality of our approach on a number of end-to-end experiments. △ Less

Submitted 14 January, 2020; v1 submitted 13 March, 2019; originally announced March 2019.

Comments: Published as a conference paper at the Thirty-third Conference on Neural Information Processing Systems, Vancouver 2019. https://papers.nips.cc/paper/9553-variational-bayesian-optimal-experimental-design.pdf

arXiv:1807.03113 [pdf, other]

Sampling and Inference for Beta Neutral-to-the-Left Models of Sparse Networks

Authors: Benjamin Bloem-Reddy, Adam Foster, Emile Mathieu, Yee Whye Teh

Abstract: Empirical evidence suggests that heavy-tailed degree distributions occurring in many real networks are well-approximated by power laws with exponents $η$ that may take values either less than and greater than two. Models based on various forms of exchangeability are able to capture power laws with $η< 2$, and admit tractable inference algorithms; we draw on previous results to show that $η> 2$ can… ▽ More Empirical evidence suggests that heavy-tailed degree distributions occurring in many real networks are well-approximated by power laws with exponents $η$ that may take values either less than and greater than two. Models based on various forms of exchangeability are able to capture power laws with $η< 2$, and admit tractable inference algorithms; we draw on previous results to show that $η> 2$ cannot be generated by the forms of exchangeability used in existing random graph models. Preferential attachment models generate power law exponents greater than two, but have been of limited use as statistical models due to the inherent difficulty of performing inference in non-exchangeable models. Motivated by this gap, we design and implement inference algorithms for a recently proposed class of models that generates $η$ of all possible values. We show that although they are not exchangeable, these models have probabilistic structure amenable to inference. Our methods make a large class of previously intractable models useful for statistical inference. △ Less

Submitted 9 July, 2018; originally announced July 2018.

Comments: Accepted for publication in the proceedings of Conference on Uncertainty in Artificial Intelligence (UAI) 2018

arXiv:1711.02222 [pdf]

Information-Dense Nonlinear Photonic Physical Unclonable Function

Authors: Brian C. Grubel, Bryan T. Bosworth, Michael R. Kossey, A. Brinton Cooper, Mark A. Foster, Amy C. Foster

Abstract: We present a comprehensive investigation into the complexity of a new private key storage apparatus: a novel silicon photonic physical unclonable function (PUF) based on ultrafast nonlinear optical interactions in a chaotic silicon microcavity that is both unclonable and impossible to emulate. This device provides remarkable improvements to total information content (raw cryptographic material), i… ▽ More We present a comprehensive investigation into the complexity of a new private key storage apparatus: a novel silicon photonic physical unclonable function (PUF) based on ultrafast nonlinear optical interactions in a chaotic silicon microcavity that is both unclonable and impossible to emulate. This device provides remarkable improvements to total information content (raw cryptographic material), information density, and key generation rates over existing optical scattering PUFs and is also more easily integrated with both CMOS electronics and telecommunications hardware. Our device exploits the natural nonlinear optical behavior of silicon to neutralize commonly used attacks against PUFs and vastly enhance device complexity. We confirm this phenomenon with thorough experimental results on prototype devices and present a detailed estimate of their total information content. Our compact, micron-scale approach represents an entirely new generation of ultrafast and high information density photonic PUF devices that can be directly incorporated into integrated circuits to ensure authenticity and provide secure physical storage of private key material. △ Less

Submitted 6 November, 2017; originally announced November 2017.

arXiv:1711.01439 [pdf]

doi 10.1364/OE.26.004710

Secure Communications using Nonlinear Silicon Photonic Keys

Authors: Brian C. Grubel, Bryan T. Bosworth, Michael R. Kossey, A. Brinton Cooper, Mark A. Foster, Amy C. Foster

Abstract: We present a secure communication system constructed using pairs of nonlinear photonic physical unclonable functions (PUFs) that harness physical chaos in integrated silicon micro-cavities. Compared to a large, electronically stored one-time pad, our method provisions large amounts of information within the intrinsically complex nanostructure of the micro-cavities. By probing a micro-cavity with a… ▽ More We present a secure communication system constructed using pairs of nonlinear photonic physical unclonable functions (PUFs) that harness physical chaos in integrated silicon micro-cavities. Compared to a large, electronically stored one-time pad, our method provisions large amounts of information within the intrinsically complex nanostructure of the micro-cavities. By probing a micro-cavity with a rapid sequence of spectrally-encoded ultrafast optical pulses and measuring the lightwave responses, we experimentally demonstrate the ability to extract 2.4 Gb of key material from a single micro-cavity device. Subsequently, in a secure communications experiment with pairs of devices, we achieve bit error rates below $10^{-5}$ at code rates of up to 0.1. The PUFs' responses are never transmitted over the channel or stored in digital memory, thus enhancing security of the system. Additionally, the micro-cavity PUFs are extremely small, inexpensive, robust, and fully compatible with telecommunications infrastructure, components, and electronic fabrication. This approach can serve one-time pad or public key exchange applications where high security is required △ Less

Submitted 5 February, 2018; v1 submitted 4 November, 2017; originally announced November 2017.

Comments: 12 pages. Replaced with revised version

Showing 1–25 of 25 results for author: Foster, A