Search | arXiv e-print repository

Switching Autoregressive Low-rank Tensor Models

Authors: Hyun Dong Lee, Andrew Warrington, Joshua I. Glaser, Scott W. Linderman

Abstract: An important problem in time-series analysis is modeling systems with time-varying dynamics. Probabilistic models with joint continuous and discrete latent states offer interpretable, efficient, and experimentally useful descriptions of such data. Commonly used models include autoregressive hidden Markov models (ARHMMs) and switching linear dynamical systems (SLDSs), each with its own advantages a… ▽ More An important problem in time-series analysis is modeling systems with time-varying dynamics. Probabilistic models with joint continuous and discrete latent states offer interpretable, efficient, and experimentally useful descriptions of such data. Commonly used models include autoregressive hidden Markov models (ARHMMs) and switching linear dynamical systems (SLDSs), each with its own advantages and disadvantages. ARHMMs permit exact inference and easy parameter estimation, but are parameter intensive when modeling long dependencies, and hence are prone to overfitting. In contrast, SLDSs can capture long-range dependencies in a parameter efficient way through Markovian latent dynamics, but present an intractable likelihood and a challenging parameter estimation task. In this paper, we propose switching autoregressive low-rank tensor (SALT) models, which retain the advantages of both approaches while ameliorating the weaknesses. SALT parameterizes the tensor of an ARHMM with a low-rank factorization to control the number of parameters and allow longer range dependencies without overfitting. We prove theoretical and discuss practical connections between SALT, linear dynamical systems, and SLDSs. We empirically demonstrate quantitative advantages of SALT models on a range of simulated and real prediction tasks, including behavioral and neural datasets. Furthermore, the learned low-rank tensor provides novel insights into temporal dependencies within each discrete state. △ Less

Submitted 6 June, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

arXiv:2208.04933 [pdf, other]

Simplified State Space Layers for Sequence Modeling

Authors: Jimmy T. H. Smith, Andrew Warrington, Scott W. Linderman

Abstract: Models using structured state space sequence (S4) layers have achieved state-of-the-art performance on long-range sequence modeling tasks. An S4 layer combines linear state space models (SSMs), the HiPPO framework, and deep learning to achieve high performance. We build on the design of the S4 layer and introduce a new state space layer, the S5 layer. Whereas an S4 layer uses many independent sing… ▽ More Models using structured state space sequence (S4) layers have achieved state-of-the-art performance on long-range sequence modeling tasks. An S4 layer combines linear state space models (SSMs), the HiPPO framework, and deep learning to achieve high performance. We build on the design of the S4 layer and introduce a new state space layer, the S5 layer. Whereas an S4 layer uses many independent single-input, single-output SSMs, the S5 layer uses one multi-input, multi-output SSM. We establish a connection between S5 and S4, and use this to develop the initialization and parameterization used by the S5 model. The result is a state space layer that can leverage efficient and widely implemented parallel scans, allowing S5 to match the computational efficiency of S4, while also achieving state-of-the-art performance on several long-range sequence modeling tasks. S5 averages 87.4% on the long range arena benchmark, and 98.5% on the most difficult Path-X task. △ Less

Submitted 3 March, 2023; v1 submitted 9 August, 2022; originally announced August 2022.

arXiv:2206.05952 [pdf, other]

SIXO: Smoothing Inference with Twisted Objectives

Authors: Dieterich Lawson, Allan Raventós, Andrew Warrington, Scott Linderman

Abstract: Sequential Monte Carlo (SMC) is an inference algorithm for state space models that approximates the posterior by sampling from a sequence of target distributions. The target distributions are often chosen to be the filtering distributions, but these ignore information from future observations, leading to practical and theoretical limitations in inference and model learning. We introduce SIXO, a me… ▽ More Sequential Monte Carlo (SMC) is an inference algorithm for state space models that approximates the posterior by sampling from a sequence of target distributions. The target distributions are often chosen to be the filtering distributions, but these ignore information from future observations, leading to practical and theoretical limitations in inference and model learning. We introduce SIXO, a method that instead learns targets that approximate the smoothing distributions, incorporating information from all observations. The key idea is to use density ratio estimation to fit functions that warp the filtering distributions into the smoothing distributions. We then use SMC with these learned targets to define a variational objective for model and proposal learning. SIXO yields provably tighter log marginal lower bounds and offers significantly more accurate posterior inferences and parameter estimates in a variety of domains. △ Less

Submitted 20 June, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

Comments: v2: Updates for clarity throughout. Results unchanged

arXiv:2012.15566 [pdf, other]

Robust Asymmetric Learning in POMDPs

Authors: Andrew Warrington, J. Wilder Lavington, Adam Ścibior, Mark Schmidt, Frank Wood

Abstract: Policies for partially observed Markov decision processes can be efficiently learned by imitating policies for the corresponding fully observed Markov decision processes. Unfortunately, existing approaches for this kind of imitation learning have a serious flaw: the expert does not know what the trainee cannot see, and so may encourage actions that are sub-optimal, even unsafe, under partial infor… ▽ More Policies for partially observed Markov decision processes can be efficiently learned by imitating policies for the corresponding fully observed Markov decision processes. Unfortunately, existing approaches for this kind of imitation learning have a serious flaw: the expert does not know what the trainee cannot see, and so may encourage actions that are sub-optimal, even unsafe, under partial information. We derive an objective to instead train the expert to maximize the expected reward of the imitating agent policy, and use it to construct an efficient algorithm, adaptive asymmetric DAgger (A2D), that jointly trains the expert and the agent. We show that A2D produces an expert policy that the agent can safely imitate, in turn outperforming policies learned by imitating a fixed expert. △ Less

Submitted 1 July, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

Comments: ICML 2021

arXiv:2003.13221 [pdf, other]

doi 10.3389/frai.2021.550603

Planning as Inference in Epidemiological Models

Authors: Frank Wood, Andrew Warrington, Saeid Naderiparizi, Christian Weilbach, Vaden Masrani, William Harvey, Adam Scibior, Boyan Beronov, John Grefenstette, Duncan Campbell, Ali Nasseri

Abstract: In this work we demonstrate how to automate parts of the infectious disease-control policy-making process via performing inference in existing epidemiological models. The kind of inference tasks undertaken include computing the posterior distribution over controllable, via direct policy-making choices, simulation model parameters that give rise to acceptable disease progression outcomes. Among oth… ▽ More In this work we demonstrate how to automate parts of the infectious disease-control policy-making process via performing inference in existing epidemiological models. The kind of inference tasks undertaken include computing the posterior distribution over controllable, via direct policy-making choices, simulation model parameters that give rise to acceptable disease progression outcomes. Among other things, we illustrate the use of a probabilistic programming language that automates inference in existing simulators. Neither the full capabilities of this tool for automating inference nor its utility for planning is widely disseminated at the current time. Timely gains in understanding about how such simulation-based models and inference automation tools applied in support of policymaking could lead to less economically damaging policy prescriptions, particularly during the current COVID-19 pandemic. △ Less

Submitted 15 September, 2021; v1 submitted 30 March, 2020; originally announced March 2020.

Comments: Revisions

Journal ref: Front Artif Intell. 2021; 4: 550603

arXiv:2003.12908 [pdf, other]

Co** With Simulators That Don't Always Return

Authors: Andrew Warrington, Saeid Naderiparizi, Frank Wood

Abstract: Deterministic models are approximations of reality that are easy to interpret and often easier to build than stochastic alternatives. Unfortunately, as nature is capricious, observational data can never be fully explained by deterministic models in practice. Observation and process noise need to be added to adapt deterministic models to behave stochastically, such that they are capable of explaini… ▽ More Deterministic models are approximations of reality that are easy to interpret and often easier to build than stochastic alternatives. Unfortunately, as nature is capricious, observational data can never be fully explained by deterministic models in practice. Observation and process noise need to be added to adapt deterministic models to behave stochastically, such that they are capable of explaining and extrapolating from noisy data. We investigate and address computational inefficiencies that arise from adding process noise to deterministic simulators that fail to return for certain inputs; a property we describe as "brittle." We show how to train a conditional normalizing flow to propose perturbations such that the simulator succeeds with high probability, increasing computational efficiency. △ Less

Submitted 28 March, 2020; originally announced March 2020.

Comments: AISTATS 2020 camera ready, version 1.0

arXiv:1907.11075 [pdf, other]

The Virtual Patch Clamp: Imputing C. elegans Membrane Potentials from Calcium Imaging

Authors: Andrew Warrington, Arthur Spencer, Frank Wood

Abstract: We develop a stochastic whole-brain and body simulator of the nematode roundworm Caenorhabditis elegans (C. elegans) and show that it is sufficiently regularizing to allow imputation of latent membrane potentials from partial calcium fluorescence imaging observations. This is the first attempt we know of to "complete the circle," where an anatomically grounded whole-connectome simulator is used to… ▽ More We develop a stochastic whole-brain and body simulator of the nematode roundworm Caenorhabditis elegans (C. elegans) and show that it is sufficiently regularizing to allow imputation of latent membrane potentials from partial calcium fluorescence imaging observations. This is the first attempt we know of to "complete the circle," where an anatomically grounded whole-connectome simulator is used to impute a time-varying "brain" state at single-cell fidelity from covariates that are measurable in practice. The sequential Monte Carlo (SMC) method we employ not only enables imputation of said latent states but also presents a strategy for learning simulator parameters via variational optimization of the noisy model evidence approximation provided by SMC. Our imputation and parameter estimation experiments were conducted on distributed systems using novel implementations of the aforementioned techniques applied to synthetic data of dimension and type representative of that which are measured in laboratories currently. △ Less

Submitted 24 July, 2019; originally announced July 2019.

Comments: Includes Supplementary Materials

arXiv:1805.00890 [pdf, other]

Generalising Cost-Optimal Particle Filtering

Authors: Andrew Warrington, Neil Dhir

Abstract: We present an instance of the optimal sensor scheduling problem with the additional relaxation that our observer makes active choices whether or not to observe and how to observe. We mask the nodes in a directed acyclic graph of the model that are observable, effectively optimising whether or not an observation should be made at each time step. The reason for this is simple: it is prudent to seek… ▽ More We present an instance of the optimal sensor scheduling problem with the additional relaxation that our observer makes active choices whether or not to observe and how to observe. We mask the nodes in a directed acyclic graph of the model that are observable, effectively optimising whether or not an observation should be made at each time step. The reason for this is simple: it is prudent to seek to reduce sensor costs, since resources (e.g. hardware, personnel and time) are finite. Consequently, rather than treating our plant as if it had infinite sensing resources, we seek to jointly maximise the utility of each perception. This reduces resource expenditure by explicitly minimising an observation-associated cost (e.g. battery use) while also facilitating the potential to yield better state estimates by virtue of being able to use more perceptions in noisy or unpredictable regions of state-space (e.g. a busy traffic junction). We present a general formalisation and notation of this problem, capable of encompassing much of the prior art. To illustrate our formulation, we pose and solve two example problems in this domain. Finally we suggest active areas of research to improve and further generalise this approach. △ Less

Submitted 2 May, 2018; originally announced May 2018.

Comments: ICRA 2018 Workshop on Informative Path Planning and Adaptive Sampling

arXiv:1710.11397 [pdf, ps, other]

Updating the VESICLE-CNN Synapse Detector

Authors: Andrew Warrington, Frank Wood

Abstract: We present an updated version of the VESICLE-CNN algorithm presented by Roncal et al. (2014). The original implementation makes use of a patch-based approach. This methodology is known to be slow due to repeated computations. We update this implementation to be fully convolutional through the use of dilated convolutions, recovering the expanded field of view achieved through the use of strided max… ▽ More We present an updated version of the VESICLE-CNN algorithm presented by Roncal et al. (2014). The original implementation makes use of a patch-based approach. This methodology is known to be slow due to repeated computations. We update this implementation to be fully convolutional through the use of dilated convolutions, recovering the expanded field of view achieved through the use of strided maxpools, but without a degradation of spatial resolution. This updated implementation performs as well as the original implementation, but with a $600\times$ speedup at test time. We release source code and data into the public domain. △ Less

Submitted 31 October, 2017; originally announced October 2017.

Comments: Submitted as two side extended abstract to NIPS 2017 workshop: BigNeuro 2017: Analyzing brain data from nano to macroscale

arXiv:1709.06181 [pdf, other]

On Nesting Monte Carlo Estimators

Authors: Tom Rainforth, Robert Cornish, Hongseok Yang, Andrew Warrington, Frank Wood

Abstract: Many problems in machine learning and statistics involve nested expectations and thus do not permit conventional Monte Carlo (MC) estimation. For such problems, one must nest estimators, such that terms in an outer estimator themselves involve calculation of a separate, nested, estimation. We investigate the statistical implications of nesting MC estimators, including cases of multiple levels of n… ▽ More Many problems in machine learning and statistics involve nested expectations and thus do not permit conventional Monte Carlo (MC) estimation. For such problems, one must nest estimators, such that terms in an outer estimator themselves involve calculation of a separate, nested, estimation. We investigate the statistical implications of nesting MC estimators, including cases of multiple levels of nesting, and establish the conditions under which they converge. We derive corresponding rates of convergence and provide empirical evidence that these rates are observed in practice. We further establish a number of pitfalls that can arise from naive nesting of MC estimators, provide guidelines about how these can be avoided, and lay out novel methods for reformulating certain classes of nested expectation problems into single expectations, leading to improved convergence rates. We demonstrate the applicability of our work by using our results to develop a new estimator for discrete Bayesian experimental design problems and derive error bounds for a class of variational objectives. △ Less

Submitted 23 May, 2018; v1 submitted 18 September, 2017; originally announced September 2017.

Comments: To appear at International Conference on Machine Learning 2018

arXiv:1408.5327 [pdf, ps, other]

doi 10.1117/12.2055295

Alternative approach to precision narrow-angle astrometry for Antarctic long baseline interferometry

Authors: Y. Kok, M. J. Ireland, A. C. Rizzuto, P. G. Tuthill, J. G. Robertson, B. A. Warrington, W. J. Tango

Abstract: The conventional approach to high-precision narrow-angle astrometry using a long baseline interferometer is to directly measure the fringe packet separation of a target and a nearby reference star. This is done by means of a technique known as phase-referencing which requires a network of dual beam combiners and laser metrology systems. Using an alternative approach that does not rely on phase-ref… ▽ More The conventional approach to high-precision narrow-angle astrometry using a long baseline interferometer is to directly measure the fringe packet separation of a target and a nearby reference star. This is done by means of a technique known as phase-referencing which requires a network of dual beam combiners and laser metrology systems. Using an alternative approach that does not rely on phase-referencing, the narrow-angle astrometry of several closed binary stars (with separation less than 2$"$), as described in this paper, was carried out by observing the fringe packet crossing event of the binary systems. Such an event occurs twice every sidereal day when the line joining the two stars of the binary is is perpendicular to the projected baseline of the interferometer. Observation of these events is well suited for an interferometer in Antarctica. Proof of concept observations were carried out at the Sydney University Stellar Interferometer (SUSI) with targets selected according to its geographical location. Narrow-angle astrometry using this indirect approach has achieved sub-100 micro-arcsecond precision. △ Less

Submitted 22 August, 2014; originally announced August 2014.

Comments: SPIE Astronomical Telescopes and Instrumentation conference, June 2014, Paper ID 9146-103, 17 pages, 13 figures

arXiv:1311.7497 [pdf, ps, other]

Phase-referenced Interferometry and Narrow-angle Astrometry with SUSI

Authors: Y. Kok, M. J. Ireland, P. G. Tuthill, J. G. Robertson, B. A. Warrington, A. C. Rizzuto, W. J. Tango

Abstract: The Sydney University Stellar Interferometer (SUSI) now incorporates a new beam combiner, called the Microarcsecond University of Sydney Companion Astrometry instrument (MUSCA), for the purpose of high precision differential astrometry of bright binary stars. Operating in the visible wavelength regime where photon-counting and post-processing fringe tracking is possible, MUSCA will be used in tand… ▽ More The Sydney University Stellar Interferometer (SUSI) now incorporates a new beam combiner, called the Microarcsecond University of Sydney Companion Astrometry instrument (MUSCA), for the purpose of high precision differential astrometry of bright binary stars. Operating in the visible wavelength regime where photon-counting and post-processing fringe tracking is possible, MUSCA will be used in tandem with SUSI's primary beam combiner, Precision Astronomical Visible Observations (PAVO), to record high spatial resolution fringes and thereby measure the separation of fringe packets of binary stars. In its current phase of development, the dual beam combiner configuration has successfully demonstrated for the first time a dual-star phase-referencing operation in visible wavelengths. This paper describes the beam combiner optics and hardware, the network of metrology systems employed to measure every non-common path between the two beam combiners and also reports on a recent narrow-angle astrometric observation of $δ$ Orionis A (HR 1852) as the project enters its on-sky testing phase. △ Less

Submitted 29 November, 2013; originally announced November 2013.

Comments: 35 pages, 51 figures, accepted for publication in JAI

arXiv:1309.3811 [pdf, other]

doi 10.1093/mnras/stt1690

Long-Baseline Interferometric Multiplicity Survey of the Sco-Cen OB Association

Authors: A. C. Rizzuto, M. J. Ireland, J. G. Robertson, Y. Kok, P. G. Tuthill, B. A. Warrington, X. Haubois, W. J. Tango, B. Norris, T. ten Brummelaar, A. L. Kraus, A. Jacob, C. Laliberte-Houdeville

Abstract: We present the first multiplicity-dedicated long baseline optical interferometric survey of the Scorpius-Centaurus-Lupus-Crux association. We used the Sydney University Stellar Interferometer to undertake a survey for new companions to 58 Sco-Cen B- type stars and have detected 24 companions at separations ranging from 7-130mas, 14 of which are new detections. Furthermore, we use a Bayesian analys… ▽ More We present the first multiplicity-dedicated long baseline optical interferometric survey of the Scorpius-Centaurus-Lupus-Crux association. We used the Sydney University Stellar Interferometer to undertake a survey for new companions to 58 Sco-Cen B- type stars and have detected 24 companions at separations ranging from 7-130mas, 14 of which are new detections. Furthermore, we use a Bayesian analysis and all available information in the literature to determine the multiplicity distribution of the 58 stars in our sample, showing that the companion frequency is F = 1.35 and the mass ratio distribution is best described as a power law with exponent equal to -0.46, agreeing with previous Sco-Cen high mass work and differing significantly from lower-mass stars in Tau-Aur. Based on our analysis, we estimate that among young B-type stars in moving groups, up to 23% are apparently single stars. This has strong implications for the understanding of high-mass star formation, which requires angular momentum dispersal through some mechanism such as formation of multiple systems. △ Less

Submitted 15 September, 2013; originally announced September 2013.

Comments: 7 figures, 5 tables, accepted for publication in MNRAS

arXiv:1304.0086 [pdf, ps, other]

doi 10.1364/AO.52.002808

A low cost scheme for high precision dual-wavelength laser metrology

Authors: Yit** Kok, Michael J. Ireland, J. Gordon Robertson, Peter G. Tuthill, Benjamin A. Warrington, William J. Tango

Abstract: A novel method capable of delivering relative optical path length metrology with nanometer precision is demonstrated. Unlike conventional dual-wavelength metrology which employs heterodyne detection, the method developed in this work utilizes direct detection of interference fringes of two He-Ne lasers as well as a less precise stepper motor open-loop position control system to perform its measure… ▽ More A novel method capable of delivering relative optical path length metrology with nanometer precision is demonstrated. Unlike conventional dual-wavelength metrology which employs heterodyne detection, the method developed in this work utilizes direct detection of interference fringes of two He-Ne lasers as well as a less precise stepper motor open-loop position control system to perform its measurement. Although the method may be applicable to a variety of circumstances, the specific application where this metrology is essential is in an astrometric optical long baseline stellar interferometer dedicated to precise measurement of stellar positions. In our example application of this metrology to a narrow-angle astrometric interferometer, measurement of nanometer precision could be achieved without frequency-stabilized lasers although the use of such lasers would extend the range of optical path length the metrology can accurately measure. Implementation of the method requires very little additional optics or electronics, thus minimizing cost and effort of implementation. Furthermore, the optical path traversed by the metrology lasers is identical with that of the starlight or science beams, even down to using the same photodetectors, thereby minimizing the non-common-path between metrology and science channels. △ Less

Submitted 30 March, 2013; originally announced April 2013.

Comments: 17 pages, 4 figures, accepted for publication in Applied Optics

arXiv:1303.3658 [pdf]

doi 10.1117/12.924946

Science and Technology Progress at the Sydney University Stellar Interferometer

Authors: J. Gordon Robertson, Michael J. Ireland, William J. Tango, Peter G. Tuthill, Benjamin A. Warrington, Yit** Kok, Aaron C. Rizzuto, Anthony Cheetham, Andrew P. Jacob

Abstract: This paper presents an overview of recent progress at the Sydney University Stellar Interferometer (SUSI). Development of the third-generation PAVO beam combiner has continued. The MUSCA beam combiner for high-precision differential astrometry using visible light phase referencing is under active development and will be the subject of a separate paper. Because SUSI was one of the pioneering interf… ▽ More This paper presents an overview of recent progress at the Sydney University Stellar Interferometer (SUSI). Development of the third-generation PAVO beam combiner has continued. The MUSCA beam combiner for high-precision differential astrometry using visible light phase referencing is under active development and will be the subject of a separate paper. Because SUSI was one of the pioneering interferometric instruments, some of its original systems are old and have become difficult to maintain. We are undertaking a campaign of modernization of systems: (1) an upgrade of the Optical Path Length Compensator IR laser metrology counter electronics from a custom system which uses an obsolete single-board computer to a modern one based on an FPGA interfaced to a Linux computer - in addition to improving maintainability, this upgrade should allow smoother motion and higher carriage speeds; (2) the replacement of the aged single-board computer local controllers for the siderostats and the longitudinal dispersion compensator has been completed; (3) the large beam reducing telescope has been replaced with a pair of smaller units with separate accessible foci. Examples of scientific results are also included. △ Less

Submitted 14 March, 2013; originally announced March 2013.

Comments: 10 pages, 9 Figures

Journal ref: Proc. SPIE 8445-21, 2012

Showing 1–15 of 15 results for author: Warrington, A