Search | arXiv e-print repository

Policy-Guided Diffusion

Authors: Matthew Thomas Jackson, Michael Tryfan Matthews, Cong Lu, Benjamin Ellis, Shimon Whiteson, Jakob Foerster

Abstract: In many real-world settings, agents must learn from an offline dataset gathered by some prior behavior policy. Such a setting naturally leads to distribution shift between the behavior policy and the target policy being trained - requiring policy conservatism to avoid instability and overestimation bias. Autoregressive world models offer a different solution to this by generating synthetic, on-pol… ▽ More In many real-world settings, agents must learn from an offline dataset gathered by some prior behavior policy. Such a setting naturally leads to distribution shift between the behavior policy and the target policy being trained - requiring policy conservatism to avoid instability and overestimation bias. Autoregressive world models offer a different solution to this by generating synthetic, on-policy experience. However, in practice, model rollouts must be severely truncated to avoid compounding error. As an alternative, we propose policy-guided diffusion. Our method uses diffusion models to generate entire trajectories under the behavior distribution, applying guidance from the target policy to move synthetic experience further on-policy. We show that policy-guided diffusion models a regularized form of the target distribution that balances action likelihood under both the target and behavior policies, leading to plausible trajectories with high target policy probability, while retaining a lower dynamics error than an offline world model baseline. Using synthetic experience from policy-guided diffusion as a drop-in substitute for real data, we demonstrate significant improvements in performance across a range of standard offline reinforcement learning algorithms and environments. Our approach provides an effective alternative to autoregressive offline world models, opening the door to the controllable generation of synthetic training data. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: Previously at the NeurIPS 2023 Workshop on Robot Learning

arXiv:2210.04843 [pdf, other]

Multi-Modal Fusion by Meta-Initialization

Authors: Matthew T. Jackson, Shreshth A. Malik, Michael T. Matthews, Yousuf Mohamed-Ahmed

Abstract: When experience is scarce, models may have insufficient information to adapt to a new task. In this case, auxiliary information - such as a textual description of the task - can enable improved task inference and adaptation. In this work, we propose an extension to the Model-Agnostic Meta-Learning algorithm (MAML), which allows the model to adapt using auxiliary information as well as task experie… ▽ More When experience is scarce, models may have insufficient information to adapt to a new task. In this case, auxiliary information - such as a textual description of the task - can enable improved task inference and adaptation. In this work, we propose an extension to the Model-Agnostic Meta-Learning algorithm (MAML), which allows the model to adapt using auxiliary information as well as task experience. Our method, Fusion by Meta-Initialization (FuMI), conditions the model initialization on auxiliary information using a hypernetwork, rather than learning a single, task-agnostic initialization. Furthermore, motivated by the shortcomings of existing multi-modal few-shot learning benchmarks, we constructed iNat-Anim - a large-scale image classification dataset with succinct and visually pertinent textual class descriptions. On iNat-Anim, FuMI significantly outperforms uni-modal baselines such as MAML in the few-shot regime. The code for this project and a dataset exploration tool for iNat-Anim are publicly available at https://github.com/s-a-malik/multi-few . △ Less

Submitted 10 October, 2022; originally announced October 2022.

Comments: The first two authors contributed equally

arXiv:1903.09335 [pdf, other]

doi 10.1088/1873-7005/ab693d

Early- and late-time evolution of Rayleigh-Taylor instability in a finite-sized domain by means of group theory analysisAnnie Naveh

Authors: Annie Naveh, Miccal T. Matthews, Snezhana I. Abarzhi

Abstract: We have developed a theoretical analysis to systematically study the late-time evolution of the Rayleigh-Taylor instability in a finite-sized spatial domain. The nonlinear dynamics of fluids with similar and contrasting densities are considered for two-dimensional flows driven by sustained acceleration. The flows are periodic in the plane normal to the direction of acceleration and have no externa… ▽ More We have developed a theoretical analysis to systematically study the late-time evolution of the Rayleigh-Taylor instability in a finite-sized spatial domain. The nonlinear dynamics of fluids with similar and contrasting densities are considered for two-dimensional flows driven by sustained acceleration. The flows are periodic in the plane normal to the direction of acceleration and have no external mass sources. Group theory analysis is applied to accurately account for the mode coupling. Asymptotic nonlinear solutions are found to describe the inter-facial dynamics far from and near the boundaries. The influence of the size of the domain on the diagnostic parameters of the flow is identified. In particular, it is shown that in a finite-sized domain the flow is slower compared to the spatially extended case. The direct link between the multiplicity of solutions and the inter-facial shear function is explored. It is suggested that the inter-facial shear function acts as a natural parameter to the family of analytic solutions. △ Less

Submitted 21 March, 2019; originally announced March 2019.

Comments: 31 pages

Journal ref: 2020 Fluid Dynamics Research 52, 025504

Showing 1–3 of 3 results for author: Matthews, M T