High-dimensional reinforcement learning for optimization and control of ultracold quantum gases
Authors:
Nicholas Milson,
Arina Tashchilina,
Tian Ooi,
Anna Czarnecka,
Zaheen F. Ahmad,
Lindsay J. LeBlanc
Abstract:
Machine-learning techniques are emerging as a valuable tool in experimental physics, and among them, reinforcement learning offers the potential to control high-dimensional, multistage processes in the presence of fluctuating environments. In this experimental work, we apply reinforcement learning to the preparation of an ultracold quantum gas to realize a consistent and large number of atoms at m…
▽ More
Machine-learning techniques are emerging as a valuable tool in experimental physics, and among them, reinforcement learning offers the potential to control high-dimensional, multistage processes in the presence of fluctuating environments. In this experimental work, we apply reinforcement learning to the preparation of an ultracold quantum gas to realize a consistent and large number of atoms at microkelvin temperatures. This reinforcement learning agent determines an optimal set of thirty control parameters in a dynamically changing environment that is characterized by thirty sensed parameters. By comparing this method to that of training supervised-learning regression models, as well as to human-driven control schemes, we find that both machine learning approaches accurately predict the number of cooled atoms and both result in occasional superhuman control schemes. However, only the reinforcement learning method achieves consistent outcomes, even in the presence of a dynamic environment.
△ Less
Submitted 29 December, 2023; v1 submitted 9 August, 2023;
originally announced August 2023.
Marginal Utility for Planning in Continuous or Large Discrete Action Spaces
Authors:
Zaheen Farraz Ahmad,
Levi H. S. Lelis,
Michael Bowling
Abstract:
Sample-based planning is a powerful family of algorithms for generating intelligent behavior from a model of the environment. Generating good candidate actions is critical to the success of sample-based planners, particularly in continuous or large action spaces. Typically, candidate action generation exhausts the action space, uses domain knowledge, or more recently, involves learning a stochasti…
▽ More
Sample-based planning is a powerful family of algorithms for generating intelligent behavior from a model of the environment. Generating good candidate actions is critical to the success of sample-based planners, particularly in continuous or large action spaces. Typically, candidate action generation exhausts the action space, uses domain knowledge, or more recently, involves learning a stochastic policy to provide such search guidance. In this paper we explore explicitly learning a candidate action generator by optimizing a novel objective, marginal utility. The marginal utility of an action generator measures the increase in value of an action over previously generated actions. We validate our approach in both curling, a challenging stochastic domain with continuous state and action spaces, and a location game with a discrete but large action space. We show that a generator trained with the marginal utility objective outperforms hand-coded schemes built on substantial domain knowledge, trained stochastic policies, and other natural objectives for generating actions for sampled-based planners.
△ Less
Submitted 17 June, 2020; v1 submitted 10 June, 2020;
originally announced June 2020.