-
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention
Authors:
Abhishek Gupta,
Justin Yu,
Tony Z. Zhao,
Vikash Kumar,
Aaron Rovinsky,
Kelvin Xu,
Thomas Devlin,
Sergey Levine
Abstract:
Reinforcement Learning (RL) algorithms can in principle acquire complex robotic skills by learning from large amounts of data in the real world, collected via trial and error. However, most RL algorithms use a carefully engineered setup in order to collect data, requiring human supervision and intervention to provide episodic resets. This is particularly evident in challenging robotics problems, s…
▽ More
Reinforcement Learning (RL) algorithms can in principle acquire complex robotic skills by learning from large amounts of data in the real world, collected via trial and error. However, most RL algorithms use a carefully engineered setup in order to collect data, requiring human supervision and intervention to provide episodic resets. This is particularly evident in challenging robotics problems, such as dexterous manipulation. To make data collection scalable, such applications require reset-free algorithms that are able to learn autonomously, without explicit instrumentation or human intervention. Most prior work in this area handles single-task learning. However, we might also want robots that can perform large repertoires of skills. At first, this would appear to only make the problem harder. However, the key observation we make in this work is that an appropriately chosen multi-task RL setting actually alleviates the reset-free learning challenge, with minimal additional machinery required. In effect, solving a multi-task problem can directly solve the reset-free problem since different combinations of tasks can serve to perform resets for other tasks. By learning multiple tasks together and appropriately sequencing them, we can effectively learn all of the tasks together reset-free. This type of multi-task learning can effectively scale reset-free learning schemes to much more complex problems, as we demonstrate in our experiments. We propose a simple scheme for multi-task learning that tackles the reset-free learning problem, and show its effectiveness at learning to solve complex dexterous manipulation tasks in both hardware and simulation without any explicit resets. This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
△ Less
Submitted 22 April, 2021;
originally announced April 2021.
-
The Simons Observatory: Metamaterial Microwave Absorber (MMA) and its Cryogenic Applications
Authors:
Zhilei Xu,
Grace E. Chesmore,
Shunsuke Adachi,
Aamir M. Ali,
Andrew Bazarko,
Gabriele Coppi,
Mark Devlin,
Tom Devlin,
Simon R. Dicker,
Patricio A. Gallardo,
Joseph E. Golec,
Jon E. Gudmundsson,
Kathleen Harrington,
Makoto Hattori,
Anna Kofman,
Kenji Kiuchi,
Akito Kusaka,
Michele Limon,
Frederick Matsuda,
Jeff McMahon,
Federico Nati,
Michael D. Niemack,
Shreya Sutariya,
Aritoki Suzuki,
Grant P. Teply
, et al. (4 additional authors not shown)
Abstract:
Controlling stray light at millimeter wavelengths requires special optical design and selection of absorptive materials that should be compatible with cryogenic operating environments. While a wide selection of absorptive materials exists, these typically exhibit high indices of refraction and reflect/scatter a significant fraction of light before absorption. For many lower index materials such as…
▽ More
Controlling stray light at millimeter wavelengths requires special optical design and selection of absorptive materials that should be compatible with cryogenic operating environments. While a wide selection of absorptive materials exists, these typically exhibit high indices of refraction and reflect/scatter a significant fraction of light before absorption. For many lower index materials such as commercial microwave absorbers, their applications in cryogenic environments are challenging. In this paper, we present a new tool to control stray light: metamaterial microwave absorber tiles. These tiles comprise an outer metamaterial layer that approximates a lossy gradient index anti-reflection coating. They are fabricated via injection molding commercially available carbon-loaded polyurethane (25\% by mass). The injection molding technology enables mass production at low cost. The design of these tiles is presented, along with thermal tests to 1 K. Room temperature optical measurements verify their control of reflectance to less than 1\% up to 65$\circ$ angles of incidence, and control of wide angle scattering below 0.01\%. The dielectric properties of the bulk carbon-loaded material used in the tiles is also measured at different temperatures, confirming that the material maintains similar dielectric properties down to 3 K.
△ Less
Submitted 22 February, 2021; v1 submitted 5 October, 2020;
originally announced October 2020.
-
Search for First-Generation Scalar Leptoquarks in $\bm{p \bar{p}}$ collisions at $\sqrt{s}$=1.96 TeV
Authors:
The CDF Collaboration,
D. Acosta,
J. Adelman,
T. Affolder,
T. Akimoto,
M. G. Albrow,
D. Ambrose,
S. Amerio,
D. Amidei,
A. Anastassov,
K. Anikeev,
A. Annovi,
J. Antos,
M. Aoki,
G. Apollinari,
T. Arisawa,
J-F. Arguin,
A. Artikov,
W. Ashmanskas,
A. Attal,
F. Azfar,
P. Azzi-Bacchetta,
N. Bacchetta,
H. Bachacou,
W. Badgett
, et al. (605 additional authors not shown)
Abstract:
We report on a search for pair production of first-generation scalar leptoquarks ($LQ$) in $p \bar{p}$ collisions at $\sqrt{s}$=1.96 TeV using an integrated luminosity of 203 $pb^{-1}$ collected at the Fermilab Tevatron collider by the CDF experiment. We observe no evidence for $LQ$ production in the topologies arising from $LQ \bar{LQ} \to eqeq$ and $LQ \bar{LQ} \to eq νq$, and derive 95% C.L.…
▽ More
We report on a search for pair production of first-generation scalar leptoquarks ($LQ$) in $p \bar{p}$ collisions at $\sqrt{s}$=1.96 TeV using an integrated luminosity of 203 $pb^{-1}$ collected at the Fermilab Tevatron collider by the CDF experiment. We observe no evidence for $LQ$ production in the topologies arising from $LQ \bar{LQ} \to eqeq$ and $LQ \bar{LQ} \to eq νq$, and derive 95% C.L. upper limits on the $LQ$ production cross section. %as a function of $β$, where $β$ is the branching fraction for $LQ \to eq$. The results are combined with those obtained from a separately reported CDF search in the topology arising from $LQ\bar{LQ} \to νq νq$ and 95% C.L. lower limits on the LQ mass as a function of $β= BR(LQ \to eq) $ are derived. The limits are 236, 205 and 145 GeV/c$^2$ for $β$ = 1, $β$ = 0.5 and $β$ = 0.1, respectively.
△ Less
Submitted 29 June, 2005;
originally announced June 2005.