-
Learning Force Control for Legged Manipulation
Authors:
Tifanny Portela,
Gabriel B. Margolis,
Yandong Ji,
Pulkit Agrawal
Abstract:
Controlling contact forces during interactions is critical for locomotion and manipulation tasks. While sim-to-real reinforcement learning (RL) has succeeded in many contact-rich problems, current RL methods achieve forceful interactions implicitly without explicitly regulating forces. We propose a method for training RL policies for direct force control without requiring access to force sensing.…
▽ More
Controlling contact forces during interactions is critical for locomotion and manipulation tasks. While sim-to-real reinforcement learning (RL) has succeeded in many contact-rich problems, current RL methods achieve forceful interactions implicitly without explicitly regulating forces. We propose a method for training RL policies for direct force control without requiring access to force sensing. We showcase our method on a whole-body control platform of a quadruped robot with an arm. Such force control enables us to perform gravity compensation and impedance control, unlocking compliant whole-body manipulation. The learned whole-body controller with variable compliance makes it intuitive for humans to teleoperate the robot by only commanding the manipulator, and the robot's body adjusts automatically to achieve the desired position and force. Consequently, a human teleoperator can easily demonstrate a wide variety of loco-manipulation tasks. To the best of our knowledge, we provide the first deployment of learned whole-body force control in legged manipulators, paving the way for more versatile and adaptable legged robots.
△ Less
Submitted 20 May, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
Learning to See Physical Properties with Active Sensing Motor Policies
Authors:
Gabriel B. Margolis,
Xiang Fu,
Yandong Ji,
Pulkit Agrawal
Abstract:
Knowledge of terrain's physical properties inferred from color images can aid in making efficient robotic locomotion plans. However, unlike image classification, it is unintuitive for humans to label image patches with physical properties. Without labeled data, building a vision system that takes as input the observed terrain and predicts physical properties remains challenging. We present a metho…
▽ More
Knowledge of terrain's physical properties inferred from color images can aid in making efficient robotic locomotion plans. However, unlike image classification, it is unintuitive for humans to label image patches with physical properties. Without labeled data, building a vision system that takes as input the observed terrain and predicts physical properties remains challenging. We present a method that overcomes this challenge by self-supervised labeling of images captured by robots during real-world traversal with physical property estimators trained in simulation. To ensure accurate labeling, we introduce Active Sensing Motor Policies (ASMP), which are trained to explore locomotion behaviors that increase the accuracy of estimating physical parameters. For instance, the quadruped robot learns to swipe its foot against the ground to estimate the friction coefficient accurately. We show that the visual system trained with a small amount of real-world traversal data accurately predicts physical parameters. The trained system is robust and works even with overhead images captured by a drone despite being trained on data collected by cameras attached to a quadruped robot walking on the ground.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
DribbleBot: Dynamic Legged Manipulation in the Wild
Authors:
Yandong Ji,
Gabriel B. Margolis,
Pulkit Agrawal
Abstract:
DribbleBot (Dexterous Ball Manipulation with a Legged Robot) is a legged robotic system that can dribble a soccer ball under the same real-world conditions as humans (i.e., in-the-wild). We adopt the paradigm of training policies in simulation using reinforcement learning and transferring them into the real world. We overcome critical challenges of accounting for variable ball motion dynamics on d…
▽ More
DribbleBot (Dexterous Ball Manipulation with a Legged Robot) is a legged robotic system that can dribble a soccer ball under the same real-world conditions as humans (i.e., in-the-wild). We adopt the paradigm of training policies in simulation using reinforcement learning and transferring them into the real world. We overcome critical challenges of accounting for variable ball motion dynamics on different terrains and perceiving the ball using body-mounted cameras under the constraints of onboard computing. Our results provide evidence that current quadruped platforms are well-suited for studying dynamic whole-body control problems involving simultaneous locomotion and manipulation directly from sensory observations.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Walk These Ways: Tuning Robot Control for Generalization with Multiplicity of Behavior
Authors:
Gabriel B Margolis,
Pulkit Agrawal
Abstract:
Learned locomotion policies can rapidly adapt to diverse environments similar to those experienced during training but lack a mechanism for fast tuning when they fail in an out-of-distribution test environment. This necessitates a slow and iterative cycle of reward and environment redesign to achieve good performance on a new task. As an alternative, we propose learning a single policy that encode…
▽ More
Learned locomotion policies can rapidly adapt to diverse environments similar to those experienced during training but lack a mechanism for fast tuning when they fail in an out-of-distribution test environment. This necessitates a slow and iterative cycle of reward and environment redesign to achieve good performance on a new task. As an alternative, we propose learning a single policy that encodes a structured family of locomotion strategies that solve training tasks in different ways, resulting in Multiplicity of Behavior (MoB). Different strategies generalize differently and can be chosen in real-time for new tasks or environments, bypassing the need for time-consuming retraining. We release a fast, robust open-source MoB locomotion controller, Walk These Ways, that can execute diverse gaits with variable footswing, posture, and speed, unlocking diverse downstream tasks: crouching, hop**, high-speed running, stair traversal, bracing against shoves, rhythmic dance, and more. Video and code release: https://gmargo11.github.io/walk-these-ways/
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Rapid Locomotion via Reinforcement Learning
Authors:
Gabriel B Margolis,
Ge Yang,
Kartik Paigwar,
Tao Chen,
Pulkit Agrawal
Abstract:
Agile maneuvers such as sprinting and high-speed turning in the wild are challenging for legged robots. We present an end-to-end learned controller that achieves record agility for the MIT Mini Cheetah, sustaining speeds up to 3.9 m/s. This system runs and turns fast on natural terrains like grass, ice, and gravel and responds robustly to disturbances. Our controller is a neural network trained in…
▽ More
Agile maneuvers such as sprinting and high-speed turning in the wild are challenging for legged robots. We present an end-to-end learned controller that achieves record agility for the MIT Mini Cheetah, sustaining speeds up to 3.9 m/s. This system runs and turns fast on natural terrains like grass, ice, and gravel and responds robustly to disturbances. Our controller is a neural network trained in simulation via reinforcement learning and transferred to the real world. The two key components are (i) an adaptive curriculum on velocity commands and (ii) an online system identification strategy for sim-to-real transfer leveraged from prior work. Videos of the robot's behaviors are available at: https://agility.csail.mit.edu/
△ Less
Submitted 5 May, 2022;
originally announced May 2022.
-
Learning to Jump from Pixels
Authors:
Gabriel B. Margolis,
Tao Chen,
Kartik Paigwar,
Xiang Fu,
Donghyun Kim,
Sangbae Kim,
Pulkit Agrawal
Abstract:
Today's robotic quadruped systems can robustly walk over a diverse range of rough but continuous terrains, where the terrain elevation varies gradually. Locomotion on discontinuous terrains, such as those with gaps or obstacles, presents a complementary set of challenges. In discontinuous settings, it becomes necessary to plan ahead using visual inputs and to execute agile behaviors beyond robust…
▽ More
Today's robotic quadruped systems can robustly walk over a diverse range of rough but continuous terrains, where the terrain elevation varies gradually. Locomotion on discontinuous terrains, such as those with gaps or obstacles, presents a complementary set of challenges. In discontinuous settings, it becomes necessary to plan ahead using visual inputs and to execute agile behaviors beyond robust walking, such as jumps. Such dynamic motion results in significant motion of onboard sensors, which introduces a new set of challenges for real-time visual processing. The requirement for agility and terrain awareness in this setting reinforces the need for robust control. We present Depth-based Impulse Control (DIC), a method for synthesizing highly agile visually-guided locomotion behaviors. DIC affords the flexibility of model-free learning but regularizes behavior through explicit model-based optimization of ground reaction forces. We evaluate the proposed method both in simulation and in the real world.
△ Less
Submitted 28 October, 2021;
originally announced October 2021.
-
The High Energy Behavior of the Forward Scattering Parameters---An Amplitude Analysis Update
Authors:
M. M. Block,
A. R. White,
B. Margolis
Abstract:
Utilizing the most recent experimental data, we reanalyze high energy \pbar p and pp data, using the asymptotic amplitude analysis, under the assumption that we have reached `asymptopia'. This analysis gives strong evidence for a $\log \,(s/s_0)$ dependence at {\em current} energies and {\em not} $\log^2 (s/s_0)$, and also demonstrates that odderons are {\em not} necessary to explain the experim…
▽ More
Utilizing the most recent experimental data, we reanalyze high energy \pbar p and pp data, using the asymptotic amplitude analysis, under the assumption that we have reached `asymptopia'. This analysis gives strong evidence for a $\log \,(s/s_0)$ dependence at {\em current} energies and {\em not} $\log^2 (s/s_0)$, and also demonstrates that odderons are {\em not} necessary to explain the experimental data.
△ Less
Submitted 13 October, 1995;
originally announced October 1995.
-
CP-Violation and the Quark Mass Matrices
Authors:
B. Margolis,
S. Punch,
; C. Hamzaoui
Abstract:
We study a class of quark mass matrix models with the $\cal CP$-violating phase determined through making the $\cal CP$-violating parameter $J$ an extremum. These models assume that $m_u \ll m_c \ll m_t$ and that $m_d \ll m_s \ll m_b$. They have $\left|{V_{ub}\over{V_{cb}}}\right|\approx{\sqrt{m_u \over{m_c}}}$, $\left|{V_{td}\over{V_{ts}}}\right|\approx{\sqrt{m_d \over{m_s}}}$ and…
▽ More
We study a class of quark mass matrix models with the $\cal CP$-violating phase determined through making the $\cal CP$-violating parameter $J$ an extremum. These models assume that $m_u \ll m_c \ll m_t$ and that $m_d \ll m_s \ll m_b$. They have $\left|{V_{ub}\over{V_{cb}}}\right|\approx{\sqrt{m_u \over{m_c}}}$, $\left|{V_{td}\over{V_{ts}}}\right|\approx{\sqrt{m_d \over{m_s}}}$ and $|V_{us}|\approx|V_{cd}|\approx{\sqrt{m_d \over{m_s}+{m_u \over{m_c}}}}$. The Wolfenstein parameters $ρ$ and $η$ are found to be related by $ρ\approxη^2$. Finally, we examine a special class of such models where the masses are constrained to be roughly in geometric progression. Further application of the extremal condition to $J$ then leads to ${\sqrt{m_d \over{m_s}}}\approx 3{\sqrt{m_u \over{m_c}}}$ and hence $η\approx\frac{1}{3}$, for maximal $J$.
△ Less
Submitted 1 June, 1995;
originally announced June 1995.
-
The High Energy Behavior of the Forward Scattering Parameters σtotal $ρ$, and $B$
Authors:
M. M. Block,
F. Halzen,
B. Margolis,
A. R. White
Abstract:
Utilizing the most recent experimental data, we reanalyze high energy \pbar p and pp data, using two distinct (and {\em dissimilar}) analysis techniques: (1) asymptotic amplitude analysis, under the assumption that we have reached `asymptopia', and (2) an eikonal model whose amplitudes are designed to mimic real QCD amplitudes. The former gives strong evidence for a $\log \,(s/s_0)$ dependence a…
▽ More
Utilizing the most recent experimental data, we reanalyze high energy \pbar p and pp data, using two distinct (and {\em dissimilar}) analysis techniques: (1) asymptotic amplitude analysis, under the assumption that we have reached `asymptopia', and (2) an eikonal model whose amplitudes are designed to mimic real QCD amplitudes. The former gives strong evidence for a $\log \,(s/s_0)$ dependence at {\em current} energies and {\em not} $\log^2 (s/s_0)$, and demonstrates that odderons are {\em not} necessary to explain the experimental data. The latter gives a unitary model for extrapolation into true `asymptopia' from current energies, allowing us to predict the values of the total cross section at future supercolliders. Using our QCD-model, we obtain $\stot(16\,\, {\rm TeV})=109\pm4$\,mb and $\stot(40\,\, {\rm TeV})=124\pm4$\,mb.
△ Less
Submitted 15 December, 1994;
originally announced December 1994.
-
Quark model calculation of $η\to l^+ l^-$ to all orders in the bound state relative momentum
Authors:
B. Margolis,
J. Ng,
M. Phipps,
H. D. Trottier
Abstract:
The electromagnetic box diagram for the leptonic decays of pseudoscalar mesons in the quark model is evaluated to all orders in ${\bf p} / m_q$, where ${\bf p}$ is the relative three-momentum of the quark-antiquark pair and $m_q$ is the quark mass. We compute $B_P \equiv Γ(η\to l^+ l^-) / Γ(η\to γγ)$ using a popular nonrelativistic (NR) harmonic oscillator wave function, and with a relativistic…
▽ More
The electromagnetic box diagram for the leptonic decays of pseudoscalar mesons in the quark model is evaluated to all orders in ${\bf p} / m_q$, where ${\bf p}$ is the relative three-momentum of the quark-antiquark pair and $m_q$ is the quark mass. We compute $B_P \equiv Γ(η\to l^+ l^-) / Γ(η\to γγ)$ using a popular nonrelativistic (NR) harmonic oscillator wave function, and with a relativistic momentum space wave function that we derive from the MIT bag model. We also compare with a calculation in the limit of extreme NR binding due to Bergström. Numerical calculations of $B_P$ using these three parameterizations of the wave function agree to within a few percent over a wide kinematical range. We find that the quark model leads in a natural way to a negligible value for the ratio of dispersive to absorptive parts of the electromagnetic amplitude for $η\to μ^+ μ^-$ (unitary bound). However we find substantial deviations from the unitary bound in other kinematical regions, such as $η,π^0 \to e^+ e^-$. These quark models yield $B(η\to μ^+μ^-) \approx 4.3 \times 10^{-6}$, within errors of the recent SATURNE measurement of $5.1 \pm 0.8 \times 10^{-6}$, $B(η\to e^+ e^-) \approx 6.3 \times 10^{-9}$, and $B(π^0 \to e^+ e^-) \approx 1.0 \times 10^{-7}$.
△ Less
Submitted 22 October, 1992;
originally announced October 1992.
-
Can perturbative QCD predict a substantial part of diffractive LHC/SSC physics?
Authors:
J. R. Cudell,
B. Margolis
Abstract:
We examine a model of hadronic diffractive scattering which interpolates between perturbative QCD and non-perturbative fits. We restrict the perturbative QCD resummation to the large transverse momentum region, and use a simple Regge-pole parametrization in the infrared region. This picture allows us to account for existing data, and to estimate the size of the perturbative contribution to futur…
▽ More
We examine a model of hadronic diffractive scattering which interpolates between perturbative QCD and non-perturbative fits. We restrict the perturbative QCD resummation to the large transverse momentum region, and use a simple Regge-pole parametrization in the infrared region. This picture allows us to account for existing data, and to estimate the size of the perturbative contribution to future diffractive measurements. At LHC and SSC energies, we find that a cut-off BFKL equation can lead to a measurable perturbative component in traditionally soft processes. In particular, we show that the total pp cross section could become as large as 228 mb (160 mb) and the rho parameter as large as 0.23 (0.24) at the SSC (LHC).
△ Less
Submitted 13 September, 1992;
originally announced September 1992.
-
Perturbative QCD Corrections to the Soft Pomeron
Authors:
J. R. Cudell,
B. Margolis
Abstract:
We study the interface between soft and hard QCD at high energy and small momentum transfer. At LHC and SSC energies, we find that a cutoff BFKL equation leads one to expect a measurable perturbative component in traditionally soft processes. We show that the total cross section could become as large as 175 mb (122 mb) and the rho parameter 0.40 (0.25) at the SSC (LHC).
We study the interface between soft and hard QCD at high energy and small momentum transfer. At LHC and SSC energies, we find that a cutoff BFKL equation leads one to expect a measurable perturbative component in traditionally soft processes. We show that the total cross section could become as large as 175 mb (122 mb) and the rho parameter 0.40 (0.25) at the SSC (LHC).
△ Less
Submitted 19 June, 1992;
originally announced June 1992.