Skip to main content

Showing 1–50 of 208 results for author: Tomizuka, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01531  [pdf, other

    cs.RO cs.LG

    Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning

    Authors: Yixiao Wang, Yifei Zhang, Mingxiao Huo, Ran Tian, Xiang Zhang, Yichen Xie, Chenfeng Xu, Pengliang Ji, Wei Zhan, Mingyu Ding, Masayoshi Tomizuka

    Abstract: The increasing complexity of tasks in robotics demands efficient strategies for multitask and continual learning. Traditional models typically rely on a universal policy for all tasks, facing challenges such as high computational costs and catastrophic forgetting when learning new tasks. To address these issues, we introduce a sparse, reusable, and flexible policy, Sparse Diffusion Policy (SDP). B… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2407.00898  [pdf, other

    cs.RO

    Residual-MPPI: Online Policy Customization for Continuous Control

    Authors: Pengcheng Wang, Chenran Li, Catherine Weaver, Kenta Kawamoto, Masayoshi Tomizuka, Chen Tang, Wei Zhan

    Abstract: Policies learned through Reinforcement Learning (RL) and Imitation Learning (IL) have demonstrated significant potential in achieving advanced performance in continuous control tasks. However, in real-world environments, it is often necessary to further customize a trained policy when there are additional requirements that were unforeseen during the original training phase. It is possible to fine-… ▽ More

    Submitted 3 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

  3. arXiv:2406.16258  [pdf, other

    cs.RO cs.AI cs.LG

    MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention

    Authors: Yuxin Chen, Chen Tang, Chenran Li, Ran Tian, Peter Stone, Masayoshi Tomizuka, Wei Zhan

    Abstract: Aligning robot behavior with human preferences is crucial for deploying embodied AI agents in human-centered environments. A promising solution is interactive imitation learning from human intervention, where a human expert observes the policy's execution and provides interventions as feedback. However, existing methods often fail to utilize the prior policy efficiently to facilitate learning, thu… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    ACM Class: I.2.6; I.2.9

  4. arXiv:2406.12303  [pdf, other

    cs.CV

    Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment

    Authors: Yiheng Li, Heyang Jiang, Akio Kodaira, Masayoshi Tomizuka, Kurt Keutzer, Chenfeng Xu

    Abstract: In this paper, we point out suboptimal noise-data map** leads to slow training of diffusion models. During diffusion training, current methods diffuse each image across the entire noise space, resulting in a mixture of all images at every point in the noise layer. We emphasize that this random mixture of noise-data map** complicates the optimization of the denoising function in diffusion model… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  5. arXiv:2406.04666  [pdf, other

    cs.RO

    Underactuated Control of Multiple Soft Pneumatic Actuators via Stable Inversion

    Authors: Wu-Te Yang, Burak Kurkcu, Masayoshi Tomizuka

    Abstract: Soft grippers, with their inherent compliance and adaptability, show advantages for delicate and versatile manipulation tasks in robotics. This paper presents a novel approach to underactuated control of multiple soft actuators, specifically focusing on the synchronization of soft fingers within a soft gripper. Utilizing a single syringe pump as the actuation mechanism, we address the challenge of… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 10 pages, 7 figures

  6. arXiv:2406.04534  [pdf, other

    cs.LG

    Strategically Conservative Q-Learning

    Authors: Yutaka Shimizu, Joey Hong, Sergey Levine, Masayoshi Tomizuka

    Abstract: Offline reinforcement learning (RL) is a compelling paradigm to extend RL's practical utility by leveraging pre-collected, static datasets, thereby avoiding the limitations associated with collecting online interactions. The major difficulty in offline RL is mitigating the impact of approximation errors when encountering out-of-distribution (OOD) actions; doing so ineffectively will lead to polici… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  7. arXiv:2405.20323  [pdf, other

    cs.CV cs.AI

    $\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving

    Authors: Nan Huang, Xiaobao Wei, Wenzhao Zheng, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang

    Abstract: Photorealistic 3D reconstruction of street scenes is a critical technique for develo** real-world simulators for autonomous driving. Despite the efficacy of Neural Radiance Fields (NeRF) for driving scenes, 3D Gaussian Splatting (3DGS) emerges as a promising direction due to its faster speed and more explicit representation. However, most existing street 3DGS methods require tracked 3D vehicle b… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Code is available at: https://github.com/nnanhuang/S3Gaussian/

  8. arXiv:2405.15757  [pdf, other

    cs.CV cs.MM

    Looking Backward: Streaming Video-to-Video Translation with Feature Banks

    Authors: Feng Liang, Akio Kodaira, Chenfeng Xu, Masayoshi Tomizuka, Kurt Keutzer, Diana Marculescu

    Abstract: This paper introduces StreamV2V, a diffusion model that achieves real-time streaming video-to-video (V2V) translation with user prompts. Unlike prior V2V methods using batches to process limited frames, we opt to process frames in a streaming fashion, to support unlimited frames. At the heart of StreamV2V lies a backward-looking principle that relates the present to the past. This is realized by m… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Project page: https://jeff-liangf.github.io/projects/streamv2v

  9. arXiv:2405.01333  [pdf, other

    cs.RO cs.CV

    NeRF in Robotics: A Survey

    Authors: Guangming Wang, Lei Pan, Songyou Peng, Shaohui Liu, Chenfeng Xu, Yanzi Miao, Wei Zhan, Masayoshi Tomizuka, Marc Pollefeys, Hesheng Wang

    Abstract: Meticulous 3D environment representations have been a longstanding goal in computer vision and robotics fields. The recent emergence of neural implicit representations has introduced radical innovation to this field as implicit representations enable numerous capabilities. Among these, the Neural Radiance Field (NeRF) has sparked a trend because of the huge representational advantages, such as sim… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 21 pages, 19 figures

  10. arXiv:2404.06605  [pdf, other

    cs.CV

    RoadBEV: Road Surface Reconstruction in Bird's Eye View

    Authors: Tong Zhao, Lei Yang, Yichen Xie, Mingyu Ding, Masayoshi Tomizuka, Yintao Wei

    Abstract: Road surface conditions, especially geometry profiles, enormously affect driving performance of autonomous vehicles. Vision-based online road reconstruction promisingly captures road information in advance. Existing solutions like monocular depth estimation and stereo matching suffer from modest performance. The recent technique of Bird's-Eye-View (BEV) perception provides immense potential to mor… ▽ More

    Submitted 20 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: Dataset page: https://thu-rsxd.com/rsrd Code: https://github.com/ztsrxh/RoadBEV

  11. arXiv:2404.04772  [pdf, other

    cs.RO

    Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning

    Authors: Zheng Wu, Yichuan Li, Wei Zhan, Changliu Liu, Yun-Hui Liu, Masayoshi Tomizuka

    Abstract: The development of robotic systems for palletization in logistics scenarios is of paramount importance, addressing critical efficiency and precision demands in supply chain management. This paper investigates the application of Reinforcement Learning (RL) in enhancing task planning for such robotic systems. Confronted with the substantial challenge of a vast action space, which is a significant im… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 8 pages, 8 figures

  12. arXiv:2404.00237  [pdf, other

    cs.RO

    Joint Pedestrian Trajectory Prediction through Posterior Sampling

    Authors: Haotian Lin, Yixiao Wang, Mingxiao Huo, Chensheng Peng, Zhiyuan Liu, Masayoshi Tomizuka

    Abstract: Joint pedestrian trajectory prediction has long grappled with the inherent unpredictability of human behaviors. Recent investigations employing variants of conditional diffusion models in trajectory prediction have exhibited notable success. Nevertheless, the heavy dependence on accurate historical data results in their vulnerability to noise disturbances and data incompleteness. To improve the ro… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  13. arXiv:2403.20001  [pdf, other

    cs.RO

    Adaptive Energy Regularization for Autonomous Gait Transition and Energy-Efficient Quadruped Locomotion

    Authors: Boyuan Liang, Lingfeng Sun, Xinghao Zhu, Bike Zhang, Ziyin Xiong, Chenran Li, Koushil Sreenath, Masayoshi Tomizuka

    Abstract: In reinforcement learning for legged robot locomotion, crafting effective reward strategies is crucial. Pre-defined gait patterns and complex reward systems are widely used to stabilize policy training. Drawing from the natural locomotion behaviors of humans and animals, which adapt their gaits to minimize energy consumption, we propose a simplified, energy-centric reward strategy to foster the de… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 8 pages, 5 figures

  14. arXiv:2403.18960  [pdf, other

    cs.RO

    Robust In-Hand Manipulation with Extrinsic Contacts

    Authors: Boyuan Liang, Kei Ota, Masayoshi Tomizuka, Devesh Jha

    Abstract: We present in-hand manipulation tasks where a robot moves an object in grasp, maintains its external contact mode with the environment, and adjusts its in-hand pose simultaneously. The proposed manipulation task leads to complex contact interactions which can be very susceptible to uncertainties in kinematic and physical parameters. Therefore, we propose a robust in-hand manipulation method, which… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted at ICRA 24

  15. DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking

    Authors: Yichuan Li, Junkai Zhao, Yixiao Li, Zheng Wu, Rui Cao, Masayoshi Tomizuka, Yunhui Liu

    Abstract: Efficiency and reliability are critical in robotic bin-picking as they directly impact the productivity of automated industrial processes. However, traditional approaches, demanding static objects and fixed collisions, lead to deployment limitations, operational inefficiencies, and process unreliability. This paper introduces a Dynamic Bin-Picking Framework (DBPF) that challenges traditional stati… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 8 pages, 5 figures. This paper has been accepted by IEEE RA-L on 2024-03-24. See the supplementary video at youtube: https://youtu.be/n5af2VsKhkg

  16. PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search

    Authors: Chensheng Peng, Zhaoyu Zeng, **ling Gao, Jundong Zhou, Masayoshi Tomizuka, Xinbing Wang, Chenghu Zhou, Nanyang Ye

    Abstract: Multiple object tracking is a critical task in autonomous driving. Existing works primarily focus on the heuristic design of neural networks to obtain high accuracy. As tracking accuracy improves, however, neural networks become increasingly complex, posing challenges for their practical application in real driving scenarios due to the high level of latency. In this paper, we explore the use of th… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: IEEE Robotics and Automation Letters 2024. Code is available at https://github.com/PholyPeng/PNAS-MOT

    Journal ref: IEEE Robotics and Automation Letters, 2024

  17. arXiv:2403.12676  [pdf, other

    cs.RO

    In-Hand Following of Deformable Linear Objects Using Dexterous Fingers with Tactile Sensing

    Authors: Mingrui Yu, Boyuan Liang, Xiang Zhang, Xinghao Zhu, Xiang Li, Masayoshi Tomizuka

    Abstract: Most research on deformable linear object (DLO) manipulation assumes rigid gras**. However, beyond rigid gras** and re-gras**, in-hand following is also an essential skill that humans use to dexterously manipulate DLOs, which requires continuously changing the grasp point by in-hand sliding while holding the DLO to prevent it from falling. Achieving such a skill is very challenging for robot… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  18. arXiv:2403.08125  [pdf, other

    cs.CV

    Q-SLAM: Quadric Representations for Monocular SLAM

    Authors: Chensheng Peng, Chenfeng Xu, Yue Wang, Mingyu Ding, Heng Yang, Masayoshi Tomizuka, Kurt Keutzer, Marco Pavone, Wei Zhan

    Abstract: Monocular SLAM has long grappled with the challenge of accurately modeling 3D geometries. Recent advances in Neural Radiance Fields (NeRF)-based monocular SLAM have shown promise, yet these methods typically focus on novel view synthesis rather than precise 3D geometry modeling. This focus results in a significant disconnect between NeRF applications, i.e., novel-view synthesis and the requirement… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  19. arXiv:2403.07470  [pdf, other

    cs.RO cs.PL

    DrPlanner: Diagnosis and Repair of Motion Planners Using Large Language Models

    Authors: Yuanfei Lin, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Matthias Althoff

    Abstract: Motion planners are essential for the safe operation of automated vehicles across various scenarios. However, no motion planning algorithm has achieved perfection in the literature, and improving its performance is often time-consuming and labor-intensive. To tackle the aforementioned issues, we present DrPlanner, the first framework designed to automatically diagnose and repair motion planners us… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: @2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  20. arXiv:2403.06086  [pdf, other

    cs.AI cs.RO

    Towards Generalizable and Interpretable Motion Prediction: A Deep Variational Bayes Approach

    Authors: Juanwu Lu, Wei Zhan, Masayoshi Tomizuka, Ye** Hu

    Abstract: Estimating the potential behavior of the surrounding human-driven vehicles is crucial for the safety of autonomous vehicles in a mixed traffic flow. Recent state-of-the-art achieved accurate prediction using deep neural networks. However, these end-to-end models are usually black boxes with weak interpretability and generalizability. This paper proposes the Goal-based Neural Variational Agent (GNe… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Accepted at AISTATS 2024

  21. arXiv:2403.06041  [pdf, other

    cs.RO cs.AI cs.CV cs.LG cs.MA

    MATRIX: Multi-Agent Trajectory Generation with Diverse Contexts

    Authors: Zhuo Xu, Rui Zhou, Yida Yin, Huidong Gao, Masayoshi Tomizuka, Jiachen Li

    Abstract: Data-driven methods have great advantages in modeling complicated human behavioral dynamics and dealing with many human-robot interaction applications. However, collecting massive and annotated real-world human datasets has been a laborious task, especially for highly interactive scenarios. On the other hand, algorithmic data generation methods are usually limited by their model capacities, making… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: IEEE International Conference on Robotics and Automation (ICRA 2024)

  22. arXiv:2402.18897  [pdf, other

    cs.RO

    Contact-Implicit Model Predictive Control for Dexterous In-hand Manipulation: A Long-Horizon and Robust Approach

    Authors: Yongpeng Jiang, Mingrui Yu, Xinghao Zhu, Masayoshi Tomizuka, Xiang Li

    Abstract: Dexterous in-hand manipulation is an essential skill of production and life. Nevertheless, the highly stiff and mutable features of contacts cause limitations to real-time contact discovery and inference, which degrades the performance of model-based methods. Inspired by recent advancements in contact-rich locomotion and manipulation, this paper proposes a novel model-based approach to control dex… ▽ More

    Submitted 11 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: 7 pages, 8 figures, submitted to IROS2024

  23. arXiv:2402.16836  [pdf, other

    cs.RO cs.AI cs.CL cs.CV

    PhyGrasp: Generalizing Robotic Gras** with Physics-informed Large Multimodal Models

    Authors: Dingkun Guo, Yuqi Xiang, Shuqi Zhao, Xinghao Zhu, Masayoshi Tomizuka, Mingyu Ding, Wei Zhan

    Abstract: Robotic gras** is a fundamental aspect of robot functionality, defining how robots interact with objects. Despite substantial progress, its generalizability to counter-intuitive or long-tailed scenarios, such as objects with uncommon materials or shapes, remains a challenge. In contrast, humans can easily apply their intuitive physics to grasp skillfully and change grasps efficiently, even for o… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  24. arXiv:2402.15583  [pdf, other

    cs.CV cs.LG

    Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation Learning of Vision-based Autonomous Driving

    Authors: Yichen Xie, Hongge Chen, Gregory P. Meyer, Yong Jae Lee, Eric M. Wolff, Masayoshi Tomizuka, Wei Zhan, Yuning Chai, Xin Huang

    Abstract: Due to the lack of depth cues in images, multi-frame inputs are important for the success of vision-based perception, prediction, and planning in autonomous driving. Observations from different angles enable the recovery of 3D object states from 2D image inputs if we can identify the same instance in different input frames. However, the dynamic nature of autonomous driving scenes leads to signific… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  25. arXiv:2402.14194  [pdf, other

    cs.LG cs.RO

    BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay

    Authors: Catherine Weaver, Chen Tang, Ce Hao, Kenta Kawamoto, Masayoshi Tomizuka, Wei Zhan

    Abstract: Imitation learning learns a policy from demonstrations without requiring hand-designed reward functions. In many robotic tasks, such as autonomous racing, imitated policies must model complex environment dynamics and human decision-making. Sequence modeling is highly effective in capturing intricate patterns of motion sequences but struggles to adapt to new environments or distribution shifts that… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Preprint

  26. arXiv:2402.08931  [pdf, other

    cs.CV

    Depth-aware Volume Attention for Texture-less Stereo Matching

    Authors: Tong Zhao, Mingyu Ding, Wei Zhan, Masayoshi Tomizuka, Yintao Wei

    Abstract: Stereo matching plays a crucial role in 3D perception and scenario understanding. Despite the proliferation of promising methods, addressing texture-less and texture-repetitive conditions remains challenging due to the insufficient availability of rich geometric and semantic information. In this paper, we propose a lightweight volume refinement scheme to tackle the texture deterioration in practic… ▽ More

    Submitted 26 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: 10 pages, 6 figures

  27. arXiv:2401.15315  [pdf, other

    cs.RO

    Learning Online Belief Prediction for Efficient POMDP Planning in Autonomous Driving

    Authors: Zhiyu Huang, Chen Tang, Chen Lv, Masayoshi Tomizuka, Wei Zhan

    Abstract: Effective decision-making in autonomous driving relies on accurate inference of other traffic agents' future behaviors. To achieve this, we propose an online belief-update-based behavior prediction model and an efficient planner for Partially Observable Markov Decision Processes (POMDPs). We develop a Transformer-based prediction model, enhanced with a recurrent neural memory model, to dynamically… ▽ More

    Submitted 17 June, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

    Comments: IEEE Robotics and Automation Letters

  28. arXiv:2401.00391  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Controllable Adversaries

    Authors: Wei-Jer Chang, Francesco Pittaluga, Masayoshi Tomizuka, Wei Zhan, Manmohan Chandraker

    Abstract: Evaluating the performance of autonomous vehicle planning algorithms necessitates simulating long-tail safety-critical traffic scenarios. However, traditional methods for generating such scenarios often fall short in terms of controllability and realism and neglect the dynamics of agent interactions. To mitigate these limitations, we introduce SAFE-SIM, a novel diffusion-based controllable closed-… ▽ More

    Submitted 15 June, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

    Comments: Under Review

    ACM Class: I.2.9; I.2.6

  29. arXiv:2312.11598  [pdf, other

    cs.RO cs.CV cs.LG

    SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

    Authors: Zhixuan Liang, Yao Mu, Hengbo Ma, Masayoshi Tomizuka, Mingyu Ding, ** Luo

    Abstract: Diffusion models have demonstrated strong potential for robotic trajectory planning. However, generating coherent trajectories from high-level instructions remains challenging, especially for long-range composition tasks requiring multiple sequential skills. We propose SkillDiffuser, an end-to-end hierarchical planning framework integrating interpretable skill learning with conditional diffusion p… ▽ More

    Submitted 28 March, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted by CVPR 2024. Camera ready version. Project page: https://skilldiffuser.github.io/

  30. arXiv:2312.10571  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Multi-level Reasoning for Robotic Assembly: From Sequence Inference to Contact Selection

    Authors: Xinghao Zhu, Devesh K. Jha, Diego Romeres, Lingfeng Sun, Masayoshi Tomizuka, Anoop Cherian

    Abstract: Automating the assembly of objects from their parts is a complex problem with innumerable applications in manufacturing, maintenance, and recycling. Unlike existing research, which is limited to target segmentation, pose regression, or using fixed target blueprints, our work presents a holistic multi-level framework for part assembly planning consisting of part assembly sequence inference, part mo… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: Supplementary video is available at https://www.youtube.com/watch?v=XNYkWSHkAaU&ab_channel=MitsubishiElectricResearchLabs%28MERL%29

  31. arXiv:2312.06876  [pdf, other

    cs.RO cs.AI

    Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks

    Authors: Lingfeng Sun, Devesh K. Jha, Chiori Hori, Siddarth Jain, Radu Corcodel, Xinghao Zhu, Masayoshi Tomizuka, Diego Romeres

    Abstract: Designing robotic agents to perform open vocabulary tasks has been the long-standing goal in robotics and AI. Recently, Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. However, planning for these tasks in the presence of uncertainties is challenging as it requires \enquote{chain-of-thought} reasoning, aggregating inform… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 22 pages, 4 figures

  32. arXiv:2311.07499  [pdf, other

    cs.RO

    Bridging the Sim-to-Real Gap with Dynamic Compliance Tuning for Industrial Insertion

    Authors: Xiang Zhang, Masayoshi Tomizuka, Hui Li

    Abstract: Contact-rich manipulation tasks often exhibit a large sim-to-real gap. For instance, industrial assembly tasks frequently involve tight insertions where the clearance is less than 0.1 mm and can even be negative when dealing with a deformable receptacle. This narrow clearance leads to complex contact dynamics that are difficult to model accurately in simulation, making it challenging to transfer s… ▽ More

    Submitted 1 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted by ICRA 24

  33. arXiv:2311.02527  [pdf, other

    cs.RO

    Nonlinear Modeling for Soft Pneumatic Actuators via Data-Driven Parameter Estimation

    Authors: Wu-Te Yang, Hannah Stuart, Burak Kurkcu, Masayoshi Tomizuka

    Abstract: Precise modeling soft robots remains a challenge due to their infinite-dimensional nature governed by partial differential equations. This paper introduces an innovative approach for modeling soft pneumatic actuators, employing a nonlinear framework through data-driven parameter estimation. The research begins by introducing Ludwick's Law, providing a accurate representation of the large deflectio… ▽ More

    Submitted 14 February, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: 7 pages, 8 figures

  34. arXiv:2310.13168  [pdf, other

    cs.RO

    Optimized Design of a Soft Actuator Considering Force/Torque, Bendability, and Controllability via an Approximated Structure

    Authors: Wu-Te Yang, Burak Kurkcu, Masayoshi Tomizuka

    Abstract: This paper introduces a novel design method that enhances the force/torque, bendability, and controllability of soft pneumatic actuators (SPAs). The complex structure of the soft actuator is simplified by approximating it as a cantilever beam. This allows us to derive approximated nonlinear kinematic models and a dynamical model, which is explored to understand the correlation between natural freq… ▽ More

    Submitted 12 April, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: 18 pages, 11 figures

  35. arXiv:2310.10509  [pdf, other

    cs.RO

    Efficient Sim-to-real Transfer of Contact-Rich Manipulation Skills with Online Admittance Residual Learning

    Authors: Xiang Zhang, Changhao Wang, Lingfeng Sun, Zheng Wu, Xinghao Zhu, Masayoshi Tomizuka

    Abstract: Learning contact-rich manipulation skills is essential. Such skills require the robots to interact with the environment with feasible manipulation trajectories and suitable compliance control parameters to enable safe and stable contact. However, learning these skills is challenging due to data inefficiency in the real world and the sim-to-real gap in simulation. In this paper, we introduce a hybr… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Conference on Robot Learning (CoRL) 2023

  36. arXiv:2310.09899  [pdf, other

    cs.RO

    Generalizable whole-body global manipulation of deformable linear objects by dual-arm robot in 3-D constrained environments

    Authors: Mingrui Yu, Kangchen Lv, Changhao Wang, Yongpeng Jiang, Masayoshi Tomizuka, Xiang Li

    Abstract: Constrained environments are common in practical applications of manipulating deformable linear objects (DLOs), where movements of both DLOs and robots should be constrained. This task is high-dimensional and highly constrained owing to the highly deformable DLOs, dual-arm robots with high degrees of freedom, and 3-D complex environments, which render global planning challenging. Furthermore, accu… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: Project website: https://mingrui-yu.github.io/DLO_planning_2

  37. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  38. arXiv:2310.07932  [pdf, other

    cs.RO cs.AI cs.CV

    What Matters to You? Towards Visual Representation Alignment for Robot Learning

    Authors: Ran Tian, Chenfeng Xu, Masayoshi Tomizuka, Jitendra Malik, Andrea Bajcsy

    Abstract: When operating in service of people, robots need to optimize rewards aligned with end-user preferences. Since robots will rely on raw perceptual inputs like RGB images, their rewards will inevitably use visual representations. Recently there has been excitement in using representations from pre-trained visual models, but key to making these work in robotics is fine-tuning, which is typically done… ▽ More

    Submitted 15 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  39. arXiv:2310.07218  [pdf, other

    cs.MA cs.AI

    Quantifying Agent Interaction in Multi-agent Reinforcement Learning for Cost-efficient Generalization

    Authors: Yuxin Chen, Chen Tang, Ran Tian, Chenran Li, **ning Li, Masayoshi Tomizuka, Wei Zhan

    Abstract: Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL). The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario. A quantitative examination of this relationship sheds light on effectively training agents for diverse scenarios. In this study, we present the Level of Influence (LoI), a metric quantifyi… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 12 pages, 6 figures

    ACM Class: I.2.6

  40. arXiv:2310.03888  [pdf, other

    cs.RO

    Frequency Domain Analysis of Nonlinear Series Elastic Actuator via Describing Function

    Authors: Motohiro Hirao, Burak Kurkcu, Alireza Ghanbarpour, Masayoshi Tomizuka

    Abstract: Nonlinear stiffness SEAs (NSEAs) inspired by biological muscles offer promise in achieving adaptable stiffness for assistive robots. While assistive robots are often designed and compared based on torque capability and control bandwidth, NSEAs have not been systematically designed in the frequency domain due to their nonlinearity. The describing function, an analytical concept for nonlinear system… ▽ More

    Submitted 24 October, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: accepted by 2023 IEEE ROBIO conference

  41. arXiv:2310.03026  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving

    Authors: Hao Sha, Yao Mu, Yuxuan Jiang, Li Chen, Chenfeng Xu, ** Luo, Shengbo Eben Li, Masayoshi Tomizuka, Wei Zhan, Mingyu Ding

    Abstract: Existing learning-based autonomous driving (AD) systems face challenges in comprehending high-level information, generalizing to rare events, and providing interpretability. To address these problems, this work employs Large Language Models (LLMs) as a decision-making component for complex AD scenarios that require human commonsense understanding. We devise cognitive pathways to enable comprehensi… ▽ More

    Submitted 13 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

  42. arXiv:2310.03023  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Human-oriented Representation Learning for Robotic Manipulation

    Authors: Mingxiao Huo, Mingyu Ding, Chenfeng Xu, Thomas Tian, Xinghao Zhu, Yao Mu, Lingfeng Sun, Masayoshi Tomizuka, Wei Zhan

    Abstract: Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks. We advocate that such a representation automatically arises from simultaneously learning about multiple simple perceptual skills that are critical for everyday scenarios (e.g., hand detection, state estimate, etc.) and is better suited fo… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  43. arXiv:2310.02648  [pdf, other

    cs.RO

    Long-Term Dynamic Window Approach for Kinodynamic Local Planning in Static and Crowd Environments

    Authors: Zhiqiang Jian, Songyi Zhang, Lingfeng Sun, Wei Zhan, Nanning Zheng, Masayoshi Tomizuka

    Abstract: Local planning for a differential wheeled robot is designed to generate kinodynamic feasible actions that guide the robot to a goal position along the navigation path while avoiding obstacles. Reactive, predictive, and learning-based methods are widely used in local planning. However, few of them can fit static and crowd environments while satisfying kinodynamic constraints simultaneously. To solv… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 9 pages, 7 figures

    Journal ref: 2023 IEEE RA-L

  44. arXiv:2310.02625  [pdf, other

    cs.RO

    Adaptive Spatio-Temporal Voxels Based Trajectory Planning for Autonomous Driving in Highway Traffic Flow

    Authors: Zhiqiang Jian, Songyi Zhang, Lingfeng Sun, Wei Zhan, Masayoshi Tomizuka, Nanning Zheng

    Abstract: Trajectory planning is crucial for the safe driving of autonomous vehicles in highway traffic flow. Currently, some advanced trajectory planning methods utilize spatio-temporal voxels to construct feasible regions and then convert trajectory planning into optimization problem solving based on the feasible regions. However, these feasible region construction methods cannot adapt to the changes in d… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 8 pages, 5 figures

    Journal ref: IEEE ITSC 2023

  45. arXiv:2310.02264  [pdf, other

    cs.RO cs.CL cs.CV cs.LG

    Generalizable Long-Horizon Manipulations with Large Language Models

    Authors: Haoyu Zhou, Mingyu Ding, Weikun Peng, Masayoshi Tomizuka, Lin Shao, Chuang Gan

    Abstract: This work introduces a framework harnessing the capabilities of Large Language Models (LLMs) to generate primitive task conditions for generalizable long-horizon manipulations with novel objects and unseen tasks. These task conditions serve as guides for the generation and adjustment of Dynamic Movement Primitives (DMP) trajectories for long-horizon task execution. We further create a challenging… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  46. arXiv:2310.02262  [pdf, other

    cs.CV cs.GR cs.RO

    RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving

    Authors: Tong Zhao, Chenfeng Xu, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Yintao Wei

    Abstract: This paper addresses the growing demands for safety and comfort in intelligent robot systems, particularly autonomous vehicles, where road conditions play a pivotal role in overall driving performance. For example, reconstructing road surfaces helps to enhance the analysis and prediction of vehicle responses for motion planning and control systems. We introduce the Road Surface Reconstruction Data… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  47. arXiv:2310.01740  [pdf, other

    cs.RO

    Control of Soft Pneumatic Actuators with Approximated Dynamical Modeling

    Authors: Wu-Te Yang, Burak Kurkcu, Motohiro Hirao, Lingfeng Sun, Xinghao Zhu, Zhizhou Zhang, Grace X. Gu, Masayoshi Tomizuka

    Abstract: This paper introduces a full system modeling strategy for a syringe pump and soft pneumatic actuators(SPAs). The soft actuator is conceptualized as a beam structure, utilizing a second-order bending model. The equation of natural frequency is derived from Euler's bending theory, while the dam** ratio is estimated by fitting step responses of soft pneumatic actuators. Evaluation of model uncertai… ▽ More

    Submitted 19 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 8 pages, 10 figures, accepted by 2023 IEEE ROBIO conference

  48. arXiv:2310.01614  [pdf, other

    cs.RO cs.MA

    Distributed Multi-agent Interaction Generation with Imagined Potential Games

    Authors: Lingfeng Sun, Pin-Yun Hung, Changhao Wang, Masayoshi Tomizuka, Zhuo Xu

    Abstract: Interactive behavior modeling of multiple agents is an essential challenge in simulation, especially in scenarios when agents need to avoid collisions and cooperate at the same time. Humans can interact with others without explicit communication and navigate in scenarios when cooperation is required. In this work, we aim to model human interactions in this realistic setting, where each agent acts… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: 8 pages, 7 figures

  49. arXiv:2309.17342  [pdf, other

    cs.CV cs.LG

    Towards Free Data Selection with General-Purpose Models

    Authors: Yichen Xie, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan

    Abstract: A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets. However, current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly. In this paper, we challenge this status quo by designing a dist… ▽ More

    Submitted 14 October, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: accepted by NeurIPS 2023

  50. arXiv:2309.10121  [pdf, other

    cs.CV

    Pre-training on Synthetic Driving Data for Trajectory Prediction

    Authors: Yiheng Li, Seth Z. Zhao, Chenfeng Xu, Chen Tang, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan

    Abstract: Accumulating substantial volumes of real-world driving data proves pivotal in the realm of trajectory forecasting for autonomous driving. Given the heavy reliance of current trajectory forecasting models on data-driven methodologies, we aim to tackle the challenge of learning general trajectory forecasting representations under limited data availability. We propose to augment both HD maps and traj… ▽ More

    Submitted 19 September, 2023; v1 submitted 18 September, 2023; originally announced September 2023.