HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: arydshln
  • failed: tensor

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: arXiv.org perpetual non-exclusive license
arXiv:2403.10309v1 [cs.RO] 15 Mar 2024

Revolutionizing Packaging: A Robotic Bagging Pipeline with Constraint-aware Structure-of-Interest Planning

Jiaming Qi11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT, Peng Zhou11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT, Pai Zheng22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT, Hongmin Wu33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPT, Chenguang Yang44{}^{4}start_FLOATSUPERSCRIPT 4 end_FLOATSUPERSCRIPT, David Navarro-Alarcon22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT, and Jia Pan11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT This work is supported by the Innovation and Technology Commission of the HKSAR Government under the InnoHK initiative. (Corresponding author: Jia Pan.)11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPTThe University of Hong Kong, Hong Kong. e-mail: [email protected]22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTThe Hong Kong Polytechnic University, Hong Kong.33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPTGuangdong Academy of Sciences, China.44{}^{4}start_FLOATSUPERSCRIPT 4 end_FLOATSUPERSCRIPTUniversity of Liverpool, United Kingdom.
Abstract

Bagging operations, common in packaging and assisted living applications, are challenging due to a bag’s complex deformable properties. To address this, we develop a robotic system for automated bagging tasks using an adaptive structure-of-interest (SOI) manipulation approach. Our method relies on real-time visual feedback to dynamically adjust manipulation without requiring prior knowledge of bag materials or dynamics. We present a robust pipeline featuring state estimation for SOIs using Gaussian Mixture Models (GMM), SOI generation via optimization-based bagging techniques, SOI motion planning with Constrained Bidirectional Rapidly-exploring Random Trees (CBiRRT), and dual-arm manipulation coordinated by Model Predictive Control (MPC). Experiments demonstrate the system’s ability to achieve precise, stable bagging of various objects using adaptive coordination of the manipulators. The proposed framework advances the capability of dual-arm robots to perform more sophisticated automation of common tasks involving interactions with deformable objects.

I Introduction

The field of deformable object manipulation (DOM) has garnered considerable attention for its potential to automate many advanced tasks in human environments. Everyday objects, from garments to soft furnishings, present highly deformable behaviors that complicate its automatic handling. Providing robots with the sufficient dexterity to manipulate this type of objects is crucial for their seamless integration into daily human environments. However, due to their infinite degrees-of-freedom and nonlinear dynamics, most research work has focused on simpler cases, such as 1-D and 2-D deformable bodies. The manipulation of complex 3D deformable structures such as common household bags (whose topology is modelled as a 2-torus), remains an underexplored problem in the robotics research community.

To address this gap in the literature, our work introduces a dual-arm robotic system empowered by constraint-aware structure-of-interest (SOI) planning, which advances DOM into the realm of 3D objects. This system is a significant leap towards sophisticated automation, capable of performing intricate tasks such as robotic bagging with precision and adaptability, marking a pivotal step in DOM research. In this paper, we introduce a novel approach to this problem through a dual-arm robotic system that leverages constraint-aware structure-of-interest (SOI) planning.

Refer to caption
Figure 1: The dual-arm grasps two handles of a fabric bag to manipulate the SOI (i.e., the opening rim) for the bagging task.

Our approach is based on the insight that similar to the concept of Region of Interest (ROI) in the image processing domain, complete state estimation of a manipulated deformable object is not essential for robotic interaction. For specific deformable object manipulation tasks, it is sufficient to focus exclusively on state estimation of the critical structure-related components. Take, for instance, a robotic bagging task: the opening rim of the fabric bag can be considered the Structure of Interest (SOI). By concentrating on state estimation related to just the opening rim, the robotic system can successfully accomplish the bagging task. This targeted approach simplifies the state estimation process and improves the efficiency and effectiveness of the manipulation task. This system is specifically designed to address the automation of bagging tasks, a common yet challenging operation in both industrial and everyday contexts. The core of our approach is the use of two robotic arms that work in unison, guided by a sophisticated planning system that accounts for the constraints imposed by the object’s structure and desired final state. This is achieved using 3D-printed connectors, which allow the robots to manipulate the bag with an unprecedented level of precision and stability.

Our contribution is as follows:

  • We propose a constraint-aware SOI planning framework that enables dual-arm robots to perform complex bagging tasks by manipulating a bag over an object to achieve a desired configuration.

  • We integrate an adaptive vision-based control system that does not require prior knowledge of the bag’s material properties or system dynamics, making the setup more flexible and broadly applicable.

  • We present a comprehensive methodological framework that encompasses SOI state estimation, bagging SOI generation, SOI planning, and motion planning, evidencing the system’s adaptability and sensitivity to environmental constraints.

II Related Work

The manipulation of deformable objects by robotic systems has been an area of increasing interest within the robotics community [1]. Early research efforts primarily addressed the manipulation of 1-D and 2-D deformable objects [2], using techniques such as tension-based strategies [3] and computational geometry [4] to model and control the behaviour of ropes, cloths, and sheets [5].

With the shift towards 3D deformable objects, researchers have explored various methods to handle increased complexity [6]. Notably, work by [7] delved into the dynamics of soft body manipulation using dual-arm robots, while [8] focused on non-prehensile manipulation techniques for cloth folding tasks [9]. Both approaches laid the groundwork for understanding the intricate interplay between robotic control and deformable object dynamics.

Recent advancements in vision-based control systems, such as those by [10], have shown how real-time feedback can enhance the adaptability of robots to the unpredictable nature of deformable objects [11, 12]. These works have informed the development of our constraint-aware SOI planning, which integrates real-time visual servoing to adjust the robot’s actions on the fly.

Our work builds on these foundational studies and takes a significant step forward by focusing on dual-arm manipulation for the specific task of robotic bagging—a complex application that has received limited attention thus far. We leverage the principles of constraint-aware planning to address the intricate problem of envelo** 3D objects with a deformable bag, which requires a high level of coordination and sensitivity to the dynamic constraints of the object and its environment.

Refer to caption
Figure 2: The dual robot grasp two handles of a deformable fabric bag to manipulate the SOI (i.e., the opening rim) for the bagging task.

III Problem Statement

Notation. Subscript ()tsubscript𝑡(\cdot)_{t}( ⋅ ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the discrete-time instant. 𝐈n×msubscript𝐈𝑛𝑚\mathbf{I}_{n\times m}bold_I start_POSTSUBSCRIPT italic_n × italic_m end_POSTSUBSCRIPT is the n×m𝑛𝑚n\times mitalic_n × italic_m matrix of ones, and the identity matrix as 𝐄nsubscript𝐄𝑛\mathbf{E}_{n}bold_E start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. 𝐋nsubscript𝐋𝑛\mathbf{L}_{n}bold_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is the low triangle matrix of 𝐈n×nsubscript𝐈𝑛𝑛\mathbf{I}_{n\times n}bold_I start_POSTSUBSCRIPT italic_n × italic_n end_POSTSUBSCRIPT, and tensor-product\otimes is the Kronecker product. 𝐩xsuperscript𝐩subscript𝑥{}^{\mathcal{F}_{x}}\mathbf{p}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT bold_p is a point 𝐩𝐩\mathbf{p}bold_p in the frame xsubscript𝑥\mathcal{F}_{x}caligraphic_F start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT. In this work, unless otherwise specified, all points are expressed in the world frame wsubscript𝑤\mathcal{F}_{w}caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT and omitted for clear.

In this work, we consider a novel robotic bagging task, as illustrated in Fig. 2. We propose a dual-arm robotic system where two robot manipulators grasp a deformable fabric bag to envelop a baggable object, denoted by 𝐁𝐁\mathbf{B}bold_B, which is suspended in the air. The system’s objective is to manipulate the fabric bag from an initial state to a goal state, where the bag completely covers the object. We posit that it is unnecessary to estimate the entire fabric bag; instead, focusing on the Section of Interest (SOI) of the bag, specifically the bag opening rim, is sufficient for this task. Consequently, we define the SOI state of the bagging task as the opening rim of the bag, represented as 𝐱𝐱\mathbf{x}bold_x, which can be captured by a depth camera configured in an eye-to-hand calibration style. In contrast to [13], which uses the entire point cloud as the SOI, we opt for a simpler representation by selecting contour keypoints to depict our SOI:

𝐱=[𝐱1,,𝐱nx]3nx,𝐱i=[xi,yi,zi]3formulae-sequence𝐱superscriptsuperscriptsubscript𝐱1superscriptsubscript𝐱subscript𝑛𝑥superscript3subscript𝑛𝑥subscript𝐱𝑖superscriptsubscript𝑥𝑖subscript𝑦𝑖subscript𝑧𝑖superscript3\mathbf{x}=\left[\mathbf{x}_{1}^{\intercal},\ldots,\mathbf{x}_{n_{x}}^{% \intercal}\right]^{\intercal}\in\mathbb{R}^{3n_{x}},\quad\mathbf{x}_{i}=\left[% x_{i},y_{i},z_{i}\right]^{\intercal}\in\mathbb{R}^{3}bold_x = [ bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊺ end_POSTSUPERSCRIPT , … , bold_x start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊺ end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT ⊺ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = [ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT ⊺ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT (1)

where nxsubscript𝑛𝑥n_{x}italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT denotes the number of contour keypoints, and 𝐱isubscript𝐱𝑖\mathbf{x}_{i}bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT represents the Cartesian coordinates of the i𝑖iitalic_i-th point in wsubscript𝑤\mathcal{F}_{w}caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT.

The task’s goal is to manipulate the SOI of the bag from its initial state 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to the target state 𝐱*superscript𝐱\mathbf{x}^{*}bold_x start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT. We address this problem by planning with constraints: 1) The SOI keypoints should approximate the shape of an oval; 2) The perimeter formed by the SOI keypoints must be constant, indicating that the size of the fabric bag’s opening rim does not change during manipulation. Depending on whether the bag is in contact with the object 𝐁𝐁\mathbf{B}bold_B, we divide the planning into pre-bagging and bagging stages:

𝒢:=[𝒢pre-bagging,𝒢bagging]|𝐠0,𝐠1,,𝐠pre-bagging,𝐠,,𝐠bagging,assign𝒢conditionalsubscript𝒢pre-baggingsubscript𝒢baggingsubscriptsubscript𝐠0subscript𝐠1superscript𝐠pre-baggingsubscriptsuperscript𝐠superscript𝐠bagging\mathcal{G}:=\left[\mathcal{G}_{\text{pre-bagging}},\mathcal{G}_{\text{bagging% }}\right]\Big{|}\underbrace{\mathbf{g}_{0},\mathbf{g}_{1},\ldots,\mathbf{g}^{{% \dagger}}}_{\text{pre-bagging}},\underbrace{\mathbf{g}^{{\dagger}},\ldots,% \mathbf{g}^{\ast}}_{\text{bagging}},caligraphic_G := [ caligraphic_G start_POSTSUBSCRIPT pre-bagging end_POSTSUBSCRIPT , caligraphic_G start_POSTSUBSCRIPT bagging end_POSTSUBSCRIPT ] | under⏟ start_ARG bold_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_g start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT end_ARG start_POSTSUBSCRIPT pre-bagging end_POSTSUBSCRIPT , under⏟ start_ARG bold_g start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , … , bold_g start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG start_POSTSUBSCRIPT bagging end_POSTSUBSCRIPT , (2)

where 𝐠superscript𝐠\mathbf{g}^{{\dagger}}bold_g start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT is an SOI shape tailored to the baggable object 𝐁𝐁\mathbf{B}bold_B that can perfectly envelop its bottom. To reach each subgoal 𝐠isubscript𝐠𝑖\mathbf{g}_{i}bold_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, we employ an MPC-based shape servoing approach to generate the velocity command 𝐮𝐮\mathbf{u}bold_u based on a measurable error function subgoal()subscriptsubgoal\mathcal{E}_{\text{subgoal}}(\cdot)caligraphic_E start_POSTSUBSCRIPT subgoal end_POSTSUBSCRIPT ( ⋅ ) between the resulting SOI state 𝐱isubscript𝐱𝑖\mathbf{x}_{i}bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and the current subgoal 𝐠isubscript𝐠𝑖\mathbf{g}_{i}bold_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT:

𝐮i=argmin𝐮𝒜subgoal(𝐱i,𝐠i),subscript𝐮𝑖𝐮𝒜subscriptsubgoalsubscript𝐱𝑖subscript𝐠𝑖\mathbf{u}_{i}=\underset{\mathbf{u}\in\mathcal{A}}{\arg\min}\leavevmode% \nobreak\ \mathcal{E}_{\text{subgoal}}(\mathbf{x}_{i},\mathbf{g}_{i}),bold_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = start_UNDERACCENT bold_u ∈ caligraphic_A end_UNDERACCENT start_ARG roman_arg roman_min end_ARG caligraphic_E start_POSTSUBSCRIPT subgoal end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , (3)

The bagging task is thus accomplished through a sequence of actions {𝐮1,𝐮2,,𝐮*}subscript𝐮1subscript𝐮2superscript𝐮\{\mathbf{u}_{1},\mathbf{u}_{2},\ldots,\mathbf{u}^{*}\}{ bold_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , bold_u start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT }.

Refer to caption
Figure 3: Schematic diagram of dual-arm manipulation approach of the bagging task. The algorithm aims to command the robot to manipulate the bag into the specified shape to cover the bottom part of 𝐁𝐁\mathbf{B}bold_B, i.e., bagging manner. Experiments are conducted in the Cartesian space.

IV Methodology

In this section, a dual-arm manipulation approach of the bagging task is proposed, which is comprised of: 1) SOI State Estimation. Extracting meaningful representations of the Structure of Interest (SOI) from the raw dense and noisy point cloud. 2) Bagging SOI Generation. Generate a pre-enclosing shape 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT to cover the bottom part of 𝐁𝐁\mathbf{B}bold_B. 3) SOI Planning. Generate a collision-free deformation path 𝒢𝒢\mathcal{G}caligraphic_G from 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT. 4) Motion Planning. Formulating the bagging process as shape servoing, drive the dual-arm move along 𝒢𝒢\mathcal{G}caligraphic_G to complete the bagging task. Fig. 3 presents the block diagram of the proposed manipulation approach.

IV-A SOI State Estimation

In this work, the bag’s SOI state is defined as a sequence of contour keypoints, i.e., 𝒬t={𝐱ti},i[1,nx]formulae-sequencesubscript𝒬𝑡superscriptsubscript𝐱𝑡𝑖𝑖1subscript𝑛𝑥\mathcal{Q}_{t}=\{\mathbf{x}_{t}^{i}\},i\in[1,n_{x}]caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = { bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT } , italic_i ∈ [ 1 , italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ]. The raw point cloud perceived by the depth camera is 𝒫t={𝐩ti},i[1,np]formulae-sequencesubscript𝒫𝑡superscriptsubscript𝐩𝑡𝑖𝑖1subscript𝑛𝑝\mathcal{P}_{t}=\{\mathbf{p}_{t}^{i}\},i\in[1,n_{p}]caligraphic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = { bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT } , italic_i ∈ [ 1 , italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ], usually npnxmuch-greater-thansubscript𝑛𝑝subscript𝑛𝑥n_{p}\gg n_{x}italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ≫ italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT. The state estimator aims to obtain a concise representation 𝐱tisuperscriptsubscript𝐱𝑡𝑖\mathbf{x}_{t}^{i}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT by aligning 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT to 𝒫tsubscript𝒫𝑡\mathcal{P}_{t}caligraphic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT in real-time.

We adopt the sampling approach in [14], i.e., Structure preserved registration (SPR), formulating the alignment process as a probability density estimation problem for Gaussian mixture model (GMM). By treating 𝒫tsubscript𝒫𝑡\mathcal{P}_{t}caligraphic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT as the points randomly sampled from GMM, thereby obtaining Gaussian’s centroids as 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. Considering that 𝒫tsubscript𝒫𝑡\mathcal{P}_{t}caligraphic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is dense, noisy and contains outliers, a uniform distribution for 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is added to GMM. The sampling probability of 𝐩tmsubscriptsuperscript𝐩𝑚𝑡\mathbf{p}^{m}_{t}bold_p start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is taken as below:

κ(𝐩tm)=Σn=1nx+1κ(n)κ(𝐩tm|n)𝜅subscriptsuperscript𝐩𝑚𝑡superscriptsubscriptΣ𝑛1subscript𝑛𝑥1𝜅𝑛𝜅conditionalsubscriptsuperscript𝐩𝑚𝑡𝑛\kappa(\mathbf{p}^{m}_{t})=\Sigma_{n=1}^{n_{x}+1}\kappa(n)\kappa(\mathbf{p}^{m% }_{t}|n)italic_κ ( bold_p start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = roman_Σ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT + 1 end_POSTSUPERSCRIPT italic_κ ( italic_n ) italic_κ ( bold_p start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | italic_n ) (4)

where κ(n)𝜅𝑛\kappa(n)italic_κ ( italic_n ) is the sampling weight of n𝑛nitalic_n-th mixture component, and κ(𝐩tm|n)𝜅conditionalsubscriptsuperscript𝐩𝑚𝑡𝑛\kappa(\mathbf{p}^{m}_{t}|n)italic_κ ( bold_p start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | italic_n ) denotes the sampling probability of 𝐩tmsubscriptsuperscript𝐩𝑚𝑡\mathbf{p}^{m}_{t}bold_p start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT from n𝑛nitalic_n-th mixture component. Both are given in [14].

The optimal estimation of 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT can be obtained by maximizing the log-likelihood function 𝒪𝒪\mathcal{O}caligraphic_O of the observation process:

𝒪(𝐱tn)=m=1npn=1nx+1κ(n|𝐩tm)log(κ(n)κ(𝐩tm|n))𝒪superscriptsubscript𝐱𝑡𝑛superscriptsubscript𝑚1subscript𝑛𝑝superscriptsubscript𝑛1subscript𝑛𝑥1𝜅conditional𝑛superscriptsubscript𝐩𝑡𝑚𝜅𝑛𝜅conditionalsuperscriptsubscript𝐩𝑡𝑚𝑛\mathcal{O}(\mathbf{x}_{t}^{n})=\sum_{m=1}^{n_{p}}\sum_{n=1}^{n_{x}+1}\kappa(n% |\mathbf{p}_{t}^{m})\log(\kappa(n)\kappa(\mathbf{p}_{t}^{m}|n))caligraphic_O ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT + 1 end_POSTSUPERSCRIPT italic_κ ( italic_n | bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ) roman_log ( italic_κ ( italic_n ) italic_κ ( bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT | italic_n ) ) (5)

The maximization of (5) can be processed through the EM algorithm [14], the optimal result 𝐱tn,superscriptsubscript𝐱𝑡𝑛\mathbf{x}_{t}^{n,\ast}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n , ∗ end_POSTSUPERSCRIPT is regarded the concise SOI representation of the bag. Fig. 4 visualizes the GMM-based representation, where the black dots are 𝒫tsubscript𝒫𝑡\mathcal{P}_{t}caligraphic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, and the red dots connected with the blue line are the precise 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT.

Refer to caption
Figure 4: The visual description of the GMM-based representation.
Refer to caption
Figure 5: (a) The visualization of the bagging SOI. (b) The projection of msubscript𝑚\mathcal{F}_{m}caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT in wsubscript𝑤\mathcal{F}_{w}caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT, and visualizing three constraints. The red square is the centroid of ΩvmsuperscriptsuperscriptΩ𝑣subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{v}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT and blue is that of ΩemsuperscriptsuperscriptΩ𝑒subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{e}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT.

IV-B Bagging SOI Generation

This section introduces how to generate two shapes, i.e., a bagging SOI 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT covering the bottom of 𝐁𝐁\mathbf{B}bold_B, while another is the goal SOI 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT for surrounding the entire lower part of 𝐁𝐁\mathbf{B}bold_B. As the dual robot manipulates the bag in an almost symmetrical way, thus the elliptical configuration is used as a reference to determine 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT and 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT. In this work, the bottom vertex points 𝐕𝐕\mathbf{V}bold_V of 𝐁𝐁\mathbf{B}bold_B are assumed to be coplanar, i.e., 𝐕=[𝐯1,,𝐯nv]𝐕subscript𝐯1subscript𝐯subscript𝑛𝑣\mathbf{V}=[\mathbf{v}_{1},\ldots,\mathbf{v}_{n_{v}}]bold_V = [ bold_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_v start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ], and 𝐯i3subscript𝐯𝑖superscript3\mathbf{v}_{i}\in\mathbb{R}^{3}bold_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT is the i𝑖iitalic_i-th vertex point’s Cartesian coordinate in wsubscript𝑤\mathcal{F}_{w}caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT. The essence is to change the SOI generation in wsubscript𝑤\mathcal{F}_{w}caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT into 2D generation in the xy𝑥𝑦xyitalic_x italic_y-plane in the map** frame msubscript𝑚\mathcal{F}_{m}caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, which is built on the plane where 𝐕𝐕\mathbf{V}bold_V is located.

Step 1: Calculate msubscript𝑚\mathcal{F}_{m}caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT on the plane consisting 𝐕𝐕\mathbf{V}bold_V.

Two auxiliary vectors are given as: ξi=𝐯i𝐯¯,i=1,2.formulae-sequencesubscript𝜉𝑖subscript𝐯𝑖¯𝐯𝑖12{\xi}_{i}=\mathbf{v}_{i}-\bar{\mathbf{v}},i=1,2.italic_ξ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - over¯ start_ARG bold_v end_ARG , italic_i = 1 , 2 . where 𝐯¯¯𝐯\bar{\mathbf{v}}over¯ start_ARG bold_v end_ARG is the centroid of 𝐕𝐕\mathbf{V}bold_V. The z𝑧zitalic_z-axis of msubscript𝑚\mathcal{F}_{m}caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is calculated as 𝐚=(ξ1×ξ2)/ξ1×ξ2𝐚subscript𝜉1subscript𝜉2normsubscript𝜉1subscript𝜉2{\mathbf{a}}=({\xi}_{1}\times{\xi}_{2})/\|{\xi}_{1}\times{\xi}_{2}\|bold_a = ( italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT × italic_ξ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) / ∥ italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT × italic_ξ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥. Any point is selected to determine the y𝑦yitalic_y-axis: 𝐨=(𝐯3𝐯¯)/𝐯3𝐯¯𝐨subscript𝐯3¯𝐯normsubscript𝐯3¯𝐯{\mathbf{o}}=(\mathbf{v}_{3}-\bar{\mathbf{v}})/\|\mathbf{v}_{3}-\bar{\mathbf{v% }}\|bold_o = ( bold_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT - over¯ start_ARG bold_v end_ARG ) / ∥ bold_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT - over¯ start_ARG bold_v end_ARG ∥. Then, x𝑥xitalic_x-axis is given as 𝐧=(𝐨×𝐚)/𝐨×𝐚𝐧𝐨𝐚norm𝐨𝐚{\mathbf{n}}=({\mathbf{o}}\times{\mathbf{a}})/\|{\mathbf{o}}\times{\mathbf{a}}\|bold_n = ( bold_o × bold_a ) / ∥ bold_o × bold_a ∥. For the uniqueness of msubscript𝑚\mathcal{F}_{m}caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, taking 𝐩=𝐯¯𝐩¯𝐯{\mathbf{p}}=\bar{\mathbf{v}}bold_p = over¯ start_ARG bold_v end_ARG as the originate of msubscript𝑚\mathcal{F}_{m}caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT. The transformation matrix from wsubscript𝑤\mathcal{F}_{w}caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT to msubscript𝑚\mathcal{F}_{m}caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is constructed as:

w𝐓m=[𝐧𝐨𝐚𝐩\hdashline0001]4×4^{\mathcal{F}_{w}}\mathbf{T}_{\mathcal{F}_{m}}=\left[\begin{array}[]{c:c:c:c}{% \mathbf{n}}&{\mathbf{o}}&{\mathbf{a}}&{\mathbf{p}}\\ \hdashline 0&0&0&1\end{array}\right]\in\mathbb{R}^{4\times 4}start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_POSTSUPERSCRIPT bold_T start_POSTSUBSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT = [ start_ARRAY start_ROW start_CELL bold_n end_CELL start_CELL bold_o end_CELL start_CELL bold_a end_CELL start_CELL bold_p end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW end_ARRAY ] ∈ blackboard_R start_POSTSUPERSCRIPT 4 × 4 end_POSTSUPERSCRIPT (6)

Adopting (6) to map 𝐕𝐕\mathbf{V}bold_V into msubscript𝑚\mathcal{F}_{m}caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, denoted as 𝐕m={m𝐯i}nv×3{}^{\mathcal{F}_{m}}\mathbf{V}=\{^{\mathcal{F}_{m}}\mathbf{v}_{i}\}\in\mathbb{% R}^{n_{v}\times 3}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT bold_V = { start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUPERSCRIPT bold_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT × 3 end_POSTSUPERSCRIPT where 𝐯im=[mvi,x,mvi,y,vi,zm]{}^{\mathcal{F}_{m}}\mathbf{v}_{i}=[^{\mathcal{F}_{m}}v_{i,x},^{\mathcal{F}_{m% }}v_{i,y},{}^{\mathcal{F}_{m}}v_{i,z}]start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT bold_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = [ start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i , italic_x end_POSTSUBSCRIPT , start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i , italic_y end_POSTSUBSCRIPT , start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i , italic_z end_POSTSUBSCRIPT ], then the centroid of 𝐕msuperscript𝐕subscript𝑚{}^{\mathcal{F}_{m}}\mathbf{V}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT bold_V is denoted as 𝐯¯msuperscript¯𝐯subscript𝑚{}^{\mathcal{F}_{m}}\bar{\mathbf{v}}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT over¯ start_ARG bold_v end_ARG. Normally, the z𝑧zitalic_z-axis of 𝐕msuperscript𝐕subscript𝑚{}^{\mathcal{F}_{m}}\mathbf{V}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT bold_V is close to zero, as 𝐕𝐕\mathbf{V}bold_V is assumed to be coplanar.

Step 2: Calculate bagging ellipse in xy𝑥𝑦xyitalic_x italic_y-plane of msubscript𝑚\mathcal{F}_{m}caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT.

The 2D ellipse parametric equation is constructed as:

x𝑥\displaystyle xitalic_x =τx+ρacos(θ)cos(α)ρbsin(θ)sin(α)absentsubscript𝜏𝑥subscript𝜌𝑎𝜃𝛼subscript𝜌𝑏𝜃𝛼\displaystyle=\tau_{x}+\rho_{a}\cos(\theta)\cos(\alpha)-\rho_{b}\sin(\theta)% \sin(\alpha)= italic_τ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT + italic_ρ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT roman_cos ( italic_θ ) roman_cos ( italic_α ) - italic_ρ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT roman_sin ( italic_θ ) roman_sin ( italic_α )
y𝑦\displaystyle yitalic_y =τy+ρacos(θ)sin(α)+ρbsin(θ)cos(α)absentsubscript𝜏𝑦subscript𝜌𝑎𝜃𝛼subscript𝜌𝑏𝜃𝛼\displaystyle=\tau_{y}+\rho_{a}\cos(\theta)\sin(\alpha)+\rho_{b}\sin(\theta)% \cos(\alpha)= italic_τ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT + italic_ρ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT roman_cos ( italic_θ ) roman_sin ( italic_α ) + italic_ρ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT roman_sin ( italic_θ ) roman_cos ( italic_α ) (7)

where τx,τysubscript𝜏𝑥subscript𝜏𝑦\tau_{x},\tau_{y}italic_τ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT are the centroid. ρa,ρbsubscript𝜌𝑎subscript𝜌𝑏\rho_{a},\rho_{b}italic_ρ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT , italic_ρ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT are the axes lengths. θ𝜃\thetaitalic_θ is the parameter belong to [0,2π]02𝜋[0,2\pi][ 0 , 2 italic_π ]. α𝛼\alphaitalic_α is the rotation angle. Let θi=2πi/1800,i[1800]formulae-sequencesubscript𝜃𝑖2𝜋𝑖1800𝑖delimited-[]1800\theta_{i}=2\pi i/1800,i\in[1800]italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 2 italic_π italic_i / 1800 , italic_i ∈ [ 1800 ], the generated 2D ellipse is given as Ωem:={(xi,yi)|θi}1800×2assignsuperscriptsuperscriptΩ𝑒subscript𝑚conditional-setsubscript𝑥𝑖subscript𝑦𝑖subscript𝜃𝑖superscript18002{}^{\mathcal{F}_{m}}\Omega^{e}:=\{(x_{i},y_{i})|\theta_{i}\}\in\mathbb{R}^{180% 0\times 2}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT := { ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) | italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } ∈ blackboard_R start_POSTSUPERSCRIPT 1800 × 2 end_POSTSUPERSCRIPT, and the xy𝑥𝑦xyitalic_x italic_y-coordinates of 𝐕msuperscript𝐕subscript𝑚{}^{\mathcal{F}_{m}}\mathbf{V}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT bold_V is extracted as Ωvm:=nv×2{}^{\mathcal{F}_{m}}\Omega^{v}:=\in\mathbb{R}^{n_{v}\times 2}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT := ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT × 2 end_POSTSUPERSCRIPT.

The 2D ellipse standard equation is constructed as:

fsm(x,y)=((xτx)cosα+(yτy)sinα)2/ρa2superscriptsubscript𝑓𝑠subscript𝑚𝑥𝑦superscript𝑥subscript𝜏𝑥𝛼𝑦subscript𝜏𝑦𝛼2superscriptsubscript𝜌𝑎2{}^{\mathcal{F}_{m}}f_{s}(x,y)={{{{\left({\left({x-{\tau_{x}}}\right)\cos% \alpha+\left({y-{\tau_{y}}}\right)\sin\alpha}\right)}^{2}}}}/{{{\rho_{a}^{2}}}}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x , italic_y ) = ( ( italic_x - italic_τ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ) roman_cos italic_α + ( italic_y - italic_τ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ) roman_sin italic_α ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_ρ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+((τxx)sinα+(yτy)cosα)2/ρb20superscriptsubscript𝜏𝑥𝑥𝛼𝑦subscript𝜏𝑦𝛼2superscriptsubscript𝜌𝑏20\displaystyle+{{{{\left({\left({{\tau_{x}}-x}\right)\sin\alpha+\left({y-{\tau_% {y}}}\right)\cos\alpha}\right)}^{2}}}}/{{{\rho_{b}^{2}}}}\geq 0+ ( ( italic_τ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT - italic_x ) roman_sin italic_α + ( italic_y - italic_τ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ) roman_cos italic_α ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_ρ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≥ 0 (8)

Whether (x,y)𝑥𝑦(x,y)( italic_x , italic_y ) is inside the ellipse can be judged by (IV-B), which is used construct the subsequent constraint. Let the perimeter of the bag’s rim as ω𝜔\omegaitalic_ω, and the cost function that satisfies the perimeter limitation is constructed as:

𝒥1(τx,τy,ρa,ρb,α)=2π(ρa2+ρb2)/2ω2subscript𝒥1subscript𝜏𝑥subscript𝜏𝑦subscript𝜌𝑎subscript𝜌𝑏𝛼superscriptnorm2𝜋superscriptsubscript𝜌𝑎2superscriptsubscript𝜌𝑏22𝜔2\mathcal{J}_{1}(\tau_{x},\tau_{y},\rho_{a},\rho_{b},\alpha)=\big{\|}2\pi\sqrt{% {(\rho_{a}^{2}+\rho_{b}^{2})}/{2}}-\omega\big{\|}^{2}caligraphic_J start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT , italic_ρ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT , italic_ρ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT , italic_α ) = ∥ 2 italic_π square-root start_ARG ( italic_ρ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_ρ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) / 2 end_ARG - italic_ω ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (9)

Further, three additional constraints are defined as:

Constraint C1subscript𝐶1C_{1}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT: it regulates the covering of ΩemsuperscriptsuperscriptΩ𝑒subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{e}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT to ΩvmsuperscriptsuperscriptΩ𝑣subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{v}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT:

0mfs(mvi,x,mvi,y)λ1,i[1,,nv]0\leq^{\mathcal{F}_{m}}f_{s}(^{\mathcal{F}_{m}}v_{i,x},^{\mathcal{F}_{m}}v_{i,% y})\leq\lambda_{1},\ \ \ i\in[1,\ldots,n_{v}]0 ≤ start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i , italic_x end_POSTSUBSCRIPT , start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i , italic_y end_POSTSUBSCRIPT ) ≤ italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_i ∈ [ 1 , … , italic_n start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ] (10)

where λ1subscript𝜆1\lambda_{1}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT controls the enclosing degree. The smaller λ1subscript𝜆1\lambda_{1}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is, the center alignment between ΩemsuperscriptsuperscriptΩ𝑒subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{e}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT and ΩvmsuperscriptsuperscriptΩ𝑣subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{v}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT more obvious. The default value is λ1=0.87subscript𝜆10.87\lambda_{1}=0.87italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.87.

Constraint C2subscript𝐶2C_{2}italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT: it limits the Euclidean distance between the centers of ΩemsuperscriptsuperscriptΩ𝑒subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{e}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT and ΩvmsuperscriptsuperscriptΩ𝑣subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{v}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT:

0[τx,τy,0]m𝐯¯λ20normsuperscriptsubscript𝑚subscript𝜏𝑥subscript𝜏𝑦0¯𝐯subscript𝜆20\leq\big{\|}[\tau_{x},\tau_{y},0]-^{\mathcal{F}_{m}}\bar{\mathbf{v}}\big{\|}% \leq\lambda_{2}0 ≤ ∥ [ italic_τ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT , 0 ] - start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUPERSCRIPT over¯ start_ARG bold_v end_ARG ∥ ≤ italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (11)

where λ2subscript𝜆2\lambda_{2}italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT specifies the proximity of the two centers, it has the similar control effect to λ1subscript𝜆1\lambda_{1}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. The default value is λ2=0.003subscript𝜆20.003\lambda_{2}=0.003italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.003.

Constraint C3subscript𝐶3C_{3}italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT: it adjusts the parallelism of the respective principal axes of ΩemsuperscriptsuperscriptΩ𝑒subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{e}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT and ΩvmsuperscriptsuperscriptΩ𝑣subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{v}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT, denoted as ηe2,ηv2formulae-sequencesubscript𝜂𝑒superscript2subscript𝜂𝑣superscript2\eta_{e}\in\mathbb{R}^{2},\eta_{v}\in\mathbb{R}^{2}italic_η start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , italic_η start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and calculated by PCA [15]. Afterwards, the inner product is used to evaluate the parallelism:

λ3|dot(ηe,ηv)|1λ3subscript𝜆3dotsubscript𝜂𝑒subscript𝜂𝑣1subscript𝜆3\displaystyle-\lambda_{3}\leq|{\rm{dot}}(\eta_{e},\eta_{v})|-1\leq\lambda_{3}- italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ≤ | roman_dot ( italic_η start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ) | - 1 ≤ italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT (12)

where λ3subscript𝜆3\lambda_{3}italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT controls the parallel degree. As we only consider the parallelism, and ignore the direction (same/opposite), so we take absolute operation and subtract 1. The default value is λ3=0.0001subscript𝜆30.0001\lambda_{3}=0.0001italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 0.0001.

The optimal values (τx*,τy*,ρa*,ρb*,α*)superscriptsubscript𝜏𝑥superscriptsubscript𝜏𝑦superscriptsubscript𝜌𝑎superscriptsubscript𝜌𝑏superscript𝛼(\tau_{x}^{*},\tau_{y}^{*},\rho_{a}^{*},\rho_{b}^{*},\alpha^{*})( italic_τ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_τ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_ρ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_ρ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_α start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ) can be obtained by minimizing 𝒥1subscript𝒥1\mathcal{J}_{1}caligraphic_J start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and considering three constraints C1,C2,C3subscript𝐶1subscript𝐶2subscript𝐶3C_{1},C_{2},C_{3}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. The nonlinear optimizer is adopted to obtain the optimal values: Ωe,m:={(xi,yi)|τx*,τy*,ρa*,ρb*,α*}assignsuperscriptsuperscriptΩ𝑒subscript𝑚conditional-setsubscript𝑥𝑖subscript𝑦𝑖superscriptsubscript𝜏𝑥superscriptsubscript𝜏𝑦superscriptsubscript𝜌𝑎superscriptsubscript𝜌𝑏superscript𝛼{}^{\mathcal{F}_{m}}\Omega^{e,\ast}:=\{(x_{i},y_{i})|\tau_{x}^{*},\tau_{y}^{*}% ,\rho_{a}^{*},\rho_{b}^{*},\alpha^{*}\}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e , ∗ end_POSTSUPERSCRIPT := { ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) | italic_τ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_τ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_ρ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_ρ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_α start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT }, then concatenate a zero vector horizontally to make Ωe,msuperscriptsuperscriptΩ𝑒subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{e,\ast}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e , ∗ end_POSTSUPERSCRIPT three-dimensional.

Step 3: Bagging SOI Generation.

Similarly, adopting (6) to map Ωe,msuperscriptsuperscriptΩ𝑒subscript𝑚{}^{\mathcal{F}_{m}}\Omega^{e,\ast}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e , ∗ end_POSTSUPERSCRIPT into wsubscript𝑤\mathcal{F}_{w}caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT to obtain Ωe,wsuperscriptsuperscriptΩ𝑒subscript𝑤{}^{\mathcal{F}_{w}}\Omega^{e,\ast}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e , ∗ end_POSTSUPERSCRIPT, and whose dimension should be consistent with that of 𝐱𝐱\mathbf{x}bold_x. Thus, farthest point sampling (FPS) [16] extracts nxsubscript𝑛𝑥n_{x}italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT samples from Ωe,wsuperscriptsuperscriptΩ𝑒subscript𝑤{}^{\mathcal{F}_{w}}\Omega^{e,\ast}start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e , ∗ end_POSTSUPERSCRIPT to obtain:

𝐱=FPS(wΩe,,nx)nx×3\mathbf{x}_{{\dagger}}={\rm{FPS}}(^{\mathcal{F}_{w}}\Omega^{e,\ast},\ n_{x})% \in\mathbb{R}^{n_{x}\times 3}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT = roman_FPS ( start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_Ω start_POSTSUPERSCRIPT italic_e , ∗ end_POSTSUPERSCRIPT , italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT × 3 end_POSTSUPERSCRIPT (13)

Step 4: Goal SOI Generation.

Note that 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT is coplanar with 𝐕𝐕\mathbf{V}bold_V, and seen as an transient shape, i.e., 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT is at the bottom of 𝐁𝐁\mathbf{B}bold_B. Our goal is to generate a shape that surrounds the bottom part of 𝐁𝐁\mathbf{B}bold_B, so we simply translate 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT along 𝐚𝐚\mathbf{a}bold_a by a safety threshold γ𝛾\gammaitalic_γ to get 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT, it yields

𝐱=𝐱+γ𝐚subscript𝐱subscript𝐱𝛾𝐚\mathbf{x}_{\ast}=\mathbf{x}_{{\dagger}}+\gamma\cdot\mathbf{a}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT + italic_γ ⋅ bold_a (14)

Finally, 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT is the goal SOI, covering the bottom part of 𝐁𝐁\mathbf{B}bold_B. Fig. 5 visualizes the bagging/goal SOI and three constraints.

Remark 1.

In this work, 𝐚𝐚\mathbf{a}bold_a should hold an acute angle with the positive direction of the z𝑧zitalic_z-axis, which can be adjusted by calculating dot(𝐚,[0,0,1])dot𝐚001\rm{dot}(\mathbf{a},[0,0,1])roman_dot ( bold_a , [ 0 , 0 , 1 ] ). If negative, we should reverse 𝐚𝐚{\mathbf{a}}bold_a.

Refer to caption
Figure 6: Illustration of SOI planning. 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT are the initial and bagging SOI. 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT is goal SOI covering the bottom part of 𝐁𝐁\mathbf{B}bold_B.

IV-C SOI Planning

In this section, we introduce how to generate a collision-free deformation path 𝒢𝒢\mathcal{G}caligraphic_G of the bag from the initial SOI 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT, where 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT is the bagging SOI as the pre-enclosing configuration of 𝐁𝐁\mathbf{B}bold_B. Note that 𝒢𝒢\mathcal{G}caligraphic_G includes two stages, 𝐱0𝐱subscript𝐱0subscript𝐱\mathbf{x}_{0}\rightarrow\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT → bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT and 𝐱𝐱subscript𝐱subscript𝐱\mathbf{x}_{{\dagger}}\rightarrow\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT → bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT Then, 𝒢𝒢\mathcal{G}caligraphic_G serves as the desired trajectory of the subsequent controller to assist in completing the bagging task. The SOI planning is conducted in the world frame wsubscript𝑤\mathcal{F}_{w}caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT. The studied planning task can be seen as the shape planning of bag’s SOI, as a global trajectory guiding the robot.

The 3D ellipse parametric equation is constructed as:

wfp(𝐜,βa,βb,𝐮,𝐯)=𝐜+βacos(θ)𝐮+βbsin(θ)𝐯^{\mathcal{F}_{w}}f_{p}(\mathbf{c},\beta_{a},\beta_{b},{\mathbf{u}},{\mathbf{v% }})=\mathbf{c}+\beta_{a}\cos(\theta){\mathbf{u}}+\beta_{b}\sin(\theta){\mathbf% {v}}start_POSTSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_c , italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT , bold_u , bold_v ) = bold_c + italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT roman_cos ( italic_θ ) bold_u + italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT roman_sin ( italic_θ ) bold_v (15)

where 𝐜=[cx,cy,cz]𝐜subscript𝑐𝑥subscript𝑐𝑦subscript𝑐𝑧\mathbf{c}=[c_{x},c_{y},c_{z}]bold_c = [ italic_c start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ] is the centroid. βasubscript𝛽𝑎\beta_{a}italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT and βbsubscript𝛽𝑏\beta_{b}italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT determine the semi-major and semi-minor axes lengths, respectively. θ𝜃\thetaitalic_θ is the parametric angle belong to [0,2π]02𝜋[0,2\pi][ 0 , 2 italic_π ]. 𝐮3,𝐯3formulae-sequence𝐮superscript3𝐯superscript3{\mathbf{u}}\in\mathbb{R}^{3},{\mathbf{v}}\in\mathbb{R}^{3}bold_u ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT , bold_v ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT are the direction vectors. Let θi=2πi/2000,i[1,2000]formulae-sequencesubscript𝜃𝑖2𝜋𝑖2000𝑖12000\theta_{i}=2\pi i/2000,i\in[1,2000]italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 2 italic_π italic_i / 2000 , italic_i ∈ [ 1 , 2000 ], and the 3-dimension ellipse in wsubscript𝑤\mathcal{F}_{w}caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT is constructed as:

Υe:={Υi=(xi,yi,zi)|θi(15)}2000×3assignsubscriptΥ𝑒conditional-setsubscriptΥ𝑖subscript𝑥𝑖subscript𝑦𝑖subscript𝑧𝑖subscript𝜃𝑖italic-(15italic-)superscript20003\Upsilon_{e}:=\{\Upsilon_{i}=(x_{i},y_{i},z_{i})|\theta_{i}\leftarrow\eqref{eq% 34}\}\in\mathbb{R}^{2000\times 3}roman_Υ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT := { roman_Υ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) | italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ← italic_( italic_) } ∈ blackboard_R start_POSTSUPERSCRIPT 2000 × 3 end_POSTSUPERSCRIPT (16)

The perimeter of ΥesubscriptΥ𝑒\Upsilon_{e}roman_Υ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is numerically obtained as χ=Σi=12001ΥiΥi1𝜒superscriptsubscriptΣ𝑖12001normsubscriptΥ𝑖subscriptΥ𝑖1\chi=\Sigma_{i=1}^{2001}\|\Upsilon_{i}-\Upsilon_{i-1}\|italic_χ = roman_Σ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2001 end_POSTSUPERSCRIPT ∥ roman_Υ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - roman_Υ start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ∥ with Υ2001=Υ1subscriptΥ2001subscriptΥ1\Upsilon_{2001}=\Upsilon_{1}roman_Υ start_POSTSUBSCRIPT 2001 end_POSTSUBSCRIPT = roman_Υ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT for the circle calculation.

Refer to caption
Figure 7: Visualization of map** a randomly sampled configuration in the raw space into the stable manifold. The blue dot-line represents the randomly sampled configuration, and the yellow one is the stable configuration in the manifold through ProjectStableConfig.

Projection of Stable Configuration Manifold

The bag’s raw configuration space 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT has a dimensionality of 3nx3subscript𝑛𝑥3n_{x}3 italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT. However, its stable state is confined to a specific subspace known as a manifold within this larger space. Therefore, it can enhance the planning credibility if the planning process is performed specifically on this manifold that contains the stable state of the bag. However, it’s challenging to obtain this manifold through random sampling in the raw space as the dimensions of the raw space significantly exceed those of the stable space. To discover this constraint manifold within the current shape configuration, it would be suitable to employ the projection method, which allows for a more targeted exploration of the stable state [17].

By formulating a random sampling in the raw space as a local minimization problem of the energy, it can project it onto the stable manifold, with 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT as the initial value.

𝐱tstablesuperscriptsubscript𝐱𝑡stable\displaystyle\mathbf{x}_{t}^{\rm{stable}}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_stable end_POSTSUPERSCRIPT =argmin𝐱𝒥2(𝐱),s.t.𝐱=𝐱t\displaystyle=\mathop{\arg\min}_{\mathbf{x}}\ \mathcal{J}_{2}(\mathbf{x}),\ \ % \ {\rm{s.t.}}\ \ \mathbf{x}=\mathbf{x}_{t}= start_BIGOP roman_arg roman_min end_BIGOP start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT caligraphic_J start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_x ) , roman_s . roman_t . bold_x = bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (17)

A geometric index is used as the projection model, where the cost function 𝒥2subscript𝒥2\mathcal{J}_{2}caligraphic_J start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for a configuration 𝐱𝐱\mathbf{x}bold_x is presented as:

𝒥2(𝐜,βa,βb,𝐮,𝐯)=|CD(Υe,𝐱t)|2subscript𝒥2𝐜subscript𝛽𝑎subscript𝛽𝑏𝐮𝐯superscriptCDsubscriptΥ𝑒subscript𝐱𝑡2\displaystyle\mathcal{J}_{2}(\mathbf{c},\beta_{a},\beta_{b},{\mathbf{u}},{% \mathbf{v}})=|{\rm{CD}}(\Upsilon_{e},\mathbf{x}_{t})|^{2}caligraphic_J start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_c , italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT , bold_u , bold_v ) = | roman_CD ( roman_Υ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (18)

where CD()CD\rm{CD}(\cdot)roman_CD ( ⋅ ) is the Chamfer Distance, to evaluate the similarity between two unordered dataset with different dimensions.

Constraint C4subscript𝐶4C_{4}italic_C start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT: it ensure the perimeter χ𝜒\chiitalic_χ of ΥesubscriptΥ𝑒\Upsilon_{e}roman_Υ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT on the manifold be consistent with the bag rim’s perimeter ω𝜔\omegaitalic_ω.

1λ4χ/ω1+λ41subscript𝜆4𝜒𝜔1subscript𝜆41-\lambda_{4}\leq{\chi}\ /\ {\omega}\leq 1+\lambda_{4}1 - italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ≤ italic_χ / italic_ω ≤ 1 + italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT (19)

where λ4subscript𝜆4\lambda_{4}italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT controls the scale between χ𝜒\chiitalic_χ and the ground-truth perimeter ω𝜔\omegaitalic_ω. The default value is set to λ4=0.001subscript𝜆40.001\lambda_{4}=0.001italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT = 0.001.

Constraint C5subscript𝐶5C_{5}italic_C start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT: it ensures that the projection process completed without large offsets.

0𝐜𝐱¯t<λ50norm𝐜subscript¯𝐱𝑡subscript𝜆50\leq\|\mathbf{c}-\bar{\mathbf{x}}_{t}\|<\lambda_{5}0 ≤ ∥ bold_c - over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ < italic_λ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT (20)

where 𝐱¯tsubscript¯𝐱𝑡\bar{\mathbf{x}}_{t}over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the centroid of 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. The default value is λ5=0.01subscript𝜆50.01\lambda_{5}=0.01italic_λ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT = 0.01.

The optimal values of (15) can be obtained by minimizing 𝒥2subscript𝒥2\mathcal{J}_{2}caligraphic_J start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with (19). Afterwards, 𝐱tstable=fpw(𝐜*,βa*,βb,𝐮*,𝐯)superscriptsubscript𝐱𝑡stablesuperscriptsubscript𝑓𝑝subscript𝑤superscript𝐜superscriptsubscript𝛽𝑎superscriptsubscript𝛽𝑏superscript𝐮superscript𝐯\mathbf{x}_{t}^{\rm{stable}}={{}^{\mathcal{F}_{w}}f_{p}}(\mathbf{c}^{*},\beta_% {a}^{*},\beta_{b}^{\ast},{\mathbf{u}}^{*},{{{\mathbf{v}}}^{\ast}})bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_stable end_POSTSUPERSCRIPT = start_FLOATSUPERSCRIPT caligraphic_F start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_FLOATSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_c start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , bold_u start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , bold_v start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ). Additional constraints can indeed be incorporated to cater to specific tasks. The projection process from a raw configuration space to a neighboring stable manifold is denoted as 𝐱stable=ProjectStableConfig(𝐱t)superscript𝐱stableProjectStableConfigsubscript𝐱t\mathbf{x}^{\rm{stable}}=\rm{ProjectStableConfig}(\mathbf{x}_{t})bold_x start_POSTSUPERSCRIPT roman_stable end_POSTSUPERSCRIPT = roman_ProjectStableConfig ( bold_x start_POSTSUBSCRIPT roman_t end_POSTSUBSCRIPT ). Four examples of ProjectStableConfig are shown in Fig. 7.

Step 1: Pre-Bagging SOI Planning

Our shape planning algorithm follows the same streamline as the Constrained Bi-directional Rapidly-Exploring Random Tree (CBiRRT) [17]. ProjectStableConfig ensures the validity of nodes, and CBiRRT contains two independent trees, growing from the initial configuration and the goal configuration, respectively. Both trees expand and explore the configuration space, gradually moving towards each other, until they eventually become connected to generate the final path. For our planning, the bag’s state 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is regarded as the tree’s node. The procedure of CBiRRT is introduced in [18], interesting readers could refer to it.

In planning, each random node 𝒢randsubscript𝒢rand\mathcal{G}_{\rm{rand}}caligraphic_G start_POSTSUBSCRIPT roman_rand end_POSTSUBSCRIPT is projected to the stable manifold using ProjectStableConfig before the next-step planning. If the bag is in collision, 𝒢randsubscript𝒢rand\mathcal{G}_{\rm{rand}}caligraphic_G start_POSTSUBSCRIPT roman_rand end_POSTSUBSCRIPT is discarded and regenerated. The Chamfer Distance is utilized to calculate the distance between two nodes, this point is different from [18].

Similiiar to [17], the constrained extension is denoted as 𝒢reachedsubscript𝒢reached\mathcal{G}_{\rm{reached}}caligraphic_G start_POSTSUBSCRIPT roman_reached end_POSTSUBSCRIPT = ConstrainedExtend(𝒢from,𝒢to)\mathcal{G}_{\rm{from}},\mathcal{G}_{\rm{to}})caligraphic_G start_POSTSUBSCRIPT roman_from end_POSTSUBSCRIPT , caligraphic_G start_POSTSUBSCRIPT roman_to end_POSTSUBSCRIPT ). This function aims to make progress from 𝒢fromsubscript𝒢from\mathcal{G}_{\rm{from}}caligraphic_G start_POSTSUBSCRIPT roman_from end_POSTSUBSCRIPT towards reaching 𝒢tosubscript𝒢to\mathcal{G}_{\rm{to}}caligraphic_G start_POSTSUBSCRIPT roman_to end_POSTSUBSCRIPT while adhering to the constraints and limitations imposed by the planning problem. During each step of the process, a new configuration for the bag is generated by interpolating from the last reached configuration 𝐱lastsubscript𝐱last\mathbf{x}_{\rm{last}}bold_x start_POSTSUBSCRIPT roman_last end_POSTSUBSCRIPT to 𝐱tosubscript𝐱to\mathbf{x}_{\rm{to}}bold_x start_POSTSUBSCRIPT roman_to end_POSTSUBSCRIPT, using a small step size. To ensure the overall shape of the bag is preserved and prevent excessive stretching, the displacement limitation for the relative deformation of the bag’s rim is enforced. Afterwards, a stable configuration 𝐱newsubscript𝐱new\mathbf{x}_{\rm{new}}bold_x start_POSTSUBSCRIPT roman_new end_POSTSUBSCRIPT is obtained using ProjectStableConfig on the interpolated configuration.

After planning, the bidirectional path of CBiRRT is extracted as the deformation path 𝒢𝒢\mathcal{G}caligraphic_G, then refined for the subsequent smooth dual-arm manipulation. The pre-bagging path is constructed as 𝒢pre-bagging:={𝐠0,𝐠1,,𝐠}assignsubscript𝒢pre-baggingsubscript𝐠0subscript𝐠1subscript𝐠\mathcal{G}_{\rm{pre\text{-}bagging}}:=\{\mathbf{g}_{0},\mathbf{g}_{1},\ldots,% \mathbf{g}_{{\dagger}}\}caligraphic_G start_POSTSUBSCRIPT roman_pre - roman_bagging end_POSTSUBSCRIPT := { bold_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_g start_POSTSUBSCRIPT † end_POSTSUBSCRIPT }

Step 2: Bagging SOI Planning

This stage presents the deformation path from 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT to 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT, adopting the same planning procedure as the Step 1. The bagging path is constructed as 𝒢bagging:={𝐠,,𝐠}assignsubscript𝒢baggingsubscript𝐠subscript𝐠\mathcal{G}_{\rm{bagging}}:=\{\mathbf{g}_{{\dagger}},\ldots,\mathbf{g}_{\ast}\}caligraphic_G start_POSTSUBSCRIPT roman_bagging end_POSTSUBSCRIPT := { bold_g start_POSTSUBSCRIPT † end_POSTSUBSCRIPT , … , bold_g start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT }.

The final path 𝒢𝒢\mathcal{G}caligraphic_G from 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT is constructed as:

𝒢:={[𝒢pre-bagging,𝒢bagging]|𝐠0,𝐠1,,𝐠pre-bagging,𝐠,,𝐠bagging}assign𝒢conditional-setsubscript𝒢pre-baggingsubscript𝒢baggingsubscriptsubscript𝐠0subscript𝐠1subscript𝐠pre-baggingsubscriptsubscript𝐠subscript𝐠bagging\mathcal{G}:=\{{\left[\mathcal{G}_{\rm{pre\text{-}bagging}},\mathcal{G}_{\rm{% bagging}}\right]|\underbrace{\mathbf{g}_{0},\mathbf{g}_{1},\ldots,\mathbf{g}_{% {\dagger}}}_{\rm{pre\text{-}bagging}},\underbrace{\mathbf{g}_{{\dagger}},% \ldots,\mathbf{g}_{\ast}}_{\rm{bagging}}}\}caligraphic_G := { [ caligraphic_G start_POSTSUBSCRIPT roman_pre - roman_bagging end_POSTSUBSCRIPT , caligraphic_G start_POSTSUBSCRIPT roman_bagging end_POSTSUBSCRIPT ] | under⏟ start_ARG bold_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_g start_POSTSUBSCRIPT † end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT roman_pre - roman_bagging end_POSTSUBSCRIPT , under⏟ start_ARG bold_g start_POSTSUBSCRIPT † end_POSTSUBSCRIPT , … , bold_g start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT roman_bagging end_POSTSUBSCRIPT } (21)

IV-D Motion Planning

The process of the proposed bagging manipulation approach is: (a) the vision system perceives 𝐁𝐁\mathbf{B}bold_B, and the robot generates the bagging/goal SOI. (b) an collision-free deformation path 𝒢𝒢\mathcal{G}caligraphic_G is obtained using CBiRRT. (c) the dual robot completes the bagging task along 𝒢𝒢\mathcal{G}caligraphic_G in a constrained environment. For providing a clear and intuitive visual effect, the robot adopts 3D translation and 3D rotation. The end-effector’s pose is denoted by 𝐫=[𝐩[left],𝐩[right]]12𝐫superscript𝐩delimited-[]leftsuperscript𝐩delimited-[]rightsuperscript12\mathbf{r}=[\mathbf{p}^{\rm{[left]}},\mathbf{p}^{[\rm{right}]}]\in\mathbb{R}^{% 12}bold_r = [ bold_p start_POSTSUPERSCRIPT [ roman_left ] end_POSTSUPERSCRIPT , bold_p start_POSTSUPERSCRIPT [ roman_right ] end_POSTSUPERSCRIPT ] ∈ blackboard_R start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT. We assume that the material properties of the bag and the robot movements remain relatively stable during the manipulation process. The robot is able to execute the given velocity commands accurately and without delay. For the controller design, we formulate this manipulation process as the shape servoing [19], i.e., tiny movements of the robot can produce tiny deformations of the bag. Inspired by [20], the local first-order kinematic model can be obtained as below:

𝐲t=𝐉t𝐮t,𝐲t=𝐱t𝐱t1,𝐮t=𝐫t𝐫t1formulae-sequencesubscript𝐲𝑡subscript𝐉𝑡subscript𝐮𝑡formulae-sequencesubscript𝐲𝑡subscript𝐱𝑡subscript𝐱𝑡1subscript𝐮𝑡subscript𝐫𝑡subscript𝐫𝑡1\mathbf{y}_{t}=\mathbf{J}_{t}\mathbf{u}_{t},\ \ \mathbf{y}_{t}=\mathbf{x}_{t}-% \mathbf{x}_{t-1},\ \ \mathbf{u}_{t}=\mathbf{r}_{t}-\mathbf{r}_{t-1}bold_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - bold_r start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT (22)

where 𝐉tsubscript𝐉𝑡\mathbf{J}_{t}bold_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the deformation Jacobian matrix (DJM), which represents the kinematic relationship between 𝐲tsubscript𝐲𝑡\mathbf{y}_{t}bold_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and 𝐮tsubscript𝐮𝑡\mathbf{u}_{t}bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. We make the assumption that 𝐉tsubscript𝐉𝑡\mathbf{J}_{t}bold_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT maintains full column rank while performing the manipulation task, which is straightforward to fulfill in practical scenarios since the dimension of 𝐱𝐱\mathbf{x}bold_x is significantly greater than 𝐮𝐮\mathbf{u}bold_u. Since the bag has strong unknown nonlinearity, it’s difficult to obtain accurate analytical expression of 𝐉tsubscript𝐉𝑡\mathbf{J}_{t}bold_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. Therefore, the Broyden approach is used to computes local approximations of 𝐉tsubscript𝐉𝑡\mathbf{J}_{t}bold_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT in real time instead of identifying the full mechanical model.

𝐉^t=𝐉^t1+ε(𝐲t𝐉^t1𝐮t)/(𝐮t𝐮t)𝐮t3nx×12subscript^𝐉𝑡subscript^𝐉𝑡1𝜀subscript𝐲𝑡subscript^𝐉𝑡1subscript𝐮𝑡superscriptsubscript𝐮𝑡subscript𝐮𝑡superscriptsubscript𝐮𝑡superscript3subscript𝑛𝑥12\hat{\mathbf{J}}_{t}=\hat{\mathbf{J}}_{t-1}+\varepsilon\cdot({{{{\mathbf{y}_{t% }}-\hat{\mathbf{J}}_{t-1}{\mathbf{u}_{t}}}}})\ /\ ({{\mathbf{u}_{t}^{\intercal% }{\mathbf{u}_{t}}}})\cdot\mathbf{u}_{t}^{\intercal}\in\mathbb{R}^{3n_{x}\times 12}over^ start_ARG bold_J end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = over^ start_ARG bold_J end_ARG start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_ε ⋅ ( bold_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - over^ start_ARG bold_J end_ARG start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) / ( bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊺ end_POSTSUPERSCRIPT bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ⋅ bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊺ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT × 12 end_POSTSUPERSCRIPT (23)

where ε(0,1]𝜀01\varepsilon\in(0,1]italic_ε ∈ ( 0 , 1 ] regulates the convergence speed.

Considering 𝒢𝒢\mathcal{G}caligraphic_G contains a sequence of trajectories, thus we adopt a model predictive control (MPC) to drive the dual-arm manipulate along 𝒢𝒢\mathcal{G}caligraphic_G. To simplify calculation burden, 𝐉tsubscript𝐉𝑡\mathbf{J}_{t}bold_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is assumed to be estimated accurately, such that it satisfies 𝐲t=𝐉^t𝐮tsubscript𝐲𝑡subscript^𝐉𝑡subscript𝐮𝑡\mathbf{y}_{t}={\hat{\mathbf{J}}_{t}}{\mathbf{u}_{t}}bold_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = over^ start_ARG bold_J end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. Two prediction vectors are defined as follows:

𝐱¯t={𝐱t+i|t}3nxh,𝐮¯t={𝐮t+i1|t}12h,i[1,h]formulae-sequencesubscript¯𝐱𝑡subscript𝐱𝑡conditional𝑖𝑡superscript3subscript𝑛𝑥subscript¯𝐮𝑡subscript𝐮𝑡𝑖conditional1𝑡superscript12𝑖1\displaystyle\bar{\mathbf{x}}_{t}=\{\mathbf{x}_{t+i|t}\}\in{\mathbb{R}^{3n_{x}% h}},\ \bar{\mathbf{u}}_{t}=\{\mathbf{u}_{t+i-1|t}\}\in{\mathbb{R}^{12h}},i\in[% 1,h]over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = { bold_x start_POSTSUBSCRIPT italic_t + italic_i | italic_t end_POSTSUBSCRIPT } ∈ blackboard_R start_POSTSUPERSCRIPT 3 italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_h end_POSTSUPERSCRIPT , over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = { bold_u start_POSTSUBSCRIPT italic_t + italic_i - 1 | italic_t end_POSTSUBSCRIPT } ∈ blackboard_R start_POSTSUPERSCRIPT 12 italic_h end_POSTSUPERSCRIPT , italic_i ∈ [ 1 , italic_h ] (24)

where 𝐱¯tsubscript¯𝐱𝑡\bar{\mathbf{x}}_{t}over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and 𝐮¯tsubscript¯𝐮𝑡\bar{\mathbf{u}}_{t}over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT represent the predictions of 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and 𝐮tsubscript𝐮𝑡\mathbf{u}_{t}bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT in the next hhitalic_h periods, respectively. 𝐱t+i|tsubscript𝐱𝑡conditional𝑖𝑡\mathbf{x}_{t+i|t}bold_x start_POSTSUBSCRIPT italic_t + italic_i | italic_t end_POSTSUBSCRIPT and 𝐮t+i|tsubscript𝐮𝑡conditional𝑖𝑡\mathbf{u}_{t+i|t}bold_u start_POSTSUBSCRIPT italic_t + italic_i | italic_t end_POSTSUBSCRIPT denote the i𝑖iitalic_ith predictions of 𝐱tsubscript𝐱𝑡\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and 𝐮tsubscript𝐮𝑡\mathbf{u}_{t}bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT from the time instant t𝑡titalic_t, where 𝐱t|t=𝐱tsubscript𝐱conditional𝑡𝑡subscript𝐱𝑡\mathbf{x}_{t|t}=\mathbf{x}_{t}bold_x start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, and 𝐮t|t=𝐮tsubscript𝐮conditional𝑡𝑡subscript𝐮𝑡\mathbf{u}_{t|t}=\mathbf{u}_{t}bold_u start_POSTSUBSCRIPT italic_t | italic_t end_POSTSUBSCRIPT = bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT must hold. 𝐱¯tsubscript¯𝐱𝑡\bar{\mathbf{x}}_{t}over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT can be calculated from 𝐉^tsubscript^𝐉𝑡\hat{\mathbf{J}}_{t}over^ start_ARG bold_J end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT by noting that 𝐉^t𝐉^t+hsubscript^𝐉𝑡subscript^𝐉𝑡\hat{\mathbf{\mathbf{J}}}_{t}\approx\hat{\mathbf{\mathbf{J}}}_{t+h}over^ start_ARG bold_J end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ≈ over^ start_ARG bold_J end_ARG start_POSTSUBSCRIPT italic_t + italic_h end_POSTSUBSCRIPT is satisfied during period [t,t+h]𝑡𝑡[t,t+h][ italic_t , italic_t + italic_h ] (which is reasonable, given the slow manipulation of the bag). In this way, 𝐱¯tsubscript¯𝐱𝑡\bar{\mathbf{x}}_{t}over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT are computed as the augmented format:

𝐱¯tsubscript¯𝐱𝑡\displaystyle\bar{\mathbf{x}}_{t}over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT =𝐃𝐱t+𝚯𝐮¯t,𝐃=𝐈h×1𝐄3nx,𝚯=𝐋h𝐉^tformulae-sequenceabsentsubscript𝐃𝐱𝑡𝚯subscript¯𝐮𝑡formulae-sequence𝐃tensor-productsubscript𝐈1subscript𝐄3subscript𝑛𝑥𝚯tensor-productsubscript𝐋subscript^𝐉𝑡\displaystyle=\mathbf{D}\mathbf{x}_{t}+\boldsymbol{\Theta}{\bar{\mathbf{u}}_{t% }},\ \ \ \mathbf{D}=\mathbf{I}_{h\times 1}\otimes\mathbf{E}_{3n_{x}},\ \ \ % \boldsymbol{\Theta}=\mathbf{L}_{h}\otimes\hat{\mathbf{J}}_{t}= bold_Dx start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + bold_Θ over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_D = bold_I start_POSTSUBSCRIPT italic_h × 1 end_POSTSUBSCRIPT ⊗ bold_E start_POSTSUBSCRIPT 3 italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_POSTSUBSCRIPT , bold_Θ = bold_L start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ⊗ over^ start_ARG bold_J end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (25)

The target 𝐱¯t*superscriptsubscript¯𝐱𝑡\bar{\mathbf{x}}_{t}^{*}over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT is constructed as 𝐱¯t*=[𝒢t+1,𝒢t+2,,𝒢t+h]superscriptsubscript¯𝐱𝑡subscript𝒢𝑡1subscript𝒢𝑡2subscript𝒢𝑡\bar{\mathbf{x}}_{t}^{*}=[\mathcal{G}_{t+1},\mathcal{G}_{t+2},\ldots,\mathcal{% G}_{t+h}]over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT = [ caligraphic_G start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT , caligraphic_G start_POSTSUBSCRIPT italic_t + 2 end_POSTSUBSCRIPT , … , caligraphic_G start_POSTSUBSCRIPT italic_t + italic_h end_POSTSUBSCRIPT ]. The optimization function of 𝐮¯tsubscript¯𝐮𝑡\bar{\mathbf{u}}_{t}over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is formulated as:

𝒬(𝐮¯t)=(𝐱¯t𝐱¯t*)𝚲1(𝐱¯t𝐱¯t*)+𝐮¯t𝚲2𝐮¯t𝒬subscript¯𝐮𝑡superscriptsubscript¯𝐱𝑡superscriptsubscript¯𝐱𝑡subscript𝚲1subscript¯𝐱𝑡superscriptsubscript¯𝐱𝑡superscriptsubscript¯𝐮𝑡subscript𝚲2subscript¯𝐮𝑡\displaystyle\mathcal{Q}\left(\bar{\mathbf{u}}_{t}\right)={\left({{\bar{% \mathbf{x}}_{t}}-{\bar{\mathbf{x}}_{t}^{*}}}\right)^{\intercal}}\boldsymbol{% \Lambda}_{1}\left({{\bar{\mathbf{x}}_{t}}-{\bar{\mathbf{x}}_{t}^{*}}}\right)+% \bar{\mathbf{u}}_{t}^{\intercal}\boldsymbol{\Lambda}_{2}{\bar{\mathbf{u}}_{t}}caligraphic_Q ( over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = ( over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ⊺ end_POSTSUPERSCRIPT bold_Λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - over¯ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ) + over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊺ end_POSTSUPERSCRIPT bold_Λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (26)

where 𝚲1subscript𝚲1\boldsymbol{\Lambda}_{1}bold_Λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 𝚲2subscript𝚲2\boldsymbol{\Lambda}_{2}bold_Λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are symmetric positive-definite matrices, regulating the convergence speed and the smoothness of 𝐮¯tsubscript¯𝐮𝑡\bar{\mathbf{u}}_{t}over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, respectively. Finally, 𝐮tsubscript𝐮𝑡{\mathbf{u}}_{t}bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is obtained by the receding horizon:

𝐮t=[𝐄12,𝟎,,𝟎]𝐮¯t12subscript𝐮𝑡subscript𝐄1200subscript¯𝐮𝑡superscript12\displaystyle\mathbf{u}_{t}=[\mathbf{E}_{12},\mathbf{0},\ldots,\mathbf{0}]% \cdot\bar{\mathbf{u}}_{t}\in\mathbb{R}^{12}bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = [ bold_E start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT , bold_0 , … , bold_0 ] ⋅ over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT (27)
Refer to caption
Figure 8: Experimental set-up. (a) Overview of the experimental set-up to validate our SOI-based control for the dual-arm bagging task. (b)-(e) Four samples of dual-arm manipulation bags. (f) - (i) Four baggable objects, i.e., coffee box, canned pineapple, grapefruit, and 3D-printed triangular prism.

V Experiments

V-A Experimental Setup

As shown in Fig. 8, we describe The experimental setup used to validate the proposed SOI-based control of dual-arm bagging task, including four baggable objects 𝐁𝐁\mathbf{B}bold_B. A D455 camera is in the eye-to-hand configuration, and used to observe the manipulation process from a top-down perspective with the resolution 640x480. Visual perception is processed with OpenCV on a Linux-based PC, and the point cloud 𝒫tsubscript𝒫𝑡\mathcal{P}_{t}caligraphic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT are obtained through RealSense libraries. Dual-CR5 robots are equipped with 3D-printer holders to grasp the both ends of the bag with zip ties in advance, and assume that no drops occur during manipulation. The custom bag with a green rim is adopted for ease of perception. The velocity command 𝐮tsubscript𝐮𝑡\mathbf{u}_{t}bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT has a hard saturation to meet the assumption in Sec. III to ensure the estimation validity of 𝐉tsubscript𝐉𝑡\mathbf{J}_{t}bold_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. The motion control algorithm is implemented on ROS, which runs with a servo-control loop of around 11 Hz.

We use professional 3D scanners (Model: CR-Scan Ferret Pro) to obtain the vertex points 𝐕𝐕\mathbf{V}bold_V of each 𝐁𝐁\mathbf{B}bold_B. Meanwhile, the ArUco markers are attached to 𝐁𝐁\mathbf{B}bold_B to ensure that the robot can determine the type through the camera before manipulation, then call the corresponding configuration of 𝐕𝐕\mathbf{V}bold_V.

V-B Evaluation of GMM-based State Estimation

In this section, we verify the GMM-based state estimator introduced in (5), it aims to extract clear state 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT from the raw dense and noisy point cloud 𝒫tsubscript𝒫𝑡\mathcal{P}_{t}caligraphic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. As the used bag has an obvious rim, so 𝒫tsubscript𝒫𝑡\mathcal{P}_{t}caligraphic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT can be obtained simply. EM algorithm [14] is used to solve 𝒪(𝐱tn)𝒪superscriptsubscript𝐱𝑡𝑛\mathcal{O}(\mathbf{x}_{t}^{n})caligraphic_O ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ), we can get the concise 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT.

Fig. 9 shows the extraction effect of the GMM-based state estimator, and the bag’s rim are marked by green, as shown in Fig. 9a. The results in Fig. 9b show that the GMM-based state estimator can propose a relatively completely 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, and 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is equidistantly distributed, this echoes the uniform distribution assumption (4). Furthermore, Fig. 9c is added to evaluate ProjectStableConfig in (17). The results show that ProjectStableConfig can find a stable manifold under the current shape configuration 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, presented as the red curve in Fig. 9c distributed along 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. This proves the effectiveness of ProjectStableConfig and can find a stable manifold projection, which is helpful for subsequent planning and control.

Refer to caption
Figure 9: Extraction results of the GMM-based state estimator. (a) the bag’s SOI. (b) the blue dots are 𝒫tsubscript𝒫𝑡\mathcal{P}_{t}caligraphic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, and the red ones are 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT estimated by GMM. (c) the smooth state by refining 𝒬tsubscript𝒬𝑡\mathcal{Q}_{t}caligraphic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT using ProjectStableConfig.
Refer to caption
Figure 10: (a) - (f): Generations of bagging SOI 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT. (g)-(l): Deformation path 𝒢𝒢\mathcal{G}caligraphic_G.

V-C Evaluation of bagging SOI Generation

In this section, we evaluate the bagging SOI generation presented in Sec. IV-B, to generate a pre-enclosed shape 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT to cover the bottom of 𝐁𝐁\mathbf{B}bold_B. Six types of baggable objects are adopted with the known 𝐕𝐕\mathbf{V}bold_V. The parameters are set to ω=0.68,λ1=0.85,λ2=0.005,λ3=0.001formulae-sequence𝜔0.68formulae-sequencesubscript𝜆10.85formulae-sequencesubscript𝜆20.005subscript𝜆30.001\omega=0.68,\lambda_{1}=0.85,\lambda_{2}=0.005,\lambda_{3}=0.001italic_ω = 0.68 , italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.85 , italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.005 , italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 0.001.

Fig. 10a - Fig. 10f shows the results of the bagging SOI 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT. The blue points represent 𝐕𝐕\mathbf{V}bold_V, and the red one is 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT through (13). The results show that 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT satisfies the perimeter ω𝜔\omegaitalic_ω, and can surround 𝐕𝐕\mathbf{V}bold_V to the greatest extent along the principal axis of 𝐕𝐕\mathbf{V}bold_V. And 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT is generated evenly distributed around 𝐕𝐕\mathbf{V}bold_V, which verifies the regulation of C1subscript𝐶1C_{1}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. As 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT is generated by a parametric equation, 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT is continuous, which is helpful for subsequent SOI planning. Moreover, in the experiment, we found that by adjusting λ1subscript𝜆1\lambda_{1}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT of constraint C1subscript𝐶1C_{1}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (10) and λ3subscript𝜆3\lambda_{3}italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT of constraint C3subscript𝐶3C_{3}italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT (12), it can efficiently regulate 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT to adapt to various 𝐁𝐁\mathbf{B}bold_B, so that 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT can best meet different task requirements. The average values of C1subscript𝐶1C_{1}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, C2subscript𝐶2C_{2}italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and C3subscript𝐶3C_{3}italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT are 0.731, 0.004, and 0.00010.00010.00010.0001, respectively.

V-D Evaluation of SOI Planning

In this section, we evaluate the SOI planning presented in Sec. IV-C, which aims to generate a collision-free deformation path 𝒢𝒢\mathcal{G}caligraphic_G from the initial SOI 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to the goal SOI 𝐱*subscript𝐱\mathbf{x}_{*}bold_x start_POSTSUBSCRIPT * end_POSTSUBSCRIPT via the bagging SOI 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT. The parameter is λ4=0.002,λ5=0.02formulae-sequencesubscript𝜆40.002subscript𝜆50.02\lambda_{4}=0.002,\lambda_{5}=0.02italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT = 0.002 , italic_λ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT = 0.02.

Fig. 10g - Fig. 10l give six planning results 𝒢𝒢\mathcal{G}caligraphic_G of different configurations of (𝐱0,𝐱,𝐱)subscript𝐱0subscript𝐱subscript𝐱(\mathbf{x}_{0},\mathbf{x}_{{\dagger}},\mathbf{x}_{\ast})( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ). The gradient curves from blue to red is 𝒢pre-baggingsubscript𝒢pre-bagging\mathcal{G}_{\rm{pre\text{-}bagging}}caligraphic_G start_POSTSUBSCRIPT roman_pre - roman_bagging end_POSTSUBSCRIPT, and that from red to green is 𝒢baggingsubscript𝒢bagging\mathcal{G}_{\rm{bagging}}caligraphic_G start_POSTSUBSCRIPT roman_bagging end_POSTSUBSCRIPT. As ProjectStableConfig is used, thus each node in 𝒢𝒢\mathcal{G}caligraphic_G is smooth, and satisfies the physical perimeter constraint ω𝜔\omegaitalic_ω. Since the distance from 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT to 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT is short and the distance to 𝐁𝐁\mathbf{B}bold_B is close, 𝒢baggingsubscript𝒢bagging\mathcal{G}_{\rm{bagging}}caligraphic_G start_POSTSUBSCRIPT roman_bagging end_POSTSUBSCRIPT shows a certain degree of fluctuation. Fig. 10j shows that CBiRRT can generate an effective deformation path 𝒢𝒢\mathcal{G}caligraphic_G even when there are obstacles. The black cuboid is the used-defined obstacle. The planning results show that continuous 𝒢𝒢\mathcal{G}caligraphic_G can be obtained using the CBiRRT, and (19) guarantees the perimeter limitation of each node in 𝒢𝒢\mathcal{G}caligraphic_G. This proves the rationality of the optimization manner (18), and various constraints can be added to improve the planning accuracy and meet the task requirements.

Note that the proposed SOI planning is a top-level planning framework, the kind where the initial SOI 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, bagging SOI 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT, and the goal 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT are given, the two-stage deformation trajectories are planned, i.e., 𝒢pre-baggingsubscript𝒢pre-bagging\mathcal{G}_{\rm{pre\text{-}bagging}}caligraphic_G start_POSTSUBSCRIPT roman_pre - roman_bagging end_POSTSUBSCRIPT and 𝒢baggingsubscript𝒢bagging\mathcal{G}_{\rm{bagging}}caligraphic_G start_POSTSUBSCRIPT roman_bagging end_POSTSUBSCRIPT. It’s just that in this article, 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT is done by simply translating 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT, but 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT may have more complex format actually.

Refer to caption
Figure 11: Manipulation trajectories of the bag’s SOI in the bagging task. The last column gives the deformation error of each step of MPC.

V-E Dual-arm Bagging Manipulation

The dual-arm bagging experiments are conducted to evaluate the proposed SOI-based bagging manipulation approach. The used baggable objects contain four types, i.e., coffee box, canned pineapple, grapefruit, and 3D-printed triangular prism, for Exp 1 to Exp 4, respectively. The fundamental process is that the dual-CR5 manipulates the bag to first deform along 𝒢pre-baggingsubscript𝒢pre-bagging\mathcal{G}_{\rm{pre\text{-}bagging}}caligraphic_G start_POSTSUBSCRIPT roman_pre - roman_bagging end_POSTSUBSCRIPT to 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT, then deform along 𝒢baggingsubscript𝒢bagging\mathcal{G}_{\rm{bagging}}caligraphic_G start_POSTSUBSCRIPT roman_bagging end_POSTSUBSCRIPT to 𝐱subscript𝐱\mathbf{x}_{\ast}bold_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT, and finally complete the bagging task. For analyzing the bagging approach, we compare two planning algorithms (FFG-RRT [21], TS-RRT [22]) and two manipulation algorithms (IBVS [23], SSVS [24]) respectively in Sec. IV-C and Sec. IV-D.

Fig. 11 shows the bagging results of four experiments, with each row representing a type of 𝐁𝐁\mathbf{B}bold_B. The first five columns of each row represent the deformation process, the sixth column represents 𝒢𝒢\mathcal{G}caligraphic_G, and the last column represents the deformation error 𝐱t𝒢tnormsubscript𝐱𝑡subscript𝒢𝑡\|\mathbf{x}_{t}-\mathcal{G}_{t}\|∥ bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - caligraphic_G start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ of each step of MPC. In order to quantitatively compare performance, three indicators are introduced, i.e., planning success rate, planning time, and manipulation success rate, corresponding to different planning algorithms and control algorithms respectively. Table I gives the detailed comparative analysis outcomes.

Planning success rate shows that CBiRRT outperforms the other counterpart, with the acceptable computation time, while FFG-RRT is the fastest. This is because FFG-RRT directly explores forward and rushes to the desired configuration at the fastest speed, while CBiRRT conducts two-way exploration based on stability, this results in CBiRRT have more exploration steps. From the manipulation success rate, we know that the MPC used in this article has the highest value, while the other two control approaches are slightly worse. This is because the desired command of the traditional shape servoing is stationary, while that of our bagging task is actually a sequence of deformation trajectories. This point is very consistent with the MPC processing manner, and can ensure the stability of tracking in the future prediction time domain. The manipulation results prove the effectiveness of MPC in such robot manipulation tasks.

Besides, 𝐱subscript𝐱\mathbf{x}_{{\dagger}}bold_x start_POSTSUBSCRIPT † end_POSTSUBSCRIPT is equivalent to an intermediate buffer shape, thus dividing the entire bagging task into two subtasks, namely pre-bagging and bagging, thereby improving the success rate of manipulation.

Table I: Performance of Different Sensorimotor Models on Different Tasks for Motor-robot Experiments
Method Coffee box (Exp 1) Canned pineapple (Exp 2) Grapefruit (Exp 3) Triangular prism (Exp 4)
Planning success rate Planning time (s) Manipulation success rate Planning success rate Planning time (s) Manipulation success rate Planning success rate Planning time (s) Manipulation success rate Planning success rate Planning time (s) Manipulation success rate
FFG-RRT [21] 6/10 3.87±1.97plus-or-minus3.871.973.87\pm 1.973.87 ± 1.97 8/8 8/10 2.37 ±plus-or-minus\pm± 0.87 8/8 7/10 3.89 ±plus-or-minus\pm± 1.18 8/8 6/10 3.58 ±plus-or-minus\pm± 1.11 8/8
TS-RRT [22] 7/10 6.32 ±plus-or-minus\pm± 1.08 8/8 8/10 5.58 ±plus-or-minus\pm± 1.13 8/8 9/10 6.85 ±plus-or-minus\pm± 0.56 8/8 7/10 7.32 ±plus-or-minus\pm± 1.34 8/8
IBVS [23] - - 4/8 - - 7/8 - - 5/8 - - 6/8
SSVS [24] - - 5/8 - - 7/8 - - 6/8 - - 7/8
Ours 9/10 5.13 ±plus-or-minus\pm± 1.26 8/8 10/10 4.21 ±plus-or-minus\pm± 0.98 8/8 10/10 4.98 ±plus-or-minus\pm± 1.93 8/8 9/10 5.32 ±plus-or-minus\pm±1.56 8/8

VI Conclusion

Our study introduced a dual-arm robotic system for automating bagging tasks, employing a novel constraint-aware SOI planning approach for manipulating 3D deformable objects. The system’s innovation lies in its targeted SOI state estimation, which simplifies the control of the bag’s opening rim, enhancing task efficiency. Key contributions include a flexible, adaptive vision-based control system and a comprehensive framework demonstrating the system’s adaptability to environmental constraints. This research not only advances DOM in handling complex tasks but also has potential implications for enhancing robotic assistance in everyday activities. Future efforts will aim to improve system adaptability and extend its application to further realize the benefits of robotic automation in diverse real-world settings.

References

  • [1] A. Gonnochenko, A. Semochkin et al., “Coinbot: Intelligent robotic coin bag manipulation using artificial brain,” in 2021 7th International Conference on Automation, Robotics and Applications (ICARA).   IEEE, 2021, pp. 67–74.
  • [2] M. Saha and P. Isto, “Manipulation planning for deformable linear objects,” IEEE Trans. on Robotics, vol. 23, no. 6, pp. 1141–1150, 2007.
  • [3] M. Kudo, Y. Nasu, K. Mitobe, and B. Borovac, “Multi-arm robot control system for manipulation of flexible materials in sewing operation,” Mechatronics, vol. 10, no. 3, pp. 371–402, 2000.
  • [4] R. Alami, T. Simeon, and J.-P. Laumond, “A geometrical approach to planning manipulation tasks. the case of discrete placements and grasps,” in The fifth international symposium on Robotics research.   MIT Press, 1990, pp. 453–463.
  • [5] A. Nair, D. Chen, P. Agrawal, P. Isola, P. Abbeel, J. Malik, and S. Levine, “Combining self-supervised learning and imitation for vision-based rope manipulation,” in 2017 IEEE international conference on robotics and automation (ICRA).   IEEE, 2017, pp. 2146–2153.
  • [6] D. Seita, P. Florence, J. Tompson, E. Coumans, V. Sindhwani, K. Goldberg, and A. Zeng, “Learning to rearrange deformable cables, fabrics, and bags with goal-conditioned transporter networks,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 4568–4575.
  • [7] L. Wijayarathne, Z. Zhou, Y. Zhao, and F. L. Hammond, “Real-time deformable-contact-aware model predictive control for force-modulated manipulation,” IEEE Transactions on Robotics, 2023.
  • [8] F. Zhang and Y. Demiris, “Visual-tactile learning of garment unfolding for robot-assisted dressing,” IEEE Robotics and Automation Letters, 2023.
  • [9] Z. Weng, P. Zhou, H. Yin, A. Kravberg, A. Varava, D. Navarro-Alarcon, and D. Kragic, “Interactive perception for deformable object manipulation,” 2024.
  • [10] L. Y. Chen, B. Shi et al., “Autobag: Learning to open plastic bags and insert objects,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 3918–3925.
  • [11] A. Bahety, S. Jain et al., “Bag all you need: Learning a generalizable bagging strategy for heterogeneous objects,” in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2023, pp. 960–967.
  • [12] N. Gu, Z. Zhang, R. He, and L. Yu, “Shakingbot: dynamic manipulation for bagging,” Robotica, vol. 42, no. 3, pp. 775–791, 2024.
  • [13] P. Zhou, P. Zheng et al., “Bimanual deformable bag manipulation using a structure-of-interest based latent dynamics model,” arXiv preprint arXiv:2401.11432, 2024.
  • [14] T. Tang and M. Tomizuka, “Track deformable objects from point clouds with structure preserved registration,” The International Journal of Robotics Research, vol. 41, no. 6, pp. 599–614, 2022.
  • [15] B. M. S. Hasan and A. M. Abdulazeez, “A review of principal component analysis algorithm for dimensionality reduction,” Journal of Soft Computing and Data Mining, vol. 2, no. 1, pp. 20–30, 2021.
  • [16] X. Yan, C. Zheng, Z. Li, S. Wang, and S. Cui, “Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 5589–5598.
  • [17] D. Berenson, S. S. Srinivasa, D. Ferguson, and J. J. Kuffner, “Manipulation planning on constraint manifolds,” in 2009 IEEE international conference on robotics and automation.   IEEE, 2009, pp. 625–632.
  • [18] M. Yu, K. Lv et al., “A coarse-to-fine framework for dual-arm manipulation of deformable linear objects with whole-body obstacle avoidance,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 10 153–10 159.
  • [19] J. Qi, G. Ran, B. Wang, J. Liu, W. Ma, P. Zhou, and D. Navarro-Alarcon, “Adaptive shape servoing of elastic rods using parameterized regression features and auto-tuning motion controls,” IEEE Robotics and Automation Letters, 2023.
  • [20] J. Qi, G. Ma et al., “Contour moments based manipulation of composite rigid-deformable objects with finite time model estimation and shape/position control,” IEEE/ASME Transactions on Mechatronics, 2021.
  • [21] O. Roussel, M. Taïx, and T. Bretl, “Motion planning for a deformable linear object,” in European workshop on deformable object manipulation, 2014, pp. 153–158.
  • [22] C. Suh, T. T. Um et al., “Tangent space rrt: A randomized planning algorithm on constraint manifolds,” in 2011 IEEE International Conference on Robotics and Automation.   IEEE, 2011, pp. 4968–4973.
  • [23] X. Ren, H. Li, and Y. Li, “Image-based visual servoing control of robot manipulators using hybrid algorithm with feature constraints,” IEEE Access, vol. 8, pp. 223 495–223 508, 2020.
  • [24] M. Hao and Z. Sun, “A universal state-space approach to uncalibrated model-free visual servoing,” IEEE/ASME Transactions on Mechatronics, vol. 17, no. 5, pp. 833–846, 2011.