-
A Novel Simulation-Based Quality Metric for Evaluating Grasps on 3D Deformable Objects
Authors:
Tran Nguyen Le,
Jens Lundell,
Fares J. Abu-Dakka,
Ville Kyrki
Abstract:
Evaluation of grasps on deformable 3D objects is a little-studied problem, even if the applicability of rigid object grasp quality measures for deformable ones is an open question. A central issue with most quality measures is their dependence on contact points which for deformable objects depend on the deformations. This paper proposes a grasp quality measure for deformable objects that uses info…
▽ More
Evaluation of grasps on deformable 3D objects is a little-studied problem, even if the applicability of rigid object grasp quality measures for deformable ones is an open question. A central issue with most quality measures is their dependence on contact points which for deformable objects depend on the deformations. This paper proposes a grasp quality measure for deformable objects that uses information about object deformation to calculate the grasp quality. Grasps are evaluated by simulating the deformations during gras** and predicting the contacts between the gripper and the grasped object. The contact information is then used as input for a new grasp quality metric to quantify the grasp quality. The approach is benchmarked against two classical rigid-body quality metrics on over 600 grasps in the Isaac gym simulation and over 50 real-world grasps. Experimental results show an average improvement of 18\% in the grasp success rate for deformable objects compared to the classical rigid-body quality metrics.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Long lifetime supersolid in a two-component dipolar Bose-Einstein condensate
Authors:
Shaoxiong Li,
Uyen Ngoc Le,
Hiroki Saito
Abstract:
Recent studies on supersolidity in a single-component Bose-Einstein condensate (BEC) have relied on the Lee-Huang-Yang (LHY) correction for stabilization of self-bound droplets, which however involves a high density inside the droplets, limiting the lifetime of the supersolid. Here we propose a two-component mixture of dipolar and nondipolar BECs, such as an $^{166}$Er-$^{87}$Rb mixture, to create…
▽ More
Recent studies on supersolidity in a single-component Bose-Einstein condensate (BEC) have relied on the Lee-Huang-Yang (LHY) correction for stabilization of self-bound droplets, which however involves a high density inside the droplets, limiting the lifetime of the supersolid. Here we propose a two-component mixture of dipolar and nondipolar BECs, such as an $^{166}$Er-$^{87}$Rb mixture, to create and stabilize a supersolid without the LHY correction, which can suppress the atomic loss and may allow observation of the long-time dynamics of the supersolid. In such a system, supersolidity can be controlled by the difference in the trap centers between the two components.
△ Less
Submitted 18 March, 2022;
originally announced March 2022.
-
3D-UCaps: 3D Capsules Unet for Volumetric Image Segmentation
Authors:
Tan Nguyen,
Binh-Son Hua,
Ngan Le
Abstract:
Medical image segmentation has been so far achieving promising results with Convolutional Neural Networks (CNNs). However, it is arguable that in traditional CNNs, its pooling layer tends to discard important information such as positions. Moreover, CNNs are sensitive to rotation and affine transformation. Capsule network is a data-efficient network design proposed to overcome such limitations by…
▽ More
Medical image segmentation has been so far achieving promising results with Convolutional Neural Networks (CNNs). However, it is arguable that in traditional CNNs, its pooling layer tends to discard important information such as positions. Moreover, CNNs are sensitive to rotation and affine transformation. Capsule network is a data-efficient network design proposed to overcome such limitations by replacing pooling layers with dynamic routing and convolutional strides, which aims to preserve the part-whole relationships. Capsule network has shown a great performance in image recognition and natural language processing, but applications for medical image segmentation, particularly volumetric image segmentation, has been limited. In this work, we propose 3D-UCaps, a 3D voxel-based Capsule network for medical volumetric image segmentation. We build the concept of capsules into a CNN by designing a network with two pathways: the first pathway is encoded by 3D Capsule blocks, whereas the second pathway is decoded by 3D CNNs blocks. 3D-UCaps, therefore inherits the merits from both Capsule network to preserve the spatial relationship and CNNs to learn visual representation. We conducted experiments on various datasets to demonstrate the robustness of 3D-UCaps including iSeg-2017, LUNA16, Hippocampus, and Cardiac, where our method outperforms previous Capsule networks and 3D-Unets.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
Point-Unet: A Context-aware Point-based Neural Network for Volumetric Segmentation
Authors:
Ngoc-Vuong Ho,
Tan Nguyen,
Gia-Han Diep,
Ngan Le,
Binh-Son Hua
Abstract:
Medical image analysis using deep learning has recently been prevalent, showing great performance for various downstream tasks including medical image segmentation and its sibling, volumetric image segmentation. Particularly, a typical volumetric segmentation network strongly relies on a voxel grid representation which treats volumetric data as a stack of individual voxel `slices', which allows le…
▽ More
Medical image analysis using deep learning has recently been prevalent, showing great performance for various downstream tasks including medical image segmentation and its sibling, volumetric image segmentation. Particularly, a typical volumetric segmentation network strongly relies on a voxel grid representation which treats volumetric data as a stack of individual voxel `slices', which allows learning to segment a voxel grid to be as straightforward as extending existing image-based segmentation networks to the 3D domain. However, using a voxel grid representation requires a large memory footprint, expensive test-time and limiting the scalability of the solutions. In this paper, we propose Point-Unet, a novel method that incorporates the efficiency of deep learning with 3D point clouds into volumetric segmentation. Our key idea is to first predict the regions of interest in the volume by learning an attentional probability map, which is then used for sampling the volume into a sparse point cloud that is subsequently segmented using a point-based neural network. We have conducted the experiments on the medical volumetric segmentation task with both a small-scale dataset Pancreas and large-scale datasets BraTS18, BraTS19, and BraTS20 challenges. A comprehensive benchmark on different metrics has shown that our context-aware Point-Unet robustly outperforms the SOTA voxel-based networks at both accuracies, memory usage during training, and time consumption during testing. Our code is available at https://github.com/VinAIResearch/Point-Unet.
△ Less
Submitted 28 February, 2024; v1 submitted 16 March, 2022;
originally announced March 2022.
-
Meta-Learning of NAS for Few-shot Learning in Medical Image Applications
Authors:
Viet-Khoa Vo-Ho,
Kashu Yamazaki,
Hieu Hoang,
Minh-Triet Tran,
Ngan Le
Abstract:
Deep learning methods have been successful in solving tasks in machine learning and have made breakthroughs in many sectors owing to their ability to automatically extract features from unstructured data. However, their performance relies on manual trial-and-error processes for selecting an appropriate network architecture, hyperparameters for training, and pre-/post-procedures. Even though it has…
▽ More
Deep learning methods have been successful in solving tasks in machine learning and have made breakthroughs in many sectors owing to their ability to automatically extract features from unstructured data. However, their performance relies on manual trial-and-error processes for selecting an appropriate network architecture, hyperparameters for training, and pre-/post-procedures. Even though it has been shown that network architecture plays a critical role in learning feature representation feature from data and the final performance, searching for the best network architecture is computationally intensive and heavily relies on researchers' experience. Automated machine learning (AutoML) and its advanced techniques i.e. Neural Architecture Search (NAS) have been promoted to address those limitations. Not only in general computer vision tasks, but NAS has also motivated various applications in multiple areas including medical imaging. In medical imaging, NAS has significant progress in improving the accuracy of image classification, segmentation, reconstruction, and more. However, NAS requires the availability of large annotated data, considerable computation resources, and pre-defined tasks. To address such limitations, meta-learning has been adopted in the scenarios of few-shot learning and multiple tasks. In this book chapter, we first present a brief review of NAS by discussing well-known approaches in search space, search strategy, and evaluation strategy. We then introduce various NAS approaches in medical imaging with different applications such as classification, segmentation, detection, reconstruction, etc. Meta-learning in NAS for few-shot learning and multiple tasks is then explained. Finally, we describe several open problems in NAS.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
CapsNet for Medical Image Segmentation
Authors:
Minh Tran,
Viet-Khoa Vo-Ho,
Kyle Quinn,
Hien Nguyen,
Khoa Luu,
Ngan Le
Abstract:
Convolutional Neural Networks (CNNs) have been successful in solving tasks in computer vision including medical image segmentation due to their ability to automatically extract features from unstructured data. However, CNNs are sensitive to rotation and affine transformation and their success relies on huge-scale labeled datasets capturing various input variations. This network paradigm has posed…
▽ More
Convolutional Neural Networks (CNNs) have been successful in solving tasks in computer vision including medical image segmentation due to their ability to automatically extract features from unstructured data. However, CNNs are sensitive to rotation and affine transformation and their success relies on huge-scale labeled datasets capturing various input variations. This network paradigm has posed challenges at scale because acquiring annotated data for medical segmentation is expensive, and strict privacy regulations. Furthermore, visual representation learning with CNNs has its own flaws, e.g., it is arguable that the pooling layer in traditional CNNs tends to discard positional information and CNNs tend to fail on input images that differ in orientations and sizes. Capsule network (CapsNet) is a recent new architecture that has achieved better robustness in representation learning by replacing pooling layers with dynamic routing and convolutional strides, which has shown potential results on popular tasks such as classification, recognition, segmentation, and natural language processing. Different from CNNs, which result in scalar outputs, CapsNet returns vector outputs, which aim to preserve the part-whole relationships. In this work, we first introduce the limitations of CNNs and fundamentals of CapsNet. We then provide recent developments of CapsNet for the task of medical image segmentation. We finally discuss various effective network architectures to implement a CapsNet for both 2D images and 3D volumetric medical image segmentation.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
ABN: Agent-Aware Boundary Networks for Temporal Action Proposal Generation
Authors:
Khoa Vo,
Kashu Yamazaki,
Sang Truong,
Minh-Triet Tran,
Akihiro Sugimoto,
Ngan Le
Abstract:
Temporal action proposal generation (TAPG) aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet plays an important role in many tasks of video analysis and understanding. Despite the great achievement in TAPG, most existing works ignore the human perception of interaction between agents and the surrounding environment by applying a deep learning model as a…
▽ More
Temporal action proposal generation (TAPG) aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet plays an important role in many tasks of video analysis and understanding. Despite the great achievement in TAPG, most existing works ignore the human perception of interaction between agents and the surrounding environment by applying a deep learning model as a black-box to the untrimmed videos to extract video visual representation. Therefore, it is beneficial and potentially improve the performance of TAPG if we can capture these interactions between agents and the environment. In this paper, we propose a novel framework named Agent-Aware Boundary Network (ABN), which consists of two sub-networks (i) an Agent-Aware Representation Network to obtain both agent-agent and agents-environment relationships in the video representation, and (ii) a Boundary Generation Network to estimate the confidence score of temporal intervals. In the Agent-Aware Representation Network, the interactions between agents are expressed through local pathway, which operates at a local level to focus on the motions of agents whereas the overall perception of the surroundings are expressed through global pathway, which operates at a global level to perceive the effects of agents-environment. Comprehensive evaluations on 20-action THUMOS-14 and 200-action ActivityNet-1.3 datasets with different backbone networks (i.e C3D, SlowFast and Two-Stream) show that our proposed ABN robustly outperforms state-of-the-art methods regardless of the employed backbone network on TAPG. We further examine the proposal quality by leveraging proposals generated by our method onto temporal action detection (TAD) frameworks and evaluate their detection performances. The source code can be found in this URL https://github.com/vhvkhoa/TAPG-AgentEnvNetwork.git.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
Doubly-polarized WZ hadronic cross sections at NLO QCD+EW accuracy
Authors:
Duc Ninh Le,
Julien Baglio
Abstract:
We present new results for next-to-leading order (NLO) electroweak (EW) corrections to double polarization signals in the $WZ$ production channel at the LHC using the $e^+ν_e μ^+ μ^-$ final state. It is found that the EW corrections are most sizable in the transverse momentum distributions of the doubly longitudinal polarization, being around -10% compared to the NLO QCD prediction at…
▽ More
We present new results for next-to-leading order (NLO) electroweak (EW) corrections to double polarization signals in the $WZ$ production channel at the LHC using the $e^+ν_e μ^+ μ^-$ final state. It is found that the EW corrections are most sizable in the transverse momentum distributions of the doubly longitudinal polarization, being around -10% compared to the NLO QCD prediction at $p_{T,e}\approx 200$ GeV, which is in the accessible energy range of the current LHC data.
△ Less
Submitted 2 March, 2022;
originally announced March 2022.
-
Tombo Propeller: Bio-Inspired Deformable Structure toward Collision-Accommodated Control for Drones
Authors:
Son Tien Bui,
Quan Khanh Luu,
Dinh Quang Nguyen,
Nhat Dinh Minh Le,
Giuseppe Loianno,
Van Anh Ho
Abstract:
There is a growing need for vertical take-off and landing vehicles, including drones, which are safe to use and can adapt to collisions. The risks of damage by collision, to humans, obstacles in the environment, and drones themselves, are significant. This has prompted a search into nature for a highly resilient structure that can inform a design of propellers to reduce those risks and enhance saf…
▽ More
There is a growing need for vertical take-off and landing vehicles, including drones, which are safe to use and can adapt to collisions. The risks of damage by collision, to humans, obstacles in the environment, and drones themselves, are significant. This has prompted a search into nature for a highly resilient structure that can inform a design of propellers to reduce those risks and enhance safety. Inspired by the flexibility and resilience of dragonfly wings, we propose a novel design for a biomimetic drone propeller called Tombo propeller. Here, we report on the design and fabrication process of this biomimetic propeller that can accommodate collisions and recover quickly, while maintaining sufficient thrust force to hover and fly. We describe the development of an aerodynamic model and experiments conducted to investigate performance characteristics for various configurations of the propeller morphology, and related properties, such as generated thrust force, thrust force deviation, collision force, recovery time, lift-to-drag ratio, and noise. Finally, we design and showcase a control strategy for a drone equipped with Tombo propellers that collides in mid-air with an obstacle and recovers from collision continuing flying. The results show that the maximum collision force generated by the proposed Tombo propeller is less than two-thirds that of a traditional rigid propeller, which suggests the concrete possibility to employ deformable propellers for drones flying in a cluttered environment. This research can contribute to morphological design of flying vehicles for agile and resilient performance.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
Finding Approximately Convex Ropes in the Plane
Authors:
Le Hong Trang,
Nguyen Thi Le,
Phan Thanh An
Abstract:
The convex rope problem is to find a counterclockwise or clockwise convex rope starting at the vertex a and ending at the vertex b of a simple polygon P, where a is a vertex of the convex hull of P and b is visible from infinity. The convex rope mentioned is the shortest path joining a and b that does not enter the interior of P. In this paper, the problem is reconstructed as the one of finding su…
▽ More
The convex rope problem is to find a counterclockwise or clockwise convex rope starting at the vertex a and ending at the vertex b of a simple polygon P, where a is a vertex of the convex hull of P and b is visible from infinity. The convex rope mentioned is the shortest path joining a and b that does not enter the interior of P. In this paper, the problem is reconstructed as the one of finding such shortest path in a simple polygon and solved by the method of multiple shooting. We then show that if the collinear condition of the method holds at all shooting points, then these shooting points form the shortest path. Otherwise, the sequence of paths obtained by the update of the method converges to the shortest path. The algorithm is implemented in C++ for numerical experiments.
△ Less
Submitted 5 March, 2023; v1 submitted 17 January, 2022;
originally announced January 2022.
-
SS-3DCapsNet: Self-supervised 3D Capsule Networks for Medical Segmentation on Less Labeled Data
Authors:
Minh Tran,
Loi Ly,
Binh-Son Hua,
Ngan Le
Abstract:
Capsule network is a recent new deep network architecture that has been applied successfully for medical image segmentation tasks. This work extends capsule networks for volumetric medical image segmentation with self-supervised learning. To improve on the problem of weight initialization compared to previous capsule networks, we leverage self-supervised learning for capsule networks pre-training,…
▽ More
Capsule network is a recent new deep network architecture that has been applied successfully for medical image segmentation tasks. This work extends capsule networks for volumetric medical image segmentation with self-supervised learning. To improve on the problem of weight initialization compared to previous capsule networks, we leverage self-supervised learning for capsule networks pre-training, where our pretext-task is optimized by self-reconstruction. Our capsule network, SS-3DCapsNet, has a UNet-based architecture with a 3D Capsule encoder and 3D CNNs decoder. Our experiments on multiple datasets including iSeg-2017, Hippocampus, and Cardiac demonstrate that our 3D capsule network with self-supervised pre-training considerably outperforms previous capsule networks and 3D-UNets.
△ Less
Submitted 28 March, 2022; v1 submitted 15 January, 2022;
originally announced January 2022.
-
Tailoring Drug Mobility by Photothermal Heating of Graphene Plasmons
Authors:
Anh D. Phan,
Nguyen K. Ngan,
Do T. Nga,
Nam B. Le,
Chu Viet Ha
Abstract:
We propose a theoretical approach to quantitatively determine the photothermally driven enhancement of molecular mobility of graphene-indomethacin mixtures under infrared laser irradiation. Graphene plasmons absorb incident electromagnetic energy and dissipate them into heat. The absorbed energy depends on optical properties of graphene plasmons, which are sensitive to structural parameters, and c…
▽ More
We propose a theoretical approach to quantitatively determine the photothermally driven enhancement of molecular mobility of graphene-indomethacin mixtures under infrared laser irradiation. Graphene plasmons absorb incident electromagnetic energy and dissipate them into heat. The absorbed energy depends on optical properties of graphene plasmons, which are sensitive to structural parameters, and concentration of plasmonic nanostructures. By using theoretical model, we calculate temperature gradients of the bulk drug with different concentrations of graphene plasmons. From these, we determine the temperature dependence of structural molecular relaxation and diffusion of indomethacin and find how the heating process significantly enhances the drug mobility.
△ Less
Submitted 28 December, 2021;
originally announced December 2021.
-
DAM-AL: Dilated Attention Mechanism with Attention Loss for 3D Infant Brain Image Segmentation
Authors:
Dinh-Hieu Hoang,
Gia-Han Diep,
Minh-Triet Tran,
Ngan T. H Le
Abstract:
While Magnetic Resonance Imaging (MRI) has played an essential role in infant brain analysis, segmenting MRI into a number of tissues such as gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) is crucial and complex due to the extremely low intensity contrast between tissues at around 6-9 months of age as well as amplified noise, myelination, and incomplete volume. In this paper, w…
▽ More
While Magnetic Resonance Imaging (MRI) has played an essential role in infant brain analysis, segmenting MRI into a number of tissues such as gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) is crucial and complex due to the extremely low intensity contrast between tissues at around 6-9 months of age as well as amplified noise, myelination, and incomplete volume. In this paper, we tackle those limitations by develo** a new deep learning model, named DAM-AL, which contains two main contributions, i.e., dilated attention mechanism and hard-case attention loss. Our DAM-AL network is designed with skip block layers and atrous block convolution. It contains both channel-wise attention at high-level context features and spatial attention at low-level spatial structural features. Our attention loss consists of two terms corresponding to region information and hard samples attention. Our proposed DAM-AL has been evaluated on the infant brain iSeg 2017 dataset and the experiments have been conducted on both validation and testing sets. We have benchmarked DAM-AL on Dice coefficient and ASD metrics and compared it with state-of-the-art methods.
△ Less
Submitted 27 December, 2021;
originally announced December 2021.
-
Bohr sets in sumsets I: Compact groups
Authors:
Anh N. Le,
Thái Hoàng Lê
Abstract:
Let $G$ be a compact abelian group and $φ_1, φ_2, φ_3$ be continuous endomorphisms on $G$. Under certain natural assumptions on the $φ_i$'s, we prove the existence of Bohr sets in the sumset $φ_1(A) + φ_2(A) + φ_3(A)$, where $A$ is either a set of positive Haar measure, or comes from a finite partition of $G$. The first result generalizes theorems of Bogolyubov and Bergelson-Ruzsa. As a variant of…
▽ More
Let $G$ be a compact abelian group and $φ_1, φ_2, φ_3$ be continuous endomorphisms on $G$. Under certain natural assumptions on the $φ_i$'s, we prove the existence of Bohr sets in the sumset $φ_1(A) + φ_2(A) + φ_3(A)$, where $A$ is either a set of positive Haar measure, or comes from a finite partition of $G$. The first result generalizes theorems of Bogolyubov and Bergelson-Ruzsa. As a variant of the second result, we show that for any partition $\mathbb{Z} = \bigcup_{i=1}^r A_i$, there exists an $i$ such that $A_i - A_i + sA_i$ contains a Bohr set for any $s \in \mathbb{Z} \setminus \{ 0 \}$. The latter is a step toward an open question of Katznelson and Ruzsa.
△ Less
Submitted 30 December, 2021; v1 submitted 22 December, 2021;
originally announced December 2021.
-
Monte Carlo calculation of the organ equivalent dose and effective dose due to immersion in a 16N beta source in air using the ICRP Reference Phantoms
Authors:
Jose M. Gomez-Ros,
Montserrat Moraleda,
Pedro Arce,
Duc-Ky Bui,
Thi-My-Linh Dang,
Laurent Desorgher,
Han Sung Kim,
Dragana Krstic,
Michal Kuc,
Ngoc-Thiem Le,
Yi-Kang Lee,
Ngoc-Quynh Nguyen,
Dragoslav Nikezic,
Katarzyna Tyminska,
Tomas Vrba
Abstract:
This work summarises the results of a comparison organized by EURADOS focused on the usage of the ICRP Reference Computational Phantoms. This activity aimed to provide training for the implementation of voxel phantoms in Monte Carlo radiation transport codes and the calculation of the dose equivalent in organs and the effective dose. This particular case describes a scenario of immersion in a 16N…
▽ More
This work summarises the results of a comparison organized by EURADOS focused on the usage of the ICRP Reference Computational Phantoms. This activity aimed to provide training for the implementation of voxel phantoms in Monte Carlo radiation transport codes and the calculation of the dose equivalent in organs and the effective dose. This particular case describes a scenario of immersion in a 16N beta source distributed in the air of a room with concrete walls where the phantom is located. Seven participants took part in the comparison of results using GEANT4, TRIPOLI-4 and MCNP family codes, and there was detected a general problem when calculating the dose to skeletal tissue and the remainder tissue. After a process of feedback with the participants the errors were corrected and the final results reached an agreement of +/-5%.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
Hadamard-type inequalities for $k$-positive matrices
Authors:
Nam Q. Le
Abstract:
We establish Hadamard-type inequalities for a class of symmetric matrices called $k$-positive matrices for which the $m$-th elementary symmetric functions of their eigenvalues are positive for all $m\leq k$. These matrices arise naturally in the study of $k$-Hessian equations in Partial Differential Equations. For each $k$-positive matrix, we show that the sum of its principal minors of size $k$ i…
▽ More
We establish Hadamard-type inequalities for a class of symmetric matrices called $k$-positive matrices for which the $m$-th elementary symmetric functions of their eigenvalues are positive for all $m\leq k$. These matrices arise naturally in the study of $k$-Hessian equations in Partial Differential Equations. For each $k$-positive matrix, we show that the sum of its principal minors of size $k$ is not larger than the $k$-th elementary symmetric function of their diagonal entries. The case $k=n$ corresponds to the classical Hadamard inequality for positive definite matrices. Some consequences are also obtained.
△ Less
Submitted 12 December, 2021; v1 submitted 2 December, 2021;
originally announced December 2021.
-
Training Experimentally Robust and Interpretable Binarized Regression Models Using Mixed-Integer Programming
Authors:
Sanjana Tule,
Nhi Ha Lan Le,
Buser Say
Abstract:
In this paper, we explore model-based approach to training robust and interpretable binarized regression models for multiclass classification tasks using Mixed-Integer Programming (MIP). Our MIP model balances the optimization of prediction margin and model size by using a weighted objective that: minimizes the total margin of incorrectly classified training instances, maximizes the total margin o…
▽ More
In this paper, we explore model-based approach to training robust and interpretable binarized regression models for multiclass classification tasks using Mixed-Integer Programming (MIP). Our MIP model balances the optimization of prediction margin and model size by using a weighted objective that: minimizes the total margin of incorrectly classified training instances, maximizes the total margin of correctly classified training instances, and maximizes the overall model regularization. We conduct two sets of experiments to test the classification accuracy of our MIP model over standard and corrupted versions of multiple classification datasets, respectively. In the first set of experiments, we show that our MIP model outperforms an equivalent Pseudo-Boolean Optimization (PBO) model and achieves competitive results to Logistic Regression (LR) and Gradient Descent (GD) in terms of classification accuracy over the standard datasets. In the second set of experiments, we show that our MIP model outperforms the other models (i.e., GD and LR) in terms of classification accuracy over majority of the corrupted datasets. Finally, we visually demonstrate the interpretability of our MIP model in terms of its learned parameters over the MNIST dataset. Overall, we show the effectiveness of training robust and interpretable binarized regression models using MIP.
△ Less
Submitted 19 March, 2022; v1 submitted 1 December, 2021;
originally announced December 2021.
-
Spin and valley ordering of fractional quantum Hall states in monolayer graphene
Authors:
Ngoc Duc Le,
Thierry Jolicoeur
Abstract:
We study spin and valley ordering in the quantum Hall fractions in monolayer graphene at Landau level filling factors $ν_G=-2+n/3$ $(n=2,4,5)$. We use exact diagonalizations on the spherical as well as toroidal geometry by taking into account the effect of realistic anisotropies that break the spin/valley symmetry of the pure Coulomb interaction. We also use a variational method based on eigenstat…
▽ More
We study spin and valley ordering in the quantum Hall fractions in monolayer graphene at Landau level filling factors $ν_G=-2+n/3$ $(n=2,4,5)$. We use exact diagonalizations on the spherical as well as toroidal geometry by taking into account the effect of realistic anisotropies that break the spin/valley symmetry of the pure Coulomb interaction. We also use a variational method based on eigenstates of the fully $SU(4)$ symmetric limit. For all the fractions we study there are two-component states for which the competing phases are generalizations of those occurring at neutrality $ν_G=0$. They are ferromagnetic, antiferromagnetic, charge-density wave and Kékulé phases, depending on the values of Ising or XY anisotropies in valley space. The varying spin-valley content of the states leads to ground state quantum numbers that are different from the $ν_G=0$ case. For filling factor $ν_G=-2+5/3$ there is a parent state in the $SU(4)$ limit which has a flavor content $(1,1/3,1/3,0)$ where the two components that are one-third filled form a two-component singlet. The addition of anisotropies leads to the formation of new states that have no counterpart at $ν_G=0$. While some of them are predicted by the variational approach, we find notably that negative Ising-like valley anisotropy leads to the formation of a state which is a singlet in both spin and valley space and lies beyond the reach of the variational method. Also fully spin polarized two-component states at $ν=-2+4/3$ and $ν=-2+5/3$ display an emergent $SU(2)$ valley symmetry because they do not feel point-contact anisotropies. We discuss implications for current experiments concerning possible spin transitions.
△ Less
Submitted 30 November, 2021;
originally announced November 2021.
-
Unsteady mass transfer from a core-shell cylinder in crossflow
Authors:
Clément Bielinski,
Nam Le,
Badr Kaoui
Abstract:
Mass transfer from a composite cylinder - made of an inner core and an outer envelo** semipermeable shell - under channel crossflow is studied numerically using two-dimensional lattice-Boltzmann simulations. The core is initially loaded with a solute that diffuses passively through the shell towards the fluid. The cylinder internal structure and the initial condition considered in this study dif…
▽ More
Mass transfer from a composite cylinder - made of an inner core and an outer envelo** semipermeable shell - under channel crossflow is studied numerically using two-dimensional lattice-Boltzmann simulations. The core is initially loaded with a solute that diffuses passively through the shell towards the fluid. The cylinder internal structure and the initial condition considered in this study differ and thus complement the classical studies dealing with homogeneous uncoated cylinders whose surfaces are sustained at either constant concentration or constant mass flux. Here, the cylinder acts as a reservoir endowed with a shell that controls the leakage rate of the encapsulated solute. The transition from steady to unsteady laminar flow regime, around the cylinder, alters the released solute spatial distribution and the mass transfer efficiency, which is characterized by the Sherwood number (the dimensionless mass transfer coefficient). Moreover, the reservoir involves unsteady and continuous boundary conditions, which lead to unsteady and nonuniform distribution of both the concentration and the mass flux at the cylinder surface. The effect of adding a coating shell is highlighted, for a given ratio of the cylinder diameter to the channel width, by extracting a correlation from the computed data set. This new correlation shows explicit dependency of the Sherwood number upon the shell solute permeability (the shell mass transfer coefficient).
△ Less
Submitted 22 November, 2021;
originally announced November 2021.
-
Global-Local Attention for Emotion Recognition
Authors:
Nhat Le,
Khanh Nguyen,
Anh Nguyen,
Bac Le
Abstract:
Human emotion recognition is an active research area in artificial intelligence and has made substantial progress over the past few years. Many recent works mainly focus on facial regions to infer human affection, while the surrounding context information is not effectively utilized. In this paper, we proposed a new deep network to effectively recognize human emotions using a novel global-local at…
▽ More
Human emotion recognition is an active research area in artificial intelligence and has made substantial progress over the past few years. Many recent works mainly focus on facial regions to infer human affection, while the surrounding context information is not effectively utilized. In this paper, we proposed a new deep network to effectively recognize human emotions using a novel global-local attention mechanism. Our network is designed to extract features from both facial and context regions independently, then learn them together using the attention module. In this way, both the facial and contextual information is used to infer human emotions, therefore enhancing the discrimination of the classifier. The intensive experiments show that our method surpasses the current state-of-the-art methods on recent emotion datasets by a fair margin. Qualitatively, our global-local attention module can extract more meaningful attention maps than previous methods. The source code and trained model of our network are available at https://github.com/minhnhatvt/glamor-net
△ Less
Submitted 7 November, 2021;
originally announced November 2021.
-
AGN lifetimes in UV-selected galaxies: a clue to supermassive black hole-host galaxy coevolution
Authors:
Xiaozhi Lin,
Yongquan Xue,
Guanwen Fang,
Lulu Fan,
Huynh Anh N. Le,
Ashraf Ayubinia
Abstract:
The coevolution between supermassive black holes (SMBHs) and their host galaxies has been proposed for more than a decade, albeit with little direct evidence about black hole accretion activities regulating galaxy star formation at $z>1$. In this paper, we study the lifetimes of X-ray active galactic nuclei (AGNs) in $UV$-selected red sequence (RS), blue cloud (BC) and green valley (GV) galaxies,…
▽ More
The coevolution between supermassive black holes (SMBHs) and their host galaxies has been proposed for more than a decade, albeit with little direct evidence about black hole accretion activities regulating galaxy star formation at $z>1$. In this paper, we study the lifetimes of X-ray active galactic nuclei (AGNs) in $UV$-selected red sequence (RS), blue cloud (BC) and green valley (GV) galaxies, finding that AGN accretion activities are most prominent in GV galaxies at $z\sim1.5-2$, compared with RS and BC galaxies. We also compare AGN accretion timescales with typical color transition timescales of $UV$-selected galaxies. We find that the lifetime of GV galaxies at $z\sim1.5-2$ is very close to the typical timescale when the AGNs residing in them stay in the high-accretion-rate mode at these redshifts; for BC galaxies, the consistency between the color transition timescale and the black hole strong accretion lifetime is more likely to happen at lower redshifts ($z<1$). Our results support the scenario where AGN accretion activities govern $UV$ color transitions of host galaxies, making galaxies and their central SMBHs coevolve with each other.
△ Less
Submitted 26 October, 2021; v1 submitted 25 October, 2021;
originally announced October 2021.
-
Robust optimal control of interacting multi-qubit systems for quantum sensing
Authors:
Nguyen H. Le,
Max Cykiert,
Eran Ginossar
Abstract:
Realising high fidelity entangled states in controlled quantum many-body systems is challenging due to experimental uncertainty in a large number of physical quantities. We develop a robust optimal control method for achieving this goal in finite-size multi-qubit systems despite significant uncertainty in multiple parameters. We demonstrate its effectiveness in the generation of the Greenberger-Ho…
▽ More
Realising high fidelity entangled states in controlled quantum many-body systems is challenging due to experimental uncertainty in a large number of physical quantities. We develop a robust optimal control method for achieving this goal in finite-size multi-qubit systems despite significant uncertainty in multiple parameters. We demonstrate its effectiveness in the generation of the Greenberger-Horne-Zeilinger state on a star graph of capacitively coupled transmons, and discuss its crucial role for achieving the Heisenberg limit of precision in quantum sensing.
△ Less
Submitted 24 October, 2021;
originally announced October 2021.
-
AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation
Authors:
Khoa Vo,
Hyekang Joo,
Kashu Yamazaki,
Sang Truong,
Kris Kitani,
Minh-Triet Tran,
Ngan Le
Abstract:
Humans typically perceive the establishment of an action in a video through the interaction between an actor and the surrounding environment. An action only starts when the main actor in the video begins to interact with the environment, while it ends when the main actor stops the interaction. Despite the great progress in temporal action proposal generation, most existing works ignore the aforeme…
▽ More
Humans typically perceive the establishment of an action in a video through the interaction between an actor and the surrounding environment. An action only starts when the main actor in the video begins to interact with the environment, while it ends when the main actor stops the interaction. Despite the great progress in temporal action proposal generation, most existing works ignore the aforementioned fact and leave their model learning to propose actions as a black-box. In this paper, we make an attempt to simulate that ability of a human by proposing Actor Environment Interaction (AEI) network to improve the video representation for temporal action proposals generation. AEI contains two modules, i.e., perception-based visual representation (PVR) and boundary-matching module (BMM). PVR represents each video snippet by taking human-human relations and humans-environment relations into consideration using the proposed adaptive attention mechanism. Then, the video representation is taken by BMM to generate action proposals. AEI is comprehensively evaluated in ActivityNet-1.3 and THUMOS-14 datasets, on temporal action proposal and detection tasks, with two boundary-matching architectures (i.e., CNN-based and GCN-based) and two classifiers (i.e., Unet and P-GCN). Our AEI robustly outperforms the state-of-the-art methods with remarkable performance and generalization for both temporal action proposal generation and temporal action detection.
△ Less
Submitted 24 October, 2021; v1 submitted 21 October, 2021;
originally announced October 2021.
-
Scalable and robust quantum computing on qubit arrays with fixed coupling
Authors:
Nguyen H. Le,
Max Cykiert,
Eran Ginossar
Abstract:
We propose a scheme for scalable and robust quantum computing on two-dimensional arrays of qubits with fixed longitudinal coupling. This opens the possibility for bypassing the device complexity associated with tunable couplers required in conventional quantum computing hardware. Our approach is based on driving a subarray of qubits such that the total multi-qubit Hamiltonian can be decomposed int…
▽ More
We propose a scheme for scalable and robust quantum computing on two-dimensional arrays of qubits with fixed longitudinal coupling. This opens the possibility for bypassing the device complexity associated with tunable couplers required in conventional quantum computing hardware. Our approach is based on driving a subarray of qubits such that the total multi-qubit Hamiltonian can be decomposed into a sum of commuting few-qubit blocks, and then efficient optimization of the unitary evolution within each block. Each driving pulse can implement a target gate on the driven qubits, and at the same time implement identity gates on the neighbouring undriven qubits, cancelling any unwanted evolution due to the constant qubit-qubit interaction. We show that it is possible to realise a universal set of quantum gates with high fidelity on the basis blocks, and by shifting the driving pattern one can realise an arbitrary quantum circuit on the array. Allowing for imperfect Hamiltonian characterisation, we use robust optimal control to obtain fidelities around 99.99% despite 1% uncertainty in the qubit-qubit and drive-qubit couplings, and a detuning uncertainty at 0.1% of the qubit-qubit coupling strength. This robust feature is crucial for scaling up as parameter uncertainty is significant in large devices.
△ Less
Submitted 27 June, 2022; v1 submitted 14 October, 2021;
originally announced October 2021.
-
Semi-Supervised Adversarial Discriminative Domain Adaptation
Authors:
Thai-Vu Nguyen,
Anh Nguyen,
Nghia Le,
Bac Le
Abstract:
Domain adaptation is a potential method to train a powerful deep neural network, which can handle the absence of labeled data. More precisely, domain adaptation solving the limitation called dataset bias or domain shift when the training dataset and testing dataset are extremely different. Adversarial adaptation method becoming popular among other domain adaptation methods. Relies on the idea of G…
▽ More
Domain adaptation is a potential method to train a powerful deep neural network, which can handle the absence of labeled data. More precisely, domain adaptation solving the limitation called dataset bias or domain shift when the training dataset and testing dataset are extremely different. Adversarial adaptation method becoming popular among other domain adaptation methods. Relies on the idea of GAN, adversarial domain adaptation tries to minimize the distribution between training and testing datasets base on the adversarial object. However, some conventional adversarial domain adaptation methods cannot handle large domain shifts between two datasets or the generalization ability of these methods are inefficient. In this paper, we propose an improved adversarial domain adaptation method called Semi-Supervised Adversarial Discriminative Domain Adaptation (SADDA), which can overcome the limitation of other domain adaptation. We also show that SADDA has better performance than other adversarial adaptation methods and illustrate the promise of our method on digit classification and emotion recognition problems.
△ Less
Submitted 19 October, 2022; v1 submitted 27 September, 2021;
originally announced September 2021.
-
Deformation-Aware Data-Driven Grasp Synthesis
Authors:
Tran Nguyen Le,
Jens Lundell,
Fares J. Abu-Dakka,
Ville Kyrki
Abstract:
Grasp synthesis for 3D deformable objects remains a little-explored topic, most works aiming to minimize deformations. However, deformations are not necessarily harmful -- humans are, for example, able to exploit deformations to generate new potential grasps. How to achieve that on a robot is though an open question. This paper proposes an approach that uses object stiffness information in additio…
▽ More
Grasp synthesis for 3D deformable objects remains a little-explored topic, most works aiming to minimize deformations. However, deformations are not necessarily harmful -- humans are, for example, able to exploit deformations to generate new potential grasps. How to achieve that on a robot is though an open question. This paper proposes an approach that uses object stiffness information in addition to depth images for synthesizing high-quality grasps. We achieve this by incorporating object stiffness as an additional input to a state-of-the-art deep grasp planning network. We also curate a new synthetic dataset of grasps on objects of varying stiffness using the Isaac Gym simulator for training the network. We experimentally validate and compare our proposed approach against the case where we do not incorporate object stiffness on a total of 2800 grasps in simulation and 420 grasps on a real Franka Emika Panda. The experimental results show significant improvement in grasp success rate using the proposed approach on a wide range of objects with varying shapes, sizes, and stiffness. Furthermore, we demonstrate that the approach can generate different gras** strategies for different stiffness values, such as pinching for soft objects and caging for hard objects. Together, the results clearly show the value of incorporating stiffness information when gras** objects of varying stiffness.
△ Less
Submitted 11 September, 2021;
originally announced September 2021.
-
Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey
Authors:
Ngan Le,
Vidhiwar Singh Rathour,
Kashu Yamazaki,
Khoa Luu,
Marios Savvides
Abstract:
Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision. In this work, we provide a detailed review of recent and state-of-the…
▽ More
Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision. In this work, we provide a detailed review of recent and state-of-the-art research advances of deep reinforcement learning in computer vision. We start with comprehending the theories of deep learning, reinforcement learning, and deep reinforcement learning. We then propose a categorization of deep reinforcement learning methodologies and discuss their advantages and limitations. In particular, we divide deep reinforcement learning into seven main categories according to their applications in computer vision, i.e. (i)landmark localization (ii) object detection; (iii) object tracking; (iv) registration on both 2D image and 3D image volumetric data (v) image segmentation; (vi) videos analysis; and (vii) other applications. Each of these categories is further analyzed with reinforcement learning techniques, network design, and performance. Moreover, we provide a comprehensive analysis of the existing publicly available datasets and examine source code availability. Finally, we present some open issues and discuss future research directions on deep reinforcement learning in computer vision
△ Less
Submitted 25 August, 2021;
originally announced August 2021.
-
Studying magnetic fields and dust in M17 using polarized thermal dust emission observed by SOFIA/HAWC+
Authors:
Thuong Duc Hoang,
Nguyen Bich Ngoc,
Pham Ngoc Diep,
Le Ngoc Tram,
Thiem Hoang,
Wanggi Lim,
Dieu D. Nguyen,
Ngan Le,
Nguyen Thi Phuong,
Nguyen Fuda,
Tuan Van Bui,
Kate Pattle,
Gia Bao Truong Le,
Hien Phan,
Nguyen Chau Giang
Abstract:
We report the highest spatial resolution measurement of magnetic fields in M17 using thermal dust polarization taken by SOFIA/HAWC+ centered at 154 $μ$m wavelength. Using the Davis-Chandrasekhar-Fermi method, we found the presence of strong magnetic fields of $980 \pm 230\;μ$G and $1665 \pm 885\;μ$G in lower-density (M17-N) and higher-density (M17-S) regions, respectively. The magnetic field morph…
▽ More
We report the highest spatial resolution measurement of magnetic fields in M17 using thermal dust polarization taken by SOFIA/HAWC+ centered at 154 $μ$m wavelength. Using the Davis-Chandrasekhar-Fermi method, we found the presence of strong magnetic fields of $980 \pm 230\;μ$G and $1665 \pm 885\;μ$G in lower-density (M17-N) and higher-density (M17-S) regions, respectively. The magnetic field morphology in M17-N possibly mimics the fields in gravitational collapse molecular cores while in M17-S the fields run perpendicular to the matter structure and display a pillar and an asymmetric hourglass shape. The mean values of the magnetic field strength are used to determine the Alfvénic Mach numbers ($\mathcal{M_A}$) of M17-N and M17-S which turn out to be sub-Alfvénic, or magnetic fields dominate turbulence. We calculate the mass-to-flux ratio, $λ$, and obtain $λ=0.07$ for M17-N and $0.28$ for M17-S. The sub-critical values of $λ$ are in agreement with the lack of massive stars formed in M17. To study dust physics, we analyze the relationship between the dust polarization fraction, $p$, and the thermal emission intensity, $I$, gas column density, $N({\rm H_2})$, and dust temperature, $T_{\rm d}$. The polarization fraction decreases with intensity as $I^{-α}$ with $α= 0.51$. The polarization fraction also decreases with increasing $N(\rm H_{2})$, which can be explained by the decrease of grain alignment by radiative torques (RATs) toward denser regions with a weaker radiation field and/or tangling of magnetic fields. The polarization fraction tends to increase with $T_{\rm d}$ first and then decreases when $T_ {\rm d} > 50$ K. The latter feature seen in the M17-N, where the gas density changes slowly with $T_{d}$, is consistent with the RAT disruption effect.
△ Less
Submitted 12 November, 2021; v1 submitted 23 August, 2021;
originally announced August 2021.
-
Image coding for machines: an end-to-end learned approach
Authors:
Nam Le,
Honglei Zhang,
Francesco Cricri,
Ramin Ghaznavi-Youvalari,
Esa Rahtu
Abstract:
Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images. Given the dramatic explosion in the number of images generated per day, a question arises: how much better would an image codec targeting machine-consumption perform against state-of-the-art codecs targeting hu…
▽ More
Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images. Given the dramatic explosion in the number of images generated per day, a question arises: how much better would an image codec targeting machine-consumption perform against state-of-the-art codecs targeting human-consumption? In this paper, we propose an image codec for machines which is neural network (NN) based and end-to-end learned. In particular, we propose a set of training strategies that address the delicate problem of balancing competing loss functions, such as computer vision task losses, image distortion losses, and rate loss. Our experimental results show that our NN-based codec outperforms the state-of-the-art Versa-tile Video Coding (VVC) standard on the object detection and instance segmentation tasks, achieving -37.87% and -32.90% of BD-rate gain, respectively, while being fast thanks to its compact size. To the best of our knowledge, this is the first end-to-end learned machine-targeted image codec.
△ Less
Submitted 30 August, 2021; v1 submitted 23 August, 2021;
originally announced August 2021.
-
Learned Image Coding for Machines: A Content-Adaptive Approach
Authors:
Nam Le,
Honglei Zhang,
Francesco Cricri,
Ramin Ghaznavi-Youvalari,
Hamed Rezazadegan Tavakoli,
Esa Rahtu
Abstract:
Today, according to the Cisco Annual Internet Report (2018-2023), the fastest-growing category of Internet traffic is machine-to-machine communication. In particular, machine-to-machine communication of images and videos represents a new challenge and opens up new perspectives in the context of data compression. One possible solution approach consists of adapting current human-targeted image and v…
▽ More
Today, according to the Cisco Annual Internet Report (2018-2023), the fastest-growing category of Internet traffic is machine-to-machine communication. In particular, machine-to-machine communication of images and videos represents a new challenge and opens up new perspectives in the context of data compression. One possible solution approach consists of adapting current human-targeted image and video coding standards to the use case of machine consumption. Another approach consists of develo** completely new compression paradigms and architectures for machine-to-machine communications. In this paper, we focus on image compression and present an inference-time content-adaptive finetuning scheme that optimizes the latent representation of an end-to-end learned image codec, aimed at improving the compression efficiency for machine-consumption. The conducted experiments show that our online finetuning brings an average bitrate saving (BD-rate) of -3.66% with respect to our pretrained image codec. In particular, at low bitrate points, our proposed method results in a significant bitrate saving of -9.85%. Overall, our pretrained-and-then-finetuned system achieves -30.54% BD-rate over the state-of-the-art image/video codec Versatile Video Coding (VVC).
△ Less
Submitted 13 October, 2021; v1 submitted 23 August, 2021;
originally announced August 2021.
-
BiMaL: Bijective Maximum Likelihood Approach to Domain Adaptation in Semantic Scene Segmentation
Authors:
Thanh-Dat Truong,
Chi Nhan Duong,
Ngan Le,
Son Lam Phung,
Chase Rainwater,
Khoa Luu
Abstract:
Semantic segmentation aims to predict pixel-level labels. It has become a popular task in various computer vision applications. While fully supervised segmentation methods have achieved high accuracy on large-scale vision datasets, they are unable to generalize on a new test environment or a new domain well. In this work, we first introduce a new Un-aligned Domain Score to measure the efficiency o…
▽ More
Semantic segmentation aims to predict pixel-level labels. It has become a popular task in various computer vision applications. While fully supervised segmentation methods have achieved high accuracy on large-scale vision datasets, they are unable to generalize on a new test environment or a new domain well. In this work, we first introduce a new Un-aligned Domain Score to measure the efficiency of a learned model on a new target domain in unsupervised manner. Then, we present the new Bijective Maximum Likelihood(BiMaL) loss that is a generalized form of the Adversarial Entropy Minimization without any assumption about pixel independence. We have evaluated the proposed BiMaL on two domains. The proposed BiMaL approach consistently outperforms the SOTA methods on empirical experiments on "SYNTHIA to Cityscapes", "GTA5 to Cityscapes", and "SYNTHIA to Vistas".
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
The Right to Talk: An Audio-Visual Transformer Approach
Authors:
Thanh-Dat Truong,
Chi Nhan Duong,
The De Vu,
Hoang Anh Pham,
Bhiksha Raj,
Ngan Le,
Khoa Luu
Abstract:
Turn-taking has played an essential role in structuring the regulation of a conversation. The task of identifying the main speaker (who is properly taking his/her turn of speaking) and the interrupters (who are interrupting or reacting to the main speaker's utterances) remains a challenging task. Although some prior methods have partially addressed this task, there still remain some limitations. F…
▽ More
Turn-taking has played an essential role in structuring the regulation of a conversation. The task of identifying the main speaker (who is properly taking his/her turn of speaking) and the interrupters (who are interrupting or reacting to the main speaker's utterances) remains a challenging task. Although some prior methods have partially addressed this task, there still remain some limitations. Firstly, a direct association of Audio and Visual features may limit the correlations to be extracted due to different modalities. Secondly, the relationship across temporal segments hel** to maintain the consistency of localization, separation, and conversation contexts is not effectively exploited. Finally, the interactions between speakers that usually contain the tracking and anticipatory decisions about the transition to a new speaker are usually ignored. Therefore, this work introduces a new Audio-Visual Transformer approach to the problem of localization and highlighting the main speaker in both audio and visual channels of a multi-speaker conversation video in the wild. The proposed method exploits different types of correlations presented in both visual and audio signals. The temporal audio-visual relationships across spatial-temporal space are anticipated and optimized via the self-attention mechanism in a Transformerstructure. Moreover, a newly collected dataset is introduced for the main speaker detection. To the best of our knowledge, it is one of the first studies that is able to automatically localize and highlight the main speaker in both visual and audio channels in multi-speaker conversation videos.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
Toward a better understanding of activation volume and dynamic decoupling of glass-forming liquids under compression
Authors:
Anh D. Phan,
Nguyen K. Ngan,
Nam B. Le,
Le T. M. Thanh
Abstract:
We theoretically investigate physical properties of the pressure-induced activation volume and dynamic decoupling of ternidazole, glycerol, and probucol by the Elastically Collective Nonlinear Langevin Equation theory. Based on the predicted temperature dependence of activated relaxation under various compression, the activation volume is determined to characterize effects of pressure on molecular…
▽ More
We theoretically investigate physical properties of the pressure-induced activation volume and dynamic decoupling of ternidazole, glycerol, and probucol by the Elastically Collective Nonlinear Langevin Equation theory. Based on the predicted temperature dependence of activated relaxation under various compression, the activation volume is determined to characterize effects of pressure on molecular dynamics of materials. We find that the decoupling of the structural relaxation time of compressed systems from their bulk uncompressed value is governed by the power-law rule. The decoupling exponent exponentially grows with pressure below 2 GPa. The decoupling exponent and activation volume are intercorrelated and have a connection with the differential activation free energy. We numerically and mathematically analyze relationships among these quantities to explain many results in previous experiments and simulations.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
Towards synthesizing grasps for 3D deformable objects with physics-based simulation
Authors:
Tran Nguyen Le,
Jens Lundell,
Fares J. Abu-Dakka,
Ville Kyrki
Abstract:
Gras** deformable objects is not well researched due to the complexity in modelling and simulating the dynamic behavior of such objects. However, with the rapid development of physics-based simulators that support soft bodies, the research gap between rigid and deformable objects is getting smaller. To leverage the capability of such simulators and to challenge the assumption that has guided rob…
▽ More
Gras** deformable objects is not well researched due to the complexity in modelling and simulating the dynamic behavior of such objects. However, with the rapid development of physics-based simulators that support soft bodies, the research gap between rigid and deformable objects is getting smaller. To leverage the capability of such simulators and to challenge the assumption that has guided robotic gras** research so far, i.e., object rigidity, we proposed a deep-learning based approach that generates stiffness-dependent grasps. Our network is trained on purely synthetic data generated from a physics-based simulator. The same simulator is also used to evaluate the trained network. The results show improvement in terms of grasp ranking and grasp success rate. Furthermore, our network can adapt the grasps based on the stiffness. We are currently validating the proposed approach on a larger test dataset in simulation and on a physical robot.
△ Less
Submitted 19 July, 2021;
originally announced July 2021.
-
Agent-Environment Network for Temporal Action Proposal Generation
Authors:
Viet-Khoa Vo-Ho,
Ngan Le,
Kashu Yamazaki,
Akihiro Sugimoto,
Minh-Triet Tran
Abstract:
Temporal action proposal generation is an essential and challenging task that aims at localizing temporal intervals containing human actions in untrimmed videos. Most of existing approaches are unable to follow the human cognitive process of understanding the video context due to lack of attention mechanism to express the concept of an action or an agent who performs the action or the interaction…
▽ More
Temporal action proposal generation is an essential and challenging task that aims at localizing temporal intervals containing human actions in untrimmed videos. Most of existing approaches are unable to follow the human cognitive process of understanding the video context due to lack of attention mechanism to express the concept of an action or an agent who performs the action or the interaction between the agent and the environment. Based on the action definition that a human, known as an agent, interacts with the environment and performs an action that affects the environment, we propose a contextual Agent-Environment Network. Our proposed contextual AEN involves (i) agent pathway, operating at a local level to tell about which humans/agents are acting and (ii) environment pathway operating at a global level to tell about how the agents interact with the environment. Comprehensive evaluations on 20-action THUMOS-14 and 200-action ActivityNet-1.3 datasets with different backbone networks, i.e C3D and SlowFast, show that our method robustly exhibits outperformance against state-of-the-art methods regardless of the employed backbone network.
△ Less
Submitted 16 March, 2022; v1 submitted 17 July, 2021;
originally announced July 2021.
-
FedXGBoost: Privacy-Preserving XGBoost for Federated Learning
Authors:
Nhan Khanh Le,
Yang Liu,
Quang Minh Nguyen,
Qingchen Liu,
Fangzhou Liu,
Quanwei Cai,
Sandra Hirche
Abstract:
Federated learning is the distributed machine learning framework that enables collaborative training across multiple parties while ensuring data privacy. Practical adaptation of XGBoost, the state-of-the-art tree boosting framework, to federated learning remains limited due to high cost incurred by conventional privacy-preserving methods. To address the problem, we propose two variants of federate…
▽ More
Federated learning is the distributed machine learning framework that enables collaborative training across multiple parties while ensuring data privacy. Practical adaptation of XGBoost, the state-of-the-art tree boosting framework, to federated learning remains limited due to high cost incurred by conventional privacy-preserving methods. To address the problem, we propose two variants of federated XGBoost with privacy guarantee: FedXGBoost-SMM and FedXGBoost-LDP. Our first protocol FedXGBoost-SMM deploys enhanced secure matrix multiplication method to preserve privacy with lossless accuracy and lower overhead than encryption-based techniques. Developed independently, the second protocol FedXGBoost-LDP is heuristically designed with noise perturbation for local differential privacy, and empirically evaluated on real-world and synthetic datasets.
△ Less
Submitted 12 August, 2021; v1 submitted 20 June, 2021;
originally announced June 2021.
-
Optimal boundary regularity for some singular Monge-Ampère equations on bounded convex domains
Authors:
Nam Q. Le
Abstract:
By constructing explicit supersolutions, we obtain the optimal global Hölder regularity for several singular Monge-Ampère equations on general bounded open convex domains including those related to complete affine hyperbolic spheres, and proper affine hyperspheres. Our analysis reveals that certain singular-looking equations, such as $ \det D^2 u = |u|^{-n-2-k} (x\cdot Du -u)^{-k} $ with zero boun…
▽ More
By constructing explicit supersolutions, we obtain the optimal global Hölder regularity for several singular Monge-Ampère equations on general bounded open convex domains including those related to complete affine hyperbolic spheres, and proper affine hyperspheres. Our analysis reveals that certain singular-looking equations, such as $ \det D^2 u = |u|^{-n-2-k} (x\cdot Du -u)^{-k} $ with zero boundary data, have unexpected degenerate nature.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Integer solutions of $a^2+ab+b^2=7^n$
Authors:
Tien Nam Le,
Tung Tho Nguyen
Abstract:
In this article we will show $2$ different proofs for the fact that there exist relatively prime positive integers $a,b$ such that: $a^2+ab+b^2=7^n$.
In this article we will show $2$ different proofs for the fact that there exist relatively prime positive integers $a,b$ such that: $a^2+ab+b^2=7^n$.
△ Less
Submitted 18 April, 2021;
originally announced April 2021.
-
Sublacunary sets and interpolation sets for nilsequences
Authors:
Anh N. Le
Abstract:
A set $E \subset \mathbb{N}$ is an interpolation set for nilsequences if every bounded function on $E$ can be extended to a nilsequence on $\mathbb{N}$. Following a theorem of Strzelecki, every lacunary set is an interpolation set for nilsequences. We show that sublacunary sets are not interpolation sets for nilsequences. Furthermore, we prove that the union of an interpolation set for nilsequence…
▽ More
A set $E \subset \mathbb{N}$ is an interpolation set for nilsequences if every bounded function on $E$ can be extended to a nilsequence on $\mathbb{N}$. Following a theorem of Strzelecki, every lacunary set is an interpolation set for nilsequences. We show that sublacunary sets are not interpolation sets for nilsequences. Furthermore, we prove that the union of an interpolation set for nilsequences and a finite set is an interpolation set for nilsequences. Lastly, we provide a new class of interpolation sets for Bohr almost periodic sequences, and as the result, obtain a new example of interpolation set for $2$-step nilsequences which is not an interpolation set for Bohr almost periodic sequences.
△ Less
Submitted 24 March, 2021;
originally announced March 2021.
-
Roughness Index and Roughness Distance for Benchmarking Medical Segmentation
Authors:
Vidhiwar Singh Rathour,
Kashu Yamakazi,
T. Hoang Ngan Le
Abstract:
Medical image segmentation is one of the most challenging tasks in medical image analysis and has been widely developed for many clinical applications. Most of the existing metrics have been first designed for natural images and then extended to medical images. While object surface plays an important role in medical segmentation and quantitative analysis i.e. analyze brain tumor surface, measure g…
▽ More
Medical image segmentation is one of the most challenging tasks in medical image analysis and has been widely developed for many clinical applications. Most of the existing metrics have been first designed for natural images and then extended to medical images. While object surface plays an important role in medical segmentation and quantitative analysis i.e. analyze brain tumor surface, measure gray matter volume, most of the existing metrics are limited when it comes to analyzing the object surface, especially to tell about surface smoothness or roughness of a given volumetric object or to analyze the topological errors. In this paper, we first analysis both pros and cons of all existing medical image segmentation metrics, specially on volumetric data. We then propose an appropriate roughness index and roughness distance for medical image segmentation analysis and evaluation. Our proposed method addresses two kinds of segmentation errors, i.e. (i)topological errors on boundary/surface and (ii)irregularities on the boundary/surface. The contribution of this work is four-fold: (i) detect irregular spikes/holes on a surface, (ii) propose roughness index to measure surface roughness of a given object, (iii) propose a roughness distance to measure the distance of two boundaries/surfaces by utilizing the proposed roughness index and (iv) suggest an algorithm which helps to remove the irregular spikes/holes to smooth the surface. Our proposed roughness index and roughness distance are built upon the solid surface roughness parameter which has been successfully developed in the civil engineering.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.
-
Analytic Filter Function Derivatives for Quantum Optimal Control
Authors:
Isabel Nha Minh Le,
Julian D. Teske,
Tobias Hangleiter,
Pascal Cerfontaine,
Hendrik Bluhm
Abstract:
Auto-correlated noise appears in many solid state qubit systems and hence needs to be taken into account when develo** gate operations for quantum information processing. However, explicitly simulating this kind of noise is often less efficient than approximate methods. Here, we focus on the filter function formalism, which allows the computation of gate fidelities in the presence of auto-correl…
▽ More
Auto-correlated noise appears in many solid state qubit systems and hence needs to be taken into account when develo** gate operations for quantum information processing. However, explicitly simulating this kind of noise is often less efficient than approximate methods. Here, we focus on the filter function formalism, which allows the computation of gate fidelities in the presence of auto-correlated classical noise. Hence, this formalism can be combined with optimal control algorithms to design control pulses, which optimally implement quantum gates. To enable the use of gradient-based algorithms with fast convergence, we present analytically derived filter function gradients with respect to control pulse amplitudes, and analyze the computational complexity of our results. When comparing pulse optimization using our derivatives to a gradient-free approach, we find that the gradient-based method is roughly two orders of magnitude faster for our test cases. We also provide a modular computational implementation compatible with quantum optimal control packages.
△ Less
Submitted 19 April, 2021; v1 submitted 16 March, 2021;
originally announced March 2021.
-
Invertible Residual Network with Regularization for Effective Medical Image Segmentation
Authors:
Kashu Yamazaki,
Vidhiwar Singh Rathour,
T. Hoang Ngan Le
Abstract:
Deep Convolutional Neural Networks (CNNs) i.e. Residual Networks (ResNets) have been used successfully for many computer vision tasks, but are difficult to scale to 3D volumetric medical data. Memory is increasingly often the bottleneck when training 3D Convolutional Neural Networks (CNNs). Recently, invertible neural networks have been applied to significantly reduce activation memory footprint w…
▽ More
Deep Convolutional Neural Networks (CNNs) i.e. Residual Networks (ResNets) have been used successfully for many computer vision tasks, but are difficult to scale to 3D volumetric medical data. Memory is increasingly often the bottleneck when training 3D Convolutional Neural Networks (CNNs). Recently, invertible neural networks have been applied to significantly reduce activation memory footprint when training neural networks with backpropagation thanks to the invertible functions that allow retrieving its input from its output without storing intermediate activations in memory to perform the backpropagation.
Among many successful network architectures, 3D Unet has been established as a standard architecture for volumetric medical segmentation. Thus, we choose 3D Unet as a baseline for a non-invertible network and we then extend it with the invertible residual network. In this paper, we proposed two versions of the invertible Residual Network, namely Partially Invertible Residual Network (Partially-InvRes) and Fully Invertible Residual Network (Fully-InvRes). In Partially-InvRes, the invertible residual layer is defined by a technique called additive coupling whereas in Fully-InvRes, both invertible upsampling and downsampling operations are learned based on squeezing (known as pixel shuffle). Furthermore, to avoid the overfitting problem because of less training data, a variational auto-encoder (VAE) branch is added to reconstruct the input volumetric data itself. Our results indicate that by using partially/fully invertible networks as the central workhorse in volumetric segmentation, we not only reduce memory overhead but also achieve compatible segmentation performance compared against the non-invertible 3D Unet. We have demonstrated the proposed networks on various volumetric datasets such as iSeg 2019 and BraTS 2020.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
Deep reinforcement learning in medical imaging: A literature review
Authors:
S. Kevin Zhou,
Hoang Ngan Le,
Khoa Luu,
Hien V. Nguyen,
Nicholas Ayache
Abstract:
Deep reinforcement learning (DRL) augments the reinforcement learning framework, which learns a sequence of actions that maximizes the expected reward, with the representative power of deep neural networks. Recent works have demonstrated the great potential of DRL in medicine and healthcare. This paper presents a literature review of DRL in medical imaging. We start with a comprehensive tutorial o…
▽ More
Deep reinforcement learning (DRL) augments the reinforcement learning framework, which learns a sequence of actions that maximizes the expected reward, with the representative power of deep neural networks. Recent works have demonstrated the great potential of DRL in medicine and healthcare. This paper presents a literature review of DRL in medical imaging. We start with a comprehensive tutorial of DRL, including the latest model-free and model-based algorithms. We then cover existing DRL applications for medical imaging, which are roughly divided into three main categories: (I) parametric medical image analysis tasks including landmark detection, object/lesion detection, registration, and view plane localization; (ii) solving optimization tasks including hyperparameter tuning, selecting augmentation strategies, and neural architecture search; and (iii) miscellaneous applications including surgical gesture segmentation, personalized mobile health intervention, and computational model personalization. The paper concludes with discussions of future perspectives.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
Study of phonon transport across several Si/Ge interfaces using full-band phonon Monte Carlo simulation
Authors:
N. D. Le,
B. Davier,
N. Izitounene,
P. Dollfus,
J. Saint-Martin
Abstract:
A Full Band Monte Carlo simulator has been developed to consider phonon transmission across interfaces that are perpendicular to the heat flux. This solver of the Boltzmann transport equation which does not require any assumption on the shape the phonon distribution can naturally consider all phonon transport regimes from the diffusive to the fully ballistic regime. Hence, this simulator is used t…
▽ More
A Full Band Monte Carlo simulator has been developed to consider phonon transmission across interfaces that are perpendicular to the heat flux. This solver of the Boltzmann transport equation which does not require any assumption on the shape the phonon distribution can naturally consider all phonon transport regimes from the diffusive to the fully ballistic regime. Hence, this simulator is used to study single and double Si/Ge heterostructures from the micrometer scale down to the nanometer scale i.e. in all phonon transport regime from ballistic to fully diffusive. A methodology to estimate the thermal conductivities and the thermal interfaces is presented.
△ Less
Submitted 5 May, 2022; v1 submitted 22 February, 2021;
originally announced February 2021.
-
Self-Supervised Learning via multi-Transformation Classification for Action Recognition
Authors:
Duc Quang Vu,
Ngan T. H. Le,
Jia-Ching Wang
Abstract:
Self-supervised tasks have been utilized to build useful representations that can be used in downstream tasks when the annotation is unavailable. In this paper, we introduce a self-supervised video representation learning method based on the multi-transformation classification to efficiently classify human actions. Self-supervised learning on various transformations not only provides richer contex…
▽ More
Self-supervised tasks have been utilized to build useful representations that can be used in downstream tasks when the annotation is unavailable. In this paper, we introduce a self-supervised video representation learning method based on the multi-transformation classification to efficiently classify human actions. Self-supervised learning on various transformations not only provides richer contextual information but also enables the visual representation more robust to the transforms. The spatio-temporal representation of the video is learned in a self-supervised manner by classifying seven different transformations i.e. rotation, clip inversion, permutation, split, join transformation, color switch, frame replacement, noise addition. First, seven different video transformations are applied to video clips. Then the 3D convolutional neural networks are utilized to extract features for clips and these features are processed to classify the pseudo-labels. We use the learned models in pretext tasks as the pre-trained models and fine-tune them to recognize human actions in the downstream task. We have conducted the experiments on UCF101 and HMDB51 datasets together with C3D and 3D Resnet-18 as backbone networks. The experimental results have shown that our proposed framework is outperformed other SOTA self-supervised action recognition approaches. The code will be made publicly available.
△ Less
Submitted 20 February, 2021;
originally announced February 2021.
-
Position-controlled functionalization of vacancies in silicon by single-ion implanted germanium atoms
Authors:
Simona Achilli,
Nguyen H. Le,
Guido Fratesi,
Nicola Manini,
Giovanni Onida,
Marco Turchetti,
Giorgio Ferrari,
Takahiro Shinada,
Takashi Tanii,
Enrico Prati
Abstract:
Special point defects in semiconductors have been envisioned as suitable components for quantum-information technology. The identification of new deep centers in silicon that can be easily activated and controlled is a main target of the research in the field. Vacancy-related complexes are suitable to provide deep electronic levels but they are hard to control spatially. With the spirit of investi…
▽ More
Special point defects in semiconductors have been envisioned as suitable components for quantum-information technology. The identification of new deep centers in silicon that can be easily activated and controlled is a main target of the research in the field. Vacancy-related complexes are suitable to provide deep electronic levels but they are hard to control spatially. With the spirit of investigating solid state devices with intentional vacancy-related defects at controlled position, here we report on the functionalization of silicon vacancies by implanting Ge atoms through single-ion implantation, producing Ge-vacancy (GeV) complexes. We investigate the quantum transport through an array of GeV complexes in a silicon-based transistor. By exploiting a model based on an extended Hubbard Hamiltonian derived from ab-initio results we find anomalous activation energy values of the thermally activated conductance of both quasi-localized and delocalized many-body states, compared to conventional dopants. We identify such states, forming the upper Hubbard band, as responsible of the experimental sub-threshold transport across the transistor. The combination of our model with the single-ion implantation method enables future research for the engineering of GeV complexes towards the creation of spatially controllable individual defects in silicon for applications in quantum information technologies.
△ Less
Submitted 4 February, 2021; v1 submitted 2 February, 2021;
originally announced February 2021.
-
Additive averages of multiplicative correlation sequences and applications
Authors:
Sebastián Donoso,
Anh N. Le,
Joel Moreira,
Wenbo Sun
Abstract:
We study sets of recurrence, in both measurable and topological settings, for actions of $(\mathbb{N},\times)$ and $(\mathbb{Q}^{>0},\times)$. In particular, we show that autocorrelation sequences of positive functions arising from multiplicative systems have positive additive averages. We also give criteria for when sets of the form $\{(an+b)^{\ell}/(cn+d)^{\ell}: n \in \mathbb{N}\}$ are sets of…
▽ More
We study sets of recurrence, in both measurable and topological settings, for actions of $(\mathbb{N},\times)$ and $(\mathbb{Q}^{>0},\times)$. In particular, we show that autocorrelation sequences of positive functions arising from multiplicative systems have positive additive averages. We also give criteria for when sets of the form $\{(an+b)^{\ell}/(cn+d)^{\ell}: n \in \mathbb{N}\}$ are sets of multiplicative recurrence, and consequently we recover two recent results in number theory regarding completely multiplicative functions and the Omega function.
△ Less
Submitted 26 April, 2022; v1 submitted 7 January, 2021;
originally announced January 2021.
-
Energy Efficiency Maximization in RIS-Aided Cell-Free Network with Limited Backhaul
Authors:
Quang Nhat Le,
Van-Dinh Nguyen,
Octavia A. Dobre,
Ruiqin Zhao
Abstract:
Integrating the reconfigurable intelligent surface in a cell-free (RIS-CF) network is an effective solution to improve the capacity and coverage of future wireless systems with low cost and power consumption. The reflecting coefficients of RISs can be programmed to enhance signals received at users. This letter addresses a joint design of transmit beamformers at access points and reflecting coeffi…
▽ More
Integrating the reconfigurable intelligent surface in a cell-free (RIS-CF) network is an effective solution to improve the capacity and coverage of future wireless systems with low cost and power consumption. The reflecting coefficients of RISs can be programmed to enhance signals received at users. This letter addresses a joint design of transmit beamformers at access points and reflecting coefficients at RISs to maximize the energy efficiency (EE) of RIS-CF networks, taking into account the limited backhaul capacity constraints. Due to a very computationally challenging nonconvex problem, we develop a simple yet efficient alternating descent algorithm for its solution. Numerical results verify that the EE of RIS-CF networks is greatly improved, showing the benefit of using RISs.
△ Less
Submitted 8 March, 2021; v1 submitted 21 December, 2020;
originally announced December 2020.
-
Grant-Free Random Access in Machine-Type Communication: Approaches and Challenges
Authors:
**ho Choi,
Jie Ding,
Ngoc Phuc Le,
Zhiguo Ding
Abstract:
Massive machine-type communication (MTC) is expected to play a key role in supporting Internet of Things (IoT) applications such as smart cities, smart factory, and connected vehicles through cellular networks. MTC is characterized by a large number of MTC devices and their sparse activities, which are difficult to be supported by conventional approaches and motivate the design of new access techn…
▽ More
Massive machine-type communication (MTC) is expected to play a key role in supporting Internet of Things (IoT) applications such as smart cities, smart factory, and connected vehicles through cellular networks. MTC is characterized by a large number of MTC devices and their sparse activities, which are difficult to be supported by conventional approaches and motivate the design of new access technologies. In particular, in the 5th generation (5G), grant-free or 2-step random access schemes are introduced for MTC to be more efficient by reducing signaling overhead. In this article, we first introduce grant-free random access and discuss how it can be modified with massive multiple-input multiple-output (MIMO) to exploit a high spatial multiplexing gain. We then explain preamble designs that can improve the performance and variations based on the notion of non-orthogonal multiple access (NOMA). Finally, design challenges of grant-free random access towards next generation cellular systems are presented.
△ Less
Submitted 18 December, 2020;
originally announced December 2020.
-
Multi-FinGAN: Generative Coarse-To-Fine Sampling of Multi-Finger Grasps
Authors:
Jens Lundell,
Enric Corona,
Tran Nguyen Le,
Francesco Verdoja,
Philippe Weinzaepfel,
Gregory Rogez,
Francesc Moreno-Noguer,
Ville Kyrki
Abstract:
While there exists many methods for manipulating rigid objects with parallel-jaw grippers, gras** with multi-finger robotic hands remains a quite unexplored research topic. Reasoning and planning collision-free trajectories on the additional degrees of freedom of several fingers represents an important challenge that, so far, involves computationally costly and slow processes. In this work, we p…
▽ More
While there exists many methods for manipulating rigid objects with parallel-jaw grippers, gras** with multi-finger robotic hands remains a quite unexplored research topic. Reasoning and planning collision-free trajectories on the additional degrees of freedom of several fingers represents an important challenge that, so far, involves computationally costly and slow processes. In this work, we present Multi-FinGAN, a fast generative multi-finger grasp sampling method that synthesizes high quality grasps directly from RGB-D images in about a second. We achieve this by training in an end-to-end fashion a coarse-to-fine model composed of a classification network that distinguishes grasp types according to a specific taxonomy and a refinement network that produces refined grasp poses and joint angles. We experimentally validate and benchmark our method against a standard grasp-sampling method on 790 grasps in simulation and 20 grasps on a real Franka Emika Panda. All experimental results using our method show consistent improvements both in terms of grasp quality metrics and grasp success rate. Remarkably, our approach is up to 20-30 times faster than the baseline, a significant improvement that opens the door to feedback-based grasp re-planning and task informative gras**. Code is available at https://irobotics.aalto.fi/multi-fingan/.
△ Less
Submitted 15 March, 2021; v1 submitted 17 December, 2020;
originally announced December 2020.