Skip to main content

Showing 1–50 of 69 results for author: Kasaei, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18746  [pdf, other

    cs.RO

    Lifelong Robot Library Learning: Bootstrap** Composable and Generalizable Skills for Embodied Control with Language Models

    Authors: Georgios Tziafas, Hamidreza Kasaei

    Abstract: Large Language Models (LLMs) have emerged as a new paradigm for embodied reasoning and control, most recently by generating robot policy code that utilizes a custom library of vision and control primitive skills. However, prior arts fix their skills library and steer the LLM with carefully hand-crafted prompt engineering, limiting the agent to a stationary range of addressable tasks. In this work,… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Published ICRA-24

  2. arXiv:2406.18742  [pdf, other

    cs.CV cs.RO

    3D Feature Distillation with Object-Centric Priors

    Authors: Georgios Tziafas, Yucheng Xu, Zhibin Li, Hamidreza Kasaei

    Abstract: Grounding natural language to the physical world is a ubiquitous topic with a wide range of applications in computer vision and robotics. Recently, 2D vision-language models such as CLIP have been widely popularized, due to their impressive capabilities for open-vocabulary grounding in 2D images. Recent works aim to elevate 2D CLIP features to 3D via feature distillation, but either learn neural f… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Submitted CoRL-24

  3. arXiv:2406.18722  [pdf, other

    cs.RO cs.CV

    Towards Open-World Gras** with Large Vision-Language Models

    Authors: Georgios Tziafas, Hamidreza Kasaei

    Abstract: The ability to grasp objects in-the-wild from open-ended language instructions constitutes a fundamental challenge in robotics. An open-world gras** system should be able to combine high-level contextual with low-level physical-geometric reasoning in order to be applicable in arbitrary scenarios. Recent works exploit the web-scale knowledge inherent in large language models (LLMs) to plan and re… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Submitted CoRL24

  4. arXiv:2402.16045  [pdf, other

    cs.RO

    Harnessing the Synergy between Pushing, Gras**, and Throwing to Enhance Object Manipulation in Cluttered Scenarios

    Authors: Hamidreza Kasaei, Mohammadreza Kasaei

    Abstract: In this work, we delve into the intricate synergy among non-prehensile actions like pushing, and prehensile actions such as gras** and throwing, within the domain of robotic manipulation. We introduce an innovative approach to learning these synergies by leveraging model-free deep reinforcement learning. The robot's workflow involves detecting the pose of the target object and the basket at each… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

    Comments: This paper has been accepted at the 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

  5. arXiv:2311.05779  [pdf, other

    cs.RO cs.CV

    Language-guided Robot Gras**: CLIP-based Referring Grasp Synthesis in Clutter

    Authors: Georgios Tziafas, Yucheng Xu, Arushi Goel, Mohammadreza Kasaei, Zhibin Li, Hamidreza Kasaei

    Abstract: Robots operating in human-centric environments require the integration of visual grounding and gras** capabilities to effectively manipulate objects based on user instructions. This work focuses on the task of referring grasp synthesis, which predicts a grasp pose for an object referred through natural language in cluttered scenes. Existing approaches often employ multi-stage pipelines that firs… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: Poster CoRL 2023. Dataset and code available here: https://github.com/gtziafas/OCID-VLG

  6. arXiv:2310.16123  [pdf, other

    cs.LG

    Anchor Space Optimal Transport: Accelerating Batch Processing of Multiple OT Problems

    Authors: Jianming Huang, Xun Su, Zhongxi Fang, Hiroyuki Kasai

    Abstract: The optimal transport (OT) theory provides an effective way to compare probability distributions on a defined metric space, but it suffers from cubic computational complexity. Although the Sinkhorn's algorithm greatly reduces the computational complexity of OT solutions, the solutions of multiple OT problems are still time-consuming and memory-comsuming in practice. However, many works on the comp… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 26 pages, 4 figures, 6 tables

  7. arXiv:2310.07937  [pdf, other

    cs.RO cs.AI

    Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation using Large Language Models

    Authors: Bangguo Yu, Hamidreza Kasaei, Ming Cao

    Abstract: In advanced human-robot interaction tasks, visual target navigation is crucial for autonomous robots navigating unknown environments. While numerous approaches have been developed in the past, most are designed for single-robot operations, which often suffer from reduced efficiency and robustness due to environmental complexities. Furthermore, learning policies for multi-robot collaboration are re… ▽ More

    Submitted 25 December, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 7 pages, 4 figures, conference

  8. arXiv:2307.00247  [pdf, other

    math.OC cs.LG

    Safe Screening for Unbalanced Optimal Transport

    Authors: Xun Su, Zhongxi Fang, Hiroyuki Kasai

    Abstract: This paper introduces a framework that utilizes the Safe Screening technique to accelerate the optimization process of the Unbalanced Optimal Transport (UOT) problem by proactively identifying and eliminating zero elements in the sparse solutions. We demonstrate the feasibility of applying Safe Screening to the UOT problem with $\ell_2$-penalty and KL-penalty by conducting an analysis of the solut… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

  9. arXiv:2306.15919  [pdf, other

    cs.CV cs.AI

    Fine-grained 3D object recognition: an approach and experiments

    Authors: Junhyung Jo, Hamidreza Kasaei

    Abstract: Three-dimensional (3D) object recognition technology is being used as a core technology in advanced technologies such as autonomous driving of automobiles. There are two sets of approaches for 3D object recognition: (i) hand-crafted approaches like Global Orthographic Object Descriptor (GOOD), and (ii) deep learning-based approaches such as MobileNet and VGG. However, it is needed to know which of… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  10. arXiv:2304.07236  [pdf, other

    cs.RO

    Learning Perceptive Bipedal Locomotion over Irregular Terrain

    Authors: Bart van Marum, Matthia Sabatelli, Hamidreza Kasaei

    Abstract: In this paper we propose a novel bipedal locomotion controller that uses noisy exteroception to traverse a wide variety of terrains. Building on the cutting-edge advancements in attention based belief encoding for quadrupedal locomotion, our work extends these methods to the bipedal domain, resulting in a robust and reliable internal belief of the terrain ahead despite noisy sensor inputs. Additio… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: 8 pages, 10 figures

  11. Frontier Semantic Exploration for Visual Target Navigation

    Authors: Bangguo Yu, Hamidreza Kasaei, Ming Cao

    Abstract: This work focuses on the problem of visual target navigation, which is very important for autonomous robots as it is closely related to high-level tasks. To find a special object in unknown environments, classical and learning-based approaches are fundamental components of navigation that have been investigated thoroughly in the past. However, due to the difficulty in the representation of complic… ▽ More

    Submitted 25 December, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: 7 pages

    Journal ref: 2023 IEEE International Conference on Robotics and Automation (ICRA)

  12. L3MVN: Leveraging Large Language Models for Visual Target Navigation

    Authors: Bangguo Yu, Hamidreza Kasaei, Ming Cao

    Abstract: Visual target navigation in unknown environments is a crucial problem in robotics. Despite extensive investigation of classical and learning-based approaches in the past, robots lack common-sense knowledge about household objects and layouts. Prior state-of-the-art approaches to this task rely on learning the priors during the training and typically require significant expensive resources and time… ▽ More

    Submitted 25 December, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: 7 pages

    Journal ref: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

  13. arXiv:2303.05323  [pdf, other

    cs.CV

    Controllable Video Generation by Learning the Underlying Dynamical System with Neural ODE

    Authors: Yucheng Xu, Li Nanbo, Arushi Goel, Zijian Guo, Zonghai Yao, Hamidreza Kasaei, Mohammadreze Kasaei, Zhibin Li

    Abstract: Videos depict the change of complex dynamical systems over time in the form of discrete image sequences. Generating controllable videos by learning the dynamical system is an important yet underexplored topic in the computer vision community. This paper presents a novel framework, TiV-ODE, to generate highly controllable videos from a static image and a text caption. Specifically, our framework le… ▽ More

    Submitted 4 April, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

  14. arXiv:2302.07824  [pdf, other

    cs.RO

    Instance-wise Grasp Synthesis for Robotic Gras**

    Authors: Yucheng Xu, Mohammadreza Kasaei, Hamidreza Kasaei, Zhibin Li

    Abstract: Generating high-quality instance-wise grasp configurations provides critical information of how to grasp specific objects in a multi-object environment and is of high importance for robot manipulation tasks. This work proposed a novel \textbf{S}ingle-\textbf{S}tage \textbf{G}rasp (SSG) synthesis network, which performs high-quality instance-wise grasp synthesis in a single stage: instance mask and… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  15. arXiv:2301.07037  [pdf, other

    cs.CV cs.AI

    Explain What You See: Open-Ended Segmentation and Recognition of Occluded 3D Objects

    Authors: H. Ayoobi, H. Kasaei, M. Cao, R. Verbrugge, B. Verheij

    Abstract: Local-HDP (for Local Hierarchical Dirichlet Process) is a hierarchical Bayesian method that has recently been used for open-ended 3D object category recognition. This method has been proven to be efficient in real-time robotic applications. However, the method is not robust to a high degree of occlusion. We address this limitation in two steps. First, we propose a novel semantic 3D object-parts se… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

    Comments: Accepted at ICRA 2023 Conference

  16. arXiv:2210.04613  [pdf, other

    cs.CV cs.AI

    Enhancing Fine-Grained 3D Object Recognition using Hybrid Multi-Modal Vision Transformer-CNN Models

    Authors: Songsong Xiong, Georgios Tziafas, Hamidreza Kasaei

    Abstract: Robots operating in human-centered environments, such as retail stores, restaurants, and households, are often required to distinguish between similar objects in different contexts with a high degree of accuracy. However, fine-grained object recognition remains a challenge in robotics due to the high intra-category and low inter-category dissimilarities. In addition, the limited number of fine-gra… ▽ More

    Submitted 6 March, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

  17. arXiv:2210.03628  [pdf, other

    cs.RO cs.AI

    GraspCaps: A Capsule Network Approach for Familiar 6DoF Object Gras**

    Authors: Tomas van der Velde, Hamed Ayoobi, Hamidreza Kasaei

    Abstract: As robots become more widely available outside industrial settings, the need for reliable object gras** and manipulation is increasing. In such environments, robots must be able to grasp and manipulate novel objects in various situations. This paper presents GraspCaps, a novel architecture based on Capsule Networks for generating per-point 6D grasp configurations for familiar objects. GraspCaps… ▽ More

    Submitted 29 November, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: Submitted to CVPR 2023, Supplementary video: https://youtu.be/d13rEhKgApI?si=EhgbDI84nlXL5V2M

  18. arXiv:2210.00858  [pdf, other

    cs.RO cs.AI cs.HC

    Enhancing Interpretability and Interactivity in Robot Manipulation: A Neurosymbolic Approach

    Authors: Georgios Tziafas, Hamidreza Kasaei

    Abstract: In this paper we present a neurosymbolic architecture for coupling language-guided visual reasoning with robot manipulation. A non-expert human user can prompt the robot using unconstrained natural language, providing a referring expression (REF), a question (VQA), or a grasp action instruction. The system tackles all cases in a task-agnostic fashion through the utilization of a shared library of… ▽ More

    Submitted 7 May, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: Submitted T-RO

  19. arXiv:2210.00843  [pdf, other

    cs.CV cs.RO

    Early or Late Fusion Matters: Efficient RGB-D Fusion in Vision Transformers for 3D Object Recognition

    Authors: Georgios Tziafas, Hamidreza Kasaei

    Abstract: The Vision Transformer (ViT) architecture has established its place in computer vision literature, however, training ViTs for RGB-D object recognition remains an understudied topic, viewed in recent literature only through the lens of multi-task pretraining in multiple vision modalities. Such approaches are often computationally intensive, relying on the scale of multiple pretraining datasets to a… ▽ More

    Submitted 7 March, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: Submitted IROS 23. Supplementary video here: https://youtu.be/L2gkDPkHsfo

  20. arXiv:2210.00803  [pdf, other

    cs.RO cs.AI

    IPPO: Obstacle Avoidance for Robotic Manipulators in Joint Space via Improved Proximal Policy Optimization

    Authors: Yongliang Wang, Hamidreza Kasaei

    Abstract: Reaching tasks with random targets and obstacles is a challenging task for robotic manipulators. In this study, we propose a novel model-free reinforcement learning approach based on proximal policy optimization (PPO) for training a deep policy to map the task space to the joint space of a 6-DoF manipulator. To facilitate the training process in a large workspace, we develop an efficient represent… ▽ More

    Submitted 9 February, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

  21. arXiv:2210.00609  [pdf, other

    cs.RO

    Throwing Objects into A Moving Basket While Avoiding Obstacles

    Authors: Hamidreza Kasaei, Mohammadreza Kasaei

    Abstract: The capabilities of a robot will be increased significantly by exploiting throwing behavior. In particular, throwing will enable robots to rapidly place the object into the target basket, located outside its feasible kinematic space, without traveling to the desired location. In previous approaches, the robot often learned a parameterized throwing kernel through analytical approaches, imitation le… ▽ More

    Submitted 2 October, 2022; originally announced October 2022.

    Comments: The video of our experiments can be found at https://youtu.be/VmIFF__c_84

  22. arXiv:2207.04216  [pdf, other

    cs.LG cs.AI

    Wasserstein Graph Distance Based on $L_1$-Approximated Tree Edit Distance between Weisfeiler-Lehman Subtrees

    Authors: Zhongxi Fang, Jianming Huang, Xun Su, Hiroyuki Kasai

    Abstract: The Weisfeiler-Lehman (WL) test is a widely used algorithm in graph machine learning, including graph kernels, graph metrics, and graph neural networks. However, it focuses only on the consistency of the graph, which means that it is unable to detect slight structural differences. Consequently, this limits its ability to capture structural information, which also limits the performance of existing… ▽ More

    Submitted 1 May, 2023; v1 submitted 9 July, 2022; originally announced July 2022.

  23. arXiv:2205.13846  [pdf, ps, other

    cs.LG cs.AI math.OC

    On the Convergence of Semi-Relaxed Sinkhorn with Marginal Constraint and OT Distance Gaps

    Authors: Takumi Fukunaga, Hiroyuki Kasai

    Abstract: This paper presents consideration of the Semi-Relaxed Sinkhorn (SR-Sinkhorn) algorithm for the semi-relaxed optimal transport (SROT) problem, which relaxes one marginal constraint of the standard OT problem. For evaluation of how the constraint relaxation affects the algorithm behavior and solution, it is vitally necessary to present the theoretical convergence analysis in terms not only of the fu… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

  24. Block-coordinate Frank-Wolfe algorithm and convergence analysis for semi-relaxed optimal transport problem

    Authors: Takumi Fukunaga, Hiroyuki Kasai

    Abstract: The optimal transport (OT) problem has been used widely for machine learning. It is necessary for computation of an OT problem to solve linear programming with tight mass-conservation constraints. These constraints prevent its application to large-scale problems. To address this issue, loosening such constraints enables us to propose the relaxed-OT method using a faster algorithm. This approach ha… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2022). arXiv admin note: substantial text overlap with arXiv:2103.05857

  25. arXiv:2205.12089  [pdf, other

    cs.CV cs.AI

    Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity Resolution

    Authors: Georgios Tziafas, Hamidreza Kasaei

    Abstract: Service robots should be able to interact naturally with non-expert human users, not only to help them in various tasks but also to receive guidance in order to resolve ambiguities that might be present in the instruction. We consider the task of visual grounding, where the agent segments an object from a crowded scene given a natural language description. Modern holistic approaches to visual grou… ▽ More

    Submitted 10 July, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Accepted CoLLAs 2022

  26. arXiv:2205.01982  [pdf, other

    cs.RO cs.LG

    Lifelong Ensemble Learning based on Multiple Representations for Few-Shot Object Recognition

    Authors: Hamidreza Kasaei, Songsong Xiong

    Abstract: Service robots are integrating more and more into our daily lives to help us with various tasks. In such environments, robots frequently face new objects while working in the environment and need to learn them in an open-ended fashion. Furthermore, such robots must be able to recognize a wide range of object categories. In this paper, we present a lifelong ensemble learning approach based on multi… ▽ More

    Submitted 9 January, 2024; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: The paper has been accepted for publication in the Robotics and Autonomous Systems journal

  27. arXiv:2203.02511  [pdf, other

    cs.RO

    Self-Supervised Learning for Joint Pushing and Gras** Policies in Highly Cluttered Environments

    Authors: Yongliang Wang, Kamal Mokhtar, Cock Heemskerk, Hamidreza Kasaei

    Abstract: Robots often face situations where gras** a goal object is desirable but not feasible due to other present objects preventing the grasp action. We present a deep Reinforcement Learning approach to learn gras** and pushing policies for manipulating a goal object in highly cluttered environments to address this problem. In particular, a dual Reinforcement Learning model approach is proposed, whi… ▽ More

    Submitted 16 March, 2024; v1 submitted 4 March, 2022; originally announced March 2022.

    Comments: This paper has been accepted for publication at the ICRA2024 conference

  28. arXiv:2109.11544  [pdf, other

    cs.RO cs.CV cs.LG

    Lifelong 3D Object Recognition and Grasp Synthesis Using Dual Memory Recurrent Self-Organization Networks

    Authors: Krishnakumar Santhakumar, Hamidreza Kasaei

    Abstract: Humans learn to recognize and manipulate new objects in lifelong settings without forgetting the previously gained knowledge under non-stationary and sequential conditions. In autonomous systems, the agents also need to mitigate similar behavior to continually learn the new object categories and adapt to new environments. In most conventional deep neural networks, this is not possible due to the p… ▽ More

    Submitted 23 January, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

  29. arXiv:2106.01866  [pdf, other

    cs.RO cs.CV

    Simultaneous Multi-View Object Recognition and Gras** in Open-Ended Domains

    Authors: Hamidreza Kasaei, Sha Luo, Remo Sasso, Mohammadreza Kasaei

    Abstract: To aid humans in everyday tasks, robots need to know which objects exist in the scene, where they are, and how to grasp and manipulate them in different situations. Therefore, object recognition and gras** are two key functionalities for autonomous robots. Most state-of-the-art approaches treat object recognition and gras** as two separate problems, even though both use visual input. Furthermo… ▽ More

    Submitted 6 December, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: arXiv admin note: text overlap with arXiv:2103.10997

  30. arXiv:2103.13834  [pdf, other

    cs.RO cs.AI

    Self-Imitation Learning by Planning

    Authors: Sha Luo, Hamidreza Kasaei, Lambert Schomaker

    Abstract: Imitation learning (IL) enables robots to acquire skills quickly by transferring expert knowledge, which is widely adopted in reinforcement learning (RL) to initialize exploration. However, in long-horizon motion planning tasks, a challenging problem in deploying IL and RL methods is how to generate and collect massive, broadly distributed data such that these methods can generalize effectively. I… ▽ More

    Submitted 26 March, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

  31. arXiv:2103.10997  [pdf, other

    cs.RO

    MVGrasp: Real-Time Multi-View 3D Object Gras** in Highly Cluttered Environments

    Authors: Hamidreza Kasaei, Mohammadreza Kasaei

    Abstract: Nowadays robots play an increasingly important role in our daily life. In human-centered environments, robots often encounter piles of objects, packed items, or isolated objects. Therefore, a robot must be able to grasp and manipulate different objects in various situations to help humans with daily tasks. In this paper, we propose a multi-view deep learning approach to handle robust object graspi… ▽ More

    Submitted 5 October, 2022; v1 submitted 19 March, 2021; originally announced March 2021.

    Comments: The video of our experiments can be found here: https://youtu.be/c-4lzjbF7fY

  32. arXiv:2103.09863  [pdf, other

    cs.RO

    MORE: Simultaneous Multi-View 3D Object Recognition and Pose Estimation

    Authors: Tommaso Parisotto, Subhaditya Mukherjee, Hamidreza Kasaei

    Abstract: Simultaneous object recognition and pose estimation are two key functionalities for robots to safely interact with humans as well as environments. Although both object recognition and pose estimation use visual input, most state-of-the-art tackles them as two separate problems since the former needs a view-invariant representation while object pose estimation necessitates a view-dependent descript… ▽ More

    Submitted 7 April, 2023; v1 submitted 17 March, 2021; originally announced March 2021.

  33. arXiv:2103.09720  [pdf, other

    cs.CV cs.AI

    Few-Shot Visual Grounding for Natural Human-Robot Interaction

    Authors: Giorgos Tziafas, Hamidreza Kasaei

    Abstract: Natural Human-Robot Interaction (HRI) is one of the key components for service robots to be able to work in human-centric environments. In such dynamic environments, the robot needs to understand the intention of the user to accomplish a task successfully. Towards addressing this point, we propose a software architecture that segments a target object from a crowded scene, indicated verbally by a h… ▽ More

    Submitted 31 March, 2021; v1 submitted 17 March, 2021; originally announced March 2021.

    Comments: 6 pages, 4 figures, ICARSC2021 accepted

  34. arXiv:2103.05857  [pdf, ps, other

    cs.LG math.OC

    Fast block-coordinate Frank-Wolfe algorithm for semi-relaxed optimal transport

    Authors: Takumi Fukunaga, Hiroyuki Kasai

    Abstract: Optimal transport (OT), which provides a distance between two probability distributions by considering their spatial locations, has been applied to widely diverse applications. Computing an OT problem requires solution of linear programming with tight mass-conservation constraints. This requirement hinders its application to large-scale problems. To alleviate this issue, the recently proposed rela… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

  35. arXiv:2103.00902  [pdf, other

    cs.LG math.OC

    Manifold optimization for non-linear optimal transport problems

    Authors: Bamdev Mishra, N T V Satyadev, Hiroyuki Kasai, Pratik Jawanpuria

    Abstract: Optimal transport (OT) has recently found widespread interest in machine learning. It allows to define novel distances between probability measures, which have shown promise in several applications. In this work, we discuss how to computationally approach general non-linear OT problems within the framework of Riemannian manifold optimization. The basis of this is the manifold of doubly stochastic… ▽ More

    Submitted 8 October, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: technical report, change is title, addition of experiments

  36. arXiv:2012.03612  [pdf, ps, other

    cs.LG cs.AI cs.DS stat.ML

    LCS Graph Kernel Based on Wasserstein Distance in Longest Common Subsequence Metric Space

    Authors: Jianming Huang, Zhongxi Fang, Hiroyuki Kasai

    Abstract: For graph learning tasks, many existing methods utilize a message-passing mechanism where vertex features are updated iteratively by aggregation of neighbor information. This strategy provides an efficient means for graph features extraction, but obtained features after many iterations might contain too much information from other vertices, and tend to be similar to each other. This makes their re… ▽ More

    Submitted 29 October, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

    Journal ref: Signal Processing, Vol.189, 2021

  37. arXiv:2011.12542  [pdf, ps, other

    cs.LG stat.ML

    Wasserstein k-means with sparse simplex projection

    Authors: Takumi Fukunaga, Hiroyuki Kasai

    Abstract: This paper presents a proposal of a faster Wasserstein $k$-means algorithm for histogram data by reducing Wasserstein distance computations and exploiting sparse simplex projection. We shrink data samples, centroids, and the ground cost matrix, which leads to considerable reduction of the computations used to solve optimal transport problems without loss of clustering quality. Furthermore, we dyna… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

    Comments: Accepted in ICPR2020

  38. arXiv:2011.12532  [pdf, ps, other

    cs.LG stat.ML

    Consistency-aware and Inconsistency-aware Graph-based Multi-view Clustering

    Authors: Mitsuhiko Horie, Hiroyuki Kasai

    Abstract: Multi-view data analysis has gained increasing popularity because multi-view data are frequently encountered in machine learning applications. A simple but promising approach for clustering of multi-view data is multi-view clustering (MVC), which has been developed extensively to classify given subjects into some clustered groups by learning latent common features that are shared across multi-view… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

    Comments: Accepted in EUSIPCO2020

  39. arXiv:2010.14773  [pdf, ps, other

    cs.LG

    Graph embedding using multi-layer adjacent point merging model

    Authors: Jianming Huang, Hiroyuki Kasai

    Abstract: For graph classification tasks, many traditional kernel methods focus on measuring the similarity between graphs. These methods have achieved great success on resolving graph isomorphism problems. However, in some classification problems, the graph class depends on not only the topological similarity of the whole graph, but also constituent subgraph patterns. To this end, we propose a novel graph… ▽ More

    Submitted 17 February, 2021; v1 submitted 28 October, 2020; originally announced October 2020.

    Comments: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021). arXiv admin note: text overlap with arXiv:2012.03612

  40. arXiv:2009.09235  [pdf, other

    cs.CV cs.RO

    Open-Ended Fine-Grained 3D Object Categorization by Combining Shape and Texture Features in Multiple Colorspaces

    Authors: Nils Keunecke, S. Hamidreza Kasaei

    Abstract: As a consequence of an ever-increasing number of service robots, there is a growing demand for highly accurate real-time 3D object recognition. Considering the expansion of robot applications in more complex and dynamic environments,it is evident that it is not possible to pre-program all object categories and anticipate all exceptions in advance. Therefore, robots should have the functionality to… ▽ More

    Submitted 28 May, 2021; v1 submitted 19 September, 2020; originally announced September 2020.

  41. arXiv:2009.07213  [pdf, other

    cs.CV cs.LG cs.RO

    3D_DEN: Open-ended 3D Object Recognition using Dynamically Expandable Networks

    Authors: Sudhakaran Jain, Hamidreza Kasaei

    Abstract: Service robots, in general, have to work independently and adapt to the dynamic changes happening in the environment in real-time. One important aspect in such scenarios is to continually learn to recognize newer object categories when they become available. This combines two main research problems namely continual learning and 3D object recognition. Most of the existing research approaches includ… ▽ More

    Submitted 15 March, 2021; v1 submitted 15 September, 2020; originally announced September 2020.

  42. arXiv:2009.01152  [pdf, other

    cs.CV stat.ML

    Local-HDP: Interactive Open-Ended 3D Object Categorization in Real-Time Robotic Scenarios

    Authors: H. Ayoobi, H. Kasaei, M. Cao, R. Verbrugge, B. Verheij

    Abstract: We introduce a non-parametric hierarchical Bayesian approach for open-ended 3D object categorization, named the Local Hierarchical Dirichlet Process (Local-HDP). This method allows an agent to learn independent topics for each category incrementally and to adapt to the environment in time. Hierarchical Bayesian approaches like Latent Dirichlet Allocation (LDA) can transform low-level features to h… ▽ More

    Submitted 11 April, 2021; v1 submitted 2 September, 2020; originally announced September 2020.

    Comments: 13 pages

  43. arXiv:2003.08151  [pdf, other

    cs.RO cs.CV

    The State of Lifelong Learning in Service Robots: Current Bottlenecks in Object Perception and Manipulation

    Authors: S. Hamidreza Kasaei, Jorik Melsen, Floris van Beers, Christiaan Steenkist, Klemen Voncina

    Abstract: Service robots are appearing more and more in our daily life. The development of service robots combines multiple fields of research, from object perception to object manipulation. The state-of-the-art continues to improve to make a proper coupling between object perception and manipulation. This coupling is necessary for service robots not only to perform various tasks in a reasonable amount of t… ▽ More

    Submitted 6 May, 2021; v1 submitted 18 March, 2020; originally announced March 2020.

  44. arXiv:2002.03892  [pdf, other

    cs.RO

    Learning to Grasp 3D Objects using Deep Residual U-Nets

    Authors: Yikun Li, Lambert Schomaker, S. Hamidreza Kasaei

    Abstract: Grasp synthesis is one of the challenging tasks for any robot object manipulation task. In this paper, we present a new deep learning-based grasp synthesis approach for 3D objects. In particular, we propose an end-to-end 3D Convolutional Neural Network to predict the objects' graspable areas. We named our approach Res-U-Net since the architecture of the network is designed based on U-Net structure… ▽ More

    Submitted 12 September, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

  45. arXiv:2002.03779  [pdf, other

    cs.RO cs.CV

    Investigating the Importance of Shape Features, Color Constancy, Color Spaces and Similarity Measures in Open-Ended 3D Object Recognition

    Authors: S. Hamidreza Kasaei, Maryam Ghorbani, Jits Schilperoort, Wessel van der Rest

    Abstract: Despite the recent success of state-of-the-art 3D object recognition approaches, service robots are frequently failed to recognize many objects in real human-centric environments. For these robots, object recognition is a challenging task due to the high demand for accurate and real-time response under changing and unpredictable environmental conditions. Most of the recent approaches use either th… ▽ More

    Submitted 26 September, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

  46. Accelerating Reinforcement Learning for Reaching using Continuous Curriculum Learning

    Authors: Sha Luo, Hamidreza Kasaei, Lambert Schomaker

    Abstract: Reinforcement learning has shown great promise in the training of robot behavior due to the sequential decision making characteristics. However, the required enormous amount of interactive and informative training data provides the major stumbling block for progress. In this study, we focus on accelerating reinforcement learning (RL) training and improving the performance of multi-goal reaching ta… ▽ More

    Submitted 21 December, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  47. arXiv:1912.09539  [pdf, other

    cs.RO cs.AI cs.CV

    Interactive Open-Ended Learning for 3D Object Recognition

    Authors: S. Hamidreza Kasaei

    Abstract: The thesis contributes in several important ways to the research area of 3D object category learning and recognition. To cope with the mentioned limitations, we look at human cognition, in particular at the fact that human beings learn to recognize object categories ceaselessly over time. This ability to refine knowledge from the set of accumulated experiences facilitates the adaptation to new env… ▽ More

    Submitted 19 December, 2019; originally announced December 2019.

    Comments: PhD thesis

  48. arXiv:1907.12924  [pdf, other

    cs.CV cs.RO

    Look Further to Recognize Better: Learning Shared Topics and Category-Specific Dictionaries for Open-Ended 3D Object Recognition

    Authors: S. Hamidreza Kasaei

    Abstract: Service robots are expected to operate effectively in human-centric environments for long periods of time. In such realistic scenarios, fine-grained object categorization is as important as basic-level object categorization. We tackle this problem by proposing an open-ended object recognition approach which concurrently learns both the object categories and the local features for encoding objects.… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

    Comments: arXiv admin note: text overlap with arXiv:1902.03057

  49. arXiv:1907.10932  [pdf, other

    cs.RO

    Object Perception and Gras** in Open-Ended Domains

    Authors: S. Hamidreza Kasaei

    Abstract: Nowadays service robots are leaving the structured and completely known environments and entering human-centric settings. For these robots, object perception and gras** are two challenging tasks due to the high demand for accurate and real-time responses. Although many problems have already been understood and solved successfully, many challenges still remain. Open-ended learning is one of these… ▽ More

    Submitted 25 July, 2019; originally announced July 2019.

  50. arXiv:1906.10436  [pdf, other

    math.OC cs.LG

    Riemannian optimization on the simplex of positive definite matrices

    Authors: Bamdev Mishra, Hiroyuki Kasai, Pratik Jawanpuria

    Abstract: In this work, we generalize the probability simplex constraint to matrices, i.e., $\mathbf{X}_1 + \mathbf{X}_2 + \ldots + \mathbf{X}_K = \mathbf{I}$, where $\mathbf{X}_i \succeq 0$ is a symmetric positive semidefinite matrix of size $n\times n$ for all $i = \{1,\ldots,K \}$. By assuming positive definiteness of the matrices, we show that the constraint set arising from the matrix simplex has the s… ▽ More

    Submitted 17 November, 2020; v1 submitted 25 June, 2019; originally announced June 2019.

    Comments: 12th OPT Workshop on Optimization for Machine Learning at NeurIPS 2020