Skip to main content

Showing 1–13 of 13 results for author: Stepanova, K

.
  1. arXiv:2404.15194  [pdf, other

    cs.RO cs.CV

    Closed Loop Interactive Embodied Reasoning for Robot Manipulation

    Authors: Michal Nazarczuk, Jan Kristof Behrens, Karla Stepanova, Matej Hoffmann, Krystian Mikolajczyk

    Abstract: Embodied reasoning systems integrate robotic hardware and cognitive processes to perform complex tasks typically in response to a natural language query about a specific physical environment. This usually involves changing the belief about the scene or physically interacting and changing the scene (e.g. 'Sort the objects from lightest to heaviest'). In order to facilitate the development of such s… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  2. arXiv:2404.01932  [pdf, other

    cs.RO cs.LG

    Bridging Language, Vision and Action: Multimodal VAEs in Robotic Manipulation Tasks

    Authors: Gabriela Sejnova, Michal Vavrecka, Karla Stepanova

    Abstract: In this work, we focus on unsupervised vision-language-action map** in the area of robotic manipulation. Recently, multiple approaches employing pre-trained large language and vision models have been proposed for this task. However, they are computationally demanding and require careful fine-tuning of the produced outputs. A more lightweight alternative would be the implementation of multimodal… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 7 pages, 5 figures, 2 tables, conference

  3. arXiv:2404.01702  [pdf, other

    cs.HC cs.RO

    Tell and show: Combining multiple modalities to communicate manipulation tasks to a robot

    Authors: Petr Vanc, Radoslav Skoviera, Karla Stepanova

    Abstract: As human-robot collaboration is becoming more widespread, there is a need for a more natural way of communicating with the robot. This includes combining data from several modalities together with the context of the situation and background knowledge. Current approaches to communication typically rely only on a single modality or are often very rigid and not robust to missing, misaligned, or noisy… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 8 pages, 8 figures

  4. arXiv:2312.06280  [pdf, other

    cs.LG cs.AI

    Adaptive Compression of the Latent Space in Variational Autoencoders

    Authors: Gabriela Sejnova, Michal Vavrecka, Karla Stepanova

    Abstract: Variational Autoencoders (VAEs) are powerful generative models that have been widely used in various fields, including image and text generation. However, one of the known challenges in using VAEs is the model's sensitivity to its hyperparameters, such as the latent space size. This paper presents a simple extension of VAEs for automatically determining the optimal latent space size during the tra… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 10 pages, 4 figures

  5. Communicating human intent to a robotic companion by multi-type gesture sentences

    Authors: Petr Vanc, Jan Kristof Behrens, Karla Stepanova, Vaclav Hlavac

    Abstract: Human-Robot collaboration in home and industrial workspaces is on the rise. However, the communication between robots and humans is a bottleneck. Although people use a combination of different types of gestures to complement speech, only a few robotic systems utilize gestures for communication. In this paper, we propose a gesture pseudo-language and show how multiple types of gestures can be combi… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: 7 pages, 9 figures

  6. Context-aware robot control using gesture episodes

    Authors: Petr Vanc, Jan Kristof Behrens, Karla Stepanova

    Abstract: Collaborative robots became a popular tool for increasing productivity in partly automated manufacturing plants. Intuitive robot teaching methods are required to quickly and flexibly adapt the robot programs to new tasks. Gestures have an essential role in human communication. However, in human-robot-interaction scenarios, gesture-based user interfaces are so far used rarely, and if they employ a… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: 7 pages, 8 figures, accepted for ICRA 2023

  7. Imitrob: Imitation Learning Dataset for Training and Evaluating 6D Object Pose Estimators

    Authors: Jiri Sedlar, Karla Stepanova, Radoslav Skoviera, Jan K. Behrens, Matus Tuna, Gabriela Sejnova, Josef Sivic, Robert Babuska

    Abstract: This paper introduces a dataset for training and evaluating methods for 6D pose estimation of hand-held tools in task demonstrations captured by a standard RGB camera. Despite the significant progress of 6D pose estimation methods, their performance is usually limited for heavily occluded objects, which is a common case in imitation learning, where the object is typically partially occluded by the… ▽ More

    Submitted 5 April, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: The dataset and code are publicly available at http://imitrob.ciirc.cvut.cz/imitrobdataset.php

    Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 5, pp. 2788-2795, 2023

  8. arXiv:2209.03048  [pdf, other

    cs.LG

    Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and Toolkit

    Authors: Gabriela Sejnova, Michal Vavrecka, Karla Stepanova

    Abstract: Multimodal Variational Autoencoders (VAEs) have been the subject of intense research in the past years as they can integrate multiple modalities into a joint representation and can thus serve as a promising tool for both data classification and generation. Several approaches toward multimodal VAE learning have been proposed so far, their comparison and evaluation have however been rather inconsist… ▽ More

    Submitted 24 November, 2023; v1 submitted 7 September, 2022; originally announced September 2022.

  9. arXiv:2204.06343  [pdf, other

    cs.RO

    Single-grasp deformable object discrimination: the effect of gripper morphology, sensing modalities, and action parameters

    Authors: Michal Pliska, Shubhan Patni, Michal Mares, Pavel Stoudek, Zdenek Straka, Karla Stepanova, Matej Hoffmann

    Abstract: In haptic object discrimination, the effect of gripper embodiment, action parameters, and sensory channels has not been systematically studied. We used two anthropomorphic hands and two 2-finger grippers to grasp two sets of deformable objects. On the object classification task, we found: (i) among classifiers, SVM on sensory features and LSTM on raw time series performed best across all grippers;… ▽ More

    Submitted 2 February, 2024; v1 submitted 13 April, 2022; originally announced April 2022.

    Comments: 12 pages, 9 figures

    ACM Class: I.2.9

  10. Automatic self-contained calibration of an industrial dual-arm robot with cameras using self-contact, planar constraints, and self-observation

    Authors: Karla Stepanova, Jakub Rozlivek, Frantisek Puciow, Pavel Krsek, Tomas Pajdla, Matej Hoffmann

    Abstract: We present a robot kinematic calibration method that combines complementary calibration approaches: self-contact, planar constraints, and self-observation. We analyze the estimation of the end effector parameters, joint offsets of the manipulators, and calibration of the complete kinematic chain (DH parameters). The results are compared with ground truth measurements provided by a laser tracker. O… ▽ More

    Submitted 24 September, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

    Comments: 25 pages, 29 figures

    Journal ref: Robotics and Computer-Integrated Manufacturing 2022, Volume 73, 102250

  11. arXiv:1901.08335  [pdf, other

    cs.HC cs.RO

    Teaching robots to imitate a human with no on-teacher sensors. What are the key challenges?

    Authors: Radoslav Skoviera, Karla Stepanova, Michael Tesar, Gabriela Sejnova, Jiri Sedlar, Michal Vavrecka, Robert Babuska, Josef Sivic

    Abstract: In this paper, we consider the problem of learning object manipulation tasks from human demonstration using RGB or RGB-D cameras. We highlight the key challenges in capturing sufficiently good data with no tracking devices - starting from sensor selection and accurate 6DoF pose estimation to natural language processing. In particular, we focus on two showcases: gluing task with a glue gun and simp… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

    Journal ref: The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018, Workshop on: Towards Intelligent Social Robots: From Naive Robots to Robot Sapiens http://intelligent-social-robots-ws.com/materials/

  12. Robot self-calibration using multiple kinematic chains -- a simulation study on the iCub humanoid robot

    Authors: Karla Stepanova, Tomas Pajdla, Matej Hoffmann

    Abstract: Mechanism calibration is an important and non-trivial task in robotics. Advances in sensor technology make affordable but increasingly accurate devices such as cameras and tactile sensors available, making it possible to perform automated self-contained calibration relying on redundant information in these sensory streams. In this work, we use a simulated iCub humanoid robot with a stereo camera s… ▽ More

    Submitted 31 August, 2020; v1 submitted 18 May, 2018; originally announced May 2018.

    Comments: 8 pages; 8 figures; substantially revised version compared to previous - all data and results are new

    MSC Class: 68T40 Robotics

    Journal ref: IEEE Robotics and Automation Letters 4(2), 1900-1907 (2019)

  13. arXiv:1706.02490  [pdf, other

    cs.NE cs.AI cs.CL cs.LG cs.RO

    Where is my forearm? Clustering of body parts from simultaneous tactile and linguistic input using sequential map**

    Authors: Karla Stepanova, Matej Hoffmann, Zdenek Straka, Frederico B. Klein, Angelo Cangelosi, Michal Vavrecka

    Abstract: Humans and animals are constantly exposed to a continuous stream of sensory information from different modalities. At the same time, they form more compressed representations like concepts or symbols. In species that use language, this process is further structured by this interaction, where a map** between the sensorimotor concepts and linguistic elements needs to be established. There is evide… ▽ More

    Submitted 8 June, 2017; originally announced June 2017.

    Comments: pp. 155-162